Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2015 Nov 2;44(Database issue):D523–D526. doi: 10.1093/nar/gkv1117

MetaNetX/MNXref – reconciliation of metabolites and biochemical reactions to bring together genome-scale metabolic networks

Sébastien Moretti 1,2, Olivier Martin 1, T Van Du Tran 1, Alan Bridge 3, Anne Morgat 3,4, Marco Pagni 1,*
PMCID: PMC4702813  PMID: 26527720

Abstract

MetaNetX is a repository of genome-scale metabolic networks (GSMNs) and biochemical pathways from a number of major resources imported into a common namespace of chemical compounds, reactions, cellular compartments—namely MNXref—and proteins. The MetaNetX.org website (http://www.metanetx.org/) provides access to these integrated data as well as a variety of tools that allow users to import their own GSMNs, map them to the MNXref reconciliation, and manipulate, compare, analyze, simulate (using flux balance analysis) and export the resulting GSMNs. MNXref and MetaNetX are regularly updated and freely available.

INTRODUCTION

A genome-scale metabolic network (GSMN), or stoichiometric model, describes the set of biochemical reactions which may occur in a given organism, as well as the requisite enzymes, and may also include information on sub-cellular compartments, transport reactions and transporters. By design GSMNs are focused on the metabolism of small molecular weight compounds when energy and mass conservation law can be applied, and are not suited to represent gene regulation or signaling pathways. In practice, a GSMN has a double purpose, as it is both a repository of knowledge about an organism's metabolism, and a model that can be simulated, using flux balance analysis (FBA). Such simulations can address different questions: (i) establish the essentiality of genes in specific growth conditions; (ii) reveal opportunities for metabolic engineering and optimization; (iii) suggest new drug targets (1). To permit simulations, a GSMN usually includes artificial reactions that describe the growth medium, a growth equation (which implies the composition of the biomass) and possibly hypothetical reactions not (yet) supported by experimental biology but required to make a model functional.

A relatively small number of high quality GSMNs have been published to date, essentially for model organisms, and are made available by a few dedicated databases (26). The development of such models requires significant human effort and curation, and the fully automated reconstruction of a GSMN from an annotated genome sequence remains a challenge (7,8). Such methods require the integration of high quality curated data covering the known biochemistry of a vast range of organisms, as well as methods that address the specific requirements of a functional GSMN, including the elemental balancing of individual reactions. These considerations form the major motivation for the development of the resource presented here.

MNXREF RECONCILIATION

The metabolite identifiers found in the early-published GSMNs were often specific to the individual groups developing and curating them, and did not generally reference the major databases of chemical compounds. In recent years there have been a few attempts to ‘reconcile’ the different nomenclatures of these compounds (9,10) including our own effort MNXref (11). The principles of the reconciliation algorithm used in MNXref can be summarized as follows:

  1. Reconciliation of common metabolites based on chemical structures;

  2. Reconciliation of metabolites through shared chemical nomenclature;

  3. Reconciliation of reactions through shared metabolites;

  4. Identification of candidate reactions for reconciliation through shared cross-references;

  5. Iterative reconciliation of metabolites through reaction context.

Figure 1 illustrates the reconciliation process using malonyl-CoA as an example; individual steps in the reconciliation are color-coded according to the type of evidence used. Table 1 summarizes the overlaps between the various sources of biochemical data and GSMNs according to the results of the MNXref reconciliation. The MNXref namespace is regularly updated with metabolite and reaction data from new resources; recent additions include the EAWAG-BBD/UMBBD pathway database (12).

Figure 1.

Figure 1.

Evidences used to reconcile different chemical compounds for the metabolite malonyl-CoA (MNXM40 in MNXref): magenta, using structure supplied by the source databases; red, using recomputed structure; orange, recomputed structure protonated at pH 7.3; yellow, recomputed structure protonated at pH 7.3 but ignoring the stereo layer of the InChI representation; green, using the cross-references supplied by the source databases; dark blue, based on compound primary names; light blue, based on compound synonym names. Triangle: in a post-processing step, chebi:57384 was chosen to best represent the targeted metabolite.

Table 1. Numbers of reconciled metabolites (a) and reactions (b) in MNXref 2.0, and mapped proteins (c), found in common between published GSMNs and major biochemical databases in MetaNetX.org.

MNXref BiGG (2) 18 GSMNs BioCyc (3) 19 GSMNs Path2Models (4) 132 GSMNs The SEED (18) 50 GSMNs YeastNet (6) 1 GSMN
(a) Metabolites
BiGGa (2) (version 2beta) 4039 3414 2610 2836 1829 1021
BioPath (19) (2010–05–03) 1313 649 875 943 567 427
ChEBI (20) (version 131) 46 477 8507 15 631 17 973 7108 4416
HMDB (21) (version 3.6) 42 542 1292 2525 3044 1054 714
KEGG (22) (version 75.1) 28 429 1958 3945 5356 1560 908
LIPIDMAPS (23) (2015–06–28) 40 719 412 1382 1587 280 252
MetaCyc (3) (version 19.1) 15 472 1835 5380 5637 1399 826
Reactome (24) (2015–07–13) 4576 1799 2539 2770 1467 1521
The SEEDa (18) (2013–06–19) 16 280 2040 3120 4098 1551 678
UMBBD–EAWAG (12) (2014–06–30) 1395 206 347 588 150 67
UniPathway (25) (version 2015_03) 1113 692 874 928 657 393
(b) Reactions
BiGGa (2) (version 2beta) 11 458 6055 3380 2580 1876 1730
BioPath (19) (2010–05–03) 1545 456 684 725 328 285
KEGG (22) (version 75.1) 9925 1335 3085 4309 877 528
MetaCyc (3) (version 19.1) 13 793 1419 5040 4220 828 549
Reactome (24) (2015–07–13) 23 592 4111 5848 5147 2849 2604
Rhea (16) (version 64) 32 256 5101 10 603 10 293 3190 2050
The SEEDa (18) (2013–06–19) 13 260 3069 2980 3337 1738 932
UniPathway (25) (version 2015_03) 1994 1065 1435 1471 836 559
(c) Proteins
UniProt (15) (version 2015_08) 11 142 15 670 76 293 27 154 912

aBiGG and The SEED distribute collections of metabolites and reactions that are not necessarily retrieved in one of their GSMN.

In the construction and use of GSMNs, every reaction must be balanced with respect to elemental composition and charge; failure to balance reactions will lead to violations in mass conservation that can have detrimental effects on the downstream simulations. The case of protons is worthy of particular attention in this regard. Protons provide a means to balance chemical equations occurring in aqueous solution, but they are also responsible for creating membrane potentials whose dissipation is a major driving force in cell metabolism. In order to distinguish these two roles we have introduced separate identifiers for those protons transported across a membrane (MNXM01 in MNXref), and those protons introduced for the purposes of balancing a reaction (MNXM1 in MNXref). An artificial spontaneous reaction is then added to every compartment of the GSMN to permit the free exchange between transported and balanced protons (MNXR01 in MNXref). In this way, the original properties of the GSMN are preserved.

METANETX REPOSITORY AND TOOLS

MetaNetX.org (13) is a website that provides free access to the MNXref reconciliation data and a collection of published GSMNs and biochemical pathways mapped onto MNXref. The website also allows users to upload, manipulate, analyze or modify their own GSMNs and export them in SBML or in our own tab-delimited format. MetaNetX.org also offers a selection of tools for analyses including network structure, FBA or nested pattern methods (14).

Gene names have been widely used in published GSMNs to describe the protein complexes that act as enzymes or transporters. Gene nomenclature is, however, essentially organism specific, if not dependent on a particular genome assembly. In MetaNetX we use UniProt accession numbers (15) to identify gene products: it greatly facilitates the inter organisms comparison of GSMNs from different sources.

Although the MNXref reconciliation algorithm is essentially automated, the compilation of the MetaNetX repository requires some manual intervention and a certain number of editorial choices. This includes definition of an accepted list of species and strains that includes important model organisms. Preference is given to the most comprehensive GSMNs from external sources that use accepted standard formats and have sufficient protein coverage (full acknowledgement is given to these external sources). We are closely collaborating with Rhea (16), which is a database of manually curated biochemical reactions, as part of the ongoing effort to further improve the quality of annotation of our resource.

CONCLUSION

The www.metanetx.org resource provides a comprehensive suite of tools for the analysis of genome-scale metabolic models, based on a single integrated namespace of metabolites and metabolic reactions that integrates the most widely used biochemical databases and model repositories – MNXref. The reconciliation process used in MNXref greatly simplifies the development and analysis of genome-scale metabolic models, allowing users to concentrate on model analysis rather than the time-consuming problem of identifier mapping. Future developments will include the provision of tools and the integration of new resources such as the SwissLipids knowledgebase (17), which provides lipid structures and curated data on enzymatic reactions.

Acknowledgments

Computation and maintenance of the MetaNetX.org server are provided by the Vital-IT center for high-performance computing of the SIB Swiss Institute of Bioinformatics (http://www.vital-it.ch). We thank Ioannis Xenarios and Joerg Stelling for support and feedback.

FUNDING

Swiss Initiative for Systems Biology [SystemsX.ch projects MetaNetX, HostPathX and SyBIT] evaluated by the Swiss National Science Foundation; Swiss Federal Government through the Federal Office of Education and Science. Funding for open access charge: SIB Swiss Institute of Bioinformatics.

Conflict of interest statement. None declared.

REFERENCES

  • 1.Bordbar A., Monk J.M., King Z.A., Palsson B.O. Constraint-based models predict metabolic and associated cellular functions. Nat. Rev. Genet. 2014;15:107–120. doi: 10.1038/nrg3643. [DOI] [PubMed] [Google Scholar]
  • 2.Schellenberger J., Park J.O., Conrad T.M., Palsson B.O. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics. 2010;11:213. doi: 10.1186/1471-2105-11-213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Caspi R., Altman T., Billington R., Dreher K., Foerster H., Fulcher C.A., Holland T.A., Keseler I.M., Kothari A., Kubo A., et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucl. Acids Res. 2014;42:D459–D471. doi: 10.1093/nar/gkt1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Büchel F., Rodriguez N., Swainston N., Wrzodek C., Czauderna T., Keller R., Mittag F., Schubert M., Glont M., Golebiewski M., et al. Path2Models: large-scale generation of computational models from biochemical pathway maps. BMC Syst. Biol. 2013;7:116. doi: 10.1186/1752-0509-7-116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Henry C.S., DeJongh M., Best A.A., Frybarger P.M., Linsay B., Stevens R.L. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat. Biotechnol. 2010;28:977–982. doi: 10.1038/nbt.1672. [DOI] [PubMed] [Google Scholar]
  • 6.Kim H., Shin J., Kim E., Kim H., Hwang S., Shim J.E., Lee I. YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Res. 2014;42:D731–D736. doi: 10.1093/nar/gkt981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Aziz R.K., Bartels D., Best A.A., DeJongh M., Disz T., Edwards R.A., Formsma K., Gerdes S., Glass E.M., Kubal M., et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9:75. doi: 10.1186/1471-2164-9-75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Karp P.D., Paley S.M., Krummenacker M., Latendresse M., Dale J.M., Lee T.J., Kaipa P., Gilham F., Spaulding A., Popescu L., et al. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform. 2010;11:40–79. doi: 10.1093/bib/bbp043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lang M., Stelzer M., Schomburg D. BKM-react, an integrated biochemical reaction database. BMC Biochem. 2011;12:42. doi: 10.1186/1471-2091-12-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kumar A., Suthers P.F., Maranas C.D. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases. BMC Bioinformatics. 2012;13:6. doi: 10.1186/1471-2105-13-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bernard T., Bridge A., Morgat A., Moretti S., Xenarios I., Pagni M. Reconciliation of metabolites and biochemical reactions for metabolic networks. Brief Bioinform. 2014;15:123–135. doi: 10.1093/bib/bbs058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gao J., Ellis L.B.M., Wackett L.P. The University of Minnesota Biocatalysis/Biodegradation Database: improving public access. Nucleic Acids Res. 2010;38:D488–D491. doi: 10.1093/nar/gkp771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ganter M., Bernard T., Moretti S., Stelling J., Pagni M. MetaNetX.org: a website and repository for accessing, analysing and manipulating metabolic networks. Bioinformatics. 2013;29:815–816. doi: 10.1093/bioinformatics/btt036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ganter M., Kaltenbach H.-M., Stelling J. Predicting network functions with nested patterns. Nat. Commun. 2014;5:3006. doi: 10.1038/ncomms4006. [DOI] [PubMed] [Google Scholar]
  • 15.UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43:D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Morgat A., Axelsen K.B., Lombardot T., Alcántara R., Aimo L., Zerara M., Niknejad A., Belda E., Hyka-Nouspikel N., Coudert E. Updates in Rhea - a manually curated resource of biochemical reactions. Nucleic Acids Res. 2015;43:D459–D464. doi: 10.1093/nar/gku961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Aimo L., Liechti R., Hyka-Nouspikel N., Niknejad A., Gleizes A., Götz L., Kuznetsov D., David F.P.A., van der Goot F.G., Riezman H., et al. The SwissLipids knowledgebase for lipid biology. Bioinformatics. 2015;31:2860–2866. doi: 10.1093/bioinformatics/btv285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Overbeek R., Olson R., Pusch G.D., Olsen G.J., Davis J.J., Disz T., Edwards R.A., Gerdes S., Parrello B., Shukla M., et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) Nucleic Acids Res. 2014;42:D206–D214. doi: 10.1093/nar/gkt1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Forster M., Pick A., Raitner M., Schreiber F., Brandenburg F.J. The system architecture of the BioPath system. In Silico Biol. (Gedrukt) 2002;2:415–426. [PubMed] [Google Scholar]
  • 20.Hastings J., de Matos P., Dekker A., Ennis M., Harsha B., Kale N., Muthukrishnan V., Owen G., Turner S., Williams M., et al. The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 2013;41:D456–D463. doi: 10.1093/nar/gks1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Wishart D.S., Jewison T., Guo A.C., Wilson M., Knox C., Liu Y., Djoumbou Y., Mandal R., Aziat F., Dong E., et al. HMDB 3.0 - The Human Metabolome Database in 2013. Nucleic Acids Res. 2013;41:D801–D807. doi: 10.1093/nar/gks1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kanehisa M., Goto S., Sato Y., Kawashima M., Furumichi M., Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–D205. doi: 10.1093/nar/gkt1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sud M., Fahy E., Cotter D., Dennis E.A., Subramaniam S. LIPID MAPS-nature lipidomics gateway: an online resource for students and educators interested in lipids. J. Chem. Educ. 2012;89:291–292. doi: 10.1021/ed200088u. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Croft D., Mundo A.F., Haw R., Milacic M., Weiser J., Wu G., Caudy M., Garapati P., Gillespie M., Kamdar M.R., et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42:D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Morgat A., Coissac E., Coudert E., Axelsen K.B., Keller G., Bairoch A., Bridge A., Bougueleret L., Xenarios I., Viari A. UniPathway: a resource for the exploration and annotation of metabolic pathways. Nucleic Acids Res. 2012;40:D761–D769. doi: 10.1093/nar/gkr1023. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES