Abstract
MetaNetX is a repository of genome-scale metabolic networks (GSMNs) and biochemical pathways from a number of major resources imported into a common namespace of chemical compounds, reactions, cellular compartments—namely MNXref—and proteins. The MetaNetX.org website (http://www.metanetx.org/) provides access to these integrated data as well as a variety of tools that allow users to import their own GSMNs, map them to the MNXref reconciliation, and manipulate, compare, analyze, simulate (using flux balance analysis) and export the resulting GSMNs. MNXref and MetaNetX are regularly updated and freely available.
INTRODUCTION
A genome-scale metabolic network (GSMN), or stoichiometric model, describes the set of biochemical reactions which may occur in a given organism, as well as the requisite enzymes, and may also include information on sub-cellular compartments, transport reactions and transporters. By design GSMNs are focused on the metabolism of small molecular weight compounds when energy and mass conservation law can be applied, and are not suited to represent gene regulation or signaling pathways. In practice, a GSMN has a double purpose, as it is both a repository of knowledge about an organism's metabolism, and a model that can be simulated, using flux balance analysis (FBA). Such simulations can address different questions: (i) establish the essentiality of genes in specific growth conditions; (ii) reveal opportunities for metabolic engineering and optimization; (iii) suggest new drug targets (1). To permit simulations, a GSMN usually includes artificial reactions that describe the growth medium, a growth equation (which implies the composition of the biomass) and possibly hypothetical reactions not (yet) supported by experimental biology but required to make a model functional.
A relatively small number of high quality GSMNs have been published to date, essentially for model organisms, and are made available by a few dedicated databases (2–6). The development of such models requires significant human effort and curation, and the fully automated reconstruction of a GSMN from an annotated genome sequence remains a challenge (7,8). Such methods require the integration of high quality curated data covering the known biochemistry of a vast range of organisms, as well as methods that address the specific requirements of a functional GSMN, including the elemental balancing of individual reactions. These considerations form the major motivation for the development of the resource presented here.
MNXREF RECONCILIATION
The metabolite identifiers found in the early-published GSMNs were often specific to the individual groups developing and curating them, and did not generally reference the major databases of chemical compounds. In recent years there have been a few attempts to ‘reconcile’ the different nomenclatures of these compounds (9,10) including our own effort MNXref (11). The principles of the reconciliation algorithm used in MNXref can be summarized as follows:
Reconciliation of common metabolites based on chemical structures;
Reconciliation of metabolites through shared chemical nomenclature;
Reconciliation of reactions through shared metabolites;
Identification of candidate reactions for reconciliation through shared cross-references;
Iterative reconciliation of metabolites through reaction context.
Figure 1 illustrates the reconciliation process using malonyl-CoA as an example; individual steps in the reconciliation are color-coded according to the type of evidence used. Table 1 summarizes the overlaps between the various sources of biochemical data and GSMNs according to the results of the MNXref reconciliation. The MNXref namespace is regularly updated with metabolite and reaction data from new resources; recent additions include the EAWAG-BBD/UMBBD pathway database (12).
Table 1. Numbers of reconciled metabolites (a) and reactions (b) in MNXref 2.0, and mapped proteins (c), found in common between published GSMNs and major biochemical databases in MetaNetX.org.
MNXref | BiGG (2) 18 GSMNs | BioCyc (3) 19 GSMNs | Path2Models (4) 132 GSMNs | The SEED (18) 50 GSMNs | YeastNet (6) 1 GSMN | |
---|---|---|---|---|---|---|
(a) Metabolites | ||||||
BiGGa (2) (version 2beta) | 4039 | 3414 | 2610 | 2836 | 1829 | 1021 |
BioPath (19) (2010–05–03) | 1313 | 649 | 875 | 943 | 567 | 427 |
ChEBI (20) (version 131) | 46 477 | 8507 | 15 631 | 17 973 | 7108 | 4416 |
HMDB (21) (version 3.6) | 42 542 | 1292 | 2525 | 3044 | 1054 | 714 |
KEGG (22) (version 75.1) | 28 429 | 1958 | 3945 | 5356 | 1560 | 908 |
LIPIDMAPS (23) (2015–06–28) | 40 719 | 412 | 1382 | 1587 | 280 | 252 |
MetaCyc (3) (version 19.1) | 15 472 | 1835 | 5380 | 5637 | 1399 | 826 |
Reactome (24) (2015–07–13) | 4576 | 1799 | 2539 | 2770 | 1467 | 1521 |
The SEEDa (18) (2013–06–19) | 16 280 | 2040 | 3120 | 4098 | 1551 | 678 |
UMBBD–EAWAG (12) (2014–06–30) | 1395 | 206 | 347 | 588 | 150 | 67 |
UniPathway (25) (version 2015_03) | 1113 | 692 | 874 | 928 | 657 | 393 |
(b) Reactions | ||||||
BiGGa (2) (version 2beta) | 11 458 | 6055 | 3380 | 2580 | 1876 | 1730 |
BioPath (19) (2010–05–03) | 1545 | 456 | 684 | 725 | 328 | 285 |
KEGG (22) (version 75.1) | 9925 | 1335 | 3085 | 4309 | 877 | 528 |
MetaCyc (3) (version 19.1) | 13 793 | 1419 | 5040 | 4220 | 828 | 549 |
Reactome (24) (2015–07–13) | 23 592 | 4111 | 5848 | 5147 | 2849 | 2604 |
Rhea (16) (version 64) | 32 256 | 5101 | 10 603 | 10 293 | 3190 | 2050 |
The SEEDa (18) (2013–06–19) | 13 260 | 3069 | 2980 | 3337 | 1738 | 932 |
UniPathway (25) (version 2015_03) | 1994 | 1065 | 1435 | 1471 | 836 | 559 |
(c) Proteins | ||||||
UniProt (15) (version 2015_08) | 11 142 | 15 670 | 76 293 | 27 154 | 912 |
aBiGG and The SEED distribute collections of metabolites and reactions that are not necessarily retrieved in one of their GSMN.
In the construction and use of GSMNs, every reaction must be balanced with respect to elemental composition and charge; failure to balance reactions will lead to violations in mass conservation that can have detrimental effects on the downstream simulations. The case of protons is worthy of particular attention in this regard. Protons provide a means to balance chemical equations occurring in aqueous solution, but they are also responsible for creating membrane potentials whose dissipation is a major driving force in cell metabolism. In order to distinguish these two roles we have introduced separate identifiers for those protons transported across a membrane (MNXM01 in MNXref), and those protons introduced for the purposes of balancing a reaction (MNXM1 in MNXref). An artificial spontaneous reaction is then added to every compartment of the GSMN to permit the free exchange between transported and balanced protons (MNXR01 in MNXref). In this way, the original properties of the GSMN are preserved.
METANETX REPOSITORY AND TOOLS
MetaNetX.org (13) is a website that provides free access to the MNXref reconciliation data and a collection of published GSMNs and biochemical pathways mapped onto MNXref. The website also allows users to upload, manipulate, analyze or modify their own GSMNs and export them in SBML or in our own tab-delimited format. MetaNetX.org also offers a selection of tools for analyses including network structure, FBA or nested pattern methods (14).
Gene names have been widely used in published GSMNs to describe the protein complexes that act as enzymes or transporters. Gene nomenclature is, however, essentially organism specific, if not dependent on a particular genome assembly. In MetaNetX we use UniProt accession numbers (15) to identify gene products: it greatly facilitates the inter organisms comparison of GSMNs from different sources.
Although the MNXref reconciliation algorithm is essentially automated, the compilation of the MetaNetX repository requires some manual intervention and a certain number of editorial choices. This includes definition of an accepted list of species and strains that includes important model organisms. Preference is given to the most comprehensive GSMNs from external sources that use accepted standard formats and have sufficient protein coverage (full acknowledgement is given to these external sources). We are closely collaborating with Rhea (16), which is a database of manually curated biochemical reactions, as part of the ongoing effort to further improve the quality of annotation of our resource.
CONCLUSION
The www.metanetx.org resource provides a comprehensive suite of tools for the analysis of genome-scale metabolic models, based on a single integrated namespace of metabolites and metabolic reactions that integrates the most widely used biochemical databases and model repositories – MNXref. The reconciliation process used in MNXref greatly simplifies the development and analysis of genome-scale metabolic models, allowing users to concentrate on model analysis rather than the time-consuming problem of identifier mapping. Future developments will include the provision of tools and the integration of new resources such as the SwissLipids knowledgebase (17), which provides lipid structures and curated data on enzymatic reactions.
Acknowledgments
Computation and maintenance of the MetaNetX.org server are provided by the Vital-IT center for high-performance computing of the SIB Swiss Institute of Bioinformatics (http://www.vital-it.ch). We thank Ioannis Xenarios and Joerg Stelling for support and feedback.
FUNDING
Swiss Initiative for Systems Biology [SystemsX.ch projects MetaNetX, HostPathX and SyBIT] evaluated by the Swiss National Science Foundation; Swiss Federal Government through the Federal Office of Education and Science. Funding for open access charge: SIB Swiss Institute of Bioinformatics.
Conflict of interest statement. None declared.
REFERENCES
- 1.Bordbar A., Monk J.M., King Z.A., Palsson B.O. Constraint-based models predict metabolic and associated cellular functions. Nat. Rev. Genet. 2014;15:107–120. doi: 10.1038/nrg3643. [DOI] [PubMed] [Google Scholar]
- 2.Schellenberger J., Park J.O., Conrad T.M., Palsson B.O. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics. 2010;11:213. doi: 10.1186/1471-2105-11-213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Caspi R., Altman T., Billington R., Dreher K., Foerster H., Fulcher C.A., Holland T.A., Keseler I.M., Kothari A., Kubo A., et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucl. Acids Res. 2014;42:D459–D471. doi: 10.1093/nar/gkt1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Büchel F., Rodriguez N., Swainston N., Wrzodek C., Czauderna T., Keller R., Mittag F., Schubert M., Glont M., Golebiewski M., et al. Path2Models: large-scale generation of computational models from biochemical pathway maps. BMC Syst. Biol. 2013;7:116. doi: 10.1186/1752-0509-7-116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Henry C.S., DeJongh M., Best A.A., Frybarger P.M., Linsay B., Stevens R.L. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat. Biotechnol. 2010;28:977–982. doi: 10.1038/nbt.1672. [DOI] [PubMed] [Google Scholar]
- 6.Kim H., Shin J., Kim E., Kim H., Hwang S., Shim J.E., Lee I. YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Res. 2014;42:D731–D736. doi: 10.1093/nar/gkt981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Aziz R.K., Bartels D., Best A.A., DeJongh M., Disz T., Edwards R.A., Formsma K., Gerdes S., Glass E.M., Kubal M., et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9:75. doi: 10.1186/1471-2164-9-75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Karp P.D., Paley S.M., Krummenacker M., Latendresse M., Dale J.M., Lee T.J., Kaipa P., Gilham F., Spaulding A., Popescu L., et al. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform. 2010;11:40–79. doi: 10.1093/bib/bbp043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Lang M., Stelzer M., Schomburg D. BKM-react, an integrated biochemical reaction database. BMC Biochem. 2011;12:42. doi: 10.1186/1471-2091-12-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kumar A., Suthers P.F., Maranas C.D. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases. BMC Bioinformatics. 2012;13:6. doi: 10.1186/1471-2105-13-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bernard T., Bridge A., Morgat A., Moretti S., Xenarios I., Pagni M. Reconciliation of metabolites and biochemical reactions for metabolic networks. Brief Bioinform. 2014;15:123–135. doi: 10.1093/bib/bbs058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gao J., Ellis L.B.M., Wackett L.P. The University of Minnesota Biocatalysis/Biodegradation Database: improving public access. Nucleic Acids Res. 2010;38:D488–D491. doi: 10.1093/nar/gkp771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ganter M., Bernard T., Moretti S., Stelling J., Pagni M. MetaNetX.org: a website and repository for accessing, analysing and manipulating metabolic networks. Bioinformatics. 2013;29:815–816. doi: 10.1093/bioinformatics/btt036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ganter M., Kaltenbach H.-M., Stelling J. Predicting network functions with nested patterns. Nat. Commun. 2014;5:3006. doi: 10.1038/ncomms4006. [DOI] [PubMed] [Google Scholar]
- 15.UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43:D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Morgat A., Axelsen K.B., Lombardot T., Alcántara R., Aimo L., Zerara M., Niknejad A., Belda E., Hyka-Nouspikel N., Coudert E. Updates in Rhea - a manually curated resource of biochemical reactions. Nucleic Acids Res. 2015;43:D459–D464. doi: 10.1093/nar/gku961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Aimo L., Liechti R., Hyka-Nouspikel N., Niknejad A., Gleizes A., Götz L., Kuznetsov D., David F.P.A., van der Goot F.G., Riezman H., et al. The SwissLipids knowledgebase for lipid biology. Bioinformatics. 2015;31:2860–2866. doi: 10.1093/bioinformatics/btv285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Overbeek R., Olson R., Pusch G.D., Olsen G.J., Davis J.J., Disz T., Edwards R.A., Gerdes S., Parrello B., Shukla M., et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) Nucleic Acids Res. 2014;42:D206–D214. doi: 10.1093/nar/gkt1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Forster M., Pick A., Raitner M., Schreiber F., Brandenburg F.J. The system architecture of the BioPath system. In Silico Biol. (Gedrukt) 2002;2:415–426. [PubMed] [Google Scholar]
- 20.Hastings J., de Matos P., Dekker A., Ennis M., Harsha B., Kale N., Muthukrishnan V., Owen G., Turner S., Williams M., et al. The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 2013;41:D456–D463. doi: 10.1093/nar/gks1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wishart D.S., Jewison T., Guo A.C., Wilson M., Knox C., Liu Y., Djoumbou Y., Mandal R., Aziat F., Dong E., et al. HMDB 3.0 - The Human Metabolome Database in 2013. Nucleic Acids Res. 2013;41:D801–D807. doi: 10.1093/nar/gks1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kanehisa M., Goto S., Sato Y., Kawashima M., Furumichi M., Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–D205. doi: 10.1093/nar/gkt1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sud M., Fahy E., Cotter D., Dennis E.A., Subramaniam S. LIPID MAPS-nature lipidomics gateway: an online resource for students and educators interested in lipids. J. Chem. Educ. 2012;89:291–292. doi: 10.1021/ed200088u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Croft D., Mundo A.F., Haw R., Milacic M., Weiser J., Wu G., Caudy M., Garapati P., Gillespie M., Kamdar M.R., et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42:D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Morgat A., Coissac E., Coudert E., Axelsen K.B., Keller G., Bairoch A., Bridge A., Bougueleret L., Xenarios I., Viari A. UniPathway: a resource for the exploration and annotation of metabolic pathways. Nucleic Acids Res. 2012;40:D761–D769. doi: 10.1093/nar/gkr1023. [DOI] [PMC free article] [PubMed] [Google Scholar]