Abstract
MetaBridge is a web-based tool designed to facilitate the integration of metabolomics with other “omics” data types such as transcriptomics and proteomics. It uses data from MetaCyc and KEGG (Kyoto Encyclopedia of Genes and Genomes) to map metabolite compounds to directly interacting upstream or downstream enzymes in enzymatic reactions and metabolic pathways. The resulting list of enzymes can then be integrated with transcriptomics or proteomics data via protein-protein interaction networks to perform integrative multi-omics analyses. MetaBridge was developed to be intuitive and easy-to-use, requiring little to no prior computational experience. The protocols described here detail all steps involved in the use of MetaBridge, from preparing input data and performing metabolite mapping, to utilizing the results to build a protein-protein interaction network.
Keywords: metabolomics, multi-omics integration, metabolite mapping
Introduction
Recent advances in sequencing technology and other high-throughput platforms has led to the increasing prevalence of “omics” methods, including transcriptomics, proteomics, and metabolomics. The field of systems biology has expanded rapidly as researchers work to analyze and interpret the large amounts of data that are routinely generated. While these “omics” methods can provide significant insights when examined independently, recent work (Connor et al., 2010; Hirai et al., 2004; Lee et al.; 2019; Li et al., 2017) has highlighted the value of integrating “multi-omics” data in order to discover both recurrent biological pathways uncovered by multiple platforms, as well as to generate new insights to the underlying biological phenomena by identifying novel pathways.
There are a few tools currently available that are designed to facilitate this integrative approach, such as DIABLO and Omics Net (Singh et al., 2018; Zhou & Xia, 2018; Zhou & Xia, 2018;). These tools, together with databases such as MetaCyc and KEGG (Caspi et al, 2017; Kanehisa et al, 2000; Kanehisa et al., 2018; Kanehisa et al., 2019) are designed to facilitate integration of different “omics” data types. Metabolomics in particular can be difficult to combine with results from other methods, since it produces lists of metabolic compounds, while methods such as transcriptomics and proteomics provide genes or their protein products. Even though databases such as MetaCyc and KEGG (Caspi et al., 2017; Kanehisa et al., 2000; Kanehisa et al., 2018; Kanehisa et al., 2019) link metabolites with interacting genes and proteins, there is no simple way to obtain a list of genes/proteins corresponding to identified metabolites. MetaBridge (Hinshaw et al., 2018) was designed specifically with this goal in mind, allowing the user to easily obtain a list of genes (which represent metabolite “interactors”) that can then be directly integrated with other “omics” data. Ultimately, integrating different omics data types can reveal novel biological insights that would not be detected by any individual method (Connor et al., 2010; Hirai et al., 2004; Lee et al., 2019).
Metabolomics as a method is designed to identify a wide range of compounds present in a particular sample, providing unique insights into the metabolic state of a sample or an organism. Many studies have used metabolomics to study a wide range of conditions, treatments, and diseases, demonstrating the broad applicability of this method. Metabolomics has even been used to generate a metabolite signature that can distinguish septic patients (Langley et al., 2014; Mickiewicz et al., 2013; Stretch et al., 2018).
The two databases used by MetaBridge are MetaCyc (Caspi et al., 2017) and KEGG (Kanehisa et al., 2000; Kanehisa et al., 2018; Kanehisa et al., 2019). Both of these resources are available online and contain a wide array of information on metabolite compounds, the pathways in which these compounds are involved, and the enzymes from these pathways that interact with a given compound. While data from MetaCyc is freely available, full access to KEGG data beyond a web browser requires a license, purchased from the KEGG organization. Academic users may pay for a yearly license by completing a form which includes the individual’s name, institution, and contact information. By leveraging the data available from both of these resources, MetaBridge is able to provide the user with directly interacting upstream and downstream enzymes of any particular identified metabolites.
This article provides a step-by-step guide for using MetaBridge to map metabolites to their directly interacting synthetic and degradative metabolic enzymes. MetaBridge was designed to be simple and intuitive to use, requiring no programming knowledge. MetaBridge is open source, developed in the Shiny framework created by Studio (Chang et al., 2017), and can be accessed through any modern web browser. The two main protocols for this article are “Basic Protocol 1: Mapping Metabolite Data using MetaCyc Identifiers” and “Basic Protocol 2: Mapping Metabolite Data using KEGG Identifiers”. These protocols primarily follow the same steps, the main differences being which database is used to perform the mapping, and the inclusion of pathway visualization via Pathview (Luo & Brouwer, 2013) when mapping with KEGG data (Basic Protocol 2).
Basic Protocol 1: Mapping Metabolite Data using MetaCyc Identifiers
This protocol describes the process of uploading metabolite data to MetaBridge and retrieving gene annotations via data obtained from MetaCyc (Caspi et al., 2017). This protocol will utilize example data supplied by MetaBridge; no additional files are needed. Mapping one’s metabolites using MetaCyc will leverage the high-quality, curated data that it provides.
Necessary Requirements
An Internet connection.
A modern and up-to-date web browser, such as Google Chrome, Mozilla Firefox, Apple Safari, or Microsoft Edge.
A table of metabolites in comma- or tab-separated format (CSV or TSV/TXT), containing either HMDB, KEGG, PubChem or CAS IDs (Figure 1). If the user only has compound names, please read Support Protocol 1 and use MetaboAnalyst (Chong et al., 2018) to convert compound names to one of the accepted ID types.
Figure 1:

Example data for input to MetaBridge; in this case, a comma-separated values (CSV) formatted file, opened in A. a plain text editor and B. a spreadsheet program.
Protocol Steps
Navigate to MetaBridge (https://metabridge.org) and click “Get Started” to begin (Figure 2).
- To upload a user file, click the “Browse…” button, which will open a new dialog window and allow the user to select a file from their computer.
- If this file contains a header (i.e. column names), leave the “Header” box checked. If it does not, uncheck the “Header” box before proceeding.
- Select the appropriate separator from the list: comma or CSV; tab or TSV/TXT; or semicolon (sometimes denoted as CSV or CSV2).
- For this protocol, the provided example data will be used; click the “Try Examples” link, which will automatically load a preview of the data (Figure 3). A direct link to the example data is also provided in the Critical Parameters section.
- By default, the first column of this data will be highlighted, indicating that this column is to be used in the mapping. For this protocol, the “HMDB” column will be used; click anywhere in the column to highlight and select it (Figure 3).
- From the dropdown menu, ensure that the proper ID type, HMDB, is selected.
- If the selected column and chosen ID type do not match, the mapping will not be successful. If an error is received, check that these items match.
Click the “Proceed” button to begin mapping. The page will automatically switch to the “Map” tab (visible along the top of the page in Figure 2 & 3) and present the user with a choice of using either MetaCyc or KEGG annotations. For this protocol we use the default option, MetaCyc (KEGG is covered next in Basic Protocol 2).
- Click the “Map” button to map uploaded metabolite compounds and see the results.
- The top table displays a summary of the mapping results (Figure 4). For each uploaded metabolite, the HMDB ID, KEGG ID, and compound names are displayed. The subsequent columns indicate the number of mappings for each category; in this case Reactions, MetaCyc Genes, Gene Names, and Ensembl Genes.
Clicking on any row will bring up the full mapping results for that particular metabolite in a second table below the first. In the example below, PHE (L-Phenylalanine) was selected, and the lower table displays the full set of mapped reactions and genes (Figure 4).
Clicking on any of the names or IDs underlined and in green (e.g. PHE) will open a new tab linking to that metabolite’s HMDB, MetaCyc or Ensembl page (depending on the name or ID clicked).
Clicking the “Download” button allows the user to save the full mapping results as a CSV or TSV file for further inspection, or input into other programs such as NetworkAnalyst for visualization and multi-omic integration (Xia et al., 2015).
Figure 2:

Landing page for MetaBridge.
Figure 3:

The uploaded example data, with HMDB column selected.
Figure 4:

Summary and per-metabolite results for the example data.
Basic Protocol 2: Mapping Metabolite Data using KEGG Identifiers
This protocol will cover how to map metabolites using data from the KEGG database. Typically, this will generate more results than if one were to use MetaCyc due to the fact that MetaCyc is manually curated, and thus, more conservative/higher confidence. One substantial advantage is that the user will be able to visualize pathways associated with a given metabolite after mapping has completed. For more information regarding the differences between KEGG and MetaCyc annotations, please see the “Guidelines for Understanding Results” section. The initial steps for mapping to KEGG are the same as when mapping to MetaCyc. Please refer to Basic Protocol 1: Mapping Metabolite Data using MetaCyc Identifiers, Step 1 to Step 5.
Protocol Steps
After selecting the example data, HMDB column, and clicking the “Proceed” button, “KEGG” is now chosen as the mapping database, then click the “Map” button.
- Again, a summary of the mapping results will appear. Selecting a single row (again choosing Phe) brings up the detailed mapping results for that metabolite.
-
Certain different information is displayed, namely enzyme names and IDs, and the KEGG IDs (Figure 5).Note: There are more entries for PHE using KEGG (8 enzymes, 12 gene names) than were retrieved for MetaCyc (2 and 3 respectively). From here click the “Download” button to save the full mapping results.
-
On the left-hand side of the screen there is the option to “Visualize Results”. Clicking the “Visualize” button will open a new page, as shown in Figure 6.
-
MetaBridge will load the first applicable pathway and display the image generated by Pathview (Luo & Brouwer, 2013) on the screen. The selected compound (Phe) is highlighted in yellow, while mapped enzymes are highlighted in red.
Note: The enzymes highlighted in the image (in red) correspond to those from all compounds mapped, not only the one selected (Phe in this case).
To save the Pathway image, right click anywhere on the image and select “Save Image As…”. This will open a new dialog window and allow the user to choose a name and location for the image on their computer.
Figure 5:

Mapping results from using the KEGG database.
Figure 6:

Visualization results for PHE, KEGG pathway “00360: Phenylalanine metabolism”.
Support Protocol 1: Converting Compound Names to HMDB IDs
This protocol is provided to guide users through the steps needed to map their metabolite compound names to standardized IDs, which are required as input for MetaBridge. For this, we will be using MetaboAnalyst (Chong et al., 2018).
Necessary Requirements
Internet connection.
A modern and up-to-date web browser, such as Google Chrome, Mozilla Firefox, Apple Safari, or Microsoft Edge.
A table or list of metabolites in comma- or tab-separated format (CSV or TSV/TXT), containing compound names to be converted to standard IDs.
Protocol Steps
Go to MetaboAnalyst (https://metaboanalyst.ca), and follow the “>> click here to start <<” link to be taken to the main navigation page. Then, click the “Other Utilities” button (bottom of the circle in Figure 7) to proceed. When prompted, select “Compound ID Conversion” and hit “OK”.
In the input field, paste your list of compound names, one per line. Alternatively, you can upload a file from your computer. If you are submitting names such as “glucose”, leave the specified input type as “Common Name”. Once you have entered all your compounds, click the “Submit” button to continue (Figure 8).
You are now presented with a results screen showing the mapping of your compounds, including the HMDB ID (among others). Click the link just below your results table to download a CSV file of your results (Figure 9). This file can then be provided as input to MetaBridge (Basic Protocol 1 or 2).
Figure 7:

Analysis options and tools presented by MetaboAnalyst. For this Supplementary Protocol 1, we will use the “Other Utilities” located at the bottom of the circle.
Figure 8:

Here we have populated the input field with a few metabolites as an example.
Figure 9:

Results of metabolite mapping with MetaboAnalyst
Support Protocol 2: Submitting Mapped Genes Produced by MetaBridge for Protein-Protein Interaction (PPI) Network Construction
This protocol is included to demonstrate one of the potential applications of the results produced by MetaBridge: generating a PPI network with NetworkAnalyst (Xia et al., 2015; Zhou et al.)
Necessary Requirements
Internet connection.
A modern and up-to-date web browser, such as Google Chrome, Mozilla Firefox, Apple Safari, or Microsoft Edge.
A table or list of gene names or ID in one of the following types: Ensembl Gene ID,
Protocol Steps
-
1.
For this protocol, the example data provided by MetaBridge is used (available by clicking the “Try Examples” link on the Upload page; see step 3 of Basic Protocol 1). Mapping this data from HMDB ID to Gene Name is performed using the MetaCyC database (i.e. following Basic Protocol 1).
-
2.
Navigate to NetworkAnalyst (https://networkanalyst.ca) and select “Gene List Input” to proceed. Then, select “H. sapiens (human)” from the dropdown menu for “Organism”, and “Official Gene Symbol” for “ID type”. Paste in the “Gene Names” from the MetaBridge results, and click the “Upload” button. A message will appear to confirm that the IDs have been uploaded successfully. Click the “Proceed” button to continue (Figure 10).
Note: If this data is used, NetworkAnalyst will warn you that duplicate IDs were removed.
-
4.
The next page will present the user with many options. A protein-protein interaction network will be constructed, here labeled “Generic PPI” (right-most column, top entry). The user will be prompted to select a database to use; choose the default “IMEx Interactome” option and press “OK” to continue.
-
5.
The next page shows a table containing the information for each generated network, including number of nodes (proteins) and edges (interactions). By default, the networks initially presented are first-order networks. For the purpose of this tutorial, a zero-order network (only directly interacting proteins) will be made by clicking the button on the right (Figure 11). Once the page updates, click the “Proceed” button to visualize
Note: A first-order network (the default, includes downloaded proteins plus direct interactors) can be returned by clicking the “Reset Network” button.
-
6.
Now the network generated from our mapped metabolites can be seen (Figure 12). To download an image of this network, click the “-- Specify --” button to the right of “Download”, and select the desired format (e.g. PNG). Right-click anywhere on the pop-up image to save to your computer.
Note: There are myriad other options for visualizing and exploring networks In NetworkAnalyst, which are beyond the scope of this protocol, but can be found in Xia et al., 2015; Zhou et al., 2019.
Figure 10:

Gene list input page for NetworkAnalyst. Here we have entered the IDs output from the example data provided by MetaBridge.
Figure 11:

Overview of mapping results for our metabolites in a zero-order network.
Figure 12:

Zero-order network generated from the example mapped metabolites using example data from MetaBridge.
Guidelines for Understanding Results
The output from MetaBridge is a list of genes mapped from input metabolites. Users can expect to retrieve multiple genes for each submitted metabolite, as most metabolites will be involved in multiple pathways/reactions. This output can then be used as input to other analyses such as construction of PPI networks as described in Support Protocol 2. The number of genes retrieved will vary depending on the input metabolites. Similarly, the number of pathways that can be visualized with Pathview (if metabolites are mapped using KEGG) will depend on the particular metabolites submitted.
The primary difference between mapping with MetaCyc or KEGG stems from the underlying annotations. MetaCyc data undergoes more thorough curation, and as a result, will likely yield fewer but higher confidence results than KEGG, when considering a particular metabolite. The benefit of using KEGG comes mainly when submitting a short list of metabolites, or one which contains compounds which are not well characterized, as it will likely produce more results.
Commentary
Background Information
MetaBridge was developed to facilitate the integration of metabolomics results with other “omics” datatypes, allowing the inclusion of one or more disparate data types whereby results were produced by other methods such as transcriptomics or proteomics. MetaBridge was designed to be simple and intuitive to use, requiring minimal or no computational proficiency. As shown in work such as Lee et al., 2019, the integration of metabolomics with other data types can produce novel results that would likely be missed by any single method. One may also refer to the original MetaBridge publication (Hinshaw et al., 2018) for more information.
Critical Parameters & Troubleshooting
It is important that the submitted metabolites come in the form of standard identifiers (e.g. HMDB, KEGG, or CAS IDs), as this is necessary for the mapping of metabolites to genes via MetaCyc or KEGG data. If the user only has compound names, please refer to Support Protocol 1, which describes how to convert names to IDs using MetaboAnalyst.
Most errors encountered in MetaBridge are due to improper formatting of input data; as such, please ensure all data is saved as a plain text file (CSV or TSV). If the user runs into an error, refreshing the page and re-uploading the data will likely solve the issue.
It is important when uploading data to ensure that the selected ID type from the dropdown is consistent with the column chosen to use for the mapping of data. If these do not match, the user will receive an error message and will not see any results. It is also worth noting that using HMDB or KEGG IDs will yield the best results, so the user is advised to employ one of these options whenever possible.
Additionally, MetaBridge is ultimately dependent on KEGG and MetaCyc for its annotations; in some cases, there may be limited information available for the submitted compounds. The application is designed to provide feedback to the user regarding commonly encountered errors, such as incorrectly parsed input data or a lack of results for submitted metabolites. It will describe what has gone wrong and at which step the error occurred, so that the user can narrow in on what needs to be fixed. If other errors are encountered, the user can submit an issue on the GitHub page (https://github.com/hancockinformatics/MetaBridgeShiny) for feedback.
The example data used by MetaBridge (when clicking the “Try Example” link) can be found in CSV form on the Github page: https://github.com/hancockinformatics/MetaBridgeShiny/blob/master/example_data/sam_example_data.csv
Suggestions for Further Analysis
As the primary use of MetaBridge is to allow for the integration of metabolomics with other “omics” data types, we have included a brief protocol for PPI network construction based on the output of MetaBridge via NetworkAnalyst (Xia et al., 2015; Zhou et al., 2019) in Support Protocol 2. From a list of differentially expressed metabolites mapped to genes, there are myriad other analyses which can be performed including: gene set analysis, pathway enrichment, and more in-depth network analysis and integration.
Internet Resources with Annotations
Resource for conversion of metabolite names to standard IDs prior to submission to MetaBridge.
https://www.networkanalyst.ca/
Web tool for construction, visualization, and analysis of PPI networks from lists of genes.
Database of metabolic pathways and reactions
Resource for metabolic pathways and reactions
https://www.bioinformatics.jp/en/keggftp.html
Information on acquiring a KEGG license
Acknowledgement of Funding
The creation of Metabridge was supported by This work was funded by a Canadian Institutes of Health Research grant FDN-154287, by funding from the Human Immunology Projects Consortium of the National Institutes of Health U19AI118608 and funding from Canada Research Chairs and a UBC Killam Professorship to REWH.
Literature Cited
- 1.Caspi R, Billington R, Fulcher CA, Keseler IM, Kothari A, Krummenacker M, … & Paley S (2017). The MetaCyc database of metabolic pathways and enzymes. Nucleic acids research, 46(D1), D633–D639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chang W, Cheng J, Allaire JJ, Xie Y, & McPherson J (2017). Shiny: web application framework for R. R package version, 1(5). [Google Scholar]
- 3.Chong J, Soufan O, Li C, Caraus I, Li S, Bourque G, … & Xia J (2018). MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic acids research, 46(W1), W486–W494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Connor SC, Hansen MK, Corner A, Smith RF, & Ryan TE (2010). Integration of metabolomics and transcriptomics data to aid biomarker discovery in type 2 diabetes. Molecular BioSystems, 6(5), 909–921. [DOI] [PubMed] [Google Scholar]
- 5.Hinshaw SJ, HY Lee A, Gill EE, & EW Hancock R (2018). MetaBridge: enabling network-based integrative analysis via direct protein interactors of metabolites. Bioinformatics, 34(18), 3225–3227. [DOI] [PubMed] [Google Scholar]
- 6.Hirai MY, Yano M, Goodenowe DB, Kanaya S, Kimura T, Awazuhara M, … & Saito K (2004). Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proceedings of the National Academy of Sciences, 101(27), 10205–10210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kanehisa M (2019). Toward understanding the origin and evolution of cellular organisms. Protein Science. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kanehisa M, & Goto S (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic acids research, 28(1), 27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kanehisa M, Sato Y, Furumichi M, Morishima K, & Tanabe M (2018). New approach for understanding genome variations in KEGG. Nucleic acids research, 47(D1), D590–D595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Langley RJ, Tipper JL, Bruse S, Baron RM, Tsalik EL, Huntley J, … & Keaton M (2014). Integrative “omic” analysis of experimental bacteremia identifies a metabolic signature that distinguishes human sepsis from systemic inflammatory response syndromes. American journal of respiratory and critical care medicine, 190(4), 445–455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lee AH, Shannon CP, Amenyogbe N, Bennike TB, Diray-Arce J, Idoko OT,..& Lê Cao KA (2019). Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nature communications, 10(1), 1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Li S, Sullivan NL, Rouphael N, Yu T, Banton S, Maddur MS, … & Liu K (2017). Metabolic phenotypes of response to vaccination in humans. Cell, 169(5), 862–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Luo W, & Brouwer C (2013). Pathview: An R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics, 29(14), 1830–1831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Mickiewicz B, Vogel HJ, Wong HR, & Winston BW (2013). Metabolomics as a novel approach for early diagnosis of pediatric septic shock and its mortality. American journal of respiratory and critical care medicine, 187(9), 967–976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Singh A, Shannon CP, Gautier B, Rohart F, Vacher M, Tebbutt SJ, & Le Cao KA (2018). DIABLO: from multi-omics assays to biomarker discovery, an integrative approach. bioRxiv, 067611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Stretch C, Aubin JM, Mickiewicz B, Leugner D, Al-manasra T, Tobola E, … & Vogel HJ (2018). Sarcopenia and myosteatosis are accompanied by distinct biological profiles in patients with pancreatic and periampullary adenocarcinomas. PloS one, 13(5), e0196235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Xia J, Gill E, and Hancock REW (2015) “NetworkAnalyst for Statistical, Visual and Network-based Approaches for Meta-analysis of Expression Data” Nature Protocols 10, 823–844. [DOI] [PubMed] [Google Scholar]
- 18.Zhou G, Soufan O, Ewald J, Hancock REW, Basu N and Xia J (2019) “NetworkAnalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis” Nucleic Acids Research (doi: 10.1093/nar/gkz240) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhou G and Xia J (2018) “Using OmicsNet for Network Integration and 3D Visualization” Current Protocols in Bioinformatics (doi: 10.1002/cpbi.69) [DOI] [PubMed] [Google Scholar]
- 20.Zhou G and Xia J (2018) “OmicsNet - a web-based tool for creation and visual analysis of biological networks in 3D space” Nucleic Acids Research (doi: 10.1093/nar/gky510) [DOI] [PMC free article] [PubMed] [Google Scholar]
