Abstract
Introduction:
Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome is only one level of molecular information, with the genome, epigenome, transcriptome, and metabolome all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems.
Areas Covered:
Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches, that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating integration of proteomics data with other ‘omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomic and other ‘omics data, critical for enabling new discoveries driven by multi-omics.
Expert Commentary:
Multi-omics, centered on the integration of proteomics information with other ‘omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across ‘omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Keywords: Mass spectrometry, bottom-up proteomics, bioinformatics, multi-omics, proteogenomics
Plain Language Summary
Proteomics uses mass spectrometry to identify as many of the proteins in a system of interest as possible, making it extremely useful in biomedical research and basic biological research. Unlike next-generation DNA/genome sequencing, proteomics directly measures the changes in gene translation in response to a disease state, injury, etc. However, when proteomics data is coupled to and examined together with other forms of “omics” data, such as transcriptomics, genomics, and metabolomics, a full biological picture emerges that can demonstrate the underlying regulatory networks of living systems and how they respond to positive and negative stimuli. This integration is called multi-omics and represents a powerful paradigm shift in systems biology. To be fully compatible with other ‘omics datasets, proteomics must be as complete and accurate as possible; in addition, the task of integrating multiple different kinds of datasets can be daunting to novice researchers. With this in mind, we reviewed in this manuscript the technologies that allow for the generation of the best possible proteomics for multi-omics analysis in addition to the software tools needed to integrate proteomics data with other ‘omics data. Together, we believe this review will enable other researchers to begin applying multi-omics approaches to answer their research questions.
1. Introduction: the utility of proteomics alone and as a centerpiece of multi-omics analyses
The twenty-first century has seen the rise of advanced nucleic acid sequencing technologies driving the new discipline of systems biology, in which large-scale molecular data from a model system are examined in an integrated fashion to understand the molecular networks underlying normal biological processes and dynamic responses to stimuli(1). Beginning with the development of DNA sequencing in the 1970s(2), improvements to sequencing technology allowed for the ability to sequence whole genomes of model organisms, eventually leading to the development of RNA sequencing (RNA-Seq) technology. These “next-generation” sequencing (NGS) approaches provide rapid genome sequencing and qualitative and quantitative information on transcribed messenger RNA(3). This transcriptome sequencing information offers a picture of the genes expressed under a given set of conditions, providing insights into gene regulation mechanisms and potential biochemical functional response within the system.
While, on their own, these powerful sequencing techniques have shown considerable utility in fields ranging from cancer research(4) to microbiology(5), these technologies lack a direct measurement of functional molecules responsible for the biochemistry driving phenotypic changes that occur in a cell, tissue, or organism. This is due in part to higher-level epigenetic regulatory mechanisms such as DNA methylation(6), histone acetylation(7), and siRNA and miRNA suppression of mRNA translation(8). To complete the molecular picture, it is essential to examine the expression of proteins present in a system (i.e., the proteome). This can be done using liquid chromatography (LC) coupled to mass spectrometry (MS). In so-called “bottom-up” MS-based proteomics, proteins are isolated from a system and enzymatically digested into their constituent peptides. Complex peptide mixtures are analyzed by LC-MS, collecting tandem mass spectra (MS/MS) of fragmentation signatures of each detected peptide. Each MS/MS spectrum is matched to sequences contained within a database of known or predicted proteins expressed by the organism(s) being studied, using customized bioinformatic software to determine the proteins present. Bottom-up proteomics can also be subdivided into the realms of a) global or untargeted proteomics, in which the goal is the detection and potential quantitation of as many proteins in a sample as is possible, or b) targeted proteomics, in which specific peptides or proteins are selected based on their m/z values for highly accurate quantitation using selected reaction monitoring (SRM) or multiple reaction monitoring (MRM) methods. Having advanced considerably with the introduction of high-resolution and high scan-rate instrumentation(9), the assorted varieties of MS-based proteomics are now mature fields with many research applications in the biomedical(10), biotechnological(11), and ecological research spaces(12).
In contrast to bottom-up approach, “top-down” proteomics performs LC-MS/MS analyses on whole, undigested proteins using high-resolution mass spectrometers. Top-down mass spectrometry is performed with the goal of detecting the chemical modifications and post-translational modifications that distinguishes individual proteins encoded by the same gene, from one another in vivo(13). Top-down” proteomics performs LC-MS analyses on whole, undigested proteins using high-resolution mass spectrometers. Top-down mass spectrometry is performed with the goal of detecting the chemical modifications and post-translational modifications (PTMs) that distinguishes individual proteins, even chemically-distinct proteoforms(14) encoded by the same gene, from one another in vivo(13). Top-down has proven valuable in many contexts, although limitations on characterizing larger massed proteins and proteins that are not easily solubilized has limited its ability to comprehensively characterize complex proteomes.
Despite its advantages, bottom-up MS-based proteomics is not without its own limitations. Historically, proteomics methodologies have only been able to identify a portion of the proteome within complex biological systems, largely due to the variable abundances of different proteins in a cell(15) as well as chemical heterogeneity resulting in a complex array of proteoforms(14) expressed by any given coding gene. In addition, untargeted bottom-up proteomics is reliant on the use of genomic data of the organism under study to refine and predict proteins expressed. Although convenient, using a reference library of predicted canonical proteins misses detection of potentially biologically relevant proteoforms expressed from sample-specific coding sequence variants and processing events not found within the reference proteome(16). Untargeted and targeted bottom-up proteomics experiments are also dependent on the detection of peptides specific for proteins of interest to confidently detect their presence in samples. It is important to maximize the number of peptides detected to ensure optimal sequence coverage to accurately infer protein identities. This is especially a challenge against the realities of sample complexity, limitations of coverage when using trypsin-based proteolysis, and occasional poor peptide ionization during MS. Finally, conventional proteomics data does not always measure activity of the many enzymes comprising signaling and metabolic pathways critical to living systems(17). The measurement of metabolites using LC-MS and/or other methods (e.g., NMR), known as metabolomics, can be used to investigate these changes in protein activity.
Given these now maturing, sensitive and accessible technologies across the ‘omic domains, the concept of multi-omic analysis has become a viable option for many researchers. Multi-omics seeks to integrate system-wide information generated by different ‘omic technologies to gain a more comprehensive molecular picture within biological systems. Given the array of ‘omic technologies now available, multi-omics can take on many flavors, depending on the types of information being generated and integrated(18). Here, we review multi-omic approaches which rely on MS-based bottom-up proteomics as the centerpiece. We describe some of the recent advances in experimental methods and sample preparation, and MS instrumentation that have helped overcome some of the past limitations of MS-based proteomics, facilitating the generation of deep proteomic information necessary for multi-omic analysis. We also provide an overview of bioinformatic tools and approaches available for the integration of proteomic data with other ‘omic information. Collectively, this review should help to guide researchers seeking to integrate MS-based proteomics data with other ‘omic information to drive new discoveries across a wide variety of research fields.
2. Recent advances in MS-based proteomics enabling proteome-centered multi-omics
Despite its promise, traditionally MS-based proteomics lacked the depth of information afforded by NGS technologies focused on sequencing DNA and/or expressed RNA transcripts. Lacking the ability to amplify low-abundance proteins, as well as the sheer chemical complexity of the potentially millions of expressed proteoforms(14), even the most cutting-edge MS-based proteomic methods still only reliably detected a portion of the proteome within complex samples. Fortunately, a combination of advances in the last several years have significantly improved this situation (Figure 1), dramatically increasing the depth of information now attainable by MS-based proteomics. Here we review some of these advances that are now available to the wider research community.
Figure 1.

Overview of methodologies to improve proteome coverage. Sections are highlighted in accordance with their provenance as sample preparation strategies (red), improvements to methods for protein quantitation and experimental design (blue), or innovations to instrument design and/or operation (yellow). Generated using biorender.com.
2.1. Sample preparation for bottom-up proteomics
Increasing the number of detected and quantified proteins begins with optimizing the processing of samples prior to LC-MS analysis -- especially important when analyzing precious, material-limited samples. A conventional protein sample preparation strategy involves the reduction of thiol groups of cysteine amino acids, followed by an alkylation step to prevent the reformation of disulfide bonds, after which the protein samples are enzymatically digested in situ. Following their digestion, samples can then be desalted to remove contaminants and injected into an LC-MS platform for analysis. Although standardized, this workflow of sample preparation can be altered to increase the sensitivity for detection of lower abundance proteins.
Peptides from highly abundant proteins suppress the detection of those from lower abundance proteins. Biofluids such as serum and plasma, as well as others (e.g., urine, lung lavage, cerebral spinal fluid, etc) are known to have high abundance proteins such as albumin which are many orders of magnitude more abundant than other proteins of interest(19); as such, many products and protocols based on immunoprecipitation strategies have been developed to remove albumin and other carrier proteins from blood(20). Similar strategies have been employed for cerebrospinal fluid (CBSF)(21) and urine(22). Many epithelial tissues such as intestinal villi, lung tissue, etc. are also infused with blood, and can also be immunodepleted following homogenization(23) to improve the depth of detection of lower abundance proteins. Indeed, depletion methods coupled with the most sensitive MS-instrumentation can now detect proteins across ten orders of magnitude in abundance from serum or plasma samples(24).
Despite the utility of in situ protein digestion with trypsin, the standard protocols incorporate multiple sample-handling steps, making it less than ideal for material-limited samples. Handling steps introduce sample loss and make processing of large sample cohorts cumbersome. As a solution, Filter-Aided Sample Preparation (FASP)(25), was introduced, wherein reduction, alkylation, buffer exchange, and digestion can all occur in a single “pot”, using molecular weight cutoff filters within a single microcentrifuge tube as a reaction vessel. Other strategies have followed and extended the FASP methods, including sample processing and clean-up using small-scale solid-phase extraction stage tips(26), digestion of sequestered proteins in the three-dimensional S-Trap(27), immobilization of proteins onto solid sphere supports in enzymatic reactors(28), and precipitation of proteins onto magnetic beads(29) with subsequent digestion which allows for processing materials while minimizing sample handling steps.
Fractionation of complex peptide samples generated via protein digestion prior to LC-MS analysis also provides a means to increase sensitivity. Fractionation using orthogonal LC methods has long been known to increase sensitivity via simplification of mixtures introduced into the MS, thereby relieving suppression(30). SDS-PAGE gels, followed by in-gel protease digestion, allows for the pre-fractionation of proteins prior to digestion, though this approach may be unfeasible for large cohorts of protein samples. A common alternative strategy is to perform high pH, reverse-phase prefractionation of peptide samples using either high-performance liquid chromatography(31) or commercial centrifuge-based kits, as LC-MS analysis for bottom-up proteomics is generally performed using low pH on reverse-phase columns giving results akin to two-dimensional liquid chromatography (2D-LC) coupling orthogonal separation methods. Other stationary phases useful for peptide pre-fractionation include ion exchange resins(32), mixtures of ion exchange and reverse-phase modalities(33), and pentafluorophenyl (PFP) resin(34). For experiments involving extremely limited amounts of material, stage tip-based fractionation can be performed to increase the number of proteins identified(35).
2.2. Experimental methods for quantitative proteomics: advances in experimental methods, design, and instrumental analysis
The experimental design of proteomics experiments, which is guided by the experimental methods employed, has direct bearing on their inherent utility in the context of multi-omics analyses. To determine abundance changes in the proteome in response to stimuli and integrate these changes with other ‘omic information, it is important to get accurate and deep quantitative data on the proteome. The main strategies for quantitative proteomics break down along two lines, namely the unlabeled and labeled methods. With unlabeled quantitation, also called label-free quantitation (LFQ), the digested and desalted peptides are analyzed via LC-MS with no chemical modification to the peptides themselves. Quantitative information on peptide and protein levels in LFQ comes from the spectral counts (counting the number of peptide spectral matches, or PSMs, that map to a given protein), or through the area under the curve (AUC) in the MS1 chromatogram for peptides identified by PSMs(36). Label-free quantitation is now a mature methodology with numerous software suites available for these analyses(37), though the method is not without its limitations. Since the peptide samples in LFQ proteomics experiments are analyzed in the mass spectrometer individually, stochastic variances in the intensity of the same species across multiple replicates can introduce uncertainty in the measurements(38) or even the loss of signal between runs(39) and must be accounted for in normalization strategies. Aside from potential methodological problems, LFQ can be impractical when processing many samples due to the large amount of instrument time. In addition, the fractionation methodologies needed to perform deep sequencing and quantitation on LFQ samples may be difficult to do with small amounts of input sample.
In contrast to LFQ, labeled mass spectrometry methods utilize labels containing stable, heavy isotopes to differentiate peptides labeled from different samples by mass signatures. One strategy is stable isotope labeling by amino acids in cell culture (SILAC), in which cell lines may be grown in media supplemented with isotopically labeled amino acids, resulting in cells that constitutively express either normal (“light”) or stable isotope labeled (“heavy”) proteins(40). “Heavy” cells are treated or perturbed in some fashion, alongside “light” control cells after which the proteins from each cell population are isolated, digested, concatenated together, and analyzed in the same LC-MS experiment. The creation of an intermediate “medium” channel can also be done using combinations of heavy isotope-labelled amino acids, allowing for the analysis of multiple conditions in triple SILAC. Detected heavy, medium, and light labeled peptides have distinct masses, but similar chromatographic and ionization behavior. The heavy, medium, and light peptides are identified from the MS/MS spectra and are quantified using the AUC values from the MS1 chromatogram, minimizing the need for between-run normalizations. Comparison of the AUC values provides relative abundance measures for the peptides and inferred proteins. The downsides of the SILAC methodology are the limited number of conditions that can be tested at present(41) as well as the potential difficulty in producing cell lines or testing subjects with identical degrees of protein labeling.
Finally, another strategy uses stable-isotope labeling reagents, called isobaric tags. These tags usually are synthesized to react with the primary amines of N-termini and nucleophilic side chains of peptides, primarily lysine(42), thereby comprehensively and covalently tagging every peptide within a complex mixture generated by trypsin digestion, or potentially other proteases. The tags are isobaric, such that the overall mass added to peptides by the different labels is the same across the different samples being compared. Differentially labeled peptides are detected as a single MS1 peak by LC-MS. Relative quantities of peptides within each sample condition are determined by reporter ions that are generated from peptides selected for MS/MS analysis. Stable isotopes incorporated at different locations in the chemical tag give rise to mass differences in these reporter ions that distinguish the different labeled samples. Comparison of their mass spectral intensity provides a relative abundance measure for each peptide subjected to MS/MS analysis and identified by sequence database searching. Identified peptides are then used to infer protein identities and associated relative abundance compared across experimental conditions. The use of isobaric tags allows for multiplexing samples processed across many conditions. This improves efficiency of instrumentation time needed for the analysis and increases the overall amount of digested peptides being handled due to pooling of labeled samples, allowing for the prefractionation of samples even with low amounts of material derived from each individually labeled sample. While initially limited to comparing only a few different sample conditions, current commercial isobaric labeling strategies can multiplex as many as eighteen individual samples together(43), with modifications being demonstrated for even higher levels of multiplexing(44).
Historically, bottom-proteomics has utilized data-dependent acquisition (DDA) mass spectrometry experiments, in which the most abundant peptides in every chromatogram peak detected in the MS1 scan are selected for fragmentation and detection in the MS/MS scan(45). The use of DDA experiments is still widespread and has had great utility in proteomics and multi-omics analyses(46). While this has been a largely successful approach, DDA can miss peaks from very low abundance peptides in a complex sample, limiting the number of identified peptides and inferred proteins. Indeed, many studies have noted that the semi-stochastic sampling of DDA experiments results in irreproducible measurements of peptides across multiple samples(38). This can be mitigated using the isobaric labelling and fractionation strategies described previously, though not entirely. An alternative strategy to improve the depth and reproducibility of quantitative proteomics is data-independent acquisition (DIA). Here ions are continuously collected and fragmented by collecting MS/MS in overlapping m/z windows(47) across the entire range of expected peptide m/z values. Results are deconvoluted by extraction of co-eluting fragment peaks that belong to a single starting peptide detected within any m/z window. Spectral libraries of fragments derived from all the detectable peptides within a proteome are used to confirm the identity of co-eluting fragment ions. Quantification is achieved by AUC measurements of the peptide-specific fragment ions. While initially limited to DDA-generated spectral libraries, DIA-based bottom-up proteomics can now be performed using libraries generated using the DIA data itself(48) or wholly generated using deep learning prediction strategies(49). As an alternative to DDA, the DIA approach has been shown to provide more accurate quantitation, with more consistent detection of peptides across samples and from low-abundance proteins, with potential for high-throughput analysis(50). Such results can greatly benefit in the quantitative aspect of multi-omic analyses centered on MS-based proteomics data.
2.6. Improvements in MS instrumentation
Novel innovations have also been made to MS instrumentation allowing for more sensitive peptide and protein identification in proteomics. One way of increasing the depth of proteome coverage is through pre-fractionation in the mass spectrometer itself using gas-phase fractionation (GPF), which is performed by multiple injections of an individual sample using variable isolation windows covering small 100–200 m/z ranges(51); by doing this, the mass spectrometer isolates and examines discrete mass ranges of precursor ions and reduces the amount of potential ion suppression by coeluting peptides. This simple technique has been shown to be powerful enough to potentially eliminate the need for LC separations(52) and has been put to effective use in creating deep spectral libraries for DIA experiments(48).
A notable instrumental advance is the improved scan-rates of mass spectrometers. Faster acquisition of MS and MS/MS spectra enables more comprehensive sampling of peptides in complex mixtures, increasing depth of detection while maintaining high mass resolution(53). For example, the scan rate of the Orbitrap-based family of mass spectrometers was doubled with the introduction of ultra-high-field orbital traps in the newer QExactive HF(54) and Fusion Tribrid(55) mass spectrometers. Increased sensitivity has enabled improved performance such as detection of the entire yeast proteome in as little as an hour(56).
In recent years, ion mobility spectrometry (IMS) has been integrated into many mass spectrometers, adding a new separation function for peptide mixtures through their collision cross section (CCS), increasing the number of peptides identified in complex mixtures(57). IMS can be accomplished by different platforms engineered for compatibility with MS instruments. One of these is high-field asymmetric waveform ion mobility spectrometry (FAIMS). In FAIMS, ions are drawn into a separation chamber via a carrier gas with an alternating RF signal and an applied counter voltage (CV)(58); by varying the CV, ions can be selectively separated by their CCS and fractionated before entering the mass spectrometer. Notably, FAIMS sources have been integrated into the latest generation of Orbitrap-based mass spectrometers and have shown their utility in improving the depth of coverage in protein sequencing in short gradient runs(59) and detecting low-abundance peptides(60) primarily due to the improved quality of MS/MS data generated. FAIMS coupled with Orbitrap instruments promises to increase the number of protein identifications, important when integrating quantitative proteomics data with NGS sequencing information in multi-omics studies
Another IMS format is trapped ion mobility spectrometry (TIMs). Here, ions are drawn into a tunnel by a gas and held in place by an applied electric field; by incrementally lowering the applied field, ions are sequentially released into the mass spectrometer in order of decreasing CCS values(61). This scanning and fractionation process can be made even faster with a longer tunnel and two applied electric fields, in a process called parallel accumulation-serial fragmentation (PASEF). Here, ions can be continuously trapped and released into the mass spectrometer, greatly improving the sensitivity of TIMs(62). This technology has been coupled with a fast-scanning time-of-flight (TOF) instrument in the Bruker timsTOF instrument, proving adept at sensitive and reproducible peptide detection from even material-limited samples(63). The timsTOF should benefit multi-omics analyses by offering extremely sensitive and reproducible quantitative proteomic results, with potential for analysis of larger sample cohorts in a high-throughput format, providing more complete results when integrated with other ‘omics data.
3. Bioinformatic integration of proteomics and other omics-type approaches
As detailed above, the generation of deep MS-based proteomics and other ‘omics data (e.g., NGS data) has now become accessible for most research laboratories. However, the task of integrating different levels of ‘omic information with proteomic data is not necessarily trivial. Fortunately, significant innovations have occurred in the development of bioinformatic software tools and platforms that address this challenge. In this latter half of the review, we explore bioinformatic methodologies and software for proteomics-based multi-omics and their applications across multiple fields of research.
A basic workflow for the analysis and integration of proteomics data is presented in Figure 2. Once the raw data has been normalized to account for batch effects and render it amenable to statistical analysis, it can then be integrated with NGS data or metabolomics data interpreted along tracks such as functional analyses, differential expression comparisons, and network analyses.
Figure 2.

Bioinformatic analysis of proteomics and integration with transcriptomic data. In this example, unpublished proteomics and transcriptomic data from a murine IBD model are integrated together with QuanTP, their shared decreased genes subjected to functional analysis in gprofiler, and network analysis performed via STRING database.
3.1. Common problems and strategies for data integration
When integrating proteomics with transcriptomic, genomic, metabolomic, or other data, there are several challenges that must be considered and addressed. Annotation of corresponding genes and their protein products is one such challenge; for example, unsynchronized annotations of proteomic and transcriptomic data make comparisons between coding regions and their expressed protein products difficult(64). As a solution, the Uniprot database(65) provides a well-curated repository of characterized proteins from diverse organisms. Entries contain annotations for proteins including unique Uniprot identifiers cross-referenced with coding gene names, and other identifiers (e.g., RefSeq, Ensembl IDs, etc) useful for matching proteins to corresponding genomic or transcriptomic sequences. In addition, computational tools such as biomaRt can be used to automatically map protein sequences to common genome or transcriptome sequence coordinates(66).
Integrating proteomic and metabolomic data presents a different challenge. Unlike genes and their coding sequences, metabolites are not easily mapped directly to a protein’s amino acid sequence; rather, the metabolites may be mapped to those enzymatically active proteins involved in their synthesis, accumulation, excretion, or degradation, as well as those proteins with which they have allosteric interactions(67). This can be done using metabolite databases such as the Human Metabolome Database(68), ConsensusPathDB(69), PathBank(70), etc., and is a functionality of many multi-omics software packages (see below).
Another important consideration for multi-omic analysis is the normalization of the quantitative data (e.g. protein and transcript abundance values), such that dynamic response at these different levels of ‘omic information can be compared directly. Common strategies for normalization include logarithmic transformation, TMM normalization(71), or normalization relative to a standard in the data. These strategies can be implemented via one’s own data manipulations or through specialized software such as NormalyzerDE(72).
For comparing large scale MS-based proteomic results with corresponding ‘omic data (e.g., transcriptome data derived from RNA-Seq analysis, quantitative MS-based metabolomics data), several different approaches exist. Commonly, researchers will conduct their analyses considering the intersection of expressed genes and/or identified metabolites and the corresponding proteins which were confidently identified and quantified. With this intersection of the ‘omics data, similarities and differences between proteins and corresponding transcripts/metabolites in response to stimuli can be compared. Methods such as component analyses(73) and hierarchical clustering(74) examine the altered system responses that occur under a given condition. These comparative analyses provide insights into potential mechanisms of post-transcriptional or post-translational regulation, offering a unique look at molecular signatures underlying biological function and disease. When considering the union of the complete multi-omics data (e.g., all quantified proteins compared with all quantified transcripts or metabolites), enrichment analysis is often employed on each separate set of results, revealing information on biological pathways and molecular functionalities(75) that may be in common or different between ‘omic domains. In addition, functional relationships between ‘omic datasets can be examined using topographical network analyses to establish changes in the expression of known clusters of genes/gene products, discover new clusters of features, and examine common regulatory elements that may be of interest across datasets(76). When ‘omics data is collected as a part of a time-course study, modelling software can be used to establish the dynamic patterns of biomolecule abundance by calculating their kinetic parameters and identifying elements (e.g., genes, proteins, and metabolites) with similar responses(77).
3.2. Data curation and annotation of proteomics data for multi-omics analyses
The rise of proteomics-based multi-omics has been greatly facilitated by the standardization of proteomics data storage and interpretation. This in-turn has made the information readily accessible to bioinformaticians and improved the ease with which multi-omics data can be analyzed. The Human Proteome Organization(78) has played a large role in this, as they have fostered community engagement and cooperation on large-scale projects as well as set forth guidelines for standardizing metadata annotation, data interpretation, and archiving of data in publicly available repositories(79).
3.3. Current software applications for integrative analysis of multi-omics results
Although the computational methods for integrating MS-proteomic and other ‘omic data are known, implementing these different algorithms presents a daunting challenge for many researchers. Fortunately, computational biologists and bioinformaticians have developed accessible software to automate these tasks and generate useful readouts to interpret this data (Table 1). Given these developments, the challenge of 21stst century systems biologists engaged in multi-omic analyses is not to find suitable software for their purposes, but to decide which software tools among many will most suit their purpose. To this end, in the section we offer some insights into software with high value for MS-based proteomics centered multi-omics (Table 1). While our listing of software is not exhaustive, those shown have been selected either through our own experience or through in-depth exploration of available tools, to select those with the most promise for multi-omic applications. We hope this serves as a starting point for researchers entering MS-based proteomics-centric multi-omic studies.
Table 1:
Software tools used for multi-omics data analysis
| Software | Data Types | Functionality | Language | Reference |
|---|---|---|---|---|
| PANTHER | gene features (data agnostic) | functional analysis | R | Mi et al.80 |
| gProfiler | gene features (data agnostic) | functional analysis | R | Raudvere et al.81 |
| reSTRING | gene features (data agnostic) | functional analysis | R | Manzini et al.82 |
| MOGSA | proteome, transcriptome | functional analysis | R | Meng et al.83 |
| WCGNA | proteome, transcriptome | network analysis | R | Langfelder et al.84 |
| STRING | gene features (data agnostic) | network analysis | R | Szklarczyk et al.86 |
| MONGKIE | proteome, phosphoproteome, transcriptome | network analysis | Java | Jang et al.87 |
| moCluster | proteome, transcriptome | data clustering | R | Meng et al.89 |
| mixOmics | data agnostic | data clustering, data correlation, network analysis | R | Rohart et al.90 |
| STATegRa | data agnostic | component analyses, functional analysis | R | Planell et al.91 |
| iOmicsPASS | genome, proteome, transcriptome | network analysis, functional analysis | C++, R | Koh et al.94 |
| netOmics | metabolome, proteome, transcriptome | network analysis, functional analysis | R | Bodein et al.95 |
| QuanTP | proteome, transcriptome (data agnostic) | heirarchical clustering, differential analysis, multivariate analysis | R | Kumar et al.99 |
Functional analyses, focused on revealing enriched biochemical processes indicated by ‘omics results, are a key aspect of multi-omics analyses. Several software tools have been created to perform this functionality, including PANTHER(80), gProfiler(81), and reString(82). Functional analysis tools like these are generally written with a single set of ‘omics data in mind (e.g., genes, proteins, metabolites), such that analysis is done for each separate ‘omic data set, with comparison of end results across the different levels of information; an exception to this is MOGSA, which was purpose-built to do gene-set analyses on multi-omics data(83).
Topographic network analysis of multi-omics data can yield important information about clusters of molecular features that interact with each other and undergo systemic changes in response to stimuli, and for this reason many applications have been created for this purpose. The WGCNA(84) package in R was designed to perform many aspects of weighted gene correlation network analysis on transcriptomic data, though it can also be used to analyze multiple sets of disparate ‘omics data(85). In addition, the STRING database program, which can be used as an R package or via a graphical user interface online, is able to display known protein-protein interactions and coexpression in datasets(86). Another package that has been found to be useful is the MONGKIE package, providing visualization capabilities of complex multi-omics networks, enabling easier interpretation of results(87).
Clustering MS-based proteomics data with other ‘omics data (most commonly quantitative transcriptomic data) illuminates potential mechanisms of regulation and response to stimuli. For such analysis, it is necessary to employ a clustering of clusters algorithm(88) which first clusters the individual ‘omics data, then clusters the clusters together to identify overarching patterns in multi-omics data. The package moCluster(89) is an especially useful iteration of this strategy, as it is able to perform clustering analysis on multiple levels of ‘omics data in a fraction of the time of similar packages.
Many of these multi-omics software packages are in fact a suite of different algorithms packaged together into a single integrated tool. The mixOmics package(90) represents an exhaustive option for supervised multi-omic analyses, being capable of analyzing individual ‘omics datasets, multiple ‘omics datasets containing measurements of the same features or meta-analyses of multiple instances of a single ‘omics analysis. Datasets in mixOmics are uploaded as pre-normalized matrices containing rows of features (e.g., genes, proteins etc.) and columns of conditional values with a categorical column containing meta-data on the system of interest. Up to three different datasets can be analyzed together, outputting clustering results, correlation analyses, and network analyses, among other possibilities. In addition, tutorials for this software are readily available at mixomics.org. Another R package, STATegRa(91), is wholly agnostic to the kind of ‘omics input and can accommodate multiple datasets. This package was developed through the STATegra consortium, an international effort to generate statistical analysis tools for ‘omics data(92). The input datasets for STATegRA also require pre-normalization as well as categorical metadata concerning their status as control or case experimental data. Each of the datasets is first subjected to quality control analyses, followed by joined component analyses of sets of two datasets to determine the ‘omics pairing that has the most significant relationship to the condition of interest. These two datasets are then subjected to nonparametric combination(93) to increase their statistical power and determine the features of both datasets that have the most significant bearing on the condition of interest; these features are ultimately subjected to functional analysis via gene-set enrichment analysis.
A notable, recently described platform is iOmicsPASS(94). This platform is unique in that it utilizes proteomics data, transcriptomics data, databases of transcription factor interactions and protein-protein interactions, and conditional metadata to determine the presence of subnetworks within the data which can then be scored with a pathway enrichment module for networks that are significantly enriched or depleted under varying conditions. Ultimately, iOmicsPass yields both topographic networks of interacting biomolecules that are enriched and depleted, as well as functional analyses on these pathways to reveal the changes these networks are affecting in response to stimuli. A similar software package is netOmics, which was designed to process multiple ‘omics datasets over extended periods of time(95). Unlike other bioinformatics packages, netOmics uses raw data as inputs, which it can pre-process before analyses. Following normalization, netOmics selects models for each molecule detected to establish their changes over time, after which it creates networks to show linkages between them using protein-protein interaction networks and KEGG pathway databases. Ultimately, the researcher is left with multi-omic interaction networks as well as functional enrichment analysis results over the course of the experiment.
The multi-omic analyses performed depend largely on the background of the researcher, and the ‘omic data types gathered as a part of the experiment. The functional analysis and topographical analysis tools detailed in the initial portion of this section were developed for use with individual datasets of genes; aspiring bioinformaticians interested in using these tools for analyzing their proteomics data integrated with other ‘omics data can use these on the intersections of proteomics and other ‘omics datasets representing a relationship of interest (i.e., shared significant changes in abundance.) In addition, using these tools as a part of a larger series of analyses may require a level of coding sophistication to input the desired parameters, submit the data and run the analysis; researchers who are less experienced in crafting and running scripts may then prefer the platforms with multiple functionalities, especially newOmics, as this platform allows for the input of raw data without a priori normalization or other processing on the part of the researcher. Other platforms that allow for automated queuing of tools as workflows may be of use to researchers with limited bioinformatic or programming experience (see below).
3.4. User-friendly multi-omics platforms for increased access and flexibility
Many software applications capable of MS-based proteomics-centered multi-omics analysis were developed as a stand-alone script or bundled package in R, Python, or C++ which are run through the command line or through an interpreter program. While this is not a problem for the skilled bioinformatician, many researchers who are less computationally-savvy are hindered by these software implementations. As such, many multi-omic software suites incorporate point-and-click graphical user interfaces (GUIs) that are user friendly and accessible to a wider range of researchers (Table 2). While there are some commercial options, such as Qiagen’s Ingenuity Pathway Analysis (IPA)(96), there are a myriad of open-source options that are as powerful and simple-to-use as they are affordable.
Table 2:
Multi-omics data analysis suites
| Suite | Data Types | Functionality | Reference |
|---|---|---|---|
| Galaxy | data agnostic | function agnostic | Jalili et al.97 |
| Perseus | proteome (data agnostic) | statistical analysis, functional analysis, network analysis | Tyanova et al.102 |
| OpenOmics | genome, proteome, transcriptome, epigenome | network analysis, functional analysis | Tran et al.100 |
| multiSLIDE | data agnostic | hierarchical clustering, differential analysis | Ghosh et al.104 |
| MiBiOmics | data agnostic | network analysis | Zoppi et al.105 |
Although useful, stand-alone software has some limitations related to multi-omic data analysis. Scalability to handle the processing and memory requirements of large volume data and the ability to integrate disparate software for automated analysis of data from across ‘omic domains are at the forefront of these limitations. To address these issues, bioinformatic workflow platforms have emerged. The Galaxy platform(97) is an open-source bioinformatics platform where bioinformatic tools can be integrated into automated workflows, implemented on powerful high-performance computing infrastructure, and accessed via a user-friendly GUI, designed for wet-bench researchers. Through collective work of our lab and a global network of others, the Galaxy for proteomics (Galaxy-P) project has implemented numerous tools for MS-based proteomics informatics into the platform, making it an ideal environment for multi-omic analysis(98). One example of a Galaxy-P tool is QuanTP, which can perform hierarchical clustering and differential analysis on quantitative proteomics and transcriptomics data, in addition to plotting the fold changes of features in the proteomic data against the transcriptomic data to examine the linear relationship between these results(99). QuanTP can also identify genes and corresponding proteins that are discordant in their quantitative response, in addition to performing k-means clustering to determine clusters of discordant transcripts and proteins that may be regulated post-transcriptionally. Another multi-omics tool currently available in Galaxy is OpenOmics, a Python library and multi-omic workspace that interfaces with public ‘omics databases and can accommodate proteomics, transcriptomics, genomics, and epigenomics data(100). Finally, Galaxy provides a suite of metabolomics data analysis tools(101), enabling analysis of metabolite data within the same environment and integration of results with other ‘omics data.
One platform which shows promise for multi-omic analysis is Perseus, the open-source matrix manipulation software developed for analysis of MS-based proteomics data(102). While developed for proteomics data, the software itself is data agnostic by design and has recently been updated to allow for R and Python scripts to be run within the software, and to enable access to Bioconductor, Conda, and other software repositories(103), making this a potential entrée for proteomics researchers into multi-omic analyses. For researchers who prefer heatmaps to other forms of data visualization, the multiSLIDE web application creates two heatmaps from raw tabular datasets and allows for both birds-eye visualization of both datasets simultaneously as well as direct comparison of a gene in two datasets using the lines that connect shared measurements between datasets(104). Finally, the MiBiOmics platform is a new web application that accepts up to three sets of ‘omics data and performs individual data processing steps on each dataset, individual data explorations in the form of component and network analyses and integrates the results together to give multi-omic networks, co-inertia plots, and hive plots to show relationships between the different datasets(105).
3.5. Integration of specialized proteomics with other ‘omics applications
The previous software packages discussed, while useful, were generally built for the analysis of global bottom-up proteomics data with other forms of ‘omics data which only represents a part of the MS-based proteomics space. We detail below a few other options for dealing with specialized proteomics experiments.
PTMs such as phosphorylation, glycosylation, acetylation etc. are critical to controlling and modulating the functions of proteins in vivo, and as such are important levels of information to integrate in with other ‘omics. Generally, PTM data that characterizes protein phosphorylation or acetylation can be analyzed with many of the same tools introduced above, as they are generated through the enrichment of modified peptides followed by bottom-up proteomics analyses. However, tools like the KEA2(106) program, which predictions the kinases and phosphatases responsible for interacting with the enriched or depleted phosphorylation sites in the data, are helpful for assessing signaling networks related to PTMs. In addition, PTM data can be generated ex post facto with tools like ProteoSushi(107) to add an extra layer of information on previously collected bottom-up proteomics data. Others have even shown how top-down proteomics can be combined with bottom-up proteomics to determine quantitative changes in specific proteoforms in a biological system using software such as ProteoCombiner(108).
The use of SRM and MRM to for targeted peptide detection has allowed for the highly reproducible and accurate quantitation via LC-MS; the resulting data from these are then interpreted using programs such as Skyline(109) or OpenMS(110) to determine peptide abundances. The resulting abundance dynamics across peptides from panels of proteins can be treated similar global proteomics data and integrated with transcriptomic data or metabolomics data using tools like Metascape(111) or Ingenuity Pathway Analysis(96).
3.6. Prominent examples of proteomics-based multi-omics
In the context of describing the technologies that are enabling MS-based proteomics-centric multi-omics, it is worth pointing out some success stories in the application of this still maturing approach. For brevity’s sake, we present here five exemplary studies, which represent the potential of multi-omics centered on MS-based proteomics data to impact diverse fields of biological research (Table 3).
Table 3:
Notable multi-omics studies centered on MS-based proteomics
| Study | ‘Omics used | Application |
|---|---|---|
| Cavalli et al.112 | proteomics, single-nuclei transcriptomics, chromosome conformation (epigenomics) | Regulatory networks for genes involved in hepatocellular carcinoma |
| Fornecker et al.113 | proteomics, transcriptomics | Biomarkers for drug resistance in B-cell lymphoma |
| Alcazar et al.114 | proteomics, transcriptomics, metabolomics, lipidomics | Biomarkers for the development of Type 1 Diabetes Mellitus |
| Lee et al.115 | Proteomics, transcriptomics, metabolomics | Molecular mechanisms of PFOS toxicity |
| McLoughlin et al.116 | Proteomics, transcriptomics, metabolomics | Nutrient stress reactions of maize |
The use of multi-omics in biomedical research represents an especially ripe opportunity for multi-omic analysis, as the high levels of information provide a holistic picture of molecular underpinnings of health and disease. An excellent example of this is a study by Cavalli et al.(112), in which proteomics is integrated with single-nuclei transcriptomics and chromosomal conformation changes to demonstrate the regulatory networks at play during the onset of hepatocellular carcinoma (HCC). Multi-omic analyses are particularly useful in discerning biomarkers for diseases, as in Fornecker et al.(113) where drug resistance in B-cell lymphoma was investigated via multi-omics to reveal increased abundances of Hexokinase 3, S100 proteins, and others as drivers of this phenotype. Similarly, Alcazar et al. were able to determine through multi-omic integration of plasma sample data that patients with inhibition of miRNA Let-7a-5p and increased activation of the inflammatory pathway proteins are more prone to the development of Type 1 diabetes(114). The use of proteomics-based multi-omic analyses is not limited to biomedical research, having utility in ecotoxicological and agricultural studies. Integration of proteomics, transcriptomics, and metabolomics by Lee et al.(115) was able to show the molecular mechanisms of perfluorooctanesulfonic acid (PFOS) neurotoxicity in zebrafish, while McLoughlin et al. demonstrated the autophagic pathways that occur in maize in response to nutrient deprivation using multi-omic analysis(116).
4. Proteogenomics: genome- and transcriptome-driven proteomics
The nature of data analysis for bottom-up MS-based proteomics, coupled with the proliferation of NGS technologies for DNA and RNA sequencing, has given rise to proteogenomics - a multi-omics approach unique to the integration of data from these ‘omic domains(117). In bottom-up proteomics, MS/MS peptide data is searched against a FASTA-formatted database containing sequences encompassing the proteome or proteomes of interest. Historically this approach relied on the a priori selection of a reference sequence database which may not include some sample-specific sequences of potential significance. For example, alternative splicing and amino acid substitutions are known to be the underlying etiology of many cancers(118), and these sample-specific sequences may not be contained in reference databases. Proteogenomics addresses these limitations by employing NGS sequencing of DNA or RNA within the biological sample of interest, to generate a sample-specific sequence database which captures potentially translated, novel protein sequences translated for variant gene sequences and/or novel transcription and RNA processing events. MS/MS data are then searched against a combined database of both the reference and novel protein sequences of interest to gain conclusive evidence on the expression of unique proteins sequences that may play a key role in biology or disease.
Proteogenomics is generally conducted in some variations of the workflow presented in Figure 3. This workflow fuses algorithms traditionally used for specific ‘omic domains (DNA/RNA sequencing and MS-based proteomics), also incorporating a number or customized tools necessary to integrate different datatypes and visualize outputs(119). A proteogenomics workflow begins with the alignment of DNA or RNA sequencing data for comparing it against a reference genome. This can be done using available open-source data or, ideally, from DNA or mRNA samples isolated from the same sample analyzed by MS-based proteomics. In the case of whole genome or exome sequencing data (Figure 3a) the sequences can be either mapped against reference genomes using programs such as Bowtie(120), Minimap2(121), BWA(122), or can be assembled de novo into contigs and then whole genomes using tools such as Velvet(123), SGA(124), or others depending on the read length of the DNA. The sequenced genome can then be subjected to either 6-frame translation to potential proteins(125) or be subjected to protein prediction software such as Peptimapper(126) or getorf in EMBOSS(127). For many researchers, RNA-Seq data on expressed transcripts is a popular choice, as it provides a template of transcribed sequences that may give rise to the translated proteome (Figure 3b). Here, sequencing data is aligned against a reference genome using programs like HiSat2(128) or TopHat(129), followed by detection of variants and other novel transcripts using programs such as FreeBayes(130) or GATK(131). Novel RNA sequences can then be converted to protein sequences using programs such as CustomProDB(132). A recent alternative is the Spritz Database engine(133), which takes in raw FASTQ sequences and a reference proteome and generates a FASTA library containing non-canonical sequences.
Figure 3.

Proteogenomics workflows. a) For generating a FASTA protein sequence database from genomic sequencing data, the FASTQ files are either aligned against a reference genome or assembled into contigs and then a working genome. In either case, the resulting assembled sequencing data is either translated into proteins in six open reading frames or submitted to analysis using gene identifying software, the results of which are translated into proteins. b) For generating a FASTA protein sequence database from RNA sequencing data or exome sequencing data, FASTQ files are aligned to a reference genome. The assembled data is then searched against a variant sequence detector, the results of which are then converted into FASTA files. Unaligned sequences from UTR transcription or novel RNA processing events are subjected to three-frame translation. c) Raw proteomic data is searched against bespoke FASTA protein sequence databases to detect noncanonical peptide sequences. Flowcharts made using Lucidchart.
Once a proteogenomics FASTA database is created, it can theoretically be used to query raw MS data using analysis software such as MaxQuant(37), SearchGUI(134), or many others (Figure 3c). However, FASTA databases generated from genomic and transcriptomic data have the potential to be much larger than those of the conventional proteome, inviting the potential for increased false positive identifications(135). While this can be controlled for via more stringent false discovery rate (FDR) cutoffs during analysis, this in turn can result in decreased sensitivity as genuine identifications are removed along with false positives; this can be mitigated using strategies to decrease the database size such as two-step searching(136) as well as database sectioning and enrichment strategies(137). Another concern is that potential non-canonical peptides matched to the proteogenomics database may be mismatches that correspond to normal peptides; tools such as BLAST-P(138) and the PepQuery search engine(139) can be employed to ensure confidence in candidate novel peptide sequences identified via proteogenomics. When non-canonical peptides are identified in proteogenomics assays, it is useful to examine their differences to the canonical sequence by mapping them to the genome. This can be done using tools such as the Multi-omics Visualization Platform (MVP)(140) or the Proteogenomic Mapping Tool(141).
Top-down MS allows for detection of distinct proteoforms that are often difficult to characterize fully using tryptic digestion and peptide-centric analysis offered by bottom-up proteomics, wherein variant amino acid sequences or PTM-carrying peptides are detected but mapping to specific proteoforms is not possible. As with bottom-up proteogenomics, tools such as TopPG(142) are being developed to generate reference databases from genetic information to aid in the identification of sample-specific proteoforms which provide a full sequence landscape of any PTMs and/or amino acid sequence variants that might make up a distinct proteoform.
Due to the extensive number of tools necessary for proteogenomics analyses, conventional command line triggering of integrated but disparate bioinformatics tools is a very cumbersome process. Multi-omics workflow platforms such as Galaxy, Peptimapper(126), or PANOPLY(143) allow for the automated generation of proteogenomics databases, searching mass spectrometry data against these databases, and statistical analysis of the results
5. Conclusion
Bottom-up proteomics holds a valuable place within the hierarchy of ‘omics technologies, directly detecting the functional molecules that collectively drive biochemical mechanisms within a cell, tissue, or organism. While informative, proteome data is only one piece of the network of interconnected biomolecules responsible for cellular function and phenotypes. Integration with DNA or RNA sequencing information that may give rise to translated proteins, or metabolite information which indicates their biochemical activity state, provides a more complete picture. Recent advances to bottom-up MS-based proteomics methodologies and instrumentation now makes deeper characterization of the proteome a reality, improving the value of integration with other ‘omic data (e.g., DNA/RNA sequencing results). At the same time, bioinformatics tools have emerged to facilitate analysis of large ‘omics data sets, including options for integration of MS-based proteomics data with other ‘omic levels of information. The linking of DNA and/or RNA NGS data with deep MS-based proteomics data has given rise to the area of proteogenomics, which offers promise in detecting previously unseen protein sequences belonging to proteoforms that may be key to biological processes and disease. As advances continue to make MS-based proteomics more cost-effective, sensitive and high-throughput, multi-omic analyses centered around this data have the potential to become a pillar of 21st century systems biology-based research -- impacting diverse fields from translational clinical applications to the study of complex environmental phenomena.
6. Expert Opinion
The value of multi-omics centered around MS-based proteomics data has been demonstrated in recent years. High profile studies via the Clinical Proteomics Tumor Analysis Consortium (CPTAC) have been notable examples(144), along with other biomedical studies(145). These approaches have also contributed significantly to research progress combating the ongoing SARS-CoV-2 pandemic(146). The technologies contributing to these multi-omic studies (NGS sequencing, high-resolution MS) have become ubiquitous and are now accessible to not only basic laboratory researchers, but also to translational and clinical settings(147). Thus, we are poised to usher in a new era of precision medicine which may bring together these multi-omic technologies to determine the best course of action related to therapeutic interventions and increase the possibility of high value diagnostic and prognostic biomarkers. The outputs of these multi-omic studies (proteins and/or metabolites) nicely feed downstream clinical assays based on targeted MS methods(148), capable of sensitive, rapid, and accurate quantitative analysis across large patient cohorts.
However, to realize the potential of MS-based proteomics centered multi-omics for clinical translation, advances are still needed. The processing and analysis of the raw data needs to be simplified, so that biologists and clinicians with minimal backgrounds in computer science, and limited time, can efficiently perform these analyses and generate reports with clear outcomes and suggested actions. Much of this review has discussed the bioinformatics suites with GUI interfaces which require no coding experience per se; continuing to develop such software platforms, with input from clinical partners, will be critical to making and keeping multi-omics a regular part of the lab and the clinic. Another challenge needing a solution is the incorporation of patient metadata into the multi-omic workflows deployed for analysis of this data, although efforts and progress is being made on this front(149). Additionally, although the depth of MS-based proteomics has significantly improved in recent years, improvements in sequence coverage to identify novel proteoforms, possibly via “middle-down” approaches(150), could increase the value of information from proteogenomics. Lastly, single cell proteomics has lagged behind genomic and transcriptomic approaches for analyzing these highly valuable, material limited sample types, with high potential for advancing biomedicine. Promising methods by a few specialist labs(151) offer hope, but these need to be proven reliable for use by the broader community.
Another emerging area that fits in the scope of MS-based proteomic-centered multi-omics is the field of metaproteomics(152). Metaproteomics incorporates metagenome information on microbial communities from a wide-variety of settings -- from human host samples to complex samples (e.g., wastewater, soil) relevant to environmental studies. This multi-omic data can be used to create large protein sequence databases of potential microbe-derived proteins within these samples, which are then used for searching MS/MS data generated from these samples. When analyzed with specialized multi-omic tools(153), the results provide a unique snapshot of the functional proteins expressed by microbial communities which may drive host biology or regulate characteristics of complex ecological systems. These results can also help identify potential metabolic pathways and small molecules generated by the microbiota that play a role in interactions and regulatory mechanisms. Metaproteomics also expands the reach of proteomic-centered multi-omics to studying flora, fauna and microbial communities responding to environmental factors (e.g., climate change, pollution(154), bioremediation(155)) in addition to biomedical applications(156).
The continuous progress in proteome-centric multi-omics points to a promising future where this approach becomes routine. Portable mass spectrometers for deployment both in the field and clinic(157), coupled with automated and portable sample collection and processing devices(158) could make sampling and MS-based proteomics analysis in the field and clinic a reliable option, complementing such approaches that are already emerging for NGS sequencing(159). Continued advances in multi-omic software platforms towards customized and automated pipelines would rapidly provide results from the generated data, aiding clinical decisions or guiding mitigation actions for environmental applications.
Article highlights.
Improvements to bottom-up proteomic technologies, from experimental methods, sample preparation, and instrumentation, are providing improved depth and quality of proteome information
New software is making it easier to perform multi-omic analyses on proteomics, transcriptomics, and metabolomics data
Integration of genomic and transcriptomic sequencing data with mass spectrometry-based proteomics data has driven the emergence of proteogenomic analysis
Data generated by bottom-up proteomics combined with other ‘omics data results in more thorough molecular descriptions of dynamic biological systems
Funding
Support for this work was provided from the National Cancer Institute – Informatics Technology for Cancer Research (NCI-ITCR) grant 1U24CA199347 to T.J.G.; A.T.R. was supported by Biotechnology Training Grant: NIH T32GM008347.
Declaration of Interests
The authors have no affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the review, including employment, consultancies, honoraria, stock ownership/options, expert testimony, grants, patents, or royalties. The funding agencies supporting this work contributed no information to or influence on the contents of the review.
References
- 1.Chifman J, Laubenbacher R, Torti SV. A systems biology approach to iron metabolism. A Systems Biology Approach to Blood. 2014:201–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Men AE, Wilson P, Siemering K, Forrest S. Sanger DNA sequencing. Next Generation Genome Sequencing: Towards Personalized Medicine. 2008:1–11. [Google Scholar]
- 3.Costa V, Angelini C, De Feis I, Ciccodicola A. Uncovering the complexity of transcriptomes with RNA-Seq. Journal of Biomedicine and Biotechnology. 2010;2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ren S, Peng Z, Mao J-H, Yu Y, Yin C, Gao X, et al. RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings. Cell research. 2012;22(5):806–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Leimena MM, Ramiro-Garcia J, Davids M, van den Bogert B, Smidt H, Smid EJ, et al. A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets. BMC genomics. 2013;14(1):1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mager S, Schönberger B, Ludewig U. The transcriptome of zinc deficient maize roots and its relationship to DNA methylation loss. BMC Plant Biology. 2018;18(1):372. doi: 10.1186/s12870-018-1603-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Balmer NV, Klima S, Rempel E, Ivanova VN, Kolde R, Weng MK, et al. From transient transcriptome responses to disturbed neurodevelopment: role of histone acetylation and methylation as epigenetic switch between reversible and irreversible drug effects. Archives of toxicology. 2014;88(7):1451–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hall J, Taylor J, Valentine HR, Irlam JJ, Eustace A, Hoskin P, et al. Enhanced stability of microRNA expression facilitates classification of FFPE tumour samples exhibiting near total mRNA degradation. British journal of cancer. 2012;107(4):684–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hecht ES, Scigelova M, Eliuk S, Makarov A. Fundamentals and advances of orbitrap mass spectrometry. Encyclopedia of Analytical Chemistry: Applications, Theory and Instrumentation. 2006:1–40. [Google Scholar]
- 10.Welton JL, Khanna S, Giles PJ, Brennan P, Brewis IA, Staffurth J, et al. Proteomics analysis of bladder cancer exosomes. Molecular & cellular proteomics. 2010;9(6):1324–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rešetar D, Martinović T, Pavelić SK, Andjelković U, Josić D. Proteomics and Peptidomics as Tools for Detection of Food Contamination by Bacteria. Advances in Food Diagnostics 2017. p. 97–137. [Google Scholar]
- 12.Mueller RS, Denef VJ, Kalnejais LH, Suttle KB, Thomas BC, Wilmes P, et al. Ecological distribution and population physiology defined by proteomics in a natural microbial community. Molecular systems biology. 2010;6(1):374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Smith LM, Kelleher NL. Proteoform: a single term describing protein complexity. Nature methods. 2013;10(3):186–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Smith LM, Kelleher NL, Consortium for Top Down P. Proteoform: a single term describing protein complexity. Nat Methods. 102013. p. 186–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Beck M, Schmidt A, Malmstroem J, Claassen M, Ori A, Szymborska A, et al. The quantitative proteome of a human cell line. Molecular Systems Biology. 2011;7(1):549. doi: 10.1038/msb.2011.82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Graveley BR. Alternative splicing: increasing diversity in the proteomic world. Trends in Genetics. 2001;17(2):100–7. doi: 10.1016/S0168-9525(00)02176-4. [DOI] [PubMed] [Google Scholar]
- 17.Rabiller M, Getlik M, Klüter S, Richters A, Tückmantel S, Simard JR, et al. Proteus in the World of Proteins: Conformational Changes in Protein Kinases. Archiv der Pharmazie. 2010;343(4):193–206. doi: 10.1002/ardp.201000028. [DOI] [PubMed] [Google Scholar]
- 18.Pinu FR, Beale DJ, Paten AM, Kouremenos K, Swarup S, Schirra HJ, et al. Systems Biology and Multi-Omics Integration: Viewpoints from the Metabolomics Research Community. Metabolites. 2019;9(4):76. doi: 10.3390/metabo9040076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Anderson NL, Anderson NG. The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics. 2002;1(11):845–67. doi: 10.1074/mcp.r200007-mcp200. [DOI] [PubMed] [Google Scholar]
- 20.Polaskova V, Kapur A, Khan A, Molloy MP, Baker MS. High-abundance protein depletion: Comparison of methods for human plasma biomarker discovery. ELECTROPHORESIS. 2010;31(3):471–82. doi: 10.1002/elps.200900286. [DOI] [PubMed] [Google Scholar]
- 21.Jankovska E, Lipcseyova D, Svrdlikova M, Pavelcova M, Kubala Havrdova E, Holada K, et al. Quantitative proteomic analysis of cerebrospinal fluid of women newly diagnosed with multiple sclerosis. International Journal of Neuroscience. 2020:1–11. doi: 10.1080/00207454.2020.1837801. [DOI] [PubMed] [Google Scholar]
- 22.Duangkumpha K, Stoll T, Phetcharaburanin J, Yongvanit P, Thanan R, Techasen A, et al. Urine proteomics study reveals potential biomarkers for the differential diagnosis of cholangiocarcinoma and periductal fibrosis. PLOS ONE. 2019;14(8):e0221024. doi: 10.1371/journal.pone.0221024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Prieto DA, Chan KC, Johann DJ, Ye X, Whitely G, Blonder J. Preparation and Immunoaffinity Depletion of Fresh Frozen Tissue Homogenates for Mass Spectrometry-Based Proteomics in the Context of Drug Target/Biomarker Discovery. In: Lazar IM, Kontoyianni M, Lazar AC, editors. Proteomics for Drug Discovery: Methods and Protocols. New York, NY: Springer New York; 2017. p. 71–90. [DOI] [PubMed] [Google Scholar]
- 24.Blume JE, Manning WC, Troiano G, Hornburg D, Figa M, Hesterberg L, et al. Rapid, deep and precise profiling of the plasma proteome with multi-nanoparticle protein corona. Nature Communications. 2020;11(1):3662. doi: 10.1038/s41467-020-17033-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wiśniewski JR, Zougman A, Nagaraj N, Mann M. Universal sample preparation method for proteome analysis. Nature Methods. 2009;6(5):359–62. doi: 10.1038/nmeth.1322. [DOI] [PubMed] [Google Scholar]
- 26.Kulak NA, Pichler G, Paron I, Nagaraj N, Mann M. Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nature Methods. 2014;11(3):319–24. doi: 10.1038/nmeth.2834. [DOI] [PubMed] [Google Scholar]
- 27.HaileMariam M, Eguez RV, Singh H, Bekele S, Ameni G, Pieper R, et al. S-Trap, an Ultrafast Sample-Preparation Approach for Shotgun Proteomics. Journal of Proteome Research. 2018;17(9):2917–24. doi: 10.1021/acs.jproteome.8b00505. [DOI] [PubMed] [Google Scholar]
- 28.Yuan H, Zhang S, Zhao B, Weng Y, Zhu X, Li S, et al. Enzymatic Reactor with Trypsin Immobilized on Graphene Oxide Modified Polymer Microspheres To Achieve Automated Proteome Quantification. Analytical Chemistry. 2017;89(12):6324–9. doi: 10.1021/acs.analchem.7b00682. [DOI] [PubMed] [Google Scholar]
- 29.Hughes CS, Foehr S, Garfield DA, Furlong EE, Steinmetz LM, Krijgsveld J. Ultrasensitive proteome analysis using paramagnetic bead technology. Molecular Systems Biology. 2014;10(10):757. doi: 10.15252/msb.20145625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Yates JR, Carmack E, Hays L, Link AJ, Eng JK. Automated protein identification using microcolumn liquid chromatography-tandem mass spectrometry. 2-D Proteome Analysis Protocols. 1999:553–69. [DOI] [PubMed] [Google Scholar]
- 31.Wang Z, Ma H, Smith K, Wu S. Two-dimensional separation using high-pH and low-pH reversed phase liquid chromatography for top-down proteomics. International Journal of Mass Spectrometry. 2018;427:43–51. doi: 10.1016/j.ijms.2017.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chan KC, Issaq HJ. Fractionation of peptides by strong cation-exchange liquid chromatography. Methods Mol Biol. 2013;1002:311–5. doi: 10.1007/978-1-62703-360-2_23. [DOI] [PubMed] [Google Scholar]
- 33.Yu F, Haynes SE, Nesvizhskii AI. IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled Match-Between-Runs. Molecular & Cellular Proteomics. 2021;20:100077. doi: 10.1016/j.mcpro.2021.100077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Grassetti AV, Hards R, Gerber SA. Offline pentafluorophenyl (PFP)-RP prefractionation as an alternative to high-pH RP for comprehensive LC-MS/MS proteomics and phosphoproteomics. Analytical and Bioanalytical Chemistry. 2017;409(19):4615–25. doi: 10.1007/s00216-017-0407-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Dimayacyac-Esleta BR, Tsai CF, Kitata RB, Lin PY, Choong WK, Lin TD, et al. Rapid High-pH Reverse Phase StageTip for Sensitive Small-Scale Membrane Proteomic Profiling. Anal Chem. 2015;87(24):12016–23. Epub 20151120. doi: 10.1021/acs.analchem.5b03639. [DOI] [PubMed] [Google Scholar]
- 36.Bubis JA, Levitsky LI, Ivanov MV, Tarasova IA, Gorshkov MV. Comparative evaluation of label-free quantification methods for shotgun proteomics. Rapid Commun Mass Spectrom. 2017;31(7):606–12. doi: 10.1002/rcm.7829. [DOI] [PubMed] [Google Scholar]
- 37.Tyanova S, Temu T, Cox J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nature Protocols. 2016;11(12):2301–19. doi: 10.1038/nprot.2016.136. [DOI] [PubMed] [Google Scholar]
- 38.Stead DA, Paton NW, Missier P, Embury SM, Hedeler C, Jin B, et al. Information quality in proteomics. Briefings in Bioinformatics. 2008;9(2):174–88. doi: 10.1093/bib/bbn004. [DOI] [PubMed] [Google Scholar]
- 39.Lazar C, Gatto L, Ferro M, Bruley C, Burger T. Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. J Proteome Res. 2016;15(4):1116–25. Epub 20160301. doi: 10.1021/acs.jproteome.5b00981. [DOI] [PubMed] [Google Scholar]
- 40.Chen X, Wei S, Ji Y, Guo X, Yang F. Quantitative proteomics using SILAC: Principles, applications, and developments. Proteomics. 2015;15(18):3175–92. Epub 20150714. doi: 10.1002/pmic.201500108. [DOI] [PubMed] [Google Scholar]
- 41.Deng J, Erdjument-Bromage H, Neubert TA. Quantitative Comparison of Proteomes Using SILAC. Curr Protoc Protein Sci. 2019;95(1):e74. Epub 20180920. doi: 10.1002/cpps.74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Thompson A, Schäfer J, Kuhn K, Kienle S, Schwarz J, Schmidt G, et al. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal Chem. 2003;75(8):1895–904. doi: 10.1021/ac0262560. [DOI] [PubMed] [Google Scholar]
- 43.Li J, Cai Z, Bomgarden RD, Pike I, Kuhn K, Rogers JC, et al. TMTpro-18plex: The Expanded and Complete Set of TMTpro Reagents for Sample Multiplexing. Journal of Proteome Research. 2021;20(5):2964–72. doi: 10.1021/acs.jproteome.1c00168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jiang H, Zhang L, Zhang Y, Xie L, Wang Y, Lu H. HST-MRM-MS: a novel high-sample-throughput multiple reaction monitoring mass spectrometric method for multiplex absolute quantitation of hepatocellular carcinoma serum biomarker. Journal of proteome research. 2018;18(1):469–77. [DOI] [PubMed] [Google Scholar]
- 45.Han X, Aslanian A, Yates JR 3rd. Mass spectrometry for proteomics. Current opinion in chemical biology. 2008;12(5):483–90. doi: 10.1016/j.cbpa.2008.07.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Meyer JG. Fast Proteome Identification and Quantification from Data-Dependent Acquisition–Tandem Mass Spectrometry (DDA MS/MS) Using Free Software Tools. Methods and protocols. 2019;2(1):8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Chapman JD, Goodlett DR, Masselon CD. Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spectrometry Reviews. 2014;33(6):452–70. doi: 10.1002/mas.21400. [DOI] [PubMed] [Google Scholar]
- 48.Pino LK, Just SC, MacCoss MJ, Searle BC. Acquiring and Analyzing Data Independent Acquisition Proteomics Experiments without Spectrum Libraries. Molecular & Cellular Proteomics. 2020;19(7):1088–103. doi: 10.1074/mcp.P119.001913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Sinitcyn P, Hamzeiy H, Salinas Soto F, Itzhak D, McCarthy F, Wichmann C, et al. MaxDIA enables library-based and library-free data-independent acquisition proteomics. Nature Biotechnology. 2021;39(12):1563–73. doi: 10.1038/s41587-021-00968-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhou Y, Tan Z, Xue P, Wang Y, Li X, Guan F. High-throughput, in-depth and estimated absolute quantification of plasma proteome using data-independent acquisition/mass spectrometry (“HIAP-DIA”). Proteomics. 2021;21(5):e2000264. Epub 20210223. doi: 10.1002/pmic.202000264. [DOI] [PubMed] [Google Scholar]
- 51.Kennedy J, Yi EC. Use of gas-phase fractionation to increase protein identifications : application to the peroxisome. Methods Mol Biol. 2008;432:217–28. doi: 10.1007/978-1-59745-028-7_15. [DOI] [PubMed] [Google Scholar]
- 52.Meyer JG, Niemi NM, Pagliarini DJ, Coon JJ. Quantitative shotgun proteome analysis by direct infusion. Nature methods. 2020;17(12):1222–8. Epub 2020/11/23. doi: 10.1038/s41592-020-00999-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Trujillo EA, Hebert AS, Brademan DR, Coon JJ. Maximizing Tandem Mass Spectrometry Acquisition Rates for Shotgun Proteomics. Analytical Chemistry. 2019;91(20):12625–9. doi: 10.1021/acs.analchem.9b02979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kelstrup CD, Jersie-Christensen RR, Batth TS, Arrey TN, Kuehn A, Kellmann M, et al. Rapid and deep proteomes by faster sequencing on a benchtop quadrupole ultra-high-field Orbitrap mass spectrometer. Journal of proteome research. 2014;13(12):6187–95. [DOI] [PubMed] [Google Scholar]
- 55.Senko MW, Remes PM, Canterbury JD, Mathur R, Song Q, Eliuk SM, et al. Novel parallelized quadrupole/linear ion trap/Orbitrap tribrid mass spectrometer improving proteome coverage and peptide identification rates. Anal Chem. 2013;85(24):11710–4. Epub 20131127. doi: 10.1021/ac403115c. [DOI] [PubMed] [Google Scholar]
- 56.Hebert AS, Richards AL, Bailey DJ, Ulbrich A, Coughlin EE, Westphall MS, et al. The one hour yeast proteome. Mol Cell Proteomics. 2014;13(1):339–47. Epub 20131019. doi: 10.1074/mcp.M113.034769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Distler U, Kuharev J, Navarro P, Tenzer S. Label-free quantification in ion mobility-enhanced data-independent acquisition proteomics. Nat Protoc. 2016;11(4):795–812. Epub 20160324. doi: 10.1038/nprot.2016.042. [DOI] [PubMed] [Google Scholar]
- 58.Swearingen KE, Moritz RL. High-field asymmetric waveform ion mobility spectrometry for mass spectrometry-based proteomics. Expert review of proteomics. 2012;9(5):505–17. doi: 10.1586/epr.12.50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Bekker-Jensen DB, Martínez-Val A, Steigerwald S, Rüther P, Fort KL, Arrey TN, et al. A Compact Quadrupole-Orbitrap Mass Spectrometer with FAIMS Interface Improves Proteome Coverage in Short LC Gradients*. Molecular & Cellular Proteomics. 2020;19(4):716–29. doi: 10.1074/mcp.TIR119.001906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Klaeger S, Apffel A, Clauser KR, Sarkizova S, Oliveira G, Rachimi S, et al. Optimized Liquid and Gas Phase Fractionation Increases HLA-Peptidome Coverage for Primary Cell and Tissue Samples. Molecular & Cellular Proteomics. 2021;20. doi: 10.1016/j.mcpro.2021.100133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Ridgeway ME, Lubeck M, Jordens J, Mann M, Park MA. Trapped ion mobility spectrometry: A short review. International Journal of Mass Spectrometry. 2018;425:22–35. doi: 10.1016/j.ijms.2018.01.006. [DOI] [Google Scholar]
- 62.Meier F, Brunner AD, Koch S, Koch H, Lubeck M, Krause M, et al. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer. Mol Cell Proteomics. 2018;17(12):2534–45. Epub 20181101. doi: 10.1074/mcp.TIR118.000900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Aballo TJ, Roberts DS, Melby JA, Buck KM, Brown KA, Ge Y. Ultrafast and Reproducible Proteomics from Small Amounts of Heart Tissue Enabled by Azo and timsTOF Pro. Journal of Proteome Research. 2021;20(8):4203–11. doi: 10.1021/acs.jproteome.1c00446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Reeves GA, Talavera D, Thornton JM. Genome and proteome annotation: organization, interpretation and integration. Journal of The Royal Society Interface. 2009;6(31):129–47. doi: doi: 10.1098/rsif.2008.0341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.The UniProt C. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Research. 2021;49(D1):D480–D9. doi: 10.1093/nar/gkaa1100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, et al. BioMart – biological queries made easy. BMC Genomics. 2009;10(1):22. doi: 10.1186/1471-2164-10-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Luzarowski M, Vicente R, Kiselev A, Wagner M, Schlossarek D, Erban A, et al. Global mapping of protein–metabolite interactions in Saccharomyces cerevisiae reveals that Ser-Leu dipeptide regulates phosphoglycerate kinase activity. Communications Biology. 2021;4(1):181. doi: 10.1038/s42003-021-01684-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, et al. HMDB: the Human Metabolome Database. Nucleic acids research. 2007;35(Database issue):D521–D6. doi: 10.1093/nar/gkl923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Kamburov A, Stelzl U, Lehrach H, Herwig R. The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Research. 2013;41(D1):D793–D800. doi: 10.1093/nar/gks1055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wishart DS, Li C, Marcu A, Badran H, Pon A, Budinski Z, et al. PathBank: a comprehensive pathway database for model organisms. Nucleic Acids Res. 2020;48(D1):D470–d8. doi: 10.1093/nar/gkz861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology. 2010;11(3):R25. doi: 10.1186/gb-2010-11-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Willforss J, Chawade A, Levander F. NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis. Journal of Proteome Research. 2019;18(2):732–40. doi: 10.1021/acs.jproteome.8b00523. [DOI] [PubMed] [Google Scholar]
- 73.Hyvärinen A. Independent component analysis: recent advances. Philosophical transactions Series A, Mathematical, physical, and engineering sciences. 2012;371(1984):20110534-. doi: 10.1098/rsta.2011.0534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Murtagh F, Contreras P. Algorithms for hierarchical clustering: an overview. WIREs Data Mining and Knowledge Discovery. 2012;2(1):86–97. doi: 10.1002/widm.53. [DOI] [Google Scholar]
- 75.Mooney MA, Wilmot B. Gene set analysis: A step-by-step guide. American journal of medical genetics Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics. 2015;168(7):517–27. Epub 2015/06/08. doi: 10.1002/ajmg.b.32328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Santamarıa C, Garcıa–Mora B, Rubio G, Falcó A. Topographic representation of cancer data using Boolean Networks. Modelling for Engineering & Human Behaviour 2019.180. [Google Scholar]
- 77.Hou J, Acharya L, Zhu D, Cheng J. An overview of bioinformatics methods for modeling biological pathways in yeast. Briefings in functional genomics. 2016;15(2):95–108. Epub 2015/10/17. doi: 10.1093/bfgp/elv040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Adhikari S, Nice EC, Deutsch EW, Lane L, Omenn GS, Pennington SR, et al. A high-stringency blueprint of the human proteome. Nat Commun. 2020;11(1):5301. Epub 20201016. doi: 10.1038/s41467-020-19045-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Deutsch EW, Lane L, Overall CM, Bandeira N, Baker MS, Pineau C, et al. Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 3.0. Journal of proteome research. 2019;18(12):4108–16. Epub 2019/10/21. doi: 10.1021/acs.jproteome.9b00542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Mi H, Ebert D, Muruganujan A, Mills C, Albou L-P, Mushayamaha T, et al. PANTHER version 16: a revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Research. 2020;49(D1):D394–D403. doi: 10.1093/nar/gkaa1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Raudvere U, Kolberg L, Kuzmin I, Arak T, Adler P, Peterson H, et al. g: Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic acids research. 2019;47(W1):W191–W8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Manzini S, Busnelli M, Colombo A, Franchi E, Grossano P, Chiesa G. reString: an open-source Python software to perform automatic functional enrichment retrieval, results aggregation and data visualization. Scientific Reports. 2021;11(1):23458. doi: 10.1038/s41598-021-02528-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Meng C, Basunia A, Peters B, Gholami AM, Kuster B, Culhane AC. MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data. Mol Cell Proteomics. 2019;18(8 suppl 1):S153–s68. Epub 20190626. doi: 10.1074/mcp.TIR118.001251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9(1):559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Kelly RS, Chawes BL, Blighe K, Virkud YV, Croteau-Chonka DC, McGeachie MJ, et al. An Integrative Transcriptomic and Metabolomic Study of Lung Function in Children With Asthma. Chest. 2018;154(2):335–48. Epub 20180613. doi: 10.1016/j.chest.2018.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research. 2018;47(D1):D607–D13. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Jang Y, Yu N, Seo J, Kim S, Lee S. MONGKIE: an integrated tool for network analysis and visualization for multi-omics data. Biology direct. 2016;11(1):10-. doi: 10.1186/s13062-016-0112-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Hoadley KA, Yau C, Wolf DM, Cherniack AD, Tamborero D, Ng S, et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell. 2014;158(4):929–44. Epub 20140807. doi: 10.1016/j.cell.2014.06.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Meng C, Helm D, Frejno M, Kuster B. moCluster: Identifying Joint Patterns Across Multiple Omics Data Sets. J Proteome Res. 2016;15(3):755–65. Epub 20151230. doi: 10.1021/acs.jproteome.5b00824. [DOI] [PubMed] [Google Scholar]
- 90.Rohart F, Gautier B, Singh A, Lê Cao K-A. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLOS Computational Biology. 2017;13(11):e1005752. doi: 10.1371/journal.pcbi.1005752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Planell N, Lagani V, Sebastian-Leon P, van der Kloet F, Ewing E, Karathanasis N, et al. STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline. Frontiers in Genetics. 2021;12(143). doi: 10.3389/fgene.2021.620453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Conesa A The STATegra project: new statistical tools for analysis and integration of diverse omics data. EMBnet journal. 2014;20(A):768. [Google Scholar]
- 93.Karathanasis N, Tsamardinos I, Lagani V. OmicsNPC: applying the non-parametric combination methodology to the integrative analysis of heterogeneous omics data. PloS one. 2016;11(11):e0165545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Koh HWL, Fermin D, Vogel C, Choi KP, Ewing RM, Choi H. iOmicsPASS: network-based integration of multiomics data for predictive subnetwork discovery. npj Systems Biology and Applications. 2019;5(1):22. doi: 10.1038/s41540-019-0099-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Bodein A, Scott-Boyer MP, Perin O, KA LC, Droit A. Interpretation of network-based integration from multi-omics longitudinal data. Nucleic Acids Res. 2021. Epub 20211209. doi: 10.1093/nar/gkab1200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Krämer A, Green J, Pollard J Jr., Tugendreich S. Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics. 2014;30(4):523–30. Epub 20131213. doi: 10.1093/bioinformatics/btt703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Jalili V, Afgan E, Gu Q, Clements D, Blankenberg D, Goecks J, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2020 update. Nucleic Acids Research. 2020;48(W1):W395–W402. doi: 10.1093/nar/gkaa434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Čech M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research. 2018;46(W1):W537–W44. doi: 10.1093/nar/gky379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Kumar P, Panigrahi P, Johnson J, Weber WJ, Mehta S, Sajulga R, et al. QuanTP: A Software Resource for Quantitative Proteo-Transcriptomic Comparative Data Analysis and Informatics. Journal of Proteome Research. 2019;18(2):782–90. doi: 10.1021/acs.jproteome.8b00727. [DOI] [PubMed] [Google Scholar]
- 100.Tran NC, Gao JX. OpenOmics: A bioinformatics API to integrate multi-omics datasets and interface with public databases. Journal of Open Source Software. 2021;6(61):3249. [Google Scholar]
- 101.Giacomoni F, Le Corguillé G, Monsoor M, Landi M, Pericard P, Pétéra M, et al. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics. 2015;31(9):1493–5. Epub 20141219. doi: 10.1093/bioinformatics/btu813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Tyanova S, Temu T, Sinitcyn P, Carlson A, Hein MY, Geiger T, et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nature Methods. 2016;13(9):731–40. doi: 10.1038/nmeth.3901. [DOI] [PubMed] [Google Scholar]
- 103.Rudolph JD, Cox J. A Network Module for the Perseus Software for Computational Proteomics Facilitates Proteome Interaction Graph Analysis. Journal of Proteome Research. 2019;18(5):2052–64. doi: 10.1021/acs.jproteome.8b00927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Ghosh S, Datta A, Choi H. multiSLIDE is a web server for exploring connected elements of biological pathways in multi-omics data. Nature Communications. 2021;12(1):2279. doi: 10.1038/s41467-021-22650-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Zoppi J, Guillaume J-F, Neunlist M, Chaffron S. MiBiOmics: an interactive web application for multi-omics data exploration and integration. BMC Bioinformatics. 2021;22(1):6. doi: 10.1186/s12859-020-03921-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Lachmann A, Ma’ayan A. KEA: kinase enrichment analysis. Bioinformatics. 2009;25(5):684–6. Epub 20090128. doi: 10.1093/bioinformatics/btp026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Seymour RW, van der Post S, Mooradian AD, Held JM. ProteoSushi: A Software Tool to Biologically Annotate and Quantify Modification-Specific, Peptide-Centric Proteomics Data Sets. Journal of Proteome Research. 2021;20(7):3621–8. doi: 10.1021/acs.jproteome.1c00203. [DOI] [PubMed] [Google Scholar]
- 108.Lima DB, Dupré M, Duchateau M, Gianetto QG, Rey M, Matondo M, et al. ProteoCombiner: integrating bottom-up with top-down proteomics data for improved proteoform assessment. Bioinformatics. 2021;37(15):2206–8. doi: 10.1093/bioinformatics/btaa958. [DOI] [PubMed] [Google Scholar]
- 109.Pino LK, Searle BC, Bollinger JG, Nunn B, MacLean B, MacCoss MJ. The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics. Mass Spectrom Rev. 2020;39(3):229–44. Epub 2017/07/12. doi: 10.1002/mas.21540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Röst HL, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nature Methods. 2016;13(9):741–8. doi: 10.1038/nmeth.3959. [DOI] [PubMed] [Google Scholar]
- 111.Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nature Communications. 2019;10(1):1523. doi: 10.1038/s41467-019-09234-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Cavalli M, Diamanti K, Pan G, Spalinskas R, Kumar C, Deshmukh AS, et al. A Multi-Omics Approach to Liver Diseases: Integration of Single Nuclei Transcriptomics with Proteomics and HiCap Bulk Data in Human Liver. OMICS: A Journal of Integrative Biology. 2020;24(4):180–94. doi: 10.1089/omi.2019.0215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Fornecker L-M, Muller L, Bertrand F, Paul N, Pichot A, Herbrecht R, et al. Multi-omics dataset to decipher the complexity of drug resistance in diffuse large B-cell lymphoma. Scientific Reports. 2019;9(1):895. doi: 10.1038/s41598-018-37273-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Alcazar O, Hernandez LF, Nakayasu ES, Nicora CD, Ansong C, Muehlbauer MJ, et al. Parallel Multi-Omics in High-Risk Subjects for the Identification of Integrated Biomarker Signatures of Type 1 Diabetes. Biomolecules. 2021;11(3):383. doi: 10.3390/biom11030383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Lee H, Sung EJ, Seo S, Min EK, Lee J-Y, Shim I, et al. Integrated multi-omics analysis reveals the underlying molecular mechanism for developmental neurotoxicity of perfluorooctanesulfonic acid in zebrafish. Environment International. 2021;157:106802. doi: 10.1016/j.envint.2021.106802. [DOI] [PubMed] [Google Scholar]
- 116.McLoughlin F, Augustine RC, Marshall RS, Li F, Kirkpatrick LD, Otegui MS, et al. Maize multi-omics reveal roles for autophagic recycling in proteome remodelling and lipid turnover. Nature Plants. 2018;4(12):1056–70. doi: 10.1038/s41477-018-0299-2. [DOI] [PubMed] [Google Scholar]
- 117.Nesvizhskii AI. Proteogenomics: concepts, applications and computational strategies. Nat Methods. 2014;11(11):1114–25. Epub 2014/10/31. doi: 10.1038/nmeth.3144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Bonnal SC, López-Oreja I, Valcárcel J. Roles and mechanisms of alternative splicing in cancer — implications for care. Nature Reviews Clinical Oncology. 2020;17(8):457–74. doi: 10.1038/s41571-020-0350-x. [DOI] [PubMed] [Google Scholar]
- 119.Tariq MU, Haseeb M, Aledhari M, Razzak R, Parizi RM, Saeed F. Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey. IEEE access : practical innovations, open solutions. 2021;9:5497–516. Epub 2020/12/25. doi: 10.1109/ACCESS.2020.3047588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Langmead B Aligning short sequencing reads with Bowtie. Current protocols in bioinformatics. 2010;Chapter 11:Unit-11.7. doi: 10.1002/0471250953.bi1107s32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Li H Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Houtgast EJ, Sima V-M, Bertels K, Al-Ars Z, editors. GPU-accelerated BWA-MEM genomic mapping algorithm using adaptive load balancing. International conference on architecture of computing systems; 2016: Springer. [Google Scholar]
- 123.Zerbino DR. Using the velvet de novo assembler for short‐read sequencing technologies. Current protocols in bioinformatics. 2010;31(1):11.5. 1–.5. 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Simpson JT, Durbin R. Efficient de novo assembly of large genomes using compressed data structures. Genome research. 2012;22(3):549–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Morgenstern B, Dress A, Werner T. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proceedings of the National Academy of Sciences. 1996;93(22):12098–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Guillot L, Delage L, Viari A, Vandenbrouck Y, Com E, Ritter A, et al. Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes. BMC genomics. 2019;20(1):1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends in genetics. 2000;16(6):276–7. [DOI] [PubMed] [Google Scholar]
- 128.Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature methods. 2015;12(4):357–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:12073907. 2012.
- 131.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics. 2011;43(5):491–8. doi: 10.1038/ng.806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Wang X, Zhang B. customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search. Bioinformatics. 2013;29(24):3235–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Cesnik AJ, Miller RM, Ibrahim K, Lu L, Millikin RJ, Shortreed MR, et al. Spritz: A Proteogenomic Database Engine. Journal of Proteome Research. 2021;20(4):1826–34. doi: 10.1021/acs.jproteome.0c00407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L. SearchGUI: An open-source graphical user interface for simultaneous OMSSA and X!Tandem searches. Proteomics. 2011;11(5):996–9. Epub 2011/02/22. doi: 10.1002/pmic.201000595. [DOI] [PubMed] [Google Scholar]
- 135.Kumar D, Yadav AK, Dash D. Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data. In: Keerthikumar S, Mathivanan S, editors. Proteome Bioinformatics. New York, NY: Springer New York; 2017. p. 17–29. [DOI] [PubMed] [Google Scholar]
- 136.Jagtap P, Goslinga J, Kooren JA, McGowan T, Wroblewski MS, Seymour SL, et al. A two-step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies. PROTEOMICS. 2013;13(8):1352–7. doi: 10.1002/pmic.201200352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Kumar P, Johnson JE, Easterly C, Mehta S, Sajulga R, Nunn B, et al. A sectioning and database enrichment approach for improved peptide spectrum matching in large, genome-guided protein sequence databases. Journal of Proteome Research. 2020. [DOI] [PubMed] [Google Scholar]
- 138.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Wen B, Wang X, Zhang B. PepQuery enables fast, accurate, and convenient proteomic validation of novel genomic alterations. Genome Research. 2019;29(3):485–93. doi: 10.1101/gr.235028.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.McGowan T, Johnson JE, Kumar P, Sajulga R, Mehta S, Jagtap PD, et al. Multi-omics Visualization Platform: An extensible Galaxy plug-in for multi-omics data visualization and exploration. GigaScience. 2020;9(4). doi: 10.1093/gigascience/giaa025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Sanders WS, Wang N, Bridges SM, Malone BM, Dandass YS, McCarthy FM, et al. The Proteogenomic Mapping Tool. BMC Bioinformatics. 2011;12(1):115. doi: 10.1186/1471-2105-12-115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Chen W, Liu X. Proteoform Identification by Combining RNA-Seq and Top-Down Mass Spectrometry. J Proteome Res. 2021;20(1):261–9. Epub 20201112. doi: 10.1021/acs.jproteome.0c00369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Mani DR, Maynard M, Kothadia R, Krug K, Christianson KE, Heiman D, et al. PANOPLY: a cloud-based platform for automated and reproducible proteogenomic data analysis. Nature Methods. 2021;18(6):580–2. doi: 10.1038/s41592-021-01176-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Rudnick PA, Markey SP, Roth J, Mirokhin Y, Yan X, Tchekhovskoi DV, et al. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline. J Proteome Res. 2016;15(3):1023–32. Epub 20160225. doi: 10.1021/acs.jproteome.5b01091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Zhang H, Liu T, Zhang Z, Payne SH, Zhang B, McDermott JE, et al. Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer. Cell. 2016;166(3):755–65. Epub 2016/07/04. doi: 10.1016/j.cell.2016.05.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Suvarna K, Salkar A, Palanivel V, Bankar R, Banerjee N, Gayathri JPM, et al. A Multi-omics Longitudinal Study Reveals Alteration of the Leukocyte Activation Pathway in COVID-19 Patients. J Proteome Res. 2021;20(10):4667–80. Epub 20210811. doi: 10.1021/acs.jproteome.1c00215. [DOI] [PubMed] [Google Scholar]
- 147.Docking TR, Parker JDK, Jädersten M, Duns G, Chang L, Jiang J, et al. A clinical transcriptome approach to patient stratification and therapy selection in acute myeloid leukemia. Nature Communications. 2021;12(1):2474. doi: 10.1038/s41467-021-22625-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Meyer JG, Schilling B. Clinical applications of quantitative proteomics using targeted and untargeted data-independent acquisition techniques. Expert review of proteomics. 2017;14(5):419–29. doi: 10.1080/14789450.2017.1322904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Thomas JP, Modos D, Korcsmaros T, Brooks-Warburton J. Network Biology Approaches to Achieve Precision Medicine in Inflammatory Bowel Disease. Front Genet. 2021;12:760501. Epub 20211021. doi: 10.3389/fgene.2021.760501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Pandeswari PB, Sabareesh V. Middle-down approach: a choice to sequence and characterize proteins/proteomes by mass spectrometry. RSC Advances. 2019;9(1):313–44. doi: 10.1039/C8RA07200K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Budnik B, Levy E, Harmange G, Slavov N. SCoPE-MS: mass spectrometry of single mammalian cells quantifies proteome heterogeneity during cell differentiation. Genome Biology. 2018;19(1):161. doi: 10.1186/s13059-018-1547-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Kleiner M Metaproteomics: Much More than Measuring Gene Expression in Microbial Communities. mSystems. 2019;4(3):e00115–19. doi: doi: 10.1128/mSystems.00115-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Gurdeep Singh R, Tanca A, Palomba A, Van Der Jeugt F, Verschaffelt P, Uzzau S, et al. Unipept 4.0: Functional Analysis of Metaproteome Data. Journal of Proteome Research. 2019;18(2):606–15. doi: 10.1021/acs.jproteome.8b00716. [DOI] [PubMed] [Google Scholar]
- 154.Matallana-Surget S, Jagtap PD, Griffin TJ, Beraud M, Wattiez R. Chapter 17 - Comparative Metaproteomics to Study Environmental Changes. In: Nagarajan M, editor. Metagenomics: Academic Press; 2018. p. 327–63. [Google Scholar]
- 155.Bargiela R, Herbst F-A, Martínez-Martínez M, Seifert J, Rojo D, Cappello S, et al. Metaproteomics and metabolomics analyses of chronically petroleum-polluted sites reveal the importance of general anaerobic processes uncoupled with degradation. Proteomics. 2015;15(20):3508–20. Epub 2015/08/27. doi: 10.1002/pmic.201400614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Levi Mortera S, Vernocchi P, Basadonne I, Zandonà A, Chierici M, Durighello M, et al. A metaproteomic-based gut microbiota profiling in children affected by autism spectrum disorders. Journal of Proteomics. 2022;251:104407. doi: 10.1016/j.jprot.2021.104407. [DOI] [PubMed] [Google Scholar]
- 157.Devereaux ZJ, Reynolds CA, Fischer JL, Foley CD, DeLeeuw JL, Wager-Miller J, et al. Matrix-Assisted Ionization on a Portable Mass Spectrometer: Analysis Directly from Biological and Synthetic Materials. Analytical Chemistry. 2016;88(22):10831–6. doi: 10.1021/acs.analchem.6b00304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Müller T, Kalxdorf M, Longuespée R, Kazdal DN, Stenzinger A, Krijgsveld J. Automated sample preparation with SP 3 for low‐input clinical proteomics. Molecular systems biology. 2020;16(1):e9111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Jain M, Olsen HE, Paten B, Akeson M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome biology. 2016;17(1):1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
