Qualitative and Quantitative Shotgun Proteomics Data Analysis from Data-Dependent Acquisition Mass Spectrometry

Jesse G Meyer

doi:10.1007/978-1-0716-1178-4_19

. Author manuscript; available in PMC: 2021 Nov 11.

Published in final edited form as: Methods Mol Biol. 2021;2259:297–308. doi: 10.1007/978-1-0716-1178-4_19

Qualitative and Quantitative Shotgun Proteomics Data Analysis from Data-Dependent Acquisition Mass Spectrometry

Jesse G Meyer ¹

PMCID: PMC8583964 NIHMSID: NIHMS1746008 PMID: 33687723

Abstract

Shotgun proteomics is the inferential analysis of proteoforms using peptide proxies produced by enzymecatalyzed hydrolysis of entire proteomes. Such peptides are usually identified by nanoflow liquid chromatography coupled to tandem mass spectrometry analysis (nLC-MS/MS). Traditionally, MS/MS analysis is performed in data-dependent acquisition (DDA) mode, which usually produces a pattern of fragment masses unique to a single peptide’s fragmentation. Here, I describe a statistically rigorous qualitative and quantitative computational analysis for shotgun proteomics DDA analysis using free open-source software tools. MS/MS data are used to identify peptides, and the area of peptide mass/charge over chromatographic elution is used to quantify peptides. All peptides that uniquely map to a protein sequence predicted from the genome are combined into a single protein quantity, which can then be compared across experimental conditions. Statistically significant protein changes can be summarized using gene ontology or pathway term enrichment analysis.

Keywords: Proteomics, Mass spectrometry, Data-dependent acquisition, Quantification, Peptides, Gene ontology analysis, Pathway analysis

1. Introduction

Shotgun (or bottom-up) proteomics using mass spectrometry has become the most widely used method for unbiased protein quantification from biological samples. Major progress over the last 30 years has enabled over 10,000 proteins to be identified and quantified from a single analysis [1]. The field has developed many free tools for proteomic data analysis. Although the availability of competing tools is healthy, too many options can make it difficult for a novice to know where to start. There are many potential pitfalls that a new proteomics practitioner may not be aware of during the analysis, so standardized and rigorous analysis pipelines are needed. This chapter describes one such data analysis pipeline for qualitative and quantitative proteomic data collected by data-dependent acquisition (DDA) mass spectrometry (MS). This chapter further describes one potential downstream summary analysis of the quantitative proteomic results by term enrichment analysis.

The proteomic sample preparation workflow has many variations, but they all follow the same skeleton of necessary steps. First, proteins are isolated from tissues or cultured cells using aqueous buffer containing a solubilizing and denaturing agent (e.g., guanidine, urea, sodium dodecyl sulfate). Disulfide bonds between cysteine residues in those proteins are reduced and covalently alkylated. “Shotgun proteomics” refers to the fact that before proteome analysis proteins are enzymatically hydrolyzed into peptides, which is done because direct analysis of intact proteins is analytically challenging compared to peptides. Usually, trypsin is used to catalyze peptide production because trypsin’s specificity for arginine and lysine residues produces peptides with good length and charge character for MS detection. Finally, peptides produced from enzymatic hydrolysis are purified from interfering substances (e.g., buffer, denaturant, undigested protein) using solid phase extraction. This highly complex mixture of purified peptides is ready for shotgun proteomic analysis.

Peptides are most commonly analyzed by liquid chromatography (LC) coupled to a hybrid mass spectrometer capable of isolating intact ions and fragmenting them before measuring fragment masses, which is a process called tandem mass spectrometry (MS/MS). There are many ways to operate a hybrid mass spectrometer for peptide analysis. One common method for data collection is data-dependent acquisition, in which the mass spectrometer surveys all peptide masses eluting from the LC column at every point in time (precursor ion scan or MS1 scan) and then selects a number of the most abundant masses to be isolated for fragmentation by collision with an inert gas. The peptide fragment masses resulting from peptide degradation in the gas phase are measured to produce a discreet snapshot called a tandem mass spectrum (or MS/MS spectra). Many thousands of MS/MS spectra are collected from each LC-MS/MS experiment, which are used for qualitative analysis of peptides in a sample.

After tandem mass spectrometry data are collected, the first step in data analysis is to identify peptides and proteins (Fig. 1a). First, the raw data files produced by the mass spectrometer store data in proprietary formats that must be converted to an open format before subsequent analysis. Most commonly, the raw data are converted to “mzXML” or “mzML” formats, which are human-readable XML. Those files are then used to generate peptide identifications. Peptides are matched to tandem mass spectra by a process called “database search” to produce peptide-spectra matches (PSMs). Peptide sequences are predicted from the organism’s genome sequence, and the theoretical fragmentation patterns of those peptides are assigned scores to observed spectra. The peptide with the highest score is assigned to the spectra, and then the probability that the PSM is true is usually assessed using the target-decoy approach [2, 3]. The target-decoy strategy is based on inclusion of shuffled peptide sequences (decoys) as possible matches for spectra. When a decoy peptide sequence produces a PSM, we assume that is a wrong answer. Because we include decoy peptides, we can model the distribution of true and false PSMs, and draw a score cutoff that contains a defined proportion of false matches (or false discovery rate, FDR).

After peptides are identified, if our study is focused on proteins, we must infer the presence of proteins in the sample from the peptides we identified. Conceptually we can assume the presence of any protein where we have identified a peptide sequence that uniquely matches only that protein entry. To rigorously claim a protein identity, we must transfer our peptide scores into protein scores and apply statistics. There are many methods to achieve protein inference [4], including ProteinProphet output as part of this workflow [5], but in this work, we will use the simple and generally stringent criteria of at least two unique peptides identified per protein.

To quantify proteins in shotgun proteomics, we quantify the peptides that uniquely match proteins. Peptides identified by nLC-MS/MS are quantified using the area under the curve of the peptide’s intact mass/charge over chromatographic elution time (Fig. 1b) [6]. The quantities of peptides for each protein are then combined into a single protein quantity (Fig. 1c). Finally, we use statistical tests to determine whether protein quantities are different across conditions (assessed by a p-value) and the magnitude of that difference (assessed by fold change). Significance and magnitude of changes are often visualized simultaneously using a volcano plot (Fig. 1d).

This chapter describes a fast and statistically rigorous workflow for analysis of peptides from data-dependent acquisition proteomics data. This analysis includes standard protein quantitation and statistical testing to find differences. This chapter also describes and demonstrates one downstream strategy for analysis of the protein changes induced by biological treatments, gene ontology (GO) term enrichment analysis. These analyses are demonstrated using public data from proteomic analysis of primary microglia from mouse brain across controls, ethanol, and lipopolysaccharide (LPS) stresses [7], but the protocol herein can be used to analyze any data collected with the same common DDA strategy.

2. Materials

This protocol requires a computer running windows 10 with a multi-core 64-bit processor (at least quad core i5 recommended) and at least 8 GB of RAM with at least 50 GB of free disk space.

2.1. Peptide Identification by Database Search

Tutorial data files (.RAW from Pride repository PXD014466 [8], https://www.ebi.ac.uk/pride/archive/projects/PXD014466, original publication [7]).
MSconvert.exe (part of proteowizard, latest version, used version 3.0.19039 here, http://proteowizard.sourceforge.net/) [9].
FragPipe.exe (version 12.1, https://github.com/Nesvilab/FragPipe/releases/tag/12.1).
MSfragger.jar (version 2.2, http://msfragger.nesvilab.org/) [10].
Philosopher.exe (version 2.0, contains PeptideProphet, ProteinProphet, and iProphet, https://philosopher.nesvilab.org/, web address) [5, 11, 12].

2.2. Quantification of Peptides and Detection of Protein Changes

Skyline-Daily Software (latest version, used version 20.1 here, https://skyline.ms) [13].
R statistical computing software (version 3.5.1, base version: www.r-project.org, RStudio suggested: https://rstudio.com/) [14].

2.3. Gene Ontology Term Enrichment Analysis and Network Visualization

Cytoscape network analysis software (latest version, used 3.5.1 here, https://cytoscape.org/) [15].
Cytoscape plug-in ClueGO (latest version, download within Cytoscape from App Manager, http://apps.cytoscape.org/apps/cluego) [16].

3. Methods

3.1. Peptide Identification by Database Search

Download the tutorial MS files and install the software listed in Subheading 2.
Convert. RAW files to mzML format using MSconvert. Right click on one. RAW file, and then select “Open with MSconvertGUI.” At the top left, click “Browse,” and add all the files. Under “Options” in the bottom left, set the output format to “mzML,” and the “Binary encoding precision” to 64-bit. Under “Filters” on the right, from the top drop-down menu, select “Peak Picking” and click “Add” below. There should be two filters in the box at the bottom right as pictured in Fig. S1. NOTE: Ensure you have at least an additional 25 GB of space on the drive that holds your mzML files.
Open the FragPipe.exe and start with the first tab named “Config.” You will notice pop-up prompts asking you to enter the location of the MSFragger.jar file and the Philosopher.exe file. Click browse for each of these and navigate to the location of each of these.
Under the second tab “Select LC/MS Files,” click “Add files” at the top left, and navigate to the mzML files you created in the first step. Add group names to the “Experiment” column and replicate numbers to the “Replicate” column matching the file names (see Note 1).
On the third “Database” tab, click “Download” on the top right to download the mouse database (if using the tutorial data or your data is from mouse). In the pop-up box, click the box for “add isoforms,” and then select “Mus musculus” below, and then click “Yes” at the bottom to start downloading the FASTA database.
On the fourth tab “MSFragger,” near “Load defaults” at the top, click “Closed Search” to load the default parameters. Under “Peak Matching” near the top, change the precursor mass tolerance to −20 and 20.
Skip the fifth tab, and on the sixth “Report” tab, under the top “Report” group at the top, check the box next to “Generate peptide level summary.” In the middle group “Quantitation,” uncheck the box “Run Quantitation” (see Note 2).
On the final tab “Run,” set the output directory to an appropriate location (e.g., C:\FragPipe\tutorial_output\), and then click “RUN” near the top left. The entire analysis workflow will now commence. This will take about an hour depending on the speed of your computer. After successful processing, you should find several new files in the folder you specified including “combined.pep.xml,” which we will use in the next section.

3.2. Quantification of Peptides and Detection of Protein Changes

The combined peptide identifications from Subheading 3.1 are imported into the Skyline software for peptide-level quantification. Open Skyline and select “Import DDA Peptide Search” from the options on the start page. You will be prompted to save the document.
In the “Import Peptide Search” box that pops up, change the cutoff score to “0.99,” click “Add Files…” on the right side, and select the output from FragPipe in the last section “combined.pep.xml.” Click “Next” to start building the spectral library. This step will take about 10 min with the tutorial data, and the duration is proportional to the number of confident peptide-spectrum matches produced by FragPipe/MSFragger analysis (see Note 3).
Once the library is finished building, the box will change to “Extract Chromatograms,” and Skyline should find your source mzML files and populate the “Results files found” list. If your files are not in the list, browse the file system and add them. Click “Next.” A box should pop up asking if you want to remove the common file prefix. This makes viewing the replicates easier. Click “OK” to remove the prefix.
The box will change to “Add Modifications” and will show a list of modifications found in the library that you can choose to include in the quantification. For the purpose of our analysis, check the box next to “Acetyl (N-term) = [42]” and “Oxidation (M) = M[16],” and then click “Next >” at the bottom.
“Configure Full-Scan Settings” will appear. Set precursor charges to “2, 3,4, 5.” Under “Isotope peaks included,” select “Count” and set “Peaks” to “3.” Set Precursor mass analyzer to “Centroided” and set “Mass Accuracy” to “10.” Under the bottom box “Retention time filtering,” set “Use only scans within 5 min of MS/MS IDs” (see Note 4).
“Import FASTA” will appear. Set “Enzyme” to “trypsin” and “Max missed cleavages” to “1.” Click “Browse…” to find the database used for the MS-Fragger database search. Skyline will now match the peptide identifications back to the proteins in the database (see Note 5).
“Import FASTA” dialog box will change to show the number of the proteins, peptides, precursors, and transitions detected during the data import. If you followed the tutorial data up to this point, it should say something like “This operation has created the following targets: 7788 proteins, 63,342 peptides, 75,126 precursors, 225,378 transitions.” At this step, you have the option to filter the proteins by the number of peptides per protein and whether the peptides are unique for a single protein. For the purpose of the tutorial including downstream peptide correlation analysis, we will only keep proteins with at least 2 peptides, but for your own data, it may be acceptable to keep proteins with 1 peptide. Check the bottom box “Remove duplicate peptides,” which will remove peptides that match multiple proteins. Peptides that are not unique to one protein can confuse quantification because their quantity results from multiple sources. For the tutorial data, these filters should reduce the data to about 2918 proteins and 27,685 peptides. Once you click “Finish,” Skyline will start extracting the peptide signals for all peptides in the document and all 15 files. This step will take time depending on your computer’s processor speed. After this step completes, save the document to ensure you keep your progress.
Prepare the document for MSstats comparison by adding the “Condition” and “BioReplicate” annotations. From the “Settings” menu, select “Document Settings.” Under the “annotations” tab, click “Add…” on the right side. Under “Name:,” type “Condition,” set the “Type” drop-down to “Text,” and at the bottom under “Applies to,” check the box next to “Replicates.” Click “OK.” Repeat the same steps to add an annotation named “BioReplicate” with the same settings. Ensure that the boxes next to your new annotation categories are checked in the “Annotations” tab of the “Document Settings” window and then click OK.
Assign values to the annotations you added in the previous step. Go to “View” > “Document Grid.” Within the Document Grid window, at the top left, select “Reports” > “Replicates.” Add the annotations for “Condition” and “BioReplicate” that match the file names (if using the tutorial data). For example, for the file named “50mM_EtOH_3,” type “EtOH” in the condition column, and type the appropriate replicate number “3” under the “BioReplicate” column.
Install the MSstats tool by going to “Tools” > “Tool Store…” and selecting MSstats. Follow the prompts to install MSstats.
Run MSstats analysis to determine protein quantity changes. From the “Tools” menu, select “MSstats” > “Group Comparison.” Skyline will take a moment to write a report table for quantitative analysis, and then a box will pop up named “MSstats Group Comparisons.” Name the comparison, select the normalization method “Equalize medians,” ensure that the “control group” is set to “control,” and under “Select group (s) to compare against:,” select only “LPS.” Click “OK” to start the computation. Results will be written in the same directory as the skyline document, including the processed data, statistical test results, volcano plot of changes, and comparison plots of all the proteins versus control. The volcano plot produced by MSstats can be found in the same directory as the skyline document with the name “VolcanoPlot.pdf’ and will look like Fig. 2 (except with many labeled points).

Fig. 2 — Examples of volcano plot from the comparison of LPS/control showing the −log₁₀(adjusted p-value) on the y-axis and the log₂(fold change) on the x-axis. Points colored in blue represent significantly downregulated proteins, and points colored red represent significantly upregulated proteins. Thresholds: adjusted p-value <0.01, fold change <−1 or >1

3.3. Term Enrichment Analysis of Protein Changes

Prepare lists of up- and downregulated proteins. Open the MSstats output file “TestingResult.csv” using Excel or a similar spreadsheet editing software. Filter the results to contain only adjusted p-value <0.01, and sort the remaining rows based on the log2FC. Then, filter by the column “log2FC” to include only those with <−1 or >1 (corresponds to twofold change up or down).
Get the 6-character UniProt accessions separate from the FASTA header text using the “Text-to-Columns” feature in excel (under the “Data” tab); separate the column “Protein” by the delimiter “|” into three new columns. To do this without overwriting the adjacent columns, you will need to create two empty columns to the right of the “Protein” column.
Open Cytoscape, and install the “ClueGo+CluePedia” plug-ins using the “App Manager” under the “Apps” menu. This will require registration of the app and may take a day or two to get your license.
Open the ClueGO app in Cytoscape by selecting it from the “App” menu.
Download the mouse (Mus musculus) ontologies. Under the “ClueGO+CluePedia” tab that pops up under the control panel at the left, click the button with a paw icon under “Load Marker Lists” near the top. Select Mus musculus from the list and click “download.” After the download, select “Mus musculus” from the drop-down box next to the button with the paw icon (see Note 6).
Copy the 6-character UniProt accessions with negative log2FC values into the box near the top left under “Load Marker List (s),” and then click the plus sign below the box to “Add an additional input cluster.” Copy the 6-character UniProt accessions with positive log2FC values into the second box.
Under the next section “Visual Style,” select the middle option “Clusters” to color the functions based on whether they are in the up- or downregulated group.
Set the ontology options. Under “ClueGO Settings,” in the list of “Ontologies/Pathways,” select “GO Biological Process” for this initial analysis. About three-fourths of the way down the options, click the box next to “Show only Pathways with pV…,” and set the threshold to 0.00001. Keep all other settings as the defaults for this initial analysis, and click “Start” at the bottom. This will produce a number of enriched GO Biological Functions similar to Fig. 3 (see Note 7).

Fig. 3 — Examples of network of enriched GO Biological Process terms produced by ClueGO plug-in within Cytoscape. The proportions of blue and red in the circles reflect the proportion of proteins assigned to each term that were increased or decreased, respectively

Supplementary Material

Figure S1

NIHMS1746008-supplement-Figure_S1.tif^{(108.2KB, tif)}

Acknowledgments

This work was supported by the National Institutes of Health (NIH) through a training grant (T15 LM007359) and a subcontract development project award (P30 AG062715).

Footnotes

^1.

Ensure that experiment and replicate columns are populated. If you skip this step, the program will not run iProphet to combine the search results from each of the files.

^2.

It is very important to check the boxes “Multi-experiment report” and “Generate peptide level summary” under the “Report” tab in FragPipe, because adding these options ensures that iProphet is run to combine the individual peptide searches in a statistically rigorous way.

^3.

If you get an error while attempting to build the spectral library, move the file “combined.pep.xml” to the same directory as the .mzML files.

^4.

Precursor mass analyzer and Mass Accuracy settings should reflect the mass spectrometer and data type you are using. For example, if your data are from a 5600 tripleTOF, set the mass accuracy to 20. Retention time filtering should be based on the reproducibility of retention time in your chromatographic system. For example, if you know that your peptide retention drifts by up to 5 min, set this value to 5 min.

^5.

If using your own data, the enzyme and missed cleavage settings should match the settings used for the database search by MS-Fragger in Subheading 3.1.

^6.

If analyzing your own data from another organism, make sure you download the appropriate ontology here.

^7.

For term enrichment analysis, there is a balance between selecting the appropriate number or ontologies, the p-value threshold, and the ability to visualize and interpret the results. For example, a large number of enriched terms might be useful to understand and explore the results but might be difficult to meaningfully display in a publication.

References

1.Muntel J, Gandhi T, Verbeke L, Bernhardt OM, Treiber T, Bruderer R, Reiter L (2019) Surpassing 10 000 identified and quantified proteins in a single run by optimizing current LC-MS instrumentation and data analysis strategy. Mol Omics 15:348–360 [DOI] [PubMed] [Google Scholar]
2.Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4:207–214 [DOI] [PubMed] [Google Scholar]
3.Choi H, Nesvizhskii AI (2008) False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res 7:47–50 [DOI] [PubMed] [Google Scholar]
4.Audain E, Uszkoreit J, Sachsenberg T, Pfeuffer J, Liang X, Hermjakob H, Sanchez A, Eisenacher M, Reinert K, Tabb DL, Kohlbacher O, Perez-Riverol Y (2017) In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics. J Proteome 150:170–182 [DOI] [PubMed] [Google Scholar]
5.Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75:4646–4658 [DOI] [PubMed] [Google Scholar]
6.Schilling B, Rardin MJ, MacLean BX, Zawadzka AM, Frewen BE, Cusack MP, Sorensen DJ, Bereman MS, Jing E, Wu CC, Verdin E, Kahn CR, Maccoss MJ, Gibson BW (2012) Platform-independent and label-free quantitation of proteomic data using MS1 extracted ion chromatograms in skyline: application to protein acetylation and phosphorylation. Mol Cell Proteomics 11:202–214 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Guergues J, Wohlfahrt J, Zhang P, Liu B, Stevens SM (2020) Deep proteome profiling reveals novel pathways associated with pro-inflammatory and alcohol-induced microglial activation phenotypes. J Proteome 220:103753. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Perez-Riverol Y, Csordas A, Bai J, Bernal-Llinares M, Hewapathirana S, Kundu DJ, Inuganti A, Griss J, Mayer G, Eisenacher M, Pérez E, Uszkoreit J, Pfeuffer J, Sachsenberg T, Yilmaz S, Tiwary S, Cox J, Audain E, Walzer M, Jarnuczak AF, Ternent T, Brazma A, Vizcaíno JA (2019) The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res 47: D442–D450 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak MY, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918–920 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N, Mendoza L, Moritz RL, Aebersold R, Nesvizhskii AI (2011) iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol Cell Proteomics 10:M111.007690 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Keller A, Nesvizhskii AI, Kolker E, Aebersold R (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74:5383–5392 [DOI] [PubMed] [Google Scholar]
13.MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26:966–968 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.R Development Core Team. (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing [Google Scholar]
15.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, Fridman WH, Pagès F, Trajanoski Z, Galon J (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25:1091–1093 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1

NIHMS1746008-supplement-Figure_S1.tif^{(108.2KB, tif)}

[R1] 1.Muntel J, Gandhi T, Verbeke L, Bernhardt OM, Treiber T, Bruderer R, Reiter L (2019) Surpassing 10 000 identified and quantified proteins in a single run by optimizing current LC-MS instrumentation and data analysis strategy. Mol Omics 15:348–360 [DOI] [PubMed] [Google Scholar]

[R2] 2.Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4:207–214 [DOI] [PubMed] [Google Scholar]

[R3] 3.Choi H, Nesvizhskii AI (2008) False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res 7:47–50 [DOI] [PubMed] [Google Scholar]

[R4] 4.Audain E, Uszkoreit J, Sachsenberg T, Pfeuffer J, Liang X, Hermjakob H, Sanchez A, Eisenacher M, Reinert K, Tabb DL, Kohlbacher O, Perez-Riverol Y (2017) In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics. J Proteome 150:170–182 [DOI] [PubMed] [Google Scholar]

[R5] 5.Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75:4646–4658 [DOI] [PubMed] [Google Scholar]

[R6] 6.Schilling B, Rardin MJ, MacLean BX, Zawadzka AM, Frewen BE, Cusack MP, Sorensen DJ, Bereman MS, Jing E, Wu CC, Verdin E, Kahn CR, Maccoss MJ, Gibson BW (2012) Platform-independent and label-free quantitation of proteomic data using MS1 extracted ion chromatograms in skyline: application to protein acetylation and phosphorylation. Mol Cell Proteomics 11:202–214 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Guergues J, Wohlfahrt J, Zhang P, Liu B, Stevens SM (2020) Deep proteome profiling reveals novel pathways associated with pro-inflammatory and alcohol-induced microglial activation phenotypes. J Proteome 220:103753. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Perez-Riverol Y, Csordas A, Bai J, Bernal-Llinares M, Hewapathirana S, Kundu DJ, Inuganti A, Griss J, Mayer G, Eisenacher M, Pérez E, Uszkoreit J, Pfeuffer J, Sachsenberg T, Yilmaz S, Tiwary S, Cox J, Audain E, Walzer M, Jarnuczak AF, Ternent T, Brazma A, Vizcaíno JA (2019) The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res 47: D442–D450 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak MY, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918–920 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N, Mendoza L, Moritz RL, Aebersold R, Nesvizhskii AI (2011) iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol Cell Proteomics 10:M111.007690 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Keller A, Nesvizhskii AI, Kolker E, Aebersold R (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74:5383–5392 [DOI] [PubMed] [Google Scholar]

[R13] 13.MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26:966–968 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.R Development Core Team. (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing [Google Scholar]

[R15] 15.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, Fridman WH, Pagès F, Trajanoski Z, Galon J (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25:1091–1093 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Qualitative and Quantitative Shotgun Proteomics Data Analysis from Data-Dependent Acquisition Mass Spectrometry

Jesse G Meyer

Abstract

1. Introduction

Fig. 1.

2. Materials

2.1. Peptide Identification by Database Search

2.2. Quantification of Peptides and Detection of Protein Changes

2.3. Gene Ontology Term Enrichment Analysis and Network Visualization

3. Methods

3.1. Peptide Identification by Database Search

3.2. Quantification of Peptides and Detection of Protein Changes

Fig. 2.

3.3. Term Enrichment Analysis of Protein Changes

Fig. 3.

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Qualitative and Quantitative Shotgun Proteomics Data Analysis from Data-Dependent Acquisition Mass Spectrometry

Jesse G Meyer

Abstract

1. Introduction

Fig. 1.

2. Materials

2.1. Peptide Identification by Database Search

2.2. Quantification of Peptides and Detection of Protein Changes

2.3. Gene Ontology Term Enrichment Analysis and Network Visualization

3. Methods

3.1. Peptide Identification by Database Search

3.2. Quantification of Peptides and Detection of Protein Changes

Fig. 2.

3.3. Term Enrichment Analysis of Protein Changes

Fig. 3.

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases