Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Nov 12.
Published in final edited form as: J Proteome Res. 2023 Aug 9;22(9):2836–2846. doi: 10.1021/acs.jproteome.3c00085

Real-time spectral library matching for sample multiplexed quantitative proteomics

Chris D McGann 1, Will Barshop 2, Jesse Canterbury 2, Chuwei Lin 1, Wassim Gabriel 3, Jingjing Huang 2, David Bergen 2, Vlad Zubraskov 2, Rafael Melani 2, Mathias Wilhelm 3, Graeme McAlister 2, Devin K Schweppe 1,#
PMCID: PMC11554524  NIHMSID: NIHMS2030407  PMID: 37557900

Abstract

Sample multiplexed quantitative proteomics assays have proved to be highly versatile means to assay molecular phenotypes. Yet, stochastic precursor selection and precursor co-isolation can dramatically reduce the efficiency of data acquisition and quantitative accuracy. To address this, intelligent data acquisition (IDA) strategies have recently been developed to improve instrument efficiency and quantitative accuracy for both discovery and targeted methods. Towards this end, we sought to develop and implement a new real-time library searching (RTLS) workflow that could enable intelligent scan triggering and peak selection within milliseconds of scan acquisition. To ensure ease of use and general applicability, we built an application to read in diverse spectral libraries and file types from both empirical and predicted spectral libraries. We demonstrate that RTLS methods enable improved quantitation of multiplexed samples, particularly with consideration for quantitation from chimeric fragment spectra. We used RTLS to profile proteome responses to small molecule perturbations and were able to quantify up to 15% more significantly regulated proteins in half the gradient time as traditional methods. Taken together, the development of RTLS expands the IDA toolbox to improve instrument efficiency and quantitative accuracy in sample multiplexed analyses.

Introduction

Nearly every data-dependent analysis suffers from stochastic precursor selection effects. These effects reduce run-to-run coverage of the proteome, increase the number of missing values, and limit quantitative access to the proteome1. This problem is particularly challenging for methods that require longer scan or fill times to reach necessary resolutions or sensitivity. For example, the most accurate methods for sample multiplexed quantitative proteomics require tertiary quantitative scans - synchronous precursor selection (SPS)-MS3 scans2,3. Acquisition of these scans leads to multifold increases in the time necessary to quantify proteomics samples at sufficient depth (~8000 proteins).

To combat these challenges, intelligent data acquisition (IDA) methods have been employed to improve both the throughput and quantitative accuracy of sample multiplexed proteomics workflows4,5. Initially IDA methods were largely focused on optimizing fragmentation schemes for precursors in real-time with instrument acquisition or adjusting acquisition settings based on retention time features46. Since then, these efforts have expanded to improve targeted7,8 and discovery-based proteomic workflows9. IDA strategies rely on interpretation and utilization of raw spectral data within a few milliseconds of scan acquisition. This data can then be used to trigger additional scans, optimize scan parameters, or filter low utility spectra10.

In previous work, we showed that real-time database searching (RTS) could double instrument acquisition efficiency for whole proteome profiling10. Comet-based11 scoring enabled real-time peptide spectral matching that could be used to inform the instrument when to collect an SPS-MS3 scan and to target b- and y-ions for SPS selection. Leveraging the increased flexibility, users have improved efficiency and sensitivity in a range of applications from drug development to aging to immunology12,13. For example, RTS was recently applied to chemical proteomics studies to identify cysteine reactive compounds as well as diverse mechanisms of action14,15. In the cysteine study, the integration of TMTpro and RTS led to a 42-fold increase in the sample throughput for reactive cysteine quantification14. In addition, RTS-based decision making was used to improve the sensitivity for the detection of single cell proteomes13.

While RTS is a powerful means to improve instrument acquisition efficiency, we know from post hoc analysis of proteomics datasets that additional information can be used to increase the sensitivity of peptide detection1619. Outside of RTS-based spectral matching, intelligent data acquisition methods have been used for tasks such as retention time shifting-correction in targeted or DIA methods and heavy-labelled peptide triggered methods7,8,2024. Recently, new tools for Real-Time spectral Library Search (RTLS) were developed to enable the real-time characterization of metabolites25. Several groups have now used spectral libraries of predicted or empirical spectra to identify fragmentation signatures to triage useful spectra or to select fragmentation schemes for further analysis26,27. This has now been applied to identify metabolites, improve structural characterization of lipids, and enrich for crosslinked peptide species2527. Though these methods have largely focused on spectral libraries of less than 200 individual spectra, they evidenced how IDA methods, and RTLS in particular, can be applied to a diverse array of biological samples.

Building from the efforts of real-time methods for quantitative proteomics and now RTLS, we wanted to explore the use of spectral library peptide matching for proteome-wide quantitative methods28. Early work with spectral library searching for proteomics relied on the construction of empirically derived spectra to generate libraries using well established workflows such as SpectraST to confidently match peptides based on common score metrics (dot product, cosine score, spectral similarity)29,30. Recent advances in deep learning have now contributed multiple pipelines for the in silico prediction of peptide spectra16,31. Algorithms such as Prosit enable users to predict peptide spectra for whole proteomes (2.6 million peptide spectra for human cells) and have recently been extended to incorporate isobaric labelled samples32,33. These predicted spectral libraries can then be used to efficiently score new empirical spectra or combined with database searching algorithms to re-score spectra for improved sensitivity3436. Spectral library searching has been shown to be a sensitive and accurate way to identify peptides, especially those of complex spectra such as data-independent acquisition experiments3741. Lessons learned from using spectral libraries in data-independent acquisition experiments have recently been leveraged to improve spectral library searches for data-dependent acquisition methods as well36. These latest spectral library search algorithms show that while using even a predicted library, there is a sensitivity gain compared to cutting-edge database search methods15. Yet, as noted these algorithms generate libraries of millions of individual spectra which can be challenging to process and compare in real-time with instrument acquisition.

Here, we sought to develop and implement RTLS for whole proteome, sample multiplexed quantitative proteomics. Previous work established that RTLS could be used for small molecule spectral matching with spectral libraries of a single small molecule, and to improve analysis of cross-linked peptides with a two-spectrum library of diagnostic ions42. To enable whole proteome RTLS, we needed to enable RTLS to process full proteome libraries of up to 4 million spectra in a few milliseconds to allow for real time processing with spectral acquisition. To accomplish this, we (1) developed a flexible library processing workflow for both predicted and experimental libraries from common proteomics resources, (2) optimized the online scoring for whole proteome sample multiplexed analyses, (3) determined optimal parameters for improved instrument efficiency and quantitative accuracy, and (4) benchmarked the RTLS workflow compared to traditional acquisition methods using established standards and complex proteomic samples. Using our optimized workflow and methods, RTLS increased instrument acquisition efficiency 2-fold, and consistently outperformed RTS for high-resolution spectral matching. In addition, RTLS improved quantitative accuracy for chimeric spectra from whole proteome single shot samples and enabled fast and efficient identification of post-translationally modified peptides. Thus, RTLS has proved to be a useful addition to the IDA toolkit with great potential for sample multiplexed quantitative proteomics and future development.

Experimental Procedures

Sample collection and preparation.

Human cell lines were grown to confluence in DMEM containing 10% fetal bovine serum and 1% streptomycin/puromycin. Cells were harvested by manual scraping and washed twice with PBS. Cells were syringe lysed in lysis buffer (8M urea, 50mM EPPS pH 8.5, 150mM NaCl, and Roche protease inhibitor tablet) and the resulting lysates were cleared via centrifugation.

Saccharomyces cerevisiae (BY4742) was grown YPD cultures to an OD600 of 0.8 then washed twice with PBS, pelleted, and stored at −80 °C until use. Cells were resuspended in lysis buffer (8 M urea, 50 mM EPPS pH 8.5, 150 mM NaCl, Roche protease inhibitor tablet) and lysed by bead beating. After lysis and bead removal, the lysate was centrifuged to remove cellular debris and the supernatant was collected for use.

A two proteome (human and yeast) HyPro standard labeled with TMTpro was prepared as previously described43. In brief, HCT116 cells were prepared according to the SL-TMT protocol44 and labeled at 1:1 across all channels. S. cerevisiae (BY4716) was prepared similarly but with ratios of 0:1:1:1:2:2:2:4:4:4:8:8:8:10:10:10 across the 16 TMTpro reporter ion channels. The HCT116 and S. cerevisiae were combined so the final sample was 90% human peptides and 10% yeast peptides (w/w). In the small molecule perturbation studies, A549, H292, or PSC1 cells were treated with 10 μM of a given HDAC inhibitor for 24 hours (belinostat, abexinostat, CUDC-101, vorinostat).

Mass spectrometry data acquisition methods and analysis.

Samples were resuspended in 5% acetonitrile/2% formic acid prior to being loaded onto an in-house pulled C18 (Thermo Accucore, 2.6 Å, 150 μm) 30 cm column. Peptides were eluted over 30, 60, 90, 120, or 180 minute gradients running from 96% Buffer A (5% acetonitrile, 0.125% formic acid) and 4% buffer B (95% acetonitrile, 0.125% formic acid) to 30% buffer B. Sample eluate was electrosprayed (2,700 V) into a Thermo Scientific Orbitrap Eclipse or Orbitrap Ascend mass spectrometer for analysis. The scan procedure for MS1 scans (Orbitrap scan at 120,000 resolving power, 50 ms max injection time, and AGC target set to 100%) and MS2 scans (linear ion trap, “rapid” scan rate, 50 ms max injection time, AGC target set to 200%, CID collision energy of 35% with 10 ms activation time, and 0.5 m/z isolation width) was constant for all analyses with a gradient length above 30 minutes. For 30 minute gradient methods, the ion trap scan rate was set to “turbo” and the maximum injection time lowered to 11ms. Peaks from the MS1 scans were filtered by intensity (minimum intensity > 5e3), charge state (2≤z≤6), and detection of a monoisotopic mass (monoisotopic precursor selection, MIPS). Dynamic exclusion was used, with a duration of 90s, repeat count of 1, mass tolerance of 10ppm, and with the “exclude isotopes” option checked. High field asymmetric waveform ion mobility spectrometry (FAIMS) was set at “standard” resolution, 4.6 L/minute gas flow and 3 CVs: −40/−60/−80. Each CV value was set to top N mode with number of dependent scans set to 6. For MS3 scans, SPS ions were set to 10, MS1 Isolation window was 2 m/z and MS2 isolation window was 3 m/z. MS3 scans were performed at a resolving power of either 45,000 (Ascend) or 50,000 (Eclipse) with an HCD collision energy of 45%.

Real-time spectral library searching.

Predicted spectral libraries were generated using Prosit-TMT (https://www.proteomicsdb.org/prosit/) and output to ‘.msp’ files. MSP files were converted to an RTLS/mzVault compatible format (‘.db’ extension) based on an indexed SQLite database structure. These files can be consumed by the RTLS filter in the instrument’s method editor. Conversion was done using the in-house developed DBKey R Shiny application (https://github.com/SchweppeLab/DBKey). Unless otherwise noted, libraries were generated with no missed cleavages, fixed TMTpro at n-term and lysine and variable oxidation on methionines. For each library candidate within the specified precursor tolerance, a cosine similarity score was calculated against the acquired spectra, with the cosine similarity score defined as such (Equation 1):

matchedILamzLbIUamzUballILamzLb2allIUamzUb2 Eq. 1

where L is library peaks, U are acquired spectral peaks, I is intensity of peak, mz is m/z of peak, a and b are the weight factors. Unless otherwise specified, the minimum cosine score threshold used was set to 20 to approximate the triggering rate of RTS-based methods. We note that the thresholds used in this work are specific to the underlying spectral libraries that were used. This is due to factors such as: the number of peaks allowed in each spectra of the library, the minimum basepeak intensity used to filter out ‘noise’, the origin of the libraries (empirical or predicted).

Real-time spectral searching and analysis was done using Orbitrap Eclipse instrument control software version 4.0, unmodified from the released version except where noted. Unless otherwise specified, the minimum cosine score threshold used was 20. Precursor tolerance was set to 10 ppm and isotope correction was set to 0/1. TMT-SPS mode was selected, such that only matched fragments with a TMTpro tag were selected for SPS-MS3 scans. For methods targeting yeast in HyPro, peptides derived from human proteins were rejected from triggering further SPS-MS3 scans, using the keyword promote/reject feature. Spectral libraries built with SpectraST 5.0 were processed through the Trans-Proteomic Pipeline45. Raw files were converted to mzXMLs with msconvert and subsequently searched with Comet using settings described in the data analysis section46. Results were then processed using PeptideProphet before SpectraST was used to build a consensus library with default settings. Decoy library entries were generated via precursor swap4749.

Real-time database searching and concatenate methods.

Real-time search was done using the instrument control software, Tune version 4.0, unmodified from the release version except where noted. For the fractionated HDAC-treated cell line experiment, a human database was used with fixed Cys(Cam) and variable Met(Ox), one missed cleavage, with TMT mode enabled, a minimum XCorr of 1.0, a minimum dCn of 0.05, precursor tolerance of 10 ppm and FDR/protein closeout disabled. Unless otherwise noted for HyPro runs, a concatenated human-yeast database (a single FASTA file starting with the human proteome then the yeast proteome) was used with the same modifications, “TMT mode” enabled, a minimum XCorr of 1.4, a minimum dCn of 0.05, and precursor tolerance of 10 ppm.

For sequential RTS/RTLS analysis, a single CV of −50 was used. MS2 nodes are set up sequentially with a real-time filter after each scan, the first filter is set to “trigger-only” and the filer set to pass along every MS2 (Cosine Score less than 100/minimum XCorr of 0) and the second filter set to block any further scans (Cosine Score/XCorr greater than 100). For FTMS2 analysis in this experiment, a resolving power of 15,000, a max fill time of 54ms and an AGC target of 200% were used. All raw data are available through accession PXD039855.

Post hoc data analysis.

Raw files were searched using the Proteome Discoverer 3.1 software. Unless otherwise noted, Comet and Sequest searches were performed against databases downloaded from Uniprot, with 2 missed cleavages and a 20 ppm precursor mass tolerance. For ITMS2-MS3 methods, a 0.6 Dalton fragment tolerance was used, and for FTMS2 methods, a 10 ppm fragment tolerance was used. Charge state was restricted to 2–6 and peptide length was limited to 7–30 amino acids to comply with INFERYS. All searches were performed with variable methionine oxidation (+15.99491), static cysteine carboxyamidomethylation (+57.02146), and static TMTpro modifications on lysine and the peptide N-termini (+304.207126). MSPepSearch was run with an INFERYS predicted library with 1 missed cleavage, fragment charges 2 to 4, variable methionine oxidation (+15.99491), static cysteine carboxyamidomethylation (+57.02146), and static TMTpro modifications on lysine and the peptide N-termini (+304.207126). MSPepSearch precursor tolerance was 10 ppm and the fragment tolerance was 0.6 Dalton for ITMS2-MS3 methods and 10 ppm for FTMS2 methods. Peptide spectral matches and spectrum spectral matches were filtered to a peptide and protein false discovery rate (FDR) of less than 1%50. Quantification was done by selecting the centroid with the highest signal-to-noise ratio within a 0.003 Dalton tolerance of the reporter ion’s theoretical m/z. Unless otherwise noted, peptides identified using FTMS2 or SPS-MS3 methods were considered quantified if the sum of the reporter ions’ signal-to-noise (“s:n”)ratios was greater than 100 and precursor isolation specificity was greater than 0.5. Peptides identified using RTS or RTLS methods were considered quantified if the sum of the reporter ions’ signal-to-noise ratios was greater than 100, regardless of precursor isolation specificity. Chimeric spectra analysis was performed using CHIMERYS under the same settings as above, using the “inferys_2.1_fragmentation” prediction model. Quantitative accuracy was assessed with the HYPER interference-free index (IFI, Equation 2), a modified version of the original TKO interference free index where the empty channels in the HyPro standard serve as substitute for the empty KO channels51,52.

IFI=1-s:nKOs:nnon-KO Eq. 2

Statistical analyses and plotting was done using the R project for statistical computing53. When comparing real-time and post hoc data, true positives were considered as those PSM/SSMs that were confidently identified post hoc (‘ground truth’) and in real time (‘test’) (Equations 3 & 4).

sensitivty=True PositivePassed RT&Post hocTrue PositivePassed RT&Post hoc+False NegativePassed Post hoc only Eq. 3
specificity=True NegativeFailed RT&Post hocTrue NegativeFailed RT&Post hoc+False PositiveFailed Post hoc only Eq. 4

Important considerations of RTLS analyses

As noted above, the spectral library used can have strong effects on the underlying score distributions for RTLS matching and therefore future users should make sure to adjust the score thresholds accordingly. Additionally, post hoc analysis of peptides quantified from RTLS triggered MS3 scans should be carefully analyzed. We have pointed out that database searching with rescoring based on spectral libraries is a powerful means to improve sensitivity, but users should also consider key metrics of successful quantitation, such as the number of SPS ions (selected by RTLS) derived from the matched peptide (selected by post hoc analysis). Ideally this fraction should be close to 100%, otherwise inaccurate quantitation may have occurred.

Results and Discussion

Spectral library searching has been shown previously to increase the sensitivity of peptide detection in quantitative proteomics by leveraging either acquired spectra or predicted fragment ion intensities41,54. With a diverse array of scoring metrics, analysis pipelines, and applications, spectral library searching has proved to be a robust method for a wide array of quantitative proteomics methods, though it is predominantly used for label-free quantitation, and data-independent acquisition methods (DIA)55. To enable spectral library searching for real-time decision making we had to (1) determine a method and memory compatible means to store full-proteome spectral libraries (2) enable conversion of diverse spectral library formats into this common format, and (3) optimize scoring functions for performant real-time decision making (Figure 1A).

Figure 1.

Figure 1.

Overview of RTLS workflow. A) The DBkey-RTLS workflow enables the use of spectral libraries from repositories, deep neural network (DNN) predictions, and empirical data. Spectral library matching and scoring are then used to define subsequent scans (SPS-MS3). B) (Left) Boxplots of the search time of Prosit-TMT libraries for yeast, human, concatenated human-yeast. (Right) Library candidates scored per scan for the same libraries. C) Target-decoy separation for a yeast sample labelled with TMTPro. D) Scatterplot comparing RTLS cosine score to that of an established offline spectral library search, MSPepSearch.

Development of a unified tool for spectral library processing for RTLS

We began by building a library processing tool, called DBKey, that can take in spectral libraries from both empirical and predicted spectral sources including repositories such as NIST, SpectraST, and Prosit. DBKey then converts these input file types to a single, RTLS-compatible data type for real-time processing. Our common data type is also compatible with mzVault and is based on an indexed SQLite data structure that can be held in memory during instrument acquisition and stored for repeated use (‘.db’ file extension). By integrating DBKey into a Docker image, we can process these files efficiently in a Docker environment for workstation or cloud extensible deployment.

Using DBkey, we built libraries from both predicted libraries and publicly available datasets to evaluate their compatibility with a multiplexed RTLS-MS3 workflow. Prosit-TMT32,33 was used to make whole-proteome predicted libraries from S. cerevisiae and human FASTA files (497,718 and 2,225,832 spectra respectively) (Table S1). The total size on disk of these databases was 0.41GB for yeast, 2.95GB for human, and 3.23GB for the concatenated human-yeast databases (Table S1). Queries to the database generally require less than 1ms for single proteome spectral libraries even when the library consists of millions of individual spectra (Figures 1B).

Optimization of RTLS cosine score weights

To enable rapid scoring of spectral-spectral matching, we chose to use the weighted, or ‘modified’, cosine score because it is very fact to calculate and has been used and tested extensively30,5658. In particular we used the weighted cosine score which has been shown to be more sensitive than the unweighted cosine score for larger spectral libraries40,56. While cosine score weights have been optimized previously for label-free proteomes they had never been tuned for TMTpro-labelled peptides59. We performed a parameter sweep for the scoring based on Equation 1 and found that the weights a = 0.4 (applied to the intensity) and b = 0.9 (applied to the m/z) provided the highest sensitivity for peptide spectral matching of TMTpro labeled peptides (Figure S1). This was in-line with previous work that found down-weighting dominant peaks and up-weighting larger fragment ions improves sensitivity30. PSMs scored with our optimized cosine score discriminate target and decoys well and show strong agreement with Comet database search results (Figure S2).

Using the optimized score weights, we sought to establish RTLS’ potential for efficiently matching spectra in real time with instrument acquisition. Ideally, spectral matching should occur fast enough to enable parallelization scan acquisition. Using optimized indexing from the .db files, we were able to match experimental spectra to library spectra from predicted and empirical libraries with median search times of 1 ms for single proteome databases (human or yeast) and 2ms for a concatenated human-yeast library (Figure 1B). Importantly, even as the number of candidate spectra considered for each search increased, the relative search time remained well below our target of 35ms. RTLS maintained accurate discrimination of target and decoy peptides using the weighted cosine score (Figure 1C) and these scores correlated well with post hoc spectral match scoring from MSPepSearch (Figure 1D, Figure S2C).

Evaluation of post hoc search methods for RTLS quantified peptides

While we performed our initial post hoc analysis with MSPepSearch, we wanted to evaluate multiple search pipelines to determine the optimal post-RTLS methods for sensitive detection of labeled peptides. To this end, we searched a 12-fraction whole-proteome multiplexed sample set collected with RTLS methods with three different search workflows: Comet (canonical database searching), MSPepSearch (spectral library searching), and Sequest with INFERYS36 rescoring – a database search rescored on spectral similarity to predicted fragment ion intensities (Figure S3). Settings were kept as similar as possible across informatics workflows (see methods). Though we observed a high degree of overlap between search methods, Sequest+INFERYS returned the highest number of confidently identified PSMs (Figure S4). Due to the improved detection sensitivity for PSMs and the incorporation of aspects of database and library searching with Sequest+INFERYS, we proceeded to use this pipeline for post hoc analysis of RTLS acquired data.

Spectral library searching is a highly flexible approach, but can be influenced by library spectra sourcing, peptide fragmentation, spectral quality and spectral purity during instrument acquisition60. To test this, we measured peptide detection sensitivity and quantitative accuracy across a panel of acquisition methods comparing: (1) predicted or empirically derived spectral libraries, (2) with or without FAIMS, and (3) using both CID and HCD MS2 fragmentation (Figure S5S7). First, we generated an empirical spectral library from fractionated yeast samples labeled with TMTpro (PXD014546)61 and assembled a library using SpectraST (66,415 spectra). We compared this empirical yeast library to a predicted library built using Prosit-TMT. Using these two libraries we found that RTLS could efficiently match spectra and trigger SPS-MS3 scans for both empirical and predicted libraries. We observed the cosine scores distribution skewed higher in the empirical spectra results, most likely due to incorporation of non b/y fragments, yet the number of quantified peptides and the concordance with offline searching was lower (Figure S5). Thus, when using empirical libraries, it is important to determine score distributions to establish an optimal score threshold for a given filter-library pairing. Overall, we observed that the larger predicted library led to 22.4% more quantified peptides compared to the empirically derived library (Figure S5) while only 5.9% of the peptides were missing from the empirical library. While scoring for an empirical library would likely be improved using data generated on the same instrument and optimized library construction, the promising results, robustness, and flexibility of predicted libraries, we primarily used Prosit-derived libraries for this work.

Second, when comparing RTLS methods with or without using FAIMS, we observed a broader score distribution, higher median score (20.7 to 26), and better target/decoy discrimination (Figure S6). We believe this is due to FAIMS ability to reduce precursor co-isolation52 thereby generating spectra with fewer interfering peptide fragments and higher spectral match scores. For this reason, we proceeded with FAIMS for further experiments. Third, peptide fragmentation methods used during acquisition can greatly influence library-spectral matching16. We examined both CID and HCD RTLS workflows for use in multiplexed method. For these analyses, we tested if predicted library spectra based on a specific fragmentation type and energy would affect the sensitivity of RTLS peptide detection (Figure S5). Due to the similar score distributions and small (3.8%) difference in peptides detected we chose to use CID peptide fragmentation as it is the most similar to canonical SPS-MS3 workflows (Figure S7).

Comparison of RTLS to canonical methods for detection of quantified proteins and peptides

Having established an optimized set of methods parameters we began by comparing RTLS to traditional SPS methods for whole proteome single-shot methods (Figure 2A). Running 180 minute gradients with HyPro we confirmed that RTLS resulted in more selective triggering and an increase in quantified peptides and proteins of 19.4% and 8.4%, respectively (Figure 2B). We then moved to the challenging task of sub-proteome analysis, quantifying only a subset of our sample, in this case the lower-abundant yeast proteins in our standard samples (10% of the peptides by mass Figure 2A, inset). When comparing SPS-MS3 to RTLS methods of equal length (180 minute gradients) RTLS increased the number of unique quantified peptides and proteins by 23.1% and 36.5%, respectively. Strikingly, when we compared RTLS methods targeting only yeast proteins and using a shortened gradient time (90 minute), the differences in total quantified yeast peptides and proteins was no significant between SPS-MS3 at 180 minute gradients and RTLS at 90 minute gradients (Figure 2B). When we compared the proportion of MS3 spectra triggered during instrument acquisition that were used to obtain our final set of quantified peptides and proteins, we found that RTLS increased MS3 usage from 11.6% to 59.8% and maintained high quantitative accuracy (Figure 2C, 2D). We observed that increasing the gradient length resulted in a slight decrease in the percentage of useful MS3 scans. We believe this is a function of sampling more low quality MS2 spectra as the gradient length increases which would in part explain why the percentages for either the yeast-only or human-yeast library methods remain consistent (Figure 2C).

Figure 2.

Figure 2.

Quantifying peptides and proteins from human and yeast in HyPro. A) Depiction of the four methods used for the comparison of SPS-MS3 and RTLS methods. B) Total quantified peptides from 1ug HyPro standard for each of the four methods highlighted in A for 3 replicates. Inset: mean quantified yeast PSMs for each method. Bars are the average number of quantified PSMs for triplicate HyPro injections; error bars represent standard error of the mean. C) Percent of total acquired MS3 scans that led to quantified peptides for the SPS-MS3 and Human + Yeast Library data as well as the percentage of total acquired MS3 scans that led to quantified yeast peptides for the Yeast-Only library data for 3 replicates. Bars represent the average number of quantified PSMs for three replicate injections; error bars represent standard error of the mean. D) Box plot of the HyPro interference free index for quantified yeast proteins for each of 3 replicates per method.

We next sought to investigate whether one or the other of these methods was better suited to sample multiplexed proteomics analysis. To do this, we ran a series of single-shot methods with settings matched as closely as possible between RTS and RTLS so as to generate similar rates of triggering quantitative MS3 scans (Figure 3A). For this work, we chose an RTLS score threshold initially because a weighted cosine score filter of 20 resulted in a similar number of total quantified proteins and peptides for ITMS2 RTLS-based methods compared to RTS methods (Experimental Procedures, Figure 4A). We observed a slight but consistent increase in the number of quantified peptides when using the RTLS methods for FTMS2 analyses (Figure 3A). To determine why RTLS consistently generated more quantified peptides we ran a sequential real-time method to directly compare RTS and RTLS on the same set of precursors in the same analytical run. In this method, the MS2 level was branched so that each precursor produced two MS2 spectra. Each of these subsequent MS2 scans were then analyzed in real time by either RTS or RTLS to generate a matched set of filtering and quantification events. We then processed this data through the common Sequest+INFERYS pipeline which combines database searching and library rescoring. With the matched set of RTS and RTLS triggering events, we compared XCorr (Comet-based RTS) and weighted cosine score (from RTLS) for their ability to classify target and decoy peptides (Figure 3B). In replicate analyses, we found that RTLS was more sensitive at low FDR thresholds (1–5%) at detecting confirmed peptide spectral matches from the post hoc analysis.

Figure 3.

Figure 3.

Comparing IDA methods. A) 60 minute HyPro runs (n=3) comparing RTS and RTLS done with MS/MS from both the ion trap (IT) and Orbitrap (FT). MS2s, MS3s, quantified peptides and quantified proteins plotted. For FTMS methods: *, p-valueproteins ~ 0.0001; **, p-valuepeptides = 0.0192. B) Receiver-operator characteristic plots of both ion-trap and Orbitrap MS2 spectral matching with either RTS or RTLS based on XCorr (RTS) or Cosine Score (RTLS). Sensitivity and selectivity were calculated based on the comparison between RTS and RTLS spectral matching compared to post hoc peptide spectral matching (‘ground truth’).

Figure 4.

Figure 4.

RTLS methods comparison on fractionated, whole-proteome samples. A) Total gradient time used per fraction in minutes. B) Total number of quantified peptide spectral matches for each method type. C) Total number of significant differentially abundant proteins (FC > 1.5 and q-value < 0.05) for H292, A549, and PSC1 cell lines. D) Scatter plots comparing the log2 fold changes of common significantly differential proteins found when comparing RTLS to SPS-MS3, HRMS2, and RTS as well as their respective the coefficients of determination (R2). E) Same as D for the comparison of RTS versus SPS-MS3 and HRMS2 methods.

Deep, quantitative proteome analysis of belinostat treated cells with RTLS

While single-shot methods are valuable, one of the most common uses for multiplexed proteomics is for the analysis of fractionated proteomes, quantifying differential abundance across conditions. To test RTLS methods across fractionated samples, we treated three human cell lines (A549, H292, PSC1) with the histone deactylase inhibitor belinostat, using four different acquisition methods: SPS-MS3, HRMS2, RTS, and RTLS. Perturbation with belinostat treatment alters chromatin state and leads to pleotropic remodeling of the proteome after sustained treatments. In keeping with previous analyses, we compared belinostat treatment responses with SPS samples run as twelve 180 minute runs while the HRMS2, RTS, and RTLS methods were collected using half of the gradient time (90 min per fraction, Figure 4A). In total, the canonical SPS-MS3 (180 min per fraction) and HRMS2 (90 min per fraction) methods quantified 119,189 and 100,044 PSMs, respectively (Figure 4B). The two real-time methods, RTS and RTLS (90 min per fraction), generated fewer total quantified PSMs of 93,294 and 85,057, respectively (Figure 4B). Interestingly, though RTLS methods for single shot whole proteome analyses performed similarly to RTS in terms of quantified peptides (Figure 3A), for the large, fractionated sample comparison RTLS was more conservative and quantified 91% of the peptides obtained with RTS (Figure 4B). However, even though both RTS and RTLS quantified fewer total PSMs, we observed more proteins with significant fold-change differences with RTS and RTLS compared to SPS-MS3 or HRMS2 methods (Figure 4C). In totally, RTS and RTLS methods quantified 11.8–14.4% more significant differentially abundant proteins (q-value < 0.05, fold-change > 1.5) across all three cell lines when compared to the SPS-MS3 method (Figure 4A). Comparing the log2 fold-changes of shared significantly quantified proteins, IDA methods have higher absolute fold-changes due to SPS ion selection of b- and y-ions leading to improved quantitative accuracy (Figure 4D, 4E, Figure S8).

Extension of RTLS methods to new instrumentation

Using our study of cellular responses to belinostat we compared the utility of RTLS when using new instrumentation for sample multiplexed quantitative proteomics workflows. In particular, we wanted to determine if improved quantitation of peptides when running RTLS methods. Therefore, we compared the reporter ion sensitivity for matched RTLS methods and matched LC systems to quantify the proteomes of A549 cells treated with three HDAC inhibitors (abexinostat, CUDC-101, vorinostat). From fractionated analysis of short gradient runs (60 minute) on both an Orbitrap Eclipse and Orbitrap Ascend using RTLS, we found that the Orbitrap Ascend increased the detected reporter ion signal-to-noise by 127% for peptides and 145% for proteins when running RTLS methods (Figure S9).

Deconvolution and simultaneous triggering of multiple MS3 scans from chimeric spectra using RTLS

Including the work above, most multiplexed proteomics methods are designed to minimize or mitigate precursor co-isolation, as failure to do so leads to chimeric spectra, which impair accurate quantitation. However, co-isolation remains a general challenge in complex sample analyses. Interestingly, recent work with library searching of label-free samples illustrated that purposefully generating chimeric spectra can increase identifications62. We hypothesized that if RTLS can correctly identify multiple precursors in a chimeric MS2 spectra, we could potentially trigger multiple separate MS3 scans from the same MS2 that would lead to an increase in sensitivity and quantified peptides across the run. To this end, we tested methods where the MS2 isolation width was increased from 0.4 Th to 2.0 Th and enabled the multiple precursor search in RTLS. This option allows the search engine to consider multiple precursors within the isolation window when performing the search on a single MS2 spectrum.

In developing multi-precursor RTLS methods, we found that canonical post hoc searching could not efficiently identify peptides from chimeric spectra. To address this, we performed post hoc searching using CHIMERYS, a newly developed search engine focused on deconvolution of chimeric spectra36,63. CHIMERYS successfully validated hundreds of chimeric spectra from our multiple precursor methods. Importantly, we found that RTLS could also correctly identify multiple precursors from a single MS2 scan. As a proof of principle, CHIMERYS identified 77.9% of all PSMs to be derived from chimeric spectrum when using a wide 2.4 Da isolation width (Figure S10). In 43.6% of the chimeric spectra, RTLS returned at least one of the CHIMERYS validated peptides as the top precursor match. There was concordance between RTLS and CHIMERYS on multiple PSMs within a spectrum across 13.5% of the validated chimeras. Due to the differing known quantitative profiles of the human and yeast peptides in our HyPro standard we were able to validate detection of chimeric peptide spectra from a single MS2 scan.

We found that RTLS could properly assign both a human and yeast peptide to a chimeric MS2 spectra and then trigger MS3-based quantification consistent with the known concentrations across TMTpro channels (Figure 5C). We calculated interference free indices (IFI) for both the human peptide (IFI = 0.12) and yeast peptide (IFI = 0.89) and observed distinct quantification profiles across TMTpro channels that were consistent with the coisolation and fragmentation of two peptides (Figure 5C). In general, we also observed that RTLS more accurately picked the 10 SPS ions selected for quantifying MS3 scans. We found that 94.9% of quantified peptides analyzed with RTLS methods had 0/10 or 1/10 unmapped SPS ions, and this outperformed RTS and SPS-MS3 methods which had 91.9% and 47.2% of quantified peptides with 0/10 or 1/10 unmapped SPS ions, respectively (Figure S11). This suggests high concordance of the real-time triggering peptide identification and the post hoc identified peptide. Future improvements in methods for chimeric spectra generation and detection sensitivity are still needed to implement this method. Despite these considerations, chimeric spectra triggering with RTLS serves as the first step towards addressing chimeric spectra isolation in sample multiplexed proteomics and potentially leveraging wider isolation width methods for isobaric multiplexed samples.

Figure 5. Chimeric spectra and RTLS.

Figure 5

A) Acquired MS1 scan with overlapping precursor isotopic envelopes. The red peak for QPMILEK comes from the human EFTU (P49411) and the blue peak for EGPIFGEEMR comes from the yeast EF2 (P32324). B) ddMS2 triggered from the MS1 in A with matched RTLS fragments colored for both precursors (right) Mirror plots of the fragments matched by RTLS and the library entry from a concatenated Prosit library of human and yeast peptides. C) Hypro interference free indexes (IFI) of the two MS3 spectra that were triggered from the single MS2 spectrum in panel B. Channels are grouped by their expected ratios (see Materials and Methods).

Conclusions

We have reported the development and use of RTLS, a modular, integrated intelligent data acquisition strategy based on spectral library searching in real time for sample multiplexed proteomics. The use of RTLS methods resulted in improved instrument efficiency and increased the number of quantified proteins and peptides when compared to traditional methods. Establishing RTLS for multiplexed proteomics lays the groundwork for future work utilizing library searches in areas where library searching can potentially make a large impact. This includes studies of post-translational modifications, where including modifications in the match scoring can lead to large search spaces. As we have shown, RTLS will also be useful in chimeric spectra deconvolution, and in combination with other IDA methods, such as RTS, for highly selective and adaptive instrument methods. In addition to the core methods, we present optimized RTLS scoring weights for sample multiplexed analyses and demonstrate the utility of integrating RTLS and FAIMS for improved discrimination of low confidence peptides. These optimizations can be further improved upon using new instrumentation and the addition of spectral library close out procedures to increase the sensitivity of detection of quantified peptides and proteins. Finally, we demonstrate that RTLS is capable of triggering multiple, quantitatively distinct MS3 spectra from the same MS2 spectrum. Together these findings highlight how new IDA methods can be used to improve sample multiplexed quantitative proteomics methods.

Supplementary Material

Supporting Information

Acknowledgements

We would like to thank the Schweppe and Villen labs at UW and Jimmy Eng of the UWPR for helpful advice and comments. This work was in part funded by the German Federal Ministry of Education and Research (BMBF; Grant No 031L0168) and European Union’s Horizon 2020 Program under Grant Agreement 823839 (EPIC-XS) to M.W. as well as the Andy Hill CARE Fund Distinguished Researcher Award to D.K.S.

M.W. is founder and shareholder of MSAID GmbH and OmicScouts GmbH, with no operational role in both companies. W.B., J.C., J.H., D.B., V.Z., R.M., and G.M. are employees of Thermo Fisher Scientific.

References

  • 1.Liu H, Sadygov RG & Yates JR 3rd. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem 76, 4193–4201, doi: 10.1021/ac0498563 (2004). [DOI] [PubMed] [Google Scholar]
  • 2.McAlister GC et al. MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression across cancer cell line proteomes. Anal Chem 86, 7150–7158, doi: 10.1021/ac502040v (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ting L, Rad R, Gygi SP & Haas W MS3 eliminates ratio distortion in isobaric multiplexed quantitative proteomics. Nat Methods 8, 937–940, doi: 10.1038/nmeth.1714 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bailey DJ, McDevitt MT, Westphall MS, Pagliarini DJ & Coon JJ Intelligent data acquisition blends targeted and discovery methods. J Proteome Res 13, 2152–2161, doi: 10.1021/pr401278j (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Graumann J, Scheltema RA, Zhang Y, Cox J & Mann M A framework for intelligent data acquisition and real-time database searching for shotgun proteomics. Mol Cell Proteomics 11, M111 013185, doi: 10.1074/mcp.M111.013185 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bailey DJ et al. Instant spectral assignment for advanced decision tree-driven mass spectrometry. Proc Natl Acad Sci U S A 109, 8411–8416, doi: 10.1073/pnas.1205292109 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yu Q et al. Sample multiplexing for targeted pathway proteomics in aging mice. Proc Natl Acad Sci U S A 117, 9723–9732, doi: 10.1073/pnas.1919410117 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rose CM et al. TomahaqCompanion: A Tool for the Creation and Analysis of Isobaric Label Based Multiplexed Targeted Assays. J Proteome Res 18, 594–605, doi: 10.1021/acs.jproteome.8b00767 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Erickson BK et al. Active Instrument Engagement Combined with a Real-Time Database Search for Improved Performance of Sample Multiplexing Workflows. J Proteome Res 18, 1299–1306, doi: 10.1021/acs.jproteome.8b00899 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schweppe DK et al. Full-Featured, Real-Time Database Searching Platform Enables Fast and Accurate Multiplexed Quantitative Proteomics. J Proteome Res 19, 2026–2034, doi: 10.1021/acs.jproteome.9b00860 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Eng JK, McCormack AL & Yates JR An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5, 976–989, doi: 10.1016/1044-0305(94)80016-2 (1994). [DOI] [PubMed] [Google Scholar]
  • 12.Keele GR et al. Global and tissue-specific aging effects on murine proteomes. bioRxiv, 2022.2005.2017.492125, doi: 10.1101/2022.05.17.492125 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Furtwangler B et al. Real-Time Search-Assisted Acquisition on a Tribrid Mass Spectrometer Improves Coverage in Multiplexed Single-Cell Proteomics. Mol Cell Proteomics 21, 100219, doi: 10.1016/j.mcpro.2022.100219 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kuljanin M et al. Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries. Nat Biotechnol 39, 630–641, doi: 10.1038/s41587-020-00778-3 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mitchell DC et al. A proteome-wide atlas of drug mechanism of action. Nat Biotechnol, doi: 10.1038/s41587-022-01539-0 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gessulat S et al. Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16, 509–518, doi: 10.1038/s41592-019-0426-7 (2019). [DOI] [PubMed] [Google Scholar]
  • 17.Dorfer V, Maltsev S, Winkler S & Mechtler K CharmeRT: Boosting Peptide Identifications by Chimeric Spectra Identification and Retention Time Prediction. J Proteome Res 17, 2581–2589, doi: 10.1021/acs.jproteome.7b00836 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.AS CS, Bouwmeester R, Martens L & Degroeve S Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions. Bioinformatics 35, 5243–5248, doi: 10.1093/bioinformatics/btz383 (2019). [DOI] [PubMed] [Google Scholar]
  • 19.Li K, Jain A, Malovannaya A, Wen B & Zhang B DeepRescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics. Proteomics 20, e1900334, doi: 10.1002/pmic.201900334 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.van Bentum M & Selbach M An Introduction to Advanced Targeted Acquisition Methods. Mol Cell Proteomics 20, 100165, doi: 10.1016/j.mcpro.2021.100165 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Plank MJ Modern Data Acquisition Approaches in Proteomics Based on Dynamic Instrument Control. J Proteome Res 21, 1209–1217, doi: 10.1021/acs.jproteome.2c00096 (2022). [DOI] [PubMed] [Google Scholar]
  • 22.Remes PM, Yip P & MacCoss MJ Highly Multiplex Targeted Proteomics Enabled by Real-Time Chromatographic Alignment. Anal Chem 92, 11809–11817, doi: 10.1021/acs.analchem.0c02075 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Stopfer LE et al. High-Density, Targeted Monitoring of Tyrosine Phosphorylation Reveals Activated Signaling Networks in Human Tumors. Cancer Res 81, 2495–2509, doi: 10.1158/0008-5472.CAN-20-3804 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Yu Q et al. Sample multiplexing-based targeted pathway proteomics with real-time analytics reveals the impact of genetic variation on protein expression. Nat Commun 14, 555, doi: 10.1038/s41467-023-36269-7 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Bills B et al. Novel Real-Time Library Search Driven Data Acquisition Strategy for Identification and Characterization of Metabolites. Anal Chem 94, 3749–3755, doi: 10.1021/acs.analchem.1c04336 (2022). [DOI] [PubMed] [Google Scholar]
  • 26.Brademan DR et al. Improved Structural Characterization of Glycerophospholipids and Sphingomyelins with Real-Time Library Searching. Anal Chem 95, 7813–7821, doi: 10.1021/acs.analchem.2c04633 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ruwolt M et al. Real-Time Library Search Increases Cross-Link Identification Depth across All Levels of Sample Complexity. Anal Chem 95, 5248–5255, doi: 10.1021/acs.analchem.2c05141 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yates JR 3rd, Morgan SF, Gatlin CL, Griffin PR & Eng JK Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis. Anal Chem 70, 3557–3565, doi: 10.1021/ac980122y (1998). [DOI] [PubMed] [Google Scholar]
  • 29.Lam H Building and searching tandem mass spectral libraries for peptide identification. Mol Cell Proteomics 10, R111 008565, doi: 10.1074/mcp.R111.008565 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Stein SE & Scott DR Optimization and testing of mass spectral library search algorithms for compound identification. J Am Soc Mass Spectrom 5, 859–866, doi: 10.1016/1044-0305(94)87009-8 (1994). [DOI] [PubMed] [Google Scholar]
  • 31.Degroeve S, Maddelein D & Martens L MS2PIP prediction server: compute and visualize MS2 peak intensity predictions for CID and HCD fragmentation. Nucleic Acids Res 43, W326–330, doi: 10.1093/nar/gkv542 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Gabriel W et al. Prosit-TMT: Deep Learning Boosts Identification of TMT-Labeled Peptides. Anal Chem 94, 7181–7190, doi: 10.1021/acs.analchem.1c05435 (2022). [DOI] [PubMed] [Google Scholar]
  • 33.Gabriel W, Giurcoiu V, Lautenbacher L & Wilhelm M Predicting fragment intensities and retention time of iTRAQ- and TMTPro-labeled peptides with Prosit-TMT. Proteomics 22, e2100257, doi: 10.1002/pmic.202100257 (2022). [DOI] [PubMed] [Google Scholar]
  • 34.Yang KL et al. MSBooster: Improving Peptide Identification Rates using Deep Learning-Based Features. bioRxiv, 2022.2010.2019.512904, doi: 10.1101/2022.10.19.512904 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Declercq A et al. MS(2)Rescore: Data-Driven Rescoring Dramatically Boosts Immunopeptide Identification Rates. Mol Cell Proteomics 21, 100266, doi: 10.1016/j.mcpro.2022.100266 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zolg DP et al. INFERYS rescoring: Boosting peptide identifications and scoring confidence of database search results. Rapid Commun Mass Spectrom, e9128, doi: 10.1002/rcm.9128 (2021). [DOI] [PubMed] [Google Scholar]
  • 37.Yang Y et al. In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat Commun 11, 146, doi: 10.1038/s41467-019-13866-z (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Shiferaw GA et al. Sensitive and Specific Spectral Library Searching with CompOmics Spectral Library Searching Tool and Percolator. J Proteome Res 21, 1365–1370, doi: 10.1021/acs.jproteome.2c00075 (2022). [DOI] [PubMed] [Google Scholar]
  • 39.Dorl S, Winkler S, Mechtler K & Dorfer V MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. J Proteome Res 22, 462–470, doi: 10.1021/acs.jproteome.2c00658 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Cranney CW & Meyer JG CsoDIAq Software for Direct Infusion Shotgun Proteome Analysis. Anal Chem 93, 12312–12319, doi: 10.1021/acs.analchem.1c02021 (2021). [DOI] [PubMed] [Google Scholar]
  • 41.Searle BC, Shannon AE & Wilburn DB Scribe: Next Generation Library Searching for DDA Experiments. J Proteome Res 22, 482–490, doi: 10.1021/acs.jproteome.2c00672 (2023). [DOI] [PubMed] [Google Scholar]
  • 42.Ruwolt M et al. Real-time library search increases cross-link identification depth across all levels of sample complexity. bioRxiv, 2022.2011.2016.516769, doi: 10.1101/2022.11.16.516769 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Navarrete-Perea J, Gygi SP & Paulo JA HYpro16: A Two-Proteome Mixture to Assess Interference in Isobaric Tag-Based Sample Multiplexing Experiments. J Am Soc Mass Spectrom 32, 247–254, doi: 10.1021/jasms.0c00299 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Navarrete-Perea J, Yu Q, Gygi SP & Paulo JA Streamlined Tandem Mass Tag (SL-TMT) Protocol: An Efficient Strategy for Quantitative (Phospho)proteome Profiling Using Tandem Mass Tag-Synchronous Precursor Selection-MS3. J Proteome Res 17, 2226–2236, doi: 10.1021/acs.jproteome.8b00217 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Deutsch EW et al. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteomics Clin Appl 9, 745–754, doi: 10.1002/prca.201400164 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Adusumilli R & Mallick P Data Conversion with ProteoWizard msConvert. Methods Mol Biol 1550, 339–368, doi: 10.1007/978-1-4939-6747-6_23 (2017). [DOI] [PubMed] [Google Scholar]
  • 47.Cheng CY, Tsai CF, Chen YJ, Sung TY & Hsu WL Spectrum-based method to generate good decoy libraries for spectral library searching in peptide identifications. J Proteome Res 12, 2305–2310, doi: 10.1021/pr301039b (2013). [DOI] [PubMed] [Google Scholar]
  • 48.Ma K, Vitek O & Nesvizhskii AI A statistical model-building perspective to identification of MS/MS spectra with PeptideProphet. BMC Bioinformatics 13 Suppl 16, S1, doi: 10.1186/1471-2105-13-S16-S1 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lam H et al. Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics 7, 655–667, doi: 10.1002/pmic.200600625 (2007). [DOI] [PubMed] [Google Scholar]
  • 50.Elias JE & Gygi SP Target-decoy search strategy for mass spectrometry-based proteomics. Methods Mol Biol 604, 55–71, doi: 10.1007/978-1-60761-444-9_5 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Paulo JA, O’Connell JD & Gygi SP A Triple Knockout (TKO) Proteomics Standard for Diagnosing Ion Interference in Isobaric Labeling Experiments. J Am Soc Mass Spectrom 27, 1620–1625, doi: 10.1007/s13361-016-1434-9 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Schweppe DK et al. Characterization and Optimization of Multiplexed Quantitative Analyses Using High-Field Asymmetric-Waveform Ion Mobility Mass Spectrometry. Anal Chem 91, 4010–4016, doi: 10.1021/acs.analchem.8b05399 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Team RCR: A language and environment for statistical computing. (2013).
  • 54.Zhang X, Li Y, Shao W & Lam H Understanding the improved sensitivity of spectral library searching over sequence database searching in proteomics data analysis. Proteomics 11, 1075–1085, doi: 10.1002/pmic.201000492 (2011). [DOI] [PubMed] [Google Scholar]
  • 55.Shen JQ et al. Spectral Library Search Improves Assignment of TMT Labeled MS/MS Spectra. Journal of Proteome Research 17, 3325–3331, doi: 10.1021/acs.jproteome.8b00594 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Huber F et al. Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships. PLoS Comput Biol 17, e1008724, doi: 10.1371/journal.pcbi.1008724 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Koo I, Kim S & Zhang X Comparative analysis of mass spectral matching-based compound identification in gas chromatography-mass spectrometry. J Chromatogr A 1298, 132–138, doi: 10.1016/j.chroma.2013.05.021 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Schollee JE et al. Similarity of High-Resolution Tandem Mass Spectrometry Spectra of Structurally Related Micropollutants and Transformation Products. J Am Soc Mass Spectrom 28, 2692–2704, doi: 10.1007/s13361-017-1797-6 (2017). [DOI] [PubMed] [Google Scholar]
  • 59.Yen CY, Houel S, Ahn NG & Old WM Spectrum-to-spectrum searching using a proteome-wide spectral library. Mol Cell Proteomics 10, M111 007666, doi: 10.1074/mcp.M111.007666 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Frewen BE, Merrihew GE, Wu CC, Noble WS & MacCoss MJ Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. Anal Chem 78, 5678–5684, doi: 10.1021/ac060279n (2006). [DOI] [PubMed] [Google Scholar]
  • 61.Paulo JA, Navarrete-Perea J & Gygi SP Multiplexed proteome profiling of carbon source perturbations in two yeast species with SL-SP3-TMT. J Proteomics 210, 103531, doi: 10.1016/j.jprot.2019.103531 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Mayer RL et al. Wide Window Acquisition and AI-based data analysis to reach deep proteome coverage for a wide sample range, including single cell proteomic inputs. bioRxiv, 2022.2009.2001.506203, doi: 10.1101/2022.09.01.506203 (2022). [DOI] [Google Scholar]
  • 63.Frejno M et al. CHIMERYS: An AI-Driven Leap Forward in Peptide Identification. (Annual Meeting of the American Society for Mass Spectrometry, 2021). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

RESOURCES