Abstract
Immunopeptidomics aims to identify major histocompatibility complex (MHC)-presented peptides on almost all cells that can be used in anti-cancer vaccine development. However, existing immunopeptidomics data analysis pipelines suffer from the nontryptic nature of immunopeptides, complicating their identification. Previously, peak intensity predictions by MS2PIP and retention time predictions by DeepLC have been shown to improve tryptic peptide identifications when rescoring peptide-spectrum matches with Percolator. However, as MS2PIP was tailored toward tryptic peptides, we have here retrained MS2PIP to include nontryptic peptides. Interestingly, the new models not only greatly improve predictions for immunopeptides but also yield further improvements for tryptic peptides. We show that the integration of new MS2PIP models, DeepLC, and Percolator in one software package, MS2Rescore, increases spectrum identification rate and unique identified peptides with 46% and 36% compared to standard Percolator rescoring at 1% FDR. Moreover, MS2Rescore also outperforms the current state-of-the-art in immunopeptide-specific identification approaches. Altogether, MS2Rescore thus allows substantially improved identification of novel epitopes from existing immunopeptidomics workflows.
Keywords: bioinformatics, immunopeptidomics, machine learning, proteomics, mass spectrometry, peptide identification
Abbreviations: CE, collision energy; FDR, false discovery rate; HCD, higher-energy collision-induced dissociation; MS2, tandem mass spectrometry; MGF, mascot generic format; PCC, Pearson correlation coefficient; PSM, peptide-spectrum match.
Graphical Abstract
Highlights
-
•
MS2Rescore significantly boosts immunopeptide identification rates
-
•
Data-driven post-processing allows for a ten-fold increase in specificity
-
•
MS2PIP and DeepLC predictors are integrated with Percolator post-processing
-
•
MS2Rescore accepts identification results from MaxQuant, PEAKS, MS-GF+ and X!Tandem
-
•
MS2Rescore shows great promise to extend current neo- and xeno-epitope landscapes
In Brief
The integration of newly trained immunopeptide MS2PIP models, DeepLC, and Percolator into one software package called MS2Rescore allows for a significant boost in immunopeptide identification rate as well as a substantial increase in specificity. MS2Rescore is search engine-agnostic and unbiased toward HLA types. MS2Rescore, therefore, shows great promise to extend the current neo- and xeno-epitope landscape in existing and future immunopeptidomics experiments.
The immune system is a complex, yet remarkable system that protects us from both invaders from outside the body, that is, pathogens, as well as from inside the body, that is, malignancies (1). Increased understanding of the immune system allowed for great medical achievements such as vaccination, which is currently available for over 29 diseases, enabled the eradication of smallpox, and prevents over 3 million deaths each year (https://www.cdc.gov/vaccines/vpd/vaccines-diseases.html?CDC_AA_refVal=https%3A%2F%2Fwww.cdc.gov%2Fvaccines%2Fvpd-vac%2Fdefault.htm). However, many diseases such as Mycobacterium tuberculosis or malignancies lack effective vaccines due to improper T-cell activation. A key issue in developing effective vaccines for these diseases is the lack of accurately identified major histocompatibility complex (MHC)-presented epitopes or immunopeptides. These epitopes are presented on the cell surface and enable T-cells to discern healthy cells from infected or malignant cells. While much effort has recently been invested in accurate prediction of these epitopes in silico (2), these are mostly limited to viruses as these contain fewer potential protein antigens (3). Moreover, these tools are not yet sufficiently precise to confidently identify epitopes (4, 5). Therefore, experimental immunopeptidomics workflows, such as epitope detection through LC-MS, are still the best way to accurately identify these immunopeptides (6).
While immunopeptidomics workflows have been readily developed and applied (7), acquisition of immunopeptides through LC-MS suffers from some major problems. First, the acquisition of immunopeptide spectra is hampered due to the low abundance of immunopeptides and even more so of neo-epitopes. Infrequently occurring epitopes are still very challenging to identify through LC-MS, despite enrichment efforts and sample preprocessing (8). Second, in contrast to standard proteomics experiments where proteins are usually digested with trypsin before LC-MS, immunopeptides are captured through immune purification with antibodies followed by acidic elution resulting in mostly nontryptic peptides. The nontryptic nature of immunopeptides results in one less positive charge due to the missing arginine or lysine at the peptide’s C-terminus, causing many immunopeptides to be singly charged during MS acquisition. These singly charged peptides are much harder to analyze because, during fragmentation of the peptide, the charge resides on one of the fragments, leaving the other uncharged and thus lost (9). Moreover, most contaminants are singly charged as well, making identifications of immunopeptides much harder (10). The nontryptic nature of immunopeptides hampers not only the acquisition but also the identification of immunopeptide spectra. To match each acquired spectrum with the peptide from which the spectrum originated, proteomics database search engines such as SEQUEST (11), X!Tandem (12), Andromeda (13), or PEAKS DB (14) generate in silico spectra for all potentially matching peptides. The complete list of peptides that could be in the sample is called the search space. It is important to note that spectra from peptides that are not included in this search space cannot be identified, even though they were acquired. In standard shotgun proteomics experiments, in silico tryptic digestion of relevant proteins yields a broad yet representable search space. In immunopeptidomics, however, the search space tends to be two orders of magnitude larger due to (i) seemingly random cleavage from the protein of origin, (ii) the variable length of MHC class I bound peptides, 8 to 11 amino acids, and MHC-II peptides, 6 to 24 amino acids (15) and (iii) the potential occurrence of conformational cis- and trans-spliced immunopeptides, which are nonlinear peptides that originate from the same or different proteins, respectively (16). Additionally, sequence variants and noncanonical protein sequences are often considered as well, even further increasing the search space. Such a search space expansion leads to considerably more ambiguity between candidate peptide-spectrum matches (PSMs) (17), lower PSM scores, drastically elevated false discovery rate (FDR) score thresholds, and ultimately in fewer identified immunopeptides (18). Furthermore, because tryptic peptides have been the longtime standard in proteomics, search engines as well as bioinformatics tools that aid in identifying LC-MS spectra are tailored toward tryptic peptides, making them less accurate or not applicable at all for immunopeptidomics.
The high need for neo- and xeno-epitope discovery led to the development of many bioinformatics tools to improve or validate identifications in immunopeptidomics. On the one hand, motif deconvolution tools have been developed that leverage binding motifs of immunopeptides to validate immunopeptide identifications. On the other hand, full pipelines have been developed to improve immunopeptide identification. For example, MHCquant (19), which is a recent computational workflow designed specifically for neo-epitope identification, and PEAKS DB (14). Even though PEAKS DB is not specifically designed for immunopeptides, it is highly interesting due to its de novo–assisted database searches, which tend to work well for large search spaces. Even though these tools can help with immunopeptide identification, they do not use all available information, such as retention time and fragment ion intensity patterns. Previously, it has been proven that integrating retention time predictions in standard proteomics workflows can improve identification rates (20). Similarly, adding peak intensity predictions to postprocessing tools such as Percolator can also improve identification rates drastically (21), which has already been proven to work for immunopeptides as well by efforts such as Prosit (22, 23). Similarly, tools such as DeepLC (24) and MS2PIP (25, 26, 27) can provide accurate retention time predictions and peak intensity predictions, respectively, to aid in postprocessing. Indeed, when combined with Percolator, identification rates at a fixed FDR have been proven to substantially increase (21). However, currently, DeepLC and MS2PIP are solely trained on tryptic peptides. This absence of lysine and arginine at the C-terminus is less of a problem for DeepLC as the effect on retention time is small and is accounted for through feature encoding (28). However, this is not the case for MS2PIP, as alterations in peptide composition as well as fragmentation patterns and labeling methods heavily alter peak intensity patterns (25). Therefore, we here present greatly improved MS2PIP models that include immunopeptides and nontryptic peptides in general. Moreover, we have integrated MS2PIP and DeepLC with Percolator into the free and open-source MS2Rescore software tool, which enables improved rescoring of peptide identifications from various proteomics search engines. Altogether, we show that well-adapted fragmentation spectrum and retention time predictions integrated into MS2Rescore drastically increase immunopeptide identification rates and outperform existing postprocessing methods.
Experimental Procedures
Training and Evaluation of New MS2PIP Spectrum Prediction Models for Immunopeptides
To train and test new MS2PIP models, five publicly available immunopeptide data sets and one publicly available chymotrypsin-digestion data set were downloaded from PRIDE Archive (29, 30). Similarly, for evaluation on representative unseen data, four distinct data sets were downloaded: (i) a data set containing HLA-I immunopeptides, (ii) a data set containing HLA-II immunopeptides, (iii) the data set with tryptic peptides that was previously used to evaluate the existing MS2PIP higher-energy collision-induced dissociation (HCD) models, and (iv) a data set containing chymotrypsin-digested peptide data. The corresponding ProteomeXchange project identifiers as well as the number of unique peptides and HLA patterns for each data set are listed in supplemental Table S1.
As tandem mass spectrometry (MS2) fragmentation patterns are highly dependent on the instrument, instrument settings, fragmentation method, and any applied labeling methods (25), all MS2PIP train, test, and evaluation data must originate from experiments with the same experimental parameters. Unlabeled HCD data from Quadrupole-Orbitrap instruments was used, as this makes the newly trained models widely applicable and plenty of training data is readily available on public repositories. For each PRIDE Archive project, the raw mass spectrometry files were converted to Mascot Generic Format (MGF) files using ThermoRawFileParser (v1.3.4) (31). The corresponding identification files were converted to MS2PIP input files using custom Python scripts and were further filtered to retain unique combinations of peptide sequence, modifications, and precursor charge at 1% FDR. Next, all spectra were combined into one MGF file. The universal spectrum identifier (32) was used as unique identifier for each PSM to ensure reproducibility and a one-on-one mapping between peptide identifications and spectra. Data from each PRIDE Archive project was either used as train/test data or as evaluation data to ensure fully independent data sets, except for the chymotrypsin data, where the same project was used to provide both training/testing data (70%) and evaluation data (30%). This split was made after selecting unique peptide-modification-charge combinations, to ensure no overlap in samples between both splits.
Similarly to the 2019 MS2PIP models, new models were trained with the XGBoost machine learning algorithm (33). The Hyperopt (34), package (v0.2.5) was used in combination with four-fold cross-validation for hyperparameter optimization, allowing 400 boosting rounds and early stopping fixed at 10 rounds. Hyperparameters were optimized for each training data set separately, as well as for b- and y-ion models. All selected hyperparameters are shown in supplemental Table S2. To evaluate each model, the Pearson correlation coefficient (PCC) was calculated between observed and predicted b- and y-ion peak intensities for each spectrum. The model performances were further analyzed by peptide length and precursor charge. Ultimately, three models were trained: the immunopeptide model solely trained on immunopeptides, the immunopeptide-chymotrypsin model trained on immunopeptides supplemented with chymotrypsin-digested peptides, and the nontryptic immunopeptide model solely trained on nontryptic immunopeptides. Ultimately, two models were integrated into MS2PIP: (i) the immunopeptide model and (ii) the immuno-chymotrypsin model. The former can be used for immunopeptide peak intensity predictions and the latter for tryptic and more general nontryptic peptide predictions. For further analysis into rescoring immunopeptide PSMs, only the immunopeptide model was used, as it showed the best performance for both HLA-I and HLA-II immunopeptides.
To compare MS2PIP predictions with Prosit predictions, the same evaluation data sets were used as mentioned above. Prosit (v1.1.2) was downloaded from GitHub (https://github.com/kusterlab/prosit) and the hcd_hla and irt_prediction models were downloaded from Figshare (https://figshare.com/projects/prosit/35582). MS2PIP predictions were acquired for the general proteomics and chymotrypsin evaluation data with the immuno-chymotrypsin model and for the HLA-I and HLA-II evaluation data with the immunopeptide model. Peptides that were not included in the Prosit output were filtered out of the MS2PIP predictions. The performance was measured in both PCC and spectral angle to ensure a thorough comparison. Only correlations for singly charged fragment ions were taken into account, as the newly trained MS2PIP models only predict intensities for these ions.
Evaluation of MS2Rescore on HLA Class I Peptides and Comparison with Prosit Rescoring
To validate the capacity of the new MS2PIP models to improve immunopeptide identification rates, the new models were implemented with DeepLC (v0.1.36) and Percolator (v3.5) into MS2Rescore. MS2Rescore calculates various meaningful features based on (i) the search engine output, (ii) the DeepLC-predicted and the observed retention times, and (iii) the MS2PIP-predicted and the observed MS2 peak intensities. These features are then passed to Percolator for PSM rescoring. Search engine features were selected based on the previous publication by Granholm et al. (35) and replicated for use with MaxQuant search results (36). MS2PIP features were used as first described by Silva et al (21). All features generated by MS2Rescore are listed in supplemental Table S3.
MS2Rescore was validated on a large-scale HLA class I data set (37), which was also used to validate the recently published Prosit-rescoring effort for immunopeptides (PXD021398) (23). This allows both an evaluation of the improved identification rates due to the new MS2PIP models and a straight-forward comparison with Prosit rescoring. First, the msms.txt identification files for the projects’ two MaxQuant searches (alkylated and nonalkylated samples), the Prosit-rescored Percolator output files, and the raw mass spectrometry files were downloaded from PRIDE Archive. The mass spectrometry files were then further processed with ThermoRawFileParser (v1.3.4) (31) and the PSMs for each of the two MaxQuant searches were rescored separately. Two rescoring methods were evaluated: (i) using only search engine features, replicating a normal Percolator run, and (ii) using the full MS2Rescore feature set, including search engine-, MS2PIP-, and DeepLC-features. Additionally, these rescoring methods were compared with the original MaxQuant results and with the downloaded Prosit-rescoring results.
Each rescoring method was evaluated at varying FDR thresholds in terms of identification rate and number of unique identified peptides. The contribution of the different feature sets in MS2Rescore was visualized using Percolator’s model weights, and the distributions of retention time difference and MS2PIP prediction correlations were compared between decoy PSMs, accepted target PSMs, and rejected target PSMs.
Additionally, as reported by Wilhelm et al. (23), sequence motif patterns for HLA pattern C∗12:03 were further analyzed with GibbsCluster (v2.0) (38), for the gained and lost peptides compared to rescoring with only search engine features.
Evaluation of MS2Rescore Across Collision Energy Settings and Peptide Abundances
To further analyze MS2Rescore performance for various experimental collision energy settings, replicate LC-MS/MS runs were performed on HL60 cells at collision energy values of 25, 27, 30, 32, and 35 NCE (supplemental Methods). The resulting spectra were searched with the Andromeda search engine (MaxQuant v1.6.14.0) against the human UniProtKB-SwissProt (14-09-2020; 20, 388 sequences, Taxonomy ID 9606) database without any enzyme specificity. A minimal peptide length of seven amino acids was required. Oxidation (M) was set as variable modification with a maximum of three modifications per peptide. Mass tolerances were set at 5 ppm and 20 ppm for MS1 and MS2 spectra, respectively. FDR was kept at 100% with the use of a decoy strategy for downstream rescoring with (i) only search engine features and (ii) the full MS2Rescore feature set. Furthermore, for all LC-MS/MS runs at all collision energy settings, precursor intensities were obtained from the MaxQuant msms.txt file to assess any differences in the performance of MS2Rescore between low and high abundant peptides.
Evaluation of MS2Rescore on HLA Class II Peptides
To validate MS2Rescore for HLA class II peptides, another set of raw mass spectrometry files were downloaded from PRIDE Archive (PXD015408). As the uploaded search engine results were already filtered at 5% FDR, the spectra were reanalyzed with PEAKS DB (v10.5) (14) with the same search parameters that were used in the original publication, that is, no enzyme specificity, precursor error tolerance of 10 ppm, fragment ion tolerance of 0.01 Da, with oxidation (M), deamidation (NQ), and trioxidation (C) as variable modifications, searched against the UniProtKB-SwissProt database (01-2021, 22, 235 sequences, Taxonomy ID 10090). The mzIdentML identification file as well as the corresponding MGF files were exported from PEAKS DB and were rescored with MS2Rescore with (i) only search engine features and (ii) the full MS2Rescore feature set, as described above for the evaluation on HLA class I peptides.
General Data Processing and Data Visualization
All plots, unless specified, were generated in Jupyter notebooks (v6.4.0) using Python (v3.8.3) with the Matplotlib (v3.4.2) (39), Seaborn (v0.11.0) (40), UpSetPlot (v0.6.0), and spectrum_utils (v0.3.5) (41) libraries.
Results
Newly Trained MS2PIP Models Accurately Predict Immunopeptide Spectrum Peak Intensities
In order to improve the identification rate of immunopeptides by leveraging peak intensity predictions, new models for MS2PIP were trained specifically for immunopeptides. Despite using different training set compositions, all newly trained models drastically improve predictions for both HLA-I and HLA-II data in comparison with the tryptic 2019 HCD model (Fig. 1A). Surprisingly, even for standard tryptic shotgun proteomics data, the predictions from the new models are slightly better, largely due to the portion of tryptic peptides within the immunopeptide training data. Indeed, when these peptides are left out of the training data, accuracy drops in comparison with the 2019 HCD model. While both immunopeptide models are well suited to predict peak intensities for tryptic and immunopeptides, the performance on chymotrypsin-digested peptides is not as high (supplemental Fig. S1). Thus, even though immuno- and chymotrypsin-digested peptides are both considered nontryptic, they are still very different for MS2PIP peak intensity predictions. Overall, immunopeptide peak intensity predictions are drastically improved by all the newly trained models, with the immunopeptide model showing the highest accuracy (median PCC of 0.94). The exact median PCCs are listed in supplemental Table S4. Examples of a prediction with median PCC values with the immunopeptide model and the corresponding, less accurate, 2019 model prediction are shown in Figure 1, B and C.
MS2Rescore Drastically Improves Immunopeptide Identification Rates
Ultimately, the goal of these newly trained models is to improve immunopeptide identification rates by providing more accurate peak intensity predictions. Therefore, (i) identification results without rescoring, (ii) rescoring with solely search engine features (replicating a normal Percolator run), (iii) rescoring with MS2Rescore (including DeepLC and the new MS2PIP models), and (iv) rescoring with the recently published Prosit models were compared in terms of the total amount of identifications as well as the number of unique identifications based on sequence. Overall, rescoring with both MS2Rescore and Prosit substantially improved the spectrum identification rate in comparison with rescoring with search engine features alone or not rescoring and this at both 1% and 0.1% FDR. Indeed, MS2Rescore achieves an identification rate of 11.1%, out of 18 million spectra, compared to 7.6% for traditional rescoring (an increase of 46%), and only 1.9% for the MaxQuant search results, all at 1% FDR (Fig. 2, A and B). Moreover, 83% of the identified spectra at 1% FDR are retained when restricting the threshold to 0.1% FDR. Thus, providing peak intensities and retention time predictions to Percolator substantially increases the number of identified immunopeptides. This is clearly illustrated by analyzing the Percolator weights for each separate feature as well as the combined absolute weights for search engine features, MS2PIP features, and DeepLC features (supplemental Fig. S2). Similarly, the number of unique identified immunopeptides increases by 36% when adding MS2PIP and DeepLC Features for the 1% FDR and even more so for 0.1% FDR where the number of unique identified peptides reaches nearly 300% of the number of traditional Percolator identification results (Fig. 2, C–F). These gains are consistent across all 95 HLA class I alleles included in the data (supplemental Figs. S3–S4), showing that the newly trained MS2PIP model, and therefore MS2Rescore, is generalizable across different HLA types. In fact, MS2Rescore allows for a substantial increase in identification rate for HLA types with initially fewer identifications (e.g., A0101, A0204, B4402…), indicating that MS2Rescore especially improves the peptide identification coverage for harder-to-identify HLA alleles.
The power of providing these predictions to Percolator is further illustrated when visualizing the distributions for decoy PSMs, rejected target PSMs, and accepted target PSMs. Indeed, the distributions for decoy and rejected target PSMs are highly similar for both the retention time error as well as the PCC, while the accepted target PSMs accumulate around low retention time errors and high PCCs (Fig. 3, A and B). The accepted target PSMs are clearly separable from the decoy and rejected target PSMs using only the PCC and retention time error distributions (Fig. 3, C and D). Furthermore, while both metrics correlate with the search engine score, a large amount of decoy and rejected target PSMs can only be separated from the target PSMs by also including PCC or retention time error information (Fig. 3, E and F). This clearly illustrates how Percolator achieves its much-improved separation between true and false target PSMs when provided with peak intensity and retention time prediction features. Furthermore, PSMs that previously would have been incorrectly accepted below a 1% FDR because of a high search engine score alone are now rejected due to a low PCC, a high retention time error, or both. This most likely accounts for the small percentage of identified peptides that are lost after rescoring.
MS2Rescore Outperforms the Current State-of-the-Art
The integration of MS2PIP-, DeepLC-, and search engine-based features in MS2Rescore has proven to substantially increase the identification rate of immunopeptides and furthermore, outperforms the recently published Prosit-rescoring method (23). In comparison with Prosit, MS2Rescore gains 5% and 35% more identifications at 1% FDR and 0.1% FDR, respectively. This trend continues for the number of unique identified peptides with a respective increase of 8% and 57% (Fig. 2). Indeed, over 32,000 unique peptides were identified below the 0.1% FDR threshold by MS2Rescore, while Prosit rescoring only identified these peptides at the 1% FDR threshold, highlighting the gain in confidence in terms of unique identified immunopeptides (supplemental Fig. S5). MS2Rescore thus substantially increases the identification rate, especially for more stringent FDR thresholds.
Peptides for which Prosit cannot predict MS2 peak intensities, that is, unmodified (C), cysteinylated (C), and acetylation (N-terminus), were left out of the Prosit-rescoring output. That is why MS2Rescore includes more PSMs in the unfiltered data set at 100% FDR, especially for the noIAA sample (supplemental Fig. S6, A and B). For a second, thorough comparison, these PSMs were left out of the MS2Rescore output as well. However, this seemed to have a rather negligible impact on the number of identifications at 1% and 0.1% FDR (supplemental Fig. S6, C and D), and thus the difference in identification rates cannot be attributed to these filtered peptides.
Furthermore, a comparison of the peptide spectrum prediction accuracy of the newly trained MS2PIP models and Prosit indicates that higher performance of MS2Rescore cannot be attributed to improved peak intensity predictions. Indeed, depending on the correlation metric that is used, either MS2PIP or Prosit performs slightly better than the other on the evaluation data (supplemental Fig. S7). It is important to note, however, that for Prosit, the correct collision energy (CE) value should be selected for optimal performance. To determine if the difference in rescoring between MS2Rescore and Prosit is driven by a difference in features, all MS2Rescore features that do not have a close counterpart in Prosit rescoring were removed for a separate run. This includes all DeepLC retention time features and various search engine features (supplemental Table S3). As MS2Rescore and Prosit both include various similar peak intensity prediction features, these were retained. With this reduced feature set, the performance of MS2Rescore is slightly lower than for Prosit rescoring (supplemental Fig. S6, A–D), which confirms that the retention time features and additional search engine features result in improved MS2Rescore performance over Prosit rescoring.
Similarly to Prosit, the identified sequence motif for HLA type C∗12:03 was highly similar to the motif reported in the original publication of the data set (23, 37), while the peptides that were removed by MS2Rescore compared to search engine rescoring showed quite different, less conserved sequence motifs (supplemental Fig. S8).
MS2Rescore Generalizes Well Across Collision Energy Settings and Peptide Abundances
Because MS2PIP does not account for CE in its predictions, MS2PIP and consequently MS2Rescore could potentially be biased toward spectra obtained with certain CE values. Therefore, search results of replicate mass spectrometry runs with CE settings varying from 25 to 35 were postprocessed with MS2Rescore. For each CE value, MS2Rescore shows a significant increase in identification rate. However, for larger (less optimal) CE values, the overall identification rate decreases (supplemental Fig. S9A). This is most likely due to a reduced quality in fragmentation spectra, which is reflected in the decreased explained ion current and in line with the b- and y-ion MS2PIP PCC distributions for accepted PSMs when using suboptimal CE values (supplemental Fig. S9, D–F). Most interestingly, the relative gain in unique identified peptides for MS2Rescore increases for higher (and therefore less optimal) CE values, approaching a 60% increase for CE 35, by slightly shifting the feature weights away from fragmentation features in favor of DeepLC retention time features (supplemental Fig. S9, B and C). Consequently, MS2Rescore is able to recover peptides that would otherwise be lost due to lower-quality fragmentation spectra.
A similar effect is observed for low abundant peptides. Indeed, the largest relative gain achieved by MS2Rescore in terms of number of unique identified peptides is seen for the lowest precursor intensities (supplemental Fig. S10, A and B), where traditional rescoring fails to recover most identifications. MS2Rescore is thus not only able to increase the amount of identifications for immunopeptides in general, it can recover peptides previously lost due to low precursor intensities and thus lower-quality spectra or nonoptimal instrument settings.
MS2Rescore is Unbiased to Different HLA Classes
To evaluate the performance of MS2Rescore on HLA class II peptides, another publicly available data set was reanalyzed. However, while for the HLA class I data set human immunopeptides and MaxQuant search engine (13) results were used, for this HLA class II data set, mouse data was searched with PEAKS DB (14). As was the case for HLA class I peptides, MS2Rescore significantly increases the identification rate for HLA class II peptides with 15% and 57% for the 1% and 0.1% FDR threshold, respectively (supplemental Fig. S11). These increases are, however, slightly lower than for the HLA class I data set. Moreover, where previously conventional rescoring showed a significant increase in comparison to search engine rescoring, here, the gain in comparison with no rescoring is lower for both identification rate as well as number of unique identified peptides. This is likely due to (i) the less extensive search engine features that are calculated for the PEAKS DB pipeline in MS2Rescore (supplemental Table S3) and (ii) to the fact PEAKS DB is likely better equipped to identify immunopeptides than MaxQuant due to its de novo–assisted database search (14). Nevertheless, the full MS2Rescore feature set, including peak intensity and retention time predictions, still results in a significantly higher identification rate. Altogether, these results show that MS2Rescore generalizes well across HLA class I and class II immunopeptides, across different species, and can boost the performance from different search engines.
Discussion
By training new peak intensity prediction models, we were able to greatly enhance immunopeptide identification rate through PSM rescoring. While all newly trained MS2PIP models greatly enhance peak intensity predictions for immunopeptides, the model trained solely on immunopeptides performed best. Even though the immuno-chymotrypsin model contained the same immunopeptide train set, the addition of the chymotrypsin-digested peptides did lower the performance slightly. Similarly, not including chymotrypsin-digested peptides in the training data resulted in lower accuracies for the chymotrypsin-digested peptides. Indeed, immunopeptides are generally much smaller and consequently carry a lower charge state; as a consequence, these immunopeptide-specific MS2PIP models are not able to predict the behavior of longer and higher charged peptides in the mass spectrometer. While both immuno- and chymotrypsin-digested peptides are considered nontryptic, their properties can be very different, leading to reduced accuracy of peak intensity of MS2PIP when applied on a different type of nontryptic peptides. Surprisingly, while immuno- and chymotrypsin-digested peptides are antagonistic, immunopeptides and tryptic peptides seem synergistic in terms of training data. This comes as no surprise as almost 50% of the immunopeptide training data consists of tryptic peptides. Immunopeptides are thus not necessarily nontryptic. However, the actual occurrence of tryptic peptides in immunopeptidomics samples is most likely much lower. This unrepresentatively sized tryptic portion most likely originated from the tryptic bias in current immunopeptidomics workflows. Indeed, in previous studies, tryptic MHC peptide coverage could rise to 70% (42). By training new, nontryptic models of MS2PIP, we take a first step in decreasing this tryptic bias to ultimately be able to analyze an unbiased immunopeptide landscape.
Moreover, by integrating the new immunopeptide model with retention time predictions and search engine features into MS2Rescore, we greatly enhanced the ability of Percolator to rescore immunopeptide PSMs, resulting in a much-improved immunopeptide identification workflow. Furthermore, rescoring drastically increases the number of unique identified peptides, which is of crucial importance for the discovery of potential neo-epitopes for cancer vaccination or xeno-epitopes for anti-bacterial and to a lesser extent, anti-viral vaccines. Moreover, while previously almost no identifications were found at a more confident 0.1% FDR threshold, MS2Rescore allows a lowering of the FDR threshold to 0.1%, while retaining 83% of the peptides identified at 1% FDR. This illustrates the large increase in confidence of the identified PSMs MS2Rescore introduces. Besides the increase in both PSM confidence and identification rate, MS2Rescore has shown to be unbiased with regard to HLA patterns and CE settings. Most importantly, the relative identification gain introduced by MS2Rescore is even larger for HLA patterns that initially had fewer identifications, showing that MS2Rescore is able to increase the view on the immunopeptide landscape for traditionally harder-to-identify HLA patterns. Moreover, MS2Rescore is able to recover peptide identifications that would have been lost due to lower-quality spectra by making use of DeepLC retention time features and can therefore recover substantial additional identifications for low abundant peptides. This potentially enables the recovery of biologically relevant neo- or xeno-epitopes that occur less frequently in the sample. Furthermore, MS2Rescore is able to gain immunopeptide identifications regardless of the search engine used, for both HLA class I and class II peptides, and across different species.
Additionally, MS2Rescore with DeepLC and the new immunopeptide MS2PIP models shows an improved identification rate over the recently published Prosit effort, especially for lower FDR thresholds. As Prosit has shown to provide more accurate predictions compared to previous MS2PIP models, it is unlikely that MS2Rescore’s higher performance can be attributed to superior peak intensity predictions. Indeed, the peak intensity prediction accuracies of the new MS2PIP models and of Prosit are highly similar for immunopeptides (supplemental Fig. S7) even when Prosit has been optimized for the right CE. These negligible differences in peak intensity prediction correlations are therefore likely not the reason for the higher performance of MS2Rescore in favor of Prosit. Instead, it is more likely that the main difference in rescoring performance is the result of the generation of more relevant MS2PIP-, DeepLC, and search engine–derived features. Indeed, when the majority of the search engine features and all DeepLC retention time features were omitted, reflecting the more limited Prosit feature set, the performance of MS2Rescore drops as well (supplemental Fig. S6). By providing a more extensive feature set, MS2Rescore creates a unique feature space that allows Percolator to separate true from false identifications much better than when provided with limited features without retention time or peak intensity information (Fig. 3). The combination of all these calculated features is therefore likely to be driver of MS2Rescore’s superior performance.
MS2Rescore is freely available under the permissive Apache 2.0 open-source license on GitHub (https://github.com/compomics/ms2rescore) and can easily be installed locally through the cross-platform PyPI Python package as well as with a standalone windows install script. Both a command line interface and a graphical user interface are available, various identification files from different search engines are accepted, and both MS2PIP and DeepLC can handle a variety of modifications, eliminating the need to filter identification files before rescoring. Altogether, these new models show great promise to greatly extend the immunopeptide landscape in existing and future immunopeptidomics experiments.
Data Availability
MS2Rescore is available at https://github.com/compomics/ms2rescore. All additional code used in this work is available at compomics/ms2rescore-immunopeptidomics-manuscript (github.com). The data used for training and evaluation of the newly trained models, the models themselves as well as the MS2Rescore output is available on Zenodo at https://doi.org/10.5281/zenodo.6532013. Additional mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (30) partner repository with the data set identifier PXD033868.
Supplemental data
This article contains supplemental data (6, 37, 43, 44, 45, 46, 47, 48, 49, 50).
Conflict of interest
The authors declare that they have no conflict of interest with the contents of the article.
Acknowledgments
NanoLC-MS/MS instruments were supported by the French Proteomic Infrastructure (ProFI FR2048; ANR-10-INBS-08-03).
Funding and additional information
R. B. acknowledges funding from the Vlaams Agentschap Innoveren en Ondernemen under project number HBC.2020.2205.; R. G. received funding from the Research Foundation Flanders (FWO) [1S50918N]. S. D. and L. M. acknowledge funding from the European Union’s Horizon 2020 Programme (H2020-INFRAIA-2018-1) [823839]; L. M. acknowledges funding from the Research Foundation Flanders (FWO) [G028821N] and from Ghent University Concerted Research Action [BOF21/GOA/033]. A. D. received funding from the Research Foundation Flanders (FWO) [1SE3722].
Author contributions
A. D., A. H., and R. G. methodology; A. D. software; A. D., R. B., and R. G. validation; A. D. and R. G. formal analysis; A. D. and R. G. writing–original draft; A. D., R. B., A. H., C. C., S. D., L. M., and R. G. writing–review and editing; A. D., R. B., C. C., S. D., L. M., and R. G. funding acquisition; A. D., R. B., S. D., L. M., and R. G. conceptualization; A. D., A. H., C. C., and R. G. investigation; C. C. and L. M. resources; L. M. and R. G. project administration; R. G. supervision.
Supplemental Data
References
- 1.Sattler S. Advances in Experimental Medicine and Biology. Springer New York LLC); New York: 2017. pp. 3–14. [Google Scholar]
- 2.Raoufi E., Hemmati M., Eftekhari S., Khaksaran K., Mahmodi Z., Farajollahi M.M., et al. Epitope prediction by novel immunoinformatics approach: a state-of-the-art Review. Int. J. Pept. Res. Ther. 2020;26:1155–1163. doi: 10.1007/s10989-019-09918-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mayer R.L., Impens F. Immunopeptidomics for next-generation bacterial vaccine development. Trends Microbiol. 2021;29:1034–1045. doi: 10.1016/j.tim.2021.04.010. [DOI] [PubMed] [Google Scholar]
- 4.Larsen M.V., Lundegaard C., Lamberth K., Buus S., Lund O., Nielsen M. Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinformatics. 2007;8:1–12. doi: 10.1186/1471-2105-8-424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zhang H., Lundegaard C., Nielsen M. Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods. Bioinformatics. 2009;25:83–89. doi: 10.1093/bioinformatics/btn579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bassani-Sternberg M., Pletscher-Frankild S., Jensen L.J., Mann M. Mass spectrometry of human leukocyte antigen class i peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol. Cell Proteomics. 2015;14:658–673. doi: 10.1074/mcp.M114.042812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Solleder M., Guillaume P., Racle J., Michaux J., Pak H.S., Müller M., et al. Mass spectrometry based immunopeptidomics leads to robust predictions of phosphorylated HLA class I ligands. Mol. Cell Proteomics. 2020;19:390–404. doi: 10.1074/mcp.TIR119.001641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Faridi P., Purcell A.W., Croft N.P. In Immunopeptidomics we need a sniper instead of a shotgun. Proteomics. 2018;18 doi: 10.1002/pmic.201700464. [DOI] [PubMed] [Google Scholar]
- 9.Pfammatter S., Bonneil E., Lanoix J., Vincent K., Hardy M.-P.P., Courcelles M., et al. Extending the comprehensiveness of immunopeptidome analyses using isobaric peptide labeling. Anal. Chem. 2020;92:9194–9204. doi: 10.1021/acs.analchem.0c01545. [DOI] [PubMed] [Google Scholar]
- 10.Purcell A.W., Ramarathinam S.H., Ternette N. Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics. Nat. Protoc. 2019;14:1687–1707. doi: 10.1038/s41596-019-0133-y. [DOI] [PubMed] [Google Scholar]
- 11.Eng J.K., McCormack A.L., Yates J.R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 1994;5:976–989. doi: 10.1016/1044-0305(94)80016-2. [DOI] [PubMed] [Google Scholar]
- 12.Craig R., Beavis R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004;20:1466–1467. doi: 10.1093/bioinformatics/bth092. [DOI] [PubMed] [Google Scholar]
- 13.Cox J., Neuhauser N., Michalski A., Scheltema R.A., Olsen J.V., Mann M. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 2011;10:1794–1805. doi: 10.1021/pr101065j. [DOI] [PubMed] [Google Scholar]
- 14.Zhang J., Xin L., Shan B., Chen W., Xie M., Yuen D., et al. Peaks DB: de novo sequencing assisted database search for sensitive and accurate peptide identification. Mol. Cell Proteomics. 2012;11 doi: 10.1074/mcp.M111.010587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jiang J., Natarajan K., Margulies D.H. Advances in Experimental Medicine and Biology. Springer New York LLC; New York: 2019. pp. 21–62. [DOI] [PubMed] [Google Scholar]
- 16.Faridi P., Li C., Ramarathinam S.H., Vivian J.P., Illing P.T., Mifsud N.A., et al. A subset of HLA-I peptides are not genomically templated: evidence for cis- and trans-spliced peptide ligands. Sci. Immunol. 2018;3 doi: 10.1126/sciimmunol.aar3947. [DOI] [PubMed] [Google Scholar]
- 17.Colaert N., Degroeve S., Helsens K., Martens L. Analysis of the resolution limitations of peptide identification algorithms. J. Proteome Res. 2011;10:5555–5561. doi: 10.1021/pr200913a. [DOI] [PubMed] [Google Scholar]
- 18.Verheggen K., Ræder H., Berven F.S., Martens L., Barsnes H., Vaudel M. Anatomy and evolution of database search engines—a central component of mass spectrometry based proteomic workflows. Mass Spectrom. Rev. 2020;39:292–306. doi: 10.1002/mas.21543. [DOI] [PubMed] [Google Scholar]
- 19.Bichmann L., Nelde A., Ghosh M., Heumos L., Mohr C., Peltzer A., et al. MHCquant: automated and reproducible data analysis for immunopeptidomics. J. Proteome Res. 2019;18:3876–3884. doi: 10.1021/acs.jproteome.9b00313. [DOI] [PubMed] [Google Scholar]
- 20.Dorfer V., Maltsev S., Winkler S., Mechtler K. CharmeRT: boosting peptide identifications by chimeric spectra identification and retention time prediction. J. Proteome Res. 2018;17:2581–2589. doi: 10.1021/acs.jproteome.7b00836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Silva A.S.C., Bouwmeester R., Martens L., Degroeve S. Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions. Bioinformatics. 2019;35:1401–1403. doi: 10.1093/bioinformatics/btz383. [DOI] [PubMed] [Google Scholar]
- 22.Li K., Jain A., Malovannaya A., Wen B., Zhang B. DeepRescore: leveraging deep learning to improve peptide identification in immunopeptidomics. Proteomics. 2020;20 doi: 10.1002/pmic.201900334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wilhelm M., Zolg D.P., Graber M., Gessulat S., Schmidt T., Schnatbaum K., et al. Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat. Commun. 2021;12:3346. doi: 10.1038/s41467-021-23713-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bouwmeester R., Gabriels R., Hulstaert N., Martens L., Degroeve S. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat. Methods. 2021:1–7. doi: 10.1038/s41592-021-01301-5. [DOI] [PubMed] [Google Scholar]
- 25.Gabriels R., Martens L., Degroeve S. Updated MS2PIP web server delivers fast and accurate MS2 peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques. Nucl. Acids Res. 2019;47:W295–W299. doi: 10.1093/nar/gkz299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Degroeve S., Maddelein D., Martens L. MS2PIP prediction server: compute and visualize MS2 peak intensity predictions for CID and HCD fragmentation. Nucl. Acids Res. 2015;43:W326–W330. doi: 10.1093/nar/gkv542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Degroeve S., Martens L. MS2PIP: a tool for MS/MS peak intensity prediction. Bioinformatics. 2013;29:3199–3203. doi: 10.1093/bioinformatics/btt544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ruiz Cuevas M.V., Hardy M.-P., Holly J., Bonneil É., Durette C., Courcelles M., et al. Most non-canonical proteins uniquely populate the proteome or immunopeptidome. Cell Rep. 2020;34:108815. doi: 10.1016/j.celrep.2021.108815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Martens L., Hermjakob H., Jones P., Adamsk M., Taylor C., States D., et al. PRIDE: the proteomics identifications database. Proteomics. 2005;5:3537–3545. doi: 10.1002/pmic.200401303. [DOI] [PubMed] [Google Scholar]
- 30.Perez-Riverol Y., Csordas A., Bai J., Bernal-Llinares M., Hewapathirana S., Kundu D.J., et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucl. Acids Res. 2019;47:D442–D450. doi: 10.1093/nar/gky1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hulstaert N., Shofstahl J., Sachsenberg T., Walzer M., Barsnes H., Martens L., et al. ThermoRawFileParser: modular, scalable, and cross-platform RAW file conversion. J. Proteome Res. 2020;19:537–542. doi: 10.1021/acs.jproteome.9b00328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Deutsch E.W., Perez-Riverol Y., Carver J., Kawano S., Mendoza L., Van Den Bossche T., et al. Universal spectrum identifier for mass spectra. Nat. Methods. 2021;18:768–770. doi: 10.1038/s41592-021-01184-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Chen T., Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; New York, NY): 2016. [Google Scholar]
- 34.Bergstra J., Yamins D., Cox D.D. 30th international conference on machine learning. ICML. 2013;2013:115–123. [Google Scholar]
- 35.Granholm V., Kim S., Navarro J.C.F., Sjölund E., Smith R.D., Käll L. Fast and accurate database searches with MS-GF+percolator. J. Proteome Res. 2014;13:890–897. doi: 10.1021/pr400937n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tyanova S., Temu T., Cox J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 2016;11:2301–2319. doi: 10.1038/nprot.2016.136. [DOI] [PubMed] [Google Scholar]
- 37.Sarkizova S., Klaeger S., Le P.M., Li L.W., Oliveira G., Keshishian H., et al. A large peptidome dataset improves HLA class I epitope prediction across most of the human population. Nat. Biotechnol. 2020;38:199–209. doi: 10.1038/s41587-019-0322-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Andreatta M., Alvarez B., Nielsen M. GibbsCluster: unsupervised clustering and alignment of peptide sequences. Nucl. Acids Res. 2017;45:W458–W463. doi: 10.1093/nar/gkx248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hunter J.D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 2007;9:90–95. [Google Scholar]
- 40.Waskom M. seaborn: statistical data visualization. J. Open Source Softw. 2021;6:3021. [Google Scholar]
- 41.Bittremieux W. Spectrum-utils: a Python package for mass spectrometry data processing and visualization. Anal. Chem. 2020;92:659–661. doi: 10.1021/acs.analchem.9b04884. [DOI] [PubMed] [Google Scholar]
- 42.Chen R., Fauteux F., Foote S., Stupak J., Tremblay T.L., Gurnani K., et al. Chemical derivatization strategy for extending the identification of MHC class i immunopeptides. Anal. Chem. 2018;90:11409–11416. doi: 10.1021/acs.analchem.8b02420. [DOI] [PubMed] [Google Scholar]
- 43.Racle J., Michaux J., Rockinger G.A., Arnaud M., Bobisse S., Chong C., et al. Robust prediction of HLA class II epitopes by deep motif deconvolution of immunopeptidomes. Nat. Biotechnol. 2019;37:1283–1286. doi: 10.1038/s41587-019-0289-6. [DOI] [PubMed] [Google Scholar]
- 44.Chong C., Marino F., Pak H., Racle J., Daniel R.T., Mü Ller M., et al. High-throughput and sensitive immunopeptidomics platform reveals profound interferon γ-mediated remodeling of the human leukocyte antigen (HLA) ligandome. Mol. Cell. Proteomics. 2018;17:533–548. doi: 10.1074/mcp.TIR117.000383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Gfeller D., Guillaume P., Michaux J., Pak H.-S., Daniel R.T., Racle J., et al. The length distribution and multiple specificity of naturally presented HLA-I ligands. J. Immunol. 2018;201:3705–3716. doi: 10.4049/jimmunol.1800914. [DOI] [PubMed] [Google Scholar]
- 46.Bassani-Sternberg M., Bräunlein E., Klar R., Engleitner T., Sinitcyn P., Audehm S., et al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat. Commun. 2016;7:1–16. doi: 10.1038/ncomms13404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wang D., Eraslan B., Wieland T., Hallström B., Hopf T., Zolg D.P., et al. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 2019;15 doi: 10.15252/msb.20188503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bassani-Sternberg M., Chong C., Guillaume P., Solleder M., Pak H.S., Gannon P.O., et al. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLOS Comput. Biol. 2017;13:1–28. doi: 10.1371/journal.pcbi.1005725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Marino F., Semilietof A., Michaux J., Pak H.S., Coukos G., Müller M., et al. Biogenesis of HLA ligand presentation in immune cells upon activation reveals changes in peptide length preference. Front. Immunol. 2020;11 doi: 10.3389/fimmu.2020.01981. doi:1981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Gravina F., Sanchuki H.S., Rodrigues T.E., Gerhardt E.C.M., Pedrosa F.O., Souza E.M., et al. Proteome analysis of an Escherichia coli ptsN-null strain under different nitrogen regimes. J. Proteomics. 2018;174:28–35. doi: 10.1016/j.jprot.2017.12.006. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
MS2Rescore is available at https://github.com/compomics/ms2rescore. All additional code used in this work is available at compomics/ms2rescore-immunopeptidomics-manuscript (github.com). The data used for training and evaluation of the newly trained models, the models themselves as well as the MS2Rescore output is available on Zenodo at https://doi.org/10.5281/zenodo.6532013. Additional mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (30) partner repository with the data set identifier PXD033868.