Table 1.
Methods for fragment ion intensity prediction
Year | Name | Neural network details | Comments | Citations |
---|---|---|---|---|
2005 | PeptideART | feedforward network | engineered peptide feature inputs, outputs of fragment probabilities | Arnold et al. (2005), Li et al. (2011) |
2017 | pDeep | bidirectional LSTM, multi-output regression; Keras v1.2.1, TensorFlow 0.12.1 | limited to peptides of up to 20 amino acids | Zhou et al. (2017) |
2018 | DeepMatch | bidirectional LSTM, weak supervision | direct integration with peptide spectrum matching algorithm outperforms COMET | Schoenholz et al. (2018) |
2018a | Prosit (latin for “of benefit”) | encoder: bidirectional GRU with dropout and attention, parallel encoding of precursor charge and collision energy; decoder: bidirectional GRU with dropout and time-distributed dense; multi-output regression Keras 2.1.1 and TensorFlow 1.4.0 | over half a million training peptides and 21 million MS/MS spectra at multiple collision energies, predicts MS/MS spectra and retention time, integration with database search to decrease FDR, integration with Skyline (MacLean et al., 2010), web tool https://www.proteomicsdb.org/prosit/ | Gessulat et al. (2019) |
2019a | DeepMass | encoder: three bidirectional LSTM with 385 units each; decoder: four fully connected dense layers 768 units each; multi-output regression TensorFlow v.1.7.0 |
predicted fragmentation with accuracy similar to repeated measure of the same peptide's fragmentation. Predicted spectra used for DIA data analysis nearly equivalent to spectral libraries | Tiwary et al. (2019) |
2019 | pDeep2 | bidirectional LSTM, multi-output regression | original pDeep model adapted to predict spectra of modified peptides using transfer learning | Zeng et al. (2019) |
2019a | N/A | encoder: bidirectional LSTM with dropout; iRT model, two dense layers, tanh, single output regression. Charge state distribution model, two dense layers, softmax activation, multi-output regression length 5 for charge 1–5. Spectral prediction model, a time-distributed dense layer with sigmoid activation function, multi-output regression; Keras | predicts retention time, precursor charge state distribution, and fragment ion spectra | Guan et al. (2019) |
2019 | MS2CNN | basic CNN architecture, engineered peptide features as input with a CNN kernel size of 4 | better than pDeep for prediction of spectra from +3 charge state peptide precursors | Lin et al., 2019 |
2020a | DeepDIA | hybrid CNN and bidirectional LSTM, CNN first extracts features from pairs of amino acids, then LSTM, then dense layer. Multi-output regression of the b/year ions, including water/ammonia losses. Keras 2.2.4 and TensorFlow 1.11 | predicts MS/MS spectra and indexed retention time (iRT). Slightly more protein identifications from DIA analysis of Hela proteome than libraries from DDA or Prosit | Yang et al. (2020) |
2020 | N/A | sequence-to-sequence CNN | full-spectrum prediction, not only fragment ions | Liu et al. (2020) |
Abbreviation are as follows: FDR, false discovery rate; N/A, not applicable.
Indicates methods that predict other factors apart from fragment ion spectra.