Table 2.
Methods for prediction of retention time
| Year | Name | Neural network details | Comment | Citation |
|---|---|---|---|---|
| 2003 | N/A | fully connected neural network with 2 hidden layers, 20 inputs and one output | 95% of retention predictions within 10% of the true value | Petritis et al. (2003) |
| 2006 | N/A | fully connected neural network with 16 inputs, 4 hidden neurons, and 1 output | mean prediction error ~5.8% | Shinoda et al. (2006) |
| 2006 | N/A | 1,052 input nodes, 24 hidden nodes, 1 output node | average elution time precision of 1.5% | Petritis et al. (2006) |
| 2017 | DeepRT | feature extraction by LSTM and CNN, retention prediction from bagged ensemble of standard prediction models. Theano (0.9.0 dev1), Keras (1.0.1), and sklearn (0.17.1) | 95% of retention predictions within 28 min versus best benchmark of 45.8 min | Ma et al. (2017) |
| 2018 | DeepRT+ | capsule network (a type of CNN) | 95% of retention predictions within 15.7 min versus DeepRT at 24.7 min or best benchmark of 45.8 min | Ma et al. (2018) |
| 2019 | Prosita (latin for “of benefit”) | encoder: bidirectional GRU with dropout and attention, parallel encoding of precursor charge and collision energy; decoder: bidirectional GRU with dropout and time-distributed dense; multi-output regression Keras 2.1.1 and TensorFlow 1.4.0 | over half a million training peptides and 21 million MS/MS spectra at multiple collision energies, predicts MS/MS spectra and retention time, integration with database search to decrease FDR, integration with Skyline (cite), web tool https://www.proteomicsdb.org/prosit/ | Gessulat et al. (2019) |
| 2019 | DeepMassa | encoder: three bidirectional LSTM with 385 units each; decoder: four fully connected dense layers 768 units each; multi-output regression TensorFlow v.1.7.0 |
predicted fragmentation with accuracy similar to repeated measure of the same peptide's fragmentation. Predicted spectra used for DIA data analysis nearly equivalent to spectral libraries | Tiwary et al. (2019) |
| 2019 | N/A | encoder: bidirectional LSTM with dropout; iRT model, two dense layers, tanh, single output regression. Charge state distribution model, two dense layers, softmax activation, multi-output regression length 5 for charge 1–5. Spectral prediction model, a time-distributed dense layer with sigmoid activation function, multi-output regression; Keras | predicts retention time, precursor charge state distribution, and fragment ion spectra | Guan et al. (2019) |
| 2020 | DeepDIAa | hybrid CNN and bidirectional LSTM, CNN first extracts features from pairs of amino acids, then LSTM, then dense layer. Multi-output regression of the b/year ions, including water/ammonia losses. Keras 2.2.4 and TensorFlow 1.11 | predicts MS/MS spectra and indexed retention time (iRT). Slightly more protein identifications from DIA analysis of Hela proteome than libraries from DDA or Prosit | Yang et al. (2020) |
| 2020 | DeepLC | hybrid network: three CNN input paths (1) one-hot amino acid sequence, (2) amino acid pairs, and (3) amino acid composition. One dense input of peptide features. Inputs concatenated and processed through dense layers | predicts retention time for previously unseen peptide modifications | Bouwmeester et al. (2020b) |
| 2020 | AutoRT | ensemble of 10 best CNN and LSTM, networks returned by transfer learning. Keras 2.2.4 and TensorFlow 1.13.1 | used predicted retention time as a filter to assess identification strategies for mutated peptides | Wen et al. (2020b) |
Abbreviations are as follows: FDR, false discovery rate; N/A, not applicable.
Indicates methods that predict other factors beyond retention time.