Skip to main content
. 2021 May 17;1(2):100003. doi: 10.1016/j.crmeth.2021.100003

Table 2.

Methods for prediction of retention time

Year Name Neural network details Comment Citation
2003 N/A fully connected neural network with 2 hidden layers, 20 inputs and one output 95% of retention predictions within 10% of the true value Petritis et al. (2003)
2006 N/A fully connected neural network with 16 inputs, 4 hidden neurons, and 1 output mean prediction error ~5.8% Shinoda et al. (2006)
2006 N/A 1,052 input nodes, 24 hidden nodes, 1 output node average elution time precision of 1.5% Petritis et al. (2006)
2017 DeepRT feature extraction by LSTM and CNN, retention prediction from bagged ensemble of standard prediction models. Theano (0.9.0 dev1), Keras (1.0.1), and sklearn (0.17.1) 95% of retention predictions within 28 min versus best benchmark of 45.8 min Ma et al. (2017)
2018 DeepRT+ capsule network (a type of CNN) 95% of retention predictions within 15.7 min versus DeepRT at 24.7 min or best benchmark of 45.8 min Ma et al. (2018)
2019 Prosita (latin for “of benefit”) encoder: bidirectional GRU with dropout and attention, parallel encoding of precursor charge and collision energy; decoder: bidirectional GRU with dropout and time-distributed dense; multi-output regression Keras 2.1.1 and TensorFlow 1.4.0 over half a million training peptides and 21 million MS/MS spectra at multiple collision energies, predicts MS/MS spectra and retention time, integration with database search to decrease FDR, integration with Skyline (cite), web tool https://www.proteomicsdb.org/prosit/ Gessulat et al. (2019)
2019 DeepMassa encoder: three bidirectional LSTM with 385 units each; decoder: four fully connected dense layers 768 units each; multi-output regression
TensorFlow v.1.7.0
predicted fragmentation with accuracy similar to repeated measure of the same peptide's fragmentation. Predicted spectra used for DIA data analysis nearly equivalent to spectral libraries Tiwary et al. (2019)
2019 N/A encoder: bidirectional LSTM with dropout; iRT model, two dense layers, tanh, single output regression. Charge state distribution model, two dense layers, softmax activation, multi-output regression length 5 for charge 1–5. Spectral prediction model, a time-distributed dense layer with sigmoid activation function, multi-output regression; Keras predicts retention time, precursor charge state distribution, and fragment ion spectra Guan et al. (2019)
2020 DeepDIAa hybrid CNN and bidirectional LSTM, CNN first extracts features from pairs of amino acids, then LSTM, then dense layer. Multi-output regression of the b/year ions, including water/ammonia losses. Keras 2.2.4 and TensorFlow 1.11 predicts MS/MS spectra and indexed retention time (iRT). Slightly more protein identifications from DIA analysis of Hela proteome than libraries from DDA or Prosit Yang et al. (2020)
2020 DeepLC hybrid network: three CNN input paths (1) one-hot amino acid sequence, (2) amino acid pairs, and (3) amino acid composition. One dense input of peptide features. Inputs concatenated and processed through dense layers predicts retention time for previously unseen peptide modifications Bouwmeester et al. (2020b)
2020 AutoRT ensemble of 10 best CNN and LSTM, networks returned by transfer learning. Keras 2.2.4 and TensorFlow 1.13.1 used predicted retention time as a filter to assess identification strategies for mutated peptides Wen et al. (2020b)

Abbreviations are as follows: FDR, false discovery rate; N/A, not applicable.

a

Indicates methods that predict other factors beyond retention time.