Skip to main content
. 2022 Oct 10;36:gzac009. doi: 10.1093/protein/gzac009

Table I.

Continued.

Section Input Predictive Model Output Performance Paper
3.4
  • Sequence

  • Substrate structure

  • Substrate physicochemical parameter

Gradient boost model regression KM
  • MSE = 0.80 (log10-scale)

  • R2 = 0.42

2021-Kroll (Kroll et al., 2021)
3.4
  • Sequence

  • Substrate SMILES

CNN kcat Pearson r = 0.94 (log10-scale) 2022-Li (Li et al., 2022)
3.4
  • Sequence

  • Substrate SMILES

Feed forward network KD classifier AUROC = 0.89 2022-Goldman (Goldman et al., 2022)
3.5 Sequence Ridge regression Fitness MSE = 0.74 2020-Favor (Favor and Jayapurna, 2020)
3.5 Sequence
  • CNN

  • Tweedie regression

Fitness Spearman ρ = 0.61 2021-Wittmann (Wittmann et al., 2021b)
3.5 Sequence Iterative MSA and conservation analysis Conserved AA N/A 2021-Teze (Teze et al., 2021)
3.5 Sequence RNN
  • Fitness classifier

  • Fitness regressor

  • AUROC = 0.88

  • Spearman ρ = 0.91

2021-Luo (Luo et al., 2021)
3.5 Sequence Regularized linear regression Fitness
  • Spearman ρ = 0.93

2021-Biswas (Biswas et al., 2021)
3.5 Sequence Ridge regression Fitness Spearman ρ ~ 0.66 2022-Hsu (Hsu et al., 2022)
3.6 Sequence Generative adversarial network Artificial enzyme sequence 24% with catalytic activity 2021-Repecka (Repecka et al., 2021)
3.6 Sequence Protein language model Artificial enzyme sequence AUC = 0.85 2021-Madani (Madani et al., 2021)
3.6 Sequence Direct coupling statistical analysis of sequence MSA Artificial enzyme sequence Hit rate = 30% 2020-Russ (Russ et al., 2020)
3.6 Sequence Variational autoencoder model of Blast sequence Artificial enzyme sequence Pearson R2 = 0.99 2022-Giessel (Giessel et al., 2022)

aFor each research work, only the best-performing models are shown. Abbreviations: artificial neural network (ANN), convolutional neural network (CNN), deep neural network (DNN), recurrent neural network (RNN).

bQ(von der Esch et al., 2019): Leave-one-out cross-fold validation score.