Table 1.
Summary of existing ML-based models for thermophilic protein prediction.
Author (year) | Classifier a | Features b | Evaluation strategyc | Web server availabilityd |
---|---|---|---|---|
Zhang et al.31 | PLS | AAC | 5CV/IND | No |
Zhang et al.32 | LogitBoost | AAC | 5CV/IND | No |
Gromiha et al.27 | NN | AAC | 5CV/IND | No |
Montanucci et al.21 | SVM | AAC, DPC | 5CV | Not accessible |
Lin et al.20 | SVM | AAC, GGAC | Jackknife | Yes |
Wang et al.24 | SVM | AAC, DPC, PCP, CTD | 5CV | No |
Nakariyakul et al.28 | SVM | AAC, DPC | 5CV/IND | No |
Zuo et al.33 | KNN | AAC | Jackknife | Not accessible |
Wang et al.30 | SVM | AAC, GGAC | 5CV/IND | No |
Fan et al.25 | SVM | AAC, pka, PSSM | 10CV/IND | No |
Tang et al.29 | SVM | k-mer | 5CV | No |
Feng et al.26 | SVM | ACC, DPC, PCP,RAAC | 10CV/IND | No |
Charoenkwan et al. (this study) | SCM | DPS | 10CV/IND | Yes |
aKNN k-nearest neighbor, NN neural networks, PLS partial least-square regression, SVM support vector machine.
bAAC amino acid composition, CTD composition-transition-distribution, DPC dipeptide composition, DPS dipeptide propensity scores, GGAP g-gap dipeptide composition, k-mer fragment-based technique, pka acid dissociation constant, PCP physicochemical properties, PseACC pseudo amino acid composition, PSSM position specific scoring matrix, RACC reduce amino acid composition, TC tripeptide composition.
c5CV fivefold cross-validation, 10CV tenfold cross-validation, jackknif jackknife cross-validation, IND independent test.
dNot accessible: the webserver was not functional during the preparation of this manuscript.