Skip to main content
. 2021 Jul 1;297(2):100931. doi: 10.1016/j.jbc.2021.100931

Table 5.

Performance (%) of random forest classifiers in predicting presence of CBMa

Performance metric Validation
Testing
All 5933 features Top 50 features 44 features (no C-terminus) Top 20 features Top 20 features
Accuracy 90.8 ± 2.1 90.9 ± 2.1 88.2 ± 2.5 89.3 ± 2.4 89.7
Sensitivity 93.7 ± 2.8 92.2 ± 2.9 89.6 ± 3.4 90.0 ± 3.2 95.7
Specificity 87.9 ± 3.5 89.7 ± 3.3 86.9 ± 3.7 88.5 ± 3.6 87.4
MCC 0.80 ± 0.05 0.81 ± 0.05 0.76 ± 0.05 0.78 ± 0.05 0.68
a

Validation and testing are performed on a 90%:10% split of the dataset, respectively. Validation performance is reported as the mean over 100 repetitions of 5-fold cross-validation ± 1 standard deviation.