Skip to main content
. 2017 Aug 29;7:9751. doi: 10.1038/s41598-017-10203-6

Table 2.

Performance evaluation of EC subclass-specific RF models using three different methods.

Validation on test sets Hybrid Fingerprint
RF model without-upsampling dataset RF model with-upsampling dataset
TPR (%) TNR (%) PPV (%) ACC (%) MCC TPR (%) TNR (%) PPV (%) ACC (%) MCC
EC1 CV-10 FOLD 62.41 97.1 65.11 95.9 0.61 97.03 99.82 97.17 99.67 0.97
Splitting and Testing 82.14 97.02 75.71 95.26 0.75 97.09 99.82 97.02 99.66 0.97
Blind Set 87.62 98.18 95.07 97.5 0.89 83.17 98.73 83.67 97.3 0.81
EC2 CV-10 FOLD 55.65 93.39 56.96 89.27 0.50 86.67 98.52 86.26 97.33 0.85
Splitting and Testing 66.94 94.23 64.75 90.34 0.6 85.36 98.44 85.83 97.17 0.84
Blind Set 81.76 94.36 81.75 91.02 0.76 83.09 96.12 80.95 92.86 0.77
EC3 CV-10 FOLD 65.38 97.38 78.82 96.3 0.68 97.04 99.63 97 99.34 0.97
Splitting and Testing 88.77 96.56 93.91 93.93 0.87 95.4 99.42 95.3 98.96 0.95
Blind Set 95 97.5 94.44 96.3 0.92 95 97.5 94.44 96.3 0.92
EC4 CV-10 FOLD 59.21 88.86 69.42 86.27 0.53 91.86 98.84 91.47 97.96 0.9
Splitting and Testing 70 76.98 48.91 73.15 0.34 88.83 98.48 89.06 97.26 0.87
Blind Set 78.57 82.5 80.55 83.33 0.63 80.83 91.22 75 86.54 0.67
EC5* CV-10 FOLD 76.67 83.54 85 91.16 0.7 95.62 98.91 95.93 98.25 0.95
Splitting and Testing 88.89 75 93.75 86.36 0.62 97.77 99.39 97.5 99 0.97
EC6* CV-10 FOLD 95.74 93.77 92.61 95.16 0.89 97.78 99.44 97.79 99.11 0.97
Splitting and Testing 95 95 90 92.86 0.85 98 99.46 97.78 99.11 0.97

*For EC5 and EC6 classes, the validation could not be performed on blind set due to less representation of molecules in these classes. The average accuracy of cross-validation, splitting and testing and the blind set was 98.61, 98.52 and 93.25%, respectively.

TPR = True Positive Rate or Sensitivity, TNR = True Negative Rate or Specificity, PPV = Positive Predictive Value or Precision, ACC = Accuracy, MCC = Matthews correlation coefficient.