Table 1.
Model performances of 5-fold cross validation
| Model | CA | SE | SP | AUC | MCC |
|---|---|---|---|---|---|
| NB-ExtFP | 0.8244 | 0.8792 | 0.7697 | 0.8633 | 0.6528 |
| KNN-ExtFP | 0.8693 | 0.9270 | 0.8118 | 0.9171 | 0.7437 |
| RF-ExtFP | 0.8750 | 0.8680 | 0.8820 | 0.9450 | 0.7501 |
| SVM-ExtFP | 0.8834 | 0.9270 | 0.8399 | 0.9407 | 0.7698 |
| NB-MACCSFP | 0.7921 | 0.8146 | 0.7697 | 0.8532 | 0.5849 |
| KNN-MACCSFP | 0.8707 | 0.9045 | 0.8371 | 0.9302 | 0.7433 |
| RF-MACCSFP | 0.8693 | 0.8961 | 0.8427 | 0.9475 | 0.7398 |
| SVM-MACCSFP | 0.8665 | 0.9410 | 0.7921 | 0.9153 | 0.7414 |
| NB-PubChemFP | 0.7950 | 0.8371 | 0.7528 | 0.8544 | 0.5920 |
| KNN-PubChemFP | 0.8539 | 0.8764 | 0.8315 | 0.9044 | 0.7086 |
| RF-PubChemFP | 0.8652 | 0.8961 | 0.8343 | 0.9408 | 0.7317 |
| SVM-PubChemFP | 0.8539 | 0.9354 | 0.7725 | 0.9103 | 0.7175 |
| NB-AP2D | 0.7710 | 0.8483 | 0.6938 | 0.8151 | 0.5487 |
| KNN-AP2D | 0.8357 | 0.8680 | 0.8034 | 0.8883 | 0.6728 |
| RF-AP2D | 0.8314 | 0.9354 | 0.7275 | 0.9056 | 0.6777 |
| SVM-AP2D | 0.8132 | 0.8933 | 0.7331 | 0.8453 | 0.6346 |
Abbreviations: NB, Naïve Bayesian; KNN, k-nearest neighbor; RF, random forest; SVM, support vector machine; Ext, extended; AP2D, 2D atom pairs; FP, fingerprints; SE, sensitivity; SP, specificity; AUC, area under the receiver operating characteristic curve; MCC, Matthews correlation coefficient; CA, classification accuracy.