Table 2.
Data Set | Model | CA | AUC | SE | SP | TP | TN | FP | FN |
---|---|---|---|---|---|---|---|---|---|
Training set | Ext-RF | 0.865 | 0.926 | 0.88 | 0.85 | 44 | 46 | 8 | 6 |
Ext-LR | 0.885 | 0.922 | 0.90 | 0.87 | 45 | 47 | 7 | 5 | |
Ext-ANN | 0.846 | 0.913 | 0.88 | 0.81 | 44 | 44 | 10 | 6 | |
Ext-SVM | 0.865 | 0.912 | 0.88 | 0.85 | 44 | 46 | 8 | 6 | |
Graph-RF | 0.817 | 0.897 | 0.84 | 0.80 | 42 | 43 | 11 | 8 | |
PubChem-LR | 0.798 | 0.887 | 0.74 | 0.85 | 37 | 46 | 8 | 13 | |
Ext-Tree | 0.837 | 0.879 | 0.78 | 0.89 | 39 | 48 | 6 | 11 | |
PubChem-RF | 0.750 | 0.871 | 0.68 | 0.81 | 34 | 44 | 10 | 16 | |
Graph-LR | 0.779 | 0.870 | 0.76 | 0.80 | 38 | 43 | 11 | 12 | |
PubChem-Tree | 0.827 | 0.867 | 0.82 | 0.83 | 41 | 45 | 9 | 9 | |
External test set | Ext-RF | 0.840 | 0.930 | 0.75 | 0.92 | 9 | 12 | 1 | 3 |
Ext-LR | 0.840 | 0.974 | 0.75 | 0.92 | 9 | 12 | 1 | 3 | |
Ext-ANN | 0.800 | 0.962 | 0.67 | 0.92 | 8 | 12 | 1 | 4 | |
Ext-SVM | 0.880 | 0.904 | 0.83 | 0.92 | 10 | 12 | 1 | 2 | |
Graph-RF | 0.880 | 0.920 | 0.92 | 0.85 | 11 | 11 | 2 | 1 | |
PubChem-LR | 0.840 | 0.936 | 0.92 | 0.77 | 11 | 10 | 3 | 1 | |
Ext-Tree | 0.880 | 0.901 | 0.83 | 0.92 | 10 | 12 | 1 | 2 | |
PubChem-RF | 0.800 | 0.917 | 0.75 | 0.85 | 9 | 11 | 2 | 3 | |
Graph-LR | 0.840 | 0.936 | 0.75 | 0.92 | 9 | 12 | 1 | 3 | |
PubChem-Tree | 0.640 | 0.667 | 0.67 | 0.62 | 8 | 8 | 5 | 4 |
1 CA, classification accuracy; AUC, the area under the ROC curve; SE, sensitivity; SP, specificity; TP, the number of true positive compounds; TN, the number of true negative compounds; FP, the number of false positive compounds; FN, the number of true negative compounds.