Table 4.
Cross-validation | Classifier | Model | Accuracy | Precision | Recall | F1-measure | MCCa | |
---|---|---|---|---|---|---|---|---|
Entire tenfold cross-validation | Fingerprint-based classifier (random forest) | Addition + subtraction | 89.55% | 92.25% | 87.53% | 89.83% | 79.23% | |
Text-based classifier (random forest) | 84.16% | 87.52% | 82.01% | 84.67% | 68.48% | |||
Composition tenfold cross-validation | ODITb test dataset | Fingerprint-based classifier (random forest) | Addition + Hadamard | 80.01% | 75.80% | 82.81% | 79.07% | 60.33% |
Text-based classifier (random forest) | 77.55% | 72.92% | 80.37% | 76.42% | 55.39% | |||
NDITc test dataset | Fingerprint-based classifier (random forest) | Addition + Hadamard | 65.06% | 42.58% | 77.55% | 54.49% | 33.81% | |
Text-based classifier (random forest) | 66.97% | 54.38% | 72.80% | 61.73% | 35.26% |
aMathews correlation coefficient. bODIT: One Drug In Train set. cNDIT: No Drug In Train set.