Table 3.
Algorithm comparison using the full feature set over all versus common ADRs
| Method | ADR_All | ADR_50+ | ||||||
| AUC | ACC | Precision | Recall | AUC | ACC | Precision | Recall | |
| LR | 0.9102 | 0.9486 | 0.4152 | 0.5671 | 0.7648 | 0.8023 | 0.5321 | 0.6908 |
| NB | 0.9116 | 0.9527 | 0.3537 | 0.6302 | 0.8627 | 0.8431 | 0.3929 | 0.7214 |
| KNN | 0.9161 | 0.9595 | 0.5300 | 0.5787 | 0.8508 | 0.8530 | 0.5633 | 0.6401 |
| RF | 0.9491 | 0.9653 | 0.6310 | 0.6250 | 0.9052 | 0.8784 | 0.6522 | 0.7057 |
| SVM | 0.9524 | 0.9669 | 0.6617 | 0.6306 | 0.9141 | 0.8857 | 0.6750 | 0.7227 |
The full feature set here refers to chemical + biological + phenotypic properties. ADR_All considers all ADRs, and ADR_50+ are the common ADRs caused by at least 50 drugs. All AUC, ACC, Precision, and Recall are micro-averages across ADRs in the corresponding dataset.
ACC, accuracy; ADR, adverse drug reaction; AUC, area under the receiver operating characteristic curve; KNN, K-nearest neighbor; LR, logistic regression; NB, naïve Bayes; RF, random forest; SVM, support vector machine.