Table 1.
Performance of the ensemble models (including the non-machine-learning score-only classifier) under different activity thresholds and machine learning approaches.
Threshold | 1 µM | 10 µM | 100 µM |
---|---|---|---|
Number of active ligands | |||
of the 230 in the training set | 76 (33%) | 107 (46%) | 127 (55%) |
of the 64 in the test set | 19 (30%) | 33 (52%) | 36 (56%) |
Vina score (no statistical model) | |||
Ensemble AUC (training set) | 0.54 | 0.57 | 0.63 |
Best AUC among all individual structures | 0.58 | 0.62 | 0.66 |
Ensemble MLP model | |||
AUC (std. dev.) 5CV (training set) | 0.84 (0.05) | 0.81 (0.03) | 0.75 (0.06) |
AUC (test set) | 0.82 | 0.90 | 0.92 |
Matthews coeff. (test set) | 0.49 | 0.56 | 0.59 |
Ensemble LDA model | |||
AUC (std. dev.) 5CV (training set) | 0.78 (0.10) | 0.77 (0.08) | 0.81 (0.07) |
AUC (test set) | 0.77 | 0.77 | 0.77 |
Matthews coeff. (test set) | 0.38 | 0.34 | 0.43 |
Ensemble QDA model | |||
AUC (std. dev.) 5CV (training set) | 0.79 (0.12) | 0.77 (0.08) | 0.81 (0.10) |
AUC (test set) | 0.76 | 0.78 | 0.75 |
Matthews coeff. (test set) | 0.31 | 0.47 | 0.39 |