Table 1.
Performance of GBDT models trained using different features on the DUD-E test set (N = 12).
Features used in the model | Average number of active compound in top N, with 90% confidence interval | Diversity*b | AUROC*d | ||
---|---|---|---|---|---|
N = 10 | N = 50 | N = 100 | |||
GlideSP and multi-resolution ExptGMS*a | 5.4 [5.0, 5.8] | 21.6 [20.4, 22.9] | 33.2 [30.9, 35.1] | 0.64 [0.60, 0.67] | 0.66 |
GlideSP and TF3P | 5.4 [4.9, 5.8] | 19.5 [18.1, 20.8] | 27.7 [26.0, 29.6] | 0.62 [0.60, 0.64] | 0.64 |
GlideSP and MM/GBSA | 4.5 [4.0, 5.2] | 18.5 [17.0, 19.9] | 30.1 [28.1, 32.2] | 0.66 [0.62, 0.68] | 0.65 |
GlideSP and USRCAT | 4.5 [3.9, 5.1] | 18.1 [16.9, 19.3] | 30.4 [28.0, 32.3] | 0.63 [0.59, 0.66] | 0.63 |
GlideSP and Alpha sphere | 5.1 [4.6,5.6] | 18.4 [17.0, 19.8] | 29.2 [27.2, 31.0] | 0.64 [0.61, 0.68] | 0.63 |
GlideSP*c | 4.3 [3.8, 4.8] | 17.3 [16.1, 18.6] | 29.9 [27.9, 31.9] | 0.66 [0.62, 0.68] | 0.62 |
*aMultiresolution indicates ExptGMS at resolutions of 2.5, 3.0, 3.5, 4.5, and 5.5 Å.
*bDiversity of active compounds among the top 100 ranked compounds.
*cRanking was performed directly using the GlideSP score without using a trained model.
*dAUROC was calculated by using GBDT predicted probability of being an active compound.