Table 3.
Scoring functions | ROC_AUC | EF0.5% | EF1% | EF2% | EF5% | F1 | MCC | Kappa |
---|---|---|---|---|---|---|---|---|
Glide@sp | 0.634 | 10.386 | 7.779 | 5.289 | 3.353 | – | – | – |
Gold@chemplp | 0.725 | 14.025 | 11.078 | 8.380 | 5.262 | – | – | – |
Dock | 0.770 | – | 15.769 | – | – | – | – | – |
sp_free_svm | 0.972 | 41.676 | 41.672 | 40.211 | 22.597 | 0.715 | 0.711 | 0.707 |
sp_free_xgb | 0.977 | 41.147 | 38.353 | 26.178 | 10.940 | 0.661 | 0.692 | 0.655 |
sp_free_rf | 0.955 | 41.743 | 41.618 | 38.099 | 17.828 | 0.607 | 0.604 | 0.598 |
sp_all_svm | 0.972 | 41.607 | 41.583 | 40.486 | 21.924 | 0.731 | 0.728 | 0.724 |
chemplp_free_svm | 0.993 | 53.625 | 46.801 | 41.027 | 21.272 | 0.897 | 0.897 | 0.894 |
The average performance of the customized SFs built by 3 ML algorithms (SVM, XGBoost and RF) in terms of 7 metrics (ROC AUC, EF at 0.5% level, EF at 1% level, EF at 2% level, EF at 5% level, F1 Score, MCC and Cohen’s kappa) and the performance of 2 traditional SFs (Glide SP and ChemPLP) in terms of 4 metrics (ROC AUC, EF at 0.5% level, EF at 1% level, EF at 2% level and EF at 5% level) on the Dataset I. For the SF labels in this figure, ‘sp’ and ‘chemplp’ represent the docking methods (Glide SP and Gold ChemPLP) used for binding pose generation, ‘free’ and ‘all’ represent the descriptor combinations, and ‘svm’, ‘xgb’ and ‘rf’ are the ML algorithms used for modelling