Skip to main content
. 2021 Feb 4;13:6. doi: 10.1186/s13321-021-00486-3

Table 3.

Average performance of various SFs on Dataset I

Scoring functions ROC_AUC EF0.5% EF1% EF2% EF5% F1 MCC Kappa
Glide@sp 0.634 10.386 7.779 5.289 3.353
Gold@chemplp 0.725 14.025 11.078 8.380 5.262
Dock 0.770 15.769
sp_free_svm 0.972 41.676 41.672 40.211 22.597 0.715 0.711 0.707
sp_free_xgb 0.977 41.147 38.353 26.178 10.940 0.661 0.692 0.655
sp_free_rf 0.955 41.743 41.618 38.099 17.828 0.607 0.604 0.598
sp_all_svm 0.972 41.607 41.583 40.486 21.924 0.731 0.728 0.724
chemplp_free_svm 0.993 53.625 46.801 41.027 21.272 0.897 0.897 0.894

The average performance of the customized SFs built by 3 ML algorithms (SVM, XGBoost and RF) in terms of 7 metrics (ROC AUC, EF at 0.5% level, EF at 1% level, EF at 2% level, EF at 5% level, F1 Score, MCC and Cohen’s kappa) and the performance of 2 traditional SFs (Glide SP and ChemPLP) in terms of 4 metrics (ROC AUC, EF at 0.5% level, EF at 1% level, EF at 2% level and EF at 5% level) on the Dataset I. For the SF labels in this figure, ‘sp’ and ‘chemplp’ represent the docking methods (Glide SP and Gold ChemPLP) used for binding pose generation, ‘free’ and ‘all’ represent the descriptor combinations, and ‘svm’, ‘xgb’ and ‘rf’ are the ML algorithms used for modelling