Table 2. Assessment of performance using the original datasets through fivefold cross-validation.
Sample ratio = 1:1 (mean±S.D) | Sample ratio = 1:10 (mean±S.D) | |||||||
---|---|---|---|---|---|---|---|---|
Activatory DTIs | Inhibitory DTIs | Activatory DTIs | Inhibitory DTIs | Activatory DTIs | Inhibitory DTIs | Activatory DTIs | Inhibitory DTIs | |
AUROC | AUPR | AUROC | AUPR | AUROC | AUPR | AUROC | AUPR | |
Logit | 0.725±0.032 | 0.690±0.041 | 0.823±0.007 | 0.806±0.006 | 0.697±0.027 | 0.166±0.016 | 0.823±0.007 | 0.340±0.004 |
RF | 0.868±0.022 | 0.890±0.021 | 0.921±0.004 | 0.932±0.003 | 0.858±0.026 | 0.559±0.054 | 0.923±0.004 | 0.729±0.010 |
MLP | 0.841±0.029 | 0.851±0.035 | 0.920±0.004 | 0.918±0.006 | 0.835±0.020 | 0.379±0.038 | 0.916±0.004 | 0.559±0.018 |
CDF# | 0.876±0.021 | 0.899±0.020 | 0.934±0.004 | 0.945±0.004 | 0.871±0.021 | 0.611±0.046 | 0.936±0.004 | 0.775±0.011 |
Boldface indicates the highest value for each performance metric. Logit, logistic regression; RF, random forest; MLP, multilayer perceptron; CDF, cascade deep forest.
# CDF model with 2 estimators in each cascade layer and 100 trees in each forest.