Table 5.
Grouping factors | Groups | N | Mean | St. Dev. | Min | Pctl(25) | Pctl(75) | Max | Range |
---|---|---|---|---|---|---|---|---|---|
Feature selection strategy | Original strategy | 75 | 0.6547 | 0.0428 | 0.5125 | 0.6372 | 0.6819 | 0.7292 | 0.2167 |
LASSO-based strategy | 75 | 0.6582 | 0.0441 | 0.4948 | 0.6449 | 0.6821 | 0.7239 | 0.2291 | |
Forward selection strategy | 75 | 0.6426 | 0.0626 | 0.5085 | 0.6307 | 0.6846 | 0.7355 | 0.2270 | |
Backward selection strategy | 75 | 0.6580 | 0.0427 | 0.5152 | 0.6435 | 0.6788 | 0.7331 | 0.2179 | |
Importance-based strategy | 75 | 0.6550 | 0.0436 | 0.5125 | 0.6350 | 0.6823 | 0.7307 | 0.2182 | |
| |||||||||
Machine learning model | RF | 75 | 0.6889 | 0.0314 | 0.6100 | 0.6713 | 0.7129 | 0.7355 | 0.1255 |
LR | 75 | 0.6666 | 0.0222 | 0.6297 | 0.6514 | 0.6792 | 0.7161 | 0.0864 | |
XGBoost-tree | 75 | 0.6574 | 0.0256 | 0.6110 | 0.6353 | 0.6748 | 0.7171 | 0.1061 | |
NNET | 75 | 0.6552 | 0.0259 | 0.6163 | 0.6362 | 0.6702 | 0.7239 | 0.1076 | |
SVM-linear | 75 | 0.6005 | 0.0668 | 0.4948 | 0.5360 | 0.6558 | 0.7184 | 0.2236 | |
| |||||||||
Sampling method | Oversampling | 125 | 0.6604 | 0.0403 | 0.5085 | 0.6385 | 0.6908 | 0.7355 | 0.2270 |
Undersampling | 125 | 0.6548 | 0.0354 | 0.5085 | 0.6398 | 0.6756 | 0.7239 | 0.2154 | |
Original sampling | 125 | 0.6460 | 0.0626 | 0.4948 | 0.6365 | 0.6911 | 0.7340 | 0.2392 |
N: number of cases. Mean: mean value corresponding to AUC of each model. St. Dev.: standard deviation corresponding to AUC of each model. Pctl(25): AUC corresponds to the first quartile of the variance numerical distribution of each model. Pctl(75): AUC corresponds to the third quartile of the variance numerical distribution of each model. LR: logistic regression model. NNET: neural networks. RF: random forest. SVM-linear: support vector machine-linear. XGBoost-tree: extreme gradient boosting-tree.