Table 3.
For each dataset, the AUC rank averaged over all repetitions when (a) randomly selecting a classifier (Random classifier), (b) preselecting the classifier with the average best AUC rank in all other datasets, that is, without any information about the current dataset (Preselected classifier), (c) selecting the classifier that yielded the highest AUC in the inner CV (Set‐specific classifier). Improvements in average AUC and average AUC rank compared to (a) are reported. The average AUC improvements by preselection and set‐specific selection were tested for statistical significance (P < 0.05, one‐sided Wilcoxon signed‐rank test) and found to be statistically significant (*). No other statistical tests besides the two aforementioned tests were conducted
Dataset | Random classifier | Preselected classifier | Set‐specific classifier | |||||
---|---|---|---|---|---|---|---|---|
Rank | Name | Rank | AUC | Rank | AUC | |||
Mean | Mean | Increase | Increase | Mean | Increase | Increase | ||
Set A | 3.59 | glmnet | 3.64 | −0.05 | 0.00 | 3.10 | 0.49 | 0.02 |
Set B | 3.48 | rf | 2.92 | 0.56 | 0.02 | 3.31 | 0.17 | 0.01 |
Set C | 3.50 | glmnet | 3.12 | 0.37 | 0.03 | 2.78 | 0.72 | 0.03 |
Set D | 3.57 | rf | 2.60 | 0.97 | 0.04 | 3.31 | 0.26 | 0.02 |
Set E | 3.53 | glmnet | 3.35 | 0.18 | 0.01 | 1.75 | 1.78 | 0.05 |
Set F | 3.39 | rf | 1.89 | 1.50 | 0.04 | 2.58 | 0.81 | 0.03 |
Set G | 3.47 | rf | 2.99 | 0.47 | 0.04 | 3.52 | −0.06 | 0.01 |
Set H | 3.44 | rf | 3.81 | −0.37 | 0.00 | 1.70 | 1.74 | 0.05 |
Set I | 3.45 | rf | 1.59 | 1.86 | 0.06 | 1.72 | 1.73 | 0.05 |
Set J | 3.52 | rf | 4.18 | −0.66 | −0.02 | 3.41 | 0.11 | 0.00 |
Set K | 3.50 | rf | 3.33 | 0.16 | 0.01 | 3.20 | 0.30 | 0.01 |
Set L | 3.58 | rf | 3.50 | 0.08 | 0.01 | 3.66 | −0.08 | 0.00 |
Mean | 3.50 | 3.08 | 0.42 | 0.02 * | 2.84 | 0.66 | 0.02 * |