. 2018 Jun 13;45(7):3449–3459. doi: 10.1002/mp.12967

Table 3.

For each dataset, the AUC rank averaged over all repetitions when (a) randomly selecting a classifier (Random classifier), (b) preselecting the classifier with the average best AUC rank in all other datasets, that is, without any information about the current dataset (Preselected classifier), (c) selecting the classifier that yielded the highest AUC in the inner CV (Set‐specific classifier). Improvements in average AUC and average AUC rank compared to (a) are reported. The average AUC improvements by preselection and set‐specific selection were tested for statistical significance (P < 0.05, one‐sided Wilcoxon signed‐rank test) and found to be statistically significant (*). No other statistical tests besides the two aforementioned tests were conducted

Dataset	Random classifier	Preselected classifier				Set‐specific classifier
	Rank	Name	Rank		AUC	Rank		AUC
	Mean	Name	Mean	Increase	Increase	Mean	Increase	Increase
Set A	3.59	glmnet	3.64	−0.05	0.00	3.10	0.49	0.02
Set B	3.48	rf	2.92	0.56	0.02	3.31	0.17	0.01
Set C	3.50	glmnet	3.12	0.37	0.03	2.78	0.72	0.03
Set D	3.57	rf	2.60	0.97	0.04	3.31	0.26	0.02
Set E	3.53	glmnet	3.35	0.18	0.01	1.75	1.78	0.05
Set F	3.39	rf	1.89	1.50	0.04	2.58	0.81	0.03
Set G	3.47	rf	2.99	0.47	0.04	3.52	−0.06	0.01
Set H	3.44	rf	3.81	−0.37	0.00	1.70	1.74	0.05
Set I	3.45	rf	1.59	1.86	0.06	1.72	1.73	0.05
Set J	3.52	rf	4.18	−0.66	−0.02	3.41	0.11	0.00
Set K	3.50	rf	3.33	0.16	0.01	3.20	0.30	0.01
Set L	3.58	rf	3.50	0.08	0.01	3.66	−0.08	0.00
Mean	3.50		3.08	0.42	0.02 ^*	2.84	0.66	0.02 ^*