. 2019 Jul 23;24:101954. doi: 10.1016/j.nicl.2019.101954

Table 4.

Measurements of area under curve (AUC), sensitivity (Sens), specificity (Spec), and balanced accuracy (BalAcc) at a specific threshold (k_T) for the subject staged with EBM and DEBM methods on training and test data sets.

	EBM					DEBM					p-value
	k_T	Sens	Spec	BalAcc	AUC	k_T	Sens	Spec	BalAcc	AUC	p-value
Training set
AD vs CN	7	0.97	0.96	0.96	0.97*	5	0.92	0.94	0.93	0.95*	1.88·10⁻³
AD vs MCI	9	0.59	0.96	0.77	0.81	5	0.48	0.94	0.71	0.76	5.30·10⁻⁵
MCI vs CN	6	0.88	0.52	0.70	0.73*	5	0.92	0.52	0.72	0.73*	0.537

Test set
AD vs CN	5	0.71	0.91	0.81	0.87	7	0.78	0.85	0.81	0.86	3.99·10⁻²
AD vs MCI	12	0.77	0.71	0.74	0.78	11	0.70	0.75	0.73	0.77	0.393
MCI vs CN	1	0.63	0.62	0.62	0.63	1	0.68	0.60	0.64	0.64	0.676

Thresholds are chosen to maximize the balanced accuracy in each classification task. P-values of Delong test performed to compare AUCs of EBM and DEBM methods are reported in the last column. AUCs of training set denoted with * are significantly different from their corresponding values derived from the test subjects (p-value of DeLong test ≤0.05).