. Author manuscript; available in PMC: 2023 Nov 14.

Published in final edited form as: IEEE Trans Pattern Anal Mach Intell. 2023 Jun 6;45(7):8081–8093. doi: 10.1109/TPAMI.2023.3234291

TABLE IV.

Alzheimer’s Disease Diagnosis From MRI

	AUROC		Balanced accuracy (%)		Sensitivity (%)		Specificity (%)

Model	Mean	95% CI	Mean	95% CI	Mean	95% CI	Mean	95% CI

Seen sites
Conventional CNN	0.703	0.621 – 0.785	69.7	63.7 – 75.7	69.3	56.7 – 81.8	70.1	58.8 – 81.5
Cluster input CNN	0.654	0.585 – 0.722	72.6	68.2 – 76.9	76.0	67.4 – 84.7	69.1	56.5 – 81.8
Cluster input+ CNN	0.901	0.871 – 0.930	89.1	86.0 – 92.1	89.1	82.6 – 95.6	89.0	84.4 – 93.7
DA-CNN	0.823	0.730 – 0.917	79.9	74.2 – 85.6	77.2	61.4 – 93.0	82.6	76.8 – 88.5
MeNet	0.923	0.894 – 0.952	89.6	86.7 – 92.5.	87.7	82.0 – 93.4	91.5	88.9 – 94.2
LMMNN	0.938	0.917 – 0.959	90.4	87.7 – 93.1	88.9	82.4 – 95.4	91.9	88.4 – 95.3
ARMED-CNN	0.900	0.861 – 0.939	88.7	83.8 – 91.6	91.8	87.0 – 96.7	83.6	75.2 – 92.0
w/o Adv.	0.816	0.729 – 0.903	79.7	73.3 – 86.2	91.6	86.1 – 97.2	67.8	52.6 – 83.1
randomized Z	0.585	0.506 – 0.664	63.3	58.3 – 68.2	65.5	48.3 – 82.7	61.0	47.2 – 74.8

Unseen sites
Conventional CNN	0.603	0.531 – 0.675	59.5	55.4 – 63.5	56.8	41.5 – 72.1	62.1	46.8 – 77.5
Cluster input CNN	0.587	0.520 – 0.653	58.3	54.4 – 62.2	57.6	42.3 – 72.9	59.0	42.9 – 75.2
Cluster input+ CNN	0.538	0.481 – 0.594	55.0	51.9 – 58.1	54.1	33.7 – 74.6	55.9	35.7 – 76.1
DA-CNN	0.652	0.614 – 0.690	62.5	59.7 – 65.3	71.8	66.7 – 77.0	53.1	48.9 – 57.4
MeNet	0.517	0.463 – 0.571	53.9	51.6 – 56.3	64.9	45.7 – 84.1	43.0	23.6 – 62.3
LMMNN	0.534	0.491 – 0.576	54.2	51.8 – 56.5	45.3	27.1 – 63.4	63.1	45.7 – 80.6
ARMED-CNN	0.645	0.606 – 0.684	61.2	58.6 – 63.9	65.9	60.3 – 71.6	56.6	50.8 – 62.3
w/o Adv.	0.655	0.608 – 0.701	62.2	58.9 – 65.6	68.6	61.6 – 75.6	55.9	46.6 – 65.1
randomized Z	0.551	0.526 – 0.576	54.6	53.0 – 56.2	47.5	33.9 – 61.1	61.8	47.4 – 76.1

CNN: convolutional neural network; DA: domain adversarial; Adv.: adversary; AUROC: area under ROC curve; CI: confidence interval. Note: Cluster is inferred via our Z-predictor for the unseen sites for Cluster input(+) CNN models.

Metrics were computed through 10 Monte Carlo cross-validation replicates. Sensitivity and specificity were computed at the Youden point. The best results for each metric are bolded.