. 2022 Jul 5;35(6):1514–1529. doi: 10.1007/s10278-022-00674-z

Table 4.

External testing on Hoboken and Seoul datasets with 95% confidence intervals

	Hoboken dataset			Seoul dataset
	EHR-based	CXR-based	Fusion	EHR-based	CXR-based	Fusion
AUROC (CI)	0.74 (0.68–0.80)	0.72 (0.66–0.78)	0.76 (0.70–0.82)	0.92 (0.88–0.96)	0.90 (0.86–0.94)	0.95 (0.92–0.98)
Sensitivity (CI)	0.68 (0.59–0.77)	0.68 (0.57–0.8)	0.68 (0.60–0.76)	0.64 (0.25–0.86)	0.63 (0.20–0.86)	0.64 (0.20–0.86)
Specificity (CI)	0.72 (0.62–0.82)	0.65 (0.55–0.78)	0.78 (0.70–0.85)	0.88 (0.85–0.93)	0.86 (0.80–0.93)	0.93 (0.89–0.96)
PPV (CI)	0.65 (0.56–0.75)	0.60 (0.52–0.69)	0.71 (0.61–0.79)	0.09 (0.02–0.17)	0.07 (0.02–0.15)	0.13 (0.03–0.25)
NPV (CI)	0.75 (0.68–0.81)	0.73 (0.66–0.8)	0.76 (0.70–0.82)	0.99 (0.99–1.0)	0.99 (0.99–1.0)	1.00 (0.99–1.0)
F1-score (CI)	0.66 (0.59–0.73)	0.64 (0.57–0.7)	0.69 (0.62–0.76)	0.15 (0.04–0.28)	0.13 (0.03–0.25)	0.21 (0.06–0.38)
Accuracy (CI)	0.70 (0.65–0.76)	0.67 (0.61–0.72)	0.74 (0.68–0.79)	0.92 (0.88–0.96)	0.90 (0.86–0.94)	0.95 (0.92–0.98)

AUROC area under the receiver operating characteristic curve, PPV positive predictive value, NPV negative predictive value, CI confidence interval