Skip to main content
. 2020 Jun 24;9(6):1981. doi: 10.3390/jcm9061981

Table 3.

Diagnostic Performance of DL algorithm and ED physicians (visible pneumonia on CR vs. non-pneumonia).

AUROC (95% CI) p Value Sensitivity (95% CI) p Value Specificity (95% CI) p Value
DL algorithm 0.940 (0.910–0.962) NA 0.817 * (0.696–0.905) NA 0·944 * (0·912–0·967) NA
Session 1 (ED physicians only)
Observer 1 0.856 (0.816–0.891) 0.003 a 0.833 (0.715–0.917) 1.000 a 0.690 (0.634–0.741) <0.0001 a
Observer 2 0.887 (0.850–0.918) 0.053 a 0.700 (0.568–0·812) 0.119 a 0.974 (0.949–0.989) 0.093 a
Observer 3 0.920 (0.887–0.946) 0.455 a 0.683 (0.550–0.797) 0.022 a 0.997 (0.982–1.000) 0.0001 a
Group 0.871 (0.849–0.890) 0.007 a 0.739 (0.668–0.801) 0.034 a 0.887 (0.864–0.907) <0.0001 a
Session 2 (ED physicians with DL algorithm assistance)
Observer 1 0.936 (0.905–0.958) 0.007 b 0.867 (0.754–0.941) 0.774 b 0.954 (0.924–0.975) <0.0001 b
Observer 2 0.907 (0.873–0.935) 0.412 b 0.783 (0.658–0.879) 0.227 b 1.000 (0.988–1.000) 0.008 b
Observer 3 0.907 (0.872–0.934) 0.609 b 0.817 (0.696–0.905) 0.022 b 0.990 (0.971–0.998) 0.625 b
Group 0.916 (0.898–0.931) 0.002 b 0.822 (0.758–0.875) 0.014 b 0.981 (0.970–0.989) <0.0001 b

AUROC = the area under the receiver operating characteristics curve, CR = chest radiograph, DL = deep learning, ED = emergency department. * Sensitivity and specificity of DL algorithm were determined at high-sensitivity threshold. a Comparison of performance with DL algorithm. b Comparison of performance with session 1.