Table 1. Radiologists and algorithm AUC with CIs.
Pathology | Radiologists (95% CI) | Algorithm (95% CI) | Algorithm − Radiologists Difference (99.6% CI)a | Advantage |
---|---|---|---|---|
Atelectasis | 0.808 (0.777 to 0.838) | 0.862 (0.825 to 0.895) | 0.053 (0.003 to 0.101) | Algorithm |
Cardiomegaly | 0.888 (0.863 to 0.910) | 0.831 (0.790 to 0.870) | −0.057 (−0.113 to −0.007) | Radiologists |
Consolidation | 0.841 (0.815 to 0.870) | 0.893 (0.859 to 0.924) | 0.052 (−0.001 to 0.101) | No difference |
Edema | 0.910 (0.886 to 0.930) | 0.924 (0.886 to 0.955) | 0.015 (−0.038 to 0.060) | No difference |
Effusion | 0.900 (0.876 to 0.921) | 0.901 (0.868 to 0.930) | 0.000 (−0.042 to 0.040) | No difference |
Emphysema | 0.911 (0.866 to 0.947) | 0.704 (0.567 to 0.833) | −0.208 (−0.508 to −0.003) | Radiologists |
Fibrosis | 0.897 (0.840 to 0.936) | 0.806 (0.719 to 0.884) | −0.091 (−0.198 to 0.016) | No difference |
Hernia | 0.985 (0.974 to 0.991) | 0.851 (0.785 to 0.909) | −0.133 (−0.236 to −0.055) | Radiologists |
Infiltration | 0.734 (0.688 to 0.779) | 0.721 (0.651 to 0.786) | −0.013 (−0.107 to 0.067) | No difference |
Mass | 0.886 (0.856 to 0.913) | 0.909 (0.864 to 0.948) | 0.024 (−0.041 to 0.080) | No difference |
Nodule | 0.899 (0.869 to 0.924) | 0.894 (0.853 to 0.930) | −0.005 (−0.058 to 0.044) | No difference |
Pleural thickening | 0.779 (0.740 to 0.809) | 0.798 (0.744 to 0.849) | 0.019 (−0.056 to 0.094) | No difference |
Pneumonia | 0.823 (0.779 to 0.856) | 0.851 (0.781 to 0.911) | 0.028 (−0.087 to 0.125) | No difference |
Pneumothorax | 0.940 (0.912 to 0.962) | 0.944 (0.915 to 0.969) | 0.004 (−0.040 to 0.051) | No difference |
aThe AUC difference was calculated as the AUC of the algorithm minus the AUC of the radiologists. To account for multiple hypothesis testing, the Bonferroni-corrected CI (1 − 0.05/14; 99.6%) around the difference was computed.
The nonparametric bootstrap was used to estimate the variability around each of the performance measures; 10,000 bootstrap replicates from the validation set were drawn, and each performance measure was calculated for the algorithm and the radiologists on these same 10,000 bootstrap replicates. This produced a distribution for each estimate, and the 95% bootstrap percentile intervals (2.5th and 97.5th percentiles) are reported.
Abbreviations: AUC, area under the receiver operating characteristic curve; CI, confidence interval.