. 2022 Sep 15;6(12):1399–1406. doi: 10.1038/s41551-022-00936-9

Table 1.

Performance of the self-supervised model, CheXzero, on the five CheXpert competition pathologies in the CheXpert dataset, compared with the performance of three board-certified radiologists

	Average	Atelectasis	Cardiomegaly	Consolidation	Oedema	Pleural effusion
AUC
CheXzero	0.889	0.816 (0.777, 0.852)	0.906 (0.876, 0.930)	0.892 (0.823, 0.947)	0.897 (0.864, 0.928)	0.932 (0.906, 0.955)
MCC
Radiologists (mean)	0.530 (0.499, 0.558)	0.548 (0.496, 0.606)	0.566 (0.511, 0.620)	0.359 (0.262, 0.444)	0.507 (0.431, 0.57)	0.548 (0.496, 0.606)
CheXzero	0.523 (0.486, 0.561)	0.468 (0.396, 0.541)	0.625 (0.553, 0.7)	0.374 (0.29, 0.458)	0.520 (0.424, 0.616)	0.628 (0.558, 0.696)
Difference (CheXzero − radiologist)	−0.005 (−0.043, 0.034)	−0.078 (−0.154, 0.000)	0.058 (−0.016, 0.133)	0.018 (−0.090, 0.123)	0.015 (−0.070, 0.099)	−0.04 (−0.096, 0.013)
F1
Radiologists (mean)	0.619 (0.585, 0.642)	0.692 (0.646, 0.731)	0.678 (0.634, 0.718)	0.385 (0.28, 0.485	0.583 (0.511, 0.645)	0.737 (0.689, 0.783)
CheXzero	0.606 (0.571, 0.638)	0.646 (0.593, 0.700)	0.743 (0.685, 0.793)	0.333 (0.239, 0.424)	0.602 (0.517, 0.678)	0.704 (0.634, 0.764)
Difference (CheXzero − radiologist)	−0.009 (−0.038, 0.018)	−0.045 (−0.090, −0.001)	0.065 (0.013, 0.115)	−0.05 (−0.146, 0.036)	0.018 (–0.053, 0.086	−0.034 (−0.078, 0.008)

There is no statistically significant difference between the mean performance of the model and that of the radiologists averaged over the pathologies for MCC and F1. Numbers within parentheses indicate 95% CI.