Table 5.
Mean AUC | Mean F1 | Mean MCC | |
---|---|---|---|
Radiologists (mean) | N/A | 0.619 | 0.530 |
Best single model | 0.878 | 0.563 (0.527, 0.598) | 0.473 (0.434, 0.510) |
Ensemble model | 0.889 | 0.606 (0.571, 0.638) | 0.523 (0.486, 0.561) |
Comparison between the ensemble over top-ten model checkpoints and the single best model on the CheXpert validation dataset. The results were averaged across the five CheXpert competition pathologies. Numbers within parentheses indicate 95% CI. *The Mean AUC of radiologists is not available (N/A) because the binary radiologist predictions are represented by a single point on the receiver operating curve; therefore an area cannot be computed.