Table 3.
Radiologist performance in distinguishing normal and abnormal CXRs across the 6 datasets.
| Scenario | Dataset (reference label used for evaluation) | Radiologists | |||||
|---|---|---|---|---|---|---|---|
| No. predicted negative (%) | NPV (95% CI) | Sensitivity (95% CI) | No. predicted positive (%) | PPV (95% CI) | Specificity (95% CI) | ||
| Abnormality detection | DS-1 (normal/abnormal) | 6567 (84.8%) | 0.86 (0.85–0.86) | 0.48 (0.46–0.51) | 1180 (15.2%) | 0.76 (0.74–0.78) | 0.95 (0.95–0.96) |
| 6380 (82.4%) | 0.87 (0.86–0.88) | 0.54 (0.52–0.57) | 1367 (17.6%) | 0.74 (0.71–0.76) | 0.94 (0.93–0.94) | ||
| CXR-14 (normal/abnormal) | 284 (35.1%) | 0.73 (0.67–0.77) | 0.87 (0.84–0.89) | 526 (64.9%) | 0.95 (0.93–0.97) | 0.89 (0.85–0.93) | |
| 325 (40.1%) | 0.67 (0.62–0.72) | 0.81 (0.78–0.84) | 485 (59.9%) | 0.97 (0.96–0.99) | 0.94 (0.91–0.97) | ||
| Unseen disease: TB | TB-1 (TB status) | 282 (61.0%) | 0.74 (0.69–0.80) | 0.70 (0.65–0.76) | 180 (39.0%) | 0.93 (0.89–0.97) | 0.95 (0.91–0.97) |
| TB-2 (TB status) | 88 (66.2%) | 0.88 (0.81–0.94) | 0.79 (0.68–0.90) | 45 (33.8%) | 0.93 (0.85–1.0) | 0.96 (0.92–1.0) | |
| Unseen disease: COVID-19 | COV-1 (COVID-19 status) | 1194 (65.6%) | 0.78 (0.76–0.80) | 0.55 (0.51–0.59) | 625 (34.4%) | 0.51 (0.47–0.54) | 0.75 (0.73–0.77) |
| COV-2 (COVID-19 status) | 352 (58.2%) | 0.62 (0.57–0.66) | 0.53 (0.48–0.59) | 253 (41.8%) | 0.60 (0.55–0.66) | 0.68 (0.64–0.74) | |