Table 7.
Retrospective validation of DGDN on MIMIC-CXR dataset.
Model variant | In-hospital split (seen patients) | Out-of-hospital split (unseen patients) | ||||||
---|---|---|---|---|---|---|---|---|
Accuracy ↑ | Precision ↑ | Recall ↑ | F1 score ↑ | Accuracy ↑ | Precision ↑ | Recall ↑ | F1 score ↑ | |
MIMIC-IV (43) | 83.6 ± 0.3 | 81.4 ± 0.4 | 79.5 ± 0.3 | 80.4 ± 0.3 | 77.5 ± 0.3 | 75.2 ± 0.4 | 73.0 ± 0.3 | 74.1 ± 0.3 |
MIMIC-CXR (42) | 85.1 ± 0.3 | 83.0 ± 0.4 | 81.8 ± 0.3 | 82.4 ± 0.3 | 79.2 ± 0.3 | 77.0 ± 0.4 | 75.8 ± 0.3 | 76.4 ± 0.3 |
NIH ChestX-ray14 (44) | 86.5 ± 0.3 | 84.6 ± 0.3 | 83.2 ± 0.4 | 83.9 ± 0.3 | 80.4 ± 0.3 | 78.1 ± 0.3 | 77.3 ± 0.4 | 77.7 ± 0.3 |
Ours (DGDN) | 88.0 ±0.3 | 86.2 ±0.3 | 85.0 ±0.4 | 85.6 ±0.3 | 83.1 ±0.3 | 81.2 ±0.3 | 80.0 ±0.4 | 80.6 ±0.3 |
Bold values indicate the best performance in each column.