Skip to main content
. 2024 Feb 8;8:10. doi: 10.1186/s41747-023-00411-3

Table 5.

Comparison of pretrained weights: self-supervised learning with large non-medical images versus supervised learning with a large, task-specific chest radiograph dataset

Labels VinDr-CXR ChestX-ray14 CheXpert UKA-CXR PadChest
DINOv2 MIMIC-CXR DINOv2 MIMIC-CXR DINOv2 MIMIC-CXR DINOv2 MIMIC-CXR DINOv2 MIMIC-CXR
Cardiomegaly 94.53 ± 0.52 97.17 ± 0.34 88.51 ± 0.47 89.54 ± 0.44 87.96 ± 0.31 87.27 ± 0.31 85.86 ± 0.18 85.45 ± 0.18 92.30 ± 0.27 92.68 ± 0.26
Pleural effusion 97.62 ± 0.68 98.31 ± 0.52 81.01 ± 0.32 82.00 ± 0.32 87.81 ± 0.20 87.64 ± 0.20 91.23 ± 0.19 91.41 ± 0.19 95.66 ± 0.26 95.85 ± 0.24
Pneumonia 91.99 ± 0.98 94.46 ± 0.66 70.17 ± 1.03 69.85 ± 1.04 76.42 ± 0.88 76.29 ± 0.84 92.15 ± 0.18 91.94 ± 0.18 83.93 ± 0.67 84.96 ± 0.66
Atelectasis 88.55 ± 1.71 92.21 ± 1.48 75.56 ± 0.43 75.87 ± 0.41 69.57 ± 0.40 69.28 ± 0.39 86.36 ± 0.23 86.30 ± 0.24 83.62 ± 0.58 83.59 ± 0.55
Consolidation 91.35 ± 1.56 94.82 ± 0.74 73.60 ± 0.57 75.11 ± 0.54 75.14 ± 0.56 74.13 ± 0.56 N/A N/A 88.26 ± 0.82 89.95 ± 0.76
Pneumothorax 90.96 ± 2.91 97.39 ± 1.27 84.70 ± 0.38 85.93 ± 0.37 87.29 ± 0.33 86.03 ± 0.34 N/A N/A 86.37 ± 2.01 92.89 ± 1.00
Lung opacity 86.86 ± 1.27 87.89 ± 1.26 N/A N/A 73.98 ± 0.28 73.62 ± 0.29 N/A N/A N/A N/A
Lung lesion N/A N/A N/A N/A 76.56 ± 0.73 75.79 ± 0.73 N/A N/A N/A N/A
Fracture N/A N/A N/A N/A 77.93 ± 0.67 76.92 ± 0.66 N/A N/A N/A N/A
No finding (healthy) 90.79 ± 0.56 93.51 ± 0.46 72.37 ± 0.33 72.48 ± 0.33 87.61 ± 0.30 87.53 ± 0.31 86.86 ± 0.18 86.49 ± 0.18 85.11 ± 0.26 85.20 ± 0.26
Average 91.58 ± 3.45 94.47 ± 3.30 77.99 ± 6.38 78.68 ± 6.77 80.03 ± 6.60 79.45 ± 6.60 88.49 ± 2.65 88.32 ± 2.77 87.89 ± 4.30 89.30 ± 4.45
p-value 0.001 0.001 0.001 0.001 0.001

The table showcases area under receiver operating characteristic curve (ROC-AUC) percentages for each individual label across datasets: VinDr-CXR, ChestX-ray14, CheXpert, UKA-CXR, and PadChest. These datasets were pretrained using SSL on non-medical images (DINOv2) and fully supervised learning on a dedicated chest radiograph dataset (MIMIC-CXR). The total fine-tuning training images for VinDr-CXR, ChestX-ray14, CheXpert, UKA-CXR, and PadChest were n = 15,000, n = 86,524, n = 128,356, n = 153,537, and n = 88,480, respectively, with corresponding test images totals of n = 3,000, n = 25,596, n = 39,824, n = 39,824, and n = 22,045, respectively. p-values signify the comparison between the average ROC-AUCs from DINOv2 and MIMIC-CXR. For details about each dataset’s labels, refer to Table 3

N/A Not available