Table 3.
Results of the comparison between local and FFL-based training for 5 different datasets.
| Dataset name | Training set size | Included labels | Training setup | AUROC | P-value |
|---|---|---|---|---|---|
| VinDr-CXR | n = 15,000 | No finding, aortic enlargement, pleural thickening, cardiomegaly, pleural effusion | Local | 0.867 ± 0.045 | 0.001 |
| FFL | 0.885 ± 0.049 | ||||
| ChestX-ray14 | n = 83,525 | Cardiomegaly, lung opacity, lung lesion, pneumonia, edema | Local | 0.744 ± 0.076 | 0.363 |
| FFL | 0.744 ± 0.080 | ||||
| CheXpert | n = 126,141 | Cardiomegaly, lung opacity, lung lesion, pneumonia, edema | Local | 0.796 ± 0.064 | 0.243 |
| FFL | 0.797 ± 0.061 | ||||
| MIMIC-CXR-JPG-v2.0 | n = 237,972 | Enlarged cardiomediastinum, consolidation, pleural effusion, pneumothorax, atelectasis | Local | 0.772 ± 0.072 | 0.004 |
| FFL | 0.786 ± 0.066 | ||||
| UKA-CXR | n = 122,297 | Pleural effusion left, pleural effusion right, cardiomegaly, pneumonic infiltrates left, pneumonic infiltrates right | Local | 0.916 ± 0.031 | 0.001 |
| FFL | 0.918 ± 0.031 |
Average AUROC values over all included labels for each dataset, tested on the test benchmark of the corresponding dataset. The FFL process for each dataset was performed in combination with the other 4 datasets including 5 different labels for each dataset.