. 2023 Apr 13;13:6046. doi: 10.1038/s41598-023-33303-y

Table 3.

Results of the comparison between local and FFL-based training for 5 different datasets.

Dataset name	Training set size	Included labels	Training setup	AUROC	P-value
VinDr-CXR	n = 15,000	No finding, aortic enlargement, pleural thickening, cardiomegaly, pleural effusion	Local	0.867 ± 0.045	0.001
VinDr-CXR	n = 15,000		FFL	0.885 ± 0.049	0.001
ChestX-ray14	n = 83,525	Cardiomegaly, lung opacity, lung lesion, pneumonia, edema	Local	0.744 ± 0.076	0.363
ChestX-ray14	n = 83,525	Cardiomegaly, lung opacity, lung lesion, pneumonia, edema	FFL	0.744 ± 0.080	0.363
CheXpert	n = 126,141	Cardiomegaly, lung opacity, lung lesion, pneumonia, edema	Local	0.796 ± 0.064	0.243
CheXpert	n = 126,141	Cardiomegaly, lung opacity, lung lesion, pneumonia, edema	FFL	0.797 ± 0.061	0.243
MIMIC-CXR-JPG-v2.0	n = 237,972	Enlarged cardiomediastinum, consolidation, pleural effusion, pneumothorax, atelectasis	Local	0.772 ± 0.072	0.004
MIMIC-CXR-JPG-v2.0	n = 237,972		FFL	0.786 ± 0.066	0.004
UKA-CXR	n = 122,297	Pleural effusion left, pleural effusion right, cardiomegaly, pneumonic infiltrates left, pneumonic infiltrates right	Local	0.916 ± 0.031	0.001
UKA-CXR	n = 122,297		FFL	0.918 ± 0.031	0.001

Average AUROC values over all included labels for each dataset, tested on the test benchmark of the corresponding dataset. The FFL process for each dataset was performed in combination with the other 4 datasets including 5 different labels for each dataset.