. Author manuscript; available in PMC: 2022 Jan 1.

Published in final edited form as: Med Image Anal. 2020 Oct 9;67:101857. doi: 10.1016/j.media.2020.101857

Table 4. CT volume test set AUROC for models trained on 9 vs. 83 labels.

The area under the receiver operating characteristic (AUROC) is shown for CT-Net-9 (trained only on the 9 labels shown) and CT-Net-83 (trained on the 9 labels shown plus 74 additional labels) for the test set of 7,209 examples. CT-Net-83 outperforms CT-Net-9 on all abnormalities, emphasizing the value of the additional 74 labels. Note that we also experimented with separate binary classifiers for each of the 9 labels independently, but these models did not converge (AUROC ~0.5). Positive Count and Positive Percent are for positive examples of the abnormality in the test set.

Abnormality	Positive Count	Positive Percent	CT-Net-9		CT-Net-83		DeLong
Abnormality	Positive Count	Positive Percent	AUROC	95% CI	AUROC	95% CI	p-value
nodule	5,617	77.9	0.682	0.667–0.698	0.718	0.703–0.732	3.346×10⁻⁷
opacity	3,877	53.8	0.617	0.605–0.630	0.740	0.728–0.751	<4.950×10⁻¹⁶
atelectasis	2,037	28.3	0.683	0.668–0.697	0.765	0.753–0.777	<4.950×10⁻¹⁶
pleural effusion	1,404	19.5	0.945	0.937–0.952	0.951	0.945–0.958	1.882×10⁻²
consolidation	1,086	15.1	0.719	0.703–0.736	0.816	0.804–0.829	<4.950×10⁻¹⁶
mass	863	12.0	0.624	0.604–0.644	0.773	0.755–0.791	<4.950×10⁻¹⁶
pericardial eff.	1,078	15.0	0.659	0.640–0.677	0.697	0.679–0.714	8.315×10⁻⁸
cardiomegaly	649	9.0	0.791	0.774–0.807	0.851	0.836–0.867	7.000×10⁻¹³
pneumothorax	205	2.8	0.816	0.785–0.847	0.904	0.882–0.926	8.810×10⁻¹¹