. 2023 Jun 21;7:33. doi: 10.1186/s41747-023-00346-9

Table 3.

Performance metrics of the network trained five times on the experiment A data

Base model	DenseNet-121
Augmentations	3° rotations in-plane, 10° rotations in 3D
Set and model	Test set from experiment A Single model (mean ± SD)	Test set from experiment A Ensemble model	Test set from experiment B, Single model (mean ± SD)
AUC, all	0.80 ± 0.02	0.80	0.63 ± 0.01
BAcc (ad hoc), all^a	0.71 ± 0.02	0.74	0.60 ± 0.01
BAcc (post hoc), all^b	0.73 ± 0.02	0.75	0.61 ± 0.01
AUC, LR_max	0.79 ± 0.02	0.80	0.63 ± 0.01
BAcc (ad hoc), LR_max^a	0.70 ± 0.02	0.71	0.60 ± 0.01
BAcc (post hoc), LR_max^b	0.73 ± 0.01	0.73	0.62 ± 0.02

The operating points for the test set balanced accuracy calculations were chosen by selecting the threshold nearest to the top-left corner of the receiver operator characteristic curve calculated from the early stopping set (^a) or from the final test set (^b). AUC Area under the receiver operator characteristic curve, BAcc Balanced accuracy, LR_max Maximum of the left and right lung prediction, SD Standard deviation