Table 3.
Performance metrics of the network trained five times on the experiment A data
| Base model | DenseNet-121 | ||
|---|---|---|---|
| Augmentations | 3° rotations in-plane, 10° rotations in 3D | ||
| Set and model | Test set from experiment A Single model (mean ± SD) |
Test set from experiment A Ensemble model |
Test set from experiment B, Single model (mean ± SD) |
| AUC, all | 0.80 ± 0.02 | 0.80 | 0.63 ± 0.01 |
| BAcc (ad hoc), alla | 0.71 ± 0.02 | 0.74 | 0.60 ± 0.01 |
| BAcc (post hoc), allb | 0.73 ± 0.02 | 0.75 | 0.61 ± 0.01 |
| AUC, LRmax | 0.79 ± 0.02 | 0.80 | 0.63 ± 0.01 |
| BAcc (ad hoc), LRmaxa | 0.70 ± 0.02 | 0.71 | 0.60 ± 0.01 |
| BAcc (post hoc), LRmaxb | 0.73 ± 0.01 | 0.73 | 0.62 ± 0.02 |
The operating points for the test set balanced accuracy calculations were chosen by selecting the threshold nearest to the top-left corner of the receiver operator characteristic curve calculated from the early stopping set (a) or from the final test set (b). AUC Area under the receiver operator characteristic curve, BAcc Balanced accuracy, LRmax Maximum of the left and right lung prediction, SD Standard deviation