Table 3.
Development set | Test set | |||
---|---|---|---|---|
Per slice | Per patient a | Per slice | Per patient a | |
AUROC (95% CI) | 0.960 (0.947–0.973) | 0.913 (0.851–0.975) | 0.947 (0.927–0.968) | 0.930 (0.828–1.000) |
AUPRC (95% CI) | 0.968 (0.956–0.977) | 0.887 (0.787–0.954) | 0.964 (0.930–0.978) | 0.941 (0.792–1.000) |
Acc (95% CI) | 91.4 (89.3–93.0) | 91.4 (83.2–95.8) | 90.8 (88.0–93.0) | 93.6 (79.3–98.2) |
Sen (95% CI) | 91.6 (88.5–93.9) | 92.7 (79.0–98.1) | 92.1 (88.5–94.6) | 95.0 (73.1–99.7) |
Spe (95% CI) | 91.1 (88.0–93.5) | 90.0 (75.4–96.7) | 88.5 (82.9–92.5) | 90.9 (57.1–99.5) |
PPV (95% CI) | 91.4 (88.3–93.7) | 90.5 (76.5–96.9) | 93.4 (90.1–95.7) | 95.0 (73.1–99.7) |
NPV (95% CI) | 91.3 (88.2–93.7) | 92.3 (78.0–98.0) | 86.2 (80.4–90.6) | 90.9 (57.1–99.5) |
AUROC, area under the receiver operating characteristics curve; AUPRC, area under the precision–recall curve; Acc, accuracy; Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value.
Since each patient yielded multiple tumor slices, the diagnostic accuracy per patient was calculated from the mean value of the all-predicted probabilities per patient.