Table 2.
Fully automated model performance at all levels (patient level classification through to voxel-level segmentation) calculated on the dedicated test set
| Task | Metric | Value |
|---|---|---|
| Patient-level classification | Accuracy (%) | 94.5 (121/128) |
| Sensitivity (%) | 93.3 (70/75) | |
| PPV (%) | 97.2 (70/72) | |
| Specificity (%) | 96.2 (51/53) | |
| NPV (%) | 91.1 (51/56) | |
| Lesion-level detection | PPV (%) | 88.2 (224/254) |
| Sensitivity (%) | 73.0 (224/307) | |
| F1 score (%) | 79.9 | |
| Lesion sub-groups detection | ||
| Local prostate | Sensitivity (%) | 90.0 (36/40) |
| Regional nodal | Sensitivity (%) | 68.3 (41/60) |
| Distant nodal | Sensitivity (%) | 76.2 (115/151) |
| Osseous | Sensitivity (%) | 58.2 (32/55) |
| Visceral | Sensitivity (%) | 0 (0/1) |
| Voxel-level segmentation | DSC (mean ± SD) | 43.5 ± 21.5 |
| Sensitivity (mean ± SD) | 45.0 ± 29.2 | |
| PPV (mean ± SD) | 58.5 ± 28.2 |