Table 3.
Patient-level, lesion-level, and voxel-level results comparison between the automated model and observer 2 measured with respect to the observer 1 segmentations
Task | Metric | Automated model | Observer 2 |
---|---|---|---|
Patient-level classification | Accuracy (%) | 100 (28/28) | 93.9 (26/28) |
Sensitivity (%) | 100 (20/20) | 100 (20/20) | |
PPV (%) | 100 (20/20) | 90.9 (20/22) | |
Specificity (%) | 100 (8/8) | 75 (6/8) | |
NPV (%) | 100 (8/8) | 100 (6/6) | |
Lesion-level detection | PPV (%) | 95.5 (63/66) | 91.7 (66/72) |
Sensitivity (%) | 68.5 (63/92) | 71.7 (66/92) | |
F1 score (%) | 79.7 | 80.5 | |
Lesion sub-groups detection | |||
Local prostate | Sensitivity (%) | 100 (15/15) | 93.3 (14/15) |
Regional nodal | Sensitivity (%) | 42.1 (8/19) | 73.7 (14/19) |
Distant nodal | Sensitivity (%) | 62.8 (27/43) | 62.8 (27/43) |
Osseous | Sensitivity (%) | 92.9 (13/14) | 78.6 (11/14) |
Visceral | Sensitivity (%) | 0 (0/1) | 0 (0/1) |
Voxel-level segmentation | DSC (mean ± SD) | 49.3 ± 18.9 | 33.1 ± 18.2 |
Sensitivity (mean ± SD) | 47.7 ± 28.4 | 62.6 ± 37.7 | |
PPV (mean ± SD) | 67.9 ± 21.6 | 31.7 ± 24.3 |