Table 1.
Segmented Tumor Volume | Quantitative Performance Metrics | ||||
---|---|---|---|---|---|
Correlation (R2) * | Agreement ** | DSC | 95HD (mm) | ΔCOM (mm) | |
Annotator 1 vs. Annotator 2 | 0.79 (p < 0.001) | No (p = 0.005) | 0.76 ± 0.13 | 0.84 ± 0.63 | 0.51 ± 0.41 |
Annotator 1 vs. Annotator 3 | 0.88 (p < 0.001) | No (p = 0.011) | 0.76 ± 0.13 | 0.82 ± 0.55 | 0.41 ± 0.54 |
Annotator 2 vs. Annotator 3 | 0.77 (p < 0.001) | Yes (p = 0.251) | 0.75 ± 0.13 | 0.66 ± 0.47 | 0.39 ± 0.36 |
Annotator 1 vs. Deep learning | 0.93 (p < 0.001) | Yes (p = 0.365) | 0.81 ± 0.09 | 0.78 ± 0.53 | 0.30 ± 0.23 |
Annotator 2 vs. Deep learning | 0.87 (p < 0.001) | No (p = 0.001) | 0.74 ± 0.12 | 0.82 ± 0.41 | 0.51 ± 0.37 |
Annotator 3 vs. Deep learning | 0.92 (p < 0.001) | No (p = 0.007) | 0.76 ± 0.10 | 0.77 ± 0.39 | 0.45 ± 0.45 |
DSC: dice similarity coefficient, 95HD: 95th percentile Hausdorff distance, ΔCOM: center of mass displacement. * Linear regression. ** Agreement: One sample t test on the difference between annotators. Quantitative performance metrics are presented as mean ± SD.