Skip to main content
. 2021 Sep 13;13(18):4585. doi: 10.3390/cancers13184585

Table 1.

Subjectivity and bias in human interpretation.

Segmented Tumor Volume Quantitative Performance Metrics
Correlation (R2) * Agreement ** DSC 95HD (mm) ΔCOM (mm)
Annotator 1 vs. Annotator 2 0.79 (p < 0.001) No (p = 0.005) 0.76 ± 0.13 0.84 ± 0.63 0.51 ± 0.41
Annotator 1 vs. Annotator 3 0.88 (p < 0.001) No (p = 0.011) 0.76 ± 0.13 0.82 ± 0.55 0.41 ± 0.54
Annotator 2 vs. Annotator 3 0.77 (p < 0.001) Yes (p = 0.251) 0.75 ± 0.13 0.66 ± 0.47 0.39 ± 0.36
Annotator 1 vs. Deep learning 0.93 (p < 0.001) Yes (p = 0.365) 0.81 ± 0.09 0.78 ± 0.53 0.30 ± 0.23
Annotator 2 vs. Deep learning 0.87 (p < 0.001) No (p = 0.001) 0.74 ± 0.12 0.82 ± 0.41 0.51 ± 0.37
Annotator 3 vs. Deep learning 0.92 (p < 0.001) No (p = 0.007) 0.76 ± 0.10 0.77 ± 0.39 0.45 ± 0.45

DSC: dice similarity coefficient, 95HD: 95th percentile Hausdorff distance, ΔCOM: center of mass displacement. * Linear regression. ** Agreement: One sample t test on the difference between annotators. Quantitative performance metrics are presented as mean ± SD.