Table 4.
Dataset | #Images | R1 vs. R2 | PD-R1 | PD-R2 | %(R1 > R2) | GED |
---|---|---|---|---|---|---|
GVA(train) | 2496 | 0.76 ± 0.17 | 0.19 ± 0.13 * | 0.16 ± 0.12 | 73.31 | 3.13 ± 0.44 |
GVA(test) | 844 | 0.77 ± 0.15 | 0.19 ± 0.14 * | 0.16 ± 0.13 | 68.36 | 3.16 ± 0.40 |
IMIM | 381 | 0.72 ± 0.20 | 0.17 ± 0.15 | 0.14 ± 0.11 | 62.2 | 2.80 ± 0.56 |
The DICE score between R1 and R2, and the percent density (PD) for each annotator are shown as mean ± standard deviation. The percentage of samples in which PD-R1 was larger than PD-R2 is shown in the column %(R1 > R2). The last column shows the generalized energy distance (GED) between the annotator’s labels and the estimated segmentation for each annotator. * The difference between the percent density (PD) obtained by R1 and R2 was statistically significant (p < 0.00001)