Table 1.
Mean metric values for human vs. human and computer vs. human comparisons.
Recall | Recall ≥ 10 mm | FP per case | Dice per case | Dice per correspondence | Merge error | Split error | |
---|---|---|---|---|---|---|---|
Human vs. Human | |||||||
MTRA (LiTS) | 0.92 | 0.94 | 2.6 | 0.70 ± 0.27 | 0.72 ± 0.11 | 11 | 5 |
LiTS (MTRA) | 0.62 | 0.85 | 0.3 | 0.70 ± 0.27 | 0.72 ± 0.11 | 5 | 12 |
Computer vs. Human | |||||||
FCN (MTRA) | 0.47 | 0.75 | 4.7 | 0.53 ± 0.37 | 0.72 ± 0.11 | 7 | 13 |
FCN (LiTS) | 0.72 | 0.86 | 4.6 | 0.51 ± 0.37 | 0.65 ± 0.16 | 12 | 14 |
FCN + RF (LiTS) | 0.63 | 0.77 | 0.7 | 0.58 ± 0.36 | 0.69 ± 0.18 | 11 | 10 |
The parentheses denote the dataset used as a reference for the computation of evaluation metrics.