. 2022 Nov 28;24:62. doi: 10.1186/s12968-022-00899-5

Table 3.

Internal validation: segmentation performance on images with artifacts

	LV ED				LV ES
	DSC	HD (mm)	Recall	Precision	DSC	HD (mm)	Recall	Precision
CNN vs GT	0.93 (0.91–0.95)⁎	7.8 (4.7–11.8)	0.92 (0.86–0.94)⁎	0.95 (0.93–0.98)⁎	0.91 (0.84–0.93)⁎	6.0 (4.6–9.5)⁎	0.88 (0.84–0.94)⁎	0.94 (0.83–0.95)⁎
Circle vs GT	0.43 (0.25–0.86)†	10.6 (2.7–20.0)	0.34 (0.18–0.80)†	0.61 (0.41–0.90)†	0.63 (0.14–0.83)†	11.1 (4.1–20.3)†	0.56 (0.08–0.77)†	0.74 (0.30–0.88)†
O1 vs O2	0.93 (0.82–0.95)	5.2 (3.5–10.2)	0.91 (0.78–0.97)	0.93 (0.88–0.95)	0.92 (0.85–0.96)	6.1 (3.4–8.1)	0.94 (0.85–0.97)	0.91 (0.84–0.93)

	RV ED				RV ES
	DSC	HD (mm)	Recall	Precision	DSC	HD (mm)	Recall	Precision
CNN vs GT	0.87 (0.84–0.91)⁎	10.5 (6.8–14.3)⁎	0.85 (0.81–0.90)⁎	0.91 (0.86–0.93)⁎†	0.83 (0.73–0.90)⁎	9.3 (6.7–13.9)⁎†	0.81 (0.70–0.89)⁎	0.88 (0.80–0.92)⁎†
Circle vs GT	0.59 (0.21–0.78)†	25.5 (15.7–47.3)	0.45 (0.14–0.72)†	0.74 (0.42–0.85)†	0.50 (0.19–0.82)†	21.4 (7.9–38.2)	0.43 (0.13–0.78)†	0.69 (0.31–0.86)
O1 vs O2	0.85 (0.70–0.90)	13.0 (6.5–20.6)	0.86 (0.70–0.95)	0.82 (0.73–0.86)	0.76 (0.70–0.85)	14.1 (9.5–24.3)	0.85 (0.65–0.93)	0.72 (0.65–0.79)

	LVM ED				LVM ES
	DSC	HD (mm)	Recall	Precision	DSC	HD (mm)	Recall	Precision
CNN vs GT	0.77 (0.71–0.82)⁎†	6.5 (5.2–9.0)⁎	0.82 (0.79–0.87)⁎†	0.70 (0.67–0.79)⁎†	0.82 (0.73–0.84)⁎	7.1 (5.1–11.4)	0.84 (0.76–0.91)⁎†	0.75 (0.70–0.81)⁎
Circle vs GT	0.41 (0.15–0.70)†	14.8 (6.8–30.4)	0.31 (0.11–0.68)†	0.50 (0.29–0.74)†	0.59 (0.10–0.79)†	10.7 (5.5–25.4)†	0.51 (0.07–0.79)†	0.59 (0.26–0.83)†
O1 vs O2	0.71 (0.62–0.78)	6.4 (3.7–11.8)	0.75 (0.68–0.79)	0.70 (0.61–0.80)	0.76 (0.66–0.81)	8.2 (4.3–9.9)	0.72 (0.62–0.85)	0.76 (0.71–0.84)

Values are reported as median (interquartile range)

Segmentation performance on images with artifacts for the proposed CNN and for the commercial software (Circle) compared to the manual gold standard (GT). Also, the results of the comparison between two observers (O1 vs O2) is reported

Abbreviations as in Table 2. ⁎p < 0.05 CNN vs. Circle; †p < 0.05 vs. inter-observer variability