. 2022 May 28;12:30. doi: 10.1186/s13550-022-00901-2

Table 3.

Metrics for AI segmentations without manual adjustments applied to the test cohort (n = 41)

Pixel-wise	AI model
Pixel-wise	Model 1	Model 2	Model 3	Model 4	Ensemble
Dice	0.801 (0.206)	0.817 (0.176)	0.768 (0.234)	0.763 (0.233)	0.801 (0.196)
Precision	0.772 (0.258)	0.816 (0.223)	0.752 (0.279)	0.787 (0.258)	0.786 (0.250)
Sensitivity	0.893 (0.173)	0.860 (0.180)	0.869 (0.182)	0.821 (0.231)*	0.872 (0.177)

Lesion-wise
Dice	0.847 (0.286)	0.828 (0.264)	0.809 (0.268)	0.803 (0.258)	0.850 (0.278)
Sensitivity	0.854 (0.230)	0.827 (0.234)	0.843 (0.228)	0.831 (0.243)	0.844 (0.238)

All values calculated as mean of the 41 patients of the test cohort with standard deviation in parentheses. Bold numbers mark the highest value across the models/ensemble in each evaluation metric. *Denotes statistically significant difference in sensitivity between Model 4 and Model 1 (p = 0.017)