Skip to main content
. 2023 Mar 6;38:103368. doi: 10.1016/j.nicl.2023.103368

Table 6.

Results on dataset2 (testing clinical dataset). Results are presented as mean ± standard error of the mean across the dataset. Volume Error Rate denoted as VER, absolute VER denoted as AVER. Best performances respective to each annotator are denoted in boldface.

Method Data Augmentation Annotator Dice Recall Precision VER AVER Pearson r
1-step no 1 0.61 ± 0.02 0.63 ± 0.02 0.63 ± 0.03 0.01 ± 0.06 0.24 ± 0.03 0.68
2 0.56 ± 0.02 0.49 ± 0.02 0.69 ± 0.03 0.49 ± 0.10 0.60 ± 0.08 0.47
yes 1 0.64 ± 0.01 0.58 ± 0.02 0.74 ± 0.02 0.32 ± 0.06 0.34 ± 0.05 0.64
2 0.56 ± 0.02 0.44 ± 0.02 0.80 ± 0.01 0.94 ± 0.11 0.94 ± 0.11 0.48



2-step no 1 0.62 ± 0.02 0.64 ± 0.02 0.61 ± 0.03 −0.04 ± 0.05 0.19 ± 0.03 0.75
2 0.56 ± 0.02 0.51 ± 0.02 0.68 ± 0.03 0.42 ± 0.09 0.51 ± 0.07 0.52
yes 1 0.67 ± 0.01 0.62 ± 0.01 0.75 ± 0.02 0.22 ± 0.03 0.24 ± 0.03 0.84
2 0.59 ± 0.02 0.47 ± 0.02 0.81 ± 0.01 0.80 ± 0.08 0.80 ± 0.08 0.62



Inter-rater agreement 0.64 ± 0.02 0.78 ± 0.01 0.55 ± 0.02 −0.29 ± 0.03 0.29 ± 0.03 0.64