Skip to main content
. Author manuscript; available in PMC: 2024 Aug 1.
Published in final edited form as: J Magn Reson Imaging. 2022 Dec 30;58(2):429–441. doi: 10.1002/jmri.28564

Table 2.

Testing results with various training settings and inter- and intra-reader reproducibility. The full YOLOv3 network performed better than the tiny network. Although augmentation methods such as reflection and translation/scaling/contrast reduced the mis-classification rate in individual slices, these augmentation methods did not improve the performance of 3D image prescription for either the full or the tiny network. Compared to the performance of the readers (radiologists), the full network’s mis-classification rate in 2D detection was low and on par with the disagreement rate from the intra- and inter-reader reproducibility studies. The full network’s performance in 3D liver detection and axial, coronal, and sagittal prescription was comparable to that of inter-reader reproducibility. Overlap: percentage of 3D volume from manual labeling covered by AI prescription.

Full network Tiny network Reproducibility of manual annotation

+ reflection + reflection + all augmentation Inter-reader Intra-reader
2D Annotation IoU between AI and manual (%) Mis-classify 4.39 3.86 9.45 9.92 6.73 6.72 4.43
Median 91.26 92.21 88.21 88.01 89.67 95.87 96.64
IQR 8.90 8.37 10.08 10.72 10.66 4.69 5.03

3D Liver Detection overlap between AI and manual (%) Median 97.62 97.66 96.65 97.17 95.17 97.02 98.91
IQR 6.51 6.52 6.16 6.49 7.49 4.96 2.46

3D Axial Prescription overlap between AI and manual (%) Median 98.48 98.41 95.26 95.62 96.51 97.09 99.05
IQR 3.00 2.89 6.27 5.59 5.27 1.87 1.68
S/I Shift for ≥99.5% patients 2.3cm 2.3cm 2.3cm 4.0cm 4.6cm 2.4cm 1.3cm

3D Coronal Prescription overlap between AI and manual (%) Median 98.32 98.17 96.30 96.91 95.95 97.73 98.71
IQR 3.76 3.71 6.13 5.13 5.65 2.18 2.25

3D Sagittal Prescription overlap between AI and manual (%) Median 97.89 97.90 96.53 96.92 96.55 97.31 99.53
IQR 5.50 5.37 5.79 5.92 6.79 3.64 1.29