Table 4. Comparison of Segmentation Performance and Volumetric Measurement Results of Deep Learning Algorithm between Internal CT Data and External CT Data in Test Dataset-2.
| Performance Statistics | External CT Data | Internal CT Data | P* |
|---|---|---|---|
| Dice similarity score† | |||
| Liver | 0.982 ± 0.011 (0.932–0.999) | 0.983 ± 0.007 (0.954–0.998) | 0.28 |
| Spleen | 0.969 ± 0.011 (0.930–0.994) | 0.968 ± 0.010 (0.936–0.993) | 0.41 |
| 95% LOA of volumetric indices‡ | |||
| Liver volume | -0.34 ± 2.67 | -0.03 ± 3.47 | NA |
| Spleen volume | -0.63 ± 4.34 | -0.33 ± 3.70 | NA |
| Liver/spleen volume ratio | 0.30 ± 5.22 | 0.50 ± 4.02 | NA |
Data are expressed as mean ± standard deviation; data in parentheses are range. *p values for comparison of dice similarity score between internal CT data and external CT data using Wilcoxon test, †Data are expressed as mean ± standard deviation; data in parentheses are range, ‡Data are Bland-Altman 95% LOA expressed in percentage as mean difference ± 1.96 × standard deviation of difference. NA = not applicable