Skip to main content
[Preprint]. 2025 May 1:arXiv:2501.14066v2. [Version 2]

Table 4:

Phase classification performance of models on the C4KC-KiTS dataset: XGBoost, ResNet3D 18-layer (r3d_18), Mixed Convolution Network 18-layer (mc3_18), R(2+1)D 18-layer (r2plus1d_18), and TotalSegmentator (ts_phase). Models are evaluated using AUC, Sensitivity, Specificity, PPV, F1 Score, and Accuracy.

AUC Sensitivity Specificity PPV F1-score Accuracy p-value
Non-contrast
 XGBoost 0.994 0.981 0.996 0.990 0.985 0.981
 r3d_18 0.992 0.981 0.973 0.929 0.954 0.981 NaN
 mc3_18 0.989 0.971 0.940 0.852 0.908 0.971 1.000
 r2plus1d_18 0.992 0.981 0.986 0.963 0.972 0.981 NaN
 ts_phase 0.984 0.971 0.996 0.990 0.981 0.990 1.000
Arterial/Venous
 XGBoost 0.994 0.961 0.974 0.975 0.968 0.961
 r3d_18 0.961 0.876 0.878 0.884 0.880 0.876 <0.001
 mc3_18 0.925 0.838 0.772 0.796 0.816 0.838 <0.001
 r2plus1d_18 0.917 0.800 0.777 0.792 0.796 0.800 <0.001
 ts_phase 0.620 0.614 0.626 0.635 0.624 0.620 <0.001
Delayed
 XGBoost 0.991 0.956 0.974 0.915 0.935 0.956
 r3d_18 0.926 0.670 0.917 0.701 0.685 0.670 <0.001
 mc3_18 0.862 0.406 0.911 0.569 0.474 0.406 <0.001
 r2plus1d_18 0.877 0.494 0.867 0.517 0.505 0.494 <0.001
 ts_phase 0.469 0.197 0.741 0.180 0.188 0.620 <0.001

P-values indicate the significance of accuracy differences compared to XGBoost (p<0.001 considered significant).