Skip to main content
. 2024 Dec 30;24:353. doi: 10.1186/s12880-024-01548-2

Table 4.

Prediction performance of each model in the training, testing, and validation sets

Model AUC (95% CI) Accuracy F1 score Sensitivity Specificity Delong test p
Training set Clinical 0.982 (0.964–0.995) 0.927 0.927 0.911 0.944 0.061
MRS 0.856 (0.798–0.911) 0.770 0.757 0.941 0.578 <0.001*
Intra-radiomics 0.939 (0.905–0.968) 0.874 0.875 0.871 0.878 <0.001*
Peri-radiomics 0.957 (0.928–0.980) 0.911 0.911 0.891 0.933 0.004*
Combined 1.000 (0.999-1.000) 0.990 0.989 1.000 0.978 Ref
Testing set Clinical 0.868 (0.782–0.947) 0.795 0.795 0.795 0.795 0.024
MRS 0.824 (0.726–0.904) 0.735 0.720 0.909 0.538 0.004*
Intra-radiomics 0.936 (0.884–0.980) 0.880 0.879 0.886 0.872 0.352
Peri-radiomics 0.943 (0.894–0.983) 0.904 0.902 0.955 0.846 0.442
Combined 0.968 (0.924–0.995) 0.928 0.927 0.932 0.923 Ref
Validation set Clinical 0.834 (0.718–0.936) 0.737 0.732 0.821 0.688 <0.001*
MRS 0.787 (0.660–0.896) 0.750 0.743 0.786 0.729 <0.001*
Intra-radiomics 0.913 (0.843–0.974) 0.816 0.811 0.893 0.771 0.365
Peri-radiomics 0.893 (0.795–0.969) 0.882 0.874 0.857 0.896 0.117
Combined 0.940 (0.881–0.988) 0.895 0.890 0.923 0.875 Ref

AUC, aera under the curve, CI, confidence interval

*, p<0.0125. P-values were adjusted for multiple comparisons using the Bonferroni correction (alpha = 0.05, adjusted p-value threshold = 0.0125)