Table 4.
Dataset | Model | AUC | 95% CI | P1 | P2 | P3 |
---|---|---|---|---|---|---|
Training | Clinical | 0.58 | 0.52–0.65 | < 0.01* | ||
Semantic | 0.85 | 0.81–0.89 | 0.01* | |||
Volume | 0.84 | 0.80–0.88 | 0.01* | |||
FS | 0.90 | 0.87–0.93 | 0.82 | 0.58 | < 0.01* | |
Radiomics | 0.89 | 0.86–0.93 | ||||
RV | 0.90 | 0.87–0.94 | 0.18 | < 0.01* | ||
CSRV | 0.91 | 0.88–0.94 | ||||
FSV | 0.94 | 0.91–0.96 | 0.06 | |||
FSRV | 0.96 | 0.94–0.98 | ||||
CSFSRV | 0.96 | 0.94–0.98 | < 0.01* | 0.74 | ||
Testing | Clinical | 0.55 | 0.45–0.65 | < 0.01* | ||
Semantic | 0.85 | 0.78–0.92 | 0.28 | |||
Volume | 0.87 | 0.81–0.93 | 0.54 | |||
FS | 0.93 | 0.88–0.97 | 0.20 | 0.21 | 0.01* | |
Radiomics | 0.89 | 0.83–0.94 | ||||
RV | 0.88 | 0.82–0.93 | 0.10 | < 0.01* | ||
CSRV | 0.89 | 0.84–0.94 | 0.21 | |||
FSV | 0.98 | 0.96–1 | 0.50 | |||
FSRV | 0.97 | 0.94–1 | ||||
CSFSRV | 0.97 | 0.94–0.99 | < 0.01* | 0.25 | ||
Validation | Clinical | 0.61 | 0.51–0.72 | < 0.01* | ||
Semantic | 0.87 | 0.81–0.92 | 0.75 | |||
Volume | 0.93 | 0.88–0.98 | 0.16 | |||
FS | 0.92 | 0.87–0.96 | 0.29 | 0.97 | 0.01* | |
Radiomics | 0.88 | 0.81–0.94 | ||||
RV | 0.91 | 0.86–0.96 | 0.71 | 0.03* | ||
CSRV | 0.92 | 0.87–0.96 | ||||
FSV | 0.97 | 0.94–0.99 | 0.62 | |||
FSRV | 0.96 | 0.93–0.99 | ||||
CSFSRV | 0.96 | 0.94–0.99 | 0.01* | 0.30 |
FS, frozen section; RV, radiomics combining with volume; CSRV, radiomics combing with clinical, semantic, and volume; FSV, frozen section combining with volume; FSRV, frozen section combining with radiomics and volume; CSFSRV, radiomics combining with clinical, semantic, volume, and frozen section; AUC, the area under the curve; CI, confidence interval. *p < 0.05; P1 = p values between radiomics and other models; P2 = p values between CSRV and other models; P3 = p values between FSRV and other models. p values calculated using roc test by Delong method