Table 2.
Average performance of the models along with their 95% confidence intervals on the validation datasets
| Model | Accuracy (%) | Precision (%) | Recall (%) | F1 score (%) | AUC |
|---|---|---|---|---|---|
| Model trained with clinical features only |
75.0 (73.86, 76.13) |
74.0 (72.04, 75.96) |
73.33 (69.87, 76.79) |
73.0 (71.86, 74.13) |
0.85 (0.83, 0.86) |
| Model trained with the combination of clinical and deep learning features |
78.0 (76.04, 79.96) |
75.33 (72.48, 78.18) |
83.0 (79.08, 86.92) |
78.33 (77.67, 78.98) |
0.82 (0.80, 0.84) |
| Model trained with the combination of clinical and radiomics features |
79.33 (78.02, 80.64) |
77.66 (75.93, 79.39) |
80.0 (77.73, 82.26) |
78.0 (78.0, 78.0) |
0.86 (0.84, 0.88) |
| Model trained with the combination of clinical, radiomics and deep learning features |
81.66 (77.69, 85.64) |
81.33 (79.60, 83.06) |
80.33 (75.75, 84.90) |
86.44 (83.27, 89.60) |
0.88 (0.85, 0.91) |