Table 2.
Model and metric | AUCb | Accuracy (%) | Sensitivity (%) | Specificity (%) | P valuec | P valuec | P valuec | P valuec |
---|---|---|---|---|---|---|---|---|
Training set (n = 300) | ||||||||
DLRad_DB | 0.90 (0.86–0.93) | 82 (245/300) | 86 (107/124) | 78 (138/176) | Reference | 0.025 | 0.86 | <0.0001 |
DLRad_FB | 0.83 (0.79–0.89) | 75 (225/300) | 82 (102/124) | 70 (123/176) | 0.025 | Reference | 0.039 | 0.028 |
3D DL | 0.89 (0.86–0.93) | 80 (239/300) | 82 (102/124) | 78 (137/176) | 0.86 | 0.039 | Reference | <0.0001 |
2D DL | 0.83 (0.79–0.88) | 77 (231/300) | 74 (92/124) | 79 (139/176) | 0.029 | 0.99 | 0.042 | 0.031 |
Radiomics | 0.82 (0.77–0.86) | 75 (224/300) | 73 (91/124) | 76 (133/176) | 0.0072 | 0.61 | 0.013 | 0.10 |
Clinical | 0.76 (0.70–0.81) | 63 (190/300) | 67 (83/124) | 61 (107/176) | <0.0001 | 0.028 | <0.0001 | Reference |
Internal test set (n = 89) | ||||||||
DLRad_DB | 0.89 (0.82–0.96) | 82 (73/89) | 88 (29/33) | 79 (44/56) | Reference | 0.34 | 0.97 | 0.042 |
DLRad_FB | 0.84 (0.75–0.92) | 78 (69/89) | 79 (26/33) | 77 (43/56) | 0.34 | Reference | 0.36 | 0.29 |
3D DL | 0.89 (0.82–0.96) | 80 (71/89) | 85 (28/33) | 77 (43/56) | 0.97 | 0.36 | Reference | 0.043 |
2D DL | 0.83 (0.75–0.92) | 76 (68/89) | 76 (25/33) | 77 (43/56) | 0.28 | 0.91 | 0.30 | 0.35 |
Radiomics | 0.79 (0.70–0.89) | 70 (62/89) | 73 (24/33) | 68 (38/56) | 0.10 | 0.50 | 0.11 | 0.69 |
Clinical | 0.77 (0.66–0.85) | 67 (60/89) | 70 (23/33) | 66 (37/56) | 0.042 | 0.29 | 0.043 | Reference |
External test set 1 (n = 120) | ||||||||
DLRad_DB | 0.89 (0.83–0.95) | 83 (100/120) | 88 (37/42) | 81 (63/78) | Reference | 0.16 | 0.95 | 0.0042 |
DLRad_FB | 0.82 (0.73–0.90) | 76 (91/120) | 74 (31/42) | 77 (60/78) | 0.16 | Reference | 0.17 | 0.15 |
3D DL | 0.89 (0.83–0.94) | 81 (97/120) | 83 (35/42) | 79 (62/78) | 0.95 | 0.17 | Reference | 0.0041 |
2D DL | 0.85 (0.77–0.92) | 78 (93/120) | 76 (32/42) | 78 (61/78) | 0.39 | 0.58 | 0.41 | 0.04 |
Radiomics | 0.79 (0.71–0.87) | 70 (84/120) | 74 (31/42) | 68 (53/78) | 0.062 | 0.66 | 0.062 | 0.31 |
Clinical | 0.73 (0.64–0.82) | 67 (80/120) | 69 (29/42) | 65 (51/78) | 0.0042 | 0.15 | 0.0041 | Reference |
External test set 2 (n = 44) | ||||||||
DLRad_DB | 0.89 (0.81–0.98) | 84 (37/44) | 82 (9/11) | 85 (28/33) | Reference | 0.46 | 0.75 | 0.042 |
DLRad_FB | 0.83 (0.71–0.95) | 73 (32/44) | 64 (7/11) | 76 (25/33) | 0.46 | Reference | 0.67 | 0.62 |
3D DL | 0.86 (0.76–0.97) | 77 (34/44) | 82 (9/11) | 76 (25/33) | 0.75 | 0.67 | Reference | 0.40 |
2D DL | 0.86 (0.74–0.97) | 75 (33/44) | 73 (8/11) | 76 (25/33) | 0.66 | 0.75 | 0.91 | 0.43 |
Radiomics | 0.79 (0.66–0.88) | 68 (30/44) | 73 (8/11) | 67 (22/33) | 0.083 | 0.67 | 0.40 | 0.83 |
Clinical | 0.78 (0.66–0.85) | 73 (32/44) | 64 (7/11) | 76 (25/33) | 0.042 | 0.62 | 0.40 | Reference |
Abbreviations: AUC, area under the curve; DL, deep learning; 3D, three-dimensional; 2D, two-dimensional.
Bold text indicates that the P-value is less than 0.05.
Unless otherwise specified, data are percentages, with proportions of patients (numerator/denominator) in parentheses.
Data in parentheses are 95% CIs.
P value was calculated by the Delong test.