TABLE 2.
Comparison of quantitative indices of the clinical model, radiomics signature, deep learning model and fused model applied to the three cohorts
Training cohort | Internal validation cohort | Independent test cohort | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | CM | RS | DL | FM | CM | RS | DL | FM | CM | RS | DL | FM |
AUC | 0.7044 | 0.8223 | 0.8510 | 0.8941 | 0.6264 | 0.7616 | 0.8073 | 0.8301 | 0.6626 | 0.7475 | 0.7513 | 0.8042 |
ACC | 0.6383 | 0.7459 | 0.7735 | 0.8160 | 0.6264 | 0.7177 | 0.7500 | 0.7702 | 0.6150 | 0.6578 | 0.6631 | 0.7273 |
SENS | 0.6157 | 0.7622 | 0.8535 | 0.8535 | 0.5724 | 0.6965 | 0.8137 | 0.8137 | 0.6698 | 0.7453 | 0.6226 | 0.7075 |
SPEC | 0.6707 | 0.7225 | 0.6585 | 0.7622 | 0.5825 | 0.7476 | 0.6601 | 0.7087 | 0.5432 | 0.5432 | 0.7160 | 0.7531 |
PPV | 0.7286 | 0.7978 | 0.7821 | 0.8375 | 0.6587 | 0.7953 | 0.7712 | 0.7973 | 0.6574 | 0.6810 | 0.7416 | 0.7895 |
NPV | 0.5486 | 0.6790 | 0.7578 | 0.7837 | 0.4918 | 0.6363 | 0.7157 | 0.7300 | 0.5570 | 0.6197 | 0.5918 | 0.6630 |
F1 score | 0.6036 | 0.7001 | 0.7047 | 0.7728 | 0.5333 | 0.6875 | 0.6868 | 0.7192 | 0.5499 | 0.5789 | 0.6480 | 0.7052 |
Significance level of DeLong test for models compared with FM |
CM | RS | DL | |
---|---|---|---|
Training cohort | <0.0001 | <0.0001 | <0.0001 |
Internal validation cohort | <0.0001 | 0.0035 | 0.1083 |
Independent test cohort | 0.0005 | 0.0295 | 0.0132 |
Abbreviations: AUC, area under curve; ACC, accuracy; CM, clinical model; DL, deep learning model; FM, fused model; NPV, negative predictive value; PPV, positive predictive value; RS, radiomics signature; SENS: sensitivity; SPEC, specificity.