Table 2.
Performance of severity prediction models
ROC-AUC (95% CI) | F score (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | p value | |
---|---|---|---|---|---|
Internal test set | |||||
Image-based model | 0·803 (0·773–0·817) | 0·792 (0·776–0·807) | 0·671 (0·660–0·696) | 0·819 (0·815–0·824) | <0·0001 |
Clinical-data-based model | 0·821 (0·796–0·828) | 0·799 (0·785–0·815) | 0·683 (0·669–0·709) | 0·827 (0·823–0·831) | <0·0001 |
Image and clinical data combined model | 0·846 (0·815–0·852) | 0·830 (0·813–0·847) | 0·738 (0·727–0·761) | 0·853 (0·850–0·856) | ref |
Severity-score-based model | 0·723 (0·710–0·762) | 0·752 (0·719–0·781) | 0·611 (0·601–0·633) | 0·777 (0·769–0·790) | <0·0001 |
Severity score and clinical data combined model | 0·837 (0·820–0·849) | 0·806 (0·790–0·817) | 0·724 (0·712–0·739) | 0·820 (0·810–0·830) | 0·067 |
External test set | |||||
Image-based model | 0·753 (0·746–0·772) | 0·688 (0·676–0·707) | 0·662 (0·639–0·676) | 0·696 (0·691–0·706) | <0·0001 |
Clinical-data-based model | 0·731 (0·712–0·738) | 0·721 (0·708–0·737) | 0·632 (0·609–0·641) | 0·688 (0·680–0·695) | <0·0001 |
Image and clinical data combined model | 0·792 (0·780–0·803) | 0·792 (0·775–0·802) | 0·728 (0·711–0·739) | 0·701 (0·695–0·709) | ref |
Severity-score-based model | 0·655 (0·617–0·685) | 0·658 (0·638–0·667) | 0·621 (0·609–0·632) | 0·643 (0·638–0·660) | <0·0001 |
Severity score and clinical data combined model | 0·736 (0·717–0·754) | 0·690 (0·674–0·703) | 0·625 (0·615–0·639) | 0·687 (0·679–0·702) | <0·0001 |
A larger ROC-AUC represents better severity prediction performance. The p value from binomial test measures the difference in performance between the image and clinical data combined model and other prediction models; a smaller p value represents greater likelihood of a difference between the combined model and other models. ROC-AUC=area under the receiver operating characteristic curve.