. 2021 Mar 24;3(5):e286–e294. doi: 10.1016/S2589-7500(21)00039-X

Table 2.

Performance of severity prediction models

	ROC-AUC (95% CI)	F score (95% CI)	Sensitivity (95% CI)	Specificity (95% CI)	p value
Internal test set
Image-based model	0·803 (0·773–0·817)	0·792 (0·776–0·807)	0·671 (0·660–0·696)	0·819 (0·815–0·824)	<0·0001
Clinical-data-based model	0·821 (0·796–0·828)	0·799 (0·785–0·815)	0·683 (0·669–0·709)	0·827 (0·823–0·831)	<0·0001
Image and clinical data combined model	0·846 (0·815–0·852)	0·830 (0·813–0·847)	0·738 (0·727–0·761)	0·853 (0·850–0·856)	ref
Severity-score-based model	0·723 (0·710–0·762)	0·752 (0·719–0·781)	0·611 (0·601–0·633)	0·777 (0·769–0·790)	<0·0001
Severity score and clinical data combined model	0·837 (0·820–0·849)	0·806 (0·790–0·817)	0·724 (0·712–0·739)	0·820 (0·810–0·830)	0·067
External test set
Image-based model	0·753 (0·746–0·772)	0·688 (0·676–0·707)	0·662 (0·639–0·676)	0·696 (0·691–0·706)	<0·0001
Clinical-data-based model	0·731 (0·712–0·738)	0·721 (0·708–0·737)	0·632 (0·609–0·641)	0·688 (0·680–0·695)	<0·0001
Image and clinical data combined model	0·792 (0·780–0·803)	0·792 (0·775–0·802)	0·728 (0·711–0·739)	0·701 (0·695–0·709)	ref
Severity-score-based model	0·655 (0·617–0·685)	0·658 (0·638–0·667)	0·621 (0·609–0·632)	0·643 (0·638–0·660)	<0·0001
Severity score and clinical data combined model	0·736 (0·717–0·754)	0·690 (0·674–0·703)	0·625 (0·615–0·639)	0·687 (0·679–0·702)	<0·0001

A larger ROC-AUC represents better severity prediction performance. The p value from binomial test measures the difference in performance between the image and clinical data combined model and other prediction models; a smaller p value represents greater likelihood of a difference between the combined model and other models. ROC-AUC=area under the receiver operating characteristic curve.