Skip to main content
. 2024 Sep 24;108:105333. doi: 10.1016/j.ebiom.2024.105333

Table 2.

Model scores across testing datasets including AUROCs achieved and qualitative evaluation scores.

Team name Team institution(s) Hold out AUROC (95% CI) Two site AUROC (95% CI) AUROC difference Mean AUROC Qualitative score
Convalesco University of Chicago 0.879 (0.873, 0.884) 0.911 (0.907, 0.915) 0.032 0.895 8.29
GAIL Geisinger 0.889 (0.884, 0.894) 0.805 (0.799, 0.812) −0.084 0.847 7.52
UC Berkeley Center for Targeted Machine Learning UC Berkeley 0.864 (0.858, 0.874) 0.859 (0.854, 0.865) −0.005 0.862 7.39
UW-Madison-BMI University of Wisconsin–Madison 0.886 (0.88, 0.893) 0.841 (0.835, 0.846) −0.045 0.864 6.84
Ruvos Ruvos 0.851 (0.832, 0.844) 0.838 (0.832, 0.844) −0.013 0.844 6.77
Anonymous Group 1 0.884 (0.877, 0.891) 0.835 (0.829, 0.841) −0.05 0.86 5.78
Anonymous Group 2 0.853 (0.846, 0.86) 0.824 (0.816, 0.83) −0.029 0.839 5.57
Penn Penn 0.889 (0.883, 0.895) 0.841 (0.834, 0.847) −0.048 0.865 5.37
Anonymous Group 4 0.905 (0.9, 0.91) 0.836 (0.83, 0.841) −0.07 0.87 4.8
Anonymous Group 5 0.837 (0.832, 0.846) 0.836 (0.83, 0.842) −0.001 0.836 4.69

Models not explicitly named have been masked as anonymous groups. The final rankings were based on the qualitative scores which combined aspects of reproducibility, interpretability, and translational feasibility (See Supplemental materials for more information).