Table 2.
Test set performance (AUC with 95% confidence intervals (CI)) of M1, M2 and CovSafeNet on D1test, D2, D3 and D4 datasets. The performance of CovSafeNet was compared with models M1 and M2 using DeLong’s test.
| Model AUC (95% CI) | D1test N=419 | D2 N=113 | D3 N=2000 | D4 N=282 | Combined N=2814 | |
|---|---|---|---|---|---|---|
| M1 | 0.850 (0.814, 0.888) | 0.714 (0.612, 0.816) | 0.709 (0.681, 0.738) | 0.610 (0.542, 0.678) | 0.655 (0.633, 0.678) | |
| M2 | 0.867 (0.833, 0.901) | 0.770 (0.667, 0.873) | 0.697 (0.666, 0.728) | 0.650 (0.579, 0.723) | 0.680 (0.658, 0.702) | |
| CovSafeNet | 0.890* (0.860, 0.921) | 0.769 (0.667, 0.870) | 0.732 (0.704, 0.761) | 0.654 (0.583, 0.724) | 0.693* (0.671, 0.716) | |
| DeLong’s Test with M1 | p=0.0342 | p= 0.9877 | p<0.0001 | p=0.812 | p=0.0123 | |
| DeLong’s Test with M2 | p=0.0001 | p=0.0589 | p=0.0548 | p=0.0558 | p<0.0001 | 
indicates statistically significant improvement as indicated by DeLong’s test.