Table 2.
Classification performance of FundusNet framework for referable vs non-referable DR compared to fully supervised baseline models.
Models | AUC (95% CI) cross-validated on EyePACS dataset | AUC (95% CI) on test dataset | P-value (vs FundusNet) | Sensitivity (95% CI) | P-value (vs FundusNet) | Specificity (95% CI) | P-value (vs FundusNet) |
---|---|---|---|---|---|---|---|
FundusNet | 0.96 (0.938–0.972) | 0.91 (0.898–0.930) | Ref | 0.90 (0.895–0.917) | Ref | 0.85 (0.830–0.862) | Ref |
Baseline1 (ResNet50) | 0.94 (0.919–0.953) | 0.80 (0.783–0.820) | P < 0.001 | 0.81 (0.793–0.834) | P < 0.001 | 0.74 (0.731–0.758) | P < 0.005 |
Baseline2 (InceptionV3) | 0.92 (0.905–0.961) | 0.83 (0.801–0.853) | P < 0.001 | 0.84 (0.822–0.848) | P < 0.001 | 0.79(0.786–0.819) | P < 0.05 |
P value from measuring statistical significance using DeLong’s test for comparing pairwise AUCs.
DR diabetic retinopathy, CI confidence interval, AUC area under the ROC curve, Ref reference.