Table 1. Performance of the DNN to detect diabetes using PPG in three validation datasets.
Sample sizes reported indicate individual people. User-level performance metrics are reported based on the average DNN score for all recordings from an individual user. Recording-level performance metrics are calculated treating each recording independently. Since Clinic Cohort participants only received one measurement, only the recording-level metric is reported for this cohort. Samples sizes shown indicate numbers of individual people.
AUC (95% CI) | Sensitivity* | Specificity* | PPV* | NPV* | |
---|---|---|---|---|---|
Test Dataset, n=11,313 | |||||
User-level | 0.766 (0.750-0.782) | 75.0% (72.0–77.8%) | 65.4% (64.6–66.3%) | 13.3% (12.3–14.3%) | 97.4% (97.0–97.7%) |
Recording-level | 0.680 (0.678-0.683) | 66.2% (65.8–66.7%) | 60.2% (60.1–60.3%) | 10.2% (10.0–10.3%) | 96.3% (96.3 – 96.4%) |
Contemporary Cohort, n=7,806 | |||||
User level | 0.740 (0.722-0.756) | 80.7% (77.7–83.6%) | 54.4% (53.2–55.5%) | 14.5% (13.3–15.5%) | 96.7% (96.2–97.2%) |
Recording-level | 0.664 (0.661-0.667) | 72.8% (72.2–73.3%) | 51.6% (51.4–51.8%) | 14.6% (14.5–14.8%) | 94.3% (94.2–94.4%) |
Clinic Cohort, n=181 | |||||
Recording-level | 0.682 (0.605–0.755) | 81.7% (69.2–93.1%) | 53.4% (45.8–61.1%) | 31.9% (22.9-40.7%) | 91.6% (85.7-97.0%) |
Newly Diagnosed Diabetes, Recording-level, (n=164) | 0.644 (0.546–0.744) | 75.9% (56.3–92.9%) | 53.0% (45.2–61.2%) | 19.1% (11.2–28.3%) | 93.8% (88.2–98.4%) |
Metrics are reported at a threshold of DNN score=0.427; this threshold can be altered to optimize DNN performance on specific metrics as suitable for future applications. Abbreviations: CI: Confidence Interval; PPV: Positive predictive value; NPV: Negative predictive value; AUC: Area under the ROC curve.