. Author manuscript; available in PMC: 2021 Sep 30.

Published in final edited form as: Nat Med. 2020 Aug 17;26(10):1576–1582. doi: 10.1038/s41591-020-1010-5

Table 1. Performance of the DNN to detect diabetes using PPG in three validation datasets.

Sample sizes reported indicate individual people. User-level performance metrics are reported based on the average DNN score for all recordings from an individual user. Recording-level performance metrics are calculated treating each recording independently. Since Clinic Cohort participants only received one measurement, only the recording-level metric is reported for this cohort. Samples sizes shown indicate numbers of individual people.

	AUC (95% CI)	Sensitivity^*	Specificity^*	PPV^*	NPV^*
Test Dataset, n=11,313
User-level	0.766 (0.750-0.782)	75.0% (72.0–77.8%)	65.4% (64.6–66.3%)	13.3% (12.3–14.3%)	97.4% (97.0–97.7%)
Recording-level	0.680 (0.678-0.683)	66.2% (65.8–66.7%)	60.2% (60.1–60.3%)	10.2% (10.0–10.3%)	96.3% (96.3 – 96.4%)
Contemporary Cohort, n=7,806
User level	0.740 (0.722-0.756)	80.7% (77.7–83.6%)	54.4% (53.2–55.5%)	14.5% (13.3–15.5%)	96.7% (96.2–97.2%)
Recording-level	0.664 (0.661-0.667)	72.8% (72.2–73.3%)	51.6% (51.4–51.8%)	14.6% (14.5–14.8%)	94.3% (94.2–94.4%)
Clinic Cohort, n=181
Recording-level	0.682 (0.605–0.755)	81.7% (69.2–93.1%)	53.4% (45.8–61.1%)	31.9% (22.9-40.7%)	91.6% (85.7-97.0%)
Newly Diagnosed Diabetes, Recording-level, (n=164)	0.644 (0.546–0.744)	75.9% (56.3–92.9%)	53.0% (45.2–61.2%)	19.1% (11.2–28.3%)	93.8% (88.2–98.4%)

Metrics are reported at a threshold of DNN score=0.427; this threshold can be altered to optimize DNN performance on specific metrics as suitable for future applications. Abbreviations: CI: Confidence Interval; PPV: Positive predictive value; NPV: Negative predictive value; AUC: Area under the ROC curve.