Table 3.
Comparing the Performance of the Baseline DLS (Left Column) and Debiased DLSs (Middle and Right Columns) for Metrics Including Accuracy, Specificity, Sensitivity, and for Darker-Skin Individuals Versus Lighter-Skin Individuals
| Baseline DLS | Debiased DLS (Retina Appearance Optimized) | Debiased DLS (DR-Status Optimized) | |
|---|---|---|---|
| Testing dataset (400 images, see Table 2): | |||
| Accuracy (overall) | 66.75 (4.62) [62.13, 71.37] | 74.75 (4.26) [70.49, 79.01] | 71.75 (4.41) [67.34, 76.16] |
| Accuracy (lighter-skin individuals) | 73.0 (6.15) [66.85,79.15] | 78.5 (5.69) [72.81, 84.19] | 72.0 (6.22) [65.78, 78.22] |
| Accuracy (darker-skin individuals) | 60.5 (6.78) [53.72,67.28] | 71.0 (6.29) [64.71, 77.29] | 71.5 (6.26) [65.24, 77.76] |
| Delta-parity (signed) value | 12.5 (9.15) [3.35, 21.7] | 7.5 (8.48) [‒1.0, 16.0] | 0.5 (8.8) [‒8.3, 9.3] |
| Specificity (lighter-skin individuals) | 61.0 (9.56) [51.44, 70.56] | 83.0 (7.36) [75.64, 90.36] | 66.0 (9.28) [56.72, 75.28] |
| Sensitivity (lighter-skin individuals) | 85.0 (7.0) [78.0, 92.0] | 74.0 (8.6) [65.40, 82.6] | 78.0 (8.12) [69.88, 86.12] |
| Specificity (darker-skin individuals) | 86.0 (6.8) [79.2, 92.8] | 86.0 (6.8) [79.2, 92.8] | 85.0 (7.0) [78.0, 92.0] |
| Sensitivity (darker-skin individuals) | 35.0 (9.35) [25.65, 44.35] | 56.0 (9.73) [46.27, 65.73] | 58.0 (9.67) [48.33, 67.67] |
| Larger leftover set darker-skin individuals with DR (6291 images): | |||
| Sensitivity (darker-skin individuals) (= accuracy) | 38.48 (1.2) [37.28, 39.68] | 52.63 (1.23) [51.4, 53.86] | 49.75 (1.24) [48.51, 50.99] |
Also showing are the 95% error margins in parenthesis and 95% confidence intervals in brackets. Values are in %.