Skip to main content
. 2022 Jun 15;9:860574. doi: 10.3389/fmed.2022.860574

TABLE 2.

The discriminative performance of the multi-task 3D deep learning model for detecting glaucomatous optic neuropathy and the comparison to average retinal nerve fibre layer thickness, a single-task 3D deep learning model, and a multi-task 2D deep learning model in all datasets.

AUROC (95% CI) p-value Sensitivity, % (95% CI) Specificity, % (95% CI) Accuracy, % (95% CI) PPV, % (95% CI) NPV, % (95% CI)
Internal validation
 3D multi-task DL 0.949 (0.930–0.969) \ 88.0 (80.9–95.9) 91.6 (81.7–97.2) 89.4 (86.6–92.1) 92.2 (85.5–97.0) 87.1 (81.3–94.7)
 Average RNFL thickness 0.913 (0.888–0.939) < 0.001 80.1 (72.2–88.4) 92.5 (84.0–97.2) 85.5 (82.2–88.6) 92.0 (85.7–96.9) 80.5 (75.0–86.7)
 3D single-task DL 0.941 (0.920–0.961) 0.53 86.3 (73.4–95.0) 88.3 (78.4–98.6) 87.0 (84.1–90.1) 89.4 (82.8–98.4) 85.2 (76.4–93.5)
 2D multi-task DL 0.940 (0.919–0.961) 0.53 84.7 (78.4–92.1) 92.5 (84.0–97.2) 88.3 (85.5–91.0) 92.6 (86.2–97.0) 84.4 (79.4–90.7)
External testing 1
 3D multi-task DL 0.890 (0.864–0.917) \ 78.9 (70.4–86.4) 86.1 (77.3–92.8) 82.0 (78.7–85.1) 86.9 (81.4–92.4) 77.7 (72.1–83.3)
 Average RNFL thickness 0.890 (0.864–0.916) 0.96 69.9 (63.7–76.7) 94.8 (89.2–98.0) 81.2 (78.3–84.2) 94.0 (88.7–97.5) 73.0 (69.5–77.3)
 3D single-task DL 0.893 (0.867–0.919) 0.88 82.3 (70.1–89.8) 81.7 (72.9–92.0) 81.8 (78.7–85.0) 84.2 (79.2–91.6) 79.7. (72.0–86.2)
 2D multi-task DL 0.900 (0.876–0.925) 0.58 82.7 (75.5–91.8) 82.1 (70.5–88.8) 82.3 (78.9–85.3) 84.3 (78.4–89.2) 80.2 (74.7–88.2)
External testing 2
 3D multi-task DL 0.903 (0.867–0.939) \ 77.6 (67.1–86.7) 91.9 (83.1–98.4) 84.3 (80.2–88.4) 92.1 (85.0–98.2) 78.4 (72.0–85.4)
 Average RNFL thickness 0.915 (0.881–0.949) 0.38 85.3 (78.3–93.7) 88.7 (77.4–93.6) 86.5 (82.4–90.3) 89.4 (82.5–94.1) 83.7 (77.9–91.6)
 3D single-task DL 0.883 (0.841–0.925) 0.48 83.9 (69.2–93.7) 83.1 (70.2–94.4) 83.2 (79.0–87.3) 85.2 (78.1–94.1) 81.8. (72.1–90.8)
 2D multi-task DL 0.882 (0.843–0.922) 0.45 81.1 (67.1–89.5) 83.1 (73.4–93.6) 82.0 (77.5–86.2) 84.9 (78.5–92.8) 79.3 (70.8–86.8)
External testing 3
 3D multi-task DL 0.906 (0.880–0.933) \ 79.7 (68.5–88.1) 88.9 (79.1–96.7) 82.1 (76.5–86.6) 94.4 (90.5–98.2) 64.9 (56.2–74.7)
 Average RNFL thickness 0.913 (0.885–0.941) 0.53 84.8 (80.4–90.9) 88.9 (81.1–94.1) 86.2 (82.9–89.3) 94.8 (91.7–97.1) 71.4 (65.6–79.3)
 3D single-task DL 0.898 (0.868–0.928) 0.70 87.1 (78.0–92.5) 79.9 (70.5–88.5) 84.5 (79.4–88.2) 90.6 (87.4–94.3) 72.8 (62.2–81.4)
 2D multi-task DL 0.903 (0.876–0.931) 0.89 82.4 (68.2–89.0) 84.9 (76.3–95.7) 82.9 (76.4–86.9) 92.7 (89.2–97.4) 67.6 (56.6–76.4)
External testing 4
 3D multi-task DL 0.950 (0.936–0.963) \ 85.2 (79.0–92.5) 94.0 (86.5–98.1) 87.3 (83.2–91.1) 97.9 (95.8–99.3) 65.6 (58.1–77.4)
 Average RNFL thickness 0.950 (0.937–0.963) 0.85 87.0 (79.5–90.5) 90.6 (85.9–96.2) 87.9 (83.5–90.3) 96.9 (95.5–98.7) 67.5 (58.4–73.5)
 3D single-task DL 0.929 (0.911–0.947) 0.08 83.3 (74.1–92.9) 87.4 (77.2–95.4) 84.5 (78.8–89.6) 95.7 (93.0–98.1) 61.2 (52.3–76.8)
 2D multi-task DL 0.939 (0.923–0.955) 0.31 88.0 (76.3–93.2) 84.7 (77.7–94.9) 87.1 (80.1–90.4) 95.1 (93.2–98.0) 67.8 (54.0–77.8)
External testing 5
 3D multi-task DL 0.930 (0.915–0.946) \ 83.9 (80.4–87.2) 92.2 (89.6–94.7) 88.2 (86.3–90.0) 90.9 (88.2–93.6) 86.1 (83.6–88.6)
 Average RNFL thickness 0.921 (0.905–0.937) 0.15 80.2 (73.1–90.2) 89.1 (78.1–94.5) 84.5 (82.5–86.6) 87.2 (79.1–92.9) 83.1 (78.9–89.7)
 3D single-task DL 0.936 (0.922–0.951) 0.31 84.1 (80.8–88.8) 92.4 (87.1–94.9) 88.3 (86.3–90.2) 91.1 (86.2–93.8) 86.2 (83.8–89.5)
 2D multi-task DL 0.938 (0.924–0.953) 0.45 84.1 (80.2–88.2) 93.8 (89.6–96.4) 89.0 (87.2–90.8) 92.5 (88.6–95.4) 86.4 (83.7–89.4)

AUROC, area under the receiver operator characteristic curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; DL, deep learning; RNFL, retinal nerve fibre layer. p-values in bold were AUROC values with significant difference.