Skip to main content
. 2020 Jul 22;9(2):42. doi: 10.1167/tvst.9.2.42

Table 1.

Summary of Studies Using Deep Learning Models in Glaucoma

Citation Training/Validation Dataset Test Dataset Reference Network Data type Output Results
Ting et al.27 Train: 125,189 Test: 71,896 Subjective grading of photographs Custom deep learning system Color Fundus Photos “Referable for glaucoma” vs. not AUC 0.942; Sensitivity 96.4%, Specificity 87.2%
Li et al.9 Train: 31,745 8000 Subjective grading of photographs Inception-v3 Color Fundus Photos “Referable for glaucoma” vs. not AUC 0.986; Sensitivity 95.6%, Specificity 92.0%
Christopher et al.36 9189 healthy, 5633 GON: divided randomly into multiple folds for 10-fold cross-validation. 10% test Subjective grading of photographs VGG6, Inception-v3, ResNet50 Color Fundus Photos “GON” vs. healthy ResNet50 AUC 0.91; Sensitivity 85% at 80% Specificity
Liu et al.28 Train: 29,865 GON, 11,046 probable GON, 200,121 unlikely GON Validation: 4514 GON, 571 Probable GON, 23,484 unlikely GON Subjective grading of photographs ResNet Color Fundus Photos “Referable GON” vs. not AUC 0.996, Sensitivity 96.2%, Specificity 97.7%
Ahn et al.29 Train: 228 Advanced glaucoma, 131 Early glaucoma, 385 Normal; Validation: 98 Advanced glaucoma, 61 Early glaucoma, 165 Normal Test: 141 Advanced glaucoma, 87 Early glaucoma, 236 Normal Subjective grading of visual field, OCT and RNFL photographs Inception-v3; Custom 3-layer CNN Color Fundus Photos Glaucoma vs. Normal Inception-v3 model: AUC 0.93; Average accuracy 84.5%; Custom 3-layer CNN: AUC 0.94, Average accuracy 87.9%
Phene et al.31 Train: 35,877 Non-glaucomatous, 20,740 Low-risk GS, 13,180 High-risk GS, 5307 Likely glaucoma, 18,487 Referable glaucoma; Tuning: 849 Non-glaucomatous, 259 Low-risk GS, 268 High-risk GS, 110 Likely glaucoma, 378 Referable glaucoma Validation set A: 687 Non-glaucomatous, 290 Low-risk GS, 170 High-risk GS, 48 Likely glaucoma, 218 Referable glaucoma; Validation set B: 8753 Non-glaucomatous, N/A Low-risk GS, N/A High-risk GS, 890 Likely glaucoma, 890 Referable glaucoma; Validation set C: 63 Non-glaucomatous, N/A Low-risk GS, 175 High-risk GS, 108 Likely glaucoma, 283 Referable glaucoma Validation set A: Referable GON based on subjective gradings of photographs; Validation set B: Referable GON based on glaucoma-related International Classification of Diseases codes; Validation set C: referable GON based on full glaucoma workup by glaucoma specialists including clinical exam, history, VF assessment, and OCT Inception-v3 Color Fundus Photos “Referable glaucoma” vs. Not Validation set A: AUC 0.945; Validation set B: AUC 0.855; Validation set C: AUC 0.881
Shibata et al.8 Train: 1364 glaucomatous appearance vs. 1768 not glaucomatous appearance; 3-fold cross-validation Test: 33 non-highly myopic glaucoma, 28 highly myopic glaucoma, 27 non-highly myopic normal, 22 highly myopic normal Train: Subjective gradings of photographs Test: Subjective gradings of photographs and categorization of RNFL and macular inner retinal thickness measurements based on OCT normative database ResNet Color Fundus Photos Glaucomatous vs. Not AUC 96.5
Li et al.30 Train 20,793/Validation 2,311:11,176 GON-confirmed, 599 GON-suspected, 11,329 Normal; 10-fold cross-validation with a random selection of 9:1 for participants within each fold Test: 1442 GON-confirmed, 515 GON-suspected, 1524 Normal Subjective grading of photographs ResNet101 Color Fundus Photos GON-confirmed vs. GON-suspected vs. Normal; Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal) Comparison of GON-confirmed vs. GON-suspected vs. Normal: Accuracy 0.941, Sensitivity 0.957, Specificity 0.929. AUC 0.992 for Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal)
Medeiros et al.24 Train + validation (80% train, 20% validation), 9,136 Glaucoma, 13,410 Suspect, 3982 Healthy Test: 2070 Glaucoma,3345 Suspect, 877 Healthy SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal (Normal + Bordeline) RNFL based on classification of global RNFL by SDOCT normative database ResNet34 Color Optic Disc Photos paired to SDOCT global RNFL SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal RNFL Pearson r = 0.832, P < 0.001 between DL predicted and actual SDOCT value; Mean absolute error 7.39 µm; AUC 0.944 for DL vs 0.94 for SDOCT (P = 0.72); 90% sensitivity at 80% specificity for both.
Thompson et al.26 Train + validation (80% train, 20% validation): 4,570 Glaucoma, 1924 Suspect, 1046 Healthy Test: 970 Glaucoma, 432 Suspect, 340 Healthy Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal (Suspect + Normal) based on classification of BMO-MRW global and sector values by SDOCT normative database ResNet34 Color Optic Disc Photos paired to SDOCT global BMO-MRW Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal Global BMO-MRW Pearson r = 0.88 (P < 0.001) between DL predicted and actual SDOCT value; Mean absolute error 27.8; AUC for DL 0.945 vs. actual SDOCT 0.933 (P = 0.59).
Devalla et al.67 40 control/60 glaucoma; training on datasets of 10, 20, 30 or 40 B-scans, with equal number of glaucoma and healthy scans in each cross-validation experiment Cross-validation experiments with test sets of 90, 80, 70, or 60. Manual segmentation of ONH OCT Custom eight-layer CNN Horizontal B-scan through ONH Digital stain of RNFL+prelamina, RPE, all other retinal layers, choroid, peripapillary sclera, lamina cribrosa Dice coefficient 0.84, Sensitivity 92%, specificity 99%, accuracy 94%
Mariottoni et al.65 Train 10,520/Validation 2742 Test Set 1 (images without segmentation errors or artifacts) 11,010; Test Set 2 (low-quality images with segmentation errors) 237; Test Set 3 (images with other artifacts) 776 Global RNFL thickness value ResNet34 SDOCT raw B-scans of peripapillary RNFL Global RNFL thickness value Test set 1: Pearson r 0.983 (P < 0.001) between predicted segmentation-free and actual SDOCT global RNFL; MAE 2.41; Test set 2: DL correlation with BAE r = 0.972 vs. with conventional algorithm 0.94, P < 0.001. Test set 3: DL correlation with BAE r = 0.94 vs. with conventional algorithm r = 0.64, P < 0.001.
Thompson et al.25 Train + Validation (50%+20%): 4828 Glaucoma, 9638 Normal Test (30%): 3897 Glaucoma, 2443 Normal Glaucoma (based on GON and reproducible glaucomatous visual field defects) vs. Healthy ResNet34 SDOCT raw B-scans of peripapillary RNFL Glaucoma vs. Healthy AUC 0.96 for DL algorithm vs. AUC 0.87 for global RNFL thickness (P < 0.001)
Maetschke et al.66 Train (80%): 672 POAG, 216 Healthy; Validation (10%): 30 Healthy, 82 POAG Test (10%): 93 POAG, 17 Healthy Glaucoma (based on glaucomatous VF defects on 2 consecutive tests) vs. Healthy Custom 5-layer CNN OCT of the ONH Glaucoma vs. Healthy AUC 0.94
Asaoka et al.7 Pretraining: 1371 Open angle glaucoma, 193 Healthy; Training: 94 Open angle glaucoma, 84 Healthy Test: 114 Open angle glaucoma and MD >−5 dB, 82 Healthy Glaucoma (based on GON and glaucomatous VF defects) vs. Healthy Custom 6-layer CNN 8 × 8 macular grid Glaucoma vs. Healthy AUC 0.937
Xu et al.69 Cross-validation (85%: 80% training/20% validation) 1632 OAG, 1764 closed Test (15%): 311 open, 329 closed Angle Closed vs. Open based on gonioscopic grade ResNet18; Inception v3 Anterior Segment-OCT Angle closed vs. Open AUC 0.928
Fu et al.70 7375 open angle, 895 angle closure: 5-fold cross-validation - four groups, each with 1654 angle closure tests for training, and one group of 1654 angle closure for testing 1654 angle closure for testing within each fold Angle closed vs. open based on gonioscopic grade VGG-16 Anterior Segment-OCT Angle closed vs. open AUC 0.96, sensitivity 90%, specificity 92%
Mariottoni et al.76 Training/Validation:3980 Glaucoma, 3732 Normal Test:1061 Glaucoma, 1057 Normal GON vs. GON suspects vs. Normal based on SAP and OCT objective criteria (see Table 2) ResNet50 Optic Disc Photos GON vs. Normal AUC 0.92, Sensitivity 77% at Specificity 95%
Li et al.71 Overall: 2389 Glaucoma, 1623 Non-glaucoma: Train: 3712 Test: 300 Glaucoma (based on glaucomatous damage to ONH and reproducible glaucomatous VF defects) vs. Healthy VGG Pattern Deviation plots from Humphrey Field Analyzer 30-2 or 24-2 visual field tests Glaucoma vs. Healthy AUC 0.966, Sensitivity 93.2%, Specificity 82.6%
Kucur et al.72 1979 control (Rotterdam 244; Budapest 1735), 2811 Early glaucoma (Rotterdam 2,279; Budapest 532)– 10-fold cross- validation 10-fold cross-validation; unclear if separate test and validation datasets were used Early glaucoma (based on glaucomatous neuroretinal rim loss, reproducible VF defects, and IOP) vs. Healthy Custom 7-layer CNN OCTOPUS 101 G1 and Humphrey Field Analyzer 24-2 visual field tests Early Glaucoma vs. Healthy Average Precision: Rotterdam 87.4%, Budapest 98.6%
Asaoka et al.10 171 Preperimetric glaucoma vs. 108 Normal and 63 artificially generated Normal—leave one out cross-validation Leave one out cross-validation; a separate test dataset was not used Preperimetric OAG (based on ONH changes, VF preceding perimetric field changes) vs. Healthy Custom DL feed-forwardneural network Humphrey Field Analyzer 24-2 Preperimetric glaucoma vs. Healthy AUC 0.926
Berchuck et al.89 Train (81%): 768 Glaucoma, 1793 Glaucoma suspects, 547 Normal Validation (9%): 83 Glaucoma, 222 Glaucoma suspect, 58 Normal5-fold cross-validation Test (9%): 93 Glaucoma, 206 Glaucoma suspect, 62 Normal Glaucoma (repeatable glaucomatous VF defect and corresponding optic nerve damage) vs. Glaucoma suspect (high IOP or suspicious optic nerve but no VF defect) vs. Normal (No visual field or optic nerve defect) Deep variational autoencoder Humphrey Field Analyzer 24-2 Rates of VF progression compared to SAP MD; Prediction of future VF compared to point-wise regression predictions Rate of progression significantly higher for VAE than MD at 2 years (25% vs. 9%) and 4 years (35% vs. 15%) from baseline. MAE for prediction of 4th, 6th, and 8th visits significantly smaller for VAE than PW (P < 0.001)
Wen et al.91 Train + validation (80%): 25,723 and 10-fold cross-validation Test (20%): 6720 Actual HFA points and Mean Deviation from HVF CascadeNet- 5 Humphrey Field Analyzer 24-2 HFA points and Mean Deviation PMAE 2.47; Mean difference in MD between predicted and actual MD = 0.41 dB, Pearson r = 0.92, P < 0.001

BAE, best available estimate; DL, deep learning; GON, glaucomatous optic neuropathy; VF, visual field; HFA, Humphrey Field Analyzer; HVF, Humphrey Visual Field; IOP, intraocular pressure; MAE, mean absolute error; PMAE, point-wise mean absolute error; POAG, primary open angle glaucoma; OAG, open angle glaucoma; ONH, optic nerve head; RPE, retina pigmented epithelium.