Table 1.
Summary of Studies Using Deep Learning Models in Glaucoma
Citation | Training/Validation Dataset | Test Dataset | Reference | Network | Data type | Output | Results |
---|---|---|---|---|---|---|---|
Ting et al.27 | Train: 125,189 | Test: 71,896 | Subjective grading of photographs | Custom deep learning system | Color Fundus Photos | “Referable for glaucoma” vs. not | AUC 0.942; Sensitivity 96.4%, Specificity 87.2% |
Li et al.9 | Train: 31,745 | 8000 | Subjective grading of photographs | Inception-v3 | Color Fundus Photos | “Referable for glaucoma” vs. not | AUC 0.986; Sensitivity 95.6%, Specificity 92.0% |
Christopher et al.36 | 9189 healthy, 5633 GON: divided randomly into multiple folds for 10-fold cross-validation. | 10% test | Subjective grading of photographs | VGG6, Inception-v3, ResNet50 | Color Fundus Photos | “GON” vs. healthy | ResNet50 AUC 0.91; Sensitivity 85% at 80% Specificity |
Liu et al.28 | Train: 29,865 GON, 11,046 probable GON, 200,121 unlikely GON | Validation: 4514 GON, 571 Probable GON, 23,484 unlikely GON | Subjective grading of photographs | ResNet | Color Fundus Photos | “Referable GON” vs. not | AUC 0.996, Sensitivity 96.2%, Specificity 97.7% |
Ahn et al.29 | Train: 228 Advanced glaucoma, 131 Early glaucoma, 385 Normal; Validation: 98 Advanced glaucoma, 61 Early glaucoma, 165 Normal | Test: 141 Advanced glaucoma, 87 Early glaucoma, 236 Normal | Subjective grading of visual field, OCT and RNFL photographs | Inception-v3; Custom 3-layer CNN | Color Fundus Photos | Glaucoma vs. Normal | Inception-v3 model: AUC 0.93; Average accuracy 84.5%; Custom 3-layer CNN: AUC 0.94, Average accuracy 87.9% |
Phene et al.31 | Train: 35,877 Non-glaucomatous, 20,740 Low-risk GS, 13,180 High-risk GS, 5307 Likely glaucoma, 18,487 Referable glaucoma; Tuning: 849 Non-glaucomatous, 259 Low-risk GS, 268 High-risk GS, 110 Likely glaucoma, 378 Referable glaucoma | Validation set A: 687 Non-glaucomatous, 290 Low-risk GS, 170 High-risk GS, 48 Likely glaucoma, 218 Referable glaucoma; Validation set B: 8753 Non-glaucomatous, N/A Low-risk GS, N/A High-risk GS, 890 Likely glaucoma, 890 Referable glaucoma; Validation set C: 63 Non-glaucomatous, N/A Low-risk GS, 175 High-risk GS, 108 Likely glaucoma, 283 Referable glaucoma | Validation set A: Referable GON based on subjective gradings of photographs; Validation set B: Referable GON based on glaucoma-related International Classification of Diseases codes; Validation set C: referable GON based on full glaucoma workup by glaucoma specialists including clinical exam, history, VF assessment, and OCT | Inception-v3 | Color Fundus Photos | “Referable glaucoma” vs. Not | Validation set A: AUC 0.945; Validation set B: AUC 0.855; Validation set C: AUC 0.881 |
Shibata et al.8 | Train: 1364 glaucomatous appearance vs. 1768 not glaucomatous appearance; 3-fold cross-validation | Test: 33 non-highly myopic glaucoma, 28 highly myopic glaucoma, 27 non-highly myopic normal, 22 highly myopic normal | Train: Subjective gradings of photographs Test: Subjective gradings of photographs and categorization of RNFL and macular inner retinal thickness measurements based on OCT normative database | ResNet | Color Fundus Photos | Glaucomatous vs. Not | AUC 96.5 |
Li et al.30 | Train 20,793/Validation 2,311:11,176 GON-confirmed, 599 GON-suspected, 11,329 Normal; 10-fold cross-validation with a random selection of 9:1 for participants within each fold | Test: 1442 GON-confirmed, 515 GON-suspected, 1524 Normal | Subjective grading of photographs | ResNet101 | Color Fundus Photos | GON-confirmed vs. GON-suspected vs. Normal; Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal) | Comparison of GON-confirmed vs. GON-suspected vs. Normal: Accuracy 0.941, Sensitivity 0.957, Specificity 0.929. AUC 0.992 for Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal) |
Medeiros et al.24 | Train + validation (80% train, 20% validation), 9,136 Glaucoma, 13,410 Suspect, 3982 Healthy | Test: 2070 Glaucoma,3345 Suspect, 877 Healthy | SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal (Normal + Bordeline) RNFL based on classification of global RNFL by SDOCT normative database | ResNet34 | Color Optic Disc Photos paired to SDOCT global RNFL | SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal RNFL | Pearson r = 0.832, P < 0.001 between DL predicted and actual SDOCT value; Mean absolute error 7.39 µm; AUC 0.944 for DL vs 0.94 for SDOCT (P = 0.72); 90% sensitivity at 80% specificity for both. |
Thompson et al.26 | Train + validation (80% train, 20% validation): 4,570 Glaucoma, 1924 Suspect, 1046 Healthy | Test: 970 Glaucoma, 432 Suspect, 340 Healthy | Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal (Suspect + Normal) based on classification of BMO-MRW global and sector values by SDOCT normative database | ResNet34 | Color Optic Disc Photos paired to SDOCT global BMO-MRW | Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal | Global BMO-MRW Pearson r = 0.88 (P < 0.001) between DL predicted and actual SDOCT value; Mean absolute error 27.8; AUC for DL 0.945 vs. actual SDOCT 0.933 (P = 0.59). |
Devalla et al.67 | 40 control/60 glaucoma; training on datasets of 10, 20, 30 or 40 B-scans, with equal number of glaucoma and healthy scans in each cross-validation experiment | Cross-validation experiments with test sets of 90, 80, 70, or 60. | Manual segmentation of ONH OCT | Custom eight-layer CNN | Horizontal B-scan through ONH | Digital stain of RNFL+prelamina, RPE, all other retinal layers, choroid, peripapillary sclera, lamina cribrosa | Dice coefficient 0.84, Sensitivity 92%, specificity 99%, accuracy 94% |
Mariottoni et al.65 | Train 10,520/Validation 2742 | Test Set 1 (images without segmentation errors or artifacts) 11,010; Test Set 2 (low-quality images with segmentation errors) 237; Test Set 3 (images with other artifacts) 776 | Global RNFL thickness value | ResNet34 | SDOCT raw B-scans of peripapillary RNFL | Global RNFL thickness value | Test set 1: Pearson r 0.983 (P < 0.001) between predicted segmentation-free and actual SDOCT global RNFL; MAE 2.41; Test set 2: DL correlation with BAE r = 0.972 vs. with conventional algorithm 0.94, P < 0.001. Test set 3: DL correlation with BAE r = 0.94 vs. with conventional algorithm r = 0.64, P < 0.001. |
Thompson et al.25 | Train + Validation (50%+20%): 4828 Glaucoma, 9638 Normal | Test (30%): 3897 Glaucoma, 2443 Normal | Glaucoma (based on GON and reproducible glaucomatous visual field defects) vs. Healthy | ResNet34 | SDOCT raw B-scans of peripapillary RNFL | Glaucoma vs. Healthy | AUC 0.96 for DL algorithm vs. AUC 0.87 for global RNFL thickness (P < 0.001) |
Maetschke et al.66 | Train (80%): 672 POAG, 216 Healthy; Validation (10%): 30 Healthy, 82 POAG | Test (10%): 93 POAG, 17 Healthy | Glaucoma (based on glaucomatous VF defects on 2 consecutive tests) vs. Healthy | Custom 5-layer CNN | OCT of the ONH | Glaucoma vs. Healthy | AUC 0.94 |
Asaoka et al.7 | Pretraining: 1371 Open angle glaucoma, 193 Healthy; Training: 94 Open angle glaucoma, 84 Healthy | Test: 114 Open angle glaucoma and MD >−5 dB, 82 Healthy | Glaucoma (based on GON and glaucomatous VF defects) vs. Healthy | Custom 6-layer CNN | 8 × 8 macular grid | Glaucoma vs. Healthy | AUC 0.937 |
Xu et al.69 | Cross-validation (85%: 80% training/20% validation) 1632 OAG, 1764 closed | Test (15%): 311 open, 329 closed | Angle Closed vs. Open based on gonioscopic grade | ResNet18; Inception v3 | Anterior Segment-OCT | Angle closed vs. Open | AUC 0.928 |
Fu et al.70 | 7375 open angle, 895 angle closure: 5-fold cross-validation - four groups, each with 1654 angle closure tests for training, and one group of 1654 angle closure for testing | 1654 angle closure for testing within each fold | Angle closed vs. open based on gonioscopic grade | VGG-16 | Anterior Segment-OCT | Angle closed vs. open | AUC 0.96, sensitivity 90%, specificity 92% |
Mariottoni et al.76 | Training/Validation:3980 Glaucoma, 3732 Normal | Test:1061 Glaucoma, 1057 Normal | GON vs. GON suspects vs. Normal based on SAP and OCT objective criteria (see Table 2) | ResNet50 | Optic Disc Photos | GON vs. Normal | AUC 0.92, Sensitivity 77% at Specificity 95% |
Li et al.71 | Overall: 2389 Glaucoma, 1623 Non-glaucoma: Train: 3712 | Test: 300 | Glaucoma (based on glaucomatous damage to ONH and reproducible glaucomatous VF defects) vs. Healthy | VGG | Pattern Deviation plots from Humphrey Field Analyzer 30-2 or 24-2 visual field tests | Glaucoma vs. Healthy | AUC 0.966, Sensitivity 93.2%, Specificity 82.6% |
Kucur et al.72 | 1979 control (Rotterdam 244; Budapest 1735), 2811 Early glaucoma (Rotterdam 2,279; Budapest 532)– 10-fold cross- validation | 10-fold cross-validation; unclear if separate test and validation datasets were used | Early glaucoma (based on glaucomatous neuroretinal rim loss, reproducible VF defects, and IOP) vs. Healthy | Custom 7-layer CNN | OCTOPUS 101 G1 and Humphrey Field Analyzer 24-2 visual field tests | Early Glaucoma vs. Healthy | Average Precision: Rotterdam 87.4%, Budapest 98.6% |
Asaoka et al.10 | 171 Preperimetric glaucoma vs. 108 Normal and 63 artificially generated Normal—leave one out cross-validation | Leave one out cross-validation; a separate test dataset was not used | Preperimetric OAG (based on ONH changes, VF preceding perimetric field changes) vs. Healthy | Custom DL feed-forwardneural network | Humphrey Field Analyzer 24-2 | Preperimetric glaucoma vs. Healthy | AUC 0.926 |
Berchuck et al.89 | Train (81%): 768 Glaucoma, 1793 Glaucoma suspects, 547 Normal Validation (9%): 83 Glaucoma, 222 Glaucoma suspect, 58 Normal5-fold cross-validation | Test (9%): 93 Glaucoma, 206 Glaucoma suspect, 62 Normal | Glaucoma (repeatable glaucomatous VF defect and corresponding optic nerve damage) vs. Glaucoma suspect (high IOP or suspicious optic nerve but no VF defect) vs. Normal (No visual field or optic nerve defect) | Deep variational autoencoder | Humphrey Field Analyzer 24-2 | Rates of VF progression compared to SAP MD; Prediction of future VF compared to point-wise regression predictions | Rate of progression significantly higher for VAE than MD at 2 years (25% vs. 9%) and 4 years (35% vs. 15%) from baseline. MAE for prediction of 4th, 6th, and 8th visits significantly smaller for VAE than PW (P < 0.001) |
Wen et al.91 | Train + validation (80%): 25,723 and 10-fold cross-validation | Test (20%): 6720 | Actual HFA points and Mean Deviation from HVF | CascadeNet- 5 | Humphrey Field Analyzer 24-2 | HFA points and Mean Deviation | PMAE 2.47; Mean difference in MD between predicted and actual MD = 0.41 dB, Pearson r = 0.92, P < 0.001 |
BAE, best available estimate; DL, deep learning; GON, glaucomatous optic neuropathy; VF, visual field; HFA, Humphrey Field Analyzer; HVF, Humphrey Visual Field; IOP, intraocular pressure; MAE, mean absolute error; PMAE, point-wise mean absolute error; POAG, primary open angle glaucoma; OAG, open angle glaucoma; ONH, optic nerve head; RPE, retina pigmented epithelium.