. 2020 Jul 22;9(2):42. doi: 10.1167/tvst.9.2.42

Table 1.

Summary of Studies Using Deep Learning Models in Glaucoma

Citation	Training/Validation Dataset	Test Dataset	Reference	Network	Data type	Output	Results
Ting et al.²⁷	Train: 125,189	Test: 71,896	Subjective grading of photographs	Custom deep learning system	Color Fundus Photos	“Referable for glaucoma” vs. not	AUC 0.942; Sensitivity 96.4%, Specificity 87.2%
Li et al.⁹	Train: 31,745	8000	Subjective grading of photographs	Inception-v3	Color Fundus Photos	“Referable for glaucoma” vs. not	AUC 0.986; Sensitivity 95.6%, Specificity 92.0%
Christopher et al.³⁶	9189 healthy, 5633 GON: divided randomly into multiple folds for 10-fold cross-validation.	10% test	Subjective grading of photographs	VGG6, Inception-v3, ResNet50	Color Fundus Photos	“GON” vs. healthy	ResNet50 AUC 0.91; Sensitivity 85% at 80% Specificity
Liu et al.²⁸	Train: 29,865 GON, 11,046 probable GON, 200,121 unlikely GON	Validation: 4514 GON, 571 Probable GON, 23,484 unlikely GON	Subjective grading of photographs	ResNet	Color Fundus Photos	“Referable GON” vs. not	AUC 0.996, Sensitivity 96.2%, Specificity 97.7%
Ahn et al.²⁹	Train: 228 Advanced glaucoma, 131 Early glaucoma, 385 Normal; Validation: 98 Advanced glaucoma, 61 Early glaucoma, 165 Normal	Test: 141 Advanced glaucoma, 87 Early glaucoma, 236 Normal	Subjective grading of visual field, OCT and RNFL photographs	Inception-v3; Custom 3-layer CNN	Color Fundus Photos	Glaucoma vs. Normal	Inception-v3 model: AUC 0.93; Average accuracy 84.5%; Custom 3-layer CNN: AUC 0.94, Average accuracy 87.9%
Phene et al.³¹	Train: 35,877 Non-glaucomatous, 20,740 Low-risk GS, 13,180 High-risk GS, 5307 Likely glaucoma, 18,487 Referable glaucoma; Tuning: 849 Non-glaucomatous, 259 Low-risk GS, 268 High-risk GS, 110 Likely glaucoma, 378 Referable glaucoma	Validation set A: 687 Non-glaucomatous, 290 Low-risk GS, 170 High-risk GS, 48 Likely glaucoma, 218 Referable glaucoma; Validation set B: 8753 Non-glaucomatous, N/A Low-risk GS, N/A High-risk GS, 890 Likely glaucoma, 890 Referable glaucoma; Validation set C: 63 Non-glaucomatous, N/A Low-risk GS, 175 High-risk GS, 108 Likely glaucoma, 283 Referable glaucoma	Validation set A: Referable GON based on subjective gradings of photographs; Validation set B: Referable GON based on glaucoma-related International Classification of Diseases codes; Validation set C: referable GON based on full glaucoma workup by glaucoma specialists including clinical exam, history, VF assessment, and OCT	Inception-v3	Color Fundus Photos	“Referable glaucoma” vs. Not	Validation set A: AUC 0.945; Validation set B: AUC 0.855; Validation set C: AUC 0.881
Shibata et al.⁸	Train: 1364 glaucomatous appearance vs. 1768 not glaucomatous appearance; 3-fold cross-validation	Test: 33 non-highly myopic glaucoma, 28 highly myopic glaucoma, 27 non-highly myopic normal, 22 highly myopic normal	Train: Subjective gradings of photographs Test: Subjective gradings of photographs and categorization of RNFL and macular inner retinal thickness measurements based on OCT normative database	ResNet	Color Fundus Photos	Glaucomatous vs. Not	AUC 96.5
Li et al.³⁰	Train 20,793/Validation 2,311:11,176 GON-confirmed, 599 GON-suspected, 11,329 Normal; 10-fold cross-validation with a random selection of 9:1 for participants within each fold	Test: 1442 GON-confirmed, 515 GON-suspected, 1524 Normal	Subjective grading of photographs	ResNet101	Color Fundus Photos	GON-confirmed vs. GON-suspected vs. Normal; Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal)	Comparison of GON-confirmed vs. GON-suspected vs. Normal: Accuracy 0.941, Sensitivity 0.957, Specificity 0.929. AUC 0.992 for Referrals (GON-confirmed and GON-suspected) vs. Observation (Normal)
Medeiros et al.²⁴	Train + validation (80% train, 20% validation), 9,136 Glaucoma, 13,410 Suspect, 3982 Healthy	Test: 2070 Glaucoma,3345 Suspect, 877 Healthy	SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal (Normal + Bordeline) RNFL based on classification of global RNFL by SDOCT normative database	ResNet34	Color Optic Disc Photos paired to SDOCT global RNFL	SDOCT global RNFL value; Abnormal (Glaucoma) vs. Normal RNFL	Pearson r = 0.832, P < 0.001 between DL predicted and actual SDOCT value; Mean absolute error 7.39 µm; AUC 0.944 for DL vs 0.94 for SDOCT (P = 0.72); 90% sensitivity at 80% specificity for both.
Thompson et al.²⁶	Train + validation (80% train, 20% validation): 4,570 Glaucoma, 1924 Suspect, 1046 Healthy	Test: 970 Glaucoma, 432 Suspect, 340 Healthy	Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal (Suspect + Normal) based on classification of BMO-MRW global and sector values by SDOCT normative database	ResNet34	Color Optic Disc Photos paired to SDOCT global BMO-MRW	Global and sector BMO-MRW thickness values; Abnormal (Glaucoma) vs. Normal	Global BMO-MRW Pearson r = 0.88 (P < 0.001) between DL predicted and actual SDOCT value; Mean absolute error 27.8; AUC for DL 0.945 vs. actual SDOCT 0.933 (P = 0.59).
Devalla et al.⁶⁷	40 control/60 glaucoma; training on datasets of 10, 20, 30 or 40 B-scans, with equal number of glaucoma and healthy scans in each cross-validation experiment	Cross-validation experiments with test sets of 90, 80, 70, or 60.	Manual segmentation of ONH OCT	Custom eight-layer CNN	Horizontal B-scan through ONH	Digital stain of RNFL+prelamina, RPE, all other retinal layers, choroid, peripapillary sclera, lamina cribrosa	Dice coefficient 0.84, Sensitivity 92%, specificity 99%, accuracy 94%
Mariottoni et al.⁶⁵	Train 10,520/Validation 2742	Test Set 1 (images without segmentation errors or artifacts) 11,010; Test Set 2 (low-quality images with segmentation errors) 237; Test Set 3 (images with other artifacts) 776	Global RNFL thickness value	ResNet34	SDOCT raw B-scans of peripapillary RNFL	Global RNFL thickness value	Test set 1: Pearson r 0.983 (P < 0.001) between predicted segmentation-free and actual SDOCT global RNFL; MAE 2.41; Test set 2: DL correlation with BAE r = 0.972 vs. with conventional algorithm 0.94, P < 0.001. Test set 3: DL correlation with BAE r = 0.94 vs. with conventional algorithm r = 0.64, P < 0.001.
Thompson et al.²⁵	Train + Validation (50%+20%): 4828 Glaucoma, 9638 Normal	Test (30%): 3897 Glaucoma, 2443 Normal	Glaucoma (based on GON and reproducible glaucomatous visual field defects) vs. Healthy	ResNet34	SDOCT raw B-scans of peripapillary RNFL	Glaucoma vs. Healthy	AUC 0.96 for DL algorithm vs. AUC 0.87 for global RNFL thickness (P < 0.001)
Maetschke et al.⁶⁶	Train (80%): 672 POAG, 216 Healthy; Validation (10%): 30 Healthy, 82 POAG	Test (10%): 93 POAG, 17 Healthy	Glaucoma (based on glaucomatous VF defects on 2 consecutive tests) vs. Healthy	Custom 5-layer CNN	OCT of the ONH	Glaucoma vs. Healthy	AUC 0.94
Asaoka et al.⁷	Pretraining: 1371 Open angle glaucoma, 193 Healthy; Training: 94 Open angle glaucoma, 84 Healthy	Test: 114 Open angle glaucoma and MD >−5 dB, 82 Healthy	Glaucoma (based on GON and glaucomatous VF defects) vs. Healthy	Custom 6-layer CNN	8 × 8 macular grid	Glaucoma vs. Healthy	AUC 0.937
Xu et al.⁶⁹	Cross-validation (85%: 80% training/20% validation) 1632 OAG, 1764 closed	Test (15%): 311 open, 329 closed	Angle Closed vs. Open based on gonioscopic grade	ResNet18; Inception v3	Anterior Segment-OCT	Angle closed vs. Open	AUC 0.928
Fu et al.⁷⁰	7375 open angle, 895 angle closure: 5-fold cross-validation - four groups, each with 1654 angle closure tests for training, and one group of 1654 angle closure for testing	1654 angle closure for testing within each fold	Angle closed vs. open based on gonioscopic grade	VGG-16	Anterior Segment-OCT	Angle closed vs. open	AUC 0.96, sensitivity 90%, specificity 92%
Mariottoni et al.⁷⁶	Training/Validation:3980 Glaucoma, 3732 Normal	Test:1061 Glaucoma, 1057 Normal	GON vs. GON suspects vs. Normal based on SAP and OCT objective criteria (see Table 2)	ResNet50	Optic Disc Photos	GON vs. Normal	AUC 0.92, Sensitivity 77% at Specificity 95%
Li et al.⁷¹	Overall: 2389 Glaucoma, 1623 Non-glaucoma: Train: 3712	Test: 300	Glaucoma (based on glaucomatous damage to ONH and reproducible glaucomatous VF defects) vs. Healthy	VGG	Pattern Deviation plots from Humphrey Field Analyzer 30-2 or 24-2 visual field tests	Glaucoma vs. Healthy	AUC 0.966, Sensitivity 93.2%, Specificity 82.6%
Kucur et al.⁷²	1979 control (Rotterdam 244; Budapest 1735), 2811 Early glaucoma (Rotterdam 2,279; Budapest 532)– 10-fold cross- validation	10-fold cross-validation; unclear if separate test and validation datasets were used	Early glaucoma (based on glaucomatous neuroretinal rim loss, reproducible VF defects, and IOP) vs. Healthy	Custom 7-layer CNN	OCTOPUS 101 G1 and Humphrey Field Analyzer 24-2 visual field tests	Early Glaucoma vs. Healthy	Average Precision: Rotterdam 87.4%, Budapest 98.6%
Asaoka et al.¹⁰	171 Preperimetric glaucoma vs. 108 Normal and 63 artificially generated Normal—leave one out cross-validation	Leave one out cross-validation; a separate test dataset was not used	Preperimetric OAG (based on ONH changes, VF preceding perimetric field changes) vs. Healthy	Custom DL feed-forwardneural network	Humphrey Field Analyzer 24-2	Preperimetric glaucoma vs. Healthy	AUC 0.926
Berchuck et al.⁸⁹	Train (81%): 768 Glaucoma, 1793 Glaucoma suspects, 547 Normal Validation (9%): 83 Glaucoma, 222 Glaucoma suspect, 58 Normal5-fold cross-validation	Test (9%): 93 Glaucoma, 206 Glaucoma suspect, 62 Normal	Glaucoma (repeatable glaucomatous VF defect and corresponding optic nerve damage) vs. Glaucoma suspect (high IOP or suspicious optic nerve but no VF defect) vs. Normal (No visual field or optic nerve defect)	Deep variational autoencoder	Humphrey Field Analyzer 24-2	Rates of VF progression compared to SAP MD; Prediction of future VF compared to point-wise regression predictions	Rate of progression significantly higher for VAE than MD at 2 years (25% vs. 9%) and 4 years (35% vs. 15%) from baseline. MAE for prediction of 4^th, 6^th, and 8^th visits significantly smaller for VAE than PW (P < 0.001)
Wen et al.⁹¹	Train + validation (80%): 25,723 and 10-fold cross-validation	Test (20%): 6720	Actual HFA points and Mean Deviation from HVF	CascadeNet- 5	Humphrey Field Analyzer 24-2	HFA points and Mean Deviation	PMAE 2.47; Mean difference in MD between predicted and actual MD = 0.41 dB, Pearson r = 0.92, P < 0.001

BAE, best available estimate; DL, deep learning; GON, glaucomatous optic neuropathy; VF, visual field; HFA, Humphrey Field Analyzer; HVF, Humphrey Visual Field; IOP, intraocular pressure; MAE, mean absolute error; PMAE, point-wise mean absolute error; POAG, primary open angle glaucoma; OAG, open angle glaucoma; ONH, optic nerve head; RPE, retina pigmented epithelium.