Abstract
Purpose:
To validate the efficacy of a fully-automatic, deep learning-based segmentation algorithm beyond conventional performance metrics by measuring the primary outcome of a clinical trial for macular telangiectasia type 2 (MacTel2)
Design:
Evaluation of diagnostic test or technology
Participants:
92 eyes from 62 participants with MacTel2 from a phase 2 clinical trial ( NCT01949324) randomized to one of two treatment groups
Methods:
The ellipsoid zone (EZ) defect areas were measured on spectral domain optical coherence tomography images of each eye at two time points (Baseline and Month 24) by a fully-automatic, deep learning-based segmentation algorithm. The change in EZ defect area from Baseline to Month 24 was calculated and analyzed according to the clinical trial protocol.
Primary Outcome Measure:
Difference in the change in EZ defect area from Baseline to Month 24 between the two treatment groups.
Results:
The difference in the change in EZ defect area from Baseline to Month 24 between the two treatment groups measured by the fully-automatic segmentation algorithm was 0.072 ± 0.035 mm2 (p = 0.021). This was comparable to the outcome of the clinical trial using semi-automatic measurements by expert Readers, 0.065 ± 0.033 mm2 (p = 0.025).
Conclusions:
The fully-automatic segmentation algorithm was as accurate as semi-automatic expert segmentation to assess EZ defect areas and was able to reliably reproduce the statistically significant primary outcome measure of the clinical trial. This approach, to validate the performance of an automatic segmentation algorithm on the primary clinical trial endpoint, provides a robust gauge of its clinical applicability.
Introduction
Idiopathic macular telangiectasia type 2 (MacTel2) is a bilateral degenerative retinal disease which leads to progressive loss of visual function.1 Structural loss of the photoreceptor ellipsoid zone (EZ), also known as the inner segment/outer segment (IS/OS) junction,2 is a biomarker of the disease. EZ defects are visible on optical coherence tomography (OCT) images1 and correlate with a functional loss of retinal sensitivity.3-5 Thus, the assessment of EZ defects is an important outcome measure in observational and interventional clinical studies for MacTel2, including an ongoing clinical trial using an encapsulated cell-based delivery of ciliary neurotrophic factor (CNTF) implant to slow photoreceptor loss during retinal degeneration.6, 7
Many automatic segmentation algorithms have been developed to analyze retinal OCT images of eyes with a variety of conditions.8-21 These algorithms have included those which can quantify EZ defects.22-27 To date, the performance of these algorithms is described by metrics such as sensitivity and specificity. For segmentation algorithms, the Dice similarity coefficient (DSC) is one of the most widely used performance metric.28 However, as there are no established thresholds to define acceptable performance, it is unclear whether these performance metrics are useful for clinical applications and whether these algorithms can be reliably used. Therefore, while automatic algorithms are often used to aid segmentation, in clinical trials, manual correction of segmentation errors is performed by a human expert; accordingly, semi-automatic segmentation has been the most common approach used in clinical applications.3, 29-35
In many applications, the segmentation itself is not necessarily the primary clinically-relevant endpoint. For example, consider the case illustrated in Figure 1 of a longitudinal clinical trial in which the change in biomarker size, the length of EZ defect regions between two time points, is used to monitor the effect of treatment on disease progression. Due to inherent bias (e.g. different thresholds of hypo-reflectivity), two different image analysis systems, whether manual or automatic, can measure a statistically significant difference in the EZ defect length at each time point. However, if these systems are well-designed and can consistently measure corresponding EZ defect regions accurately, they may have a statistically insignificant difference in the change in length over time, thus reflecting the same treatment effect. Therefore, we believe that the performance of automatic segmentation algorithms using clinically-relevant endpoints would be a better reflection of their clinical applicability.
Figure 1:
The white and yellow lines illustrate measurements of EZ defect length by two image analysis systems (a manually-corrected method7 in white and a fully-automated method26 in yellow). While there is a relatively large difference between the two measurements of the EZ defect length at each time point, there is an insignificant difference between the two measurements of the change, δ, in length between time points, as both image analysis systems are consistent in their respective measurements.
In this article, we validate the clinically-relevant performance of a fully-automatic segmentation method by evaluating its ability to reliably reproduce the primary outcome measure of a clinical trial, in addition to more conventional performance metrics. We use Deep OCT Atrophy Detection (DOCTAD) – a recently-developed, fully-automatic, deep learning-based segmentation method26 to assess EZ defects on OCT images from the phase 2 clinical trial of CNTF for MacTel2.7 Herein, we show that the fully-automatic deep learning-based segmentation method produced a statistically significant primary clinical trial endpoint result that was comparable to the much more labor-intensive, expert-evaluated semi-automatic method. We demonstrate that there is the potential to replace the current standard practice of semi-automatic segmentation, which is both expensive and time-consuming, and recommend that for clinical trial applications, automatic segmentation algorithms should be validated using clinically-relevant endpoints.
Methods
Participants:
The study data set consisted of spectral domain (SD)-OCT volumes of 92 eyes from 62 per-protocol participants enrolled in the international, multicenter, randomized, phase 2 clinical trial of CNTF for MacTel27 ( NCT01949324; NTMT02; Neurotech, Cumberland, RI, USA). These 92 eyes were from an intent-to-treat population consisting of 99 eyes from 67 participants, where seven eyes were later considered ineligible because of evidence of subretinal neovascular proliferation, inadequate EZ defect area at Baseline, or other protocol deviations such as incorrect imaging modality.7 The study complied with the Health Insurance Portability and Accountability Act (HIPAA) and Clinical Trials (United States and Australia) guidelines, adhered to the tenets of the Declaration of Helsinki, and was approved by the institutional ethics committees at each participating center.
Bilateral eligible participants were randomized to receive a CNTF implant in one eye, and sham treatment in the other eye. Unilateral eligible participants were randomized to receive either the CNTF implant or sham treatment. All participants were imaged at multiple pre-specified time points over the course of two years (Baseline to Month 24) on Spectralis SD-OCT systems (Heidelberg Engineering, GmBH, Heidelberg, Germany) at 11 study sites in the United States and Australia. All Baseline and Month 24 SD-OCT volumes consisted of 97 B-scans × 1024 A-scans within a 20° × 20° (approximately 6mm × 6mm) retinal area, except 12 Baseline volumes which consisted of 512 A-scans. All eyes were evaluated through Month 24, except for three eyes which were lost due to the death of two participants.
Semi-automatic Segmentation (Clinical Trial Gold Standard):
Semi-automatic segmentation of the EZ defect areas was performed as described in previous publications.3, 7, 26 Briefly, for each B-scan in a given SD-OCT volume, the retinal layer boundaries defining the EZ were automatically segmented by graph search using the Duke OCT Retinal Analysis Program (DOCTRAP; Duke University, Durham, NC, USA).36 The automatic segmentations were reviewed and manually corrected by an expert Reader at Duke Reading Center. A second, more senior Reader reviewed the corrected segmentations, and further corrected the segmentations as necessary.
The EZ thicknesses were axially projected onto a 2-dimensional (2-D) image to generate an en face EZ thickness map. The EZ thickness map was interpolated to obtain a pixel pitch of 10μm in each direction. EZ thicknesses less than 12μm were classified as EZ defects3 and the EZ thickness map was thresholded to obtain a binary map of EZ defects. Figure 2 illustrates the semi-automatic segmentation process.
Figure 2:
Semi-automatic segmentation process used as the clinical trial gold standard. (a) SD-OCT volume scan. (b) Segmentation of the inner (magenta) and outer (cyan) retinal layer boundaries that corresponded to the EZ, after manual correction by expert Readers. (c) En face EZ thickness map generated by axial projection of EZ thicknesses. (d) En face binary map of EZ defects obtained by thresholding the EZ thickness map.
Fully-automatic Segmentation:
Fully-automatic segmentation of the EZ defect areas were obtained by a modified variation of DOCTAD, the original version of which was described in our recent publication.26 Briefly, a convolutional neural network (CNN) was designed and trained to classify clusters of SD-OCT A-scans as “normal” or “EZ defects”. During prediction, the trained CNN was used to generate an en face probability map of EZ defects for each SD-OCT volume. The probability map was interpolated to obtain a pixel pitch of 10μm in each direction, thresholded, and post-processed to obtain the binary map of EZ defects where the EZ defect areas could be measured. Figure 3 illustrates the fully-automatic segmentation process.
Figure 3:
Fully-automatic segmentation process. (a) SD-OCT volume scan. (b) Clusters of A-scans extracted from the SD-OCT volume scan. (c) A trained CNN was used to classify clusters of A-scans as “normal” or “EZ defects”. (d) En face probability map of EZ defects generated by the trained CNN. (e) En face binary map of EZ defects obtained by thresholding and post-processing the probability map.
All modifications made since the publication of DOCTAD26 are fully-automatic and are as follows:
Pre-processing: As previously described,26 all B-scans in a volume were cropped about a single estimated center during cluster extraction to remove as much of the background as possible. However, for images whereby the retina is tilted, either due to off-axis scan acquisition or the natural curvature of the eye, this method is not ideal and may result in a portion of the retina being cropped. Therefore, for each B-scan, estimates of the retinal nerve fiber layer (RNFL) and retinal pigment epithelium (RPE), which are the inner and outermost retinal layers, were obtained by a simple smoothing and thresholding method, as previously described.26 The estimated center was adjusted by shifting it up or down as necessary, to ensure that the cropping did not remove any portion of the image between the RNFL and RPE estimates.
Thresholding: Grid-search was used to determine the best threshold to be applied to the probability map by maximizing the DSC on the hold-out validation set. The best threshold was determined to be 0.65.
Post-processing: An additional binary morphological operation was applied to fill any holes in the binary map of EZ defects.37
Six-fold cross validation at the participant level, identical to the original publication,26 was used to ensure independence of the training and testing sets. The 67 enrolled participants were divided into six groups – five groups contained 11 participants while one group contained 12 participants. As both eyes were imaged regardless of eligibility, the Baseline volumes of both eyes were used during training. Each training set consisted of the Baseline volumes of participants in five groups. From the training set, one group was set aside as the hold-out validation set. The groups were then rotated such that each group was excluded from the training set once. As a result, there were six CNNs, each trained using a different set of groups. During prediction, for a given SD-OCT volume of a participant, the CNN that did not include the participant in its training set was used to generate the en face probability map of EZ defects.
Performance Metrics:
We calculated three performance metrics – the DSC, absolute error (Ea), and signed error (Es) – of the fully-automatic segmentation method as compared to the semi-automatic segmentation method (clinical trial gold standard). The performance metrics were calculated as
#(1) |
#(2) |
#(3) |
where TP was the number of true positives, FP was the number of false positives, FN was the number of false negatives, and k = 0.0001 was the conversion factor from pixels to mm2. Es is also equivalent to the difference between EZ defect areas measured by the fully- and semi-automatic segmentation methods.
Primary Outcome Measure:
The primary outcome measure was the change in total EZ defect area from Baseline to Month 24 for each per-protocol eye. A mixed effects model which incorporated random (due to correlation between eyes) and fixed (due to treatment group) effects was used to compute the mean difference in change between the sham and CNTF implant treatment groups and the corresponding one-sided p-values. Both the Readers and algorithm-development team were masked to the treatment group of each eye. All statistical analysis was conducted by a (remote, and independent from the algorithm-development team) biostatistician (TEC) using commercially-available software (SAS Version 9.3; SAS Institute, Cary, NC, USA).
Results
Quantitative Analysis:
Table 1 shows the performance metrics of the fully-automatic segmentation method as compared to the semi-automatic segmentation method (clinical trial gold standard). Overall, there was good agreement between the semi- and fully-automatic binary maps of EZ defects at Baseline and Month 24 with high DSC and low errors. There was a high positive correlation (Pearson’s r = 0.79) between the change in total EZ defect area measured by the semi- and fully-automatic segmentation methods. Figure 4 shows the Bland-Altman plot of the agreement between the measurements.
Table 1:
Performance metrics (mean ± standard deviation) of the fully-automatic segmentation method at Baseline and Month 24.
Performance metric | Baseline | Month 24 | ||
---|---|---|---|---|
Semi-automatic | Fully-automatic | Semi-automatic | Fully-automatic | |
EZ defect area (mm2) | 0.702 ± 0.444 | 0.747 ± 0.455 | 0.871 ± 0.535 | 0.861 ± 0.525 |
DSC | - | 0.880 ± 0.094 | - | 0.894 ± 0.091 |
Ea (mm2) | - | 0.146 ± 0.103 | - | 0.150 ± 0.103 |
Es (mm2) | - | 0.045 ± 0.097 | - | −0.010 ± 0.125 |
Figure 4:
Bland-Altman plot of the semi- and fully-automatic change in total EZ defect area from Baseline to Month 24. The solid and dotted lines show the mean and 95% limits of agreement of the difference, respectively.
Qualitative Analysis:
Overall, there was good agreement between the semi- and fully-automatic segmentations of EZ defects at Baseline and Month 24. Figure 5 shows an example whereby the change in total EZ defect area measured by the semi- and fully-automatic segmentation methods was very similar, and differences occurred only around the boundaries of the EZ defect areas.
Figure 5:
Segmentation of EZ defect areas at Baseline and Month 24. The B-scans correspond to the position marked by the horizontal line on the en face image. The change in total EZ defect area measured by both semi- and fully-automatic segmentation methods was very similar and minor differences occurred only around the boundaries of the EZ defect areas. (Legend: As/f = semi/fully-automatic EZ defect areas, δs/f = semi/fully-automatic change in total EZ defect area, white = EZ defects, green = true positives, blue = false positives, red = false negatives)
Further qualitative assessment of the segmentations showed that the smaller change in total EZ defect area measured by the fully-automatic segmentation method can be attributed to the algorithm’s tendency to classify “borderline defective” areas as EZ defects, especially at Baseline. Often, the EZ is faintly visible in these areas, a sign that it is diseased, but has not been completely lost. Figure 6 shows an example whereby a small “borderline defective” area at Baseline was classified as normal by the semi-automatic segmentation method but classified as EZ defects by the fully-automatic segmentation method. By Month 24, the same area had progressively degenerated, and was classified as EZ defects by both segmentation methods.
Figure 6:
Segmentation of EZ defect areas at Baseline and Month 24. The B-scans correspond to the position marked by the horizontal line on the en face image. The change in total EZ defect area measured by the fully-automatic segmentation method was less than that measured by the semi-automatic segmentation method. This result was caused by a “borderline defective” area (hypo-reflective EZ region indicated by the yellow arrow) at Baseline that was classified as normal by the semi-automatic segmentation method but classified as EZ defects by the fully-automatic segmentation method. By Month 24, the same area (orange arrow) had progressively degenerated and was classified as EZ defects by both segmentation methods. (Legend: As/f = semi/fully-automatic EZ defect areas, δs/f = semi/fully-automatic change in total EZ defect area, white = EZ defects, green = true positives, blue = false positives, red = false negatives)
“Borderline defective” areas are difficult to correct and segment even for expert Readers, and the final corrected segmentation often comes down to a judgment call, which may vary between different Readers, or even between the same Reader at different time points. To illustrate this point, in contrast to Figure 6 which shows an example whereby a small “borderline defective” area at Baseline was classified as normal by the semi-automatic segmentation method, Figure 7 shows an example whereby a small “borderline defective” area at Baseline was instead classified as EZ defects by the semi-automatic segmentation method.
Figure 7:
Segmentation of EZ defect areas at Baseline. The B-scans correspond to the position marked by the horizontal line on the en face image. A “borderline defective” area (yellow arrow) was classified as EZ defects by both semi- and fully-automatic segmentation methods. (Legend: As/f = semi/fully-automatic EZ defect areas, white = EZ defects, green = true positives, blue = false positives, red = false negatives)
Occasionally, upon re-review, definite minor manual correction errors were also found in the semi-automatic segmentations, although these were rare occurrences. Figure 8 shows examples where the EZ was misclassified by the semi-automatic segmentation method but often correctly classified by the fully-automatic segmentation method.
Figure 8:
Manual correction errors in the semi-automatic segmentation method. (a) EZ defect areas misclassified as normal by the semi-automatic segmentation method (yellow arrow) was correctly classified as EZ defects by the fully-automatic segmentation method. (b) Normal area misclassified as EZ defects by the semi-automatic segmentation method (orange arrow) was correctly classified as normal by the fully-automatic segmentation method. (Legend: white = EZ defects, green = true positives, blue = false positives, red = false negatives)
Primary Outcome Measure:
Overall, both semi- and fully-automatic segmentation methods measured a statistically significant smaller change in total EZ defect area in the treatment group that received the CNTF implant compared to sham treatment. Table 2 shows the change in total EZ defect area from Baseline to Month 24 for the sham and CNTF implant treatment groups, the difference between the treatment groups, and the corresponding one-sided p-value, for both semi- and fully-automatic segmentation methods.
Table 2:
Change in total EZ defect area (mean ± standard deviation) and difference between treatment groups as measured by the semi- and fully-automatic segmentation methods. Statistical significance was based on a p-value < 0.05.
Segmentation method |
Change in total EZ defect area (mm2) | p-value | ||
---|---|---|---|---|
Treatment group | Difference | |||
Sham | CNTF implant | |||
Semi-automatic | 0.213 ± 0.028 | 0.148 ± 0.029 | 0.065 ± 0.033 | 0.025 |
Fully-automatic | 0.161 ± 0.026 | 0.089 ± 0.027 | 0.072 ± 0.035 | 0.021 |
Discussion
Several exciting studies have been conducted to evaluate the performance of deep learning-based algorithms to segment biomarkers or to diagnose ophthalmic diseases from ophthalmic imaging.14, 18, 38-50 Unlike these studies, this paper successfully used fully-automatic deep learning-based analysis to measure the primary outcome of a clinical trial that assessed the efficacy of an ophthalmic therapeutic agent, when the algorithm-development team was completely masked to the treatment groups. This masked design reduces the probability of developing a system that is only tailor-made for a particular dataset.
We have shown that the fully-automatic, deep learning-based algorithm segments EZ defect areas to a degree that matches expert manual correction of semi-automatic segmentation, which is the current gold standard. The performance metrics were considered high by conventional standards, where the rule of thumb is that DSC and correlation values above 0.70 indicate high correlation.51 The differences between the semi- and fully-automatic segmentation methods occurred mostly around the boundaries of the EZ defect areas or in “borderline defective” areas which are difficult to segment even for expert Readers. As illustrated in Figure 6 and Figure 7, the final segmentations boil down to judgment calls which may vary between Readers or the same Reader at different time points. Besides that, manual correction errors naturally occur in any large-scale multi-center clinical trial as shown in Figure 8, although such cases are few and far between. On the other hand, the fully-automatic segmentation method consistently classified “borderline defective” areas as EZ defects, and correctly classified the areas for which manual correction errors were made. Consistency is important when assessing the change in the size of a biomarker between two time points, especially when the size of the biomarker and therefore any changes over time may be small.
Ultimately, the difference measured between the two treatment groups by the fully-automatic segmentation method was comparable to the semi-automatic segmentation method. Both segmentation methods resulted in the same clinical trial outcome, showing that the CNTF implant did indeed slow down the progression of EZ defects in eyes with MacTel2, data that support the CNTF implant as the first potential treatment for MacTel2. We have demonstrated beyond conventional performance metrics, that a fully-automatic segmentation algorithm can reproduce the statistically significant expert-evaluated results in a clinical trial for an ophthalmic treatment.
The major limitation of this study is that we have only demonstrated the clinical applicability of one automatic segmentation algorithm, but we believe that there is a benefit to validate other automatic segmentation algorithms in this way. Many automatic algorithms have been developed in the past few years to quantify retinal layers and pathological features. While segmentation based on classic image analysis methods such as graph search are useful for segmentation of eyes with limited pathology, recent studies have shown the superior performance of learning-based algorithms when dealing with a diverse set of pathological and anatomical features which are often encountered in large-scale clinical trials.8, 26, 52 As the rapid development of automatic segmentation algorithms continues, by using real clinical endpoints, we are better able to determine the usefulness of automatic segmentation algorithms and the impact that they may have on clinical decisions and outcomes.
Our current algorithm may be applied to other clinical trials or datasets. While ensemble methods 13, 38, 52 may be used with the existing six trained CNNs, they come with a price in terms of computational resources or a longer test time. Therefore, the algorithm can instead be retrained with all the data to produce a single high-performing CNN for application to future datasets.
Any technology requires approval or clearance by the United States Food and Drug Administration (FDA) before it can be legally marketed and distributed, so the question arises if such a fully-automatic deep learning-based algorithm would likely be approved for clinical use by the FDA. In the past few years, the FDA has been actively considering and regulating artificial intelligence/machine learning (AI/ML)-based software as medical devices, most recently publishing a proposed regulatory framework53 for discussion and feedback from the community. To date, several AI/ML-based devices in various medical fields have been approved,54 including one for the detection of referable diabetic retinopathy in ophthalmology.39 Therefore, recent history and current trends demonstrate an optimistic outlook regarding the FDA’s approval of well-evaluated and well-regulated AI/ML-based devices. We emphasize that the current software is not yet approved by the FDA, which is expected to require specific experiments for further evaluation.
Highlights.
A deep learning-based automatic segmentation algorithm was as accurate as semi-automatic expert segmentation to assess ellipsoid zone defect areas and was able to reliably reproduce the statistically significant primary outcome measure of a clinical trial.
Acknowledgments
Financial Support: Funding was provided by The Lowy Medical Research Institute, National Institutes of Health (R01 EY022691 and P30 EY005722), Google Faculty Research Award, and 2018 Unrestricted Grant from Research to Prevent Blindness. The funding organizations had no role in the design or conduct of this research.
Abbreviations and Acronyms:
- CNN
convolutional neural network
- CNTF
ciliary neurotrophic factor
- DOCTAD
Deep Optical Coherence Tomography Atrophy Detection
- DOCTRAP
Duke Optical Coherence Tomography Retinal Analysis Program
- DSC
Dice similarity coefficient
- Ea
absolute error
- Es
signed error
- EZ
ellipsoid zone
- FP
false positive
- FN
false negative
- IS/OS
inner segment/outer segment
- MacTel2
macular telangiectasia type 2
- RNFL
retinal nerve fiber layer
- RPE
retinal pigment epithelium
- SD-OCT
spectral domain optical coherence tomography
- TP
true positive
Footnotes
Conflict of Interest:
JL, TEC, EYC, MF: These authors have no financial interest
GJJ: Consultant to Heidelberg Engineering
SF: Patent holder for Duke Optical Coherence Tomography Retinal Analysis Program (DOCTRAP) software for segmenting the OCT images at Duke University
Meeting Presentation: Ophthalmic Technologies XXIX, San Francisco, CA, 2019; ARVO Annual Meeting, Vancouver, BC, 2019.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Issa PC, Gillies MC, Chew EY, et al. Macular telangiectasia type 2. Prog. Retin. Eye. Res 2013;34:49–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Jonnal RS, Kocaoglu OP, Zawadzki RJ, et al. The cellular origins of the outer retinal bands in optical coherence tomography images. Invest. Ophth. Vis. Sci 2014;55(12):7904–7918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mukherjee D, Lad EM, Vann RR, et al. Correlation between macular integrity assessment and optical coherence tomography imaging of ellipsoid zone in macular telangiectasia type 2. Invest. Ophth. Vis. Sci 2017;58(6):291–299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Peto T, Heeren TF, Clemons TE, et al. Correlation of clinical and structural progression with visual acuity loss in macular telangiectasia type 2: MacTel Project Report No. 6 - The MacTel Research Group. Retina 2018;38:S8–S13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sallo FB, Leung I, Clemons TE, et al. Correlation of structural and functional outcome measures in a phase one trial of ciliary neurotrophic factor in type 2 idiopathic macular telangiectasia. Retina 2018;38:S27–S32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chew EY, Clemons TE, Peto T, et al. Ciliary neurotrophic factor for macular telangiectasia type 2: results from a phase 1 safety trial. Am. J. Ophthalmol 2015;159(4):659–666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chew EY, Clemons TE, Jaffe GJ, et al. Effect of ciliary neurotrophic factor on retinal neurodegeneration in patients with macular telangiectasia type 2: a randomized clinical trial. Ophthalmology 2019;126(4):540–549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fang L, Cunefare D, Wang C, et al. Automatic segmentation of nine retinal layer boundaries in OCT images of non-exudative AMD patients using deep learning and graph search. Biomed. Opt. Express 2017;8(5):2732–2744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Venhuizen FG, van Ginneken B, Liefers B, et al. Robust total retina thickness segmentation in optical coherence tomography images using convolutional neural networks. Biomed. Opt. Express 2017;8(7):3292–3316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chiu SJ, Izatt JA, O'Connell RV, et al. Validated automatic segmentation of AMD pathology including drusen and geographic atrophy in SD-OCT images. Invest. Ophth. Vis. Sci 2012;53(1): 53–61. [DOI] [PubMed] [Google Scholar]
- 11.Chiu SJ, Allingham MJ, Mettu PS, et al. Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema. Biomed. Opt. Express 2015;6(4): 1172–1194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Shi F, Chen X, Zhao H, et al. Automated 3-D retinal layer segmentation of macular optical coherence tomography images with serous pigment epithelial detachments. IEEE T. Med. Imaging 2015;34(2):441–452. [DOI] [PubMed] [Google Scholar]
- 13.Roy AG, Conjeti S, Karri SPK, et al. ReLayNet: Retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks. Biomed. Opt. Express 2017;8(8):3627–3642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lee CS, Baughman DM, Lee AY. Deep learning is effective for classifying normal versus age- related macular degeneration OCT images. Ophthalmol. Retina 2017;1(4):322–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lee CS, Tyring AJ, Deruyter NP, et al. Deep-learning based, automated segmentation of macular edema in optical coherence tomography. Biomed. Opt. Express 2017;8(7):3440–3448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lu D, Heisler M, Lee S, et al. Deep-learning based multiclass retinal fluid segmentation and detection in optical coherence tomography images using a fully convolutional neural network. Med. Image. Anal 2019;54:100–110. [DOI] [PubMed] [Google Scholar]
- 17.Lu W, Tong Y, Yu Y, et al. Deep learning-based automated classification of multi-categorical abnormalities from optical coherence tomography images. Transl. Vis. Sci. Technol 2018;7(6):41–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Schlegl T, Waldstein SM, Bogunovic H, et al. Fully automated detection and quantification of macular fluid in OCT using deep learning. Ophthalmology 2018;125(4):549–558. [DOI] [PubMed] [Google Scholar]
- 19.Tian J, Varga B, Somfai GM, et al. Real-time automatic segmentation of optical coherence tomography volume data of the macular region. PLoS One 2015;10(8):e0133908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Venhuizen FG, van Ginneken B, Liefers B, et al. Deep learning approach for the detection and quantification of intraretinal cystoid fluid in multivendor optical coherence tomography. Biomed. Opt. Express 2018;9(4):1545–1569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Garvin MK, Abramoff MD, Wu X, et al. Automated 3-D intraretinal layer segmentation of macular spectral-domain optical coherence tomography images. IEEE T. Med. Imaging 2009;28(9): 1436–1447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhu W, Chen H, Zhao H, et al. Automatic three-dimensional detection of photoreceptor ellipsoid zone disruption caused by trauma in the OCT. Sci. Rep 2016;6:25433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang Z, Camino A, Zhang M, et al. Automated detection of photoreceptor disruption in mild diabetic retinopathy on volumetric optical coherence tomography. Biomed. Opt. Express 2017;8(12):5384–5398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Camino A, Wang Z, Wang J, et al. Deep learning for the segmentation of preserved photoreceptors on en face optical coherence tomography in two inherited retinal diseases. Biomed. Opt. Express 2018;9(7):3092–3105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Itoh Y, Vasanji A, Ehlers JP. Volumetric ellipsoid zone mapping for enhanced visualisation of outer retinal integrity with optical coherence tomography. Brit. J. Ophthalmol 2016;100(3):295–299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Loo J, Fang L, Cunefare D, et al. Deep longitudinal transfer learning-based automatic segmentation of photoreceptor ellipsoid zone defects on optical coherence tomography images of macular telangiectasia type 2. Biomed. Opt. Express 2018;9(6):2681–2698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.de Sisternes L, Hu J, Rubin DL, Leng T. Visual prognosis of eyes recovering from macular hole surgery through automated quantitative analysis of spectral-domain optical coherence tomography (SD-OCT) scans. Invest. Ophth. Vis. Sci 2015;56(8):4631–4643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Dice LR. Measures of the amount of ecologic association between species. Ecology 1945;26(3):297–302. [Google Scholar]
- 29.Banaee T, Singh RP, Champ K, et al. Ellipsoid zone mapping parameters in retinal venous occlusive disease with associated macular edema. Ophthalmol. Retina 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Farsiu S, Chiu SJ, O'Connell RV, et al. Quantitative classification of eyes with and without intermediate age-related macular degeneration using optical coherence tomography. Ophthalmology 2014; 121(1): 162–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Folgar FA, Yuan EL, Sevilla MB, et al. Drusen volume and retinal pigment epithelium abnormal thinning volume predict 2-year progression of age-related macular degeneration. Ophthalmology 2016;123(1):39–50. [DOI] [PubMed] [Google Scholar]
- 32.Simonett JM, Huang R, Siddique N, et al. Macular sub-layer thinning and association with pulmonary function tests in amyotrophic lateral sclerosis. Sci. Rep 2016;6:29187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Francis AW, Wanek J, Lim JI, Shahidi M. Enface thickness mapping and reflectance imaging of retinal layers in diabetic retinopathy. PLoS One 2015;10(12):e0145628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wu Z, Cunefare D, Chiu E, et al. Longitudinal associations between microstructural changes and microperimetry in the early stages of age-related macular degeneration. Invest. Ophth. Vis. Sci 2016;57(8):3714–3722. [DOI] [PubMed] [Google Scholar]
- 35.Boynton GE, Stem MS, Kwark L, et al. Multimodal characterization of proliferative diabetic retinopathy reveals alterations in outer retinal function and structure. Ophthalmology 2015;122(5):957–967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Chiu SJ, Li XT, Nicholas P, et al. Automatic segmentation of seven retinal layers in SDOCT images congruent with expert manual segmentation. Opt. Express 2010; 18(18): 19413–19428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Soille P. Morphological image analysis: principles and applications: Springer Science & Business Media; 2013. [Google Scholar]
- 38.De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med 2018;24(9): 1342. [DOI] [PubMed] [Google Scholar]
- 39.Abràmoff MD, Lavin PT, Birch M, et al. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit. Med 2018;1 (1):39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Peng Y, Dharssi S, Chen Q, et al. DeepSeeNet: A deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs. Ophthalmology 2019; 126(4) :565–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Abràmoff MD, Lou Y, Erginay A, et al. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Invest. Ophth. Vis. Sci 2016;57(13):5200–5206. [DOI] [PubMed] [Google Scholar]
- 42.Schmidt-Erfurth U, Bogunovic H, Sadeghipour A, et al. Machine learning to analyze the prognostic value of current imaging biomarkers in neovascular age-related macular degeneration. Ophthalmol. Retina 2018;2(1):24–30. [DOI] [PubMed] [Google Scholar]
- 43.Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016;316(22):2402–2410. [DOI] [PubMed] [Google Scholar]
- 44.Gargeya R, Leng T. Automated identification of diabetic retinopathy using deep learning. Ophthalmology 2017;124(7):962–969. [DOI] [PubMed] [Google Scholar]
- 45.Coyner AS, Swan R, Campbell JP, et al. Automated fundus image quality assessment in retinopathy of prematurity using deep convolutional neural networks. Ophthalmol. Retina 2019;3(5):444–450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ting DSW, Cheung CY-L, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017;318(22):2211–2223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Li Z, He Y, Keel S, et al. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology 2018; 125(8): 1199–1206. [DOI] [PubMed] [Google Scholar]
- 48.Grassmann F, Mengelkamp J, Brandl C, et al. A deep learning algorithm for prediction of age-related eye disease study severity scale for age-related macular degeneration from color fundus photography. Ophthalmology 2018;125(9): 1410–1420. [DOI] [PubMed] [Google Scholar]
- 49.Burlina PM, Joshi N, Pekala M, et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks. JAMA Ophthalmol. 2017;135(11): 1170–1176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Brown JM, Campbell JP, Beers A, et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 2018;136(7): 803–810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Mukaka MM. A guide to appropriate use of correlation coefficient in medical research. Malawi Medical Journal 2012;24(3):69–71. [PMC free article] [PubMed] [Google Scholar]
- 52.Ji Z, Chen Q, Niu S, et al. Beyond retinal layers: a deep voting model for automated geographic atrophy segmentation in SD-OCT images. Transl. Vis. Sci. Technol 2018;7(1): 1–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.US Food and Drug Administration. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based Software as a Medical Device (SaMD)-discussion paper and request for feedback. 2019.
- 54.Topol EJ. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med 2019;25(1):44. [DOI] [PubMed] [Google Scholar]