Skip to main content
PLOS One logoLink to PLOS One
. 2024 Dec 18;19(12):e0314707. doi: 10.1371/journal.pone.0314707

Inter-rater reliability in labeling quality and pathological features of retinal OCT scans: A customized annotation software approach

Katherine Du 1, Stavan Shah 1, Sandeep Chandra Bollepalli 1, Mohammed Nasar Ibrahim 1, Adarsh Gadari 2, Shan Sutharahan 2, José-Alain Sahel 1, Jay Chhablani 1, Kiran Kumar Vupparaboina 1,*
Editor: Jiro Kogo3
PMCID: PMC11654994  PMID: 39693322

Abstract

Objectives

Various imaging features on optical coherence tomography (OCT) are crucial for identifying and defining disease progression. Establishing a consensus on these imaging features is essential, particularly for training deep learning models for disease classification. This study aims to analyze the inter-rater reliability in labeling the quality and common imaging signatures of retinal OCT scans.

Methods

500 OCT scans obtained from CIRRUS HD-OCT 5000 devices were displayed at 512x1024x128 resolution on a customizable, in-house annotation software. Each patient’s eye was represented by 16 random scans. Two masked reviewers independently labeled the quality and specific pathological features of each scan. Evaluated features included overall image quality, presence of fovea, and disease signatures including subretinal fluid (SRF), intraretinal fluid (IRF), drusen, pigment epithelial detachment (PED), and hyperreflective material. The raw percentage agreement and Cohen’s kappa (κ) coefficient were used to evaluate concurrence between the two sets of labels.

Results

Our analysis revealed κ = 0.60 for the inter-rater reliability of overall scan quality, indicating substantial agreement. In contrast, there was slight agreement in determining the cause of poor image quality (κ = 0.18). The binary determination of presence and absence of retinal disease signatures showed almost complete agreement between reviewers (κ = 0.85). Specific retinal pathologies, such as the foveal location of the scan (0.78), IRF (0.63), drusen (0.73), and PED (0.87), exhibited substantial concordance. However, less agreement was found in identifying SRF (0.52), hyperreflective dots (0.41), and hyperreflective foci (0.33).

Conclusions

Our study demonstrates significant inter-rater reliability in labeling the quality and retinal pathologies on OCT scans. While some features show stronger agreement than others, these standardized labels can be utilized to create automated machine learning tools for diagnosing retinal diseases and capturing valuable pathological features in each scan. This standardization will aid in the consistency of medical diagnoses and enhance the accessibility of OCT diagnostic tools.

Introduction

Retinal diseases, encompassing conditions like age-related macular degeneration (AMD) and diabetic retinopathy (DR), represent significant contributors to global vision impairment and blindness. Optical coherence tomography (OCT) serves as the cornerstone imaging modality in ophthalmology imaging, leveraging light waves to capture detailed cross-sectional scans of the retina [1]. In particular, OCT scans play a pivotal role in diagnosing and guiding treatment decisions for various eye pathologies, thus contributing significantly to vision preservation.

Currently, the interpretation of OCT images by ophthalmologists relies heavily on their individual experiences and expertise, leading to variability in diagnoses. For instance, studies focusing on macular disease diagnosis and glaucoma detection using OCT have reported varying levels of interobserver agreement, underscoring the impact of individual interpretation on diagnostic consistency and treatment plans [2,3]. Specifically for AMD, a study showed significant variability in identifying complete retinal pigment epithelium and outer retinal atrophy (cRORA) between retina-trained ophthalmologists [4]. On the other hand, another study demonstrated substantial interrater agreement for qualitatively graded AMD features associated with atrophy, while other classification systems had varying amounts of agreement [5]. Other research using AMD OCT scans showed that a nine-point summary scale for grading may exhibit stronger intergrader agreement compared to binary grading [6].

As OCT data annotation by ophthalmologists requires specialist knowledge and is substantially time-consuming, there is a sparsity of labeled OCT datasets for a plethora of eye conditions including AMD. Traditionally, deep learning methods have required large amounts of labeled data to be used as input in the creation of effective algorithms. Recently, to combat the sparsity in labels for OCT data, studies have been conducted that enhance learning on limited data. For example, using a self-supervised learning phase followed by an OCT image classification learning phase, Fang et al. (2022) created a machine learning algorithm that can first learn inherent representations from the OCT images without manual labels which provided useful initialization parameters for the downstream OCT image classification model [7]. This tool classified features on OCT images, such as pigment epithelium detachment (PED), intraretinal fluid (IRF), subretinal fluid (SRF), normal retina, and retinal edema area (REA) with accuracies greater than 95% on the public RETOUCH dataset and greater than 98% on the AI Challenger dataset. While creating these promising techniques helps enhance learning on OCT data, a robust set of manually labeled images is still needed as a foundation for input as well as model evaluation in deep learning.

In response to current clinical challenges, the integration of artificial intelligence (AI) algorithms with OCT data offers promising avenues for improving early disease detection and expanding diagnostic capabilities, particularly in underserved regions [815]. Automated report generation using AI on OCT data represents a significant opportunity for enhancing diagnosis and treatment of retinal diseases, providing valuable decision-support tools for clinicians. Recent advancements have demonstrated the practical clinical application of AI algorithms in proficiently identifying pathologic retinal cases and efficiently triaging patients [1623]. However, the efficacy of these algorithms relies heavily on accurate labeling of pathological features by medical specialists, highlighting the importance of evaluating inter-rater reliability in OCT scan interpretations to establish robust "gold-standard" labels [24,25].

Moreover, the scarcity of labeled OCT datasets for various eye conditions poses a significant challenge for traditional deep learning methods, which typically require large amounts of labeled data for effective algorithm creation [2630]. Recent studies have explored innovative approaches to address this challenge, such as self-supervised learning phases followed by OCT image classification learning phases [7,3135]. These efforts aim to enhance learning on limited data and improve the performance of AI algorithms in diagnosing retinal diseases. Additionally, the quality of OCT scans significantly impacts the performance of machine learning algorithms [3639]. Variations in image quality can affect the accuracy of diagnoses, as evidenced by studies focusing on retinopathy screening in infants. Strategies for enhancing image quality, such as deep learning-based image quality assessment and enhancement systems, have shown promising results in improving diagnostic accuracy [4043].

The objective of this study is to evaluate the inter-rater reliability of OCT scan interpretation for AMD eyes among independent reviewers, specifically focusing on labeling for scan quality and pathological retinal features. We aim to assess the ability of independent reviewers to reach a consensus on annotating different aspects of OCT B-scans with AMD. We believe this work is a steppingstone towards generating standardized ground-truth labeled data for building AI-based diagnostic tools and in turn enhance the accuracy of disease screening in clinical practice.

Methods

Dataset

This retrospective study was conducted in accordance with the principles outlined in the Declaration of Helsinki, with approval from the institutional review board of the University of Pittsburgh Medical Center (UPMC), Pittsburgh. Informed consent was obtained as written consent from all participants for the inclusion of their retrospective data in the study. We utilized 500 OCT volumes obtained from the Cirrus 5000 OCT device (Carl Zeiss Meditec) corresponding to individual subjects diagnosed with age-related macular degeneration (AMD). Each Cirrus OCT volume comprised 128 B-scans, with a lateral resolution of 6mm and a depth resolution of 2mm, resulting in a pixel resolution of 512 × 1024. To yield a total of 500 B-scans for analysis, 16 B-scans were uniformly sampled from each volume until reaching this count, totaling the inclusion of 32 unique eyes. Patients were screened between 2017 and 2019. This data was accessed on May 31, 2023 for research purposes, and authors did not have access to information that could identify individual participants during or after data collection.

Feature description

As image quality of OCT scans may affect diagnostic interpretations [44,45], it is critical to evaluate alongside labeling scans with pathological features. In this study, image quality was quantified as good, usable, or bad. If the scan exhibited bad quality, the reasons for poor quality were specified as speckle noise, artifacts, low contrast, or cropping. In the following, we describe the varying levels of scan quality considered for annotation:

Good quality: All retinal sublayers are distinct, pathological features can be clearly observed.

Usable quality: Some clarity lacking so that there may be slight blur or contrast issues, but most pathological features can be observed.

Bad quality: Retinal sublayers cannot be clearly seen or are covered. If scan falls in this category, the reason for poor quality is classified as follows (Fig 1):

Fig 1. Examples of labels for reason for poor quality scan: Speckle noise, artifact, low contrast, and cropping.

Fig 1

Speckle noise: Speckle noise in OCT images is a granular pattern caused by random light scattering and interference, reducing image clarity and obscuring fine details [46]. It makes it difficult for clinicians to visualize retinal structures and corresponding structural changes and can hinder accurate detection of retinal abnormalities.

Artifact: Artifacts in OCT images often happen when a patient blinks or moves their eyes during the scan. Blinking can create gaps or missing parts in the image, while movement causes blurring or misalignment of the retinal layers. These distortions can make it harder to detect and analyze disease features accurately [47].

Low contrast: A bad quality OCT image due to low contrast occurs when the differences in intensity between adjacent structures or layers in the eye are too subtle for clear visualization [48]. Low contrast can make it difficult to distinguish critical features, such as retinal layers or disease markers like drusen and fluid. This results in a "washed-out" appearance, where boundaries between structures are blurred, reducing the clarity of important diagnostic details. Factors contributing to low contrast include poor illumination, scattering of light, media opacities in the eye (like cataracts), or suboptimal OCT device settings.

Cropping: A side of the retinal tissue layers in a scan is cut off, so that the image is clipped to a smaller dimension and part of the scan is not visible.

Retinal OCT scans provide detailed in vivo visualization of the structural changes of the posterior segment substructures including retinal sublayers, capturing various pathological features indicative of retinal diseases. This study aims to identify features that are associated with AMD disease. Fig 2 presents the representative OCT scans depicting various disease manifestations of AMD disease. In the following, we describe various AMD disease signatures considered for annotation:

Fig 2. Examples of labels for drusen, pigment epithelial detachment, intraretinal fluid, subretinal fluid, hyperreflective dots, and hyperreflective foci.

Fig 2

Subretinal Fluid (SRF): SRF refers to the accumulation of fluid between the retina and the retinal pigment epithelium (RPE) layer in the eye. Normally, the subretinal space is devoid of fluid, but in various retinal diseases and conditions, fluid can accumulate in this space, leading to SRF. In the OCT image, they appear as dark spaces between the RPE and the neurosensory retina [49].

Intraretinal Fluid (IRF): IRF refers to the accumulation of fluid within the layers of the retina itself. Unlike subretinal fluid, which accumulates between the retina and the RPE layer, IRF initially presents as diffuse thickening of the outer nuclear layer of the retina which if more severe may form cystoid spaces that may involve all retinal layers [49]. IRF appears as dark spaces within the retinal layers.

Pigment Epithelial Detachment (PED): PED is a condition characterized by the accumulation of fluid or material underneath the RPE, appearing on OCT as elevations of RPE band relative to Bruch’s membrane [49].

Drusen: Drusen are tiny yellow or white deposits that accumulate under the retina [50]. They are found in AMD eyes, but may also be a normal feature of aging eyes [51]. Drusen is composed of lipids and proteins that have been deposited in the blood [51].

Hyperreflective Dots: Small, bright reflections within the retinal layers, often indicating cellular debris or inflammatory cells [52].

Hyperreflective Foci: Larger, more distinct bright spots within the retinal layers, associated with retinal inflammation, ischemia, or neovascularization [53,54].

In early AMD, drusen and hyperreflective dots are key features present on OCT scans [55,56]. In intermediate AMD, drusen, PED, IRF, and hyperreflective foci are prevalent on OCT scans [5759]. In late AMD, pathological features observed include drusen, PED, IRF, SRF, hyperreflective dots, and hyperreflective foci [60,61]. Many of these features may also be present in other retinal diseases such as diabetic retinopathy, central serous chorioretinopathy, and retinal vein occlusion.

Annotation tool

We developed OCT image labeling software in-house for annotating retinal features on scans (Fig 3). This software, created using PHP and SQL platforms, was deployed as a standalone application using an Apache server. The software enables users to create separate accounts and independently label a set of OCT scans. Accordingly, we created separate accounts for two reviewers (Katherine and Stavan) for performing the annotations. The interface presents a table on the home screen listing each OCT scan along with its relevant characteristics, including patient research ID, laterality (right or left eye), imaging modality, resolution, and image type. There is a “View” button corresponding to each B-scan that reviewers can select to perform the annotation. Clicking the “View” button takes the reviewer to a different page where the reviewer can view the full view of the B-scan with the list of features to annotate on the right of the image (Fig 4). There are check boxes for “Yes” or “No” for binary classification i.e., for annotating the presence or absence of certain disease features. Further, there are also dropdown menus for some aspects of annotation including reason for scan quality and type of PED. Accordingly, the reviewer annotates each OCT image either by selecting options from the drop-down or by checking the check boxes for all aspects of the OCT image including scan quality, reason for poor quality, presence of specific retinal features, and additional comments. Once an image is labeled by a reviewer, the table updates the scan status from unlabeled to labeled. Annotations are saved, and reviewers can proceed to the next OCT scan. The aggregated data of all labels can be downloaded for further analysis.

Fig 3. Home page of OCT image labeling software created and used for labeling OCT scans.

Fig 3

Fig 4. Interactive window pop-up of OCT scan used to label a single scan.

Fig 4

Annotation strategy

Two medical students (Katherine Du and Stavan Shah) underwent training by a retinal ophthalmologist (Jay Chhablani) to independently label the OCT B-scan features under consideration. In particular, the medical students separately reviewed a range of OCT scans to gauge scan quality as well as several examples of each disease feature to be labeled. The medical students next trained one-on-one with the retinal ophthalmologist to label sample scans, then worked together to label other sample scans to further solidify their understanding of these features. Lastly, the retinal ophthalmologist reviewed the initial 50 OCT scans in the official dataset independently labeled by the medical students to ensure their validity, and also provided clarifications on any of the 500 scans in which the medical students had uncertainty.

In the labeling process, first, the scan quality was determined as either good (clear), usable (identifiable but not optimal), or poor (unidentifiable) (Fig 5). Further, the reasons for poor-quality scans were classified as speckle noise, artifacts, low contrast, or cropping (Fig 5). Next, reviewers grade if the scan is a foveal scan or a non-foveal scan (Fig 5). Subsequently, reviewers look for the presence and absence of the disease features on OCT scans and grade them as healthy (scans with no retinal disease signature) or diseased (scans with retinal disease signatures). If the scan is labeled as diseased, then reviewers proceed to annotate for the presence or absence of specific disease signatures including subretinal fluid, intraretinal fluid, drusen, PED, hyperreflective dots, and hyperreflective foci. Further, if PED was present, the type of PED was indicated as fibrovascular, flat irregular, serous, drusenoid, or hemorrhagic. The reviewers also are provided with a “comments” box to make any specific comments about the image under consideration. This comprehensive annotation strategy aimed to capture the full spectrum of retinal characteristics and disease manifestations present in the OCT scans.

Fig 5. Examples of labels for overall quality (good, bad, useable), reasons for poor quality, and foveal scan.

Fig 5

Statistical analysis

To assess inter-observer variance, raw percentage agreement was calculated for each feature. Additionally, Cohen’s kappa (κ) coefficient was computed as a measure of inter-rater reliability [62]. Kappa values range from –1 to 1, with < 0 indicating worse than chance agreement, 0.01–0.20 indicating slight agreement, 0.21–0.40 indicating fair agreement, 0.41–0.60 indicating moderate agreement, 0.61–0.80 indicating substantial agreement, and 0.81–1.00 indicating almost perfect agreement. Both percentage agreement and kappa statistics were utilized as metrics in this study.

Results

The results of the inter-rater reliability analysis for labeling OCT retinal scans revealed significant agreement between independent reviewers for various pathological features and scan quality assessments (Fig 6).

Fig 6. Inter-rater reliability for 500 OCT scans, reported as raw percentage match and Cohen’s kappa coefficient.

Fig 6

blue = % match, green = Cohen’s kappa coefficient.

Scan Quality Agreement: The agreement between reviewers regarding the overall quality of OCT scans as bad, usable, or good was moderate to substantial, with a Cohen’s kappa coefficient (κ) of 0.60 (raw percent agreement of 84%). This indicates a consistent assessment of scan quality, which is crucial for ensuring the reliability of subsequent analyses and machine learning algorithms.

Reasons for Poor Quality Scan Agreement: The reasons for poor quality on a scan showed slight agreement, with a κ coefficient of 0.18 (raw percent agreement of 59%). For usable and bad scans, where quality affects the ability to observe retinal sublayers or pathological features, the breakdown of reasons for poor quality are presented in Fig 7. For both reviewers, the majority of OCT scans labeled as “bad quality” were due to low contrast in the scan; reviewer 1 graded more than 60% of bad scans as due to low contrast, while reviewer 2 graded more than 80% of bad scans as due to low contrast. Speckle noise was present in both bad and useable scans, with a higher percentage of usable scans exhibiting speckle noise (about 55% of usable scans graded by reviewer 1, about 70% of usable scans graded by reviewer 2). Artifacts occurred less frequently than speckle noise for both reviewers and are present in both bad and useable scans. Lastly, cropped images are sparsely represented, making up none of the scans of reviewer 1 and a slight percentage of the usable scans of reviewer 2.

Fig 7. Percent breakdown of reasons for poor quality in bad and usable scans independently labeled by reviewer 1 and reviewer 2.

Fig 7

Total number of bad quality scans by reviewer 1 and reviewer 2 were 52 and 51, respectively. Total number of usable quality scans by reviewer 1 and reviewer 2 were 87 and 65, respectively. blue = speckle noise, green = low contrast, yellow = artifact, purple = cropped.

Pathological Features Agreement: Regarding binary grading scans for the presence and absence of diseased signatures, the inter-rater reliability was almost in perfect agreement, with a κ coefficient of 0.85 (raw percent agreement of 95%). This high level of agreement suggests robust consistency in identifying scans without disease signatures in retinal layers of the OCT B-scans.

Among specific retinal pathologies, substantial to almost perfect agreement was observed for several features. The presence of a foveal scan exhibited a κ coefficient of 0.78 (raw percent agreement of 96%), indicating strong concurrence between the reviewers. Similarly, agreement was substantial for identifying PED type, with a κ coefficient of 0.87 (raw percent agreement of 95%). Moderate to substantial agreement was found for subretinal fluid, intraretinal fluid, and drusen, with κ coefficients of 0.52, 0.63, and 0.73, respectively. However, some features showed weaker agreement between the reviewers. In particular, the κ coefficients for identifying hyperreflective dots was 0.41 and hyperreflective foci was 0.33. Although these features still demonstrated fair agreement, there was slightly more variability in their interpretation compared to other features.

Discussion

Our study shows that there is significant inter-rater reliability in labeling the quality and retinal pathologies on OCT scans of the retina. As the quality of scans is an important determinant for clinical decision-making and building AI-based tools for automated detection of disease signatures, the considerable agreement between the two reviewers indicates that scans deemed better quality can be exclusively used or weighed more in models to guarantee more accurate model learning. However, the reason for poor quality scan differs between the reviewers, indicating that there may be multiple factors contributing to an unclear scan which leaves more interpretation up to the reviewer. This indicates that it may also be difficult for AI models to determine a singular cause for poor-quality scans. For example, some of the OCT scans exhibited multiple reasons for quality deficits, such as speckle noise and low contrast, and reviewers were asked to indicate the one predominant reason for poor quality (Fig 8). These differences could have led to the lower κ coefficient of 0.18 for the “quality bad due to” category, while the “overall quality” of good, useable, or bad received higher concurrence with κ = 0.60. However, some common themes arise in the “quality bad due to” category, as both reviewers labeled “low contrast” as the predominant reason for a bad quality scan (60% by reviewer 1, 80% by reviewer 2). This indicates that low contrast may be a key factor impacting scan quality and the interpretability of the OCT scan for pathological features assessment. Hence, machine learning algorithms that read OCT scans may increase their accuracy by excluding low contrast scans, or by learning to differentiate these low contrast scans and interpreting the scan differently than it would with a good quality scan. As differing amounts of speckle noise is present in most OCT scans, if artifact, low contrast, or cropping were not present, the default reason for a poor-quality scan was mostly speckle noise. Artifacts seem to be minimally present in scans with bad or usable quality, with cropping occurring even less. Hence, the overarching reason for poor image quality that may result in unusable or incorrectly interpreted scans in downstream analysis is low contrast due to the OCT imaging collection process.

Fig 8. Disagreements between reviewers for features in OCT scans.

Fig 8

(L) disagreements in reason for bad quality (R) disagreements in presence of hyperreflective material.

In terms of pathological features of the retina, there is almost perfect agreement in making binary decision about whether the scan has the presence or absence of the disease signatures. This requires the synthesis of information about the identification of several other pathological features to come to a determination. Since these individual features (SRF, IRF, drusen, PED, hyperreflective dots, hyperreflective foci) also have excellent concurrence between the two reviewers, it strengthens the diagnosis of healthy or diseased eye by providing the grounds for this interpretation. There was also strong concurrence for the specific PED out of the five types, with a κ coefficient of 0.87 for the 500 scans. Thus, we have stronger confidence that this feature is standardized amongst independent reviewers and can readily be used in future AI models. As for hyperreflective dots and foci, their κ coefficient was 0.41 and 0.33, indicating a weaker match. This could be explained by the fact that many of the OCT scans had hyperreflective material that differed in size and shape which the reviewers may have interpreted differently (Fig 8). Specifically, the relatively smaller size of hyperreflective dots and the relatively larger size of hyperreflective foci is on a spectrum, and up to the qualitative judgment of reviewers as to its presence in a scan. Additionally, some scans exhibit minimal hyperreflective material, and the reviewer determines to what extent it is significant enough to quantify. Hence, hyperreflective dots and foci could be grouped together into a single feature for AI models or each distinct feature not weighed as significantly. Lastly, foveal scan has strong agreement as well, with a κ coefficient of 0.78. As retinal pathologies in the fovea may cause more noticeable visual impairment, identifying the presence of a foveal scan plus pathological features or the fact that the layers of the retina are more curved on foveal scans and distort the appearance of pathological features may be key in accurate triage or diagnosis using OCT scans.

Limitations to this study include the particular scope of OCT scans used for concurrence analysis, which was capturing age-related macular degeneration eyes with CIRRUS HD-OCT 5000 imaging devices. Hence, the assessment of quality and pathological retinal features of other eye diseases such as diabetic retinopathy and central serous chorioretinopathy can be further explored. Additionally, quality of scans and reasons for poor quality scans may be examined on other OCT imaging devices, such as Heidelberg Engineering SPECTRALIS OCT or Topcon 3D OCT-2000. Lastly, on the annotation software, reviewers added comments onto individual OCT scans to indicate the presence of other retinal signatures that were not explicitly listed. However, the interrater reliability of these other retinal signatures, such as epiretinal membrane and geographic atrophy, can be analyzed in future studies.

These standardized labels can be used to create automated algorithms that help standardize medical diagnoses and increase the accessibility of OCT diagnostic tools. Firstly, OCT image quality should be taken into consideration, as our study demonstrates that low contrast is a common reason for a poor-quality scan, which may lead to limited readability or diagnostic misinterpretation by machine learning algorithms. Either excluding poor quality scans or training machine learning models to interpret these scans differently may alleviate downstream analysis issues resulting from varying scan quality. Our study demonstrates that labels created for retinal pathological features are generally agreeable between reviewers, with some features displaying stronger concurrence compared to others. This indicates that there may be slight variability in ophthalmologists’ diagnosis of retinal diseases using OCT images, and that supporting OCT information with patient records may present a more definite diagnosis. Additionally, it shows us which features may have stronger diagnostic capacity, such as PED, subretinal fluid, and intraretinal fluid, compared to others. The widespread use of electronic medical records, with deep neural networks in machine learning, allows for the adaptation of artificial intelligence image analysis tools for computer-aided diagnosis. The conclusions from this analysis can be integrated into engineering decisions in the creation of these AI models for the diagnosis of retinal diseases.

Data Availability

Data cannot be shared publicly as it contains potentially sensitive patient information and is not in line with the institutional review board approval of the University of Pittsburgh Medical Center (UPMC), Pittsburgh. However, data may be available to researchers who meet the criteria for access to confidential data, pending approval from the University of Pittsburgh Medical Center's institutional ethics committee. Interested researchers can request access by contacting the corresponding author at kiran1559@gmail.com or the clinical trials manager Rose Carla Aubourg at aubourgrc@upmc.edu.

Funding Statement

The work was funded by the NIH CORE Grant to the Dept. of Ophthalmology (Grant No. P30 EY08098); The Eye and Ear Foundation of Pittsburgh; The Hillman Challenge Grant - Exploratory Research Project (received by Dr. Kiran Kumar Vupparaboina); and the Research to Prevent Blindness Medical Student Eye Research Fellowship, Research to Prevent Blindness, 360 Lexington Avenue, Floor 22, New York, NY 10017-6528 (received by Dr. Katherine Du). The sponsor or funding organizations had no role in the design or conduct of this research.

References

  • 1.Fujimoto JG, Pitris C, Boppart SA, Brezinski ME. Optical coherence tomography: an emerging technology for biomedical imaging and optical biopsy. Neoplasia. 2000. Jan 1;2(1–2):9–25. doi: 10.1038/sj.neo.7900071 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Anton A, Nolivos K, Pazos M, Fatti G, Herranz A, Ayala-Fuentes ME, et al. Interobserver and intertest agreement in telemedicine glaucoma screening with optic disk photos and optical coherence tomography. Journal of Clinical Medicine. 2021. Jul 28;10(15):3337. doi: 10.3390/jcm10153337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wagner SK, Chopra R, Ledsam JR, Askham H, Blackwell S, Faes L, et al. Diagnostic accuracy and interobserver variability of macular disease evaluation using optical coherence tomography. Investigative Ophthalmology & Visual Science. 2019. Jul 22;60(9):1849. [Google Scholar]
  • 4.Chandra S, Rasheed R, Sen P, Menon D, Sivaprasad S. Inter-rater reliability for diagnosis of geographic atrophy using spectral domain OCT in age-related macular degeneration. Eye. 2022. Feb;36(2):392–7. doi: 10.1038/s41433-021-01490-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wu Z, Pfau M, Blodi BA, Holz FG, Jaffe GJ, Liakopoulos S, Sadda SR, et al. OCT signs of early atrophy in age-related macular degeneration: interreader agreement: classification of atrophy meetings report 6. Ophthalmology Retina. 2022. Jan 1;6(1):4–14. doi: 10.1016/j.oret.2021.03.008 [DOI] [PubMed] [Google Scholar]
  • 6.Carvajal N, Yang D, Nava K, Kedia A, Keenan JD, Yiu G, et al. Intergrader Agreement in Grading Optical Coherence Tomography Morphologic Features in Eyes with Intermediate Nonexudative Age-Related Macular Degeneration. Translational Vision Science & Technology. 2024. Aug 1;13(8):3. doi: 10.1167/tvst.13.8.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Fang L, Guo J, He X, Li M. Self-supervised patient-specific features learning for OCT image classification. Medical & Biological Engineering & Computing. 2022. Oct;60(10):2851–63. doi: 10.1007/s11517-022-02627-8 [DOI] [PubMed] [Google Scholar]
  • 8.Daich Varela M, Sen S, De Guimaraes TA, Kabiri N, Pontikos N, Balaskas K, et al. Artificial intelligence in retinal disease: clinical application, challenges, and future directions. Graefe’s Archive for Clinical and Experimental Ophthalmology. 2023. Nov;261(11):3283–97. doi: 10.1007/s00417-023-06052-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Arslan J, Samarasinghe G, Benke KK, Sowmya A, Wu Z, Guymer RH, et al. Artificial intelligence algorithms for analysis of geographic atrophy: a review and evaluation. Translational vision science & technology. 2020. Jan 28;9(2):57. doi: 10.1167/tvst.9.2.57 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Verejan V. Contrast sensitivity and aspects of binocular vision alteration in school-aged children after head injury. Rom J Ophthalmol. 2023;67(4):394–397. doi: 10.22336/rjo.2023.62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Yang D, Ran AR, Nguyen TX, Lin TP, Chen H, Lai TY, et al. Deep learning in optical coherence tomography angiography: Current progress, challenges, and future directions. Diagnostics. 2023. Jan 16;13(2):326. doi: 10.3390/diagnostics13020326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Li AL, Feng M, Wang Z, Baxter SL, Huang L, Arnett J, et al. Automated detection of posterior vitreous detachment on OCT using computer vision and deep learning algorithms. Ophthalmology science. 2023. Jun 1;3(2):100254. doi: 10.1016/j.xops.2022.100254 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Esfahani PR, Reddy AJ, Nawathey N, Ghauri MS, Min M, Wagh H, et al. Deep Learning Classification of Drusen, Choroidal Neovascularization, and Diabetic Macular Edema in Optical Coherence Tomography (OCT) Images. Cureus. 2023. Jul 9;15(7). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bogunović H, Waldstein SM, Schlegl T, Langs G, Sadeghipour A, Liu X, et al. Prediction of anti-VEGF treatment requirements in neovascular AMD using a machine learning approach. Investigative ophthalmology & visual science. 2017. Jun 1;58(7):3240–8. doi: 10.1167/iovs.16-21053 [DOI] [PubMed] [Google Scholar]
  • 15.Han R, Cheng G, Zhang B, Yang J, Yuan M, Yang D, et al. Validating automated eye disease screening AI algorithm in community and in-hospital scenarios. Frontiers in Public Health. 2022. Jul 22; 10:944967. doi: 10.3389/fpubh.2022.944967 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang L, Wang G, Zhang M, Fan D, Liu X, Guo Y, et al. An intelligent optical coherence tomography-based system for pathological retinal cases identification and urgent referrals. Translational Vision Science & Technology. 2020. Jan 28;9(2):46. doi: 10.1167/tvst.9.2.46 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Karri SP, Chakraborty D, Chatterjee J. Transfer learning-based classification of optical coherence tomography images with diabetic macular edema and dry age-related macular degeneration. Biomedical optics express. 2017. Feb 1;8(2):579–92. doi: 10.1364/BOE.8.000579 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Venhuizen FG, van Ginneken B, van Asten F, van Grinsven MJ, Fauser S, Hoyng CB, et al. Automated staging of age-related macular degeneration using optical coherence tomography. Investigative ophthalmology & visual science. 2017. Apr 1;58(4):2318–28. doi: 10.1167/iovs.16-20541 [DOI] [PubMed] [Google Scholar]
  • 19.Lu W, Tong Y, Yu Y, Xing Y, Chen C, Shen Y. Deep learning-based automated classification of multi-categorical abnormalities from optical coherence tomography images. Translational vision science & technology. 2018. Nov 1;7(6):41. doi: 10.1167/tvst.7.6.41 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wang R, Fan D, Lv B, Wang M, Zhou Q, Lv C, et al. OCT image quality evaluation based on deep and shallow features fusion network. In2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI) 2020. Apr 3 (pp. 1561–1564). IEEE. [Google Scholar]
  • 21.Chen X, Xue Y, Wu X, Zhong Y, Rao H, Luo H, et al. Deep learning-based system for disease screening and pathologic region detection from optical coherence tomography images. Translational Vision Science & Technology. 2023. Jan 3;12(1):29. doi: 10.1167/tvst.12.1.29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Liu X, Zhao C, Wang L, Wang G, Lv B, Lv C, et al. Evaluation of an OCT-AI–based telemedicine platform for retinal disease screening and referral in a primary care setting. Translational Vision Science & Technology. 2022. Mar 2;11(3):4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dansingani KK, Devarakonda ST, Vupparaboina K, Jana S, Chhablani J, Freund KB, et al. Classification of macular lesions using optical coherence tomography and an artificial neural network. Investigative Ophthalmology & Visual Science. 2017. Jun 23;58(8):823. [Google Scholar]
  • 24.Yanagihara RT, Lee CS, Ting DS, Lee AY. Methodological challenges of deep learning in optical coherence tomography for retinal diseases: a review. Translational Vision Science & Technology. 2020. Jan 28;9(2):11. doi: 10.1167/tvst.9.2.11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Nakayama LF, Ribeiro LZ, Gonçalves MB, Ferraz DA, Dos Santos HN, Malerbi FK, et al. Diabetic retinopathy classification for supervised machine learning algorithms. International Journal of Retina and Vitreous. 2022. Dec; 8:1–5. doi: 10.1186/s40942-021-00352-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ting DS, Peng L, Varadarajan AV, Keane PA, Burlina PM, Chiang MF, et al. Deep learning in ophthalmology: the technical and clinical considerations. Progress in retinal and eye research. 2019. Sep 1; 72:100759. doi: 10.1016/j.preteyeres.2019.04.003 [DOI] [PubMed] [Google Scholar]
  • 27.Bali A, Mansotra V. Analysis of deep learning techniques for prediction of eye diseases: A systematic review. Archives of Computational Methods in Engineering. 2024. Jan;31(1):487–520. [Google Scholar]
  • 28.Fikri MA, Santosa PI, Wibirama S. A review on opportunities and challenges of machine learning and deep learning for eye movements classification. In2021 IEEE International Biomedical Instrumentation and Technology Conference (IBITeC) 2021. Oct 20 (pp. 65–70). IEEE. [Google Scholar]
  • 29.Kavianfar A, Salimi M, Taherkhani H. A Review of the Management of Eye Diseases Using Artificial Intelligence, Machine Learning, and Deep Learning in Conjunction with Recent Research on Eye Health Problems. Journal of Ophthalmic and Optometric Sciences. 2021. Jan 1;5(2):57–72. [Google Scholar]
  • 30.Ting DS, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R, et al. Artificial intelligence and deep learning in ophthalmology. British Journal of Ophthalmology. 2019. Feb 1;103(2):167–75. doi: 10.1136/bjophthalmol-2018-313173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yellapragada B, Hornauer S, Snyder K, Yu S, Yiu G. Self-supervised feature learning and phenotyping for assessing age-related macular degeneration using retinal fundus images. Ophthalmology Retina. 2022. Feb 1;6(2):116–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Cai Z, Lin L, He H, Tang X. Uni4Eye: unified 2D and 3D self-supervised pre-training via masked image modeling transformer for ophthalmic image classification. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2022. Sep 16 (pp. 88–98). Cham: Springer Nature Switzerland. [Google Scholar]
  • 33.Shurrab S, Duwairi R. Self-supervised learning methods and applications in medical imaging analysis: A survey. PeerJ Computer Science. 2022. Jul 19;8: e1045. doi: 10.7717/peerj-cs.1045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hervella ÁS, Ramos L, Rouco J, Novo J, Ortega M. Multi-modal self-supervised pre-training for joint optic disc and cup segmentation in eye fundus images. InICASSP 2020–2020 IEEE international conference on acoustics, speech and signal processing (ICASSP) 2020. May 4 (pp. 961–965). IEEE. [Google Scholar]
  • 35.Hervella ÁS, Rouco J, Novo J, Ortega M. Retinal image understanding emerges from self-supervised multimodal reconstruction. InMedical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16–20, 2018, Proceedings, Part I 2018. (pp. 321–328). Springer International Publishing. [Google Scholar]
  • 36.Elezaby S, Bagherinia H, Ren H, Sha P, Tracewell L, Wu C, et al. A machine learning method for optical coherence tomography scan quality assessment. Investigative Ophthalmology & Visual Science. 2020. Jul 21;61(9): PB0090. [Google Scholar]
  • 37.Kugelman J, Alonso-Caneiro D, Read SA, Vincent SJ, Chen FK, Collins MJ. Effect of altered OCT image quality on deep learning boundary segmentation. Ieee Access. 2020. Mar 2; 8:43537–53. [Google Scholar]
  • 38.Wang J, Deng G, Li W, Chen Y, Gao F, Liu H, et al. Deep learning for quality assessment of retinal OCT images. Biomedical optics express. 2019. Dec 1;10(12):6057–72. doi: 10.1364/BOE.10.006057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lauermann JL, Treder M, Alnawaiseh M, Clemens CR, Eter N, Alten F. Automated OCT angiography image quality assessment using a deep learning algorithm. Graefe’s Archive for Clinical and Experimental Ophthalmology. 2019. Aug 5; 257:1641–8. doi: 10.1007/s00417-019-04338-7 [DOI] [PubMed] [Google Scholar]
  • 40.Koidala SP, Manne SR, Ozimba K, Rasheed MA, Bashar SB, Ibrahim MN, et al. Deep learning based diagnostic quality assessment of choroidal OCT features with expert-evaluated explainability. Scientific Reports. 2023. Jan 28;13(1):1570. doi: 10.1038/s41598-023-28512-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Apostolopoulos S, Salas J, Ordóñez JL, Tan SS, Ciller C, Ebneter A, et al. Automatically enhanced OCT scans of the retina: a proof of concept study. Scientific reports. 2020. May 8;10(1):7819. doi: 10.1038/s41598-020-64724-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Halupka KJ, Antony BJ, Lee MH, Lucy KA, Rai RS, Ishikawa H, et al. Retinal optical coherence tomography image enhancement via deep learning. Biomedical optics express. 2018. Dec 1;9(12):6205–21. doi: 10.1364/BOE.9.006205 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Yuan Z, Yang D, Zhao J, Liang Y. Enhancement of OCT en face images by unsupervised deep learning. Physics in Medicine & Biology. 2024. May 30;69(11):115042. doi: 10.1088/1361-6560/ad4c52 [DOI] [PubMed] [Google Scholar]
  • 44.Tang Z, Wang X, Ran AR, Yang D, Ling A, Yam JC, et al. Deep learning-based image quality assessment for optical coherence tomography macular scans: a multicentre study. British Journal of Ophthalmology. 2024. Jul 19. doi: 10.1136/bjo-2023-323871 [DOI] [PubMed] [Google Scholar]
  • 45.Aumann S, Donner S, Fischer J, Müller F. Optical coherence tomography (OCT): principle and technical realization. High resolution imaging in microscopy and ophthalmology: new frontiers in biomedical optics. 2019:59–85. [PubMed] [Google Scholar]
  • 46.Schmitt JM. Optical coherence tomography (OCT): a review. IEEE Journal of selected topics in quantum electronics. 1999. Jul;5(4):1205–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Chhablani J, Krishnan T, Sethi V, Kozak I. Artifacts in optical coherence tomography. Saudi Journal of Ophthalmology. 2014. Apr 1;28(2):81–7. doi: 10.1016/j.sjopt.2014.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Choma MA, Sarunic MV, Yang C, Izatt JA. Sensitivity advantage of swept source and Fourier domain optical coherence tomography. Optics express. 2003. Sep 8;11(18):2183–9. doi: 10.1364/oe.11.002183 [DOI] [PubMed] [Google Scholar]
  • 49.Keane PA, Patel PJ, Liakopoulos S, Heussen FM, Sadda SR, Tufail A. Evaluation of age-related macular degeneration with optical coherence tomography. Survey of ophthalmology. 2012. Sep 1;57(5):389–414. doi: 10.1016/j.survophthal.2012.01.006 [DOI] [PubMed] [Google Scholar]
  • 50.Guymer R, Wu Z. Age‐related macular degeneration (AMD): More than meets the eye. The role of multimodal imaging in today’s management of AMD—A review. Clinical & Experimental Ophthalmology. 2020. Sep;48(7):983–95. doi: 10.1111/ceo.13837 [DOI] [PubMed] [Google Scholar]
  • 51.Khan KN, Mahroo OA, Khan RS, Mohamed MD, McKibbin M, Bird A, et al. Differentiating drusen: Drusen and drusen-like appearances associated with ageing, age-related macular degeneration, inherited eye disease and other pathological processes. Progress in retinal and eye research. 2016. Jul 1; 53:70–106. doi: 10.1016/j.preteyeres.2016.04.008 [DOI] [PubMed] [Google Scholar]
  • 52.Coscas G, De Benedetto U, Coscas F, Li Calzi CI, Vismara S, Roudot-Thoraval F, et al. Hyperreflective dots: a new spectral-domain optical coherence tomography entity for follow-up and prognosis in exudative age-related macular degeneration. Ophthalmologica. 2012. Sep 19;229(1):32–7. doi: 10.1159/000342159 [DOI] [PubMed] [Google Scholar]
  • 53.Cao D, Leong B, Messinger JD, Kar D, Ach T, Yannuzzi LA, et al. Hyperreflective foci, optical coherence tomography progression indicators in age-related macular degeneration, include transdifferentiated retinal pigment epithelium. Investigative ophthalmology & visual science. 2021. Aug 2;62(10):34. doi: 10.1167/iovs.62.10.34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Duic C, Pfau K, Keenan TD, Wiley H, Thavikulwat A, Chew EY, et al. Hyperreflective foci in age-related macular degeneration are associated with disease severity and functional impairment. Ophthalmology Retina. 2023. Apr 1;7(4):307–17. doi: 10.1016/j.oret.2022.11.006 [DOI] [PubMed] [Google Scholar]
  • 55.Sarks JP, Sarks SH, Killingsworth MC. Evolution of soft drusen in age-related macular degeneration. Eye. 1994. May;8(3):269–83. doi: 10.1038/eye.1994.57 [DOI] [PubMed] [Google Scholar]
  • 56.Coscas G, De Benedetto U, Coscas F, Li Calzi CI, Vismara S, Roudot-Thoraval F, et al. Hyperreflective dots: a new spectral-domain optical coherence tomography entity for follow-up and prognosis in exudative age-related macular degeneration. Ophthalmologica. 2012. Sep 19;229(1):32–7. doi: 10.1159/000342159 [DOI] [PubMed] [Google Scholar]
  • 57.Roisman L, Zhang Q, Wang RK, Gregori G, Zhang A, Chen CL, et al. Optical coherence tomography angiography of asymptomatic neovascularization in intermediate age-related macular degeneration. Ophthalmology. 2016. Jun 1;123(6):1309–19. doi: 10.1016/j.ophtha.2016.01.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Servillo A, Kesim C, Sacconi R, Battist M, Capuano V, Fragiotta S, et al. NON-EXUDATIVE INTRARETINAL FLUID IN INTERMEDIATE AGE-RELATED MACULAR DEGENERATION. Retina. 2022. May 12:10–97. [DOI] [PubMed] [Google Scholar]
  • 59.Segal O, Barayev E, Nemet AY, Geffen N, Vainer I, Mimouni M. Prognostic value of hyperreflective foci in neovascular age-related macular degeneration treated with bevacizumab. Retina. 2016. Nov 1;36(11):2175–82. doi: 10.1097/IAE.0000000000001033 [DOI] [PubMed] [Google Scholar]
  • 60.Age-Related Eye Disease Study Research Group. A randomized, placebo-controlled, clinical trial of high-dose supplementation with vitamins C and E, beta carotene, and zinc for age-related macular degeneration and vision loss: AREDS report No. 8. Arch Ophthalmol. 2001;119(10):1417–36. doi: 10.1001/archopht.119.10.1417 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Hanson RL, Airody A, Sivaprasad S, Gale RP. Optical coherence tomography imaging biomarkers associated with neovascular age-related macular degeneration: a systematic review. Eye. 2023. Aug;37(12):2438–53. doi: 10.1038/s41433-022-02360-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.McHugh ML. Interrater reliability: the kappa statistic. Biochemia medica. 2012. Oct 15;22(3):276–82. doi: 10.1016/j.jocd.2012.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data cannot be shared publicly as it contains potentially sensitive patient information and is not in line with the institutional review board approval of the University of Pittsburgh Medical Center (UPMC), Pittsburgh. However, data may be available to researchers who meet the criteria for access to confidential data, pending approval from the University of Pittsburgh Medical Center's institutional ethics committee. Interested researchers can request access by contacting the corresponding author at kiran1559@gmail.com or the clinical trials manager Rose Carla Aubourg at aubourgrc@upmc.edu.


Articles from PLOS ONE are provided here courtesy of PLOS

RESOURCES