Abstract
Retinal pigment epithelium (RPE) cells are essential for normal retinal function. Morphological defects in these cells are associated with a number of retinal neurodegenerative diseases. Owing to the cellular resolution and depth-sectioning capabilities, individual RPE cells can be visualized in vivo with adaptive optics-optical coherence tomography (AO-OCT). Rapid, cost-efficient, and objective quantification of the RPE mosaic’s structural properties necessitates the development of an automated cell segmentation algorithm. This paper presents a deep learning-based method with partial annotation training for detecting RPE cells in AO-OCT images with accuracy better than human performance. We have made the code, imaging datasets, and the manual expert labels available online.
1. Introduction
Retinal pigment epithelium (RPE) cells form the outer-most layer of the retina and are situated between the retinal photoreceptor (PR) layer and the choriocapillaris (CC). These pigmented cells play a crucial role in light absorption, nutrient and waste transport, and maintenance of the visual function and visual cycle [1]. Characteristics of the RPE cell mosaic (for example, cell density, cell area, and number of neighbors) undergo changes with aging and in some retinal degenerative diseases such as Stargardt disease, Best disease, retinitis pigmentosa, age-related macular degeneration (AMD), and others [1–5]. Investigation of the RPE cell mosaic topography and the interaction between the RPE and photoreceptor layers have mainly involved ex vivo experiments [1–5] and cell culture studies [6–8]. Clinical imaging systems such as fundus autofluorescence imaging, confocal scanning laser ophthalmoscopy (cSLO), and optical coherence tomography (OCT) are useful tools for assessing RPE health, but are limited to aggregate measurements encompassing many cells [9–15]. These clinical imaging systems lack the cellular-level resolution needed to evaluate individual RPE cells and the mosaics that they form. In vivo visualization of individual RPE cells offers considerable promise for aiding in the early diagnosis, prognosis, and treatment of RPE-related retinal degenerative diseases.
While it is possible to image a subset of retinal cells in some subjects using conventional ophthalmic imaging systems [16–25], adaptive optics (AO)-enabled imagers allow cellular-level imaging in most eyes and provide the sharpest images of retinal cells [26–31]. In vivo visualization of human RPE cells has been demonstrated with dark-field AO-SLO [32,33], AO-SLO autofluorescence [34–38] and intravenously injected indocyanine-green (ICG) fluorescence imaging [39–41], transscleral optical phase imaging (TOPI) with AO [42,43], and AO-OCT [28,30,44–47]. AO-OCT RPE imaging has recently gained attention for clinical vision research as it provides high-quality three-dimensional (3D) visualization of the PR-RPE-CC complex. Understanding the mechanistic relationships between the different layers of the PR-RPE-CC complex in different retinal diseases is crucial for disease staging and the development of targeted therapies, especially in AMD [48,49]. The information-rich AO-OCT volumetric images offer the potential to detect minute changes in the PR-RPE-CC complex caused by pathology [50,51], and thus provide imaging outcomes more sensitive for monitoring disease progression, as for example in interventional trials. To achieve this, however, automated analysis methods are necessary for rapid, objective, and cost-efficient quantification of the different structures in the PR-RPE-CC complex.
While several reliable automated PR and ganglion cell segmentation methods for AO-SLO [52–57] and AO-OCT [58–62] have been developed, reliable automated RPE segmentation in AO-OCT images has not been comprehensively addressed. Indeed, automated RPE cell segmentation methods have been proposed for other imaging modalities, such as RPE flat mount images [63–65] and in vivo AO fluorescence [66,67] and TOPI images [42,68]. These methods range from traditional image processing approaches [42,63,66,68] to deep learning methods based on fully supervised [67], semi-supervised [64], or self-supervised [65] learning strategies. The fully supervised method in [67] was developed for AO-ICG images, in which the authors integrated Voronoi diagrams in addition to cell center maps as labels for the training of their cell detection model. In AO-ICG images, cell borders are not directly visible, and thus, the Voronoi diagrams were used to introduce additional spatial information about the cell hexagonal packing pattern [67]. The semi- [64] and self-supervised [65] methods were developed to leverage large sets of unannotated flat mount images for the training of the deep learning cell segmentation models. In these methods, the reconstruction of patches from damaged image regions (fluorescence labeling of cell nuclei or weak/missing cell borders) into high-quality images facilitated effective representation extraction for the segmentation task [64,65]. These studies reflect the success of different deep learning approaches to automate the task of RPE cell segmentation in different imaging modalities.
In this paper, we automated RPE cell detection in AO-OCT images with a deep learning-based approach. The results showed that our method exceeded human-level performance in identifying RPE cells across different retinal locations and imagers, and maintained human-level performance on images of a diseased retina and across different ranges of image quality on healthy subjects. The results also showed better performance of our method compared to a popular generalist algorithm for cell segmentation. Our work provides the first step towards automatic quantification of RPE cells in AO-OCT volumes, potentially facilitating the integration of this technology into standard eye care and clinical studies through efficient and robust analysis of the imaging data. To promote future studies, we have made the code, imaging datasets, and the manual expert labels available online.
2. Materials and methods
2.1. AO-OCT dataset and annotation
Our AO-OCT dataset includes images acquired with two imaging systems (Table 1). The first set was acquired by the multimodal AO system developed at the U.S. Food and Drug Administration (FDA), which collects volumes at 13.4 Hz using Fourier domain mode-locked (FDML) laser technology [28]. The dataset consisted of AO-OCT volumes with the AO focus set to the PR-RPE complex from eleven healthy subjects (2°×2° FOV with a lateral sampling of 1.17 µm/pixel) and one subject with sub-clinical drusen deposits (‘diseased’ subject with ID 6242 in Table 1; 1.5°×1.5° FOV with a lateral sampling of 0.88 µm/pixel). For each healthy subject, 9-13 locations were imaged, spanning the retina from the fovea to 12° temporal to the fovea, and the diseased subject was imaged at the locations of the drusen deposits. At each retinal location, multiple volumes were acquired (90 and 340 volumes in ∼7 and 30 minutes for the healthy and diseased subjects, respectively, temporally separated by 0.5-6 seconds for improved RPE contrast by organelle motility [45]), dewarped to correct for sinusoidal scanner motion, automatically registered in 3D [69], and averaged to yield the final high-quality volume for analysis. The second set includes the AO-OCT images from [45] that were obtained by the imager developed at Indiana University (IU). The IU system [70,71] (a spectral domain AO-OCT imager) has a different optical design than the FDA’s FDML imager (a swept source AO-OCT imager). The IU set was captured with a protocol different from the FDA set (e.g., lateral pixel sampling size of 1 µm/pixel, volume rate of 5.5 Hz, etc.), which includes 1°×1° RPE en face images from 3D registered and averaged volumes from four healthy subjects acquired at 7° temporal to the fovea. The number of averaged volumes for each subject is summarized in Table 1. All protocols adhered to the tenets of the Helsinki Declaration and were approved by the Institutional Review Boards of the FDA and IU.
Table 1. Summary of data.
| Imager | Subject ID | Age | Sex | # Cropped ROIs | Location of Full FOV | Number of Averaged Images |
|---|---|---|---|---|---|---|
| FDA | 0420 | 27.7 | F | 11 | 10.5°T | 90 |
| 0571 | 42.2 | M | 10 | 9°T | 90 | |
| 1610 | 32.0 | M | 13 | 3°T | 90 | |
| 2875 | 32.8 | M | 13 | 9°T | 90 | |
| 3339 | 36.6 | M | 13 | 0° | 90 | |
| 5291 | 36.6 | M | 13 | 12°T | 90 | |
| 5810 | 36.5 | F | 13 | 4.5°T | 90 | |
| 7473 | 36.6 | M | 13 | 0° | 90 | |
| 7743 | 42.9 | F | 13 | 12°T | 90 | |
| 8195 | 33.3 | M | 13 | 7.5°T | 90 | |
| 0201 | 32.9 | F | 11 | 12°T | 90 | |
| 6242 a | 62.5 | M | - | 0.8°N/0.8°I & 0.8°N/1°S | 340 | |
|
| ||||||
| IU | S1 | 21.0 | M | - | 7°T | 50 |
|
| ||||||
| S2 | 26.0 | M | - | 7°T | 45 | |
| S3 | 47.0 | F | - | 7°T | 55 | |
| S4 | 49.0 | M | - | 7°T | 59 | |
Subject 6242 is the ‘diseased’ subject with sub-clinical drusen deposits. For imaging the diseased subject an extended protocol was used as part of a different study. The total imaging time and number of averaged volumes required for RPE imaging in this subject is not expected to be higher than the healthy subjects. T: Temporal, N: Nasal, S: Superior, I: Inferior.
The FDA healthy dataset was manually labeled by expert graders in a recent study [72], which we used as our training set. Briefly, for each AO-OCT volume, a 2D en face projection was created by averaging 2-3 axial pixel planes centered on the RPE layer at its peak intensity location. Manual annotations were created for ∼0.5°×0.5° regions-of-interest (ROI) cropped across the retina with ∼1° separation. An example annotated ROI cropped from the full FOV image is shown in Fig. 1(A). This procedure yielded a total of 136 ROIs across all eleven subjects. Table 1 summarizes the number of ROIs that were extracted from the data of each subject. Two expert graders independently annotated the center of individual RPE cells in these ROI images using custom software (Matlab, Mathworks, Natick MA). We used the annotations of the senior grader as the gold standard ground truth labels and the other annotation set (2nd grader) to obtain the expert-level performance for benchmarking model performance.
Fig. 1.
Curation of RPE training data. A) ROI selection and annotation from a sample RPE full FOV en face image. Manual annotations (x and y coordinates of cell centers; yellow cross markers) were used to create the RPE cell center map, where the pixels at the location of cell centers have values of 1 and 0 elsewhere. These maps were convolved with a Gaussian filter with standard deviation σ to create a smoothed center map used for training. B) Example images created by averaging different numbers of registered images (numbers on the top-left corner). 200 × 200 pixels regions cropped from the full FOV images are shown for better visualization of the RPE cells.
Beyond the cropped ROIs of the FDA dataset, we created a set of annotated full FOV images to test the generalizability of our framework trained with partial annotations (labeled ROIs) across a larger area. One 2°×2° region at a specific retinal location was selected per healthy subject for manual annotation. The set of images was selected to cover the entire imaged region over all subjects (fovea to 12°T) while aiming for the ones that exhibited the best qualitative visualization of RPE cells on the 90-volume averaged data (see Table 1 for the selected location for each subject).
For the diseased subject, the presence of drusen below the RPE layer elevated the outer retinal layers. We used our previously developed layer segmentation network in [61] to segment the displaced photoreceptor layer, and then segmented the RPE layer by following the curvature of the cone outer segment boundary down by 8-12 pixels (12-18 µm). Using the RPE layer segmentation, we created the en face images and used them for further analysis. Two expert graders independently annotated the full FOV images, the diseased images, and images from the IU dataset for performance evaluation.
2.2. Overall framework: network architectures and training process
Our developed method (Fig. 2(A)) consists of two independently trained modules: one for RPE identification and the other for vessel segmentation. The vessel segmentation module was included to remove regions where RPE information was compromised due to low signal-to-noise caused by the shadowing of overlying vessels present in the full FOV en face images (note that vessel shadows were not present in the cropped ROIs). At inference, the binary mask of the hypo-reflective regions marked by the vessel segmentation module is used to filter out cells found in these areas from further analysis.
Fig. 2.
Overall framework for automatic RPE cell detection from AO-OCT images. A) Full FOV images are processed both by the RPE cell identification network and DconnNet (vessel segmentation). The automatically identified cells in the segmented regions determined by DconnNet are excluded from further analysis (final cells are the yellow markers on the zoomed-in section of the image indicated by the blue box). B) Network architecture for RPE cell center localization, consisting of the DenseNet121 network as the encoder for extracting multi-dimensional features. TL: transition layer, BN: batch normalization, ReLU: rectified linear unit. The numbers in each DenseBlock denote the number of dense layers in the block. C) Example images created by Fourier-based intensity augmentation of a sample ROI image.
The inputs to the cell identification network (Fig. 2(B)) are the 2D AO-OCT images, and the network outputs a probability map predicting the locations of RPE cell centers, denoted by the bright clusters of pixels. Input AO-OCT images were preprocessed by normalizing the intensity values to the range [0,1]. For cell identification, we used a similar network architecture as the one in [67], which demonstrated better performance compared to U-Net or VGG16-U-Net for RPE cell detection in AO-ICG images. Briefly, the network was composed of the Dense121 network [73] as the encoder and the decoder path was constructed with multiple layers of deconvolution and dense blocks to increase the image feature resolutions to the input image size. Skip connections between the encoder and decoder levels were also included and batch normalization, ReLU activation, and dropout were used at each level of the network. Instead of using a single deconvolution layer at the output layer, we used bilinear upsampling followed by convolution at the last stage of the network, which was followed by Sigmoid activation function. This modification was done to avoid the checkerboard artifact common to deconvolution layers [74].
To optimize the weights of the network, we used the average L1 difference between the ground truth centroid map and the network’s output predicted map defined as:
| (1) |
were N is the number of pixels in . Given the ground truth cell markings of the ROI images, we created cell probability maps by centering Gaussian kernels with a standard deviation of 1.5 pixels at the cell coordinates and mapping the intensity values to the range [0,1] (Fig. 1(A)). The network was trained with a batch size of 8 for a maximum of 250 epochs using the Adam optimizer with a learning rate of 0.0001. During training, data augmentation in the form of random spatial transforms (horizontal flip, vertical flip, and diagonal flip) and intensity augmentation were performed. We used a Fourier-based approach for intensity augmentation in which the amplitude information of images in the Fourier domain was manipulated [75]. Since the phase component in the Fourier domain retains most of the important features of an image, perturbing the amplitude information will generate augmented images with the same high-level semantics as the original image [76]. Given an image I(x,y) with Fourier transform F(u,v), an augmented image J(x,y) is created by:
| (2) |
in which A(u,v) and φ(u,v) are the magnitude and phase of F(u,v), respectively. The amplitude information A(u,v) is augmented by element-wise multiplication with a noise image, U(u,v), consisting of values randomly drawn from a uniform distribution in the range [0.5, 1]. Illustrative examples of such augmented images are shown in Fig. 2(C). An ablation experiment on this step is reported in Section 1.1 of the Supplement 1 (99.4KB, pdf) .
The best-performing model was determined based on the minimum loss value achieved on the validation set. At the inference stage, individual RPE cells were localized as the local maxima points in . Local maxima points with relative height of at least α within an area of 3 × 3 pixels were kept as predicted cell locations. The threshold α was selected as the value that resulted in the highest F1 score on the validation data. Implementation details and processing times are reported in Section 1.2 of the Supplement 1 (99.4KB, pdf) .
To segment vessel regions, we used DconnNet [77] as our segmentation network. DconnNet is based on directional connectivity modeling [78] with enhanced topology-preserving capability. Following our proposed approach in [61], we created the training dataset by simulating and adding dark tubular regions to the 0.5°×0.5° ROI images. Images of one subject were used as the validation set for monitoring the training process. DconnNet was trained with the Adam optimizer with a learning rate of 0.0001, batch size of 8, and maximum 50 epochs. Once trained, the network was used without any modifications to segment vessel regions from the full FOV RPE en face images, if needed. A threshold of 0.5 was used to determine pixels belonging to the vessel regions.
2.3. Competing approach
We used an enhanced version of the popular cell segmentation method named CellPose [79] as the alternative approach to our method for RPE cell detection. The CellPose method represents cell masks as vector flows for model training and test-time cell segmentation. The neural network in CellPose is a modified U-Net with the standard blocks replaced with residual blocks and using addition for the aggregation of encoder features with the decoder path. Additionally, a style vector was introduced for re-adjusting the network computations on an image-by-image basis for diverse cell datasets [79]. For any given image, CellPose outputs a three-channel map predicting (1) the probability that a pixel is inside or outside of a cell (Binary cell mask), and (2) the horizontal and (3) vertical flow gradients towards the cell center. CellPose uses a combinatorial loss (weighted sum of cross-entropy loss for the binary mask and L2 loss for the vector flow) for optimizing the network weights. Similar to [67], we adopted the Voronoi diagrams created with the manual cell centers as the RPE cell masks. We used the available codes online provided by the authors of CellPose to transform the cell masks into the necessary labels for model training and trained the network with its combinatorial loss using the default hyperparameter values. At the inference stage, CellPose groups pixels based on the predicted gradient maps to segment individual cells by running a dynamic system for ni iterations for each pixel. We increased ni from 200 to 500 to ensure convergence and changed the flow threshold hyperparameter value from 0.4 to 0 as no cells were found with the default value of 0.4. We used the predicted center points of each segmented cell for assessing the detection performance. The original CellPose, unlike the proposed method, does not have a vessel removal mechanism. To help Cellpose in these regions and to fairly compare CellPose’s mask-based approach to our cell center-based detection approach in the same image regions, we applied the segmented vessel regions identified with our DconnNet module to the predictions of CellPose. We named this enhanced version Dconn-CellPose to distinguish it from the original CellPose method.
2.4. Experimental design
We conducted five experiments to evaluate the model performance. In the first experiment, we evaluated performance on cropped ROIs. We used 6-fold cross-validation for training and validation of the cell identification networks, in which subjects were divided into six groups. Five groups contained two subjects and one group consisted of images from one subject. In this strategy, at each fold of cross-validation all images of one group were used as the test set and the remaining data were used in the training and optimization process. From the remaining subjects, images belonging to one subject were separated as the validation set to monitor the training process (selection of the best-performing model) and to determine the best threshold α for our method. At inference (for both our method and Dconn-CellPose), test images were augmented by identity, horizontal, or vertical flipping before being input to the trained networks, and the output probability maps were averaged after reversing the augmentation. RPE cells were then localized on the averaged maps.
In the second experiment, we evaluated model performance on the FDA full FOV images. The full FOV image of each subject was passed through the network that was trained on all cropped ROIs, excluding those belonging to the same subject. We used test-time-augmentation for both models to enhance the predicted cell maps, in which multiple versions of the input image were passed through the trained neural network. Test-time-augmentation consisted of an image-level augmentation step followed by a tiling step. In the first step, an input image was augmented by a spatial transformation in the form of identity, horizontal, or vertical flipping before feeding into the network. In the tiling step, the input image was partitioned into overlapping windows of size 256 × 256 pixels with 10% overlap. The whole image prediction map was then constructed by first placing the windowed predictions in their corresponding locations and averaging in the overlap areas, and then reversing the whole-image augmentation if flipping was applied. The predicted maps from all image augmentations were averaged to yield the final map for cell localization. To segment vessels, predictions were made on the full FOV images with test-time-augmentation without the tiling step. The segmented vessel regions were used to calculate performance scores (Section 2.5).
In the third experiment, the goal was to evaluate model performance across different levels of image quality. We created images with different qualities in terms of signal-to-noise ratio (SNR) for each of the selected full FOV images of the FDA healthy dataset by averaging different numbers of registered volumes. Specifically, after registering all 90 volumes to a selected reference volume, we sorted these volumes using the quality matrix of our registration software [69] in descending order and selected the top K for averaging. We set K to 1, 5, 10, 20, 30, 45, 60, and 90 volumes. Example images are shown in Fig. 1(B). We quantified SNR as the ratio between the RPE peak signal (cusp of concentrated energy) and the average of the noise floor (determined by the optical cut-off frequency) in the 2D power spectra. Similar to the previous experiment, we used test-time-augmentation for cell detection. We used the human annotations and predicted vessel mask from the 90-volume averaged image for all other images when calculating the performance scores as all image instances have been aligned to a common reference space.
In the fourth experiment, we evaluated the generalizability of the trained models to AO-OCT images of healthy subjects obtained by an imager with a different optical design and type (IU imager) than that of the training dataset (FDA imager). To detect RPE cells, we ensembled the predictions of the six trained models by averaging their predictions and then localizing cells (or segmenting vessel shadows) on the averaged map. For our method, we used the maximum value of the hyperparameter α among all the trained models for the final result. To account for differences in the lateral pixel sizes between the two imagers, the IU images were resized by a factor of 0.8 before being input to the neural networks.
Lastly, we evaluated the generalizability of the models trained on healthy images to a diseased case. We used the two images from the subject with drusen deposits in the FDA dataset for this purpose. Similar to the previous experiment, we accounted for differences in the lateral pixel sizes between the FOVs by resizing the diseased images by a factor of 0.75 before inputting to the neural networks and ensembled the predictions of the trained models for final cell detection and segmentation of hypo-reflective regions. The segmented hypo-reflective mask was used for all models in performance score calculations. For our method, we used the minimum value of the hyperparameter α among all the trained models for the final result.
2.5. Evaluation metrics
In all experiments, detection performance was quantified using recall, precision, and F1 scores [61]. We used the Euclidean distance between the manually marked and the automatically identified cells to find matches between the two sets. If the distance between each manually marked cell and its nearest automatic cell was smaller than Dcell, they were considered a true match. To remove border artifacts, any cell and its matching cell from the other set located within Dedge pixels from the image edges were disregarded as border cells (even if one of the cells is determined to be a border cell, both cells are removed). Dcell was set to ∼6 µm for the FDA healthy dataset, which contained multiple retinal locations, and ∼7 µm for the IU and FDA diseased images. We set a larger value for Dcell for the IU set as it was collected at 7°T (average RPE cell size ∼14 µm), and also for the subject with drusen deposits as RPE cells lose their shape and enlarge at the affected sites [73]. We set Dedge to 5 pixels for the FDA ROI images, 50 and 15 pixels (fast-axis and slow-axis directions, respectively) for the FDA full FOV images, and 15 pixels for the IU images. We used larger values of Dedge for the full FOV images to remove regions with potential residual motion or uncorrected aberrations. Cells (manually and automatically detected) overlapping the hypo-reflective regions were also disregarded. Note that a manual label for the hypo-reflective regions was not available. Thus, we relied on the segmentations by our DconnNet module to identify these regions for the analysis.
For quantifying the inter-observer agreement, the markings of the 2nd grading were compared to the gold-standard markings in the same way. We used the Wilcoxon signed-rank test as the statistical test for all pairwise comparisons unless noted otherwise. The threshold p-value for statistical significance was set to 0.05 and 0.05/n for single and multiple comparisons, respectively, with n representing the number of comparisons.
3. Results
3.1. RPE cell detection on cropped ROIs
Examples of automated and 2nd grader annotations for three subjects are illustrated in Fig. 3(A). These images were selected to cover the range of difficulties in marking RPE cells, from an easy (top image) to a difficult case (bottom image). Overall, our method was significantly better than the 2nd grader in all performance scores (all p-values < 0.0001) when processing the 136 ROI images that covered the retina from the fovea to 12°T (Table 2). The higher score values were also maintained at the subject level, as shown in Fig. 3(B). Although Dconn-CellPose exceeded the 2nd grader in precision and F1 (p-values < 0.0001), its performance was significantly lower than our method (F1 p-value = 0.0004; significance level: 0.05/3 = 0.0167). Next, we used the 2nd grader annotations as the ground truth to assess any dependency on the selection of the human grader, if any. The results in Table 2 show that our method still performed better than Dconn-CellPose (F1 p-value = 0.0018).
Fig. 3.
RPE cell detection results on ROI images. A) Example markings of the 2nd grader, our automated method, and Dconn-CellPose compared to the ground truth. Cyan asterisk: true positive, red triangle: false positive, and yellow circle: false negative. B) Box chart plots of the performance scores across different subjects. The line in each box is the median and the top and bottom edges are the upper and lower quartiles, respectively. Outliers are displayed as circle markers, and the whiskers extend to the maximum and minimum values that are not outliers. DCP: Dconn-CellPose.
Table 2. Performance scores on 136 FDA ROI images. The first and second sets of scores denote the results for the case of using Grader1’s and 2nd grader’s (Grader2) annotations as the ground truth, respectively. For each score in each set, the higher value is written in bold.
| Ground Truth | Score | 2nd Grader | Ours | Dconn-CellPose |
|---|---|---|---|---|
| Grader1 | Recall | 0.915 ± 0.043 | 0.929 ± 0.037 | 0.920 ± 0.043 |
| Precision | 0.891 ± 0.055 | 0.920 ± 0.043 | 0.911 ± 0.047 | |
| F1 | 0.901 ± 0.035 | 0.924 ± 0.029 | 0.914 ± 0.035 | |
|
| ||||
| Recall | - | 0.916 ± 0.049 | 0.908 ± 0.052 | |
| Grader2 | Precision | - | 0.929 ± 0.038 | 0.923 ± 0.038 |
| F1 | - | 0.921 ± 0.030 | 0.914 ± 0.031 | |
3.2. Evaluation of cell detection on full FOV images
Figures 4 and 5 show example results of the 2nd grader and our method compared to the ground truth for an easy and challenging (defined by the low/high performance scores) case, respectively. As the results in Table 3 show, our method outperformed the 2nd grader (F1 p-value = 0.0019; significance level: 0.05/3 = 0.0167) with marginally higher F1 score compared to Dconn-CellPose, and Dconn-CellPose was on par with the 2nd grader (F1 p-value = 0.042; significance level: 0.05/3 = 0.0167).
Fig. 4.
RPE cell detection of the 2nd grader and our automated method on the full FOV image for an easy case (subject 8195). The automatically segmented vessel region by DconnNet is overlaid as the magenta region. The magenta boxes indicate the location of the zoomed-in images for better visualization. Cyan asterisk: true positive (TP), red triangle: false positive (FP), and yellow circle: false negative (FN).
Fig. 5.
RPE cell detection of the 2nd grader and our automated method on the full FOV image for a challenging case (subject 7743). The automatically segmented vessel region by DconnNet is overlaid as the magenta region. The magenta boxes indicate the location of the zoomed-in images for better visualization. Cyan asterisk: true positive (TP), red triangle: false positive (FP), and yellow circle: false negative (FN).
Table 3. Performance scores on the 11 FDA full FOV images. The first and second sets of scores denote the results for the case of using Grader1’s and 2nd grader’s (Grader2) annotations as the ground truth, respectively. For each score in each set, the higher value is written in bold.
| Ground Truth | Score | 2nd Grader | Ours | Dconn-CellPose |
|---|---|---|---|---|
| Recall | 0.916 ± 0.043 | 0.916 ± 0.042 | 0.908 ± 0.051 | |
| Grader1 | Precision | 0.907 ± 0.026 | 0.938 ± 0.011 | 0.938 ± 0.015 |
| F1 | 0.911 ± 0.022 | 0.926 ± 0.020 | 0.922 ± 0.027 | |
|
| ||||
| Recall | - | 0.918 ± 0.042 | 0.912 ± 0.056 | |
| Grader2 | Precision | - | 0.949 ± 0.022 | 0.951 ± 0.021 |
| F1 | - | 0.933 ± 0.020 | 0.930 ± 0.028 | |
To show that our better performance scores were valid regardless of which grader was selected as the ground truth, we quantified the performances using the 2nd grader annotations as the ground truth. Results in Table 3 show that our method was marginally better than Dconn-CellPose.
3.3. Evaluation of cell detection across different levels of SNR
The performance of our automated approach for different image quality is shown in Fig. 6. With the decrease in the number of averaged volumes (Fig. 6(A)), and, in turn, SNR reduction (Fig. 6(B); binned to 0.5 intervals), the performance scores gradually decrease. Compared to the 2nd grader, which annotated the highest quality images of each subject (90-volume averaged images), our method exceeded expert-level F1 score in the 90- and 60-volume averages (p-values < 0.005), and was on par with the expert grader for the 45- and 30-volume averages (p-values = 0.067 and 0.898, respectively). For averaged images with 20 volumes and less, the F1 scores decreased below the 2nd grader (p-values ≤ 0.024). In terms of SNR, our method attained expert-level F1 score down to SNR ≈ 3 (p-values > 0.1; two-sided Wilcoxon rank sum test). A similar trend was observed for Dconn-CellPose.
Fig. 6.
Performance scores across different levels of A) number of averaged volumes and B) binned noise-to-signal ratio (SNR). In B, the scores are plotted against the mean of each binned SNR interval (interval lengths of 0.5). The 2nd grader labeled the 90-volume averaged images, for which the performance scores are shown with the yellow markers/shaded area in A. The circle markers and shaded areas denote the mean and standard deviation (STD), respectively. The mean ± STD of 2nd grader scores are also shown in B as the shaded area. DCP: Dconn-CellPose.
3.4. Generalizability to a different AO-OCT imager
RPE cell detection results for the IU dataset (four subjects) are summarized in Table 4. The F1 score of our method was higher than that of the 2nd grader and Dconn-CellPose for three and two cases, respectively. On average, our method achieved a marginally higher F1 score compared to Dconn-CellPose and the 2nd grader. As compared to the results for the FDA dataset, all average F1 scores are higher on the IU images (2nd grader: 0.935 vs. 0.911, ours: 0.941 vs. 0.926, and Dconn-CellPose: 0.939 vs. 0.922), though the sample size for the IU set is much smaller than the FDA set. An example image from the IU set with the cell detection results is illustrated in Fig. 7.
Table 4. Generalizability of models trained on the FDA healthy ROI images to the IU dataset and diseased images. Scores are reported as F1 (recall, precision) for individual cases.
| Dataset | Subject ID | 2nd Grader | Ours | Dconn-CellPose |
|---|---|---|---|---|
| IU Healthy | S1 | 0.932 (0.922,0.941) | 0.941 (0.943,0.939) | 0.943 (0.949,0.937) |
| S2 | 0.952 (0.965,0.940) | 0.969 (0.973,0.965) | 0.973 (0.976,0.971) | |
| S3 | 0.915 (0.896,0.934) | 0.908 (0.876,0.942) | 0.905 (0.865,0.950) | |
| S4 | 0.941 (0.945,0.937) | 0.946 (0.924.0.970) | 0.934 (0.923,0.946) | |
|
|
||||
| Overall F1 | 0.935 ± 0.016 | 0.941 ± 0.025 | 0.939 ± 0.028 | |
|
| ||||
| FDA Diseased | 6242 – Image1 | 0.897 (0.879,0.915) | 0.899 (0.900,0.899) | 0.901 (0.898,0.903) |
| 6242 – Image2 | 0.868 (0.848,0.888) | 0.874 (0.879,0.870) | 0.874 (0.885,0.864) | |
Fig. 7.
RPE cell detection results on subject S4 from the IU dataset. Cyan asterisk: true positive, red triangle: false positive, and yellow circle: false negative.
3.5. Generalizability to diseased images
RPE cell detection results for the images from the diseased subject are summarized in Table 4 and Fig. 8. Overall, both our method and Dconn-CellPose achieved similar scores as the 2nd grader. Note that the drusen deposits occurred near the fovea for this subject where vessel shadows are not present. We used the trained DconnNet network to identify the hypo-reflective regions associated with the presence of drusen to remove them from the performance score calculations. These segmented regions are overlaid as the magenta masks in Fig. 8.
Fig. 8.
RPE detection results on the two images of the diseased subject with drusen deposits. The automatically segmented hypo-reflective regions (corresponding to the drusen areas) by DconnNet are overlaid as the magenta region. Cyan asterisk: true positive, red triangle: false positive, and yellow circle: false negative.
4. Discussion
In this paper, we developed a deep learning method to automatically detect RPE cells from AO-OCT images of the human retina with partial annotations. Our work is the first to demonstrate automated detection of RPE cells in this imaging modality with accuracy exceeding human performance. Our framework consisted of a cell detection module to localize cells in en face images and a vessel segmentation module to remove low-confidence regions for cell localization.
Annotation of the full FOV en face images entirely across different subjects and retinal locations (either for cellular-level clinical investigations or for model development purposes) is an extremely labor-intensive task. To alleviate the annotation cost for model training and avoid redundancy in image appearance, we partially annotated images as RPE cells in healthy eyes exhibit repetitive patterns across the 2°×2° FOV. The training data was created by restricting the markings to select ROIs with 1/16th area of the full FOV image, thus significantly reducing the annotation cost while fully leveraging the diverse images captured across subjects and retinal locations.
Given the annotated ROIs for training, our method achieved above expert-level performance across different retinal locations and maintained human-level performance down to SNR = 3 on the full FOV data, which was ∼0.5 of the maximum SNR in our set of healthy images. In the case of non-averaged images (en face RPE images from single volumes), although our method was not on par with the 2nd grader (who had the advantage of annotating the 90-volume averaged images), the scores were encouraging (recall = 0.831 ± 0.055, precision = 0.812 ± 0.038, F1 = 0.819 ± 0.027). With the decrease in SNR as a result of less averaging, it is expected that human graders will also struggle to locate RPE cells. We have not quantified the drop in the 2nd grader’s performance on a lower number of averaged volumes as it was beyond the scope of this paper.
The results of the generalizability experiments showed that our method attained human-level scores on images taken with a different AO-OCT system and images from a diseased eye with drusen deposits. As the results in Section 3.3 showed, image SNR affected the performance of the model trained on only the high SNR 90-averaged images. Thus, system parameters and imaging protocols that affect RPE image quality (i.e., optical properties that control speckle size and shape, the time between consecutive AO-OCT volumes) and lateral pixel size of the RPE image could influence generalizability. For our IU dataset, we obtained high performance by only correcting for the differences in pixel sizes. Larger datasets from a more diverse healthy population (e.g., with respect to age), different diseased eyes (e.g., with respect to severity and the type of disease, including inherited retinal dystrophies), and imagers with different optical designs (e.g., with respect to imaging wavelength, resolution, and power), and are required for a more comprehensive generalizability assessment of our method on AO-OCT images. Beyond AO-OCT, imagers with different contrast modes for in vivo RPE cell visualization (e.g., AO-TOPI) could also benefit from our method.
In this work, we tackled the problem of RPE cell localization with partial annotations of cell centers. As RPE cells are tightly packed without empty spaces between them, except for regions with vessel shadowing, Voronoi-based analysis of the cell centers often gives adequate representations of the cell borders. Using the Voronoi diagrams as cell masks for training the popular CellPose model and combining its predictions with our vessel segmentation module to eliminate predictions under vessels (which we called Dconn-CellPose), the comparison to our method did not show an advantage of the extra spatial information for RPE cell detection. Thus, our approach of directly detecting the cell centers is sufficient for future RPE cell mosaic analyses. A potential avenue for future studies is to investigate transformer-based methods. These models have recently shown promise in other types of segmentation tasks but require larger datasets to learn from, which were not available in this study.
In this work, we used Gaussian kernels with a fixed standard deviation to represent cell centers across different retinal locations as the increase in RPE cell-to-cell spacing was observed to be only ∼0.3 µm/degree from the fovea to 6°, which then plateaued at higher eccentricities [72]. Future studies could explore the potential advantage of using adaptive values for the Gaussian kernel’s standard deviation across the retina. Another avenue for future investigation is to use an auxiliary task for the unlabeled image regions through self-supervised learning, which might further reduce the amount of labeled data needed to achieve human-level performance.
In AO-OCT imaging, speckle noise limits the visibility of low-contrast structures such as the RPE cells. Averaging of multiple AO-OCT images is required to enhance the RPE cell contrast to a level at which individual cells can be discerned, thus increasing the imaging time. Data collected in longer imaging sessions are more susceptible to different artifacts, such as eye motion, blinks, and tear film breakup, which degrade image quality. As our method maintained human-level performance on lower SNR images, the imaging time per a local retinal location can be reduced. This improvement can be used either to decrease patient wait times (improving clinical efficacy) or to acquire images from additional retinal locations to yield a more complete picture of retinal health (improving clinical diagnosis). In combination with a recent work on enhancing RPE cell structures from single volumes [80], the gain in imaging speed and cell quantification for future clinical investigations could be even more substantial.
Supplemental information
Acknowledgment
We thank Osamah Saeedi for clinical support. This project was supported in part by an appointment to the Research Participation Program at the U.S. FDA administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. DoE and the FDA.
Funding
National Institutes of Health10.13039/100000002 (P30EY005722, R01-EY018339, R01-EY035675); Foundation Fighting Blindness10.13039/100001116 (BR-CL-0621-0812-DUKE); U.S. Food and Drug Administration10.13039/100000038 (Critical Path Initiative); Research to Prevent Blindness10.13039/100001818 (Unrestricted Grant to Duke University); Hartwell Foundation10.13039/100006792 (Postdoctoral Fellowship).
Disclosures
The mention of commercial products, their sources, or their use in connection with material reported herein is not to be construed as either an actual or implied endorsement of such products by the US Department of Health and Human Services.
Data availability
Code for our algorithm, imaging data, and the manual expert labels underlying the results presented in this paper are available online [81].
Supplemental document
See Supplement 1 (99.4KB, pdf) for supporting content.
References
- 1.Strauss O., “The retinal pigment epithelium in visual function,” Physiol. Rev. 85(3), 845–881 (2005). 10.1152/physrev.00021.2004 [DOI] [PubMed] [Google Scholar]
- 2.Boulton M., Dayhaw-Barker P., “The role of the retinal pigment epithelium: Topographical variation and ageing changes,” Eye 15(3), 384–389 (2001). 10.1038/eye.2001.141 [DOI] [PubMed] [Google Scholar]
- 3.Ach T., Huisingh C., McGwin G., et al. , “Quantitative autofluorescence and cell density maps of the human retinal pigment epithelium,” Invest. Ophthalmol. Visual Sci. 55(8), 4832 (2014). 10.1167/iovs.14-14802 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bhatia S. K., Rashid A., Chrenek M. A., et al. , “Analysis of RPE morphometry in human eyes,” Mol. Vis. 22, 898–916 (2016). [PMC free article] [PubMed] [Google Scholar]
- 5.Kim Y.-K., Yu H., Summers V. R., et al. , “Morphometric analysis of retinal pigment epithelial cells from C57BL/6J mice during aging,” Invest. Ophthalmol. Visual Sci. 62(2), 32 (2021). 10.1167/iovs.62.2.32 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Esteve-Rudd J., Hazim R. A., Diemer T., et al. , “Defective phagosome motility and degradation in cell nonautonomous RPE pathogenesis of a dominant macular degeneration,” Proc. Natl. Acad. Sci. U.S.A. 115(21), 5468–5473 (2018). 10.1073/pnas.1709211115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lakkaraju A., Umapathy A., Tan L. X., et al. , “The cell biology of the retinal pigment epithelium,” Prog. Retinal Eye Res. 78, 100846 (2020). 10.1016/j.preteyeres.2020.100846 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Umapathy A., Torten G., Paniagua A. E., et al. , “Spatiotemporal live-cell analysis of photoreceptor outer segment membrane ingestion by the retinal pigment epithelium reveals actin-regulated scission,” J. Neurosci. 43(15), 2653–2664 (2023). 10.1523/JNEUROSCI.1726-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.von Rückmann A., Fitzke F. W., Bird A. C., “Distribution of fundus autofluorescence with a scanning laser ophthalmoscope,” Br. J. Ophthalmol. 79(5), 407–412 (1995). 10.1136/bjo.79.5.407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Keilhauer C. N., Delori F. C., “Near-infrared autofluorescence imaging of the fundus: visualization of ocular melanin,” Invest. Ophthalmol. Visual Sci. 47(8), 3556 (2006). 10.1167/iovs.06-0122 [DOI] [PubMed] [Google Scholar]
- 11.Holz F. G., Bindewald-Wittich A., Fleckenstein M., et al. , “Progression of geographic atrophy and impact of fundus autofluorescence patterns in age-related macular degeneration,” Am. J. Ophthalmol. 143(3), 463–472.e2 (2007). 10.1016/j.ajo.2006.11.041 [DOI] [PubMed] [Google Scholar]
- 12.Spaide R. F., Curcio C. A., “Drusen characterization with multimodal imaging,” Retina 30(9), 1441–1454 (2010). 10.1097/IAE.0b013e3181ee5ce8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ahlers C., Götzinger E., Pircher M., et al. , “Imaging of the retinal pigment epithelium in age-related macular degeneration using polarization-sensitive optical coherence tomography,” Invest. Ophthalmol. Visual Sci. 51(4), 2149 (2010). 10.1167/iovs.09-3817 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Götzinger E., Pircher M., Geitzenauer W., et al. , “Retinal pigment epithelium segmentation by polarization sensitive optical coherence tomography,” Opt. Express 16(21), 16410 (2008). 10.1364/OE.16.016410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Allingham M. J., Nie Q., Lad E. M., et al. , “Semiautomatic segmentation of rim area focal hyperautofluorescence predicts progression of geographic atrophy due to dry age-related macular degeneration,” Invest. Ophthalmol. Visual Sci. 57(4), 2283 (2016). 10.1167/iovs.15-19008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Miller D. T., Williams D. R., Morris G. M., et al. , “Images of cone photoreceptors in the living human eye,” Vision Res. 36(8), 1067–1079 (1996). 10.1016/0042-6989(95)00225-1 [DOI] [PubMed] [Google Scholar]
- 17.LaRocca F., Dhalla A.-H., Kelly M. P., et al. , “Optimization of confocal scanning laser ophthalmoscope design,” J. Biomed. Opt 18(7), 076015 (2013). 10.1117/1.JBO.18.7.076015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.LaRocca F., Nankivil D., Farsiu S., et al. , “Handheld simultaneous scanning laser ophthalmoscopy and optical coherence tomography system,” Biomed. Opt. Express 4(11), 2307 (2013). 10.1364/BOE.4.002307 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.LaRocca F., Nankivil D., DuBose T., et al. , “In vivo cellular-resolution retinal imaging in infants and children using an ultracompact handheld probe,” Nat. Photonics 10(9), 580–584 (2016). 10.1038/nphoton.2016.141 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.DuBose T. B., LaRocca F., Farsiu S., et al. , “Super-resolution retinal imaging using optically reassigned scanning laser ophthalmoscopy,” Nat. Photonics 13(4), 257–262 (2019). 10.1038/s41566-019-0369-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Mecê P., Groux K., Scholler J., et al. , “Coherence gate shaping for wide field high-resolution in vivo retinal imaging with full-field OCT,” Biomed. Opt. Express 11(9), 4928 (2020). 10.1364/BOE.400522 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Castanos M. V., Zhou D. B., Linderman R. E., et al. , “Imaging of macrophage-like cells in living human retina using clinical oct,” Invest. Ophthalmol. Visual Sci. 61(6), 48 (2020). 10.1167/iovs.61.6.48 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Mendonça L. S. M., Braun P. X., Martin S. M., et al. , “Repeatability and reproducibility of photoreceptor density measurement in the macula using the spectralis high magnification module,” Ophthalmology Retina 4(11), 1083–1092 (2020). 10.1016/j.oret.2020.04.021 [DOI] [PubMed] [Google Scholar]
- 24.Konstantinou E. K., Mendonça L. S. M., Braun P., et al. , “Retinal imaging using a confocal scanning laser ophthalmoscope-based high-magnification module,” Ophthalmology Retina 5(5), 438–449 (2021). 10.1016/j.oret.2020.08.014 [DOI] [PubMed] [Google Scholar]
- 25.Zhang F., Kovalick K., Raghavendra A., et al. , “In vivo imaging of human retinal ganglion cells using optical coherence tomography without adaptive optics,” Biomed. Opt. Express 15(8), 4675 (2024). 10.1364/BOE.533249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mujat M., Ferguson R. D., Hammer D. X., et al. , “high-resolution retinal imaging: technology overview and applications,” Photonics 11(6), 522 (2024). 10.3390/photonics11060522 [DOI] [Google Scholar]
- 27.Williams D. R., Burns S. A., Miller D. T., et al. , “Evolution of adaptive optics retinal imaging [Invited],” Biomed. Opt. Express 14(3), 1307 (2023). 10.1364/BOE.485371 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liu Z., Zhang F., Zucca K., et al. , “Ultrahigh-speed multimodal adaptive optics system for microscopic structural and functional imaging of the human retina,” Biomed. Opt. Express 13(11), 5860 (2022). 10.1364/BOE.462594 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.DuBose T., Nankivil D., LaRocca F., et al. , “Handheld adaptive optics scanning laser ophthalmoscope,” Optica 5(9), 1027 (2018). 10.1364/OPTICA.5.001027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liu Z., Tam J., Saeedi O., et al. , “Trans-retinal cellular imaging with multimodal adaptive optics,” Biomed. Opt. Express 9(9), 4246 (2018). 10.1364/BOE.9.004246 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Liu Z., Kurokawa K., Zhang F., et al. , “Imaging and quantifying ganglion cells and other transparent neurons in the living human retina,” Proc. Natl. Acad. Sci. U. S. A. 114(48), 12803–12808 (2017). 10.1073/pnas.1711734114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Scoles D., Sulai Y. N., Dubra A., “In vivo dark-field imaging of the retinal pigment epithelium cell mosaic,” Biomed. Opt. Express 4(9), 1710 (2013). 10.1364/BOE.4.001710 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Baraas R. C., Pedersen H. R., Knoblauch K., et al. , “Human foveal cone and RPE cell topographies and their correspondence with foveal shape,” Invest. Ophthalmol. Visual Sci. 63(2), 8 (2022). 10.1167/iovs.63.2.8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gray D. C., Merigan W., Wolfing J. I., et al. , “In vivo fluorescence imaging of primate retinal ganglion cells and retinal pigment epithelial cells,” Opt. Express 14(16), 7144 (2006). 10.1364/OE.14.007144 [DOI] [PubMed] [Google Scholar]
- 35.Morgan J. I. W., Dubra A., Wolfe R., et al. , “In vivo autofluorescence imaging of the human and macaque retinal pigment epithelial cell mosaic,” Invest. Ophthalmol. Visual Sci. 50(3), 1350 (2009). 10.1167/iovs.08-2618 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Rossi E. A., Rangel-Fonseca P., Parkins K., et al. , “In vivo imaging of retinal pigment epithelium cells in age related macular degeneration,” Biomed. Opt. Express 4(11), 2527 (2013). 10.1364/BOE.4.002527 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Roorda A., Zhang Y., Duncan J. L., “High-resolution in vivo imaging of the RPE mosaic in eyes with retinal disease,” Invest. Ophthalmol. Visual Sci. 48(5), 2297 (2007). 10.1167/iovs.06-1450 [DOI] [PubMed] [Google Scholar]
- 38.Granger C. E., Yang Q., Song H., et al. , “Human retinal pigment epithelium: in vivo cell morphometry, multispectral autofluorescence, and relationship to cone mosaic,” Invest. Ophthalmol. Visual Sci. 59(15), 5705 (2018). 10.1167/iovs.18-24677 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Jung H., Liu T., Liu J., et al. , “Combining multimodal adaptive optics imaging and angiography improves visualization of human eyes with cellular-level resolution,” Commun. Biol. 1(1), 189 (2018). 10.1038/s42003-018-0190-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jung H., Liu J., Liu T., et al. , “Longitudinal adaptive optics fluorescence microscopy reveals cellular mosaicism in patients,” JCI Insight 4(6), e124904 (2019). 10.1172/jci.insight.124904 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tam J., Liu J., Dubra A., et al. , “In vivo imaging of the human retinal pigment epithelial mosaic using adaptive optics enhanced indocyanine green ophthalmoscopy,” Invest. Ophthalmol. Visual Sci. 57(10), 4376 (2016). 10.1167/iovs.16-19503 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kowalczuk L., Dornier R., Kunzi M., et al. , “In vivo retinal pigment epithelium imaging using transscleral optical imaging in healthy eyes,” Ophthalmology Science 3(1), 100234 (2023). 10.1016/j.xops.2022.100234 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Laforest T., Künzi M., Kowalczuk L., et al. , “Transscleral optical phase imaging of the human retina,” Nat. Photonics 14(7), 439–445 (2020). 10.1038/s41566-020-0608-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Liu Z., Kocaoglu O. P., Miller D. T., “3D imaging of retinal pigment epithelial cells in the living human retina,” Invest. Ophthalmol. Visual Sci. 57(9), OCT533 (2016). 10.1167/iovs.16-19106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Liu Z., Kurokawa K., Hammer D. X., et al. , “In vivo measurement of organelle motility in human retinal pigment epithelial cells,” Biomed. Opt. Express 10(8), 4142 (2019). 10.1364/BOE.10.004142 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Bower A. J., Liu T., Aguilera N., et al. , “Integrating adaptive optics-SLO and OCT for multimodal visualization of the human retinal pigment epithelial mosaic,” Biomed. Opt. Express 12(3), 1449 (2021). 10.1364/BOE.413438 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Shirazi M. F., Brunner E., Laslandes M., et al. , “Visualizing human photoreceptor and retinal pigment epithelium cell mosaics in a single volume scan over an extended field of view with adaptive optics optical coherence tomography,” Biomed. Opt. Express 11(8), 4520 (2020). 10.1364/BOE.393906 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bhutto I., Lutty G., “Understanding age-related macular degeneration (AMD): Relationships between the photoreceptor/retinal pigment epithelium/Bruch’s membrane/choriocapillaris complex,” Mol. Aspects Med. 33(4), 295–317 (2012). 10.1016/j.mam.2012.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Cuenca N., Fernández-Sánchez L., Campello L., et al. , “Cellular responses following retinal injuries and therapeutic approaches for neurodegenerative diseases,” Prog. Retinal Eye Res. 43, 17–75 (2014). 10.1016/j.preteyeres.2014.07.001 [DOI] [PubMed] [Google Scholar]
- 50.Liu Z., Aghayee S., Soltanian-Zadeh S., et al. , “Microscopic imaging of photoreceptor-RPE-choriocapillaris complex in late-onset retinal degeneration with 3.4 MHz AO-OCT,” Invest. Ophthalmol. Visual Sci. 64(8), 1972 (2023). [Google Scholar]
- 51.Aguilera N., Liu T., Bower A. J., et al. , “Widespread subclinical cellular changes revealed across a neural-epithelial-vascular complex in choroideremia using adaptive optics,” Commun. Biol. 5(1), 893 (2022). 10.1038/s42003-022-03842-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Cunefare D., Cooper R. F., Higgins B., et al. , “Automatic detection of cone photoreceptors in split detector adaptive optics scanning light ophthalmoscope images,” Biomed. Opt. Express 7(5), 2036 (2016). 10.1364/BOE.7.002036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Cunefare D., Langlo C. S., Patterson E. J., et al. , “Deep learning based detection of cone photoreceptors with multimodal adaptive optics scanning light ophthalmoscope images of achromatopsia,” Biomed. Opt. Express 9(8), 3740 (2018). 10.1364/BOE.9.003740 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Cunefare D., Huckenpahler A. L., Patterson E. J., et al. , “RAC-CNN: multimodal deep learning based automatic detection and classification of rod and cone photoreceptors in adaptive optics scanning light ophthalmoscope images,” Biomed. Opt. Express 10(8), 3815 (2019). 10.1364/BOE.10.003815 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Morgan J. I. W., Chen M., Huang A. M., et al. , “Cone identification in choroideremia: repeatability, reliability, and automation through use of a convolutional neural network,” Trans. Vis. Sci. Tech. 9(2), 40 (2020). 10.1167/tvst.9.2.40 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Liu J., Shen C., Aguilera N., et al. , “Active cell appearance model induced generative adversarial networks for annotation-efficient cell segmentation and identification on adaptive optics retinal images,” IEEE Trans. Med. Imaging 40(10), 2820–2831 (2021). 10.1109/TMI.2021.3055483 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Zhou M., Doble N., Choi S. S., et al. , “Using deep learning for the automated identification of cone and rod photoreceptors from adaptive optics imaging of the human retina,” Biomed. Opt. Express 13(10), 5082 (2022). 10.1364/BOE.470071 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Heisler M., Ju M. J., Bhalla M., et al. , “Automated identification of cone photoreceptors in adaptive optics optical coherence tomography images using transfer learning,” Biomed. Opt. Express 9(11), 5353 (2018). 10.1364/BOE.9.005353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Wells-Gray E. M., Choi S. S., Ohr M., et al. , “Photoreceptor identification and quantitative analysis for the detection of retinal disease in AO-OCT imaging,” in Ophthalmic Technologies XXIX , Manns F., Söderberg P. G., Ho A., eds. (SPIE, 2019), p. 22. [Google Scholar]
- 60.Soltanian-Zadeh S., Kurokawa K., Liu Z., et al. , “Weakly supervised individual ganglion cell segmentation from adaptive optics OCT images for glaucomatous damage assessment,” Optica 8(5), 642 (2021). 10.1364/OPTICA.418274 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Soltanian-Zadeh S., Liu Z., Liu Y., et al. , “Deep learning-enabled volumetric cone photoreceptor segmentation in adaptive optics optical coherence tomography images of normal and diseased eyes,” Biomed. Opt. Express 14(2), 815 (2023). 10.1364/BOE.478693 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Zhou M., Zhang Y., Karimi A., et al. , “Reducing manual labeling requirements and improved retinal ganglion cell identification in 3D AO-OCT volumes using semi-supervised learning,” Biomed. Opt. Express 15(8), 4540 (2024). 10.1364/BOE.526053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Chiu S. J., Toth C. A., Bowes Rickman C., et al. , “Automatic segmentation of closed-contour features in ophthalmic images using graph theory and dynamic programming,” Biomed. Opt. Express 3(5), 1127 (2012). 10.1364/BOE.3.001127 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Yu H., Wang F., Teodoro G., et al. , “MultiHeadGAN: A deep learning method for low contrast retinal pigment epithelium cell segmentation with fluorescent flatmount microscopy images,” Comput. Biol. Med. 146, 105596 (2022). 10.1016/j.compbiomed.2022.105596 [DOI] [PubMed] [Google Scholar]
- 65.Yu H., Wang F., Teodoro G., et al. , “Self-supervised semantic segmentation of retinal pigment epithelium cells in flatmount fluorescent microscopy images,” Bioinformatics 39(4), btad191 (2023). 10.1093/bioinformatics/btad191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Rangel-Fonseca P., Gómez-Vieyra A., Malacara-Hernández D., et al. , “Automated segmentation of retinal pigment epithelium cells in fluorescence adaptive optics images,” J. Opt. Soc. Am. A 30(12), 2595 (2013). 10.1364/JOSAA.30.002595 [DOI] [PubMed] [Google Scholar]
- 67.Liu J., Han Y.-J., Liu T., et al. , “Spatially aware dense-LinkNet based regression improves fluorescent cell detection in adaptive optics ophthalmic images,” IEEE J. Biomed. Health Inform. 24(12), 3520–3528 (2020). 10.1109/JBHI.2020.3004271 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Caetano Dos Santos F. L., Laforest T., Künzi M., et al. , “Fully automated detection, segmentation, and analysis of in vivo RPE single cells,” Eye 35(5), 1473–1481 (2021). 10.1038/s41433-020-1036-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Kurokawa K., Crowell J. A., Do N., et al. , “Multi-reference global registration of individual A-lines in adaptive optics optical coherence tomography retinal images,” J. Biomed. Opt. 26(01), 016001 (2021). 10.1117/1.JBO.26.1.016001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Liu Z., Kocaoglu O. P., Miller D. T., “In-the-plane design of an off-axis ophthalmic adaptive optics system using toroidal mirrors,” Biomed. Opt. Express 4(12), 3007 (2013). 10.1364/BOE.4.003007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Kocaoglu O. P., Turner T. L., Liu Z., et al. , “Adaptive optics optical coherence tomography at 1 MHz,” Biomed. Opt. Express 5(12), 4186 (2014). 10.1364/BOE.5.004186 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Liu Z., Aghayee S., Soltanian-Zadeh S., et al. , “Quantification of human photoreceptor—retinal pigment epithelium macular topography with adaptive optics-optical coherence tomography,” Diagnostics 14(14), 1518 (2024). 10.3390/diagnostics14141518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Huang G., Liu Z., Van Der Maaten L., et al. , “Densely connected convolutional networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2017), pp. 2261–2269. [Google Scholar]
- 74.Odena A., Dumoulin V., Olah C., “Deconvolution and checkerboard artifacts,” Distill 1(10), (2016). [Google Scholar]
- 75.Anaya-Isaza A., Zequera-Diaz M., “Fourier transform-based data augmentation in deep learning for diabetic foot thermograph classification,” Biocybernetics and Biomedical Engineering 42(2), 437–452 (2022). 10.1016/j.bbe.2022.03.001 [DOI] [Google Scholar]
- 76.Oppenheim A. V., Lim J. S., “The importance of phase in signals,” Proc. IEEE 69(5), 529–541 (1981). 10.1109/PROC.1981.12022 [DOI] [Google Scholar]
- 77.Yang Z., Farsiu S., “Directional connectivity-based segmentation of medical images,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2023), pp. 11525–11535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Yang Z., Soltanian-Zadeh S., Farsiu S., “BiconNet: An edge-preserved connectivity-based approach for salient object detection,” Pattern Recognition 121, 108231 (2022). 10.1016/j.patcog.2021.108231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Stringer C., Wang T., Michaelos M., et al. , “Cellpose: a generalist algorithm for cellular segmentation,” Nat. Methods 18(1), 100–106 (2021). 10.1038/s41592-020-01018-x [DOI] [PubMed] [Google Scholar]
- 80.Das V., Zhang F., Bower A. J., et al. , “Revealing speckle obscured living human retinal cells with artificial intelligence assisted adaptive optics optical coherence tomography,” Commun. Med. 4(1), 68 (2024). 10.1038/s43856-024-00483-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Soltanian-Zadeh S., Kovalick K., Aghayee S., et al. , “Identifying retinal pigment epithelium cells in adaptive optics-optical coherence tomography images with partial annotations and superhuman accuracy,” Github (2024), https://github.com/soltanianzadeh/AOOCT-RPE-cell-detection.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Code for our algorithm, imaging data, and the manual expert labels underlying the results presented in this paper are available online [81].








