Abstract
Background
This study aimed to evaluate the effectiveness of DeepLabv3+with Squeeze-and-Excitation (DeepLabv3+SE) architectures for segmenting the choroid in optical coherence tomography (OCT) images of patients with diabetic retinopathy.
Methods
A total of 300 B-scans were selected from 21 patients with mild to moderate diabetic retinopathy. Six DeepLabv3+SE variants, each utilizing a different pre-trained convolutional neural network (CNN) for feature extraction, were compared. Segmentation performance was assessed using the Jaccard index, Dice score (DSC), precision, recall, and F1-score. Binarization and Bland-Altman analysis were employed to evaluate the agreement between automated and manual measurements of choroidal area, luminal area (LA), and Choroidal Vascularity Index (CVI).
Results
DeepLabv3+SE with EfficientNetB0 achieved the highest segmentation performance, with a Jaccard index of 95.47, DSC of 98.29, precision of 98.80, recall of 97.41, and F1-score of 98.10 on the validation set. Bland-Altman analysis indicated good agreement between automated and manual measurements of LA and CVI.
Conclusions
DeepLabv3+SE with EfficientNetB0 demonstrates promise for accurate choroid segmentation in OCT images. This approach offers a potential solution for automated CVI calculation in diabetic retinopathy patients. Further evaluation of the proposed method on a larger and more diverse dataset can strengthen its generalizability and clinical applicability.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12880-024-01459-2.
Keywords: Convolutional neural network, Transfer learning, Choroidal vascular index, Choroid segmentation, Optical coherence tomography
Introduction
The choroid, a highly vascular layer between the retina and sclera plays a critical role in ocular health by supplying blood to the outer retina and optic nerve [1]. Changes in choroidal structure are strongly linked to various eye diseases [2, 3]. Therefore, understanding these changes alongside retinal information is crucial for disease diagnosis and treatment response.
Optical coherence tomography (OCT) offers high-resolution, non-invasive imaging of retinal and choroidal layers [4]. However, current OCT imaging techniques encounter limitations in choroidal segmentation due to non-uniform brightness, weak choroid-sclera interface (CSI), and low contrast between vasculature and stroma [5]. Manual segmentation is also labor-intensive and subjective, leading to inconsistencies between clinicians [6].
Enhanced Depth Imaging (EDI) and swept-source OCT (SS-OCT) have revolutionized choroidal imaging by significantly improving the visualization of choroidal structures. EDI-OCT is an imaging method that utilizes the enhanced depth of field resulting from the inverted picture acquired by positioning a spectral-domain OCT (SD-OCT) device in close proximity to the eye. SS-OCT utilizes a longer wavelength that can penetrate deeper into the choroid, and offers faster acquisition times [7–9]. Studies have demonstrated that SS-OCT provides superior clarity of the CSI, with 100% visibility compared to SD-OCT [10, 11]. These advancements have established EDI and SS-OCT as superior techniques for detailed choroidal analysis, particularly in cases where conventional SD-OCT without EDI falls short.
Deep learning models for choroid segmentation in OCT images encounter various limitations, including the need for large annotated datasets, which are scarce, particularly those with well-defined CSI [12]. These models can also struggle with generalization due to variations in image acquisition settings across different OCT devices and protocols. Image quality issues, such as noise, shadows, and artifacts, further complicate segmentation tasks and often require pre-processing, which can introduce variability [13]. Additionally, the black-box nature of deep learning models limits interpretability, making it difficult to understand and trust their decisions in clinical settings [14]. These challenges underscore the need for ongoing research to develop more robust, interpretable, and generalizable deep learning approaches for choroid segmentation in OCT images.
While retinal layer segmentation methods have been developed for OCT images, their performance often falls short in real-world scenarios [13]. Transfer learning, a deep learning approach where a pre-trained model is adapted to a new task, offers potential for improved choroidal segmentation [15]. This study aims to train and validate various pre-trained transfer learning models on our OCT dataset. We will compare their performance to identify the best model for accurate choroidal area segmentation and choroidal vascularity index (CVI) measurement. Additionally, we will evaluate the proposed method against existing segmentation techniques.
By providing accurate and reproducible measurements, automated choroidal segmentation algorithms can facilitate faster, more objective analysis for early disease detection and monitoring [16]. This research not only advances technological capabilities but also holds significant promise for improving patient care by enabling more precise diagnoses and personalized treatment strategies.
Materials and methods
Main objective and methodology overview
This study's primary objective is to leverage a convolutional neural network (CNN) model for the segmentation of the choroid area in OCT images. To achieve this goal, the following sections will detail the methodology employed:
Data Preprocessing: This section will elucidate the various methods utilized to enhance the OCT image dataset before the primary processing stage.
Model Selection: We will introduce and evaluate the performance of different models for choroidal segmentation, ultimately identifying the most effective model for our task.
EfficientNetB0 Model: Following the model selection process, we will present a detailed description of the chosen model, EfficientNetB0, which serves as the backbone of our approach.
Dataset
This study utilizes a dataset obtained from the Farabi Eye Hospital at Tehran University of Medical Sciences, Tehran, Iran. The dataset comprises EDI-OCT images captured using the RTVue XR 100 Avanti device (Optovue et al., USA). Each 8 mm x 12 mm volume scan contains 25 B-scans. A total of 300 B-scans were selected from 21 patients with varying stages of diabetic retinopathy. B-scans with significant motion artifacts or poor image quality were excluded, while the remaining scans were chosen based on their clear depiction of the choroidal and retinal layers. No patients with diabetic macular edema (DME) were included, in accordance with our study’s exclusion criteria. All patients included in the study were without signs of macular edema, and their diabetic retinopathy stages ranged from mild to moderate non-proliferative diabetic retinopathy (NPDR).
Pre-processing
Pre-processing of the dataset was crucial to improve data quality and ensure compatibility with subsequent stages of the analysis. This section details the pre-processing steps implemented:
Optic Nerve Removal: During pre-processing, the optic nerve region was removed from the images to create a suitable training set for the network.
Resizing: Images were resized to a standard dimension for consistency within the network.
Normalization: Normalization techniques were applied to standardize the intensity values of the images.
Augmentation: Data augmentation techniques were employed to artificially increase the dataset size and improve model generalizability.
Subsequent sections will provide a detailed explanation of each pre-processing step.
Removal of the optic nerve
During the pre-processing stage, removing the optic nerve from our images involved locating the upper border of the choroid. A combination of edge detection, morphological operations, and thresholding techniques were employed to identify the optic nerve region and remove it from the images.
We first identified the Bruch membrane (BM) by applying a 3 × 3 median filter to achieve this. Subsequently, a Gabor filter was employed for further processing. The Gabor filter [17] used in this section was obtained from the following formula:
where;
Moreover, λ is the wavelength of the sinusoidal component whose value is equal to . Also, θ represents the orientation of the normal to the parallel stripes of the Gabor function and is considered equal to , Ψ represents the phase offset of the sinusoidal function and is set to be 10.9. σ is the standard deviation (SD) of the Gaussian envelope, whose value is 9, and finally γ is the spatial aspect ratio and specifies the elasticity support of the Gabor function, which is equal to 5.
Following applying the Gabor filter with these parameters to the image, we utilized morphological operators to isolate the BM layer, effectively separating the optic nerve area from the rest of the image. Once this pre-processing was complete, our images were prepared to train the target network. In general, Fig. 1 (a-c) illustrates all these steps for the left eye (OS), while Fig. 1 (d-f) presents the steps for the right eye (OD) (Fig. 1).
Fig. 1.
Image preparation steps for model training (a) Original left eye image, (b) BM segmentation, (c) Optic nerve removal based on BM, (d) Original right eye image, (e) BM identification, (f) Optic nerve removal from the original image
Resizing
To ensure compatibility with the CNN architecture, all images in the dataset were resized to a uniform dimension of 128 × 512 × 1. This standardization allows the network to efficiently process the input data.
Normalization
Normalization was performed to standardize the intensity values of the images within a range of 0 to 1. This step involved dividing each pixel value by 255. Normalization facilitates improved training convergence and reduces the impact of varying intensity scales within the dataset.
Data augmentation
Data augmentation techniques were implemented to artificially expand the training dataset. This strategy is particularly beneficial for datasets with limited samples, as it helps to improve model generalizability and prevent overfitting. The following random augmentation techniques were applied:
Rotation: Images were randomly rotated within a range of 0° to 15°.
Shift and Zoom: Images underwent random horizontal and vertical shifts within 10% of their width and height, along with a 10% zoom.
Mirroring: Images were randomly flipped along the horizontal axis.
These augmentation techniques effectively increased the dataset size and diversity, leading to a more robust and generalizable model.
Proposed method
Squeeze and excitation block (SE-Block)
The SE-block [18] is designed to prioritize crucial feature maps while diminishing the significance of less important ones. This enhancement strengthens the model's capacity to discern the target object within the input image, potentially leading to improved segmentation outcomes. The SE-block is incorporated following the utilization of three convolutional excitations, accompanied by two activation functions: Rectified Linear Unit (ReLU) and a sigmoid activation function (Fig. 2).
Fig. 2.
This block diagram illustrates a method for enhancing feature maps within a deep learning model. SENets consist of three stages: squeeze, excitation, and scale. The squeeze stage extracts global spatial information by applying global average pooling to each feature map, generating channel-wise feature descriptors with a dimensionality of HxWxC, where H and W represent the height and width of the input feature map, and C denotes the number of channels (typically corresponding to the number of filters used in the previous convolutional layer). The excitation stage refines these features by employing two fully-connected layers with a ReLU activation function in between, followed by a sigmoid activation function. The final output of the excitation stage scales the original feature maps, highlighting important features and suppressing less significant ones [18]. This process strengthens the model's capacity to discriminate between objects within the input image, potentially leading to improved segmentation outcomes in medical imaging applications
EfficientNetB0: a composite scaling approach for CNNs
CNNs have become a mainstay in medical image analysis tasks. However, a key challenge lies in balancing model accuracy with computational efficiency. EfficientNetB0 addresses this issue by introducing a novel composite scaling technique. Unlike traditional methods that scale depth, width, and resolution independently, EfficientNetB0 utilizes a composite scaling factor in conjunction with specific scaling coefficients. This approach allows for the uniform expansion of all three network dimensions, ensuring optimal resource allocation and potentially leading to improved performance. The core architecture of EfficientNetB0 leverages MobileNetV2's inverse bottleneck residual block, further enhanced by the integration of SE-blocks (Fig. 3). This combination of architectural elements contributes to EfficientNetB0's ability to achieve high accuracy while maintaining computational efficiency.
Fig. 3.
This schematic illustrates the architecture of the EfficientNetB0 CNN model. EfficientNetB0 employs a composite scaling approach, utilizing a composite scaling factor and specific scaling coefficients to uniformly scale the network’s depth, width, and resolution. This approach differs from traditional scaling methods that adjust these dimensions independently. The EfficientNetB0 architecture incorporates MobileNetV2’s inverse bottleneck residual blocks and integrates SE-blocks for enhanced performance
EfficientNetB0 was selected as the backbone network due to its efficient architecture and composite scaling approach, which allows for a balanced expansion of depth, width, and resolution. This approach helps to optimize the model's performance while maintaining computational efficiency.
Atrous Spatial Pyramid Pooling (ASPP) block
DeepLabv3 + with Squeeze-and-Excitation (DeepLabv3 + SE) incorporates an ASPP block to capture multiscale contextual information crucial for semantic segmentation tasks. Unlike standard DeepLabv3 + , which utilizes primary convolution alongside pooling operations, DeepLabv3 + SE Network employs ASPP modules with varying dilation rates [1, 6, 12, 18] combined with depth-wise and point-wise convolutions. This approach reduces computational complexity compared to using standard convolutions.
The ASPP module operates by sampling the input feature map at varying rates as defined by the dilation rate "r." The formula for the ASPP module is provided:
where "x" represents the input signal, "w" signifies the filter, and "r" denotes the dilation rate. When "r = 1," the operation becomes a standard convolution. Atrous convolution, a core component of the ASPP block, extends the receptive field of the convolution kernel without increasing the number of parameters or computations. This is achieved by introducing zeros between filter values (Fig. 4).
Fig. 4.
Structure of the ASPP module used in Deeplabv3 + SE. This module consists of two stages, including (a) Atrous convolution and (b) Image Pooling, and produces the final output using a convolution layer after concatenating the feature maps
In DeepLabv3 + SE, the ASPP block is applied to high-level features extracted from the sixth layer of the pre-trained EfficientNetB0 model (Fig. 4). This strategic placement allows the network to capture multiscale characteristics of the choroid area with varying sizes. The specific configuration of the ASPP block utilizes four dilation rates [1, 6, 12, 18] with corresponding kernel sizes [1, 3, 3, 3]. The resulting feature maps are then concatenated using a 1 × 1 convolution.
DeepLabv3 + network architecture
DeepLabv3 + is a deep learning model designed for semantic segmentation tasks. It builds upon DeepLabv3 by incorporating a decoder module specifically tailored to refine segmentation results, particularly at object boundaries. DeepLabv3 + leverages an encoder-decoder architecture. The encoder network is responsible for feature extraction and dimensionality reduction of feature maps. Conversely, the decoder module focuses on restoring edge information and increasing feature map resolution to achieve accurate semantic segmentation.
To preserve feature map resolution while expanding the receptive field, the DeepLabv3 + encoder substitutes standard convolutions in the final layers with atrous convolutions. Atrous convolutions, also employed within the ASPP module introduced in DeepLabv3 + SE, utilize varying dilation rates to capture multiscale semantic context information from the input image. These architectural enhancements contribute to DeepLabv3 + SE’s ability to deliver precise semantic segmentation performance across diverse datasets (Fig. 5).
Fig. 5.
DeepLabv3 + network architecture. This schematic illustrates the DeepLabv3 + network architecture, a deep learning model for semantic segmentation tasks. DeepLabv3 + incorporates an encoder-decoder structure, where the encoder extracts feature and reduces dimensionality, while the decoder refines segmentation results, particularly at object boundaries. To expand the receptive field while preserving feature map resolution, atrous convolutions are employed in the later stages of the encoder. The ASPP module, also utilizing atrous convolutions with varying dilation rates, captures multiscale semantic context information
Our Previous Study showed that DeepLabv3 + SE and EfficientNetB0 significantly improve segmentation accuracy and computational efficiency. EfficientNetB0’s balanced scaling of network dimensions allows for better accuracy with fewer parameters, while SE-blocks enhance the model’s ability to focus on important image regions [19]. We hypothesized that this combination could improve performance, particularly in complex areas of the choroid, and reduce the computational load, making the model more suitable for real-time applications and deployment in environments with limited resources.
Through a comparative study of well-known models such as U-Net, U-Net++, LinkNet, PSPNet, and DeepLabv3 +, it becomes clear that the DeepLabv3 + SE model is the most promising architecture in this study. Supplementary material 1 illustrates the comparison of these models based on various criteria (Supplementary material 1).
The DeepLabv3 + SE architecture with EfficientNetB0 is expected to offer several improvements in choroid segmentation in OCT imaging:
Increased Segmentation Precision: By leveraging the strengths of DeepLabv3 + SE and EfficientNetB0, we anticipate a significant increase in segmentation precision, as measured by metrics such as Jaccard index, Dice Score (DSC), and F1-score. This improvement is attributed to the model's ability to capture multi-scale contextual information, refine segmentation results at object boundaries, and extract informative features efficiently.
Reduced Computational Resources: EfficientNetB0's efficient architecture and composite scaling approach contribute to a reduction in computational resources required for training and inference. This translates to faster model training times and the ability to deploy the model on devices with limited computational power, such as smartphones or tablets.
Leaner Parameter Configuration: EfficientNetB0's design emphasizes computational efficiency while maintaining high performance. This means that the model can achieve comparable or even superior results with a leaner parameter configuration compared to other state-of-the-art architectures. This can lead to reduced memory usage and faster inference times.
While specific percentage gains in segmentation precision and computational resource reductions may vary depending on the dataset and hardware used, we expect substantial improvements based on the theoretical and empirical evidence supporting the DeepLabv3 + SE and EfficientNetB0 architecture.
Model training
Following the pre-processing and choroidal area isolation stages, the DeepLabv3 + SE architecture was employed for model training. This network leverages the benefits of both DeepLabv3 + SE networks, potentially leading to improved segmentation performance.
The selection of DeepLabv3 + SE and EfficientNetB0 was driven by their demonstrated ability to balance segmentation accuracy and computational efficiency. EfficientNetB0’s composite scaling was particularly appealing due to its capability to uniformly scale network dimensions, which optimizes performance without a significant increase in computational resources. The SE-blocks were incorporated to enhance the model's attention mechanisms, prioritizing critical features in the segmentation task. Hyperparameters, including learning rate, batch size, and the number of epochs, were fine-tuned through extensive cross-validation to achieve optimal performance. The Dice coefficient (DC) was selected as the loss function due to its sensitivity to class imbalances, common in medical image segmentation. These choices were validated against multiple baseline models, ensuring that the selected configuration provided the best trade-off between accuracy and efficiency for choroidal segmentation in OCT images.
The choice of loss function plays a crucial role in model optimization. In this study, the DC was chosen as the loss function to guide the training process. Adam, a widely used optimization algorithm, served as the model optimizer. Each convolutional layer within the DeepLabv3 + SE architecture utilized 128 filters. Additionally, the ReLU activation function was implemented within the network.
To capture a comprehensive range of image features, the model received input from two layers of the pre-trained EfficientNetB0 model. High-level semantic information was extracted from the sixth layer, while the third layer provided low-level image details. This combination of features potentially contributes to the model's ability to accurately segment the choroid area.
Hyperparameter tuning
The following hyperparameters were tuned through experimentation and validation:
Learning Rate: The learning rate was adjusted to achieve a balance between stability and convergence speed. The optimal learning rate was determined through a grid search method. Stable and efficient convergence was guaranteed by setting the learning rate to 0.001 in this investigation, as determined by preliminary experiments.
Batch Size: The batch size was determined to be 32 in order to account for both computational resources and training stability. Although larger sample sizes can expedite convergence, they also necessitate a substantial amount of memory, which was a constraint in this investigation.
Number of Epochs: The number of epochs was determined based on the model's convergence behavior and validation performance. Early stopping was implemented to prevent overfitting.
Optimizer: The Adam optimizer was chosen due to its efficiency and robustness.
Loss Function: The DC was used as the loss function, as it is well-suited for medical image segmentation tasks with imbalanced classes.
Hyperparameter tuning was conducted using a grid search or random search approach, where different combinations of hyperparameters were evaluated on a validation set. The best-performing combination of hyperparameters was selected based on metrics such as validation loss and segmentation accuracy.
By carefully considering the model's architecture, its suitability for the task, and the impact of hyperparameters on performance, we were able to select and tune the DeepLabv3 + SE model with EfficientNetB0 to achieve optimal results for choroid segmentation in OCT images.
Pre-trained transfer learning models for choroid segmentation
This study investigated the efficacy of transfer learning for choroid area segmentation in OCT images. Transfer learning leverages a pre-trained model, initially designed for a different task, as a foundation for a new model focused on the specific problem of interest. Here, pre-trained models were utilized to extract image features, which were subsequently fed into the DeepLabv3 + SE architecture for segmentation.
Six pre-trained models were employed: Xception, SeResNet50, VGG19, DenseNet121, InceptionResNetV2, and EfficientNetB0. These models offer a variety of architectural elements, including:
Xception: This model utilizes deep separable convolutions within its architecture, achieving good performance.
SE-ResNet50: This variant of ResNet incorporates SE-blocks, potentially improving feature extraction.
VGG19: This widely used model offers a deep architecture pre-trained on a massive image dataset, providing a strong understanding of image features.
DenseNet121: This architecture focuses on efficient training of deep convolutional networks by shortening connections between layers.
InceptionResNetV2: This model combines inception and residual connections, achieving high accuracy on image classification tasks.
EfficientNetB0: This model utilizes a composite scaling method to efficiently scale depth, width, and resolution for optimal performance.
The performance of a segmentation model is directly related to the choice of loss function. In this study, we employed the DC loss, which is well-suited for medical image segmentation tasks with imbalanced classes. The results of the proposed model are visualized in Fig. 6, which shows the OCT scans used, the choroidal area segmented using the automated method, and the choroidal area segmented using the manual method. Figure 7 further illustrates the overlap between the choroidal areas segmented by the two methods, using the same OCT scans with both manual and automated segmentation lines (Figs. 6 and 7).
Fig. 6.
The original images (left column), automated segmentation by our purpose model (center column), and manual segmentation of experts with ImageJ software (right column)
Fig. 7.
The original images (left column), overlapping of borders detection from manual and automated segmentation (center column), and automated segmented area overlay on the original image (right column). The red lines represent the automated segmentation, and the blue lines represent the manual segmentation
Evaluation metrics
This study employed various metrics to evaluate the performance of the proposed DeepLabv3 + architecture for choroid segmentation in OCT images. Addressing the challenge of imbalanced data in medical image segmentation, the study explored several loss functions, including the DC loss. Two and three-class segmentation approaches were investigated, with the final model utilizing a three-class strategy to account for the distinct textures of the upper and lower choroid. To ensure a reliable CVI estimate, a region of interest (ROI) extending from the optic nerve to the temporal side of the image was considered.
Unsigned boundary localization errors of 3 μm (Micrometer) and 20.7 μm were chosen for the BM border and the CSI, respectively. Niblack's autolocal thresholding method, established by prior clinical studies, was used to calculate the luminal area (LA) for subsequent CVI calculation.
To assess the agreement between the automated and manual segmentation of the choroid boundaries, we conducted a Bland–Altman analysis. In this analysis, the x-axis represents the average choroidal thickness between the two methods, while the y-axis represents the difference between the manual and automated measurements. The distance between the segmentation lines was measured in pixels (each pixel equals 1.56 microns). By examining the Bland–Altman plot, we were able to visualize the level of agreement between the automated and manual measurements and identify any potential biases in choroid boundary detection.
For manual delineation of choroidal boundaries as ground truth, raw OCT images were imported using ImageJ software (http://imagej.nih.gov/ij; accessed on 13 November 2022), which is provided in the public domain by the National Institutes of Health, Bethesda, MD, USA. Choroidal borders of all images were delineated using the polygonal selection tool in the software toolbar. The retinal pigment epithelium (RPE)–BM complex and the CSI were selected as the upper and lower margins of the choroid, respectively. The edge of the optic nerve head and the most temporal border of the image were selected as the nasal and temporal margins of the choroidal area. All manual segmentations were conducted by a skilled grader (E.K) and verified by another independent grader (H.R.E). In case of any disputes, the outlines were segmented by consensus.
According to the method introduced by Sonoda et al., the ground truth for the CVI values was calculated using ImageJ software. For this purpose, the total choroidal area (TCA) was manually selected from the optic nerve to the temporal side of the image. The selected area was added as a ROI with the ROI manager tool. The CVI was calculated in selected images by randomly selecting three sample choroidal vessels with lumens larger than 100 µm using the oval selection tool in the toolbar. The average reflectivity of these areas was determined by the software. The average brightness was set as the minimum value to minimize the noise in the OCT image. Then, the image was converted to 8 bits and adjusted with the auto local threshold of Niblack (using default parameters). The binarized image was reconverted into a red green blue (RGB) image, and the LA was determined using the color threshold tool. The light pixels were defined as the choroidal stroma or interstitial area, and the dark pixels were defined as the LA (Fig. 8). The TCA, LA, and stromal area (SA) were automatically calculated [20].
Fig. 8.
The original images (left column), image binarization using Niblack thresholding method (center column), and binary area overlay on the original image (right column)
TCA: In this study, the choroidal area refers to the total cross-sectional area of the choroid as segmented from the OCT images. It includes both the vascular and non-vascular components of the choroid.
LA: The choroidal vascular area is the portion of the choroidal area that is occupied by blood vessels. It is calculated by identifying and quantifying the vascular structures within the segmented choroid.
CVI: The CVI is a ratio that represents the proportion of the choroidal area occupied by blood vessels. It is calculated as:
To evaluate the inter-rater reliability of the CVI measurement, the absolute agreement model of the inter-class correlation coefficient (ICC) was employed on 20 OCT images that were initially segmented by two independent graders. A correlation value of 0.81–1.00 indicated good agreement. The ICC for CVI measurement was 0.969, with a 95% confidence interval (CI) of 0.918–0.988. This implies a robust consensus among observers regarding the measurement of the CVI.
Performance metrics
The Jaccard index, DSC, precision, recall, and F1 score were employed to quantify the segmentation performance. These metrics range from 0 to 1, with higher values signifying greater overlap between the predicted segmentation and the ground truth (manually segmented) labels.
Jaccard Index: Measures the overlap between the predicted segmentation and the ground truth, ranging from 0 (no overlap) to 1 (perfect overlap).
DSC: Similar to the Jaccard index, it measures the overlap between predicted and ground truth segmentation, with values between 0 and 1, where 1 indicates perfect overlap.
Precision: Represents the proportion of true positive pixels over the total number of pixels identified as positive by the model.
Recall: Represents the proportion of true positive pixels over the total number of actual positive pixels in the image.
F1 Score: Combines precision and recall into a single metric, providing a measure of overall segmentation quality.
The formulas for each metric are:
Jaccard index = (Area of Overlap) / (Area of Union).
DSC = (2 * Area of Overlap) / Total Area.
Precision = (True Positive) / (True positive + False positive)
Recall = (True Positive) / (True positive + False negative)
F1 Score = 2 * (Precision * Recall) / (precision + recall)
Validation strategy
To assess model generalizability, 20% of the images were allocated for internal validation. Additionally, 50 OCT B-scans from a separate group of diabetic patients without DME with the same OCT device and the same imaging protocol were used for external validation. Six performance criteria were used for evaluation, along with Bland–Altman plots to visually assess the accuracy of upper and lower choroid border detection. Manual segmentation of the choroid area was performed by two retinal specialists using ImageJ software for ground truth labeling.
Results
In the present study, we analyzed 300 OCT B-scans from 21 patients, of whom 52% were male (11 male). The mean age of the patients was 62, ranging from 49 to 75. These patients were diagnosed with mild (10 patients) to moderate (11 patients) diabetic retinopathy without DME.
In Table 1, we assess the performance of the six mentioned architectures without utilizing feature maps extracted from pre-trained networks. We will then select the best-performing network to proceed with our study (Table 1). DeepLabv3 + SE with EfficientNetB0 achieved the highest performance on both the training and validation datasets based on all five metrics.
Table 1.
The performance of the six architectures without utilizing feature maps extracted from pre-trained networks
| Data | CNN Architecture | Metrics | ||||
|---|---|---|---|---|---|---|
| Jaccard Index | DSC | Precision | Recall | F1-Score | ||
| Train | DeepLabV3 + SE xception | 93.61 | 96.01 | 97.94 | 95.27 | 96.23 |
| DeepLabV3 + SE_seresnet50 | 93.72 | 96.21 | 97.53 | 94.82 | 95.96 | |
| DeepLabV3 + SE_VGG19 | 93.91 | 96.44 | 98.24 | 94.52 | 96.73 | |
| DeepLabV3 + SE_DenseNet121 | 94.59 | 97.23 | 98.41 | 95.33 | 96.53 | |
| DeepLabV3 + SE_InceptionResnetv2 | 94.98 | 97.85 | 98.36 | 96.73 | 97.46 | |
| DeepLabV3 + SE_EfficientNetB0 | 95.47 | 98.29 | 98.80 | 97.41 | 98.10 | |
| Validation | DeepLabV3 + SE xception | 92.34 | 93.44 | 94.56 | 91.42 | 93.96 |
| DeepLabV3 + SE_seresnet50 | 92.85 | 93.89 | 94.26 | 91.59 | 93.01 | |
| DeepLabV3 + SE_VGG19 | 92.89 | 94.51 | 94.33 | 91.29 | 93.26 | |
| DeepLabV3 + SE_DenseNet121 | 92.81 | 94.26 | 94.64 | 92.15 | 94.17 | |
| DeepLabV3 + SE_InceptionResnetv2 | 93.03 | 95.74 | 95.35 | 92.94 | 94.42 | |
| DeepLabV3 + SE_EfficientNetB0 | 93.41 | 96.25 | 96.8 | 94.50 | 95.63 | |
The radar plots below also show the evaluation functions for architecture selection (a) and the result of using pre-trained networks (b) (Fig. 9).
Fig. 9.
Overall performance metrics on train (left picture) and validation (right picture) data
Bland–Altman plots were employed to assess the agreement between the automated segmentation of the upper and lower choroid boundaries and the manually labeled ground truth (Fig. 10). Bland–Altman plots are a graphical tool used to visualize the level of agreement between two measurement methods. In Fig. 10, the x-axis represents the mean between the distance obtained from the automated segmentation and the distance measured from the manual labeling for the upper (a) and lower (b) choroid boundaries. The y-axis represents the difference between these two distance measurements. The Bland–Altman lines (mean difference ± 1.96 SDs of the difference) are also shown.
Fig. 10.
The x-axis represents the mean between the distance obtained from the automated segmentation and the distance measured from the manual labeling for the upper (a) and lower (b) choroid boundaries. The y-axis represents the difference between these two distance measurements. The Bland–Altman lines (mean difference ± 1.96 SDs of the difference) are also shown
We assessed the agreement between the automated CVI measurements obtained using the proposed method and the manual CVI measurements performed by experts. Bland–Altman analysis was conducted on 50 samples to evaluate this agreement. The Bland–Altman plot in Fig. 11 demonstrates acceptable agreement between the manually measured CVIs and the CVIs calculated using the proposed automated method. This suggests that the proposed method offers a reliable approach for CVI calculation (Fig. 11).
Fig. 11.
The Bland–Altman plot for CVI measurements, where the x-axis represents the mean of the manual and automated CVI values, and the y-axis represents the difference between the two measurements. The Bland–Altman lines (mean difference ± 1.96 SDs of the difference) are also shown on the plot
Discussion
This study investigated the efficacy of DeepLabv3 + SE architectures for choroid segmentation in OCT images of patients with diabetic retinopathy. We evaluated six DeepLabv3 + SE variants, each incorporating a different pre-trained CNN for feature extraction. Our findings demonstrate that DeepLabv3 + SE with EfficientNetB0 achieved the highest performance on both the training and validation datasets based on various metrics, including Jaccard index, DSC, precision, recall, and F1-score. This suggests that EfficientNetB0's composite scaling approach, which efficiently scales depth, width, and resolution, provides a strong foundation for feature extraction in the context of choroid segmentation.
The success of DeepLabv3 + SE with EfficientNetB0 aligns with previous research highlighting the effectiveness of DeepLabv3 + for medical image segmentation tasks. The incorporation of SE-blocks likely contributed to improved feature learning by focusing on informative features within the network. Our results support the growing body of evidence demonstrating the potential of deep learning for automated medical image analysis in ophthalmology.
DeepLabv3 + SE's ASPP module captures multi-scale contextual information, enabling the model to better understand the spatial relationships between different features within the image. The decoder module in DeepLabv3 + SE refines the segmentation results, particularly at object boundaries, leading to more accurate and precise segmentation. EfficientNetB0's composite scaling approach allows for a balanced expansion of depth, width, and resolution, ensuring optimal resource allocation and potentially leading to improved feature extraction. The SE-block in DeepLabv3 + SE helps to prioritize the most informative features, enhancing the model's ability to focus on relevant information for segmentation.
Our experiments demonstrated that the DeepLabv3 + SE model with EfficientNetB0 achieved superior segmentation performance compared to other architectures, as measured by metrics such as Jaccard index, DSC, precision, recall, and F1-score. EfficientNetB0's efficient architecture and composite scaling approach contribute to reduced computational cost, making the model suitable for real-time applications and large-scale deployments. The combination of DeepLabv3 + SE and EfficientNetB0 can potentially improve the model's ability to generalize to new, unseen OCT images, making it more robust and reliable in clinical settings.
Transfer learning was employed in this study to leverage the pre-trained knowledge of the selected CNN models (Xception, SeResNet50, VGG19, DenseNet121, InceptionResNetV2, and EfficientNetB0) in a related task. By initializing the DeepLabv3 + SE architecture with these pre-trained weights, we aimed to accelerate training, improve performance, and reduce overfitting. While the manuscript did not explicitly quantify the benefits of transfer learning in terms of specific performance improvements, the results obtained with the pre-trained models suggest that transfer learning was effective in enhancing the segmentation accuracy and efficiency of the proposed approach. Future work could explore the impact of different pre-trained models, fine-tuning strategies, and hyperparameter tuning on the performance of the proposed method.
Beyond segmentation, we aimed to facilitate the automated calculation of the CVI index using the Niblack binarization method. This method effectively adapts to local image variations, particularly beneficial for OCT images with non-uniform backgrounds. Additionally, the good agreement between automated and manual measurements of the LA and the CVI, as indicated by Bland–Altman analysis, suggests the reliability of the proposed approach for CVI calculation. This offers a promising avenue for streamlining clinical workflows and potentially improving diagnostic accuracy.
However, further investigation is warranted to comprehensively evaluate the generalizability of our findings. Here, we discuss key limitations and highlight promising avenues for future research.
Comparison with existing work
The field of automated choroidal segmentation has witnessed significant advancements in recent years. Traditional machine learning methods often relied on algorithms like gradient-based graph search, neural networks, and three-dimensional (3D) graph-cut algorithms. However, these methods struggled with the specific challenges of OCT images, such as blurred boundaries and low contrast.
Automatic choroidal segmentation has been the subject of recent research, which has employed either conventional models or deep learning techniques. Remarkably, the increased computing power and access to big data have resulted in the increased interest in deep learning methods in medical imaging [21]. Sui et al. proposed a method that directly learns graph-edge weights from raw OCT pixels, utilizing a CNN. The BM boundary and the CSI boundary are detected by the network structure [22]. Masood et al. utilized deep learning techniques to create a novel segmentation structure that enabled the acquisition of the outer choroidal surface. Before being transmitted into the CNN, the OCT image is partitioned into segments for data sampling and conversion [16]. Chai et al. developed a method that effectively segments the choroidal boundary by minimizing disparities between regions and compensating for variations in OCT acquisition equipment. They employ adversarial and perceptual loss to adapt to the domain and feed OCT images from other domains into a U-Net-based network [23]. In another study, the choroid was automatically segmented using a mask region-based convolutional neural network (R-CNN) model that was trained with pre-trained weights from the Common Objects in Context database (COCO) [24]. To accomplish efficient and entirely automated retinal choroid segmentation in OCT, Jamie Burke et al. simplified a hand-crafted pipeline into a U-Net with a MobileNetV3 backbone that was pre-trained on ImageNet in 2023 [25].
The absence of open-source algorithms that are accessible for entirely automatic choroidal segmentation is a substantial challenge in all of the aforementioned research. At present, the methods in use are heavily reliant on proprietary software, which frequently necessitates modifications to accurately depict the distinctive characteristics of choroidal images. The segmentation accuracy of these images is suboptimal due to the presence of distinct challenges, including low contrast and indistinct boundaries. To resolve these concerns, it is necessary to create innovative segmentation algorithms for choroidal imaging. Despite the success of CNNs in medical image analysis, this discipline requires more robust research on transfer learning for choroid layer segmentation in OCT images.
In order to achieve near-to-fully automatic image segmentation with fewer manual interventions and higher accuracy, new models rely on robust CNNs and Deep Learning methods, including group-wise attention based fusion network (GAF-Net), perceptual-assisted adversarial adaptation (PAAA), attention-based dense (AD), and generative adversarial network (GAN) [25–28].
Our literature evaluation indicates that the utilization of Encoder-Decoder networks on CNNs offers a substantial advantage. The performance of U-Net and U-Net++ and the Pix2Pix GAN has been significantly enhanced [26, 27, 29]. These networks are excellent at capturing both local and global features from the input images, which allows for the precise segmentation of the choroidal region. This reinforces the efficacy of our models and fosters trust in our discoveries.
The efficacy of CNN has been significantly enhanced by the implementation of transfer-learning models. The literature indicates that the model will exhibit state-of-the-art high performance when input images are trained on a pre-trained transfer-learning model and combined with an encoder-decoder network. Except for our investigation, this procedure has been implemented in only three other studies [26–28]. The DSC of all of these articles was ideal, with a range of approximately 96–98%. Table 2 provides a summary of the additional studies that have been conducted in the past decade in the field of choroidal segmentation.
Table 2.
Literature review of choroidal segmentation using artificial intelligence over the past decade [30]
| Authors | Year | Purpose | Samples | Methods | Results | |
|---|---|---|---|---|---|---|
| 1 | Tian J et al. [21] | 2012 | Automatic measurement of the choroidal thickness in EDI-OCT images | 45 B-scans from 45 healthy adult participants | Gradient-based graph search (dynamic programing) | Mean DC = 90.5% (SD = 3%) |
| 2 | Torzicky T et al. [31] | 2012 | Automatic segmentation of choroid using swept source polarization-sensitive OCT (PS-OCT) | 5 healthy participants (25–54 years old) | Based on tissue-specific polarization contrast | SD of thickness measurement = 18.3 μm |
| 3 | Zhang L et al. [32] | 2012 | Evaluation of macular choriocapillaris and choroidal vasculature thickness and choroidal vessels segmentation on 3D SD-OCT cross-sectional slabs | 24 normal participants | Graph-based multilayer segmentation method | Average macular choriocapillaris thickness = 23.1 µm, average thickness of the choroidal vasculature in normal participants = 172.1 µm, DC = 0.78 ± 0.08 |
| 4 | Kajić V et al. [33] | 2012 | Automatic segmentation of choroid in SD-OCT slabs (both normal and pathologic participants) | 12 eyes (871 B-scans) of adult participants | Neural network, stochastic modeling, convex hull, active appearance model, and Dijkstra’s shortest path (Machine learning) | Average error = 13% (Proportion of misclassified pixels) |
| 5 | Alonso-Caneiro D et al. [1] | 2013 | Automatic segmentation of choroidal thickness in EDI-OCT images | 1083 B scans (104 pediatric healthy participants) and 90 B scans (15 adult healthy participants) | Edge filter, directional weight, dual brightness probability gradient, and the Dijkstra's shortest path algorithm | The mean absolute error (MAE) was 12.6 µm (SD = 9.00 µm) for pediatric and 16.27 µm (SD = 11.48 µm) for adult, also the mean DC was 97.3% (SD = 1.5%) for pediatric and 96.7% (SD = 2.1%) for adult participants |
| 6 | Lu H et al. [34] | 2013 | Automated segmentation of the choroid in OCT images | 30 eyes of 30 adult diabetic participants | Two-stage fast active contour model and real-time human-supervised automated segmentation | Mean DC: 92.7% (SD = 3.6%) |
| 7 | Lee S et al. [35] | 2013 | Comparison of the manual measurements of subfoveal choroidal thickness by experts and an automated algorithm in EDI-OCT images | 88 eyes of 44 patients (bilateral non-neovascular age related macular degeneration (AMD)) and above 55 years old | 3D graph-cut algorithm | ICCs were 0.91 (CI: 0.86–0.94) for the first rater, 0.96 (CI: 0.94–0.97) for the second rater, and 0.87 (CI: 0.80–0.92) for the automated algorithm |
| 8 | Hu Z et al. [36] | 2013 | Comparing the performance of automatic identification of choroidal layer in SD-OCT slabs with manual delineation | 37 B-scans from each eye (20 healthy and 10 non-neovascular AMD eyes) | Gradient-based multistage segmentation | Mean and absolute differences between the algorithm and manual segmentation for the outer RPE boundary was -0.74 ± 3.27 μm and 3.15 ± 3.07 μm; and for the CSI was -3.90 ± 15.93 μm and 21.39 ± 10.71 μm |
| 9 | Srinath N et al. [37] | 2014 | Automated detection of choroid boundary and vessels in OCT Images | OCT slabs with 30 μm thickness | Structural similarity and adaptive Hessian analysis | Thickness variation along the length of the image |
| 10 | Gerendas BS et al. [38] | 2014 | Automatic measurement of choroidal thickness in SD-OCT images | 284 eyes of 142 patients with clinically significant diabetic macular edema (CS-DME) and 20 healthy participants (control) | Iowa reference algorithm (graph-based) | Total choroidal thickness is significantly reduced in DME (175 ± 23 µm) and non-edematous fellow eyes (177 ± 20 µm) compared with healthy control eyes (190 ± 23 µm) |
| 11 | Danesh H et al. [39] | 2014 | Segmentation of choroidal boundary in EDI-OCT | 100 B-scans from 10 eyes of 6 healthy adult participants | Dynamic programming, largest gradient, wavelet features and Gaussian mixture model | Unsigned error of 2.48 ± 0.32 pixels for BM extraction and 9.79 ± 3.29 pixels for choroid detection |
| 12 | Chen Q et al. [40] | 2015 | Automatic choroidal segmentation of SD-OCT images | 212 high-definition OCT (HD-OCT) images (110 eyes of 66 participants) | Thresholding, graph minimum-cut and maximum-flow, gradual intensity distance, and the energy minimization technique | Thickness difference (TD) = 6.72 ± 8.26, correlation coefficients (CC) = 0.970 |
| 13 | Vupparaboina KK et al. [41] | 2015 | Automated measurement of choroidal thickness in SD-OCT images | 97 B-scans per eye (5 healthy adult participants) | Structural similarity index, tensor voting, and eigenvalue analysis of the Hessian matrix | Mean CC = 99.64% (SD = 0.27%), Mean DC = 95.47% (SD = 1.73%), Mean absolute volume difference (VD) = 0.3046 mm3 |
| 14 | Twa MD et al. [42] | 2016 | A new image segmentation method to evaluate automatic choroidal thickness compared to manual segmentation | 30 young adults (24 ± 2 years), total of 180 B-scan images were analyzed (left eye of each participants) | Graph theory, dynamic programming, and wavelet-based texture analysis | TD (Manual–Auto) Central (fovea) = 4 ± 5 μm, inferior (12.5° below fovea) = − 1 ± 6 μm, superior (12.5° above fovea) = 8 ± 5 μm |
| 15 | Shi F et al. [43] | 2016 | Segmentation of the choroid using 1-μm wide view SS-OCT | 32 eyes (normal population) | 3D graph search (with gradient-based cost) | Mean TD = 20.64 ± 4.16 μm, mean DSC = 93.17 ± 1.30% |
| 16 | Wang C et al. [44] | 2017 | Choroid segmentation in 3D OCT images | 30 3D OCT scans of 30 healthy participants (20–85 years old) | 3D nonlinear anisotropic diffusion filter for image enhancement and the Markov random field methods for approximation of the boundary of choroid | Mean DC = 90 ± 4%, mean signed difference = 1.59 ± 1.65 pixels, mean unsigned difference = 2.17 ± 1.77 pixels |
| 17 | Chen M et al. [45] | 2017 | Automatic segmentation of choroid on EDI-OCT images (paitients diagnosed with AMD) | 62 EDI-OCT images of AMD patients (manually segmented) | CNN | DC = 82 ± 1% |
| 18 | Al-Bander B et al. [46] | 2017 | Choroidal segmentation of EDI-OCT images | 169 EDI-OCT images | Deep learning algorithm (CNN) | Accuracy = 98.01%, DC = 89.76% |
| 19 | Mazzaferri J et al. [12] | 2017 | Detection of the choroid boundaries on OCT images | 280 participants | Graph based method | The mean successful fraction was always above 96% with SD below 5% (for all patient) |
| 20 | Chen Q et al. [47] | 2017 | Segmentation of choroid from 3D-SD-OCT images | 5248 SD-OCT B-scan images (41 eyes) | 3D graph search | VD (µm3 × 103) = -1.96 ± 4.42, TD = -3.53 ± 7.99, CC = 0.9175 |
| 21 | Sui X et al. [22] | 2017 | Choroidal segmentation of OCT slabs | 912 OCT B-scans (42 normal participants and 31 patients with macular edema (34–68 years old)) | Graph-Edge weights learned from deep CNN | Mean Squared Error (*1000) = 5.2, MAE = 8.0, TD = 8.5 |
| 22 | Salafian B et al. [48] | 2018 | Segmentation of choroid in neutrosophic space using EDI-OCT images | 32 EDI-OCT images from 11 participants | Dijkstra’s algorithm in neutrosophic space | Unsigned error of 25.3 μm (6.55 pixels) for prepapillary images and 12.9 μm (3.34 pixels) for macular images |
| 23 | Hussain MA et al. [49] | 2018 | Choroidal segmentation using EDI-OCT images | 190 B-scans of 10 participants | Dijkstra’s shortest path algorithm | Mean root mean square error (RMSE) = 7.71 ± 6.29 pixels, CC = 0.76 |
| 24 | George N et al. [50] | 2019 | Segment the choroidal layer in OCT images | The used data set was the same with Danesh et al. [39] (589 OCT images in 19 volumes, and each volume had 31 OCT images from one eye) | Rotating kernel transformation (RKT) filter for image enhancement and multi-level contour evolution based on Chan Vese method for choroidal segmentation | Mean error of BM = 0.15 ± 0.8 pixels, MAE of BM = 1.78 ± 1.3 pixels, mean error of CSI = 0.48 ± 3.8 pixels, MAE of CSI = 4.7 ± 3.2 pixels |
| 25 | Masood S et al. [16] | 2019 | Segmentation of choroidal layer in OCT images | 525 OCT images of 21 participants | A series of morphological operations and deep learning | The average DC = 97.35%, SD = 2.3% |
| 26 | Xuena Cheng et al. [51] | 2020 | GAF-Net for choroid segmentation in OCT images | Dataset composed of 1650 clinically obtained B-scans | Novel GAF-Net | DC = 95.21 ± 0.73% |
| 27 | Zhenjie Chai et al. [23] | 2020 | PAAA for choroid segmentation in OCT | The first dataset, TOPCON consists of 640 OCT images with 512 × 512, and the second dataset, NIDEK, includes 120 OCT images with 512 × 1024 | Introduced an unsupervised PAAA framework for efficiently segmenting the choroid area by narrowing the domain discrepancies between different domains | Intersection over union (IOU) = 85.77 and Average unsigned surface detection error (AUSDE) = 3.21 |
| 28 | Ruchir Srivastava et al. [52] | 2021 | Choroid segmentation in OCT images using deep learning | Dataset of 1280 OCT images | Use a deep-learning architecture called U-Net, which utilizes the texture of the choroid to segment it | IOU of 0.85 |
| 29 | Shan Suthaharan et al. [28] | 2021 | An automated choroid segmentation approach using transfer learning and encoder-decoder networks | 120 OCT image-mask pairs of the choroid with 128 *128 resolution | Built U-Net and U-Net++ (encoder-decoder network) models on a VGG-19 backbone for allowing pretrained ImageNet weights in the encoder | Average precision scores of 0.92 for U-Net and U-Net++. The integration of transfer learning yielded an mean increase of 2% in U-Net and 4% in U-Net++ |
| 30 | Xiangcong Xu et al. [27] | 2022 | Automatic segmentation and measurement of choroid layer in high myopia for OCT imaging using deep learning | 800 OCT B-scans of the choroid layers from both normal eyes and high myopia | Combines image enhancement and AD U-Net | Area under curve (AUC) of 99.51% and a DSC of 97.91% |
| 31 | Menghan Li et al. [29] | 2022 | Choroid automatic segmentation and thickness quantification on SS-OCT images of high myopic patients | 720 SS-OCT images were obtained from 40 high myopic and 20 non-high myopic eyes | Extraction of multiscale features corresponding to different perception fields also gives the model strong ability in distinguishing choroid and other retinal structures by named GCSnet | IOU, DSC, sensitivity, and specificity of 87.89%, 93.40%, 92.42%, and 99.82% in high myopic patients, respectively |
| 32 | Jamie Burke et al. [25] | 2023 | Efficient and fully automated retinal choroid segmentation in OCT through deep learning-based distillation of a hand-crafted pipeline | 715 EDI-scan OCT 768 × 768-pixel resolution | Train fine tune a U-Net with MobileNetV3 backbone pre-trained on ImageNet | AUC = 0.9994, DC = 0.9664 |
| 33 | Kiran Kumar Vupparaboina et al. [26] | 2023 | Choroid layer segmentation using OCT B-scans: An image translation approach based on Pix2Pix GANs | 994 OCT B-scan images of healthy subjects | Pix2Pix GAN architecture to translate natural images pixel-by-pixel about the target images, with the residual encoder-decoder network as a generator to map OCT images with the corresponding choroid annotated images | Mean DC of 97.50% |
To comprehensively evaluate the generalizability of our findings, several aspects require further investigation. First, the study's reliance on data from a limited number of patients with diabetic retinopathy necessitates incorporating a broader spectrum of retinal diseases. This includes pathologies with extreme choroidal characteristics, such as the very thin choroids observed in AMD or high myopia, and the significantly thickened choroids associated with infiltrative choroidal diseases like Vogt-Koyanagi-Harada syndrome. Additionally, while prior research suggests minimal influence of device type on CVI measurement, broader applicability demands evaluation on data acquired with different OCT devices beyond the RTVue XR 100 Avanti used in this study. Furthermore, the current assessment focused on a single B-scan through the macula. Future advancements in OCT technology offering ultra-wide field imaging and 3D assessments hold promise for more comprehensive and generalizable choroid segmentation across the entire retina. Finally, transitioning from retrospective to prospective and clinical trial studies can provide more robust and reliable results, ultimately leading to stronger clinical validation and wider adoption of the proposed method.
This study has several limitations that should be acknowledged. First, the dataset used in this study was relatively small and consisted of a single center and a specific patient population with diabetic retinopathy. Expanding the dataset to include a broader range of retinal diseases and patient populations would enhance the generalizability of the findings. Second, the external validation dataset was even smaller, limiting the ability to assess the model's performance on a more diverse set of OCT images. Additional details about the external validation dataset, such as the source, device, and imaging parameters, should be provided to improve the transparency and reproducibility of the study. Furthermore, the EDI-OCT images utilized in the study comprised merely 25 B-scans per volume, potentially resulting in the omission of structural information owing to the substantial lateral resolution of this low-density scan. Future research should employ higher-density scans or alternate imaging procedures to obtain more comprehensive structural information. Third, the removal of the optic nerve region from the OCT images could potentially affect the field of view and introduce biases in the choroid segmentation. Future studies could explore alternative approaches to optic nerve removal or incorporate techniques that can preserve more of the optic nerve margin. Fourth, the model's performance was evaluated based on traditional segmentation metrics. Incorporating clinical data like visual acuity and optical coherence tomography angiography (OCTA) images during model training could potentially improve the model's ability to differentiate between healthy and diseased states and enhance its clinical relevance. Finally, the use of Explainable AI (XAI) techniques could provide insights into how the deep learning model makes segmentation decisions, fostering trust and transparency in its clinical application. Additionally, investigating the use of deep learning models for segmentation on OCTA images could offer new opportunities for improved disease diagnosis and monitoring.
Table 2 summarizes additional studies conducted over the past decade (Table 2).
Conclusion
This study explored DeepLabv3 + SE architectures with transfer learning for choroid segmentation in OCT images from diabetic retinopathy patients. DeepLabv3 + SE with EfficientNetB0 achieved superior performance, highlighting its potential for automated segmentation and CVI calculation. The proposed method offers promise for streamlining clinical workflows. However, limitations including dataset size and disease focus necessitate further research with a broader range of retinal pathologies and OCT devices. Further analysis should include increasing the dataset size, diversifying retinal conditions, and utilizing various OCT platforms. Additionally, incorporating prospective studies and exploring OCTA image analysis is crucial for enhancing generalizability and clinical utility. Overall, this study demonstrates the potential of deep learning for automated choroid segmentation, paving the way for future advancements in diabetic retinopathy diagnosis and management.
Supplementary Information
Acknowledgements
Not applicable.
Abbreviations
- OCT
Optical Coherence Tomography
- CSI
Choroid-Sclera Interface
- EDI-OCT
Enhanced-Depth Imaging Optical Coherence Tomography
- SS-OCT
Swept-Source Optical Coherence Tomography
- SD-OCT
Spectral-Domain Optical Coherence Tomography
- CVI
Choroidal Vascularity Index
- CNN
Convolutional Neural Network
- DME
Diabetic Macular Edema
- MM
Millimeter
- NPDR
Non-Proliferative Diabetic Retinopathy
- BM
Bruch Membrane
- SD
Standard Deviation
- OS
Left eye
- OD
Right eye
- SE-Block
Squeeze and Excitation Block
- ReLU
Rectified Linear Unit
- ASPP
Atrous Spatial Pyramid Pooling
- DeepLabv3 + SE
DeepLabv3 + with Squeeze-and-Excitation
- DSC
Dice Score
- DC
Dice Coefficient
- ROI
Region Of Interest
- μm
Micrometer
- LA
Luminal Area
- RPE
Retinal Pigment Epithelium
- TCA
Total Choroidal Area
- RGB
Red Green Blue
- SA
Stromal Area
- ICC
Inter-Class Correlation Coefficient
- CI
Confidence Interval
- 3D
Three-Dimensional
- R-CNN
Region-Based Convolutional Neural Network
- COCO
Common Objects in Context Database
- GAF-Net
Group-wise Attention based Fusion Network
- PAAA
Perceptual-Assisted Adversarial Adaptation
- AD
Attention-based Dense
- GAN
Generative Adversarial Network
- AMD
Age related Macular Degeneration
- OCTA
Optical Coherence Tomography Angiography
- XAI
Explainable Artificial Intelligence
- PS-OCT
Polarization-Sensitive Optical Coherence Tomography
- MAE
Mean Absolute Error
- CS-DME
Clinically Significant Diabetic Macular Edema
- HD-OCT
High-Definition Optical Coherence Tomography
- TD
Thickness Difference
- CC
Correlation Coefficient
- VD
Volume Difference
- RMSE
Root Mean Square Error
- RKT
Rotating Kernel Transformation
- IOU
Intersection Over Union
- AUSDE
Average Unsigned Surface Detection Error
- AUC
Area Under Curve
Authors’ contributions
Code Programming, H.A. and Z.A.; Conceptualization, H.R.E. and E.K.; Methodology, J.S., Z.A. and E.K.; Validation, H.R.E. and E.K.; Formal analysis, H.A.; Investigation, J.S.; Resources, H.R.E. and E.K.; Data curation, H.R.E., A.H., K.D., and E.K.; Writing—original draft, H.A.G, J.S., Z.A. and H.A.; Writing—review & editing, H.R.E., H.A.G, P.P. and E.K.; Supervision, J.S.; Project administration, H.R.E. and E.K. All authors have read and agreed to the published version of the manuscript.
Funding
Not applicable.
Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Declarations
Ethics approval and consent to participate
By the ethical guidelines of the Helsinki principles, this research has been approved by a local committee at Tehran University of Medical Sciences (IR.TUMS.FARABIH.REC.1400.035). Informed consent was obtained from all subjects.
Consent for publication
Participants were asked for permission to share their information anonymously and gave their informed consent for publication by signing a form.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Alonso-Caneiro D, Read SA, Collins MJ. Automatic segmentation of choroidal thickness in optical coherence tomography. Biomed Opt Express. 2013;4(12):2795–812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chirco K, Sohn E, Stone E, Tucker B, Mullins R. Structural and molecular changes in the aging choroid: implications for age-related macular degeneration. Eye. 2017;31(1):10–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bazvand F, Asadigandomani H, Nezameslami A, Sadeghi R, Soleymanzadeh M, Khodabande A, et al. Short term choroidal microvascular changes following photodynamic therapy in chronic central serous chorioretinopathy. Photodiagn Photodyn Ther. 2023;44: 103807. [DOI] [PubMed] [Google Scholar]
- 4.Hitzenberger CK, Götzinger E, Sticker M, Pircher M, Fercher AF. Measurement and imaging of birefringence and optic axis orientation by phase resolved polarization sensitive optical coherence tomography. Opt Express. 2001;9(13):780–90. [DOI] [PubMed] [Google Scholar]
- 5.Lee EJ, Lee KM, Lee SH, Kim T-W. OCT angiography of the peripapillary retina in primary open-angle glaucoma. Invest Ophthalmol Vis Sci. 2016;57(14):6265–70. [DOI] [PubMed] [Google Scholar]
- 6.Wenger E, Mårtensson J, Noack H, Bodammer NC, Kühn S, Schaefer S, et al. Comparing manual and automatic segmentation of hippocampal volumes: reliability and validity issues in younger and older brains. Hum Brain Mapp. 2014;35(8):4236–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zeppieri M, Marsili S, Enaholo ES, Shuaibu AO, Uwagboe N, Salati C, et al. Optical coherence tomography (OCT): a brief look at the uses and technological evolution of ophthalmology. Medicina. 2023;59(12):2114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lains I, Wang JC, Cui Y, Katz R, Vingopoulos F, Staurenghi G, et al. Retinal applications of swept source optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA). Prog Retin Eye Res. 2021;84: 100951. [DOI] [PubMed] [Google Scholar]
- 9.Aumann S, Donner S, Fischer J, Müller F. Optical Coherence Tomography (OCT): Principle and Technical Realization. In: Bille JF, editor. High Resolution Imaging in Microscopy and Ophthalmology: New Front Biomed Optics. Cham (CH): Springer: Copyright 2019, The Author(s). 2019. p. 59–85. [PubMed]
- 10.Adhi M, Liu JJ, Qavi AH, Grulkowski I, Lu CD, Mohler KJ, et al. Choroidal analysis in healthy eyes using swept-source optical coherence tomography compared to spectral domain optical coherence tomography. American J Ophthalmol. 2014;157(6):1272–81 e1. [DOI] [PubMed] [Google Scholar]
- 11.Lee M-W, Park H-J, Shin Y-I, Lee W-H, Lim H-B, Kim J-Y. Comparison of choroidal thickness measurements using swept source and spectral domain optical coherence tomography in pachychoroid diseases. PLoS ONE. 2020;15(2): e0229134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mazzaferri J, Beaton L, Hounye G, Sayah DN, Costantino S. Open-source algorithm for automatic choroid segmentation of OCT volume reconstructions. Sci Rep. 2017;7:42112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ting DSW, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Romero K, Bagherinia H, Lu J, Shi Y, Rosenfeld PJ, Wang RK. Comparison of the deep learning-based choroid segmentation with and without optical attenuation corrected inputs. Invest Ophthalmol Vis Sci. 2024;65(9):PB0011–PB. [Google Scholar]
- 15.Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, et al. A comprehensive survey on transfer learning. Proc IEEE. 2020;109(1):43–76. [Google Scholar]
- 16.Masood S, Fang R, Li P, Li H, Sheng B, Mathavan A, et al. Automatic choroid layer segmentation from optical coherence tomography images using deep learning. Sci Rep. 2019;9(1):3058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gabor D. Theory of communication. Part 1: The analysis of information. J Institut Electric Eng-Part III: radio commun eng. 1946;93(26):429–41. [Google Scholar]
- 18.Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 7132–41.
- 19.Naeeni Davarani M, Arian Darestani A, Guillen Cañas V, Azimi H, Havadaragh SH, Hashemi H, et al. Efficient segmentation of active and inactive plaques in FLAIR-images using DeepLabV3Plus SE with efficientnetb0 backbone in multiple sclerosis. Sci Rep. 2024;14(1):16304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sonoda S, Sakamoto T, Yamashita T, Uchino E, Kawano H, Yoshihara N, et al. Luminal and stromal areas of choroid determined by binarization method of optical coherence tomographic images. American J Ophthalmol. 2015;159(6):1123–31 e1. [DOI] [PubMed] [Google Scholar]
- 21.Tian J, Marziliano P, Baskaran M, Tun TA, Aung T. Automatic measurements of choroidal thickness in EDI-OCT images. Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:5360–3. [DOI] [PubMed] [Google Scholar]
- 22.Sui X, Zheng Y, Wei B, Bi H, Wu J, Pan X, et al. Choroid segmentation from optical coherence tomography with graph-edge weights learned from deep convolutional neural networks. Neurocomputing. 2017;237:332–41. [Google Scholar]
- 23.Chai Z, Zhou K, Yang J, Ma Y, Chen Z, Gao S, et al. Perceptual-assisted adversarial adaptation for choroid segmentation in optical coherence tomography. In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI): IEEE; 2020. p. 1966–70.
- 24.Chen HJ, Huang YL, Tse SL, Hsia WP, Hsiao CH, Wang Y, et al. Application of Artificial Intelligence and Deep Learning for Choroid Segmentation in Myopia. Transl Vis Sci Technol. 2022;11(2):38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Burke J, Engelmann J, Hamid C, Reid-Schachter M, Pearson T, Pugh D, et al. Efficient and fully-automatic retinal choroid segmentation in OCT through DL-based distillation of a hand-crafted pipeline. arXiv preprint arXiv:2307.00904. 2023.
- 26.Vupparaboina KK, Bollepalli SC, Manne SR, Sahel J, Chhablani J. Choroid layer segmentation using OCT B-scans: An image translation approach based on Pix2Pix generative adversarial networks. Invest Ophthalmol Vis Sci. 2023;64(8):1123. [Google Scholar]
- 27.Xu X, Wang X, Lin J, Xiong H, Wang M, Tan H, et al. Automatic Segmentation and Measurement of Choroid Layer in High Myopia for OCT Imaging Using Deep Learning. J Digit Imaging. 2022;35(5):1153–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Suthaharan S, Chhablani G, Vupparaboina KK, Sahel J-A, Dansingani KK, Chhablani J. An automated choroid segmentation approach using transfer learning and encoder-decoder networks. Invest Ophthalmol Vis Sci. 2021;62(8):2158. [Google Scholar]
- 29.Li M, Zhou J, Chen Q, Zou H, He J, Zhu J, et al. Choroid automatic segmentation and thickness quantification on swept-source optical coherence tomography images of highly myopic patients. Ann Transl Med. 2022;10(11):620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Alizadeh Eghtedar R, Esmaeili M, Peyman A, Akhlaghi M, Rasta SH. An Update on Choroidal Layer Segmentation Methods in Optical Coherence Tomography Images: a Review. J Biomed Phys Eng. 2022;12(1):1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Torzicky T, Pircher M, Zotter S, Bonesi M, Götzinger E, Hitzenberger CK. Automated measurement of choroidal thickness in the human eye by polarization sensitive optical coherence tomography. Opt Express. 2012;20(7):7564–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhang L, Lee K, Niemeijer M, Mullins RF, Sonka M, Abramoff MD. Automated segmentation of the choroid from clinical SD-OCT. Invest Ophthalmol Vis Sci. 2012;53(12):7510–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kajić V, Esmaeelpour M, Považay B, Marshall D, Rosin PL, Drexler W. Automated choroidal segmentation of 1060 nm OCT in healthy and pathologic eyes using a statistical model. Biomed Opt Express. 2012;3(1):86–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lu H, Boonarpha N, Kwong MT, Zheng Y. Automated segmentation of the choroid in retinal optical coherence tomography images. In 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC): IEEE; 2013. p. 5869–72. [DOI] [PubMed]
- 35.Lee S, Fallah N, Forooghian F, Ko A, Pakzad-Vaezi K, Merkur AB, et al. Comparative analysis of repeatability of manual and automated choroidal thickness measurements in nonneovascular age-related macular degeneration. Invest Ophthalmol Vis Sci. 2013;54(4):2864–71. [DOI] [PubMed] [Google Scholar]
- 36.Hu Z, Wu X, Ouyang Y, Ouyang Y, Sadda SR. Semiautomated segmentation of the choroid in spectral-domain optical coherence tomography volume scans. Invest Ophthalmol Vis Sci. 2013;54(3):1722–9. [DOI] [PubMed] [Google Scholar]
- 37.Srinath N, Patil A, Kumar VK, Jana S, Chhablani J, Richhariya A. Automated detection of choroid boundary and vessels in optical coherence tomography images. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society: IEEE; 2014. p. 166–9. [DOI] [PubMed]
- 38.Gerendas BS, Waldstein SM, Simader C, Deak G, Hajnajeeb B, Zhang L, et al. Three-dimensional automated choroidal volume assessment on standard spectral-domain optical coherence tomography and correlation with the level of diabetic macular edema. Am J Ophthalmol. 2014;158(5):1039–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Danesh H, Kafieh R, Rabbani H, Hajizadeh F. Segmentation of choroidal boundary in enhanced depth imaging OCTs using a multiresolution texture based modeling in graph cuts. Computat Mathematical Methods Med. 2014;2014(1):479268. [DOI] [PMC free article] [PubMed]
- 40.Chen Q, Fan W, Niu S, Shi J, Shen H, Yuan S. Automated choroid segmentation based on gradual intensity distance in HD-OCT images. Opt Express. 2015;23(7):8974–94. [DOI] [PubMed] [Google Scholar]
- 41.Vupparaboina KK, Nizampatnam S, Chhablani J, Richhariya A, Jana S. Automated estimation of choroidal thickness distribution and volume based on OCT images of posterior visual section. Comput Med Imaging Graph. 2015;46(Pt 3):315–27. [DOI] [PubMed] [Google Scholar]
- 42.Twa MD, Schulle KL, Chiu SJ, Farsiu S, Berntsen DA. Validation of Macular Choroidal Thickness Measurements from Automated SD-OCT Image Segmentation. Optom Vis Sci. 2016;93(11):1387–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Shi F, Tian B, Zhu W, Xiang D, Zhou L, Xu H, et al. Automated choroid segmentation in three-dimensional 1-μm wide-view OCT images with gradient and regional costs. J Biomed Opt. 2016;21(12): 126017. [DOI] [PubMed] [Google Scholar]
- 44.Wang C, Wang YX, Li Y. Automatic choroidal layer segmentation using markov random field and level set method. IEEE J Biomed Health Inform. 2017;21(6):1694–702. [DOI] [PubMed] [Google Scholar]
- 45.Chen M, Wang J, Oguz I, VanderBeek BL, Gee JC. Automated segmentation of the choroid in EDI-OCT images with retinal pathology using convolution neural networks. Fetal Infant Ophthalmic Med Image Anal. 2017;2017(10554):177–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Al-Bander B, Williams BM, Al-Taee MA, Al-Nuaimy W, Zheng Y, editors. A novel choroid segmentation method for retinal diagnosis using deep learning. 2017 10th International Conference on Developments in eSystems Engineering (DeSE); 2017: IEEE.
- 47.Chen Q, Niu S, Fang W, Shuai Y, Fan W, Yuan S, et al. Automated choroid segmentation of three-dimensional SD-OCT images by incorporating EDI-OCT images. Comput Methods Programs Biomed. 2018;158:161–71. [DOI] [PubMed] [Google Scholar]
- 48.Salafian B, Kafieh R, Rashno A, Pourazizi M, Sadri S. Automatic segmentation of choroid layer in edi oct images using graph theory in neutrosophic space. arXiv preprint arXiv:1812.01989. 2018.
- 49.Hussain MA, Bhuiyan A, Ishikawa H, Smith RT, Schuman JS, Kotagiri R. An automated method for choroidal thickness measurement from Enhanced Depth Imaging Optical Coherence Tomography images. Comput Med Imaging Graph. 2018;63:41–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.George N, Jiji C. Two stage contour evolution for automatic segmentation of choroid and cornea in OCT images. Biocybernetics and biomedical Engineering. 2019;39(3):686–96. [Google Scholar]
- 51.Cheng X, Chen X, Feng S, Zhu W, Xiang D, Chen Q, et al. Group-wise attention fusion network for choroid segmentation in OCT images. In Medical Imaging 2020: Image Processing, Vol. 11313: SPIE; 2020. p. 773–9.
- 52.Srivastava R, Ong EP, Lee B-H, editors. Choroid segmentation in optical coherence tomography images using deep learning. 17th International Conference on Biomedical Engineering: Selected Contributions to ICBME-2019, December 9–12, 2019, Singapore; 2021: Springer.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.











