Abstract
Age-related macular degeneration is a visual disorder caused by abnormalities in a part of the eye’s retina and is a leading source of blindness. The correct detection, precise location, classification, and diagnosis of choroidal neovascularization (CNV) may be challenging if the lesion is small or if Optical Coherence Tomography (OCT) images are degraded by projection and motion. This paper aims to develop an automated quantification and classification system for CNV in neovascular age-related macular degeneration using OCT angiography images. OCT angiography is a non-invasive imaging tool that visualizes retinal and choroidal physiological and pathological vascularization. The presented system is based on new retinal layers in the OCT image-specific macular diseases feature extractor, including Multi-Size Kernels ξcho-Weighted Median Patterns (MSKξMP). Computer simulations show that the proposed method: (i) outperforms current state-of-the-art methods, including deep learning techniques; and (ii) achieves an overall accuracy of 99% using ten-fold cross-validation on the Duke University dataset and over 96% on the noisy Noor Eye Hospital dataset. In addition, MSKξMP performs well in binary eye disease classifications and is more accurate than recent works in image texture descriptors.
Keywords: optical coherence tomography, binary patterns, weighted median filter, macular disease, age-macular degeneration
1. Introduction
Age-related macular degeneration is a visual disorder caused by abnormalities in a part of the eye’s retina and is a leading source of visual impairment Ref. [1]. Therefore, early diagnosis and treatment is critical Ref. [2]. Recently, retinal optical coherence tomography (OCT) images have been used to attain information regarding the health of the posterior eye (e.g., the retina and choroid) Ref. [3]. OCT is a quick, non-invasive medical imaging tool that uses low coherence interferometry to produce cross-sectional images of the retina and optic nerve head (ONH), or the most anterior part of the visual pathway, from the retina to lamina the cribrosa, which assess visual disorders, such as optic nerve disease, qualitative and quantitative. High-resolution (in µm range) scanner techniques such as optical coherence tomography (OCT) produce three-dimensional cross-sectional images of the eye’s biological tissues to visualize the individual layers of the posterior segment of the eye, allowing the diagnosis and monitoring of ocular diseases and anomalies Ref. [4]. This development of image acquisition reduces the cost of storage, which allows ophthalmologists to utilize these images to diagnose various eye diseases Refs. [5,6]. However, ophthalmologists would manually interpret each OCT image in the volumes to make a diagnosis decision. The increased data makes manual interpretation of the OCT volumes time-consuming Ref. [7].
Recent research in academia and industry has allowed artificial intelligence, Machine Learning, and Deep Learning (DL) to develop computerized algorithms to classify retinal disorders. These retinal disorders include diabetic macular edema (DME), aged macular degeneration (AMD), and Choroidal Neovascularization (CNV), see Figure 1. DME is the cause of glaucoma, where fluids or glucose build-up within the retinal layer causes vision impairment. Age-related diseases such as Age Macular Degeneration are caused by a build-up of drusen particles within the retinal epithelium layer. Dry AMD, which is usually diagnosed in older individuals, may cause daily life disruptions because it impairs the patient’s central vision. CNV, wet AMD, is when unusual blood vessels grow into the retina layers, causing fluids to leak and making the retina wet. For some people who are diagnosed with AMD, too many vascular blood vessels are produced. These new blood vessels spear from the choroid and then extend into the retina layers. These vessels are also leaky, allowing fluids and blood with red blood cells to enter the retina layers, distorting their vision. For this reason, early diagnosis and automated detection are essential in treating AMD Refs. [6,8,9].
Figure 1.
(a) Images taken from [8], top: CNV class, bottom: Drusen class; (b) Images taken from [9], top: DME class, bottom: Normal class.
The correct detection, precise location, classification, and diagnosis of choroidal neovascularization (CNV) may also be challenging if the lesion is small or if OCT images are degraded by speckle noises. Speckle noises in OCT images must be addressed to achieve good classification performance. Speckle noise, called multiplicative noise, is a granular noise texture. It degrades image quality as a consequence of interference among wavefronts in coherent imaging systems, such as radar, laser imaging, medical ultrasound, and optical coherence tomography. The speckle noise is signal-dependent and governed by the Fisher-Tippett distribution. Mathematically, Speckle noise Refs. [10,11,12,13] is expressed as u(x,y) = v(x,y) + v(x,y)η(x,y), where (x,y) are pixel position, v(x,y) is the clean image, u(x,y) is the noisy image, η is a Gaussian noise distribution with zero-mean and some variance σ2. It is well known that smoothing filters can reduce speckle noise. These techniques convolve an image with various neighbor sizes, 3 × 3, 5 × 5, etc., using a weighted average or selecting a median value, i.e., a non-weighted median filter. However, selecting a median value without any weights from a high noise-density image is not practical because a noise pixel may be selected instead of an image pixel. Therefore, we propose a new texture descriptor, MSKξMP, which echoes or repeats pixel values in a kernel to encode OCT images while avoiding a high level of speckle noises. This paper makes the following contributions:
-
-Develops a new, simple, and highly accurate local texture descriptor algorithm, Multi-Size Kernels ξcho-Weighted Median Patterns (MSKξMP), to
-
(i)Avoid speckles noises;
-
(ii)Perform eye disease classifications the Choroidal Neovascularization and Aged Macular Degeneration;
-
(iii)Perform highly accurate eye disease classifications between Diabetic Macular Edema, AMD, and Normal eyes.
-
(i)
-
-
Offers a Unique Singular Value Decomposition and Neighborhood Component Analysis based weighted feature selection method for establishing the optimal accuracy using SVM and Random Forest classification techniques.
-
-
Presents computer simulation results that show: (i) 99.78% accuracy on the Duke Dataset, 96.63% and 88.51% on the noisy Noor OCT Volume datasets; (ii) good eye disease diagnosis and recognition outcomes compared to the recent texture descriptors.
This paper has the following structure: the next section contains a Literature Review that offers an overview of methods that give detailed descriptions of the MSKξMP algorithm; the computer simulations section gives numerous experiments showing the performance of MSKξMP; finally, the conclusion and future work are presented in the last section.
2. Literature Review
In recent years, deep learning networks can perform vision recognition in various applications such as self-driving cars, natural language and image processing, and medical diagnosis (e.g., ocular diseases). Ref. [7] proposes an algorithm imitating how ophthalmologists form diagnoses by focusing on the information provided by OCT images during the classification process. This process is called a B-scan attentive convolutional neural network (BACNN). A self-attention module is used to cumulate features based on their clinical importance to obtain feature vectors. Ref. [1] proposes a multipath CNN architecture for the diagnosis of AMD. This architecture has five convolutional layers to classify AMD or normal images.
The multipath convolution layers extract critical global structures with a large filter kernel and use a sigmoid function as the classifier. Ref. [14] presented a CNN based on surrogate-assisted classification that classifies retinal OCT images automatically. Initial preprocessing was performed on each image using image denoising, thresholding, and morphological dilation to locate binary masks of the retina regions. The preprocessed images were employed to produce surrogate images in image augmentation, which were used to train the CNN model.
Several hand-crafted feature techniques show good classification performance on ocular disorder classifications in OCT images. Ref. [4]’s algorithm extracts features using multiscale histograms of oriented gradient descriptors and are classified using a support vector machine. Ref. [15] proposes the automated detection of AMD and DME from retina OCT images based on sparse coding, spatial pyramids, global representations, and dictionary learning. This process is coupled with preprocessing and a support vector machine classifier. Ref. [6] proposed a multiclass model for detecting AMD, DME, and normal using linear configuration patterns (LCPs) to extract pyramid and multiscale features. Recently, [10] proposed a texture descriptor called Alpha Mean Trimmed local binary patterns (AMT-LBP) based on Alpha Mean Trimmed Filter. The AMT-LBP encodes image pixels while partially avoiding and exploiting speckle noises. Table 1 summarizes the rest of the other recent techniques in ocular disorder classification.
Table 1.
Shows prior works and their results.
| Article | Dataset | Algorithm Used | Results |
|---|---|---|---|
| Srinivasan [4] | Duke Dataset: 15 normal subjects, 15 patients with Age-related macular degeneration (AMD) 15 patients with Diabetic Macular Edema (DME) |
Preprocessing: Denoised, Flattened, and cropped image; Feature extraction: HOG descriptor using Image pyramid; Classifier: SVM | Normal 13/15 = 86.67% AMD 15/15 = 100% DME 15/15 = 100% |
| Sun [15] | Preprocessing: Retinal Alignment using polynomial regression fitting Feature Extraction: Dictionary Learning, Sparse Coding and Max Pooling Classifier: Linear SVM | Normal 14/15 = 93.33% AMD 15/15 = 100% DME 15/15 = 100% |
|
| Wang [16] | Feature Extraction: Spatial Pyramid Features, Multi-scale Spatial Pyramid Features; Classifier: SVM | AMD 0.978 ± 0.008, DME 0.940 ± 0.004 Normal 0.996 ± 0.001, Overall Acc 0.980 ± 0.001 |
|
| Hussain [17] | Feature Extraction: 3D segmentation Canny Edge Detection, extracting curved boundaries using polynomials Classifier: Random Forest |
2 Classes Normal vs. DME Sensitivity: 94.67%, Specificity: 100.00%, F1-score: 97.22, Accuracy: 97.33% 3 Classes Accuracy: 96.89%AUC: 0.99 |
|
| Thomas [1] | Feature extraction: multi-Path CNN with residual connections Classifier: Sigmoid Fully Connected layers |
Two Class AMD vs. Normal Accuracy: 96.67% AUC: 100% |
|
| Rong [14] | Preprocessing: Denoising and Flattening Feature Extraction: Surrogate-Assisted Retinal CNN |
3-Classes—With Surrogates|With Raw Images AUC: 0.9856|0.9491 Acc: 0.9509|0.9205 Sen: 0.9639|0.9059 Spe: 0.9360|0.9371 |
|
| Wang [18] | Preprocessing: Mean and Bilateral Filter and image cropping Feature Extraction and Classifier: CliqueNet |
3-Classes Accuracy: 0.990 Precision AMD: 0.956|Recall AMD: 1 Precision DME: 1.00|Recall DME: 0.99 Precision Normal: 1.00|Recall Normal: 0.986 |
|
| Das [7] | Feature Extraction: B-Scan Attentive CNN Classifier: Fully connected layer |
3-Classes Acc: 97.12 ± 2.78, SP: 95.61 ± 4.35, Se: 97.76 ± 2.07, AUC: 0.97 ± 0.03 |
|
| Noor Eye Hospital dataset 148 volumes 48 AMD volumes (1565 total images) 50 DME volumes (1104 total images) 50 Normal Volumes (1585 total images) |
AMD, Acc = 93.2 ± 2.7, SP = 95 ± 0.1, Se = 92.0 ± 4.4 DME: Acc = 99.3 ± 1.5, SP = 98.9 ± 2.4, Se = 100 Normal: Acc = 92.2 ± 2.3, SP = 93.2 ± 2.3, Se = 87.8±4.8 |
||
| Wang [4] | Preprocessing: Flattening and Cropping Feature Extraction: ResNet50 with LSTM modules Classifier: Fully Connected layers |
AMD 90.0 ± 7.1 94.6 ± 9.2 DME 94.0 ± 9.4 97.5 ± 5.6 Normal 98.0 ± 4.5 91.2 ± 5.9 Overall Accuracy: 94.0 ± 4.3 Overall Precision 94.4 ± 4.4 Overall Sensitivity 93.8 ± 4.8 |
|
| Rasti [9] | Preprocessing: Retinal Flattening Feature Extraction: Multi-Scale Convolutional Neural Network Ensemble |
3-Classes Highest Performances: l3 -- l2 -- l1 -- l0 Precision: 99.39 ± 1.21 Recall: 99.36 ± 1.33 F1: 99.34 ± 1.34, AUC: 0.998 |
The current works using hand-crafted features fall short compared to the deep learning technique in achieving extremely high classification accuracies. However, deep-learning models usually require long training times and heavy computational hardware such as GPUs. They are also data-hungry and complex. A simple, aggressive machine-learning approach is proposed here to overcome these shortcomings. This technique contains a smaller number of weights, making them easier to implement, requires less training time, and does not require specialized hardware Ref. [19]. Textural information is vital in analyzing eye diseases. Thus, this paper presents a novel texture descriptor, Multi-Size Kernel ξcho-Weighted Median Patterns, to distinguish between the various ocular diseases while achieving very high classification accuracies.
3. Methods
Our algorithm has the following steps.
-
(i)
Preprocessing: performs image segmentation using thresholding, then flattens and aligns the retina layers;
-
(ii)
Generating Hand-Crafted Features: generating hand-crafted features using Multi-Size Kernels Echo-Weighted Median Patterns (MSKξMP);
-
(iii)
Feature Weights Selection and Classification: best features are selected using Singular Value Decomposition, and neighborhood component analysis (NCA) is classified using Gaussian and Polynomial Kernels Support Machine Vector, Random Forest, Naïve Bayes, Adaboost, and RUS Boost.
3.1. Preprocessing
OCT scanner misalignment between the eye and sensor during the acquisition of the retinal images cause white areas in the image. Therefore, preprocessing Ref. [10] starts with removing white areas by assigning white pixels to black pixels. Then, the image is resized from a square image to a rectangular 256 × 512 image. Next, the image is binarized using Otsu thresholding Ref. [20], which generates a binary image (black and white), and a non-weighted median filter is applied afterward. The purpose of this median filter is not to remove noise in the image but to remove white areas in the binary image outside the retinal regions. A morphological dilated operator with a large structuring element is used on the median filtered image to close fluid region structures in DME and CNV images. Finally, the retinal layers are flattened by applying the polynomial fit of either the 2nd order or 3rd order based on the R2 fitness function. In most cases, the 3rd-order polynomial is selected to flatten the retinal layers.
3.2. Hand-Crafted Features
This section presents new, so-called Multi-Size Kernels ξcho-Weighted Median Patterns (MSKξMP), hand-crafted features. This is an extension of the Local Binary Patterns (LBP) feature extractor [21] and is used in many applications such as texture analysis, face recognition, object detection, fault diagnosis, and image retrieval. The advantages of LBP are that it is computationally simple, efficient, and is invariant to illumination. However, the disadvantages of LBP are its production of artifacts and noises, which distort the central pixel value causing classification degradation. To improve the robustness of LBP, different modifications of LBP were proposed.
In this proposal, we start with the generalized form of the LBP given by the following:
| (1) |
where IC is the center pixel, Ii are the surrounding pixels, f(i) is equal to 2i and R is a region defined by the kernel size. The kernel size is usually selected to be 3 × 3; however, sizes such as 5 × 5 and 7 × 7 are also known to be used. s(Ii − IC) = {1, Ii − IC ≥ T; 0, Ii − IC < T} where T is selected to be zero. The generalized LBP operates on all the pixels in an image using a specific kernel size. Each of these kernels is placed over a pixel, IC, and is compared to its surrounding neighboring pixels Ii using Equation (1). If a neighbor pixel is greater than or equal to the center pixel, then s(·) is assigned 1; if a neighbor pixel is less than the center pixel, then s(·) is assigned 0. A binary sequence is obtained, and each sequence is assigned to the appropriate decimal weight, 2i, which is then converted into a decimal value. The decimal weights, 2i, are then summed and encoded to structural information, generating an LBP image. The LBP is simple, fast, easy, and robust to illumination changes. However, it suffers when speckle noises are present.
The median LBP Ref. [22] was proposed to address these noises. The median LBP can be implemented by replacing IC in (1) with the median value of all the pixels within the region R: IC = Imedian = median(R) = median(I0, I1, I2, … IN−1), where N is the number of pixels in region R. However, the current median LBP does not go far enough to prevent speckle noises from falling into the texture-based calculation. For example, Figure 2a illustrates a 3 × 3 kernel with the following values [135, 75, 75; 135, 75, 75; 160, 135, 135]. The 75’s are image pixels without noise and the 135’s and 160’s are noisy pixels. When we select the median value within the 3 × 3 kernel, we obtain 135, a noisy pixel. To overcome this deficiency, Multi-Size Kernels ξcho-Weighted Median Patterns are proposed.
Figure 2.
(a) A noisy DME OCT image taken from [9] with a 3 × 3 kernel of clean and noisy pixels; median is calculated; (b) Finding the median using echo-based weights; (c) A sample calculation of medianLBP5 × 5.
3.3. Multi-Size Kernels Echo-Weighted Median Patterns
MSKξMP is a textural feature descriptor similar to median LBP. MSKξMP encodes the textural pattern information by comparing surrounding pixels to the median pixel Imedian. However, this median value is calculated differently than Ref. [22]. Its surrounding pixels are defined using six different kernel sizes represented using Rn×m, where n’s and m’s have the following values: 3 × 3, 5 × 5, 7 × 7, 3 × 5, 5 × 7, and 3 × 7. Numerous kernel sizes are used because each kernel captures a variation of texture within the OCT; one kernel can miss a specific feature that another kernel size picks up. A larger kernel like the 7 × 7 can capture regional details that a 3 × 3 kernel can miss and vice versa.
The MSKξMP is calculated in three steps for each kernel size, Rn×m. The first is to calculate Imedian, the second is to find each mi’s found in Rn×m, and the third step is to calculate each medianLBPn×m using parameters obtained from steps one and two. The medianLBPn×m is defined by the following equation:
| (2) |
where IC is the current pixel being encoded and is the center pixel of the kernel. The combination of all medianLBPn×m are the MSKξMP. The calculation of Imedian is motivated by the weighted median filter in Ref. [23], which echoes or repeats pixel values a predetermined number of times. Figure 2b illustrates the 3 × 3 kernel, but now echo-weights are applied. The integer of each weight indicates the number of repetitions, e.g., the second element, 75, is repeated two times, or the middle element, 75, is repeated three times. After applying the echo weight to the 3 × 3 kernel, the median was determined to be 75 or Imedian = 75, which is not a noise pixel. These weights can be determined by centering a two-dimensional Gaussian function onto IC, defined by the following:
where σx and σy are standard deviations in the x and y directions, respectively. When a square kernel is utilized, the standard deviations should be the same, σx = σy, and when a rectangular kernel is utilized, the standard deviations could be different, σx ≠ σy. The selection of σx and σy is flexible, as long as the center pixel has the most weight and the surrounding weights are not too small compared to the center. Please see Figure 3, “MSKξMP Weights”. Notice the center weight is only one value higher than its neighboring values. The Gaussian Function is then normalized by its minimum value, min(G(x,y)), to ensure its minimum value equals 1. We then took the ceiling of each element in the normalized Gaussian Function to obtain integer values (or repetition values). The Echo formula can be represented by the following:
| (3) |
where ⌈·⌉ indicates the ceiling operator, and ξ(x,y) represents the echo weights of our kernels. The standard deviations utilized in the 3 × 3 weight kernel, shown in Figure 3, are σx = σy = 1.2. The values of the standard deviations should increase with increasing kernel size. The weight of the center pixel should always have the highest repetition because this gives a non-noise pixel a better chance of being selected as the median.
Figure 3.
MSKξMP images associated with their weights, generated using Echo Formula.
Figure 3 shows MSKξMP images associated with their echo weights for all the kernel sizes used in this paper. Notice that the MSKξMP images with weights 5 × 7 and 7 × 7 had the coarsest texture, meaning there were more texture variations and they would provide the highest performance. For example, the MSKξMP of the CNV images can extract drusen particles better than the Fibonacci patterns. The DME image had the most noise, however, the 3 × 3 weighted MSKξMP image could encode less noise compared to the classical LBP.
Finally, Imedian can be written in terms the pixel intensities in Rn×m:
| (4) |
where △ is a duplication operator, ε0, ε1, … εN−1 are elements in the matrix ξ(x,y) and Ii, i = 0, 1, …, N − 1 are the pixel values in Rn×m. reads the following: I0 is duplicated ε0 number of times.
The second step is to determine mi values. These median values are selected within sub-regions of Rn×m and are color-coded in red, yellow, and blue, see top of Figure 4. The pair of regions to the left and right of the center pixel are coded in yellow, the corner sub-regions are coded in red, and the top and bottom sub-regions are coded in blue. A sample calculation of mi’s within a 5 × 5 kernel is illustrated in the following. East (yellow): m1 = median(0, 135) = 67.5, northeast (red): m2 = median(123, 147, 0, 135) = 106; north (blue): m3 = median(106, 123, 123, 106, 0, 0) = 106; northwest (Red): m4 = median(106, 106, 106, 106) = 106; West (yellow): m5 = median(106, 106) = 106; southwest (red): m6 = median(70, 70, 70, 70) = 70; south (blue): m7 = median(70, 153, 153, 70, 153, 153) = 153; and southeast (red): m8 = median(153, 128, 153, 128) = 140.5.
Figure 4.
MSKξMP OCT Image recognition system.
Once all the mi’s are calculated, we moved to the third step, where we compared mi’s to Imedian found in the first step using (2). The comparison encoded the mi’s to binaries which were then multiplied by their respective decimal weights, and the sum was calculated to obtain medianLBP5 × 5(IC). The rest of the medianLBPn×m’s are obtained similarly. Figure 3 shows that each medianLBPn×m image displays different textural information, which is beneficial in distinguishing the different types of ocular disorders.
Four histograms were generated for each medianLBPn×m image to extract their features. One histogram was generated from the entire medianLBPn×m image, which represented the global features. The other three histograms were generated from local overlapping regions and labeled with different color brackets, as seen in Figure 4. Each local regions had a size of 256 × 256 to capture local features. The four histograms from each medianLBPn×m are concatenated to form the MSKξMP feature vector for each OCT image.
4. Computer Simulation
4.1. Datasets
Three datasets were used to test the performance of MSKξMP. The first dataset, designated as Dataset 1, was taken from Duke University, Harvard University, and the University of Michigan Ref. [4]. This dataset consisted of SD-OCT volumetric scans acquired from 45 patients: 15 normal patients, 15 patients with dry AMD, and 15 patients with DME. All SD-OCT volumes were acquired in Institutional Review Board-approved protocols using Spectralis SD-OCT (Heidelberg Engineering Inc., Heidelberg, Germany). The second dataset, which will be designated as Dataset 2, was acquired from the Noor Eye Hospital dataset Ref. [9] and consisted of 148 SD-OCT volumes (48 AMD, 50 DME, and 50 Normal), acquired by using the Heidelberg SD-OCT imaging system at Noor Eye Hospital in Tehran (NEH). Each volume consisted of 19 to 61 B-scans; the resolution of the B-scans was 3.5 μm, and the scan dimension was 8.9 × 7.4 mm2.
The third dataset, Dataset 3, was also collected by the Heidelberg SD-OCT imaging system at Noor Eye Hospital (NEH) and was obtained from the Mendeley database website Ref. [8]. It contained 16,822 OCTs images of Normal (120 volumes), Drusen (160 volumes), and CNV (161 volumes). However, 12,641 images (3234 CNV, 3740 Drusen, and 5667 Normals) were selected for our experiments. This is because we were only keeping worst-case condition images for each volume, whereas if a patient was labeled a CNV case, only CNV B-scans within that volume were included for training and testing procedures. Normal and drusen B-scans of that patient were excluded. Note that drusen particles are early signs of aged macular degeneration, which can be treated in the same class as AMD.
4.2. Feature Weightings, Selection, and Classification
Principal Component Analysis (PCA) is one of the oldest and most widely used statistical techniques that reduces feature dimensionalities. PCA improves interpretability while preserving variability, which minimizes informational loss. This means finding new variables Fi that are linear functions of those in the original features, while successively maximizing variance and maintaining uncorrelatability. This allows the representation of a class of features in the following form:
where λi eigenvalues are sorted according to the size as λ1 ≥ λ2 ≥ … ≥ λp ≥ 0, Fi, i = 1, 2, … p, features, F1 is the best feature and the 2nd, 3rd, …, kth (k ≥ 4) are other principal components.
The PCA used in this paper was based on Singular Value Decomposition (SVD), which was used to select our best features. X = UΣVT represents the SVD of an input matrix X, where U represents the reflection of X, the diagonal of Σ represents the variabilities, σ1, …, ση, up to the η-th feature, VT is the ‘eigen vectors’ of X. X will be subtracted by its mean of each column of Xmean, to obtain Xm = X − Xmean. This subtraction is commonly used for normalizing PCA data. We then calculated XPCA, which is defined by the following:
| (5) |
where M is selected based on the percentage of variability up to η, η ≥ M, N is the number of total observations in the dataset and the dimensions of XPCA is N rows by M columns.
Our feature selection also included feature weighting using neighborhood component analysis (NCA) Ref. [24] to refine our feature searching process. NCA is a feature weighting algorithm utilizing the nearest neighbor to maximize expected leave-one-out classification accuracy. Then NCA was optimized using stochastic gradient accent to learn a feature weighting vector. The output of the NCA is a vector, ω, of length of η representing the weight importance of each feature. Each element in ω is assigned to each column of matrix XPCA.
We then searched for sets of optimal features by reducing the number of columns of XPCA through iterations. The values of the weights in ω were used for removing features from XPCA at iteration. The weights can be written as the following:
| (6) |
where Ml, is the number of features remaining after some features have been removed at lth iteration. This idea was taken from Ref. [25] “Drop Out”, where a Neural Network with a large number of parameters tended to overfit a dataset, causing a decrease in performance. The function of dropout is to drop neuron units with low weights from the neural network during training after multiplying feature weights to feature vectors. Our algorithm utilized this idea to drop features with weights that were below a certain threshold. By dropping these unimportant features, we achieved faster training times and prevented overfitting. To multiply feature weights to feature vectors, ω can be rewritten in matrix form by replicating each element and putting them in column wise fashion. Ω(l) is defined by the following:
where Ω has N number of rows. We can rewrite our input classification matrix, XPCA(l), in terms of Ω and at lth iteration using the following:
| (7) |
where is the feature vector in the ith row of , “” indicates point matrix multiplication operation, is the weight removal threshold at lth iteration, and is the updated feature matrix with M1 number of features. The features that are dropped have feature weights below at the l iteration. XPCA(l) is a new classification input matrix and is inputted to each of the classifiers. The classifiers used during each iteration are RUS-Boost Ref. [26], and Naïve Bayes Ref. [27] and support vector machine using Polynomial and Gaussian Ref. [28], Random Forest Ref. [29] and Adaboost Ref. [30]. values selected for this paper ranged from 0.1 to 1 with increments of 0.01. The classification accuracies, a(l,k), were recorded in each lth iteration at the kth classifier. a(l,k) was defined by the following:
where Ck was one of the kth classifiers listed above. The maximum accuracy was found using the following:
| (8) |
where argmax(·) determines the maximum location or accuracy of lth at kth classifier. Dataset 1, 2, and 3 were trained in this fashion. The η was selected to be 100 for Dataset 1, and 70 for Dataset 2 and 3. Figure 4 shows the MSKξMP OCT Image recognition system from start to finish.
4.3. Performance Evaluation
To evaluate and determine the performance of the proposed feature extraction approach, the accuracy, sensitivity, specificity, precision, and F1-score were compared with the results of HOG Ref. [4], BACNN Ref. [7], Alpha Mean Trimmed Patterns Ref. [10], Fibonacci Patterns Ref. [19], Classical LBP Ref. [21], Vision Transformer Ref. [31], ResNet Ref. [32], VGG16 Ref. [33], and Inception V3 Ref. [34]. By definition, higher values on these indexes imply better quality measures of classification. Mathematical formulas of these measurements are given below:
where Tp and Tn are the true positive and true negative and Fp and Fn are the false positive and false negative, respectively.
5. Results and Discussions
For datasets 1 and 2, four different classification schemes were tested using MSKξMP, AMD vs. DME vs. Normal, AMD vs. DME, DME vs. Normal, Normal vs. AMD, and all classes. Since dataset 1 was slightly imbalanced, AMD (723 images), DME (1101), and has Normal (1407), using accuracy as the only measurement would not be enough to detail the efficacies of the MSKξMP. Using measurements such as recall, specificity, precision, and F1-score was more effective. For dataset 3, we also tested using four classification schemes: Normal vs. CNV, CNV vs. Drusen, Normal vs. Drusen, and all classes, Normal vs. CNV vs. Drusen. Our experiments included kernel sizes up to 3 × 9 and 5 × 9, however, these window sizes were too large, and important features detected may not be within the local areas of the pixel, IC. Therefore, kernel sizes up to n × 9 were omitted from the result section. As using more kernels with varying sizes would have diminishing returns and performance improvements would be minimal. Using six kernels is the right balance between computation times and performances of our MSKξMP.
5.1. Dataset 1, 2 and 3 Results
The highlights of our experiments on datasets 1 and 2 are shown in Table 2 and Figure 5, which show the performance measurements of our MSKξMP. The highest accuracy was achieved by SVM using polynomial kernel at 99.78% on dataset 1 and 96.59% on dataset 2. We achieved 100% on some of the binary classifications, Normal vs. AMD and Normal vs. DME, both using SVM with polynomials as well. Rus-Boost and Naïve Bayes seem to perform the worst, while Adaboost and Random Forest have similar performance to each other. This is because Adaboost and Random Forest both utilize an ensemble of weaker classifiers to create one stronger classifier: K-nearest-neighbor Classifiers to create Adaboost and decision trees for creating Random Forest. The highest accuracy was achieved by SMV polynomial. Regarding misclassifications, SVM polynomial kernel did not misclassify any normal images, only misclassified two DME class images, and using SVM Gaussian only two DME images are misclassified for dataset 1, see confusion matrixes Figure 5. Something to note on dataset 2’s results are their sensitivities (recall) and specificity, notice the specificity was 98.33%, which is higher than the accuracy and sensitivity; this suggests high confidence in their true negatives.
Table 2.
Results from datasets 1, 2 and 3.
| Dataset 1 | ||||||
|---|---|---|---|---|---|---|
| 3 Classes | AdaBoost | Naïve Bayes | SVM: Poly | R. Forest | SVM: RBF | Rus Boost |
| precision | 0.9714 | 0.8809 | 0.9976 | 0.9806 | 0.9958 | 0.8340 |
| sensitivity | 0.9696 | 0.8696 | 0.9971 | 0.9754 | 0.9941 | 0.8385 |
| specificity | 0.9857 | 0.9317 | 0.9989 | 0.9885 | 0.9978 | 0.9240 |
| accuracy | 0.9725 | 0.8737 | 0.9978 | 0.9790 | 0.9957 | 0.8480 |
| F-measure | 0.9705 | 0.8740 | 0.9974 | 0.9779 | 0.9949 | 0.8354 |
| 2 Classes | Normal vs. AMD | Normal vs. DME | DME vs. AMD | |||
| Classifier | SVM: Poly | SVM: Poly | SVM: Poly | |||
| precision | 0.9996 | 1.0000 | 0.9970 | |||
| sensitivity | 0.9993 | 1.0000 | 0.9961 | |||
| specificity | 0.9993 | 1.0000 | 0.9961 | |||
| accuracy | 0.9995 | 1.0000 | 0.9967 | |||
| F-measure | 0.9995 | 1.0000 | 0.9966 | |||
| Dataset 2 | ||||||
| 3 Classes | AdaBoost | Naïve Bayes | SVM: Poly | R. Forest | SVM: RBF | Rus Boost |
| precision | 0.8500 | 0.8066 | 0.9662 | 0.9256 | 0.9601 | 0.7347 |
| sensitivity | 0.8476 | 0.7975 | 0.9663 | 0.9260 | 0.9601 | 0.7410 |
| specificity | 0.9223 | 0.8977 | 0.9833 | 0.9615 | 0.9799 | 0.8655 |
| accuracy | 0.8481 | 0.8002 | 0.9669 | 0.9243 | 0.9603 | 0.7332 |
| F-measure | 0.8487 | 0.7997 | 0.9663 | 0.9249 | 0.9599 | 0.7326 |
| 2 Classes | Normal vs. AMD | Normal vs. DME | DME vs. AMD | |||
| Classifier | SVM: Poly | SVM: Poly | SVM: Poly | |||
| precision | 0.9740 | 0.9877 | 0.9760 | |||
| sensitivity | 0.9740 | 0.9870 | 0.9753 | |||
| specificity | 0.9740 | 0.9870 | 0.9753 | |||
| accuracy | 0.9740 | 0.9877 | 0.9764 | |||
| F-measure | 0.9740 | 0.9873 | 0.9757 | |||
| Dataset 3 | ||||||
| 3 Classes | AdaBoost | Naïve Bayes | SVM: Poly | R. Forest | SVM: RBF | Rus Boost |
| precision | 0.7350 | 0.6730 | 0.8931 | 0.8343 | 0.8952 | 0.6608 |
| sensitivity | 0.6907 | 0.6609 | 0.8851 | 0.7908 | 0.8758 | 0.6327 |
| specificity | 0.8499 | 0.8385 | 0.9431 | 0.8985 | 0.9395 | 0.8191 |
| accuracy | 0.7256 | 0.6894 | 0.8920 | 0.8175 | 0.8887 | 0.6616 |
| F-measure | 0.7022 | 0.6652 | 0.8888 | 0.8026 | 0.8837 | 0.6414 |
| 2 Classes | Normal vs. Drusen | Normal vs. CNV | Drusen vs. CNV | |||
| Classifier | SVM: Poly | SVM: Poly | SVM: Poly | |||
| precision | 0.9001 | 0.9817 | 0.9358 | |||
| sensitivity | 0.8887 | 0.9782 | 0.9327 | |||
| specificity | 0.8887 | 0.9782 | 0.9327 | |||
| accuracy | 0.8993 | 0.9815 | 0.9345 | |||
| F-measure | 0.8935 | 0.9799 | 0.9339 | |||
Figure 5.
(Left) Confusion Matrixes from Dataset 1; (Middle) Confusion Matrixes from Dataset 2; (Right) Confusion Matrixes from Dataset 3.
The main takeaway from the results in dataset 3 is that our MSKξMP was able to achieve 89.3% in accuracy using SVM polynomial kernel with specificity of 94.37%. This means high confidences in the true negative diagnoses. The other classifiers performed similarly to dataset 1 and dataset 2. Normal vs. CNV achieved the highest accuracy at 98.53% due to the significant differences between their visual features. It can be observed that MSKξMP was able to achieve 93.69% accuracy with the CNV vs. Drusen classification scheme. This is a good score because while CNV images contain drusen particles that may cause some confusion with AMD images, our feature extractor was able to differentiate between these two classes reliably.
Figure 6 shows ROC curves for our best results “SVM: Polynomial” in all three datasets. For each dataset, we plotted three ROC curves with one class being the positive class of the other two classes. For example, dataset 1 have three classes: Normal, DME, and AMD. One curve plots the True Positives vs. the False Positives using Normal as the positive class and DME and AMD as the negative classes. The area under the curves (AUC) are also computed for each respective curve. Notice the AUCs for dataset 1 were higher and the AUCs of dataset 3 were lower, which reflects their respective accuracies, sensitivities, and specificity.
Figure 6.
SVM: Poly produced the best results (Left) ROC curves of Dataset 1; (Middle) ROC curves of Dataset 2; (Right) ROC curves of Dataset 3.
5.2. Ablation Study
For our ablation study, we tested the feature extraction capabilities of each kernel size and weights (shown in Figure 3) on dataset 1. From Table 3, it can be observed that the accuracies measured were highest with the 5 × 7 and 7 × 7 kernels. For each kernel size, the accuracies were about 0.5–3% below our highest achieved accuracy of 99.78% (all kernels). Concatenating the feature vectors generated from each kernel size helped improve the performances of our MSKξMP and indicated that each sized kernel captured different textural information from the OCT image, both locally and regionally. Based on this ablation study, we decided to omit the 3 × 3 kernel and test combined features generated by 5 × 5, 7 × 7, 3 × 5, 5 × 7 and 3 × 7. We were able to achieve an accuracy of 99.78% by SVM polynomial kernel without using the 3 × 3 kernel.
Table 3.
Results of ablation study.
| Ablation Study | ||||||
|---|---|---|---|---|---|---|
| Kernel Size | 3 × 3 | 5 × 5 | 7 × 7 | 3 × 5 | 5 × 7 | 3 × 7 |
| precision | 0.9684 | 0.9832 | 0.9902 | 0.9772 | 0.9904 | 0.9746 |
| sensitivity | 0.9636 | 0.9808 | 0.9878 | 0.9743 | 0.9890 | 0.9734 |
| specificity | 0.9821 | 0.9920 | 0.9950 | 0.9872 | 0.9955 | 0.9868 |
| accuracy | 0.9666 | 0.9842 | 0.9904 | 0.9762 | 0.9910 | 0.9749 |
| F-measure | 0.9659 | 0.9820 | 0.9890 | 0.9757 | 0.9897 | 0.9739 |
5.3. Comparisons to Recent State of the Art
Comparing dataset 1 with recent state of the art shows our MSKξMP is able to achieve slightly higher accuracies, as shown in Table 4 and Table 5. Das Ref. [7]’s BACNN achieved an accuracy of 97.76% with a plus or minus of 2.24%, meaning BACNN was able to achieve 100% but with extremely high fluctuation. Even though our experiments were based on ten-fold validations, our fluctuation was only about 0.3–0.4%, which is more stable than BACNN. Compared to other state of the arts, our MSKξMP outperformed very deep CNNs such as Refs. [13,32,34] as well as the recent discovery of vision transformer algorithm Ref. [31]. Also, our MSKξMP was able to outperform recent state of the art texture descriptors such as Refs. [10,19], by 99.78% to 94.96% and 99.16% respectively, see Table 6. Table 7 shows datasets 2 and 3 comparisons between the recent state of the art and our MSKξMP. Our accuracies either outperformed or were comparable against deep learning techniques such as Ref. [35]. Note that Ref. [35] had high fluctuations across all performance measurements, ±0.8–3%, whereas our results with MSKξMP only fluctuated within 0.1–0.2%.
Table 4.
Dataset 1 comparisons to recent state of the art.
Table 5.
Dataset 1 comparisons to recent state of the art.
Table 6.
Dataset 1 comparisons to recent texture descriptors.
| Comparisons to Other Recent Texture Descriptors: Dataset 1 | ||||
|---|---|---|---|---|
| Technique | AMT-LBP [10] | LBP [21] | Fibonacci [19] | MSKξMP |
| precision (%) | 99.24 | 97.62 | 95.11 | 99.76 |
| sensitivity (%) | 98.95 | 97.01 | 94.60 | 99.71 |
| specificity (%) | 99.61 | 98.67 | 97.33 | 99.89 |
| accuracy (%) | 99.16 | 97.31 | 94.96 | 99.78 |
| F-measure | 99.09 | 97.30 | 94.84 | 99.74 |
Table 7.
Dataset 2 and 3 comparisons to recent state of the art.
| Dataset 2: Three Class Classification | ||||
|---|---|---|---|---|
| Author | Technique | Sensitive (%) | Precision (%) | Accuracy (%) |
| Wang [4] | OCANet-CK | 88.50 | 89.40 | 88.60 |
| VOCT-RNN | 94.00 | 94.40 | 93.80 | |
| Our Work | MSKξMP | 96.63 | 96.61 | 96.68 |
| Dataset 3: Three Class Classification | ||||
| Author | Technique | Sensitive (%) | Specificity (%) | Accuracy (%) |
| Sotoudeh-Paima [35] | FPN-EfficientNetB0 | 86.60 | 93.30 | 87.80 |
| FPN-ResNet50 | 89.80 | 94.80 | 90.10 | |
| Our Work | MSKξMP | 88.51 | 94.31 | 89.20 |
6. Conclusions and Future Work
This work presents a new image textural descriptor, MSKξMP, for differentiating between ocular diseases such as AMD, Drusen, DME, and CNV in OCT images. The presented method can be used to encode textural patterns at local and regional scales and to improve edges in various directions while avoiding speckle noises. The computer simulations show that our measurements outperformed (a) the current state-of-the-art deep learning techniques and vision transformer; and (b) the FPN-EfficientNetB0 in dataset 3 and is comparable to the FPN-ResNet50 network. Some of our binary classifications from dataset 1 also performed well, achieving perfect accuracy, or close to it. MSKξMP had high specificity, suggesting high confidence in their true negatives. Future work should be focused on constructing 3D retinal images, such as 3D OCT images, by extracting volume and depth information to determine the spread of the disease. The 3D structures should be used for multi-view classification.
Acknowledgments
We would like to thank Anna Liew for assisting us in understanding and analyzing the OCT image features.
Author Contributions
A.L. and S.A., methodology; A.L., validation; A.L. and S.A., formal analysis, investigation; A.L. and S.A., data generation; A.L. and S.B., writing—original draft preparation; S.A. and S.B.; writing—review and editing. All authors have read and agreed to the published version of the manuscript.
Informed Consent Statement
Not applicable.
Data Availability Statement
Duke Dataset is found here: https://people.duke.edu/~sf59/Srinivasan_BOE_2014_dataset.htm, NOOR datasets are found here: https://hrabbani.site123.me/available-datasets/dataset-for-oct-classification-50-normal-48-amd-50-dme and https://data.mendeley.com/datasets/8kt969dhx6/1 (accessed on 1 January 2020).
Conflicts of Interest
The authors declare no conflict of interest.
Funding Statement
This research received no external funding.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Thomas A., Harikrishnan P.M., Krishna A.K., Palanisamy P., Gopi V.P. Automated Detection of Age-Related Macular Degeneration from OCT Images Using Multipath CNN. J. Comput. Sci. Eng. 2021;15:34–46. doi: 10.5626/JCSE.2021.15.1.34. [DOI] [Google Scholar]
- 2.Bhende M., Shetty S., Parthasarathy M.K., Ramya S. Optical coherence tomography: A guide to interpretation of common macular diseases. Indian J. Ophthalmol. 2018;66:20–35. doi: 10.4103/ijo.IJO_902_17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Thomas A., Harikrishnan P., Ramachandran R., Ramachandran S., Manoj R., Palanisamy P., Gopi V.P. A novel multiscale and multipath convolutional neural network based age-related macular degeneration detection using OCT images. Comput. Methods Programs Biomed. 2021;209:106294. doi: 10.1016/j.cmpb.2021.106294. [DOI] [PubMed] [Google Scholar]
- 4.Srinivasan P.P., Kim L.A., Mettu P.S., Cousins S.W., Comer G.M., Izatt J.A., Farsiu S. Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images. Biomed. Opt. Express. 2014;5:3568–3577. doi: 10.1364/BOE.5.003568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fujimoto J.G., Pitris C., Boppart S.A., Brezinski M.E. Optical Coherence Tomography: An Emerging Technology for Biomedical Imaging and Optical Biopsy. Neoplasia. 2000;2:9–25. doi: 10.1038/sj.neo.7900071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Drexler W., Fujimoto J. Optical Coherence Tomography: Technology and Applications. Springer; Berlin/Heidelberg, Germany: 2008. [Google Scholar]
- 7.Das V., Prabhakararao E., Dandapat S., Bora P.K. B-Scan Attentive CNN for the Classification of Retinal Optical Coherence Tomography Volumes. IEEE Signal Process. Lett. 2020;27:1025–1029. doi: 10.1109/LSP.2020.3000933. [DOI] [Google Scholar]
- 8.Sotoudeh-Paima S. Labeled Retinal Optical Coherence Tomography Dataset for Classification of Normal, Drusen, and CNV Cases, Mendeley Data, 2021, V1. [(accessed on 4 January 2023)]. Available online: https://paperswithcode.com/dataset/labeled-retinal-optical-coherence-tomography.
- 9.Rasti R., Rabbani H., Mehridehnavi A., Hajizadeh F. Macular OCT Classification Using a Multiscale Convolutional Neural Network Ensemble. IEEE Trans. Med Imaging. 2017;37:1024–1034. doi: 10.1109/TMI.2017.2780115. [DOI] [PubMed] [Google Scholar]
- 10.Liew A., Ryan L., Agaian S.S. Alpha mean trim texture descriptors for optical coherence tomography eye classification. Multimodal Image Exploit. Learn. 2022;12100:157–167. doi: 10.1117/12.2618059. [DOI] [Google Scholar]
- 11.Szkulmowski M., Gorczynska I., Szlag D., Sylwestrzak M., Kowalczyk A., Wojtkowski M. Efficient reduction of speckle noise in Optical Coherence Tomography. Opt. Express. 2012;20:1337–1359. doi: 10.1364/OE.20.001337. [DOI] [PubMed] [Google Scholar]
- 12.Choi H., Jeong J. Speckle Noise Reduction Technique for SAR Images Using Statistical Characteristics of Speckle Noise and Discrete Wavelet Transform. Remote Sens. 2019;11:1184. doi: 10.3390/rs11101184. [DOI] [Google Scholar]
- 13.Yu H., Ding M., Zhang X., Wu J. PCANet based non-local means method for speckle noise removal in ultrasound images. PLoS ONE. 2018;13:e0205390. doi: 10.1371/journal.pone.0205390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rong Y., Xiang D., Zhu W., Yu K., Shi F., Fan Z., Chen X. Surrogate-Assisted Retinal OCT Image Classification Based on Convolutional Neural Networks. IEEE J. Biomed. Health Inform. 2018;23:253–263. doi: 10.1109/JBHI.2018.2795545. [DOI] [PubMed] [Google Scholar]
- 15.Sun Y., Li S., Sun Z. Fully automated macular pathology detection in retina optical coherence tomography images using sparse coding and dictionary learning. J. Biomed. Opt. 2017;22:016012. doi: 10.1117/1.JBO.22.1.016012. [DOI] [PubMed] [Google Scholar]
- 16.Wang Y., Zhang Y., Yao Z., Zhao R., Zhou F. Machine learning based detection of age-related macular degeneration (AMD) and diabetic macular edema (DME) from optical coherence tomography (OCT) images. Biomed. Opt. Express. 2016;7:4928–4940. doi: 10.1364/BOE.7.004928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hussain M., Bhuiyan A., Luu C.D., Smith R.T., Guymer R.H., Ishikawa H., Schuman J.S., Ramamohanarao K. Classification of healthy and diseased retina using SD-OCT imaging and Random Forest algorithm. PLoS ONE. 2018;13:e0198281. doi: 10.1371/journal.pone.0198281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang D., Wang L. On OCT Image Classification via Deep Learning. IEEE Photonics J. 2019;11:3900714. doi: 10.1109/JPHOT.2019.2934484. [DOI] [Google Scholar]
- 19.Panetta K., Sanghavi F., Agaian S., Madan N. Automated Detection of COVID-19 Cases on Radiographs using Shape-Dependent Fibonaccip Patterns. IEEE J. Biomed. Health Inform. 2021;25:1852–1863. doi: 10.1109/JBHI.2021.3069798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Otsu N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979;9:62–66. doi: 10.1109/TSMC.1979.4310076. [DOI] [Google Scholar]
- 21.Morales S., Engan K., Naranjo V., Colomer A. Retinal Disease Screening through Local Binary Patterns. IEEE J. Biomed. Health Inform. 2015;21:184–192. doi: 10.1109/JBHI.2015.2490798. [DOI] [PubMed] [Google Scholar]
- 22.Hafiane A., Seetharaman G., Zavidovique B. Median Binary Pattern for Textures Classification; Proceedings of the International Conference Image Analysis and Recognition; Montreal, QC, Canada. 22–24 August 2007; pp. 387–398. [DOI] [Google Scholar]
- 23.Zhang Q., Xu L., Jia J. 100+ Times Faster Weighted Median Filter; Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition; Columbus, OH, USA. 23–28 June 2014. [Google Scholar]
- 24.Yang W., Wang K., Zuo W. Neighborhood Component Feature Selection for High-Dimensional Data. J. Comput. 2012;7:161–168. doi: 10.4304/jcp.7.1.161-168. [DOI] [Google Scholar]
- 25.Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014;15:1929–1958. [Google Scholar]
- 26.Friedman J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001;29:1189–1232. doi: 10.1214/aos/1013203451. [DOI] [Google Scholar]
- 27.Manning C.D., Raghavan P., Schütze M. Introduction to Information Retrieval. Cambridge University Press; Cambridge, UK: 2008. [Google Scholar]
- 28.Christianini N., Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press; Cambridge, UK: 2000. [Google Scholar]
- 29.Breiman L. Random Forests. Mach. Learn. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
- 30.Freund Y., Schapire R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997;55:119–139. doi: 10.1006/jcss.1997.1504. [DOI] [Google Scholar]
- 31.Jiang Z., Wang L., Wu Q., Shao Y., Shen M., Jiang W., Dai C. Computer-aided diagnosis of retinopathy based on vision transformer. J. Innov. Opt. Health Sci. 2022;15:2250009. doi: 10.1142/S1793545822500092. [DOI] [Google Scholar]
- 32.He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition; Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA. 27–30 June 2016; pp. 770–778. [DOI] [Google Scholar]
- 33.Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv. 20141409.1556 [Google Scholar]
- 34.Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking the Inception Architecture for Computer Vision; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA. 27–30 June 2016; pp. 2818–2826. [DOI] [Google Scholar]
- 35.Sotoudeh-Paima S., Jodeiri A., Hajizadeh F., Soltanian-Zadeh H. Multiscale convolutional neural network for automated AMD classification using retinal OCT images. Comput. Biol. Med. 2022;144:105368. doi: 10.1016/j.compbiomed.2022.105368. [DOI] [PubMed] [Google Scholar]
- 36.Luo Y., Xu Q., Jin R., Wu M., Liu L. Automatic detection of retinopathy with optical coherence tomography images via a semi-supervised deep learning method. Biomed. Opt. Express. 2021;12:2684–2702. doi: 10.1364/BOE.418364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Mousavi E., Kafieh R., Rabbani H. Classification of dry age-related macular degeneration and diabetic macular oedema from optical coherence tomography images using dictionary learning. IET Image Process. 2020;14:1571–1579. doi: 10.1049/iet-ipr.2018.6186. [DOI] [Google Scholar]
- 38.Karri S.P.K., Chakraborty D., Chatterjee J. Transfer learning based classification of optical coherence tomography images with diabetic macular edema and dry age-related macular degeneration. Biomed. Opt. Express. 2017;8:579–592. doi: 10.1364/BOE.8.000579. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Duke Dataset is found here: https://people.duke.edu/~sf59/Srinivasan_BOE_2014_dataset.htm, NOOR datasets are found here: https://hrabbani.site123.me/available-datasets/dataset-for-oct-classification-50-normal-48-amd-50-dme and https://data.mendeley.com/datasets/8kt969dhx6/1 (accessed on 1 January 2020).






