Abstract
Objective
We developed an optimized decision support system for retinal fundus image-based glaucoma screening.
Methods
We combined computer vision algorithms with a convolutional network for fundus images and applied a faster region-based convolutional neural network (FRCNN) and artificial algae algorithm with support vector machine (AAASVM) classifiers. Optic boundary detection, optic cup, and optic disc segmentations were conducted using TernausNet. Glaucoma screening was performed using the optimized FRCNN. The Softmax layer was replaced with an SVM classifier layer and optimized with an AAA to attain enhanced accuracy.
Results
Using three retinal fundus image datasets (G1020, digital retinal images vessel extraction, and high-resolution fundus), we obtained accuracy of 95.11%, 92.87%, and 93.7%, respectively. Framework accuracy was amplified with an adaptive gradient algorithm optimizer FRCNN (AFRCNN), which achieved average accuracy 94.06%, sensitivity 93.353%, and specificity 94.706%. AAASVM obtained average accuracy of 96.52%, which was 3% ahead of the FRCNN classifier. These classifiers had areas under the curve of 0.9, 0.85, and 0.87, respectively.
Conclusion
Based on statistical Friedman evaluation, AAASVM was the best glaucoma screening model. Segmented and classified images can be directed to the health care system to assess patients’ progress. This computer-aided decision support system will be useful for optometrists.
Keywords: TernausNet, faster region-based convolutional neural network, artificial algae algorithm, support vector machine, glaucoma, screening, fundus
Introduction
Glaucoma is the second leading cause of irremediable vision loss globally, which progressively damages the optic nerve, without earlier signs. 1 An abnormal increase in intraocular pressure is the focal origin of vision loss among patients with glaucoma 2 As a result, detection of glaucoma in the preliminary phase is essential for well-timed management. Eye pressure detection, vision testing, 3 and optic nerve head evaluation 4 are the primary methods used to identify glaucoma. Manual segmentation of the optic cup (OC) and optic disc (OD) are highly time-consuming and complex processes. To address this, various automatic segmentation algorithms have been proposed, such as clustering, threshold, 5 and enhancement based on multiple features. 6
Zilly et al. 7 attained dice coefficients of 0.97 and 0.871 for OD and OC segmentation. The team used entropy sampling and ensemble algorithms for the above process. Sevastopolsky 8 proposed U-Net convolutional neural network (CNN)-based automatic OD and OC segmentation. Carmona et al. 9 implemented genetic algorithms for optic nerve head detection. Al-Bander et al. 10 proposed a dense, fully convolutional network for glaucoma detection based on OD and OC segmentation.
Yu et al. 11 took advantage of a modified U-net with pre-trained Resnet-34 to encode layers with traditional decoding layers. This achieved a 97.31% average disc value on OD segmentation and OC segmentation of 87.6%, evaluated with the retinal datasets DRISHTI-GS and RIM-ONE.
A fully convolutional network can be effectively combined with an adversarial network for OD and OC segmentation. 12 Novel M-Net architecture can be joined with polar transformation to segment the OC and OD. 13 At present, researchers have adopted many deep learning techniques, as compared with traditional systems.14,15 Iglovikov and Shvets 16 proposed the novel architecture, TernausNet, designed on convolutions of a VGG11 pre-trained set with initialized weights and an encoder for effective feature extraction in image segmentation.
Koh et al. 17 designed a model with different classifiers for glaucoma detection like support vector machine (SVM), Naïve Bayes, and probabilistic NN, with an accuracy of 0.93.
Related works
A deep-learning CNN was proposed for glaucoma detection, implemented using image datasets. The first image dataset obtained an accuracy of 95.6% and the second reached an accuracy of 96.95%. 18 Al-Bander et al. 19 implemented a CNN for feature extraction to distinguish normal and glaucoma images using an SVM classifier. His team used 23 layers in AlexNet to pre-train the extracted features and attained an accuracy of 88.2%, sensitivity of 85%, and specificity of 90.8%. Another CNN was designed with 18 layers, which included a Softmax layer to extract deep features of the fundus to diagnose glaucoma. This model achieved an accuracy of 98.13%. 20
Transfer learning using a deep system was designed by Norouzifard et al. 21 for glaucomatous fundus image classification. This process was implemented with cropping of the input image and executed with the VGG19 and ResNet models, with a ratio of training and testing phases of 70% and 30%, respectively. The VGG19 model achieved an accuracy of 92.3% at the 30th epoch.
Diaz-Pinto et al. 22 expanded convolution-based glaucoma classification and tested it using various pre-trained models like VGG19, GoogleNet, ResNet, and DenseNet. VGG19 outperformed the other models, with an area under the receiver operating characteristic curve (AUROC) of 0.94. Kim et al. 23 proposed a computer-aided deep learning method to detect glaucoma. The team used gradient-weighted activation mapping with a CNN. The VGG16 model was customized for training the basic flow into a fully convoluted layer with learning rate of 0.0001 at a size of batch 32. This model was tested against various pre-trained models; the Xception model had the best performance, with an AUROC of 0.96. Juneja et al. 24 designed a deep learning model, G-Net, to increase the optimal performance of a CNN by removing irrelevant details to effectively identify normal and abnormal retinal images to screen glaucoma.
Adaptive weighted locality-constrained sparse coding is based on referencing OD and auto determination of the cup-to-disc ratio (CDR) to detect glaucoma. 25 Gour and Khanna, 26 used a pre-trained CNN model along with fine-tuned stochastic gradient descent and Adam optimizer for effective detection of multi-class, multi-label retinal disease. The VGG16 model with a stochastic optimizer achieved an AUROC of 0.84 and an accuracy of 96.49% in the training set. For OC and OD segmentation, Jiang et al. 27 proposed the atrous joint region-based CNN (RCNN). This model achieved an AUROC of 0.85 and 0.90 with the ORIGA and Singapore Chinese Eye Study datasets, respectively.
Gidaris and Komodakis, 28 proposed a new multi-region CNN to improve object detection using the bounding box regression approach. A deep multi-scale RCNN was implemented to extract semantic segmentation-aware parameters, and the bounding boxes were assessed twice. Guindel et al. 29 designed a new system based on a faster RCNN (FRCNN) to find objects and their obstacles. An unmanned ground vehicle object was detected using a boosted region proposal network (RPN) and FRCNN with swarm optimization algorithms. 30 Wang et al. 31 proposed an autonomous driving object detection technique on an FRCNN with the learning methods of particle swarm optimization and bacterial foraging optimization algorithms to obtain the optimal results. In the biomedical field, lung nodules are detected using an FRCNN framework and the results compared with an RCNN and FRCNN; this method achieves an accuracy of 91.4%. 32 Wan and Goudos 33 designed a multi-class fruit detection system based on an FRCNN and robotic vision. This system obtained greater accuracy and had less processing time than the state-of-the-art system.
Currently, optimization problems are solved by recently developed meta-heuristic algorithms. The slime mould algorithm 34 was on its oscillation, and it is applied on adaptive weight to create optimistic and pessimistic feedback regarding slime propagation. The Harris hawk optimizer 35 mimics Harris hawks’ element of surprise and the dynamic escape patterns of their prey. Another famous algorithm, the Grey Wolf Optimizer, simulates the leadership and hunting processes of grey wolves to obtain the optimal solution. 36 Huang et al. 37 reviewed nearly 312 papers related to ophthalmology and provided a clear summary regarding the emergence of artificial intelligence (AI) in glaucoma. Jain and Salau 38 used tensor-based empirical wavelet transform for glaucoma detection and achieved an accuracy of 98.7%.
Problem credentials
As per literature reviews, researchers are working on enhancing deep learning methods, as compared with traditional state-of-the-art methods, to eradicate vision loss owing to glaucoma. Each method has several drawbacks, including 1) most feature extractions are based on traditional methods, 2) localization of the OD is difficult, 3) use of repeated classifiers, and 4) image quality is a large concern in deep framework implementation.
Related works of feature extraction 39 and the literature on AI 37 are focused on the above research gap with the aim to propose a new system. To address the aforementioned limitations, we designed a novel system to predict glaucoma using a deep framework with an artificial algae algorithm (AAA) to optimize features over traditional methods that is not limited to a graphical processing unit.
Methods
In this practical experimental study, the proposed framework was implemented according to the following steps: 1) multi-level preprocessing; 2) TernausNet OD localization, boundary detection, and segmentation; 3) feature extraction using an FRCNN; 4) classification using an FRCNN and adaptive gradient algorithm optimizer FRCNN (AFRCNN); 5) replacing the Softmax layer with SVM and an optimized parameter with AAA; 6) using Friedman statistical analysis to evaluate and optimize the model; 6) testing the optimizer AAASVM in FRCNN-based glaucoma screening using different fundus datasets. Figure 1 depicts the workflow of the study process. No ethics approval or informed consent was needed because we used a public dataset.
Figure 1.
Proposed hybrid framework. AAASVM, artificial algae algorithm with support vector machine; RCNN, region-based convolutional neural network; A-CLAHE, adaptive enhanced local contrast histogram equalization.
Multi-level preprocessing
The concept of multi-level pre-processing is a drastic improvement over classical preprocessing procedures in that the input image is processed based on metrics such as color constancy, enhancement of the input image, homomorphic filtration, and the removal of background portions. These metrics are considered in processing the retinal input image to attain an enhanced image for further processing. The stages are clearly described below.
Color constancy
The proposed approach of FRCNN classification attains color constancy based on the second-order gray edge procedure. This color constancy norm estimates the color level of the image without considering light sources. This second-order gray edge is the most popular edge detection and color estimation algorithm to determine the color factors in a given retinal image. The general consideration of the fundus is in RGB format. The same logic was followed with the present second-order gray edge algorithm in the color constancy procedures. The RGB color image values were estimated based on the following, derived based on the camera-sensitive range function c(λ) in Equation 1.
| (1) |
where f(x) indicates the functional value of color constancy, λ indicates the wavelength specification, e(λ) indicates the light source associated with processing, s(λ) indicates the surface reflection of the input image, and d(λ) indicates the distance vector specification of the input image. The main intention of color constancy is to analyze the color factors associated with RGB color factors, independent of light sources. The projection value e(λ) is estimated using the following equation:
| (2) |
where the principle estimation of light source-independent color constancy variations are analyzed using the form factor e( ) with respect to RGB color coordinates.
Image enhancement
The proposed approach of multi-level image pre-processing uses adaptive enhanced local contrast histogram equalization (A-CLAHE) principles. This method enhances the image features with respect to RGB color factors. This color enhancement feature is used to attain high levels of accuracy. A-CLAHE is completely distinct from the classical image feature enhancement norms in that it processes the input image with respect to three form factors—image block size estimations, histogram estimations, and the maximum slope region estimations—instead of checking only with histogram values. These features enhance the accuracy of categorization, with the block size of the image estimated with respect to the local areas of the images while preserving the RGB color coefficients. Histograms are used to analyze the optimization of histogram features in the input image, and the maximum slop values are intensively used to estimate the contrast intensity values of an input image.
Homomorphic filtration
Homomorphic filtration is a general image processing scheme used to eliminate noise levels from the input image, which improves the accuracy levels of image processing; the prediction results are more accurate following this approach. The strength level of the given image and the associated color levels were estimated using the following equation with respect to the maximum level of illumination (Mi) and image contrast level enhancements (Ri):
| (3) |
Here, the region estimations of an input image are illustrated with the terms of spatial coordinates x, y; n indicates the amount of associated image illustration presented in the training model. The homomorphic filtration procedure is the main method for upgrading computerized images, particularly when the image data are in illuminated conditions. This separation method has been applied in various imaging applications, including biometric, clinical, and mechanical vision assessment. This filtration procedure works in the recurrence area by applying a high-pass type channel to lessen the meaning of low recurrence segments. A few adaptations of numeric conditions are used to introduce this channel. Consequently, this approach will estimate a portion of these conditions. The application of Fourier transformation (FT) is used to attain the best result. This FT logic analyzes the frequency ranges and sorts out the proper frequency levels of the input image based on RGB color coefficients. The following FT function was used to yield the image filtration levels with respect to homomorphic image filtrations:
| (4) |
where ft indicates the FT logic, and CInt and indicates the color intensity levels of the input retinal image.
Background exclusion
The background removal process is the most important concern in the digital image processing stream, especially in the medical image processing domain. Separating the background and foreground portions of an image is necessary so as to process only the required data from the input image, instead of considering the overall image. The process of background subtraction is intended to ensure extraction of the correct foreground processing portions from the input image, which improves the region selections and extraction level in an efficient manner so as to use the region of interest (ROI). The pixel ratios are important to estimate ROI levels in the input image. The following factors are also important to estimate the background removal process: 1) time-based image frame acquisition t(i); 2) binary pixel ratio of input image, denoted as B(x), in which x indicates the respective image to process; 3) actual frame positions AFrame(i,j), in which i points to the number of rows in the extracted pixels and j indicates the columns. Both these values are subtracted using the function called background subtraction to attain the best possible prediction, which results in a real-time scenario of glaucoma detection.
TernausNet OD boundary detection and joint segmentation
For clear-cut detection of an OD borderline, we introduced an enhanced version of U-net with a VGG11 pre-trained encoder model, called TernausNet. This model has seven convolutional layers, sequenced with a rectified linear activation function. The pooling layers have a maximum of five operations to reduce the feature map resulting from two. This initial convolution layer had weights of 5 × 5 × 3 × 32, with a mask size of 3 × 3 over 64 channels. Based on a set of connections, the model deepens and the channels double up to 512. Then, the encoder works on convolutions of 3 × 3 and pares down blocks of the input image to 1 × 1 convolutions. After this process, the output is fed into a decoder, which doubles the size of the features and reduces the channels from 512 to 64. Max pooling layers are designed for up-sampling of the OD boundary and concatenation of the prior output by a depth concatenation layer. This output is fed into the final convolution and the Softmax layer interprets the probabilities of the OD edge values between 0 and 1 with activation of 227 ×227 × 96. In the proposed work, the output layer was recognized as the segmentation layer to deduct the OD localization based on joint OC and OD fragmentation.
FRCNN-based feature extraction
Salau and Jain 39 discussed various feature extractions for image-based retrieval, which help in developing new feature extractions based on an FRCNN. In this step, the image output of TernausNet is supplied to the FRCNN to extract features like shape, color, and texture. Instead of a selective search, the FRCNN uses an RPN. Input images are passed into ConvNet, returning the feature map value along with the value of the object along with anchor boxes of different sizes. The prior proposals are sent to a pooling layer called the ROI, which reduces each proposal to a size equal to that of the bounding box. After this process, the prior output is supplied to the Softmax layer. The RPN generates a slide window on the extracted features. This proposed method, with stochastic gradient descent momentum as the solver, facilitates the gradients of the feature vectors in the right directions, for faster grouping. This momentum provides the average of the gradients. 40
| (5) |
and
Here, is a hyperparameter, with a value between 0 and 1.
The classification layer is referred as a feature extraction layer here. The box regression layer provides the box on a featured image based on the above step. This supports grey color intensities regarding texture parameters like contrast, correlation, energy, and entropy with homogeneity, which are extracted from the training retinal dataset based on the above FRCNN. Then, color features like mean, variance, standard deviation, skewness, and kurtosis continue to be extracted to identify abnormalities in the retinal image, followed by shape feature extraction. These features are stored in a file for further processing to produce the feature map for predicting the classification of glaucoma.
Glaucoma screening using a modified RCNN and tuned AFRCNN
Extracted feature maps are fed into a classification layer of the FRCNN to predict normal and abnormal retinal images, and hence, to predict glaucoma. Fully connected Softmax and classification layers are newly added for effective classification. The box regression layer provides the boxes on the input image, connecting over the average pooling. The classification layer predicts glaucoma based on the CDR of the retinal image, according to the above steps.
An adaptive gradient algorithm is used to increase the accuracy of the classification and manually eradicate tuning of the learning rate. This kind of optimization is related to stochastic gradient by adapting the frequently appearing features over occasional features on past observations. This adjusts the learning rate based on the learning rate of Equation 6, with each parameter θ at every time t on past gradient θi.:
| (6) |
In this, the initial learning rate is defined as 3e−4 and 0.99 is the squared gradient decay factor with maximum epochs trailing on batch size.
Support vector machine (SVM)
In this proposed work, the SVM was used to replace the Softmax classifier in the RPN to boost the anchor box performance. In the field of machine learning, the SVM approach is a well-known classifier. Furthermore, SVM has a strong theoretical foundation. SVM can effectively find global optimal solutions with a limited number of training examples. The SVM approach is often used for detection of an object, classification of the object, non-linear regression, and recognizing patterns.
To capture few mappings of non-linear vectors based on a multi-dimensional space, a linear model is often used to derive non-linear class frontiers. An appropriate segregating hyperplane is developed in a multi-dimensional space. As a result, the benefit of SVM is that it searches for the highest hyperplanes to separate output classes.
To discriminate the foreground and background region boxes, an SVM classifier with a radial basis function (RBF) kernel is used. The SVM’s RBF kernel has two parameters: C and the Softmax function (which does not have any parameters). As a result, the SVM algorithm outperforms the Softmax method in terms of optimization. In Equation 7, xi and xj are samples input to the kernel, and is a parameter:
| (7) |
Artificial algae algorithm (AAA)
The activities of microalgae inspired the development of this algorithm. 41 Algae development is determined by the amount of light it absorbs to obtain nutrients. According to the algorithm, artificial algal cells are created to replicate natural algae and their life activity. Colonies are formed by a group of algae cells, which thereafter make up the total population.
Every algal cell in the jth dimension of ith algal colony is given as Y i,j , and the population of algae colonies is symbolized as Y i = [yi,1,y i ,2,…y i,D ], which is represented as a matrix (Equation 8),
| (8) |
Here, D specifies the dimension of algal colonies and NP represents the algal group number in the population.
Each algal group is made up by a sequence of algal cells that can be thought of as solution dimensions. The algal group forms a cluster and moves toward a nutrient-rich environment. By migrating, adjusting, and evolving, the algal colony strives to improve its position. When the colony is placed in the perfect position, the best solution is obtained.
This algorithm goes through three stages, namely, reproduction or evolution, adaption, and helical movement. In the reproduction phase, the algal group that discovers a suitable result develops and grows. If the algal colonies act but do not find an appropriate result, they will eventually perish because they will be unable to develop. The algal colonies are ordered according to their fitness during the evolutionary process (G). The smallest single-dimension algal colony is randomly destroyed, and the identical dimension of the largest algal colony is copied, as represented in Equations 9–11:
| (9) |
| (10) |
| (11) |
Here, smallestt is the smallest colony at time t in the population, and biggestt is the largest colony in the population at time t.
In the helical phase, the energy is computed in relation to the size of the algal colonies at the start of each cycle (i.e., reflecting their fitness). The energy of each algal group determines how many times it moves helically during each cycle. The colony’s energy is proportional to the amount of nutrients it absorbs from its surroundings. The algal group that finds an enhanced solution experiences an energy loss, per Equations 12–14:
| (12) |
| (13) |
| (14) |
Here, and are the ith algal colony at time t in the x, y, and z coordinates is shear force, with friction surface ; this is calculated using Equation 15:
| (15) |
In the proposed study, the shear force value is 3, the energy loss is 0.3, and the parameter of adaptation is 0.2. Adaptation is the process through which a colony that has survived but is unable to grow adequately strives to be the best colony. Each algal colony begins with a zero hunger (starvation) level. The algal colony’s hunger level rises with each helical movement because it is unable to find a better alternative. After each helical movement cycle, the algal group with the highest level of hunger is exposed to an adaptation method; see Equations 16 and 17:
| (16) |
| (17) |
Here, represents the highest level of starvation at time t, and indicates the ith algal colony starvation value at t. The constraint of adaptation Ap is specific; it takes values between 0 and 1 and also determines whether this adaptation process is applied at time t.
The above process was obtained via the following steps. The mean square error (MSE) is defined as the value of fitness based on the square of the various calculations among forecasted and real values in the initialization weight and bias of the SVM. The MSE is the fitness function of the Softmax SVM learning data of the individual colony fitness value. This value is updated based on the adaptation equations. These steps are reiterated up to a maximum count to reach a definite error rate as the cycle. The resultant smallest error rate is incorporated for testing using SVM-generated parameters.
Description of experimental datasets
The proposed technique was implemented using Matlab 2020a (The Math Works, Inc., Natick, MA, USA) with a capacity of 8 GB RAM. The proposed OD boundary detection, OD and OC segmentation, and classification of glaucoma were accomplished using the G1020, high-resolution fundus (HRF), and digital retinal images vessel extraction (DRIVE) retinal datasets as the test sets.
G1020 retinal dataset
This dataset has 1020 retinal images from 432 patients. 42 Among the total, 296 images from 110 patients show glaucoma and 724 images from 322 patients show a normal retina. This dataset does not have realistic restrictions in imaging as compared with other retinal fundus image datasets. This repository includes a collection of fundus images with a 45-degree view field following dilation drops, collected between 2005 and 2017 in a private hospital in Germany.
HRF image set
This image dataset includes 45 retinal fundus images taken from healthy patients with diabetic retinopathy and glaucoma. 43 These images were captured using a Canon CR-1 fundus camera (Tokyo, Japan) with a 45-degree view field and a dimension of 3304 × 2336.
DRIVE retinal dataset
This dataset contains 400 retinal images of diabetic patients aged 25 to 90 years. These images were taken using a Canon CR5 non-mydriatic 3CCD camera with an angle of 45 degrees and stored in JPEG format. 44
Results
System workflow
The present work on glaucoma prediction classification has four steps. Initially, G1020 images underwent multi-level preprocessing to obtain an improved image as output. Then, the preprocessed images were fed into the TernausNet input layer for accurate detection of the OD boundary. This accurately segments the ROI with the help of an encoder, decoder, maxpool, and depth concatenation, Softmax of 227 ×227 × 96, and segmentation layer of 1 ×1 × 96 with weight of 5 × 5 × 32 × 32 to attain better segmentation than that of the patch level (see Figure 2). From this, the OC was found as the center of the deducted OD boundary. Then, hough transform was incorporated to position the center and radius of the disc circle. After the post-processing module, the individual segmented portion returns the OC and OD boundary. The CDR was examined using the vertical distance of the OC versus the OD. The retinal image segmentation was generalized with the loss function and minimized with the result of binary task classification, which was subtracted from cross entropy, defined as I. From this, the loss function was defined from the metrics of the Jaccard index, defined as,
| (18) |
Figure 2.
Proposed optic disc boundary detection using TernausNet.
From Equation 18, the system derived the chance of diminishing the loss function and the probability of increasing prediction of the correct pixel value of the intersected region. Here, feature maps were fed into an RPN, which proposed a number of candidate regions with the score of bounding box probability on the disc boundary. Prediction of the output images depended on an input image and every pixel value matching up to a probability of the required area detection. Based on loss function, the threshold value was set to 0.3. At this point, the predicted OD boundary image is shown in black and white, based on multiplication of every pixel by 255. In case the pixel value is zero, it is assumed that the threshold value is lower than the assigned one and that the pixel value is higher than the prescribed threshold, assumed as 1.
In the proposed work, color, shape, and texture features were extracted using an FRCNN, which can better deduct the object (OC and OD) as joint bounding boxes. Feature maps of shape, color, and texture were input to the RPN to generate candidate region proposals. All proposals had a score signifying the probability of the box being in the OD and OC. An anchor was generated to generate a bounding box, and sliding windows were used for anchor generation. Outputs are encoded coordinates of convolution and not of bounding boxes. Our aim was to deduct retinal images (OD and OC) based on region proposals of shape, color, and texture scores. After this, ROI pooling was used to select minute parts of feature maps together with the coordinates of candidate bounding boxes. This ROI was used to check the classifier to concentrate on the feature map bounding box, region of pooling, and max pooling of size in the m × m region.
Deep convolutional layers have classifiers, which indicate the probability of the target class based on generating the encoded bounding box coordinates. Here, the Softmax layer was used to create the probability of the scores. The optimal bounding box prediction was obtained according to the correlation between RPN candidate bounding box coordinates and their encoded coordinates.
After discriminating normal and glaucoma retinal images, the system calculates the ROC curves of the given dataset and the AUROC, accounting for the performance of glaucoma classification. A superior AUROC is considered superior execution of the newly designed framework.
Evaluation of OD boundary detection and joint segmentation results
Next, performance measures were introduced to enhance the confidence level of the proposed segmentation. Outputs of the segmented objects were initialized as true positive (TP) and false positive (FP) of the OD/OC boundary, and the backdrop part was initialized as true negative (TN) and false negative (FN). Based on this, a confusion matrix was designed to measure performance.
TernausNet segmentation outputs are illustrated in Figure 3. This method is compared with the superpixel method, 31 quadratic divergence regularized SVM, 45 Unet-based segmentation 16 and FRCNN method. 32
Figure 3.
TernausNet and FRCNN glaucoma screening using the (a) G1020 and (b) HRF, and DRIVE retinal datasets. FRCNN, faster region-based convolutional neural network; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
We created a comparison chart based on OD and OC error and organized the segmenting tasks as bounding box detection. This proposed method simplifies complex methods to be faster and easier. This method can be run with or without a pre-trained model and with transfer learning. This task resulted in error reduction of OD and OC segmentation on a perfect disc boundary (see Table 1).
Table 1.
Evaluation of optic disc and optic cup segmentation with different techniques.
| Techniques | Disc error | Cup error |
|---|---|---|
| Superpixel 46 | 0.102 | 0.264 |
| Quadratic divergence SVM 45 | 0.11 | – |
| Unet 8 | 0.115 | 0.287 |
| FRCNN 32 | 0.069 | 0.222 |
| Proposed TernausNet | 0.061 | 0.207 |
SVM, support vector machine; FRCNN, faster region-based convolutional neural network.
In this work, various performance measures were incorporated into glaucoma classification screening. These performance indicators are listed as Equations 19–24:
| (19) |
| (20) |
| (21) |
| (22) |
| (23) |
| (24) |
Discussion
Table 2 depicts the performance of the FRCNN classifier without an optimizer in the training set (G1020) and test sets (HRF and DRIVE). The new proposed framework achieved an accuracy of 95.11%, 93.7%, and 92.87%, respectively, with the datasets listed in the comparison charts (Tables 2–4). The AUROC was 0.92 for the G1020, 0.84 for the HRF, and 0.87 for the DRIVE retinal dataset classification (see Figure 4). The output of the above was sent to an optimizer (AAA) to improve the accuracy of the glaucoma classifier by reducing the MSE.
Table 2.
Comparison of FRCNN classification using different retinal datasets.
| Dataset name | Accuracy % | Sensitivity % | Specificity % | Precision % | Recall % | F-score % |
|---|---|---|---|---|---|---|
| G1020 | 95.11 | 93.25 | 94.13 | 87.3 | 92.05 | 89.12 |
| HRF | 93.7 | 92.43 | 93.12 | 85.67 | 92.78 | 88.31 |
| DRIVE | 92.87 | 91.57 | 92.09 | 85.23 | 90.10 | 87.12 |
FRCNN, faster region-based convolutional neural network; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
Table 3.
Assessment of AFRCNN glaucoma classification.
| Dataset name | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | Recall (%) | F-score (%) |
|---|---|---|---|---|---|---|
| G1020 | 96.15 | 94.10 | 95.12 | 88.92 | 93.05 | 90.93 |
| HRF | 94.47 | 93.35 | 94.05 | 86.20 | 92.56 | 89.92 |
| DRIVE | 93.92 | 92.61 | 93.01 | 86.57 | 91.10 | 88.17 |
AFRCNN, adaptive gradient algorithm optimizer faster region-based convolutional neural network; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
Table 4.
Performance of AAASVM glaucoma classification.
| Dataset name | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | Recall (%) | F-score (%) |
|---|---|---|---|---|---|---|
| G1020 | 97.15 | 93.70 | 94.12 | 88.72 | 92.05 | 91.82 |
| HRF | 96.51 | 94.25 | 95.15 | 85.15 | 93.51 | 88.12 |
| DRIVE | 95.90 | 93.61 | 94.87 | 87.38 | 92.49 | 89.36 |
AAASVM, artificial algae algorithm with support vector machine; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
Figure 4.
Results of AAASVM area under curve on ROC for (a) G1020, (b) HRF, and (c) DRIVE. AAA, artificial algae algorithm; SVM, support vector machine; ROC, receiver operating characteristic; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
The evaluation accuracy was increased by adopting the AAA as the solver for the SVM–FRCNN (see Table 5). This improvised the learning rate and yielded good accuracy. This optimizer increased the accuracy by 1.02% for each dataset. The AUROC was between 0.87 and 0.93 on the training and testing sets, on the left side of the graph. This rendered an inference time of 0.02 s compared with the normal method.
Table 5.
Comparison of the performance of different classifiers.
| Classifiers | Average sensitivity(%) | Average specificity(%) | Average accuracy(%) |
|---|---|---|---|
| FRCNN | 92.416 | 93.113 | 93.89 |
| AFRCNN | 93.353 | 94.061 | 94.85 |
| SVM | 93.051 | 94.021 | 95.02 |
| AAASVM | 93.856 | 94.706 | 96.52 |
AFRCNN, adaptive gradient algorithm optimizer faster region-based convolutional neural network; AAASVM, artificial algae algorithm with support vector machine.
Figures 5–7 depict the performance assessment of the FRCNN, optimized AFRCNN, SVM, and AAASVM techniques. The AAASVM method had optimal accuracy over that of the FRCNN, untuned FRCNN, and SVM.
Figure 5.
Performance measures of the FRCNN without tuning. FRCNN, faster region-based convolutional neural network; HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
Figure 6.
Performance measures of the tuned AFRCNN. HRF, high-resolution fundus; DRIVE, digital retinal images vessel extraction.
Figure 7.
Performance measures of the different classifiers. AFRCNN, adaptive gradient algorithm optimizer faster region-based convolutional neural network; AAASVM, artificial algae algorithm with support vector machine.
Performance evaluation based on the Friedman test
We conducted a comparison of AUROCs obtained by the classifiers with a probability of 0.05; 47 the null hypothesis was set for evaluation purposes.
The null hypothesis is defined as all methods are equal and their individual ranking is also equal. Here, the classifier with the highest AUROC value is called the best classifier and ranked as 1. The R of each classifier i is calculated using Equation 25:
| (25) |
where N = 3 is the number of datasets used, the rank is the jth dataset of the ith rank of the classifier, and the number of classifiers is k = 4. The chi-square value was less than the resultant value for chance of 0.05. Therefore, each classifier was listed based on its own ranking and rejection of the null hypothesis. According to this, the AUROC of AAASVM surpassed that of the other classifiers.
The current proposed system can serve as a good resource to assess the progression of glaucoma classification at the primary stage, with an increased optimum accuracy of 0.86, for better prediction of glaucoma. Some prior classifiers are listed in Table 6 for comparative analysis.
Table 6.
Assessment in comparison with other glaucoma screenings
| Authors | Classifiers | No. of features | Accuracy (%) |
|---|---|---|---|
| Koh et al. 48 | Random forest | 15 | 92.48 |
| Mookiah et al. 49 | SVM | 35 | 95 |
| Kausu et al. 50 | MLP | 4 | 97.6 |
| Dua et al. 51 | SMO | 14 | 93.3 |
| Elangovan and Nath 52 | CNN | 96.6 | |
| Proposed | FRCNN, AFRCNNAAASVM | 14 | 96.1597.01 |
AFRCNN, adaptive gradient algorithm optimizer faster region-based convolutional neural network; AAASVM, artificial algae algorithm with support vector machine; MLP, multilayer perceptron.
Conclusions
In this work, TernausNet was used for effective boundary detection of the OD and joint segmentation of the OD and OC. Minimum error was obtained using three retinal datasets to predict the onset of glaucoma, with general identification of the OD, which is quite difficult. Hence, this technique provided OC and OD errors of 0.06 and 0.027, respectively. This led to better prediction of glaucoma than the AFRCNN. This classifier attained an accuracy of 96.15% with the G1020 dataset, 94.5% with the HRF, and 93.92% with the DRIVE dataset. Then, the Softmax layer was replaced by SVM and weight optimized using AAA, which reached an optimal AUROC of 0.9, 0.87, and 0.85 for these three datasets, respectively. This system can serve as a useful computer decision support system, with less inference time using the combination of a computer vision algorithm with fundus images and optimized with AAASVM. One advantage of this proposed framework is that it can be executed using any processor, like a traditional method, rather than using deep learning. In the future, this output could be augmented as a synthetic dataset for a deep learning framework. Finally, the segmented and classified results of retinal images can be updated in the patient medical database, for frequent updating of retinal status using the Internet of Medical Things for future use through cloud computing. Using this approach, optometrists and ophthalmologists can easily conduct follow-up and predict patient prognosis, thereby helping to prevent vision loss among patients with glaucoma.
The proposed system was developed using a small retinal database and only for the prediction of glaucoma, not for other retinal disorders, owing to lacking availability of retinal information. There are no databases that include retinal data along with other factors to predict vision loss. In the future, this work will be extended to include a large retinal image dataset with the aim to effectively predict other retinal diseases.
Acknowledgement
The first author would like to thank the management of Kalasalingam Academy of Research and Education for providing a fellowship to carry out the present research work.
Author contributions: Conceptualization, design, first draft of the manuscript: M. Shanmuga Eswari.
Analysis: Lakshmana Kumar Ramasamy.
Evaluation and revision of the manuscript: S. Balamurali.
The authors declare that there is no conflict of interest.
Funding: This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
ORCID iD: Lakshmana Kumar Ramasamy https://orcid.org/0000-0002-0643-6599
Data availability
Publicly available datasets were used for this research.
References
- 1.Tham YC, Li X, Wong TY, et al. Global prevalence of glaucoma and projections of glaucoma burden through 2040: a systematic review and meta-analysis. Ophthalmology 2014; 121: 2081–2090. [DOI] [PubMed] [Google Scholar]
- 2.Calkins DJ, Pekny M, Cooper ML, et al.; Lasker/IRRF Initiative on Astrocytes and Glaucomatous Neurodegeneration Participants. The challenge of regenerative therapies for the optic nerve in glaucoma. Exp Eye Res 2017; 157: 28–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Drance S, Anderson DR, Schulzer M; Collaborative Normal-Tension Glaucoma Study Group. Risk factors for progression of visual field abnormalities in normal-tension glaucoma. Am J Ophthalmol 2001; 131: 699–708. [DOI] [PubMed] [Google Scholar]
- 4.Tjandrasa H, Wijayanti A, Suciati N. Optic nerve head segmentation using hough transform and active contours. Telkomnika 2012; 10: 531–536. [Google Scholar]
- 5.Pathan S, Kumar P, Pai R, et al. Automated detection of optic disc contours in fundus images using decision tree classifier. Biocybernetics and Biomedical Engineering 2020; 40: 52–64. [Google Scholar]
- 6.Gao Y, Yu X, Wu C, et al. Accurate and efficient segmentation of optic disc and optic cup in retinal images integrating multi-view information. IEEE Access 2019; 7: 148183–148197. [Google Scholar]
- 7.Zilly J, Buhmann JM, Mahapatra D. Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. Comput Med Imaging Graph 2017; 55: 28–41. [DOI] [PubMed] [Google Scholar]
- 8.Sevastopolsky A. Optic disc and cup segmentation methods for glaucoma detection with modification of U-Net convolutional neural network. Pattern Recognit Image Anal 2017; 27: 618–624. [Google Scholar]
- 9.Carmona EJ, Rincón M, García-Feijoó J, et al. Identification of the optic nerve head with genetic algorithms. Artif Intell Med 2008; 43: 243–259. [DOI] [PubMed] [Google Scholar]
- 10.Al-Bander B, Williams BM, Al-Nuaimy W, et al. Dense fully convolutional segmentation of the optic disc and cup in colour fundus for glaucoma diagnosis. Symmetry 2018; 10: 87. [Google Scholar]
- 11.Yu S, Xiao D, Frost S, et al. Robust optic disc and cup segmentation with deep learning for glaucoma detection. Comput Med Imaging Graph 2019; 74: 61–71. [DOI] [PubMed] [Google Scholar]
- 12.Shankaranarayana SM, Ram K, Mitra K, et al. Joint optic disc and cup segmentation using fully convolutional and adversarial networks. In 4th International Workshop on Fetal, Infant and Ophthalmic Medical Image Analysis 2017; 168–176. [Google Scholar]
- 13.Fu H, Cheng J, Xu Y, et al. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans Med Imaging 2018; 37: 1597–1605. [DOI] [PubMed] [Google Scholar]
- 14.Thakur N, Juneja M. Survey on segmentation and classification approaches of optic cup and optic disc for diagnosis of glaucoma. Biomedical Signal Processing and Control 2018; 42: 162–189. [Google Scholar]
- 15.Joshi S, Partibane B, Hatamleh WA, et al. Glaucoma Detection Using Image Processing and Supervised Learning for Classification. J Healthc Eng 2022; 2022: 2988262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Iglovikov V, Shvets A. Ternausnet: U-net with vgg11 encoder pre-trained on ImageNet for image segmentation. 2018; arXiv:1801.05746.
- 17.Koh JE, Mookiah MRK, Kadri NA. Application of multiresolution analysis for the detection of glaucoma. J Med Imaging Hlth Inform 2013; 3: 401–408. [Google Scholar]
- 18.Benzebouchi NE, Azizi N, Bouziane SE. Glaucoma diagnosis using cooperative convolutional neural networks. Proceedings of ISER 88th International Conference 2017; 1–6. [Google Scholar]
- 19.Al-Bander B, Al-Nuaimy W, Al-Taee MA, et al. Automated glaucoma diagnosis using deep learning approach. Proceedings of 14th IEEE International Multi-Conference on Systems, Signals & Devices (SSD) 2017; 207–210. [Google Scholar]
- 20.Raghavendra U, Fujita H, Bhandary SV, et al. Deep convolution neural network for accurate diagnosis of glaucoma using digital fundus images. Information Sciences 2018; 441: 41–49. [Google Scholar]
- 21.Norouzifard M, Nemati A, Gholam HH, et al. Automated glaucoma diagnosis using deep and transfer learning: proposal of a system for clinical testing. Proceedings of IEEE International Conference on Image and Vision Computing New Zealand (IVCNZ) 2018; 1–6. [Google Scholar]
- 22.Diaz-Pinto A, Morales S, Naranjo V, et al. CNNs for automatic glaucoma assessment using fundus images: an extensive validation. Biomedical Engineering 2019; 18: 29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kim M, Janssens O, Park HM, et al. Web Applicable Computer-aided Diagnosis of Glaucoma Using Deep Learning. 2018; arXiv:1812.02405.
- 24.Juneja M, Singh S, Agarwal N, et al. Automated detection of glaucoma using deep learning convolution network (G-net). Multimed Tools Appl 2019; 79: 15531–15553. [Google Scholar]
- 25.Zhou W, Yi Y, Bao J, et al. Adaptive weighted locality-constrained sparse coding for glaucoma diagnosis. Med Biol Eng Comput 2019; 57: 2055–2067. [DOI] [PubMed] [Google Scholar]
- 26.Gour N, Khanna P. Multi-class multi-label ophthalmological disease detection using transfer learning based convolutional neural network. Biomedical Signal Processing and Control (In Press) 2020; 66: 102329. [Google Scholar]
- 27.Jiang Y, Duan L, Cheng J, et al. JointRCNN: a region-based convolutional neural network for optic disc and cup segmentation. IEEE Trans Biomed Eng 2019; 67: 335–343. [DOI] [PubMed] [Google Scholar]
- 28.Gidaris S, Komodakis N. Object detection via a multi-region and semantic segmentation-aware CNN model. Proceedings of the IEEE international conference on computer vision 2015; 1134–1142. [Google Scholar]
- 29.Guindel C, Martin D, Armingol JM. Fast joint object detection and viewpoint estimation for traffic scene understanding. IEEE Intell Transport Syst Mag 2018; 10: 74–86. [Google Scholar]
- 30.Xu Q, Wang G, Li Y, et al. A comprehensive swarming intelligent method for optimizing deep learning-based object detection by unmanned ground vehicles. Plos One 2021; 16: e0251339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wang G, Guo J, Chen Y, et al. A PSO and BFO-based learning strategy applied to faster R-CNN for object detection in autonomous driving. IEEE Access 2019; 7: 18840–18859. [Google Scholar]
- 32.Su Y, Li D, Chen X. Lung nodule detection based on faster R-CNN framework. Comput Methods Programs Biomed 2021; 200: 105866. [DOI] [PubMed] [Google Scholar]
- 33.Wan S, Goudos S. Faster R-CNN for multi-class fruit detection using a robotic vision system. Computer Networks 2020; 168: 107036. [Google Scholar]
- 34.Li S, Chen H, Wang M, et al. Slime mould algorithm: A new method for stochastic optimization. Future Generation Computer Systems 2020; 111: 300–323. [Google Scholar]
- 35.Heidari AA, Mirjalili S, Faris H, et al. Harris hawks optimization: Algorithm and applications. Future Generation Computer Systems 2019; 97: 849–872. [Google Scholar]
- 36.Mirjalili S, Saremi S, Mirjalili SM, et al. Multi-objective grey wolf optimizer: a novel algorithm for multi-criterion optimization. Expert Systems with Applications 2016; 47: 106–119. [Google Scholar]
- 37.Huang X, Islam MR, Akter S, et al. Artificial intelligence in glaucoma: opportunities, challenges, and future directions. Biomed Eng Online 2023; 22: 126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jain S, Salau AO. Detection of glaucoma using two dimensional tensor empirical wavelet transform. SN Appl Sci 2019; 1: 1417. [Google Scholar]
- 39.Salau AO, Jain S. Feature Extraction: A Survey of the Types, Techniques, Applications. Proceedings of IEEE International Conference on Signal Processing and Communication (ICSC). NOIDA, India 2019; 158–164.
- 40.Qian N. On the momentum term in gradient descent learning algorithms. Neural Netw 1999; 12: 145–151. [DOI] [PubMed] [Google Scholar]
- 41.Uymaz SA, Tezel G, Yel E. Artificial algae algorithm (AAA) for nonlinear global optimization. Applied Soft Computing 2015; 31: 153–171. [Google Scholar]
- 42.G1020 retinal public dataset. https://www.dfki.de/SDS-Info/G1020/. 2019. (accessed on August 2019).
- 43.HRF retinal image dataset. https://www5.cs.fau.de/research/data/fundus-images/. 2019. (accessed on Dec 2019).
- 44.Drive retinal dataset. https://drive.grand-challenge.org/. 2019. (accessed on Dec 2019).
- 45.Cheng J, Tao D, Wong DWK, et al. Quadratic divergence regularized SVM for optic disc segmentation. Biomed Opt Express 2017; 8: 2687–2696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Cheng J, Liu J, Xu Y, et al. Superpixel classification based optic disc and optic cup segmentation for glaucoma screening. IEEE Trans Med Imaging 2013; 32: 1019–1032. [DOI] [PubMed] [Google Scholar]
- 47.Friedman M. A comparison of alternative tests of significance for the problem of m rankings. Ann Math Statist 1940; 11: 86–92. [Google Scholar]
- 48.Koh JE, Acharya UR, Hagiwara Y, et al. Diagnosis of retinal health in digital fundus images using continuous wavelet transform (CWT) and entropies. Comput Biol Med 2017; 84: 89–97. [DOI] [PubMed] [Google Scholar]
- 49.Mookiah MRK, Acharya UR, Lim CM, et al. Data mining technique for automated diagnosis of glaucoma using higher order spectra and wavelet energy features. Knowledge Based Systems 2012; 33: 73–82. [Google Scholar]
- 50.Kausu TR, Gopi VP, Wahid KA, et al. Combination of clinical and multiresolution features for glaucoma detection and its classification using fundus images. Biocybernetics and Biomedical Engineering 2018; 38: 329–341. [Google Scholar]
- 51.Dua S, Acharya UR, Chowriappa P, et al. Wavelet-based energy features for glaucomatous image classification. IEEE Trans Inf Technol Biomed 2011; 16: 80–87. [DOI] [PubMed] [Google Scholar]
- 52.Elangovan P, Nath MK. Glaucoma assessment from color fundus images using convolutional neural network. Int J Imaging Syst Tech 2021; 31: 955–971. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Publicly available datasets were used for this research.







