Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2022 Jan 24;54(3):2363–2384. doi: 10.1007/s11063-021-10734-0

Multithreshold Image Segmentation Technique Using Remora Optimization Algorithm for Diabetic Retinopathy Detection from Fundus Images

V Desika Vinayaki 1,, R Kalaiselvi 1
PMCID: PMC8784591  PMID: 35095328

Abstract

One of the most common complications of diabetes mellitus is diabetic retinopathy (DR), which produces lesions on the retina. A novel framework for DR detection and classification was proposed in this study. The proposed work includes four stages: pre-processing, segmentation, feature extraction, and classification. Initially, the image pre-processing is performed and after that, the Multi threshold-based Remora Optimization (MTRO) algorithm performs the vessel segmentation. The feature extraction and classification process are done by using a Region-based Convolution Neural Network (R-CNN) with Wild Geese Algorithm (WGA). Finally, the proposed R-CNN with WGA effectively classifies the different stages of DR including Non-DR, Proliferative DR, Severe, Moderate DR, Mild DR. The experimental images were collected from the DRIVE database, and the proposed framework exhibited superior DR detection performance. Compared to other existing methods like fully convolutional deep neural network (FCDNN), genetic-search feature selection (GSFS), Convolutional Neural Networks (CNN), and deep learning (DL) techniques, the proposed R-CNN with WGA provided 95.42% accuracy, 93.10% specificity, 93.20% sensitivity, and 98.28% F-score results.

Keywords: Diabetic retinopathy, DRIVE database, Multi threshold-based Remora Optimization, Faster R-CNN, Wild Geese Algorithm

Introduction

Diabetic retinopathy (DR) is one of the most prevalent causes of blindness in humans [1]. The most common cause of DR is an injury to the blood vessels in the eye tissue. Blurred vision, floaters, difficulty perceiving colors, and other symptoms are some of the first signs. This might have resulted in visual loss, and uncontrolled blood sugar is a risk factor. If DR is detected early enough, it can be treated. Mostly there will be no severe difficulties during the early phases and the failure of discovery may create serious destruction. As a result, it is important to identify the DR from images captured using various modalities [2].

The pupil can be dilated (mydriatic) or not dilated (non-mydriatic) for the purpose of acquiring retinal images [3]. Professionally trained centers or readers can examine the captured images. Meanwhile, exudates are classified into two types: soft and hard exudates [4]. Soft exudates occur when the white portion of the eye has faint yellow margins, whereas hard exudates develop when yellow dots appear in the retina. Furthermore, manually identifying exudates is inaccurate, resulting in errors. Therefore, it is proposed to detect the exudates automatically, ie. by employing artificial intelligence algorithms together with the machine.

When the retina is affected, abnormal features such as exudates, microaneurysms, neovascularization, cotton wool patches, and hemorrhages appear [5]. The optic disc, blood vessels, fovea, and macula are all normal features of the retina [6]. The extracted characteristics are not always correct, and often only supply a little amount of information, making it difficult to continue with the procedure. Because determining the location of the optic nerve head, fovea, and macula is difficult, the classification accuracy drops.

In order to diagnose retinal diseases, image processing techniques such as automated segmentation of retinal fundus images, localization and identification of blood vessels and optic discs, and detection of exudates, hemorrhages, and macula are used. Furthermore, the screening [7] procedure entails identifying blood vessels in the retina and extracting some of their characteristics, such as angle, breadth, length, and color. Meanwhile, the computerized surgery and identification of a person through biometric involves segmentation of blood vessels in the eye images. Since manual segmentation is a complex process, automatic segmentation is often suggested to address this issue which reduces the error rate and time. Moreover, the segmentation of retinal blood vessels from the image will help to provide better treatment [8].

Despite the fact that the global thresholding strategy was used in existing methodologies [9], it was found to be ineffective in segmenting retinal vascular systems. It's possible that this was caused by limitations in the preprocessing step [10]. An effective preprocessing phase, a good global thresholding method, and an efficient postprocessing technique are all necessary for successful vessel segmentation. Popular image processing algorithms that were developed and assessed using low-resolution fundus pictures have exhibited limits in clinical application because of the increased spatial resolution of fundus images. To achieve this purpose, a new generation of processes must be developed. These methods must be capable of dealing with high-resolution images while being computationally simple [1113].

In context with this, we have proposed a novel approach known as the Multi threshold-based Remora Optimization (MTRO) algorithm approach for the segmentation of blood vessels from the fundus images. The retinal images from the DRIVE datasets are grayscale images that have been preprocessed. This procedure is used to turn images into the needed contrast. The adoption of ROA enhances the performance of the MT steadily. Then the segmented images are classified based on the faster RCNN with WGA. The features are then extracted, and the extracted features are classified using the fast-RCNN technique. The parameters are then fine-tuned using the WGA method. As a result, the classification accuracy is improved. This method accurately classifies the images based on non-DR, Proliferative, Moderate DR, Severe, and Mild DR.

The major contribution of this work is listed below,

  • We present a novel vessel segmentation method using the MTRO algorithm that needs less processing time and allows us to distinguish specular reflexes of thick vessels that are not visible in lower resolution fundus images.

  • Because MT's searching capability is weak and they are susceptible to local trapping, ROA is utilized to improve image segmentation performance.

  • MTRO algorithm can help in dynamically selecting the optimal grey-level threshold value for segmenting retinal vessels from background tissue in an image based on their intensity distribution.

The rest of the paper is organized accordingly, in Sect. 2, the literature survey of the proposed work is elucidated. The proposed methodology is explained in Sect. 3. The experimental analysis is elaborately explained in Sect. 4. Finally, the work is summarized in Sect. 5.

Review of Related Works

The fully convolutional deep neural network (FCDNN) was proposed by AlBadawi et al. [14] for venules and arterioles classification in retinal images. The inference and feature learning is directly applied without the requirement of segmented vasculature. The intricate patterns are automatically extracted from the retinal picture without the need for handcrafted features. Based on the experimental results, the detection rate of 93.5% is obtained when with the images present in the DRIVE database. However, the model suffered from higher costs and computational complexities.

Based on computerized retinal image analysis, data mining and image processing techniques were introduced by GeethaRamani et al. [15] for retinal blood vessel segmentation. During pre-processing stage, they applied halfwave rectification, Gabor filtering, contrast enhancement, color channel extraction, color transformation, and image cropping. K-means clustering was used to create non-vessel or vessel clusters from the group pixels. When the vessel group is left unprocessed, an ensemble classification procedure is used to guide the bagging and decision tree. With the experimental analysis, the publicly accessible DRIVE database yielded 95.36% accuracy. The vessel segmentation output from the previous step was post-processed using morphological approaches, however, they were unable to classify the fundus classes.

Huang et al. [16] introduced a genetic-search feature selection (GSFS) model for the classification of retinal artery or vein. During classification, the optimal feature subsets were obtained by applying a genetic-search-based feature selection method to extract several numbers of features. According to the INSPIRE dataset, the classification results of 91.3% specificity, 89.6% sensitivity, and 90.2% accuracy were attained but the larger feature set made computational difficulties. Convolutional Neural Networks (CNN) was suggested by Lyu et al. [17] for deep tessellated retinal image detection. The classification performances were evaluated by using 12,000 color retinal images based on the database. The transfer learning technique and pre-trained GoogLeNet provided 0.9659 AUC value and 97.73% accuracy.

The missing patches are not estimated using this approach. For diabetic retinopathy (DR) classification, Yu et al. [18] proposed a deep learning (DL) model. From the saliency map, unsupervised features were combined thereby detecting poor and high-quality retinal fundus images. Chen et al. [19] presented a deep learning-based active contour model for medical image segmentation. They are mainly overcoming the pixel-wise fitting problem that occurs in the segmentation map by taking both the areas present inside and outside along with the boundaries as the region of interest. The area and size information is represented via a new loss function design. The approach is evaluated in more than 2000 cardiac MRI scans. Zhou et al. [20] presented a Graded-Feature Multilabel-Learning Network known as GMNet to divide the multilevel features into different levels (junior, intermediate, and senior). Ke et al. [21] presented a three-stage self-training framework to extract the statistical region of the pseudo masks whose prediction probability is highly uncertain.

Different authors used neural networks [22] for various purposes in the medical domain and some of them are listed below. Rodrigues et al. [23] evaluated the performance of the convolutional neural networks in classifying the HEp-2 cells present on immunofluorescence images. Pérez-García et al. [24] utilized the python library known as TorchIO for loading, preprocessing, augmentation, and patch-based sampling of medical images for the deep learning classifier. They mainly present this approach to address the different challenges associated with deep learning techniques such as high computational costs, the need for large labeled datasets, etc. Quan et al. [25] utilized a dense capsule neural network (DenseCapsNet) for identifying the COVID-19 disease from X-ray images. Using a total of 750 chest X-ray images, pneumonia and COVID-19 were identified in the patients. This model obtained an accuracy value of 90.7%. Li et al. [26] utilized a structural convolutional neural network architecture that was proposed for speckle noise removal in medical images. Their model comprises three subnetworks: rough clean image estimate subnetwork, noise estimate network, and information fusion network.

Wang et al. [27] presented an approach known as deep interactive geodesic (DeepIGeoS) for medical segmentation to minimize the user interactions during refinement and offer improved accuracy. The efficiency of this technique is verified in terms of 2D placenta segmentation from the fetal MRI and 3D brain tumor segmentation from the FLAIR images. To improve the large compression ratio, recognition accuracy, and scalability of the deep learning classifier, Yu et al. [28, 29] presented two models namely multimodal hypergraph learning-based sparse coding (HLSE) and Hierarchical Deep Word Embedding (HDWE) methods.

Proposed Approach

This section proposes novel approaches for retinal image analysis and classification. Figure 1 illustrates the overall flow diagram. The images were obtained from the DRIVE database. At first, image-pre-processing is performed. Next, retinal image segmentation is performed with the help of a multi-threshold-based Remora Optimization algorithm. After that, both feature extraction and retinal image classification are carried out via R-CNN with WGA.

Fig. 1.

Fig. 1

Overall flow diagram

Data Preprocessing

In this stage, the DRIVE dataset is preprocessed to differentiate optimal retinopathy and non-diabetic retinopathy. The pre-processing stage helps to identify the retinal disease clearly. Hence it is ineluctable to conduct preprocessing before applying for the feature extraction stage. Meanwhile, this process can also be used to recognize the blood vessels in the format of microaneurysms. Moreover, the grayscale conversion technique is performed to accomplish the required contrast. Besides, the shade correction method is conducted to evaluate the image and subtract it from the previous image. Figure 2 depicts the fundus image model both in colored and black ground.

Fig. 2.

Fig. 2

Fundus image

Proposed Multithreshold Image Segmentation with ROA

Image segmentation is the process of dividing the fundus images into several different regions of interest. Let the image set be A, and the non-empty subset be N with N=A1A2,.....,Al. However, these subsets can satisfy the following constraints [30].

Constraint 1: The sum of pixels in the l subregions which includes all the pixels in the original image and subregion are interlinked, i.e. i=1lAi=A.

Constraint 2: When ij,AiAj=χ then the ith nonempty subset and jth nonempty subset depicts that the pixel in the image cannot be included in the various sub-regions at a similar time.

Constraint 3: The pixels values in the same sub-regions have a maximum degree i.e.for(i=1,2,...l),P(Ai)=True.

Constraint 4: The pixels of the various subregions have various features in respect i.e. ij,PAiAj=False for a nonempty subset.

For the segmentation of fundus images, a threshold segmentation algorithm is the best option, since it possesses some merits such as easy implementation, less computational complexities, and better performance. Moreover, the images are classified based on the threshold value and can be represented as single and multi-threshold image segmentation. Here, we utilized multi-threshold image segmentation that relies on the highest entropy values.

The maximum entropy thresholding was proposed in the year 1985 [31]. This approach can be used to estimate the optimal threshold value and thus utilized to optimize the information presented in the background. Henceforth the estimation of entropy of the image grayscale histogram is performed to extract the targeted regions. Let us define the image's grayscale range as [0, N − M], where k represents the grayscale value and t represents the threshold value. The pixels which hold the grayscale value of t can be taken as targeted region (T) and the values greater than t are considered as background region (G). The expressions are described below,

k=0M-1Pk=1 1
T:P0Pn,P1Pn,.....,PtPn 2
B:Pt+11-Pn,Pt+21-Pn,.....,PM-11-Pn, 3
Pn=k=0tPi 4

These two regions include entropies which is incorporated with probability densities are denoted by the following equations,

E(T)=-i=0tPkPllnPkPl 5
E(B)=-i=t+1M-1Pk1-PllnPk1-Pl 6
Ψ(t)=E(T)+E(B) 7

Ψ(t) can be used to achieve the maximum value and t is the optimal threshold value. Thus the maximum entropy thresholding method can be used for the multi-threshold segmentation and the issues in the multi-threshold can be considered as problem optimization in N dimensions. Moreover, consider that the grayscale values of the images can be segmented as 0,1,2,...,M-1, and the image subjected to be segmented is said to be l + 1 region. The objective function of the maximum entropy can be given as follows,

Ot1,t2,...tl=E0+E1++El=-k=0t1-1Piω0(t1)lnPiω0(t1)+k=0t2-1Piω0(t1,t2)lnPiω0(t1,t2)+...+k=tlM-1Piω0(tl,M-1)lnPiω0(tl,M-1) 8

Here, Pi is the probability of occurrence of a gray value in the ith image and the probability of occurrence of gray values in a class can be represented by ω. The optimal thresholding value can be estimated when the value of Ot1,t2,...tn reaches the value equal to t1,t2,...tl. Even though this segmentation possesses better segmentation accuracy, the time complexity has been increased with the increased number of segmentation linearly as well as exponentially. Thus it will affect the segmentation effectiveness and hence to address this problem we have adopted the Remora Optimization algorithm along with the multi-thresholding segmentation approach.

ROA

To reduce the time complexity of the multi-thresholding approach we have utilized the ROA algorithm. This algorithm is based on the behaviors of Remora, the suckerfish from the family of fish Echeneidae [32]. This may be used to execute a global search of images with comparable grayscale values, reducing the multi thresholding's time complexity. This ROA can be used to avoid the region of interest being confined by local trapping. As a result, for retinal image analysis in the fundus image, accurate segmentation may be conducted.

The Numerical Expression of ROA

The remora optimization algorithm is mainly inspired by the remora whale which is an intelligent traveler in the ocean. The two stages in the algorithm are exploration and exploitation. Remora behaviors such as free travel and mindful eating are used to generate the numerical expression. One small step try is used for mode switching and the precision of the optimization can be improved via the remora factor which yields convergence. While taking mode switching decisions, the phases such as free travel, eating thoughtfully, and experience plays a vital role. These strategies help the ROA algorithm to achieve optimal results. The steps in the ROA algorithm is presented below:

  • (i)

    Initialization

The best solution is called Remora, and its current position D is the variable that represents the problems in the search space. Remora's position changes according to the size of the pool. The current location is denoted as, Di=Di1,Di2,...,Did. The dimension of remora swimming is indicated by d, and the total number of remora is expressed by i. Similarly, the optimal solution of the algorithm can be represented as DOpt=D1,D2,...,Dd. Moreover, each candidate solution exhibits its own fitness value. Moreover, it can be expressed as FDi=FDi1,Di2,...,Did and the F is used to estimate the fitness function value. FDOpt=FD1,D2,...,Dd is used to denote the optimal fitness of the respective remora position.

  • SFO Strategy

The position of the Remora can be updated when it is attached to the swordfish and it can be expressed as,

Dit+1=DOpttrand(0,1)DOptt+Drandt2-Drandt 9

Here, Dit+1 is the current position of the remora with its number(i) and T is the maximum number of iterations and t is used to denote the ongoing iterations. The random position of the remora is denoted as Drand and DOptt is the best position of the remora identified. These variables are used to ensure the capability of a global search of the algorithm. Moreover, the random selection of remora is based on the fitness value and the fitness value of the current iteration can be obtained by the Experience attack step.

  • Experience Attack

This phase can be used to estimate Remora's change of host. It may be stated as follows:

Datt=Dit+Dit-Dprerandl 10

The position of the previous generation can be represented by the Dpre and the tentative step can be denoted using Datt. This is the global search movement, and the randl can be chosen appropriately. This step is made to compare the fitness values between the FDit(current solution) and the FDatt(attempted solution). Besides the FDatt value is small that is FDit>F(Datt). Remora uses a new feeding strategy to achieve local optimization. If the current solution's fitness function is lower than the attached one, it will revert to the prior one, which may be expressed as:

FDit<F(Datt) 11
  • (ii)

    Eat Thoughtfully (Exploitation)

  • WOA Strategy

This is based on the attachment of Remora to the whale and the position updates are formulated as,

Di+1=Leδcos(2Πδ)+Di 12
δ=rand(0,1)a-1+1 13
a=-1+tT 14
L=DOpt-Di 15

The position of the remora relies on the whale and L is the distance between the hunter and prey (ongoing optimal solution), δ is the selected random number, and the random number a falls under the range of [− 1, 1] and goes linearly down in the range [− 2, − 1].

  • Host Feeding

It is the subsection in the exploitation stage. The solution space of this stage can be reduced to the host's position space which is formulated as follows:

Dit=Dit+R 16
R=SDit-CDOpt 17
S=2Vrand(0,1)-V 18
V=21-tIMax 19

The small movement step is designated as R, and it is proportional to the volume space of the host and remora. S is employed in the solution space to narrow down the location of remora. The flowchart of the ROA is shown in Fig. 3.

Fig. 3.

Fig. 3

Flow diagram of ROA

Proposed MTRO Segmentation Approach for the Retinal Analysis

The MTRO based retinal segmentation involves steps that are described below,

  • The retinal images are converted into grayscale images in order to attain the grayscale distribution. Based on these grayscale value the location boundaries of the ROA is also set.

  • The number of Remora individuals are initialized in a particular space and the process includes dimension, number of population, and number of iterations.

  • The fitness function values for the Remora are evaluated, and the fitness value is calculated based on that.

  • Evaluate each ROA's fitness value, and the related fitness function values are preserved for the selection process. For each iteration, the fitness function values are saved.

  • The previous steps are repeated until the terminating condition is reached, and the resulting output is regarded as the best threshold value for retinal image segmentation. The segmentation results are graphically provided in Table 1.

Table 1.

Graphical description of MTRO segmentation

graphic file with name 11063_2021_10734_Tab1_HTML.jpg

Feature Extraction and Classification

The Region-based Convolution Neural Network (R-CNN) with Wild Geese Algorithm (WGA) performs both feature extraction and retinal image classification. The classification steps are discussed a follows:

Faster R-CNN

Figure 4 describes the schematic diagram of the Faster R-CNN model, which consists of three major sections namely classification, boundary regression, and feature extraction. Convolution layers are used first to extract feature maps while conducting object detection [33]. Softmax judges the context attribute of anchor points through the Region Proposal Networks (RPN) and then uses bounding box regression to correct the anchor point, resulting in a somewhat accurate candidate region [34]. In each layer, the significant information is retrieved gradually by employing a new representation of the input image. The faster R-CNN network structure is described as follows: The ResNet101 is the backbone of the Faster R-CNN network which is partitioned in a five scale value {conv1, conv2_a, conv3_a, conv4_a, and conv5_a}. The Region of Interest (ROI) pooling and RPN share the conv4_a output. For a 14 × 14 × 1024 dimension feature map, the conv5_a is applied by the ROI pooling as per the input needs. To derive a 2048 dimensional feature, average pooling is applied for classification.

Fig. 4.

Fig. 4

Structure of R-CNN

From the fundus image, R-CNN extracts meaningful information such as microaneurysm areas, optical distance, exudates, identifying blood vessel, circularity, pixel area, minor and major axis length [35]. Ultimately, ROI pooling is done on the feature maps derived by convolution layers and the candidate region is used to extract the candidate feature maps, followed by further classification to produce the classification and location information.

The vital role in faster R-CNN is RPN. The candidate region accuracy is used to investigate the effects of detection accuracy and computational efficiency. RPN is a significant improvement over the prior Faster R-CNN algorithms. Instead of selective searching, the RPN produces candidate areas using a sliding window technique [36]. The convolution feature layer receives a 256-dimensional eigenvector through the sliding window. The center point is regarded as an anchor when it moves on the convolution feature map thereby producing k anchor boxes at each position.

Wild Geese Algorithm

A novel strategy inspired by animal group mobility and group search has just been proposed for large-scale global continuous optimization [37].

The Ordered and Coordinated Group Migration

The displacement and velocity formula in relation to the geese's coordinated velocity is described in the equations below [38].

VjDItr+1=R1,D×Vj,DItr+R2,D×Vj+1,DItr-Vj-1,DItr+R3,D×Pj,DItr-Yj-1,DItr+R4,D×P+1j,DItr-Yj,DItr+R5,D×Pj+2,DItr-Yj+1,DItr-R6,D×Pj-1,DItr-Yj+2,DItr 20

The Dth dimension of a current position is YJ,D,PJ,D and VJ,D. The random number R tends to the intervals 0 and 1. Using Eq. (20), the variations in each wild goose velocity and location are measured based on the upfront and rear ember velocities. The position of a neighboring member is (Vj+1Itr,-Vj-1Itr) and Eq. (21) represents the global best member.

Yj,DV=Pj,DItr+R7,D×R8,D×GDItr+Pj+1,DItr-2×Pj,DItr+Vj,DItr+1 21

The global optimal position among all members is GD.

Walking and Searching for Food

The jth wild geese move forward the upfront member. The j+1th goose q reached by trying jth geese. The below equation explains the searching and walking of YjM wild geese for food.

Yj,DM=Pj,DItr+R9,D×R10,D×Pj+1,DItr-Pj,DItr 22
Wild Geese Reproduction and Evolution

Reproduction and evolution are other stages of wild geese life. This process is modeled by integrating the walking and search for food YjM with migration YjV. Where, CR is the control parameter.

Yj,DItr+1=Yj,DVifR11,DCRYj,DItr+1Otherwise 23
Evolution of Order, Death, and Migration

In order to balance algorithm performance, the compromised solution is established. During algorithm iterations, the algorithm initializes with a maximal number of populations MInitial. In the final iteration, the population is linearly reduced to the final value MFinal.

M=RoundMInitial-MInitial-MFEFEmaximum 24

The number of function evaluations with its maximal is FE and FEmaximum.

Faster R-CNN with WGA for Retinal Image Classification

When performing fastener location based on DR detection and classification from fundus images, the ratio between images and fasteners remains stable. It is unnecessary to generate a large number of anchor boxes of varied ratios and sizes when RPN is used for candidate region selection [34]. Based on the distribution information and location of the fastener in the image, the overlaps, ratios, and sizes of anchor boxes need to be optimized. Hence, we used Wild Geese Algorithm (WGA) for optimization purposes due to the optimization of high-dimension problems, real-world optimization problems, large-scale optimization problems, better convergence speed, and control parameter CR is the major advantage of WGA. Therefore, the WGA effectively optimizes the parameters such as overlaps, ratios, and sizes of anchor boxes during DR detection and classification. Figure 5 explains the DR classification model using faster R-CNN with WGA. This method detects the DR and classifies different stages of DR such as Non-DR, Proliferative DR, Severe, Moderate DR, Mild DR.

Fig. 5.

Fig. 5

DR detection and classification using faster R-CNN with WGA

Result and Discussion

This section portrays the experimental analysis of our proposed method in a wider context. This section includes both the segmentation analysis and classification analysis. The dataset description is also enclosed. This section investigates the performance of DR detection and classification, which is performed using various experimental analyses with state-of-art comparisons. Furthermore, the i5 processor at 2.53 GHz and 4 GB RAM with Weka 3.6.11 and Matlab r2008a on a PC implement the proposed system [39].

Dataset Description and Performance Metrics

For the experimental purpose, we have taken the DRIVE [40] dataset. this dataset has been established to analyze the segmentation of blood vessels in retinal images. the images were taken from the patients of the diabetic retinopathy screening program in the Netherlands. For the screening process, about 450 patients of about 25–90 years of age were subjected. The images were captured using Canon CR5 non-mydriatic 3CCD camera with a 45-degree FOV. The FoV of each image is circular along with a diameter of pixels of 544 approximately. To analyze the segmentation approach of our proposed MTRO method we have considered several metrics such as false discovery rate, false-positive rate, positive predictive value, negative predictive value, false-negative rate, accuracy, specificity, sensitivity, and F-score.

  • False-positive rate (FPR)

    It can be defined as the rate at which the segmentation of retinal images depicts positive results instead of negative. It is expressed as,
    FPR=FalsepositiveTruenegative+Falsepositive 25
  • False-negative rate (FNR)

    It can be defined as the rate at which the segmentation depicts negative results instead of positive results.
    FNR=FalsenegativeTruepositive+Falsenegative 26
  • Accuracy

    Accuracy is defined as the ratio of correctly assigned pixels in the segmented image to the total number of blood vessel pixels included in the image.
    A=TN+TPTN+FN+FP+TP 27
  • Specificity

    Specificity is used to calculate the ratio of correctly identified vessels to the total number of non-vessels.
    Spec=TNFP+TN 28
  • Sensitivity

    Sensitivity is defined as the ratio of accurately identified vessels to a total number of vessels.
    Sen=TPFN+TP 29
  • F-score

    The measure of test accuracy is defined as F-score. The number of all positive results divides the number of true positive results.
    F-score=2×Recall·PrecisionPrecision+Recall 30
  • Positive Predictive Value (PPV)

    It is determined as the probability of fundus images that are segmented accurately.

  • Negative predictive value (NPV)

    It is determined as the probability of fundus images that are segmented inaccurately.

  • False discovery rate (FDR)

    It is otherwise known as false positive and can be defined as the rate of expected portions of errors.

Segmentation Analysis

The resultant values are compared with state-of-art works such as DOFE [41], QUINCUNX [42], CNN [43], MT approaches. Figure 6 depicts the comparative analyses of the proposed MTRO approach with the other state-of-art works such as DOFE [41], QUINCUNX [42], CNN [43], and MT approaches in terms of positive predictive values. From Fig. 6, it is evident that the proposed approach achieves a better rate of about 99% and without the inclusion of ROA, the PPV is equal to 97%. All other approaches show lower PPV values.

Fig. 6.

Fig. 6

Segmentation analysis based on the positive predictive values

The segmentation analysis based on the false positive rate is depicted in Fig. 7. Since we have adopted ROA the false positive rate is less when compared to other approaches. The false-positive rate of the proposed MTRO is equal to 4, whereas, DOFE achieves 19 and is the highest. Thus our proposed segmentation predicted accurately the blood vessels accurately. The graphical representation based on the FPV is illustrated in Fig. 8. The FPV of our proposed method achieves maximum since the ROA can search the entire image of the blood vessel and predict both the positive and negative pixels accurately. The FPV of the proposed method is about 98%, and without ROA the multi-thresholding itself achieves 90%. Moreover, the DOFE accomplishes the least FPV of about 76%.

Fig. 7.

Fig. 7

Segmentation analysis based on the false positive rate

Fig. 8.

Fig. 8

Segmentation analysis based on the negative predictive value

The segmentation analysis based on the FNR is illustrated in Fig. 9. The value of FNR of our proposed method is low of about 0.4, whereas, the method QUINCUNX method achieves 1.6 values and the Multithresholding without the ROA accomplishes an FNR value of 2. The adoption of ROA can reduce the wrong prediction value as shown in Fig. 10. Meanwhile, Fig. 10 illustrates the segmentation based on the FDR. The FDR of our proposed method achieves a low value of about 4. The method DOFE exhibits a higher FDR of about 7.5 and the method without ROA achieves the FDR of 4.4 as shown in Fig. 10.

Fig. 9.

Fig. 9

Segmentation analysis based on the false-negative rate

Fig. 10.

Fig. 10

Segmentation analysis based on the false discovery rate

The Table 2 presents the performance of the best segmentation techniques (K-means clustering [15], Deep learning-based active contour model GMNet [20], Three-stage self-training framework [20], and DeepIGeoS[27]) present in the literature with our proposed work. These methodologies are evaluated using the DRIVE dataset in terms of average accuracy, average sensitivity, and average specificity. The proposed methodology offers higher average accuracy, sensitivity, and specificity due to the reliable preprocessing technique used along with the RO algorithm. When compared to the previous techniques, higher accuracy and superior performance were achieved with respect to the large retinal fundus image dataset. But, the existing techniques slightly suffered from increasing computational complexity with higher feature dimensionality.

Table 2.

Segmentation results obtained using different classifiers for the DRIVE dataset

Techniques Average accuracy Average sensitivity Average specificity
K-means clustering [15] 87.36 87.54 87.63
Deep learning-based active contour model 91.21 91.23 91.04
GMNet [20] 93.14 93.27 93.24
Three-stage self-training framework [20] 94.65 94.21 94.36
DeepIGeoS [27] 95.14 95.65 95.645
Proposed MTRO segmentation 98.54 98.35 98.48

Classification Results

The feature extraction results are depicted in Table 3. This study uses a variety of features in relation to FCDNN, GSFS, CNN, DL, and the proposed approach. The proposed method delineated 91.7% accuracy, 92.34% specificity, and 90.36% sensitivity during feature extraction.

Table 3.

Feature extraction results

Techniques Number of features used for classification Measures
Accuracy (%) Specificity (%) Sensitivity (%)
FCDNN 20 81.77 90.44 80.39
GSFS 14 88.09 67.87 84.90
CNN 19 82.7 78.09 80.90
DL 12 80.35 87.09 88.90
Proposed 50 91.7 92.34 90.36

The confusion matrix results based on classification accuracy are described in Table 4. We have obtained 95.20%, 95.34%, 96.34%, 96.93%, and 95.93% accuracies for Non-DR, Proliferative DR, Severe, Moderate DR, Mild DR.

Table 4.

Confusion matrix results

Predicted value Non-DR (%) Proliferative DR (%) Severe DR (%) Moderate DR (%) Mild DR (%)
Actual value
Non-DR 95.20 2.42 0 1.3 1.08
Proliferative DR 3.45 95.34 0.35 0 0.86
Severe DR 1.45 0 96.34 2.10 0.11
Moderate DR 2.34 0.45 0.28 96.93 0
Mild DR 0.78 2.67 0 0.62 95.93

Best values are indicated in bold

Figure 11 describes the state-of-art results of accuracy. This investigation is conducted using FCDNN, GSFS, CNN, DL, and the proposed method. The proposed method demonstrated an accuracy of 95.42%, which is higher than other existing techniques. The state-of-art outputs of specificity are delineated in Fig. 12. The state-of-art techniques namely FCDNN, GSFS, CNN, DL, and proposed method describe the performance of the proposed method based on the specificity results. However, the proposed method accomplished 93.10% specificity outputs. when compared to the existing methods such as FCDNN, GSFS, CNN, and DL, the specificity result of the proposed method is higher.

Fig. 11.

Fig. 11

State-of-art comparison of accuracy

Fig. 12.

Fig. 12

State-of-art comparison of specificity

Figure 13 delineates the state-of-art comparative result of sensitivity. Based on state-of-art methods such as FCDNN, GSFS, CNN, and DL, the proposed method provided higher sensitivity results and demonstrated 93.20% sensitivity outputs. The state-of-art comparison of F-score values is delineated in Fig. 14. This investigation is conducted by using state-of-art methods such as FCDNN, GSFS, CNN, DL, and the proposed method. The proposed method provided a 98.28% F-score value than other techniques such as FCDNN, GSFS, CNN, and DL.

Fig. 13.

Fig. 13

State-of-art comparison of sensitivity

Fig. 14.

Fig. 14

State-of-art comparison of F-score

Cross-validation is mainly used in our work to divide the input data into six-folds where five-folds is used for training and the remaining one fold is used for testing. The input dataset is randomly sampled during this process and in each iteration, a certain fold is selected for validating the model and the remaining folds are used for testing. The computational time analysis is conducted for the proposed methodology by comparing it with the existing techniques and the results obtained are presented in Table 5. The state-of-art techniques taken for comparison are CNN [17], TorchIO [24], DenseCapsNet [25], HLSE [28], and HDWE [29]. The computational time of the proposed Faster R-CNN with WGA is 96.3 s which is relatively lower than the existing techniques (CNN(259.45 s), TorchIO (159.359 s), DenseCapsNet (154.55 s), HLSE (126.5 s), and HDWE (188.25 s)). Since the computational time of the proposed methodology is minimal than the existing techniques, it seems to be efficient in retinopathy detection.

Table 5.

Computational time analysis

Techniques Computational time (s)
CNN [17] 259.45
TorchIO [24] 159.369
DenseCapsNet [25] 154.55
HLSE [28] 126.5
HDWE [29] 188.25
Proposed Faster R-CNN with WGA 96.3

Conclusion

The proposed approach utilizes both MT and ROA to perform the segmentation process. The proposed MTRO was utilized to accurately segment the blood arteries in retinal images. Furthermore, the features were taken from the segmented image and classified. We used a fast RCNN-based WGA technique for classification which improved classification performance. For the experimental purpose, we have taken the DRIVE dataset. To analyze the segmentation process the performance metrics such as FPR, FNR, PPV, NPV, and FDR and our proposed method achieves better segmentation performance. Further, the classification performance was analyzed with state-of-art works along with the performance metrics such as accuracy, sensitivity, and specificity. The attained accuracy, F1-score, sensitivity, and specificity of our proposed method are 95.42%, 98.28%, 93.20%, and 93.10% correspondingly. Thus our proposed method accomplished better performance than the other approaches.

Authors’ Contributions

All authors agreed on the content of the study. VDV and RK collected all the data for analysis. VDV agreed on the methodology. VDV and RK completed the analysis based on agreed steps. Results and conclusions arediscussed and written together. The author read and approved the final manuscript.

Funding

Not applicable.

Availability of Data and Material

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Code Availability

Not applicable.

Declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Kempen JH, O’Colmain BJ, Leske MC, Haffner SM, Klein R, Moss SE, Taylor HR, Hamman RF. The prevalence of diabetic retinopathy among adults in the United States. Arch Ophthalmol (Chicago, Ill.: 1960) 2004;122(4):552–563. doi: 10.1001/archopht.122.4.552. [DOI] [PubMed] [Google Scholar]
  • 2.Fong DS, Aiello L, Gardner TW, King GL, Blankenship G, Cavallerano JD, Ferris FL, Klein R. Retinopathy in diabetes. Diabetes Care. 2004;27(suppl 1):s84–s87. doi: 10.2337/diacare.27.2007.S84. [DOI] [PubMed] [Google Scholar]
  • 3.Murgatroyd H, Ellingford A, Cox A, Binnie M, Ellis JD, MacEwen CJ, Leese GP. Effect of mydriasis and different field strategies on digital image screening of diabetic eye disease. Br J Ophthalmol. 2004;88(7):920–924. doi: 10.1136/bjo.2003.026385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.El Abbadi NK, Al-Saadi EH. Automatic detection of exudates in retinal images. Int J Comput Sci Issues. 2013;10(2 Part 1):237. [Google Scholar]
  • 5.Ehrenhofer MC, Deeg CA, Reese S, Liebich HG, Stangassinger M, Kaspers B. Normal structure and age-related changes of the equine retina. Vet Ophthalmol. 2002;5(1):39–47. doi: 10.1046/j.1463-5224.2002.00210.x. [DOI] [PubMed] [Google Scholar]
  • 6.Morita A, Sawada S, Mori A, Arima S, Sakamoto K, Nagamitsu T, Nakahara T. Establishment of an abnormal vascular patterning model in the mouse retina. J Pharmacol Sci. 2018;136(4):177–188. doi: 10.1016/j.jphs.2018.03.002. [DOI] [PubMed] [Google Scholar]
  • 7.Garg S, Davis RM. Diabetic retinopathy screening update. Clin Diabetes. 2009;27(4):140–145. doi: 10.2337/diaclin.27.4.140. [DOI] [Google Scholar]
  • 8.Usher D, Dumskyj M, Himaga M, Williamson TH, Nussey S, Boyce J. Automated detection of diabetic retinopathy in digital retinal images: a tool for diabetic retinopathy screening. Diabet Med. 2004;21(1):84–90. doi: 10.1046/j.1464-5491.2003.01085.x. [DOI] [PubMed] [Google Scholar]
  • 9.Mapayi T, Viriri S, Tapamo JR. Comparative study of retinal vessel segmentation based on global thresholding techniques. Comput Math Methods Med. 2015 doi: 10.1155/2015/895267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Budai A, Bock R, Maier A, Hornegger J, Michelson G. Robust vessel segmentation in fundus images. Int J Biomed Imaging. 2013 doi: 10.1155/2013/154860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sundararaj V, Selvi M. Opposition grasshopper optimizer based multimedia data distribution using user evaluation strategy. Multimed Tools Appl. 2021;80(19):29875–29891. doi: 10.1007/s11042-021-11123-4. [DOI] [Google Scholar]
  • 12.Jose J, Gautam N, Tiwari M, Tiwari T, Suresh A, Sundararaj V, Rejeesh MR. An image quality enhancement scheme employing adolescent identity search algorithm in the NSST domain for multimodal medical image fusion. Biomed Signal Process Control. 2021;66:102480. doi: 10.1016/j.bspc.2021.102480. [DOI] [Google Scholar]
  • 13.Sundararaj V. An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. Int J Intell Eng Syst. 2016;9(3):117–126. [Google Scholar]
  • 14.AlBadawi S, Fraz MM (2018) Arterioles and venules classification in retinal images using fully convolutional deep neural network. In: International conference image analysis and recognition. Springer, Cham, pp 659–668
  • 15.GeethaRamani R, Balasubramanian L. Retinal blood vessel segmentation employing image processing and data mining techniques for computerized retinal image analysis. Biocybern Biomed Eng. 2016;36(1):102–118. doi: 10.1016/j.bbe.2015.06.004. [DOI] [Google Scholar]
  • 16.Huang F, Dashtbozorg B, Tan T, ter Haar Romeny BM. Retinal artery/vein classification using genetic-search feature selection. Comput Methods Programs Biomed. 2018;161:197–207. doi: 10.1016/j.cmpb.2018.04.016. [DOI] [PubMed] [Google Scholar]
  • 17.Lyu X, Li H, Zhen Y, Ji X, Zhang S (2017) Deep tessellated retinal image detection using Convolutional Neural Networks. In: 2017 39th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp 676–680 [DOI] [PubMed]
  • 18.Yu F, Sun J, Li A, Cheng J, Wan C, Liu J (2017) Image quality classification for DR screening using deep learning. In: 2017 39th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp 664–667 [DOI] [PubMed]
  • 19.Chen X, Williams BM, Vallabhaneni SR, Czanner G, Williams R, Zheng Y (2019) Learning active contour models for medical image segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11632–11640
  • 20.Zhou W, Liu J, Lei J, Yu L, Hwang JN. GMNet: graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation. IEEE Trans Image Process. 2021;30:7790–7802. doi: 10.1109/TIP.2021.3109518. [DOI] [PubMed] [Google Scholar]
  • 21.Ke R, Aviles-Rivero A, Pandey S, Reddy S, Schönlieb CB (2020) A three-stage self-training framework for semi-supervised semantic segmentation. arXiv preprint arXiv:2012.00827. [DOI] [PubMed]
  • 22.Ignatov A, Romero A, Kim H, Timofte R (2021) Real-time video super-resolution on smartphones with deep learning, mobile ai 2021 challenge: report. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2535–2544
  • 23.Rodrigues LF, Naldi MC, Mari JF. Comparing convolutional neural networks and preprocessing techniques for HEp-2 cell classification in immunofluorescence images. Comput Biol Med. 2020;116:103542. doi: 10.1016/j.compbiomed.2019.103542. [DOI] [PubMed] [Google Scholar]
  • 24.Pérez-García F, Sparks R, Ourselin S. TorchIO: a Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning. Comput Methods Programs Biomed. 2021;208:106236. doi: 10.1016/j.cmpb.2021.106236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Quan H, Xu X, Zheng T, Li Z, Zhao M, Cui X. DenseCapsNet: detection of COVID-19 from X-ray images using a capsule neural network. Comput Biol Med. 2021;133:104399. doi: 10.1016/j.compbiomed.2021.104399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Li D, Yu W, Wang K, Jiang D, Jin Q. Speckle noise removal based on structural convolutional neural networks with feature fusion for medical image. Signal Process Image Commun. 2021;99:116500. doi: 10.1016/j.image.2021.116500. [DOI] [Google Scholar]
  • 27.Wang G, Zuluaga MA, Li W, Pratt R, Patel PA, Aertsen M, Doel T, David AL, Deprest J, Ourselin S, Vercauteren T. DeepIGeoS: a deep interactive geodesic framework for medical image segmentation. IEEE Trans Pattern Anal Mach Intel. 2018;41(7):1559–1572. doi: 10.1109/TPAMI.2018.2840695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yu J, Rui Y, Tao D. Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process. 2014;23(5):2019–2032. doi: 10.1109/TIP.2014.2311377. [DOI] [PubMed] [Google Scholar]
  • 29.Yu J, Tan M, Zhang H, Tao D, Rui Y. Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intel. 2019 doi: 10.1109/TPAMI.2019.2932058. [DOI] [PubMed] [Google Scholar]
  • 30.Jiao W, Chen W, Zhang J. An improved cuckoo search algorithm for multithreshold image segmentation. Secur Commun Netw. 2021;2021:6036410. doi: 10.1155/2021/6036410. [DOI] [Google Scholar]
  • 31.Kapur JN, Sahoo PK, Wong AK. A new method for gray-level picture thresholding using the entropy of the histogram. Comput Vis Graph Image Process. 1985;29(3):273–285. doi: 10.1016/0734-189X(85)90125-2. [DOI] [Google Scholar]
  • 32.Jia H, Peng X, Lang C. Remora optimization algorithm. Expert Syst Appl. 2021;185:115665. doi: 10.1016/j.eswa.2021.115665. [DOI] [Google Scholar]
  • 33.Ren Y, Zhu C, Xiao S. Deformable faster r-cnn with aggregating multi-layer features for partially occluded object detection in optical remote sensing images. Remote Sens. 2018;10(9):1470. doi: 10.3390/rs10091470. [DOI] [Google Scholar]
  • 34.Ren S, He K, Girshick R, Sun J. Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. 2015;28:91–99. doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
  • 35.Barros DM, Moura JC, Freire CR, Taleb AC, Valentim RA, Morais PS. Machine learning applied to retinal image processing for glaucoma detection: review and perspective. Biomed Eng Online. 2020;19(1):1–21. doi: 10.1186/s12938-020-00767-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bai T, Yang J, Xu G, Yao D. An optimized railway fastener detection method based on modified Faster R-CNN. Measurement. 2021;182:109742. doi: 10.1016/j.measurement.2021.109742. [DOI] [Google Scholar]
  • 37.Mahdavi S, Shiri ME, Rahnamayan S. Metaheuristics in large-scale global continues optimization: a survey. Inf Sci. 2015;295:407–428. doi: 10.1016/j.ins.2014.10.042. [DOI] [Google Scholar]
  • 38.Ghasemi M, Rahimnejad A, Hemmati R, Akbari E, Gadsden SA. Wild Geese Algorithm: a novel algorithm for large scale optimization based on the natural life and death of wild geese. Array. 2021;11:100074. doi: 10.1016/j.array.2021.100074. [DOI] [Google Scholar]
  • 39.Welikala RA, Foster PJ, Whincup PH, Rudnicka AR, Owen CG, Strachan DP, Barman SA. Automated arteriole and venule classification using deep learning for retinal images from the UK Biobank cohort. Comput Biol Med. 2017;90:23–32. doi: 10.1016/j.compbiomed.2017.09.005. [DOI] [PubMed] [Google Scholar]
  • 40.http://www.isi.uu.nl/Research/Databases/DRIVE/
  • 41.Wang S, Yu L, Li K, Yang X, Fu CW, Heng PA. Dofe: domain-oriented feature embedding for generalizable fundus image segmentation on unseen datasets. IEEE Trans Med Imaging. 2020;39(12):4237–4248. doi: 10.1109/TMI.2020.3015224. [DOI] [PubMed] [Google Scholar]
  • 42.Sathya N, Rathika N. Different classification methods of fundus image segmentation using quincunx wavelet decomposition. J Ambient Intel Humaniz Comput. 2021;12(7):6947–6953. doi: 10.1007/s12652-020-02340-0. [DOI] [Google Scholar]
  • 43.Shirokanev AS, Ilyasova NY, Demin NS. Analysis of convolutional neural network for fundus image segmentation. J Phys Conf Ser. 2020;1438(1):012016. doi: 10.1088/1742-6596/1438/1/012016. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Not applicable.


Articles from Neural Processing Letters are provided here courtesy of Nature Publishing Group

RESOURCES