Abstract
Objective
Diabetic retinopathy (DR) is a common problem of diabetes, and it is the cause of blindness worldwide. Detection of diabetic radiology disease in the early detection stage is crucial for preventing vision loss. In this work, a deep learning-based binary classification of DR images has been proposed to classify DR images into healthy and unhealthy. Transfer learning-based 20 pre-trained networks have been fine-tuned using a robust dataset of diabetic radiology images. The combined dataset has been collected from three robust databases of diabetic patients annotated by experienced ophthalmologists indicating healthy or non-healthy diabetic retina images.
Method
This work has improved robust models by pre-processing the DR images by applying a denoising algorithm, normalization, and data augmentation. In this work, three rubout datasets of diabetic retinopathy images have been selected, named DRD- EyePACS, IDRiD, and APTOS-2019, for the extensive experiments, and a combined diabetic retinopathy image dataset has been generated for the exhaustive experiments. The datasets have been divided into training, testing, and validation sets, and the models use classification accuracy, sensitivity, specificity, precision, F1-score, and ROC-AUC to assess the model's efficiency for evaluating network performance. The present work has selected 20 different pre-trained networks based on three categories: Series, DAG, and lightweight.
Results
This study uses pre-processed data augmentation and normalization of data to solve overfitting problems. From the exhaustive experiments, the three best pre-trained have been selected based on the best classification accuracy from each category. It is concluded that the trained model ResNet101 based on the DAG category effectively identifies diabetic retinopathy disease accurately from radiological images from all cases. It is noted that 97.33% accuracy has been achieved using ResNet101 in the category of DAG network.
Conclusion
Based on the experiment results, the proposed model ResNet101 helps healthcare professionals detect retina diseases early and provides practical solutions to diabetes patients. It also gives patients and experts a second opinion for early detection of diabetic retinopathy.
Keywords: Series, DAG, Lightweight, Pre-trained networks, Classification accuracy
Introduction
The most common eye conditions that cause blindness or reduced vision are diabetic retinopathy, cataracts, glaucoma, and age-related macular degeneration (AMD) [1]. Diabetes is a condition that is common all over the world. Diabetes is ranked seventh among deadly diseases according to a World Health Organization (WHO) report [2–5]. It is noted that the number of cases and incidence of diabetes have been increasing over the past few decades, with an estimated 422 million people with diabetes disease [3–6]. It is noted that approximately 62 million Indians suffer from diabetic retinopathy with age 25–75, and it is expected that 102 million rise by 2030[1–5]. It is familiar to people with diabetes, and the retina is damaged due to high blood sugar [6]. The blood vessels leak, swell, and do not pass blood in the retina, causing abnormalities. Diabetic retinopathy damages the retina due to the complexities of diabetes mellitus and is the leading cause of blindness [7–10]. Diabetic Retinopathy has been classified into two stages named as (a) Proliferative Diabetic Retinopathy (PDR) and (b) Non-proliferative Diabetic Retinopathy (NPDR). Further, NPDR is divided into five classes: (a) Mild, (b) Moderate, and (c) severe [11–16]. The advanced form of diabetic retinopathy is called PDR, caused by the development of aberrant blood vessels or swelling in the retina [14]. The first stage of DR is known as mild NPR, during which the patient does not notice any changes in the vision or eye condition [15]. The blood vessel walls of the retina dilute and cause leakage, leading to microaneurysms, i.e., small lumps coming out from vessel walls [16–19]. Generally, a minute red spot or yellow circle that appears in the eye is called mild, moderate, or severe NDPR. Moderate NPDR is distinguished from mild NPDR by more significant damage to the retina's blood vessels. In moderate NPDR, the blood arteries develop microscopic balloons or swellings with microaneurysms, and fluid leaks from the retina or bleeds [20–24]. Severe NDPR is a serious problem characterized by a more damaged retina in blood vessels than mild and moderate NPDR [24–26].
The exhaustive literature survey shows that deep learning-based pre-trained networks with a transfer learning approach have been widely used for classifying diabetic retinopathy images [1–42]. Fundus retina images are widely used to detect abnormalities of diabetic retinopathy. In this work, a robust dataset of DR has been prepared to classify images into binary classes [27]. PDP and NPDP are considered unhealthy eye conditions, and non-diabetic Retinopathy is considered healthy eye fundus images of the eye [28]. In unhealthy eye images, retinal vessels become irregular in shape, size, and diameter, whereas healthy eye images have regular shape, size, and diameter [29–33]. Three benchmarked datasets of diabetic retinopathy images have been considered in this work for the experiment. A brief description of healthy eye and diabetic or unhealthy eye images of retinopathy is shown in Fig. 1.
Fig. 1.
A brief description of healthy eye and diabetic or unhealthy eye images of retinopathy
In the present work, pre-processing of DR images has been used to pre-process the raw DR images named as (a) resizing of DR images, (b) removing of Gaussian and salt and paper noise with a suitable filtering method [30–33]. The quality of DR images has been enhanced by selecting an optimal filtering algorithm in this work [33–37]. The optimal filtering algorithm has been chosen by pulling different filtering categories in the context of smoothening the homogeneous region and preserving the edges of the DR images [38]. This work uses structure and edge preservation index to evaluate the performance of the filtering algorithm using DR images [37–40]. Accordingly, in the present work, the pre-trained network has been divided into three categories named (a) series, (b) DAG (Residual and inception modules), and lightweight networks [35–42]. In this work, 20 pre-trained have been selected in the robust pull of three categories, i.e., Series, DAG & Lightweight, and fined-tuned the pre-trained networks using the transfer learning approach [40–42]. Accordingly, classification accuracy, sensitivity, specificity, precision, F1-score, and ROC-AUC are used to evaluate the network performance [37–42].
Table 1 shows a brief exhaustive literature review on the classification of diabetic retinopathy imaging using a deep learning-based pre-trained network and transfer learning approach.
Table 1.
A brief exhaustive literature review on the classification of diabetic retinopathy imaging using a deep learning-based pre-trained network and transfer learning approach
| Investigator(s) | Year | Pre-processed Method | Dataset Name | No. of Images | DL Based Pre-Trained Model | Classifier |
|---|---|---|---|---|---|---|
| Gulshan et al. [1] | 2016 | Resizing and normalization | EyePACS & MESSIDOR-2 | - | DCNN | Sen.-97.5% |
| Chandrakumar et al. [2] | 2016 | Resizing and Contrast Enhancement | EyePACS, Drive | - | DCNN | Acc.-94% |
| Zhou, L. et al. [3] | 2017 | Contrast Enhancement | Three Dataset | - | Self-Design Architecture | AUC-0.928 |
| Dutta, S et al. [4] | 2018 | Contrast Enhancement |
EyePACS (5 Class) |
50000 | VGG16 | Acc.- 78.3% |
| Junjun, P. et al. [5] | 2018 | Contrast Enhancement |
EyePACS (5 Class) |
35, 126 | ResNet18 | Acc.- 78.4% |
| Kassani, S. H. et. al [6] | 2019 | Resizing and normalization |
APTOS 2019 (5 Class) |
3662 | Xception | Acc.- 83.09% |
| Challa, U.K et al. [7] | 2019 | Gaussian filters |
Kaggle dataset (Five Class) |
33,000 | Pre-Trained Networks | Acc. -86.64% |
| Qummar, S et al. [8] | 2019 | Scaling and Resizing | Kaggle dataset | 5608 | 5 Pre-trained Network | Sp and F1 Score |
| Bhardwaj, C. et al. [9] | 2020 | Contrast Enhancement |
MESSIDOR (4 Class) |
1200 | QIV Model (Inception-V3) | Acc.- 93.33% |
| Saxena, G. et al. [10] | 2020 | Resizing and Augmentation | EyePACS | 88,702 | InceptionResNet, ResNet and Inception | AUC-0.927 |
| Yusaku Katada et al. [11] | 2020 | DA | EyePACS | 35,126 (3508 Selected) | Inception v3 | Sensitivity of 81.5% and 90.8% |
| Ali Usman et al. [12] | 2020 | Resizing, Data Augmentation | 7 Online Dataset | 2680 | Inception v3, ResNet50 and Alex Net | Acc.- 85.2% |
| Wejdan L. Alyoubi et al. [13] | 2021 | CLAHE, Cropping, and DA | DDR and APTOS-2019 | 47,870 | EfficientNetB0 | Acc.-89% |
| Bhardwaj, C. et al. [14] | 2021 | Contrast Enhancement |
MESSIDOR (4 Class) |
1200 | QEIRV‑2 Model (Inception-V3) | Acc.- 93.3% |
| Chen, P. N. et al. [15] | 2021 | Resizing and Grayscale | EyePACS | (88,702), | NASNet-Large |
Acc.- 81.60% & 92.5% |
| San-Li Yi et al. [16] | 2021 | Resizing and Augmentation |
APTOS 2019 (5 Class) |
3662 | RA-EfficientNet | Acc.- 93.55% |
| Z. Khan et al. [17] | 2021 | Scaling and Resizing | EyePACS | 88,702 | VGG16 and VGG-NiN | Sp.-91% |
| Sraddha Das et al. [18] | 2021 | Adaptive histogram equalization | DIARETDB1 | - | CNN | Acc.- 98.7% |
| AbdelMaksoud et al. [19] | 2022 | Resizing and Augmentation | 4 Dataset | 39,301 | E-DenseNet | Acc.- 91.3% |
| Kobat, S. G., et al. [20] | 2022 | Resizing |
NDRD APTOS 2019 |
2355 and 3662 | DenseNet201 | Acc. -87.43% & 84.90% |
| Mungloo-Dilmohamud, Z et al. [21] | 2022 | Rescaling and Augmentation |
APTOS 2019 (5 Class) |
3662 |
VGG16, ResNet50 DenseNet169 |
Acc. -82% |
| Al-Omaisi Asia et al. [22] | 2022 | Cropping, Resizing, and Augmentation |
XHO Dataset (5 Class) |
1607 | ResNet 50, 101 and VGG16 | Acc.-80.88% |
| Sambit S. Mondal et al. [23] | 2022 | CLAHE and DA |
APTOS 2019 (5 Class) |
3662 | ResNext and DenseNet | Acc.-86.08 |
| Yasashvini, R. et al. [24] | 2022 | Weiner Filter | APTOS 2019 | 3662 | ResNet & DenseNet | Acc.- 96.22% |
| Dayana, A. M. et al. [25] | 2022 | ADF | Local | - | AFU-NET | - |
| Oulhadj, M. et al. [26] | 2022 | Scaling and Resizing | APTOS 2019 | 3662 | Densenet-121, Xception, Inception-v3 & Resnet-50 | Acc. -85.28% |
| Jabbar M. K. et al. [27] | 2022 | CLAHE, DA & Resizing | EyePACS | 35,126 | VggNet | Acc.-96.6% |
| Menaouer, B. et al. [28] | 2022 | Scaling and Resizing |
APTOS‑2019, Messidor-2 & Local public DR |
5584 | VGG 16 and 19 | Acc.-90.6% and |
| Fayyaz, A. M. et al. [29] | 2023 | Resizing and Augmentation |
ODIR (4 Class) |
- | AlexNet and ResNet-101 | SVM Acc.-93% |
| Dolly Das et al. [30] | 2023 | Resizing and Augmentation |
EyePACS (5 Class) |
35, 126 | 19 Pre-Trained | Acc.-79.11% |
| C. Mohanty et al. [31] | 2023 | Cropped and Resized |
APTOS 2019 (5 Class) |
3662 | DenseNet 121 | Acc.-97.30% |
| Pradeep Kumar Jena et al. [32] | 2023 | CLAHE | APTOS and MESSIDOR | 3662 & 1200 | Self-Design Architecture | Acc.—SVM (98.6% and 91.9%) |
| Bhimavarapu, U. et al. [33] | 2023 | CLAHE and Histogram Equalization |
APTOS and Kaggle |
3662 & 35,126 | 5 Pre-trained Network | Acc. – 98.32% & 98.71% |
| Islam, N. et al. [34] | 2023 | Gaussian Filter | APTOS, IDRiD | 3662 & 516 | Xception | APTOS—99.04% IDRiD -94.17% |
| Sajid, M. et al. [35] | 2023 | DA and Image Enhancement | Public Dataset | 32,800 | DR-NASNet | Acc.-96% |
| Alwakid, G. et al. [36] | 2023 | CLAHE & Data Augmentation | APTOS 2019 | 3662 | DenseNet-121 | Acc.-98.36% |
| Vijayan, M. et al. [37] | 2023 | Scaling | DDR, IDRiD, and APTOS | 13,673, 516 & 3662 | 6 Pre-trained Network | Acc.-82.5% |
| Alwakid, G. et al. [38] | 2023 | CLAHE & DA | APTOS 2019 | 3662 | DenseNet-121 | - |
| Guefrachi, S et al. [39] | 2024 | Resizing and Augmentation | APTOS 2019 | 3662 | Resnet152-V2 | Acc.- 100% |
| Sunkari, S. et al. [40] | 2024 | Contrast and Brightness |
APTOS and Kaggle |
3662 & 35,126 | 3 Pre-trained Network | Acc.—93.51% |
| Macsik, P. et al. [41] | 2024 | CLAHE & DA | DDR and APTOS 2019 | 3662 | Xception & EfficientNetB4 | Acc |
| Shakibania Bu-Ali et al. [42] | 2024 | CLAHE & DA | APTOS 2019 | 3662 | 4 Pre-trained Network | Acc.-96.44 |
T & V—Training & Validation, ODIR- Ocular Intelligent Recognition collect, NDRD-New Diabetic Retinopathy Dataset, APTOS-Asia Pacific Tele-Ophthalmology Society, ADF- Anisotropic Diffusion Filter, AFU-NET-Attention-based Fusion Network, DA-Data Augmentation
From Table 1, it is observed that APTOS 2019, Diabetic Retinopathy Dataset, IDRiD, and EyePACS datasets were used in most of the studies with binary and multi-class classification using tuned pre-trained networks using transfer learning [35–42]. It is concluded that the highest accuracy has been achieved at 100% by Guefrachi S et al. [39]. It is noted that the interpretability of pre-trained networks improves the classification accuracy and explains the ability of DR detection [30–42]. It is also observed that only 3 out of 42 studies (7%) used pre-trained networks as feature extractors and classifiers. It is observed that SVM, KNN, and PNN classifiers have been used to classify features extracted by CNN architecture [1–42]. It is also noted that only 2 out of 42 studies (approx. 5%) used self-design CNN architecture to classify diabetic retinopathy images. It is also stated that only 6 out of 42 studies used binary class classification [35–42]. It is also noted that Gaussian filters, CLAHE, Data augmentation, histogram equalization, scaling, and resizing have been widely used as despeckling algorithms [37–42]. According to the literature review, noise is removed from an image by convolving it with a Gaussian kernel while maintaining the underlying structures [7]. It's an easy way to reduce Gaussian noise without sacrificing performance [7, 28, 36–42].
In the present work, the pre-trained networks have been divided into three categories: series, DAG (Directed Acyclic Graphs), and lightweight. Selecting a network in which category to classify medical images is crucial. This work provided a platform to classify DR images into binary classes by selecting the optimal pre-trained network with category. Accordingly, 20 pre-trained networks have been divided into three categories: series, DAG, and lightweight. These networks are used to classify DR images into binary. However, lightweight pre-trained networks provide effective, scalable, and readily available options for deploying deep learning models in resource-constrained situations. DAGs and series help manage and optimize complex workflows with apparent dependencies.
In the present work, the Optimal Denoising algorithm, Gaussian Filter, and resizing image size are done by optimizing performance evaluation parameters and maintaining the aspect ratio of the DR images. It is noted that the robust pull of the filtering and resizing algorithm has selected the denoising and resizing method. Structure and edge-preserving index (SEPI) [43–50] metrics have been evaluated for the performance of denoising filters. The Gaussian optimal denoising filter for DR images performed outstandingly in the DR image classification. In the present work, the effect of the denoising algorithm has been evaluated based on the performance parameter of the classification used in this work. The impact of the denoising algorithm has been reported in this work.
Workflow adopted for classification of diabetic retinopathy images
The workflow adopted in this work for classifying diabetic retinopathy images using fine-tuned pre-trained networks with a transfer learning approach is presented in Fig. 2.
Fig. 2.
The workflow adopted in this work for the classification of diabetic retinopathy images using fined tuned pre-trained networks with the transfer learning approach
The exhaustive literature review noted that very few studies based on binary class classification of DR images differentiate retina disease into two classes, as no sign of diabetic retinopathy and the sign of diabetic retinopathy has been reported by the researchers [1–42]. It is also concluded that the feature engineering step has been eliminated using deep learning or different types of pre-trained networks [44]. It is also noted that feature engineering shows significantly less result than deep learning. Convention feature extraction and selection process has been removed using a convolution-based network followed by ReLU, normalization, and max pooling [43–47].
From Fig. 2, it is noted that the classification of DR images involves different steps named (a) Benchmark diabetic Retinopathy Dataset Collection, (b) Data Pre-processing and Preparation Module, (c) Dataset Splitting for Training, Validation, and Testing Set, (d) Data Augmentation (e) Selection of pre-trained network, (f) Network training (g) Hyperparameters tuning using transfer learning, (h) evaluation parameters, (i) Validation and interpretation of the trained model and (j) Development and deployment.
Original benchmark diabetic retinopathy dataset collection
The dataset for diabetic retinopathy images has been collected from three benchmarked datasets consisting of annotated images by experienced ophthalmologists marked as healthy and unhealthy retinal images affected by diabetic disease [51–53]. The sample images of diabetic retinopathy in different stages are shown in Fig. 3.
Fig. 3.
Diabetic Retinopathy Stages (a) Normal retinal, (b) Mild Diabetic Retinopathy, (c) Moderate Diabetic Retinopathy, (d) Severe Diabetic Retinopathy, (e) Proliferative Diabetic Retinopathy, (f) Macular Edema
The first dataset available at Kaggle, the Diabetic Retinopathy Dataset (DRD—EyePACS) [51], consists of 2750 DR images, 1000 belonging to healthy classes and 1750 to unhealthy classes. The size of all images is 256 × 256.
The second dataset, IDRiD (Indian Diabetic Retinopathy Image Dataset) [52], consists of 513 DR images, 413 of which belong to healthy and 103 to unhealthy classes. The size of all images is 4288 × 2848.
The third dataset, APTOS (Asia Pacific Tele-Ophthalmology Society)-2019 [53], contains 3662 DR images collected from many participants in rural India. The dataset was prepared by India's Aravind Eye Hospital [35–42]. The fundus images have been taken over a long period in various settings and situations. The medical professionals examined and labeled the samples based on the standard provided by the International Clinical Diabetic Retinopathy Disease Severity Scale (ICDRDSS). It consists of 1805 samples with healthy and 1857 samples with unhealthy classes. The details of diabetic retinopathy datasets are shown in Table 2.
Table 2.
The Detail of Diabetic Retinopathy Datasets
Table 2 notes that four experiments were performed in this work. Datasets DRD-EyePACS, IDRiD, and APOS-2019 have been considered for experiments 1, 2, and 3. Experiment 4 has been performed in this work, combining all three datasets and generating a robust dataset. The datasets used by medical professionals are not robust, so to solve this problem, all datasets have been combined into two classes, and a collective dataset has been prepared in this work. 6928 DR images contain 2973 images in the healthy class and 3955 DR images in the unhealthy class.
Data pre-processing and preparation module
In the present work, the data pre-processing and preparation module has been divided into three categories named as (a) Resizing of DR Images, (b) Image enhancement using suitable denoising algorithm, (c) Data augmentation, and (d) Dataset Splitting for Training, Validation and Testing Set.
Resizing of DR images
The exhaustive literature review concludes that the resizing of DR images is essential for the analysis and classification of diabetic retinopathy disease. It is also noted that deep learning-based pre-trained networks required the standard size of the image. To meet this requirement, the image size has been resized by 256 × 256 in this work. The DR image has been resized by maintaining the aspect ratio.
Image enhancement using suitable denoising algorithm
The present work has enhanced DR image quality using a pre-processing step. From the exhaustive literature review, denoising, contrast adjustment, and sharpening have been widely used for DR images to enhance the quality of images [1–42]. In the present work, an exhaustive poll of filtering and enhancement algorithms has been selected for the categories named (a) linear, (b) non-linear, (c) edge-preserving, and (d) contrast enhancement categories [7, 35–42]. It is observed that the Gaussian filter outperformed in all categories. This work used a Gaussian filter to pre-process DR images [7].
Dataset splitting for training, validation and testing set
Dataset splitting is an essential step in deep learning model development to evaluate the performance of the networks. Initially, the dataset was divided into training and testing subsets, and then the training set was split into validation data and training data. A brief description of diabetic retinopathy datasets splitting for training, validation, and testing set is shown in Table 3.
Table 3.
The brief description of diabetic retinopathy datasets splitting for training and testing
| Dataset | DRD-EyePACS Dataset | IDRiD Dataset | ATOS -2019 Dataset | Combined Dataset | ||||
|---|---|---|---|---|---|---|---|---|
| Class | Training | Testing | Training | Testing | Training | Testing | Training | Testing |
| Healthy (Not Diabetic Retinopathy) | 800 | 200 | 134 | 50 | 1605 | 200 | 2505 | 450 |
| Unhealthy (Diabetic Retinopathy) | 1550 | 200 | 298 | 50 | 1657 | 200 | 3487 | 450 |
| Total DR Image | 2350 | 400 | 416 | 100 | 3262 | 400 | 5992 | 900 |
| 2750 | 516 | 3662 | 6928 | |||||
Table 3 reveals that the training data contains a large portion of the set, and the network learns patterns and relationships from this data. The deep learning hyperparameters are fine-tuned, and their training performance is evaluated using the validation set. It prevents overfitting the deep learning network and calculates the networks' performance during each epoch. The trained model's performance is calculated using the testing set. It acts as a neutral gauge of the model applied to untested data. The dataset is typically divided into three subsets in equal ratios depending on the dataset's size and the problem's specific requirements.
Data augmentation
Balancing a dataset for diabetic retinopathy using data augmentation involves creating additional samples of the minority class (e.g., severe diabetic retinopathy) to match the number of samples in the majority class (e.g., no diabetic retinopathy) [33–42]. Data augmentation techniques generate new samples by transforming existing images while preserving their semantic content [40]. The dataset has been randomly shuffled, and the DR image dataset ensures that each sample type is represented in each subset [41]. To overcome these problems, data augmentation balanced the training and validation data and achieved good network performance [30–40]. In the present work, data augmentation has been applied to the training set of all three datasets. The healthy and unhealthy class of DR images reached approximately 4000 of each class and each dataset. The combined dataset was prepared by mixing all classes, and a robust dataset was prepared to classify DR images. The number of multipliers as data augmentation operations with each dataset is shown in Table 4.
Table 4.
The number of multipliers as data augmentation operation with each dataset
| Dataset | DRD-EyePACS Dataset | IDRiD Dataset | ATOS -2019 Dataset | Combined Dataset | ||||
|---|---|---|---|---|---|---|---|---|
| Class | Training | Testing | Training | Testing | Training | Testing | Training | Testing |
| Healthy (Not Diabetic Retinopathy) |
800 × 5 = 4000 |
200 | 118 × 34 = 4012 | 50 | 1605 × 2.5 = 4012 | 200 | 4000 + 4012 + 4012 = 12,024 | 450 |
| Unhealthy (Diabetic Retinopathy) | 1550*2.6 = 4030 | 200 | 298 × 13.5 = 4023 | 50 | 1657 × 2.40 = 3976 | 200 | 4030 + 4023 + 3976 = 12,029 | 450 |
| Total | 8030 | 400 | 8035 | 100 | 7988 | 400 | 24,053 | 900 |
Randomly shuffle the dataset to ensure each subset contains a representative sample of the entire dataset. This helps reduce the risk of any particular subgroup being biased. In classification tasks, preserving the class allocation between subsets is essential, mainly when working with imbalanced datasets. In the present work, class subsets have been balanced using data augmentation, ensuring that each subset has an equal distribution of classes of diabetic retinopathy images.
The number of augmentation transform operations used as a multiplier for different images is shown in Table 5.
Table 5.
The number of augmentation transform operations used as multipliers for different images
| Datasets | Diabetic Retinopathy disease type | Rotation | Flipping | Flipping with Rotation | Translation | Multiplier | No. of Images | Total Images | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 90o | 180o | Horizontal | Vertical | 90° H | 180° H | 90oV | 180oV | ||||||
| DRD-EyePACS Dataset -1 | Healthy (Not Diabetic Retinopathy) | ✓ | ✓ | ✓ | ✓ | ✓ | × | × | × | × | 5 | 800 | 4000 |
| IDRiD Dataset -2 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | -13 to 13 | 34 | 118 | 4012 | |
| ATOS -2019 Dataset -3 | ✓ | ✓ | × | × | × | × | × | × | 0.5 × (-1) | 2.5 | 1605 | 4012 | |
| DRD-EyePACS Dataset -1 | Unhealthy (Diabetic Retinopathy) | ✓ | ✓ | × | × | × | × | × | × | 0.6 × (-1) | 2.6 | 1550 | 4030 |
| IDRiD Dataset -2 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | -3 to 3 | 13.5 | 298 | 4023 | |
| ATOS -2019 Dataset—3 | ✓ | ✓ | × | × | × | × | × | × | 0.4 × (-1) | 2.4 | 1657 | 3976 | |
V-Vertical, H-Horizontal
Splitting the diabetic retinopathy dataset into training, testing, and validation sets
The first step in developing a deep learning-based classification model is to divide the DR Dataset into training, testing, and validation sets to ensure the model performs efficiently when applied to new data. Table 6 provides a brief description of the dataset after data augmentation was applied to diabetic retinopathy images, with balanced training, validation, and testing sets, as well as the combined dataset.
Table 6.
The brief description of the dataset after data augmentation applied to diabetic retinopathy images with balanced training, validation, and testing sets
Table 6 shows that the diabetic retinopathy datasets are considered balanced if there are the same number of cases or samples in each class or category in the training, validation, and testing sets. In the present work, the combined dataset has been generated. In deep learning applications, particularly classification, the balanced dataset is an important step to prevent biasing the majority of classes and guarantee the representation of all classes during training. It is also noted that diabetic retinopathy dataset performance has been compared with classification accuracy, and the best combination has been selected based on the dataset and type of pre-trained networks. It is also revealed that the performance of each pre-trained network has been calculated based on the transfer learning approach, and the denoising effect has been reported.
Selection of pre-trained network and classification module
Selecting a pre-trained network for diabetic retinopathy depends upon the type of dataset, size of images, and the availability of pre-trained models. In the present work, a pre-trained network has been selected based on Series, DAG, and lightweight categories. Table 7 shows a brief description of each category. The selection of a pre-trained network is essential for classifying diabetic retinopathy images. This work divides the pre-trained network into Series, DAG, and Lightweight categories. The number of parameters is the criteria for dividing the pre-trained network into each category [44].
Table 7.
A brief category of selecting a pre-trained network based on each category
| S.No | Name of the Pre-trained Networks | Type of Categories | Size of Images | Number of Parameters(Million) | Depth of the Network |
|---|---|---|---|---|---|
| 1 | AlexNet | Series | 227 × 227 × 3 | 61.0 | 8 |
| 2 | vgg16 | 224 × 224 × 3 | 138 | 16 | |
| 3 | vgg19 | 224 × 224 × 3 | 144 | 19 | |
| 4 | darknet19 | 256 × 256 × 3 | 20.8 | 19 | |
| 5 | darknet53 | 256 × 256 × 3 | 41.6 | 53 | |
| 6 | inceptionv3 | DAG | 299 × 299 × 3 | 23.9 | 48 |
| 7 | densenet201 | 224 × 224 × 3 | 20.0 | 201 | |
| 8 | Resnet50 | 224 × 224 × 3 | 25.6 | 50 | |
| 9 | Resnet101 | 224 × 224 × 3 | 44.6 | 101 | |
| 10 | xception | 299 × 299 × 3 | 22.9 | 71 | |
| 11 | inceptionresnetv2 | 299 × 299 × 3 | 55.9 | 164 | |
| 12 | nasnetlarge | 331 × 331 × 3 | 88.9 | - | |
| 13 | SqueezeNet | Lightweight Networks | 227 × 227 × 3 | 1.24 | 18 |
| 14 | mobilenetv2 | 224 × 224 × 3 | 3.5 | 53 | |
| 15 | shufflenet | 224 × 224 × 3 | 1.4 | 50 | |
| 16 | nasnetmobile | 224 × 224 × 3 | 5.3 | - | |
| 17 | efficientnetb0 | 224 × 224 × 3 | 5.3 | 82 | |
| 18 | GoogleNet | 224 × 224 × 3 | 7.0 | 22 | |
| 19 | googlenet-places365 | 224 × 224 × 3 | 7.0 | 22 | |
| 20 | resnet18 | 224 × 224 × 3 | 11.7 | 18 |
Table 7 shows that the pre-trained networks selected in Series, DAG, and lightweight categories are 5, 7, and 8, respectively. A total of 20 pre-trained networks have been selected, and the depth of the pre-trained network is important for classifying DR images.
Assessment parameters used in classification of DR images
From an exhaustive literature survey, it is noted that Accuracy, sensitivity, specificity, log loss, precision, F-score, overlapping error, boundary-based evaluation, etc., are the performance metrics used by different researchers to evaluate the detection algorithms of diabetic retinopathy. The present work uses Accuracy, Sensitivity, Specificity, Precision, and F1-Score metrics to calculate performance parameters by classifying DR images. The sample of the Confusion matrix from the classification of DR images is shown in Table 8.
Table 8.
The sample of the Confusion matrix from the classification of DR images
| Healthy (Not Diabetic Retinopathy) | TN | FP |
| Unhealthy (Diabetic Retinopathy) | FN | TP |
| Healthy (Not Diabetic Retinopathy) | Unhealthy (Diabetic Retinopathy) |
Number of correctly detected DR images True positive (TP) and the number of DR images detected wrongly, false-negative (FN) instances that are incorrectly classified, and the numbers of positive FP (false positive) and FN (false negative).
Accuracy is the ratio of the total number of correctly identified to the total number of images taken for the classification. The formula used for the accuracy is shown in Eq. 1.
| 1 |
Sensitivity or recall determines the measure of the correctness of patients with DR. The formula used for Sensitivity is shown in Eq. 2.
| 2 |
It is calculated by dividing the number of patients diagnosed with diabetic retinopathy by the total number of affected patients.
Specificity is a ratio that determines whether a person is unaffected by the DR disease. The formulas for Precision and recall are shown in Eq. 4 and 5, and the formula for Specificity is shown in Eq. 3.
| 3 |
Precision is the ratio of correctly classified DR images to the number of unhealthy DR images detected wrongly as the DR images.
| 4 |
The F-Score, also known as the F1-Score, is a metric for how accurate a model is on a given dataset. It is used to evaluate binary classification systems.
| 5 |
The harmonic mean of precision and recall calculates the regular F1 score.
Implementation details and number of experiments
The selected hyperparameters used to implement experiments are shown in Table 9.
Table 9.
The selected hyperparameters for implementation of experiments
| Hyperparameters | Details |
|---|---|
| Learning Rate | 10–4 |
| Mini-batch Size | 32 |
| Maximum Epochs | 30 |
| Optimizer | Adam |
| The selected machine for implementation of experiments | |
| Used Machine | Details |
| GPU | NVidia GEFORCE RTX 4060, 8 GB, 3072 CUDA CORE |
| Processor | 12th Gen Intel Core i7 Processor |
| Operating System | Windows 11 Home |
In the present work, four experiments have been drawn from each category of the pre-trained network. The experiments performed to classify DR images are shown in Table 10.
Table 10.
Experiments Performed in the classification of DR images
| Experiment(s) | Details | Dataset |
|---|---|---|
| Experiment—1 | Using Original DR images without augmentation | DRD-EyePACS, IDRiD, ATOS-2019, and Combined dataset |
| Experiment—2 | Using Original DR images with augmentation | DRD-EyePACS, IDRiD, ATOS-2019, and Combined dataset |
| Experiment—3 | Using pre-processed dataset images without augmentation | DRD-EyePACS, IDRiD, ATOS-2019, and Combined dataset |
| Experiment—4 | Using pre-processed dataset images with augmentation | DRD-EyePACS, IDRiD, ATOS-2019, and Combined dataset |
Performance evaluation of experiments
In the present network, 20 pre-trained networks have been used to calculate the performance of three datasets and combined three datasets for DR images based on healthy and unhealthy classes. The pre-trained network has been divided into series, DAG, and lightweight categories. Accuracy, Sensitivity, Specificity, precision, and F1-score have been calculated in this work. In the present work, original, pre-processed with augmentation, and without augmentation DR images have been used to calculate the performance of pre-trained network categories. The performance of each case is as follows:
Performance evaluation metrics using DRD-EyePACS dataset
Several factors such as choice of pre-trained network, data pre-processing, fine-tuning of the pre-trained network using optimal hyper-parameter, validation data, and evaluation metrics are taken into consideration while networks performance using series, DAG and Lightweight network architectures category using DRD-EyePACS Dataset.
Performance of series-based network architectures using DRD-EyePACS dataset
Table 11 shows the performance of series-based networks with and without augmentation using original and pre-processed DR images of the DRD-EyePACS Dataset.
Table 11.
The performance of series-based networks based on augmentation and without augmentation using original & pre-processed DR images of DRD-EyePACS Dataset
| (a) Using Original DRD-EyePACS Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC % | Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| AlexNet | 136 | 64 | 65 | 0.62 | 0.68 | 0.66 | 0.64 | 176 | 24 | 85 | 0.82 | 0.88 | 0.87 | 0.85 |
| 76 | 124 | 36 | 164 | |||||||||||
| vgg16 | 140 | 60 | 67.5 | 0.65 | 0.70 | 0.68 | 0.67 | 180 | 20 | 86 | 0.82 | 0.90 | 0.89 | 0.85 |
| 70 | 130 | 36 | 164 | |||||||||||
| vgg19 | 152 | 48 | 73 | 0.70 | 0.76 | 0.74 | 0.72 | 186 | 14 | 89.5 | 0.86 | 0.93 | 0.92 | 0.89 |
| 60 | 140 | 28 | 172 | |||||||||||
| darknet19 | 130 | 70 | 66 | 0.67 | 0.65 | 0.66 | 0.66 | 182 | 18 | 86.5 | 0.82 | 0.91 | 0.90 | 0.86 |
| 66 | 134 | 36 | 164 | |||||||||||
| darknet53 | 144 | 56 | 71 | 0.70 | 0.72 | 0.71 | 0.71 | 176 | 24 | 87 | 0.86 | 0.88 | 0.88 | 0.87 |
| 60 | 140 | 28 | 172 | |||||||||||
| (b) Using Pre-processed DRD-EyePACS Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC | Sen | Sp | Pr | F1 | Confusion Matrix | ACC | Sen | Sp | Pr | F1 | ||
| AlexNet | 140 | 60 | 69 | 0.68 | 0.70 | 0.69 | 0.69 | 178 | 22 | 87 | 0.85 | 0.89 | 0.89 | 0.87 |
| 64 | 136 | 30 | 170 | |||||||||||
| vgg16 | 150 | 50 | 71 | 0.67 | 0.75 | 0.73 | 0.70 | 182 | 18 | 87.5 | 0.84 | 0.91 | 0.90 | 0.87 |
| 66 | 134 | 32 | 168 | |||||||||||
| vgg19 | 152 | 48 | 75 | 0.74 | 0.76 | 0.76 | 0.75 | 192 | 8 | 92.5 | 0.89 | 0.96 | 0.96 | 0.92 |
| 52 | 148 | 22 | 178 | |||||||||||
| darknet19 | 144 | 56 | 69 | 0.66 | 0.72 | 0.70 | 0.68 | 176 | 24 | 85 | 0.82 | 0.88 | 0.87 | 0.85 |
| 68 | 132 | 36 | 164 | |||||||||||
| darknet53 | 156 | 44 | 73 | 0.68 | 0.78 | 0.76 | 0.72 | 186 | 14 | 89.5 | 0.86 | 0.93 | 0.92 | 0.89 |
| 64 | 136 | 28 | 172 | |||||||||||
Table 11 shows that the vgg19 network in the series category achieved the highest classification accuracy using the DRD-EyePACS Dataset. It is also observed that 92.5% accuracy has been achieved after the augmentation of the DRD Dataset. It is also noted that the vgg19-based network achieved 0.89, 0.96, 0.96, and 0.92 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It is also noted that Individual Class Accuracy (ICA) for the healthy eye is 192 and ICA for an unhealthy eye is 178, which has been achieved using series-based network vgg19. The best result is marked in grey.
Performance of DAG-based network architectures using DRD-EyePACS dataset
Table 12 shows the performance of DAG-based networks with and without augmentation using original and pre-processed DR images.
Table 12.
The performance of DAG-based networks based on augmentation and without augmentation using original & pre-processed DR images of DRD-EyePACS Dataset
| (a) Using Original DRD-EyePACS Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
|
ACC % |
Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| inceptionv3 | 138 | 62 | 67.5 | 0.66 | 0.69 | 0.68 | 0.67 | 180 | 20 | 86.5 | 0.83 | 0.90 | 0.89 | 0.86 |
| 68 | 132 | 34 | 166 | |||||||||||
| densenet201 | 152 | 48 | 69.5 | 0.63 | 0.76 | 0.72 | 0.67 | 184 | 16 | 87.5 | 0.83 | 0.92 | 0.91 | 0.87 |
| 74 | 126 | 34 | 166 | |||||||||||
| Resnet50 | 152 | 48 | 72.5 | 0.69 | 0.76 | 0.74 | 0.72 | 182 | 18 | 88.5 | 0.86 | 0.91 | 0.91 | 0.88 |
| 62 | 138 | 28 | 172 | |||||||||||
| Resnet101 | 154 | 46 | 74.5 | 0.72 | 0.77 | 0.76 | 0.74 | 190 | 10 | 91.5 | 0.88 | 0.95 | 0.95 | 0.91 |
| 56 | 144 | 24 | 176 | |||||||||||
| xception | 152 | 48 | 71 | 0.66 | 0.76 | 0.73 | 0.69 | 184 | 16 | 87 | 0.82 | 0.92 | 0.91 | 0.86 |
| 68 | 132 | 36 | 164 | |||||||||||
| inceptionresnetv2 | 150 | 50 | 69 | 0.63 | 0.75 | 0.72 | 0.67 | 180 | 20 | 85 | 0.80 | 0.90 | 0.89 | 0.84 |
| 74 | 126 | 40 | 160 | |||||||||||
| nasnetlarge | 148 | 52 | 68.5 | 0.63 | 0.74 | 0.71 | 0.67 | 188 | 12 | 88.5 | 0.83 | 0.94 | 0.93 | 0.88 |
| 74 | 126 | 34 | 166 | |||||||||||
| (b) Using Pre-processed DRD-EyePACS Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC | Sen | Pr | F1 | Confusion Matrix | ACC | Sen | Pr | F1 | ||||
| inceptionv3 | 152 | 48 | 72.5 | 0.69 | 0.76 | 0.74 | 0.72 | 184 | 16 | 88.5 | 0.85 | 0.92 | 0.91 | 0.88 |
| 62 | 138 | 30 | 170 | |||||||||||
| densenet201 | 154 | 46 | 73.5 | 0.70 | 0.77 | 0.75 | 0.73 | 184 | 16 | 89 | 0.86 | 0.92 | 0.91 | 0.89 |
| 60 | 140 | 28 | 172 | |||||||||||
| Resnet50 | 158 | 42 | 75 | 0.71 | 0.79 | 0.77 | 0.74 | 188 | 12 | 91.5 | 0.89 | 0.94 | 0.94 | 0.91 |
| 58 | 142 | 22 | 178 | |||||||||||
| Resnet101 | 162 | 38 | 77.5 | 0.74 | 0.81 | 0.80 | 0.77 | 194 | 6 | 93.5 | 0.90 | 0.97 | 0.97 | 0.93 |
| 52 | 148 | 20 | 180 | |||||||||||
| xception | 150 | 50 | 72 | 0.69 | 0.75 | 0.73 | 0.71 | 182 | 18 | 89.5 | 0.88 | 0.91 | 0.91 | 0.89 |
| 62 | 138 | 24 | 176 | |||||||||||
| inceptionresnetv2 | 148 | 52 | 70 | 0.66 | 0.74 | 0.72 | 0.69 | 184 | 16 | 89 | 0.86 | 0.92 | 0.91 | 0.89 |
| 68 | 132 | 28 | 172 | |||||||||||
| nasnetlarge | 142 | 58 | 69.5 | 0.68 | 0.71 | 0.70 | 0.69 | 176 | 24 | 86.5 | 0.85 | 0.88 | 0.88 | 0.86 |
| 64 | 136 | 30 | 170 | |||||||||||
From Table 12, it is concluded that the Resnet101 network in the category of DAG achieved the highest classification accuracy using the DRD-EyePACS Dataset. It is also observed that 93.5% accuracy has been achieved after the augmentation of the DRD Dataset. It is also noted that the Resnet101-based network achieved 0.90, 0.97, 0.97, and 0.93 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for a healthy eye is 194 and ICA for an unhealthy eye is 180, which was achieved using the based network Resnet101. The best result is marked in grey.
Performance of lightweight based network architectures using DRD-EyePACS dataset
Table 13 shows the performance of Lightweight networks based on augmentation and without augmentation using original and pre-processed DR images of the DRD-EyePACS Dataset.
Table 13.
The performance of Lightweight networks based on augmentation and without augmentation using original & pre-processed DR images of the DRD-EyePACS Dataset
| (a) Using Original DRD-EyePACS Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC | Sen | Sp | Pr | F1 | ACC | Sen | Sp | Pr | F1 | |||||
| SqueezeNet | 142 | 58 | 69.5 | 0.68 | 0.71 | 0.70 | 0.69 | 178 | 22 | 85 | 0.81 | 0.89 | 0.88 | 0.84 |
| 64 | 136 | 38 | 162 | |||||||||||
| mobilenetv2 | 154 | 46 | 72 | 0.67 | 0.77 | 0.74 | 0.71 | 180 | 20 | 87.5 | 0.85 | 0.90 | 0.89 | 0.87 |
| 66 | 134 | 30 | 170 | |||||||||||
| shufflenet | 158 | 42 | 75.5 | 0.72 | 0.79 | 0.77 | 0.75 | 188 | 12 | 91 | 0.88 | 0.94 | 0.94 | 0.91 |
| 56 | 144 | 24 | 176 | |||||||||||
| nasnetmobile | 152 | 48 | 72 | 0.68 | 0.76 | 0.74 | 0.71 | 180 | 20 | 88.5 | 0.87 | 0.90 | 0.90 | 0.88 |
| 64 | 136 | 26 | 174 | |||||||||||
| efficientnetb0 | 140 | 60 | 67 | 0.64 | 0.70 | 0.68 | 0.66 | 176 | 24 | 85 | 0.82 | 0.88 | 0.87 | 0.85 |
| 72 | 128 | 36 | 164 | |||||||||||
| GoogleNet | 138 | 62 | 64.5 | 0.60 | 0.69 | 0.66 | 0.63 | 170 | 30 | 83.5 | 0.82 | 0.85 | 0.85 | 0.83 |
| 80 | 120 | 36 | 164 | |||||||||||
| googlenet-places365 | 150 | 50 | 71 | 0.67 | 0.75 | 0.73 | 0.70 | 182 | 18 | 87 | 0.83 | 0.91 | 0.90 | 0.86 |
| 66 | 134 | 34 | 166 | |||||||||||
| resnet18 | 142 | 58 | 68 | 0.65 | 0.71 | 0.69 | 0.67 | 176 | 24 | 90.5 | 0.93 | 0.88 | 0.89 | 0.91 |
| 70 | 130 | 14 | 186 | |||||||||||
| (b) Using Pre-processed DRD-EyePACS Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | ||
| SqueezeNet | 150 | 50 | 71 | 0.67 | 0.75 | 0.73 | 0.70 | 180 | 20 | 87.5 | 0.85 | 0.90 | 0.89 | 0.87 |
| 66 | 134 | 30 | 170 | |||||||||||
| mobilenetv2 | 158 | 42 | 74 | 0.69 | 0.79 | 0.77 | 0.73 | 186 | 14 | 90.5 | 0.88 | 0.93 | 0.93 | 0.90 |
| 62 | 138 | 24 | 176 | |||||||||||
| shufflenet | 160 | 40 | 76.5 | 0.73 | 0.80 | 0.78 | 0.76 | 192 | 8 | 94.5 | 0.93 | 0.96 | 0.96 | 0.94 |
| 54 | 146 | 14 | 186 | |||||||||||
| nasnetmobile | 158 | 42 | 74 | 0.69 | 0.79 | 0.77 | 0.73 | 184 | 16 | 90.5 | 0.89 | 0.92 | 0.92 | 0.90 |
| 62 | 138 | 22 | 178 | |||||||||||
| efficientnetb0 | 150 | 50 | 71 | 0.67 | 0.75 | 0.73 | 0.70 | 178 | 22 | 86.5 | 0.84 | 0.89 | 0.88 | 0.86 |
| 66 | 134 | 32 | 168 | |||||||||||
| GoogleNet | 142 | 58 | 68 | 0.65 | 0.71 | 0.69 | 0.67 | 180 | 20 | 86 | 0.82 | 0.90 | 0.89 | 0.85 |
| 70 | 130 | 36 | 164 | |||||||||||
| googlenet-places365 | 152 | 48 | 72.5 | 0.69 | 0.76 | 0.74 | 0.72 | 182 | 18 | 89.5 | 0.88 | 0.91 | 0.91 | 0.89 |
| 62 | 138 | 24 | 176 | |||||||||||
| resnet18 | 152 | 48 | 70 | 0.64 | 0.76 | 0.73 | 0.68 | 176 | 24 | 86.5 | 0.85 | 0.88 | 0.88 | 0.86 |
| 72 | 128 | 30 | 170 | |||||||||||
From Table 13, it is concluded that the shufflenet network in the category of Lightweight achieved the highest classification accuracy using the DRD-EyePACS Dataset. It is also observed that 94.5% accuracy has been achieved after the augmentation of the DRD Dataset. It is also noted that the shufflenet-based network achieved 0.93, 0.96, 0.96, and 0.94 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for the healthy eye is 192, and ICA for an unhealthy eye is 186, achieved using a series-based network shufflenet. The best result is marked in grey.
Performance evaluation metrics using IDRiD Dataset
Several factors such as choice of pre-trained network, data pre-processing, fine-tuning of the pre-trained network using optimal hyper-parameter, validation data, and evaluation metrics are taken into consideration while networks performance using series, DAG and Lightweight network architectures category using IDRiD Dataset.
Performance of series-based network architectures using IDRiD dataset
The performance of series-based networks with and without augmentation using original and pre-processed DR images of the IDRiD Dataset is shown in Table 14.
Table 14.
The performance of series-based networks based on augmentation and without augmentation using original & pre-processed DR images of IDRiD Dataset
| (a) Using Original IDRiD Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC % | Sen | Sp | Pr | F1 | ACC % | Sen | Sp | Pr | F1 | |||||
| AlexNet | 42 | 8 | 64 | 0.44 | 0.84 | 0.73 | 0.55 | 46 | 4 | 85 | 0.78 | 0.92 | 0.91 | 0.84 |
| 28 | 22 | 11 | 39 | |||||||||||
| vgg16 | 43 | 7 | 67 | 0.48 | 0.86 | 0.77 | 0.59 | 46 | 4 | 86 | 0.80 | 0.92 | 0.91 | 0.85 |
| 26 | 24 | 10 | 40 | |||||||||||
| vgg19 | 42 | 8 | 72 | 0.60 | 0.84 | 0.79 | 0.68 | 47 | 3 | 90 | 0.86 | 0.94 | 0.93 | 0.90 |
| 20 | 30 | 7 | 43 | |||||||||||
| darknet19 | 39 | 11 | 62 | 0.46 | 0.78 | 0.68 | 0.55 | 44 | 6 | 84 | 0.80 | 0.88 | 0.87 | 0.83 |
| 27 | 23 | 10 | 40 | |||||||||||
| darknet53 | 41 | 9 | 71 | 0.60 | 0.82 | 0.77 | 0.67 | 46 | 4 | 87 | 0.82 | 0.92 | 0.91 | 0.86 |
| 20 | 30 | 9 | 41 | |||||||||||
| (b) Using Pre-processed IDRiD Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | ||
| AlexNet | 45 | 5 | 68% | 0.56 | 0.90 | 0.85 | 0.67 | 47 | 3 | 87 | 0.80 | 0.94 | 0.93 | 0.86 |
| 22 | 28 | 10 | 40 | |||||||||||
| vgg16 | 46 | 4 | 71% | 0.50 | 0.92 | 0.86 | 0.63 | 47 | 3 | 88 | 0.82 | 0.94 | 0.93 | 0.87 |
| 25 | 25 | 9 | 41 | |||||||||||
| Vgg19 | 45 | 05 | 73% | 0.56 | 0.90 | 0.85 | 0.67 | 48 | 2 | 92 | 0.88 | 0.96 | 0.96 | 0.92 |
| 22 | 28 | 6 | 44 | |||||||||||
| darknet19 | 38 | 12 | 66% | 0.56 | 0.76 | 0.70 | 0.62 | 46 | 4 | 88 | 0.84 | 0.92 | 0.91 | 0.88 |
| 22 | 28 | 8 | 42 | |||||||||||
| darknet53 | 46 | 4 | 72% | 0.52 | 0.92 | 0.87 | 0.65 | 45 | 5 | 89 | 0.88 | 0.90 | 0.90 | 0.89 |
| 24 | 26 | 6 | 44 | |||||||||||
Table 14 shows that the vgg19 network in the series category achieved the highest classification accuracy using the IDRiD Dataset. It is also observed that 92% accuracy has been achieved after the augmentation of the IDRiD Dataset. It is also noted that the vgg19-based network achieved 0.88, 0.96, 0.96, and 0.92 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for the healthy eye is 48 and ICA for an unhealthy eye is 44, which was achieved using series-based network vgg19. The best result is marked in grey.
Performance of DAG-based network architectures using IDRiD dataset
The performance of DAG-based networks with and without augmentation using original and pre-processed DR images of the IDRiD Dataset is shown in Table 15.
Table 15.
The performance of DAG-based networks based on augmentation and without augmentation using original & pre-processed DR images of IDRiD Dataset
| (a) Using Original IDRiD Dataset images | ||||||||||||||
| Network Name | Confusion Matrix 100 | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC | Sen | Sp | Pr | F1 | ACC | Sen | Sp | Pr | F1 | |||||
| inceptionv3 | 40 | 10 | 67 | 0.54 | 0.80 | 0.73 | 0.62 | 46 | 4 | 88 | 0.84 | 0.92 | 0.91 | 0.88 |
| 23 | 27 | 8 | 42 | |||||||||||
| densenet201 | 42 | 8 | 69 | 0.50 | 0.84 | 0.76 | 0.60 | 45 | 5 | 87 | 0.84 | 0.90 | 0.89 | 0.87 |
| 25 | 25 | 8 | 42 | |||||||||||
| Resnet50 | 41 | 9 | 70 | 0.58 | 0.82 | 0.76 | 0.66 | 47 | 3 | 88 | 0.82 | 0.94 | 0.93 | 0.87 |
| 21 | 29 | 9 | 41 | |||||||||||
| Resnet101 | 44 | 6 | 72 | 0.56 | 0.88 | 0.82 | 0.67 | 47 | 3 | 90 | 0.86 | 0.94 | 0.93 | 0.90 |
| 22 | 28 | 7 | 43 | |||||||||||
| xception | 43 | 7 | 69 | 0.52 | 0.86 | 0.79 | 0.63 | 42 | 8 | 85 | 0.86 | 0.84 | 0.84 | 0.85 |
| 24 | 26 | 7 | 43 | |||||||||||
| inceptionresnetv2 | 38 | 12 | 67 | 0.58 | 0.76 | 0.71 | 0.64 | 45 | 5 | 84 | 0.78 | 0.90 | 0.89 | 0.83 |
| 21 | 29 | 11 | 39 | |||||||||||
| nasnetlarge | 37 | 13 | 68 | 0.62 | 0.74 | 0.70 | 0.66 | 46 | 4 | 86 | 0.80 | 0.92 | 0.91 | 0.85 |
| 19 | 31 | 10 | 40 | |||||||||||
| (b) Using Pre-processed IDRiD Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC | Sen | Sp | Pr | F1 | Confusion Matrix | ACC | Sen | Sp | Pr | F1 | ||
| inceptionv3 | 41 | 9 | 68 | 0.54 | 0.82 | 0.75 | 0.63 | 46 | 4 | 89 | 0.86 | 0.92 | 0.91 | 0.89 |
| 23 | 27 | 7 | 43 | |||||||||||
| densenet201 | 43 | 7 | 71 | 0.56 | 0.86 | 0.80 | 0.66 | 47 | 3 | 90 | 0.86 | 0.94 | 0.93 | 0.90 |
| 22 | 28 | 7 | 43 | |||||||||||
| Resnet50 | 43 | 7 | 72 | 0.58 | 0.86 | 0.81 | 0.67 | 45 | 5 | 90 | 0.90 | 0.90 | 0.90 | 0.90 |
| 21 | 29 | 5 | 45 | |||||||||||
| Resnet101 | 44 | 6 | 73 | 0.58 | 0.88 | 0.83 | 0.68 | 47 | 3 | 92 | 0.90 | 0.94 | 0.94 | 0.92 |
| 21 | 29 | 5 | 45 | |||||||||||
| xception | 44 | 6 | 70 | 0.52 | 0.88 | 0.81 | 0.63 | 47 | 3 | 88 | 0.82 | 0.94 | 0.93 | 0.87 |
| 24 | 26 | 9 | 41 | |||||||||||
| inceptionresnetv2 | 40 | 10 | 69 | 0.58 | 0.80 | 0.74 | 0.65 | 43 | 7 | 86 | 0.86 | 0.86 | 0.86 | 0.86 |
| 21 | 29 | 7 | 43 | |||||||||||
| nasnetlarge | 39 | 11 | 70 | 0.62 | 0.78 | 0.74 | 0.67 | 47 | 3 | 88 | 0.82 | 0.94 | 0.93 | 0.87 |
| 19 | 31 | 9 | 41 | |||||||||||
From Table 15, it is concluded that the Resnet101 network in the category of DAG achieved the highest classification accuracy using the IDRiD Dataset. It is also observed that 92% accuracy has been achieved after the augmentation of the IDRiD Dataset. It is also noted that the Resnet101-based network achieved 0.90, 0.94, 0.94, and 0.92 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for a healthy eye is 47, and ICA for an unhealthy eye is 45, achieved using a series-based network, Resnet101. The best result is marked in grey.
Performance of Lightweight based network architectures using IDRiD dataset
Table 16 shows the performance of Lightweight networks based on augmentation and without augmentation using original and pre-processed DR images of the IDRiD Dataset.
Table 16.
The performance of Lightweight pre-trained networks based on augmentation and without augmentation using original & pre-processed DR images of IDRiD Dataset
| (a) Using Original IDRiD Dataset images | ||||||||||||||
| Network Name | Confusion Matrix100 | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
|
ACC % |
Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| SqueezeNet | 41 | 9 | 68 | 0.54 | 0.82 | 0.75 | 0.63 | 44 | 6 | 83 | 0.78 | 0.88 | 0.87 | 0.82 |
| 23 | 27 | 11 | 39 | |||||||||||
| mobilenetv2 | 44 | 6 | 73 | 0.58 | 0.88 | 0.83 | 0.68 | 45 | 5 | 89 | 0.88 | 0.90 | 0.90 | 0.89 |
| 21 | 29 | 6 | 44 | |||||||||||
| shufflenet | 44 | 6 | 74 | 0.60 | 0.88 | 0.83 | 0.70 | 46 | 4 | 90 | 0.88 | 0.92 | 0.92 | 0.90 |
| 20 | 30 | 6 | 44 | |||||||||||
| nasnetmobile | 40 | 10 | 69 | 0.58 | 0.80 | 0.74 | 0.65 | 45 | 5 | 88 | 0.86 | 0.90 | 0.90 | 0.88 |
| 21 | 29 | 7 | 43 | |||||||||||
| efficientnetb0 | 38 | 12 | 65 | 0.54 | 0.76 | 0.69 | 0.61 | 41 | 9 | 82 | 0.82 | 0.82 | 0.82 | 0.82 |
| 23 | 27 | 9 | 41 | |||||||||||
| GoogleNet | 39 | 11 | 65 | 0.52 | 0.78 | 0.70 | 0.60 | 43 | 7 | 83 | 0.80 | 0.86 | 0.85 | 0.82 |
| 24 | 26 | 10 | 40 | |||||||||||
| googlenet-places365 | 39 | 11 | 67 | 0.56 | 0.78 | 0.72 | 0.63 | 45 | 5 | 86 | 0.82 | 0.90 | 0.89 | 0.85 |
| 22 | 28 | 9 | 41 | |||||||||||
| resnet18 | 38 | 12 | 66 | 0.56 | 0.76 | 0.70 | 0.62 | 44 | 6 | 84 | 0.80 | 0.88 | 0.87 | 0.83 |
| 22 | 28 | 10 | 40 | |||||||||||
| (b) Using Pre-processed IDRiD Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | ||
| SqueezeNet | 40 | 10 | 70 | 0.60 | 0.80 | 0.75 | 0.67 | 45 | 5 | 86 | 0.82 | 0.90 | 0.89 | 0.85 |
| 20 | 30 | 9 | 41 | |||||||||||
| mobilenetv2 | 44 | 6 | 74 | 0.60 | 0.88 | 0.83 | 0.70 | 45 | 5 | 90 | 0.90 | 0.90 | 0.90 | 0.90 |
| 20 | 30 | 5 | 45 | |||||||||||
| shufflenet | 45 | 5 | 75 | 0.60 | 0.90 | 0.86 | 0.71 | 46 | 4 | 91 | 0.90 | 0.92 | 0.92 | 0.91 |
| 20 | 30 | 5 | 45 | |||||||||||
| nasnetmobile | 40 | 10 | 71 | 0.62 | 0.80 | 0.76 | 0.68 | 46 | 4 | 89 | 0.86 | 0.92 | 0.91 | 0.89 |
| 19 | 31 | 7 | 43 | |||||||||||
| efficientnetb0 | 39 | 11 | 66 | 0.54 | 0.78 | 0.71 | 0.61 | 44 | 6 | 84 | 0.80 | 0.88 | 0.87 | 0.83 |
| 23 | 27 | 10 | 40 | |||||||||||
| GoogleNet | 40 | 10 | 68 | 0.56 | 0.80 | 0.74 | 0.64 | 43 | 7 | 85 | 0.84 | 0.86 | 0.86 | 0.85 |
| 22 | 28 | 8 | 42 | |||||||||||
| googlenet-places365 | 40 | 10 | 70 | 0.60 | 0.80 | 0.75 | 0.67 | 45 | 5 | 88 | 0.86 | 0.90 | 0.90 | 0.88 |
| 20 | 30 | 7 | 43 | |||||||||||
| resnet18 | 40 | 10 | 68 | 0.56 | 0.80 | 0.74 | 0.64 | 44 | 6 | 85 | 0.82 | 0.88 | 0.87 | 0.85 |
| 22 | 28 | 9 | 41 | |||||||||||
From Table 16, it is concluded that the shufflenet network in the category of Lightweight achieved the highest classification accuracy using IDRiD Dataset. It is also observed that 91% accuracy has been achieved after the augmentation of the IDRiD Dataset. It is also noted that the shufflenet-based network achieved 0.90, 0.92, 0.92, and 0.91 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It is also noted that Individual Class Accuracy (ICA) for the healthy eye is 46 and ICA for an unhealthy eye is 45, which has been achieved using series-based network shufflenet. The best result is marked in grey.
Performance evaluation metrics using the ATOS-2019 Dataset
Several factors such as choice of pre-trained network, data pre-processing, fine-tuning of the pre-trained network using optimal hyper-parameter, validation data, and evaluation metrics are taken into consideration while networks performance using series, DAG and Lightweight network architectures category using ATOS-2019 Dataset.
Performance of series-based network architectures using ATOS-2019 Dataset
The performance of series-based networks based with augmentation and without augmentation using original & pre-processed DR images of the ATOS-2019 Dataset is shown in Table 17.
Table 17.
The performance of series-based networks based on augmentation and without augmentation using original & pre-processed DR images of ATOS-2019 Dataset
| (a) Using Original ATOS-2019 Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
|
ACC % |
Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| AlexNet | 140 | 60 | 66 | 0.62 | 0.70 | 0.67 | 0.65 | 172 | 28 | 85 | 0.84 | 0.86 | 0.86 | 0.85 |
| 76 | 124 | 32 | 168 | |||||||||||
| vgg16 | 141 | 59 | 68.5 | 0.67 | 0.71 | 0.69 | 0.68 | 180 | 20 | 87.5 | 0.85 | 0.90 | 0.89 | 0.87 |
| 67 | 133 | 30 | 170 | |||||||||||
| vgg19 | 152 | 48 | 72.5 | 0.69 | 0.76 | 0.74 | 0.72 | 188 | 12 | 91.5 | 0.89 | 0.94 | 0.94 | 0.91 |
| 62 | 138 | 22 | 178 | |||||||||||
| darknet19 | 142 | 58 | 68 | 0.65 | 0.71 | 0.69 | 0.67 | 180 | 20 | 87 | 0.84 | 0.90 | 0.89 | 0.87 |
| 70 | 130 | 32 | 168 | |||||||||||
| darknet53 | 141 | 59 | 67 | 0.64 | 0.71 | 0.68 | 0.66 | 177 | 23 | 86.5 | 0.85 | 0.89 | 0.88 | 0.86 |
| 73 | 127 | 31 | 169 | |||||||||||
| (b) Using Pre-processed ATOS-2019 Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | ||
| AlexNet | 145 | 55 | 67.5 | 0.63 | 0.73 | 0.69 | 0.66 | 179 | 21 | 86.5 | 0.84 | 0.90 | 0.89 | 0.86 |
| 75 | 125 | 33 | 167 | |||||||||||
| vgg16 | 145 | 55 | 70 | 0.68 | 0.73 | 0.71 | 0.69 | 184 | 16 | 88.5 | 0.85 | 0.92 | 0.91 | 0.88 |
| 65 | 135 | 30 | 170 | |||||||||||
| vgg19 | 153 | 47 | 73 | 0.70 | 0.77 | 0.75 | 0.72 | 191 | 9 | 92.5 | 0.90 | 0.96 | 0.95 | 0.92 |
| 61 | 139 | 21 | 179 | |||||||||||
| darknet19 | 145 | 55 | 69.5 | 0.67 | 0.73 | 0.71 | 0.69 | 182 | 18 | 88 | 0.85 | 0.91 | 0.90 | 0.88 |
| 67 | 133 | 30 | 170 | |||||||||||
| darknet53 | 143 | 57 | 68 | 0.65 | 0.72 | 0.69 | 0.67 | 180 | 20 | 87.5 | 0.85 | 0.90 | 0.89 | 0.87 |
| 71 | 129 | 30 | 170 | |||||||||||
From Table 17, it is concluded that vgg19 networks in the category of series achieved the highest classification accuracy using the ATOS-2019 Dataset. It is also observed that 92.5% accuracy has been achieved after the augmentation of the ATOS-2019 Dataset. It is also noted that the vgg19-based network achieved 0.90, 0.96, 0.95, and 0.92 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It also noted that individual class accuracy (ICA) for a healthy eye is 191 and ICA for an unhealthy eye is 179, which was achieved using series-based network vgg19. The best result is marked in grey.
Performance of DAG-based network architectures using ATOS-2019 dataset
Table 18 shows the performance of DAG-based networks with and without augmentation using original and pre-processed DR images of the ATOS-2019 Dataset.
Table 18.
The performance of DAG-based networks based on augmentation and without augmentation using original & pre-processed DR images of ATOS-2019 Dataset
| (a) Using Original ATOS-2019 Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
|
ACC % |
Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| inceptionv3 | 147 | 53 | 71.5 | 0.70 | 0.74 | 0.72 | 0.71 | 181 | 19 | 86 | 0.82 | 0.91 | 0.90 | 0.85 |
| 61 | 139 | 37 | 163 | |||||||||||
| densenet201 | 145 | 55 | 705 | 0.69 | 0.73 | 0.71 | 0.70 | 185 | 15 | 88 | 0.84 | 0.93 | 0.92 | 0.87 |
| 63 | 137 | 33 | 167 | |||||||||||
| Resnet50 | 146 | 54 | 71.5 | 0.70 | 0.73 | 0.72 | 0.71 | 185 | 15 | 90 | 0.88 | 0.93 | 0.92 | 0.90 |
| 60 | 140 | 25 | 175 | |||||||||||
| Resnet101 | 150 | 50 | 74 | 0.73 | 0.75 | 0.74 | 0.74 | 189 | 11 | 91.5 | 0.89 | 0.95 | 0.94 | 0.91 |
| 54 | 146 | 23 | 177 | |||||||||||
| xception | 144 | 56 | 70 | 0.68 | 0.72 | 0.71 | 0.69 | 181 | 19 | 86.5 | 0.83 | 0.91 | 0.90 | 0.86 |
| 64 | 136 | 35 | 165 | |||||||||||
| inceptionresnetv2 | 140 | 60 | 67 | 0.64 | 0.70 | 0.68 | 0.66 | 170 | 30 | 83.5 | 0.82 | 0.85 | 0.85 | 0.83 |
| 72 | 128 | 36 | 164 | |||||||||||
| nasnetlarge | 145 | 55 | 70.5 | 0.69 | 0.73 | 0.71 | 0.70 | 182 | 18 | 87.5 | 0.84 | 0.91 | 0.90 | 0.87 |
| 63 | 137 | 32 | 168 | |||||||||||
| (b) Using Pre-processed ATOS-2019 Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | ||
| inceptionv3 | 146 | 54 | 72.5 | 0.72 | 0.73 | 0.73 | 0.72 | 185 | 15 | 87.5 | 0.83 | 0.93 | 0.92 | 0.87 |
| 56 | 144 | 35 | 165 | |||||||||||
| densenet201 | 146 | 54 | 71.5 | 0.70 | 0.73 | 0.72 | 0.71 | 187 | 13 | 89 | 0.85 | 0.94 | 0.93 | 0.88 |
| 60 | 140 | 31 | 169 | |||||||||||
| Resnet50 | 151 | 49 | 73 | 0.71 | 0.76 | 0.74 | 0.72 | 187 | 13 | 91 | 0.89 | 0.94 | 0.93 | 0.91 |
| 59 | 141 | 23 | 177 | |||||||||||
| Resnet101 | 152 | 48 | 74.5 | 0.73 | 0.76 | 0.75 | 0.74 | 193 | 7 | 94 | 0.92 | 0.97 | 0.96 | 0.94 |
| 54 | 146 | 17 | 183 | |||||||||||
| xception | 146 | 54 | 71 | 0.69 | 0.73 | 0.72 | 0.70 | 185 | 15 | 88 | 0.84 | 0.93 | 0.92 | 0.87 |
| 62 | 138 | 33 | 167 | |||||||||||
| inceptionresnetv2 | 141 | 59 | 68 | 0.66 | 0.71 | 0.69 | 0.67 | 175 | 25 | 85 | 0.83 | 0.88 | 0.87 | 0.85 |
| 69 | 131 | 35 | 165 | |||||||||||
| nasnetlarge | 147 | 53 | 72 | 0.71 | 0.74 | 0.73 | 0.72 | 187 | 13 | 89.5 | 0.86 | 0.94 | 0.93 | 0.89 |
| 59 | 141 | 29 | 171 | |||||||||||
From Table 18, it is concluded that the Resnet101 network in the category of DAG achieved the highest classification accuracy using the ATOS-2019 Dataset. It is also observed that 94% accuracy has been achieved after augmentation of ATOS-2019 Dataset. It is also noted that the Resnet101-based network achieved 0.92, 0.97, 0.96, and 0.94 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It is also stated that individual class accuracy (ICA) for the healthy eye is 193 and ICA for an unhealthy eye is 183, which was achieved using the based network Resnet101. The best result is marked in grey.
Performance of Lightweight based network architectures using ATOS-2019 dataset
Table 19 shows the performance of Lightweight networks based on augmentation and without augmentation using original and pre-processed DR images of the ATOS-2019 Dataset.
Table 19.
The performance of Lightweight networks based on augmentation and without augmentation using original & pre-processed DR images of ATOS-2019 Dataset
| (a) Using Original ATOS-2019 Dataset images | ||||||||||||||
| Network Name | Confusion Matrix400 | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC % | Sen | Sp | Pr | F1 | ACC % | Sen | Sp | Pr | F1 | |||||
| SqueezeNet | 143 | 57 | 70.5 | 0.70 | 0.72 | 0.71 | 0.70 | 180 | 20 | 86.5 | 0.83 | 0.90 | 0.89 | 0.86 |
| 61 | 139 | 34 | 166 | |||||||||||
| mobilenetv2 | 145 | 55 | 71 | 0.70 | 0.73 | 0.72 | 0.71 | 176 | 24 | 87 | 0.86 | 0.88 | 0.88 | 0.87 |
| 61 | 139 | 28 | 172 | |||||||||||
| shufflenet | 149 | 51 | 72.5 | 0.71 | 0.75 | 0.73 | 0.72 | 187 | 13 | 91.5 | 0.90 | 0.94 | 0.93 | 0.91 |
| 59 | 141 | 21 | 179 | |||||||||||
| nasnetmobile | 143 | 57 | 70 | 0.69 | 0.72 | 0.71 | 0.70 | 182 | 18 | 86.5 | 0.82 | 0.91 | 0.90 | 0.86 |
| 63 | 137 | 36 | 164 | |||||||||||
| efficientnetb0 | 135 | 65 | 66 | 0.65 | 0.68 | 0.66 | 0.65 | 170 | 30 | 83.5 | 0.82 | 0.85 | 0.85 | 0.83 |
| 71 | 129 | 36 | 164 | |||||||||||
| GoogleNet | 130 | 70 | 64.5 | 0.64 | 0.65 | 0.65 | 0.64 | 165 | 35 | 81.75 | 0.81 | 0.83 | 0.82 | 0.82 |
| 72 | 128 | 38 | 162 | |||||||||||
| googlenet-places365 | 141 | 59 | 69 | 0.68 | 0.71 | 0.70 | 0.69 | 175 | 25 | 85.5 | 0.84 | 0.88 | 0.87 | 0.85 |
| 65 | 135 | 33 | 167 | |||||||||||
| resnet18 | 136 | 64 | 66 | 0.64 | 0.68 | 0.67 | 0.65 | 174 | 26 | 84 | 0.81 | 0.87 | 0.86 | 0.84 |
| 72 | 128 | 38 | 162 | |||||||||||
| (b) Using Pre-processed ATOS-2019 Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | ||
| SqueezeNet | 148 | 52 | 72 | 0.70 | 0.74 | 0.73 | 0.71 | 183 | 17 | 87.5 | 0.84 | 0.92 | 0.91 | 0.87 |
| 60 | 140 | 33 | 167 | |||||||||||
| mobilenetv2 | 149 | 51 | 72.5 | 0.71 | 0.75 | 0.73 | 0.72 | 180 | 20 | 89.5 | 0.89 | 0.90 | 0.90 | 0.89 |
| 59 | 141 | 22 | 178 | |||||||||||
| shufflenet | 153 | 47 | 73.5 | 0.71 | 0.77 | 0.75 | 0.73 | 191 | 9 | 92.5 | 0.90 | 0.96 | 0.95 | 0.92 |
| 59 | 141 | 21 | 179 | |||||||||||
| nasnetmobile | 145 | 55 | 71 | 0.70 | 0.73 | 0.72 | 0.71 | 185 | 15 | 89 | 0.86 | 0.93 | 0.92 | 0.89 |
| 61 | 139 | 29 | 171 | |||||||||||
| efficientnetb0 | 137 | 63 | 67 | 0.66 | 0.69 | 0.68 | 0.66 | 175 | 25 | 85.5 | 0.84 | 0.88 | 0.87 | 0.85 |
| 69 | 131 | 33 | 167 | |||||||||||
| GoogleNet | 135 | 65 | 66 | 0.65 | 0.68 | 0.66 | 0.65 | 172 | 28 | 85 | 0.84 | 0.86 | 0.86 | 0.85 |
| 71 | 129 | 32 | 168 | |||||||||||
| googlenet-places365 | 143 | 57 | 70 | 0.69 | 0.72 | 0.71 | 0.70 | 180 | 20 | 87.5 | 0.85 | 0.90 | 0.89 | 0.87 |
| 63 | 137 | 30 | 170 | |||||||||||
| resnet18 | 139 | 61 | 67 | 0.65 | 0.70 | 0.68 | 0.66 | 175 | 25 | 85.5 | 0.84 | 0.88 | 0.87 | 0.85 |
| 71 | 129 | 33 | 167 | |||||||||||
From Table 19, it is concluded that the shufflenet network in the category of Lightweight achieved the highest classification accuracy using the ATOS-2019 Dataset. It is also observed that 92.5% accuracy has been achieved after augmentation of ATOS-2019 Dataset. It is also noted that the shufflenet-based network achieved 0.90, 0.96, 0.95, and 0.92 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for the healthy eye is 191 and ICA for an unhealthy eye is 179, which was achieved using series-based network shufflenet. The best result is marked in grey.
Performance evaluation metrics using the combined dataset
Several factors such as choice of pre-trained network, data pre-processing, fine-tuning of the pre-trained network using optimal hyper-parameter, validation data, and evaluation metrics are taken into consideration while network performance using series, DAG, and Lightweight network architectures category using Combined Dataset.
Performance of series-based network architectures using combined dataset
The performance of series-based networks-based augmentation and without augmentation using original & pre-processed DR images of the Combined Dataset is shown in Table 20.
Table 20.
The performance of series-based networks based on augmentation and without augmentation using original & pre-processed DR images of Combined Dataset
| (a) Using Original Combined Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC % | Sen | Sp | Pr | F1 | ACC % | Sen | Sp | Pr | F1 | |||||
| AlexNet | 318 | 132 | 65.33 | 0.60 | 0.71 | 0.67 | 0.63 | 415 | 35 | 91 | 0.90 | 0.92 | 0.92 | 0.91 |
| 180 | 270 | 46 | 404 | |||||||||||
| vgg16 | 324 | 126 | 67.88 | 0.64 | 0.72 | 0.69 | 0.67 | 409 | 41 | 90 | 0.89 | 0.91 | 0.91 | 0.90 |
| 163 | 287 | 49 | 401 | |||||||||||
| vgg19 | 346 | 104 | 72.66 | 0.68 | 0.77 | 0.75 | 0.71 | 428 | 22 | 94.66 | 0.94 | 0.95 | 0.95 | 0.95 |
| 142 | 308 | 26 | 424 | |||||||||||
| darknet19 | 311 | 139 | 66.44 | 0.64 | 0.69 | 0.67 | 0.66 | 421 | 29 | 92 | 0.90 | 0.94 | 0.93 | 0.92 |
| 163 | 287 | 43 | 407 | |||||||||||
| darknet53 | 326 | 124 | 69.22 | 0.66 | 0.72 | 0.71 | 0.68 | 410 | 40 | 90.1111 | 0.89 | 0.91 | 0.91 | 0.90 |
| 153 | 297 | 49 | 401 | |||||||||||
| (b) Using Pre-processed Combined Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | ||
| AlexNet | 330 | 120 | 68.77 | 0.64 | 0.73 | 0.71 | 0.67 | 419 | 31 | 92 | 0.91 | 0.93 | 0.93 | 0.92 |
| 161 | 289 | 41 | 409 | |||||||||||
| vgg16 | 341 | 109 | 70.55 | 0.65 | 0.76 | 0.73 | 0.69 | 413 | 37 | 91.11 | 0.90 | 0.92 | 0.92 | 0.91 |
| 156 | 294 | 43 | 407 | |||||||||||
| vgg19 | 350 | 100 | 73.88 | 0.70 | 0.78 | 0.76 | 0.73 | 435 | 15 | 96.22 | 0.96 | 0.97 | 0.97 | 0.96 |
| 135 | 315 | 19 | 431 | |||||||||||
| darknet19 | 327 | 123 | 68.88 | 0.65 | 0.73 | 0.70 | 0.68 | 425 | 25 | 93 | 0.92 | 0.94 | 0.94 | 0.93 |
| 157 | 293 | 38 | 412 | |||||||||||
| darknet53 | 345 | 105 | 70.66 | 0.65 | 0.77 | 0.73 | 0.69 | 421 | 29 | 91 | 0.88 | 0.94 | 0.93 | 0.91 |
| 159 | 291 | 52 | 398 | |||||||||||
Table 20 shows that the vgg19 network in the series category achieved the highest classification accuracy using a Combined Dataset. It is also observed that 96.22% accuracy has been achieved after augmentation of Combined Dataset. It is also noted that the vgg19-based network achieved 0.96, 0.97, 0.97, and 0.96 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It was also noted that individual class accuracy (ICA) for the healthy eye is 435 and ICA for an unhealthy eye is 431, which was achieved using series-based network vgg19. The best result is marked in grey.
Performance of DAG-based network architectures using combined dataset
Table 21 shows the performance of DAG-based networks based augmentation and without augmentation using original and pre-processed DR images of the Combined Dataset.
Table 21.
The performance of DAG-based networks based on augmentation and without augmentation using original & pre-processed DR images of Combined Dataset
| (a) Using Original Combined Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
| ACC % | Sen | Sp | Pr | F1 | ACC % | Sen | Sp | Pr | F1 | |||||
| inceptionv3 | 325 | 125 | 69.2 | 0.66 | 0.72 | 0.70 | 0.68 | 415 | 35 | 89.55 | 0.87 | 0.92 | 0.92 | 0.89 |
| 152 | 298 | 59 | 391 | |||||||||||
| densenet201 | 339 | 111 | 69.6 | 0.64 | 0.75 | 0.72 | 0.68 | 414 | 36 | 89.11 | 0.86 | 0.92 | 0.92 | 0.89 |
| 162 | 288 | 62 | 388 | |||||||||||
| Resnet50 | 339 | 111 | 71.7 | 0.68 | 0.75 | 0.73 | 0.71 | 414 | 36 | 90.22 | 0.88 | 0.92 | 0.92 | 0.90 |
| 143 | 307 | 52 | 398 | |||||||||||
| Resnet101 | 348 | 102 | 0.74 | 0.71 | 0.77 | 0.76 | 0.73 | 426 | 24 | 93.55 | 0.92 | 0.95 | 0.95 | 0.93 |
| 132 | 318 | 34 | 416 | |||||||||||
| xception | 339 | 111 | 70.3 | 0.65 | 0.75 | 0.73 | 0.69 | 417 | 33 | 91.33 | 0.90 | 0.93 | 0.92 | 0.91 |
| 156 | 294 | 45 | 405 | |||||||||||
| inceptionresnetv2 | 328 | 122 | 67.8 | 0.63 | 0.73 | 0.70 | 0.66 | 406 | 44 | 88.33 | 0.86 | 0.90 | 0.90 | 0.88 |
| 167 | 283 | 61 | 389 | |||||||||||
| nasnetlarge | 330 | 120 | 69.3 | 0.65 | 0.73 | 0.71 | 0.68 | 416 | 34 | 90.11 | 0.88 | 0.92 | 0.92 | 0.90 |
| 156 | 294 | 55 | 395 | |||||||||||
| Using Pre-processed Combined Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | ||
| inceptionv3 | 339 | 111 | 72 | 0.69 | 0.75 | 0.74 | 0.71 | 415 | 35 | 90.88 | 0.89 | 0.93 | 0.93 | 0.88 |
| 141 | 309 | 72 | 378 | |||||||||||
| densenet201 | 343 | 107 | 72.3 | 0.68 | 0.76 | 0.74 | 0.71 | 418 | 32 | 91 | 0.89 | 0.93 | 0.93 | 0.89 |
| 142 | 308 | 66 | 384 | |||||||||||
| Resnet50 | 352 | 98 | 73.7 | 0.69 | 0.78 | 0.76 | 0.73 | 420 | 30 | 91.44 | 0.90 | 0.93 | 0.93 | 0.91 |
| 138 | 312 | 50 | 400 | |||||||||||
| Resnet101 | 358 | 92 | 75.6 | 0.72 | 0.80 | 0.78 | 0.75 | 440 | 10 | 97.33 | 0.97 | 0.98 | 0.98 | 0.97 |
| 127 | 323 | 14 | 436 | |||||||||||
| xception | 340 | 110 | 71.3 | 0.67 | 0.76 | 0.73 | 0.70 | 414 | 36 | 92.44 | 0.91 | 0.94 | 0.93 | 0.88 |
| 148 | 302 | 66 | 384 | |||||||||||
| inceptionresnetv2 | 329 | 121 | 69 | 0.65 | 0.73 | 0.71 | 0.68 | 402 | 48 | 89.77 | 0.89 | 0.91 | 0.91 | 0.87 |
| 158 | 292 | 70 | 380 | |||||||||||
| nasnetlarge | 328 | 122 | 70.6 | 0.68 | 0.73 | 0.72 | 0.70 | 410 | 40 | 91.44 | 0.89 | 0.94 | 0.94 | 0.88 |
| 142 | 308 | 68 | 382 | |||||||||||
From Table 21, it is concluded that the Resnet101 network in the category of DAG achieved the highest classification accuracy using the Combined Dataset. It is also observed that 97.33% accuracy has been achieved after augmentation of Combined Dataset. It is also noted that the Resnet101-based network achieved 0.97, 0.98, 0.98, and 0.97 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It is also stated that individual class accuracy (ICA) for the healthy eye is 440, and ICA for an unhealthy eye is 436, achieved using the based network Resnet101. The best result is marked in grey.
Performance of lightweight-based network architecture using combined dataset
Table 22 shows the performance of Lightweight networks based on augmentation and without augmentation using original and pre-processed DR images of the Combined Dataset.
Table 22.
The performance of Lightweight networks based on augmentation and without augmentation using original & pre-processed DR images of Combined Dataset
| (a) Using Original Combined Dataset images | ||||||||||||||
| Network Name | Confusion Matrix | Without Augmentation | Confusion Matrix | After Augmentation | ||||||||||
|
ACC % |
Sen | Sp | Pr | F1 |
ACC % |
Sen | Sp | Pr | F1 | |||||
| SqueezeNet | 326 | 124 | 69.7 | 0.67 | 0.72 | 0.71 | 0.69 | 402 | 48 | 85.44 | 0.82 | 0.89 | 0.88 | 0.85 |
| 148 | 302 | 83 | 367 | |||||||||||
| mobilenetv2 | 343 | 107 | 71.6 | 0.67 | 0.76 | 0.74 | 0.70 | 401 | 49 | 87.44 | 0.86 | 0.89 | 0.89 | 0.87 |
| 148 | 302 | 64 | 386 | |||||||||||
| shufflenet | 351 | 99 | 74 | 0.70 | 0.78 | 0.76 | 0.73 | 429 | 21 | 93.33 | 0.91 | 0.95 | 0.95 | 0.93 |
| 135 | 315 | 39 | 411 | |||||||||||
| nasnetmobile | 335 | 115 | 70.7 | 0.67 | 0.74 | 0.72 | 0.70 | 407 | 43 | 87.55 | 0.85 | 0.90 | 0.90 | 0.87 |
| 148 | 302 | 69 | 381 | |||||||||||
| efficientnetb0 | 313 | 137 | 66.3 | 0.63 | 0.70 | 0.67 | 0.65 | 387 | 63 | 0.84 | 0.82 | 0.86 | 0.85 | 0.84 |
| 166 | 284 | 81 | 369 | |||||||||||
| GoogleNet | 307 | 143 | 64.5 | 0.61 | 0.68 | 0.66 | 0.63 | 378 | 72 | 82.66 | 0.81 | 0.84 | 0.84 | 0.82 |
| 176 | 274 | 84 | 366 | |||||||||||
| googlenet-places365 | 330 | 120 | 69.6 | 0.66 | 0.73 | 0.71 | 0.69 | 402 | 48 | 86.22 | 0.83 | 0.89 | 0.89 | 0.86 |
| 153 | 297 | 76 | 374 | |||||||||||
| resnet18 | 316 | 134 | 66.8 | 0.64 | 0.70 | 0.68 | 0.66 | 394 | 56 | 86.88 | 0.86 | 0.88 | 0.87 | 0.87 |
| 164 | 286 | 62 | 388 | |||||||||||
| (b) Using Pre-processed Combined Dataset Images | ||||||||||||||
| Network Name | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | Confusion Matrix |
ACC % |
Sen | Sp | Pr | F1 | ||
| SqueezeNet | 338 | 112 | 71.3 | 0.68 | 0.75 | 0.73 | 0.70 | 408 | 42 | 87.33 | 0.84 | 0.91 | 0.90 | 0.87 |
| 146 | 304 | 72 | 378 | |||||||||||
| mobilenetv2 | 351 | 99 | 73.3 | 0.69 | 0.78 | 0.76 | 0.72 | 411 | 39 | 90 | 0.89 | 0.91 | 0.91 | 0.90 |
| 141 | 309 | 51 | 399 | |||||||||||
| shufflenet | 358 | 92 | 75 | 0.70 | 0.80 | 0.78 | 0.74 | 439 | 11 | 96.66 | 0.96 | 0.98 | 0.98 | 0.97 |
| 133 | 317 | 19 | 431 | |||||||||||
| nasnetmobile | 343 | 107 | 72.3 | 0.68 | 0.76 | 0.74 | 0.71 | 415 | 35 | 89.66 | 0.87 | 0.92 | 0.92 | 0.89 |
| 142 | 308 | 58 | 392 | |||||||||||
| efficientnetb0 | 326 | 124 | 68.6 | 0.65 | 0.72 | 0.70 | 0.67 | 397 | 53 | 85.77 | 0.83 | 0.88 | 0.88 | 0.85 |
| 158 | 292 | 75 | 375 | |||||||||||
| GoogleNet | 317 | 133 | 67.1 | 0.64 | 0.70 | 0.68 | 0.66 | 395 | 55 | 85.44 | 0.83 | 0.88 | 0.87 | 0.85 |
| 163 | 287 | 76 | 374 | |||||||||||
| googlenet-places365 | 335 | 115 | 71.1 | 0.68 | 0.74 | 0.73 | 0.70 | 407 | 43 | 88.44 | 0.86 | 0.90 | 0.90 | 0.88 |
| 145 | 305 | 61 | 389 | |||||||||||
| resnet18 | 331 | 119 | 68.4 | 0.63 | 0.74 | 0.71 | 0.67 | 396 | 54 | 87.55 | 0.87 | 0.88 | 0.88 | 0.88 |
| 165 | 285 | 58 | 392 | |||||||||||
From Table 22, it is concluded that the shufflenet network in the category of Lightweight achieved the highest classification accuracy using the Combined Dataset. It is also observed that 96.66% accuracy has been achieved after augmentation of Combined Dataset. It is also noted that the shufflenet-based network achieved 0.96, 0.98, 0.98, and 0.97 evaluation parameters named sensitivity, Specificity, Precision, and F1-Score, respectively. It is also noted that Individual Class Accuracy (ICA) for the healthy eye is 439 and ICA for an unhealthy eye is 431, which has been achieved using series-based network shufflenet. The best result is marked in grey.
Result and discussion
When creating a model on limited resources, CNN image analysis uses transfer learning (TL), which involves moving feature weights from training on large image datasets to training on smaller datasets. It is possible to minimize the number of images in the target domain drastically. Generally, the model is trained on a smaller target dataset for feature extraction or fine-tuning based on the target domain's size and similarity, using pre-trained weights from ImageNet, a sizable dataset of natural images. However, if there is a significant difference between the source and target domains, the model's performance might not be recognized. Several pre-trained models have been used to classify diabetic retinopathy images. They show promising and different results and require less computational time. To solve this problem, 20 different pre-trained models have been divided into three categories: series, DAG, and lightweight. Three benchmark datasets have been collected for this work, and a combined dataset has been prepared for two classes. The best results have been achieved with weights sequentially transferred from ImageNet to prepared datasets. Our research showed that the deep learning method based on the ResNet101 network effectively distinguishes between normal retinal and DR images. The promising result shows that the resnet101-based pre-trained network achieved the highest classification accuracy using combined dataset. The amount of dataset used for training, the choice of hyperparameters, the pre-trained network category, and the specific dataset impact the level of precision of deep learning models compared. The present work uses Accuracy, Sensitivity, Specificity, Precision, and F1-Score metrics to calculate performance parameters by classifying DR images. The best-performing network has been selected based on classification accuracy for each category. Table 23 shows the comparison analysis of different networks with different categories. Figure 4 shows the ROC-AUC curve using the combined dataset for three categories.
Table 23.
The comparison analysis of different networks with different categories
| Category | Dataset | Network Name | Confusion Matrix | ACC % | Sen | Sp | Pr | F1 | |
|---|---|---|---|---|---|---|---|---|---|
| Series | Vgg19 | EyePACS Dataset | 192 | 8 | 92.5 | 0.89 | 0.96 | 0.96 | 0.92 |
| 22 | 178 | ||||||||
| IDRiD Dataset | 48 | 2 | 92 | 0.88 | 0.96 | 0.96 | 0.92 | ||
| 6 | 44 | ||||||||
| ATOS-2019 | 191 | 9 | 92.5 | 0.90 | 0.96 | 0.95 | 0.92 | ||
| 21 | 179 | ||||||||
| Combined Dataset | 435 | 15 | 96.22 | 0.96 | 0.97 | 0.97 | 0.96 | ||
| 19 | 431 | ||||||||
| DAG | Resnet101 | EyePACS Dataset | 194 | 6 | 93.5 | 0.90 | 0.97 | 0.97 | 0.93 |
| 20 | 180 | ||||||||
| IDRiD Dataset | 46 | 4 | 91 | 0.90 | 0.92 | 0.92 | 0.91 | ||
| 5 | 45 | ||||||||
| ATOS-2019 | 193 | 7 | 94 | 0.92 | 0.97 | 0.96 | 0.94 | ||
| 17 | 183 | ||||||||
| Combined Dataset | 440 | 10 | 97.33 | 0.97 | 0.98 | 0.98 | 0.97 | ||
| 14 | 436 | ||||||||
| Lightweight | shufflenet | EyePACS Dataset | 192 | 8 | 94.5 | 0.93 | 0.96 | 0.96 | 0.94 |
| 14 | 186 | ||||||||
| IDRiD Dataset | 46 | 4 | 91 | 0.90 | 0.92 | 0.92 | 0.91 | ||
| 5 | 45 | ||||||||
| ATOS-2019 | 191 | 9 | 92.5 | 0.90 | 0.96 | 0.95 | 0.92 | ||
| 21 | 179 | ||||||||
| Combined Dataset | 439 | 11 | 96.66 | 0.96 | 0.98 | 0.98 | 0.97 | ||
| 19 | 431 | ||||||||
Fig. 4.
The ROC-AUC curve using the combined dataset for three categories
The Final Result of the pre-trained network with classification accuracy using four datasets is shown in Table 24.
Table 24.
The Final Result of the pre-trained network with classification accuracy using four datasets
| S.No | Network Name | Classification Accuracy | |||
|---|---|---|---|---|---|
| EyePACS Dataset | IDRiD Dataset | ATOS-2019 | Combined Dataset | ||
| 1 | AlexNet | 87 | 87 | 86.5 | 92 |
| 2 | vgg16 | 87.5 | 88 | 88.5 | 91.11 |
| 3 | vgg19 | 92.5 | 92 | 92.5 | 96.22 |
| 4 | darknet19 | 85 | 88 | 88 | 93 |
| 5 | darknet53 | 89.5 | 89 | 87.5 | 91 |
| 6 | inceptionv3 | 88.5 | 89 | 87.5 | 90.88 |
| 7 | densenet201 | 89 | 90 | 89 | 91 |
| 8 | Resnet50 | 91.5 | 90 | 91 | 91.44 |
| 9 | Resnet101 | 93.5 | 92 | 94 | 97.33 |
| 10 | xception | 89.5 | 88 | 88 | 92.44 |
| 11 | inceptionresnetv2 | 89 | 86 | 85 | 89.77 |
| 12 | nasnetlarge | 86.5 | 88 | 89.5 | 91.44 |
| 13 | SqueezeNet | 87.5 | 86 | 87.5 | 87.33 |
| 14 | mobilenetv2 | 90.5 | 90 | 89.5 | 90 |
| 15 | shufflenet | 94.5 | 91 | 92.5 | 96.66 |
| 16 | nasnetmobile | 90.5 | 89 | 89 | 89.66 |
| 17 | efficientnetb0 | 86.5 | 84 | 85.5 | 85.77 |
| 18 | GoogleNet | 86 | 85 | 85 | 85.44 |
| 19 | googlenet-places365 | 89.5 | 88 | 87.5 | 88.44 |
| 20 | resnet18 | 86.5 | 85 | 85.5 | 87.55 |
Tables 23 & 24 and Fig. 4 concluded that the combined dataset achieved the highest accuracy in all three categories: series, DAG, and Lightweight. It is observed that Vgg19, ResNet101, and shufflenet pre-trained networks achieved the highest accuracy of 96.22%, 97.33%, and 96.66% in series, DAG, and Lightweight categories. It is also noted that ResNet101 achieved the highest category in all cases. It is concluded that the ResNet101 pre-trained network in the DAG category is optimal for diabetic retinopathy disease detection in the early stage. The best result is marked in grey.
Conclusion
The exhaustive experiments concluded that the ResNet101-based pre-trained network in the category of DAG achieved the highest accuracy using combined datasets, i.e., DRD-EyePACS, IDRiD, and ATOS-2019. The significant advancement in the ResNet101 network achieves a balance between computing efficiency, depth, and accuracy. It is also observed that high accuracy, robustness, and efficiency have been achieved using ResNet101 in the category of DAG and observed as a powerful method for diagnosing and classifying diabetic retinopathy. It enhances the diagnosis and treatment in the early stage for the patient and can be used as real-time clinical practice. It is also noted that the implementation and training of ResNet101 require much processing compared to other networks and require high-quality labeled data. In this work, data was balanced using the augmentation technique, and balanced data was generated for robust model training. It is noted that 97.33% accuracy has been achieved using ResNet101. The proposed method can be used for routine time clinical practice.
Funding
This research received no external funding.
Data availability
The corresponding author has been authorized to access the data, and it will be shared for reasonable reasons.
Declarations
Ethics approval and consent to participate
Not Applicable.
Consent for publication
NA.
Conflicts of interest
There is no conflict of interest among the authors.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Gulshan V, Peng L, Coram M, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016;316(22):2402–10. 10.1001/jama.2016.17216. [DOI] [PubMed] [Google Scholar]
- 2.Chandrakumar T, Kathirvel R. Classifying diabetic retinopathy using deep learning architecture. Int J Eng Res. 2016;5:19–24. [Google Scholar]
- 3.Zhou L, Zhao Y, Yang J, Yu Q, Xu X. Deep multiple instance learning for automatic detection of diabetic retinopathy in retinal images. IET Image Proc. 2018;12(4):563–71. 10.1049/iet-ipr.2017.0636. [Google Scholar]
- 4.Dutta S, Manideep BCS, Basha SM, Caytiles RD, Iyengar NCSN. Classification of diabetic retinopathy images by using deep learning models. Int J Grid Distrib Comput. 2018;11(1):89–106. 10.14257/ijgdc.2018.11.1.09. [Google Scholar]
- 5.Junjun P, Zhifan Y, Dong S, Hong, Q. Diabetic Retinopathy Detection Based on Deep Convolutional Neural Networks for Localization of Discriminative Regions. Proceedings - 8th International Conference on Virtual Reality and Visualization, ICVRV 2018;46–52. 10.1109/ICVRV.2018.00016
- 6.Kassani SH, Kassani PH, Khazaeinezhad R, Wesolowski MJ, Schneider KA, Deters R. Diabetic retinopathy classification using a modified xception architecture. 2019 IEEE 19th International Symposium on Signal Processing and Information Technology, ISSPIT. 2019. 10.1109/ISSPIT47144.2019.9001846
- 7.Challa UK, Yellamraju P, Bhatt JS. A Multi-class Deep All-CNN for detection of diabetic retinopathy using retinal fundus images. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11941 LNCS. 2019;191–199. 10.1007/978-3-030-34869-4_21
- 8.Qummar S, Khan FG, Shah S, Khan A, Shamshirband S, Rehman ZU, Khan IA, Jadoon W. A Deep Learning ensemble approach for diabetic retinopathy detection. IEEE Access. 2019;7:150530–9. 10.1109/ACCESS.2019.2947484. [Google Scholar]
- 9.Bhardwaj C, Jain S, Sood M. Diabetic retinopathy severity grading employing quadrant-based Inception-V3 convolution neural network architecture. Int J Imaging Syst Technol. 2021;31(2):592–608. 10.1002/ima.22510. [Google Scholar]
- 10.Saxena G, Verma DK, Paraye A, Rajan A, Rawat A. Improved and robust deep learning agent for preliminary detection of diabetic retinopathy using public datasets. Intelligence-Based Med. 2020;3–4. 10.1016/j.ibmed.2020.100022
- 11.Katada Y, Ozawa N, Masayoshi K, Ofuji Y, Tsubota K, Kurihara T. Automatic screening for diabetic retinopathy in interracial fundus images using artificial intelligence. Intelligence-Based Med. 2020;3–4. 10.1016/j.ibmed.2020.100024
- 12.Usman, A., Muhammad, A., Martinez-Enriquez, A. M., & Muhammad, A. (2020). Classification of Diabetic Retinopathy and Retinal Vein Occlusion in Human Eye Fundus Images by Transfer Learning. In K. Arai, S. Kapoor, & R. Bhatia (Eds.), Advances in Information and Communication (pp. 642–653). FICC 2020. Adv Intell Syst Comput.2020;1130. Springer, Cham. 10.1007/978-3-030-39442-4_47.
- 13.Alyoubi WL, Abulkhair MF, Shalash WM. Diabetic retinopathy fundus image classification and lesions localization system using deep learning. Sensors. 2021;21(11). 10.3390/s21113704 [DOI] [PMC free article] [PubMed]
- 14.Bhardwaj C, Jain S, Sood M. Deep learning-based diabetic retinopathy severity grading system employing quadrant ensemble model. J Digit Imaging. 2021;34(2):440–57. 10.1007/s10278-021-00418-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chen PN, Lee CC, Liang CM, Pao SI, Huang KH, Lin KF. General deep learning model for detecting diabetic retinopathy. BMC Bioinformatics. 2021;22. 10.1186/s12859-021-04005-x [DOI] [PMC free article] [PubMed]
- 16.Yi SL, Yang XL, Wang TW, She FR, Xiong X, He JF. Diabetic retinopathy diagnosis based on RA-efficientnet. Applied Sciences (Switzerland). 2021;11(22):11035. 10.3390/app112211035. [Google Scholar]
- 17.Khan Z, Khan FG, Khan A, Rehman ZU, Shah S, Qummar S, Ali F, Pack S. Diabetic retinopathy detection using vgg-nin a deep learning architecture. IEEE Access. 2021;9:61408–16. 10.1109/ACCESS.2021.3074422. [Google Scholar]
- 18.Das S, Kharbanda K, M S, Raman R, DED. Deep learning architecture based on segmented fundus image features for classification of diabetic retinopathy. Biomed Signal Process Control. 2021;68, 102600. 10.1016/j.bspc.2021.102600.
- 19.AbdelMaksoud E, Barakat S, Elmogy M. A computer-aided diagnosis system for detecting various diabetic retinopathy grades based on a hybrid deep learning technique. Med Biol Eng Compu. 2022;60(7):2015–38. 10.1007/s11517-022-02564-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kobat SG, Baygin N, Yusufoglu E, Baygin M, Barua PD, Dogan S, Yaman O, Celiker U, Yildirim H, Tan RS, Tuncer T, Islam N, Acharya UR. Automated diabetic retinopathy detection using horizontal and vertical patch division-Based Pre-Trained DenseNET with digital fundus images. Diagnostics. 2022;12(8):1975. 10.3390/diagnostics12081975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Mungloo-Dilmohamud Z, Khan MHM, Jhumka K, Beedassy BN, Mungloo NZ, Peña-Reyes C. Balancing data through data augmentation improves the generality of transfer learning for diabetic retinopathy classification. Appl Sci (Switzerland). 2022;12(11):5363. 10.3390/app12115363. [Google Scholar]
- 22.Asia AO, Zhu CZ, Althubiti SA, Al-Alimi D, Xiao YL, Ouyang PB, Al-Qaness MAA. Detection of diabetic retinopathy in retinal fundus images using CNN classification models. Electronics (Switzerland). 2022;11(17):2740. 10.3390/electronics11172740. [Google Scholar]
- 23.Mondal SS, Mandal N, Singh KK, Singh A, Izonin I. EDLDR: An ensemble deep learning technique for detection and classification of diabetic retinopathy. Diagnostics. 2023;13(1):124. 10.3390/diagnostics13010124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Yasashvini R, Raja Sarobin VM, Panjanathan R, Graceline S, Anbarasi JL. Diabetic retinopathy classification using CNN and hybrid deep convolutional neural networks. Symmetry. 2022;14(9):1932. 10.3390/sym14091932. [Google Scholar]
- 25.Dayana AM, Emmanuel WRS. Deep learning enabled optimized feature selection and classification for grading diabetic retinopathy severity in the fundus image. Neural Comput Appl. 2022;34(21):18663–83. 10.1007/s00521-022-07471-3. [Google Scholar]
- 26.Oulhadj M, Riffi J, Chaimae K, Mahraz AM, Ahmed B, Yahyaouy A, Fouad C, Meriem A, Idriss BA, Tairi H. Diabetic retinopathy prediction based on deep learning and deformable registration. Multimedia Tools and Applications. 2022;81(20):28709–27. 10.1007/s11042-022-12968-z. [Google Scholar]
- 27.Jabbar MK, Yan J, Xu H, Rehman ZU, Jabbar A. Transfer learning-based model for diabetic retinopathy diagnosis using retinal images. Brain Sci. 2022;12(5):535. 10.3390/brainsci12050535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Menaouer B, Dermane Z, el Houda Kebir N, Matta N. diabetic retinopathy classification using hybrid deep learning approach. SN Comp Sci. 2022;3(5). 10.1007/s42979-022-01240-8
- 29.Fayyaz AM, Sharif MI, Azam S, Karim A, El-Den J. Analysis of diabetic retinopathy (DR) based on the deep learning. Information (Switzerland). 2023;14(1):30. 10.3390/info14010030. [Google Scholar]
- 30.Das D, Biswas SK, Bandyopadhyay S. Detection of diabetic retinopathy using convolutional neural networks for feature extraction and classification (DRFEC). Multimedia Tools Appl. 2023;82(19):29943–30001. 10.1007/s11042-022-14165-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Mohanty C, Mahapatra S, Acharya B, Kokkoras F, Gerogiannis VC, Karamitsos I, Kanavos A. Using deep learning architectures for detection and classification of diabetic retinopathy. Sensors. 2023;23(12):5726. 10.3390/s23125726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Jena PK, Khuntia B, Palai C, Nayak M, Mishra TK, Mohanty SN. A novel approach for diabetic retinopathy screening using asymmetric deep learning features. Big Data Cogn Comput. 2023;7(1):25. 10.3390/bdcc7010025. [Google Scholar]
- 33.Bhimavarapu U, Chintalapudi N, Battineni G. automatic detection and classification of diabetic retinopathy using the improved pooling function in the convolution neural network. Diagnostics. 2023;13(15):2606. 10.3390/diagnostics13152606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Islam N, Jony MdMH, Hasan E, Sutradhar S, Rahman A, Islam MdM. Toward lightweight diabetic retinopathy classification: A knowledge distillation approach for resource-constrained settings. Appl Sci. 2023;13(22):12397. 10.3390/app132212397. [Google Scholar]
- 35.Sajid MZ, Hamid MF, Youssef A, Yasmin J, Perumal G, Qureshi I, Naqi SM, Abbas Q. DR-NASNet: automated system to detect and classify diabetic retinopathy severity using improved pretrained NASNet model. Diagnostics. 2023;13(16):2645. 10.3390/diagnostics13162645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Alwakid G, Gouda W, Humayun M. Enhancement of diabetic retinopathy prognostication using deep learning, CLAHE, and ESRGAN. Diagnostics. 2023. 10.3390/diagnostics. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Vijayan M, Venkatakrishnan S. A regression-based approach to diabetic retinopathy diagnosis using efficientnet. Diagnostics. 2023;13(4):774. 10.3390/diagnostics13040774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Alwakid G, Gouda W, Humayun M, Jhanjhi NZ. Deep learning-enhanced diabetic retinopathy image classification. Digital Health. 2023;9. 10.1177/20552076231194942 [DOI] [PMC free article] [PubMed]
- 39.Guefrachi S, Echtioui A, Hamam H. Automated diabetic retinopathy screening using deep learning. Multimedia Tools Appl. 2024. 10.1007/s11042-024-18149-4. [Google Scholar]
- 40.Sunkari S, Sangam A, P VS, Manikandan S, Raman R, Rajalakshmi R, S T. A refined ResNet18 architecture with Swish activation function for Diabetic Retinopathy classification. Biomedical Signal Processing and Control. 2024;88, 105630. 10.1016/j.bspc.2023.105630.
- 41.Macsik P, Pavlovicova J, Kajan S, Goga J, Kurilova V. Image preprocessing-based ensemble deep learning classification of diabetic retinopathy. IET Image Proc. 2024;18(3):807–28. 10.1049/ipr2.12987. [Google Scholar]
- 42.Shakibania Bu-Ali H, Raoufi S, Pourafkham B, Khotanlou Bu-Ali H, Shakibania H, Khotanlou H, Mansoorizadeh M. Dual branch deep learning network for detection and stage grading of diabetic retinopathy. Biomedical Signal Processing and Control(Pre-print). 2024.
- 43.Yadav N, Dass R, Virmani J. Despeckling filters applied to thyroid ultrasound images : a comparative analysis. Multimedia Tools Appl. 2022. 10.1007/s11042-022-11965-6. [Google Scholar]
- 44.Yadav N, Dass R, Virmani J. Deep leaning-based CAD system design for thyroid tumor characterization using ultrasound images. Multimedia Tools Appl. 2023. 10.1007/s11042-023-17137-4. [Google Scholar]
- 45.Yadav N, Dass R, Virmani J. A systematic review of machine learning based thyroid tumor characterisation using ultrasonographic images. J Ultrasound. 2024. 10.1007/s40477-023-00850-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Dass R, Yadav N. Image quality assessment parameters for despeckling filters. Procedia Comput Sci. 2020;167(2019):2382–92. 10.1016/j.procs.2020.03.291.1. [Google Scholar]
- 47.Yadav N, Dass R, Virmani J. Machine learning based CAD system for thyroid tumor characterization using ultrasound images. Int J Med Eng Info. 2022. 10.1504/IJMEI.2022.10049164. [Google Scholar]
- 48.Yadav N, Dass R, Virmani J. Assessment of encoder-decoder based segmentation models for thyroid ultrasound images. Med Biol Eng Compu. 2023. 10.1007/s11517-023-02849-4. [DOI] [PubMed] [Google Scholar]
- 49.Yadav N, Dass R, Virmani J. Texture analysis of liver ultrasound images. emergent converging technol. Biomed Syst Lect Notes Electr Eng. 2022;841:575–85. 10.1007/978-981-168774-7_48. [Google Scholar]
- 50.Yadav N, Dass R, Virmani J. Objective assessment of segmentation models for thyroid ultrasound images. J Ultrasound. 2022. 10.1007/s40477-022-00726-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.https://www.kaggle.com/datasets/sachinkumar413/diabetic-retinopathy-dataset. Accessed on February 2024.
- 52.Porwal P, Pachade S, Kamble R, Kokare M, Deshmukh G, Sahasrabuddhe V, Meriaudeau F. Indian diabetic retinopathy image dataset (IDRiD): A database for diabetic retinopathy screening research. Data. 2018;3(25):1–8. 10.21227/H25W98. [Google Scholar]
- 53.https://www.kaggle.com/datasets/sovitrath/diabetic-retinopathy-224x224-2019-data?resource=download. Accessed on February 2024
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The corresponding author has been authorized to access the data, and it will be shared for reasonable reasons.





