Explainable AI for skin disease classification using gradient-weighted class activation mapping and transfer learning in digital health to identify contours

S M Saiful Islam Badhon; Sharun Akter Khushbu; S M Shaqib; Md Aiyub Ali; Asif Hossain Anik; K S M Tozammel Hossain

doi:10.1177/20552076251404523

. 2025 Dec 10;11:20552076251404523. doi: 10.1177/20552076251404523

Explainable AI for skin disease classification using gradient-weighted class activation mapping and transfer learning in digital health to identify contours

S M Saiful Islam Badhon ¹, Sharun Akter Khushbu ², S M Shaqib ^2,^✉, Md Aiyub Ali ², Asif Hossain Anik ³, K S M Tozammel Hossain ¹

PMCID: PMC12696321 PMID: 41393849

Abstract

Objective

This research evaluates the feasibility of addressing computer vision problems with limited resources, particularly in the context of medical data, where patient privacy concerns restrict data availability. The study focuses on diagnosing skin diseases using five distinct transfer learning models based on convolutional neural networks.

Methods

Two versions of the dataset were created, one imbalanced (4092 samples) and the other balanced (5182 samples), using simple data augmentation techniques. Preprocessing techniques were employed to enhance the quality and utility of the data, including image resizing, noise removal, and blur techniques. The performance of each model was assessed using fresh data after preprocessing. We have utilized VGG19, VGG16, GoogleNet, XceptionNet, and Inception for comparing data training performance, which indicates that our preprocessing refined the image quality and texture. Therefore, the accuracy increased after augmentation, and low output reflects that the data quality before augmentation was poor.

Results

According to the research findings, the VGG-19 model achieved an accuracy of 95.00% on the imbalanced dataset. After applying augmentation on the balanced data, the best-performing model was VGG-16-Aug with an accuracy of 97.07%. These results suggest that low-resource approaches, coupled with preprocessing techniques, can effectively identify skin diseases, particularly when utilizing the VGG-16-Aug model with a balanced dataset.

Conclusion

The study addresses a range of skin disorders, including acne, vitiligo, hyperpigmentation, nail psoriasis, and SJS-TEN, focusing on aspects that remain underexplored in previous research. The findings highlight the potential of simple data augmentation techniques; moreover, explainable AI: Grad-CAM interpreted the model outcome by showing image contours visually and identifying uncommon skin conditions and overcoming the data scarcity challenge. The implications of these research findings are significant for the development of machine learning-based diagnostic systems in the medical field. Further investigation is necessary to explore the generalizability of these findings to other medical datasets.

Keywords: Skin disease, transfer learning, VGG-16, CNN, explainable AI, Grad-CAM

Introduction

The multiple body limbs together make a shape that forms the structure of a human. There are two-part inner sculptures, and the outer cover is skin, which can protect us from pathogens, enhance immunity, prevent excessive water loss, and address other concerning aspects of our body.¹ Skin disease is a common infectious condition worldwide.² The World Health Organization (WHO) also claimed that by 2020,³ 49% of Asian women and men will have lost their lives to skin diseases. Statistics of death also added that 384 people died in Bangladesh.⁴ However, this number is growing globally to 9.6 million deaths.⁵ When people are getting attacked, their skin texture becomes sunken. The causes of skin disease include viruses, bacteria, allergies, and fungal infections.⁶ It also happens to people, and it can be genetic as well. Typically, most of the disease occurs in the thinner layer of the skin. This type of outer layer disease contaminates the skin, whose name is the epidermis, and is visible to others, and can be responsible for mental depression and injuries. There are several types of skin sores: acne, Vitiligo, Hyperpigmentation (HP), Nail Psoriasis (NP), Stevens-Johnson syndrome, and toxic epidermal necrolysis (TEN). The sore is not similar in terms of skin image sign and hardness. Typically, few diseases are cured by the passage of time, and some diseases are permanent and may be painful or not. According to these diseases, SJS-TEN is the most severe type. Skin problems can be cured at an early stage of detection.

Typically, there is poor communication between medical specialists and individuals affected by skin diseases. People are often unaware of their skin health types, symptoms, and the most concerning issues related to early-stage prediction. In the biomedical sector, many diseases have invisible signs but affect the body's inner parts. In the context of this purpose, it must be a rapid prediction, as long as a report solution is provided. In practice, there are numerous obstacles to diagnosing and determining the cost-effectiveness of knowing disease type and skin forecasting results. The ML approach, specifically a Deep Learning Convolutional Neural Network, is an effective architecture for detecting images, providing faster and more accurate solutions. Google has a tremendous number of studies available online, all of which are based on skin disease detection or classification over the last few decades.

Approaching some automatic computer-aided technology for biomedical skin-related images is a significant research area in deep learning.⁷ Moreover, a significant body of high-quality research has been presented, exploring various approaches for recognition and classification experiments; however, researchers continuously strive to fill gaps. Numerous studies have been conducted on multilevel disease,^8,9 but single-image analyses have been performed previously.^10,11 These studies are inadequate for identifying multiscale classes.¹² The multiscale class assessment poses a significant challenge, particularly in cases where skin diseases share similar patterns, such as those across different diseases. Skin problems obviously can be cured at an early stage. Typically, there is an extensive lack of linkage between medical specialists and people with affected skin diseases; in fact, people are unaware of their skin health types, symptoms, and most concerns related to early-stage prediction. In the biomedical sector, many diseases lack visible symptoms, yet they affect the body's inner parts. In the context of this purpose, it must be a very quick prediction, as long as a report solution is. Practically, there are numerous obstacles to diagnosing and determining the cost-effectiveness of disease types and skin forecasting results. According to recent research and comparative analyses, Deep Learning Convolutional Neural Networks (CNNs) are a highly successful architecture for image detection, offering improved accuracy and the potential for faster automated diagnosis in dermatological applications.^13,14,15,16 Google has a vast number of studies available online, all of which are based on skin disease detection or classification over the last few decades. The key contribution of this study is outlined as follows:

Evaluation of deep learning models: The study comprehensively evaluated five deep learning models, including GoogleNet, Inception, VGG-16, VGG-19, and Xception, for the identification of skin diseases.
In this study, a dataset consisting of approximately 5185 images of skin diseases was created from different online sources.
Importance of data balancing and augmentation: The study highlighted the significance of data balancing techniques, specifically through augmentation, in improving the accuracy of skin disease identification models
We have used lime to explain how the model performs in the hidden layer and also identify the disease areas by using the deep color contrast.

Literature review

In recent research studies, most available datasets have been used to build skin disease detection systems. We employed a different approach by utilizing low-resource methods to collect medical images from various sources, aiming to enhance model performance across five classes. For three classes, we collected fewer than 1000 images out of the five classes. Due to the limited resources, we employed several advanced techniques to develop our model, which was able to identify images with approximately 98% accuracy. Additionally, our research pursued a state-of-the-art approach by conducting comparative studies.

A CNN architecture with a SoftMax classifier was used to detect dermoscopic images, and when evaluated with ML classifiers, it produced the highest accuracy and generated a diagnostic report as output, according to Nagarajaiah et al.¹⁷ Liu et al.¹⁸ performed a prominent study using 56,134 data points from 17 clinical instances. Set A contained 16,539 images. Srinivasu et al.¹⁹ conducted a study where a 15 Set B dataset was found, which contained 4145 images of 26 different skin combinations, resulting in 94% accuracy. Shanthi et al.²⁰ conducted a study, where a classification approach with an assumed accuracy of 85% was implemented using MobileNet, LSTM, CNN, and VGG on the HAM1000 dataset. According to Chen et al.,²¹ CNN is the best solution for image recognition, achieving 90% to 99% accuracy in identifying 60 different skin problems. Another study on skin diseases utilized the VGG16, LeNet-5, and AlexNet models on 6144 learning images with five distinct problems, and low-resource images were also included, as Goceri et al.²² have shown. MobileNet, introduced by Esteva et al.,²³ uses a hybrid loss function with a modified architecture, achieving an accuracy of 94.76%. Allugunti et al.²⁴ implemented a CNN for 2475 dermoscopic images were identified with 88% accuracy, while other classifiers, such as GBT, DT, and RF, were also used. A new modified loss function has been proposed by Groh et al.²⁵ using DenseNet-201, achieving an accuracy of 95.24%. Soenksen et al.²⁶ performed a review on 16,577 images with 114 classes labeled by Fitzpatrick skin types, resulting in a profound solution for DNN and improved accuracy. Burlina et al.²⁷ conducted research on lesion skin medical images, and advances in DCNN were evaluated by a sensitivity of 90.3% and a specificity of 89.9%. Using 1834 images, a DL model was proposed by Al-Masni et al.,²⁸ trained to achieve a 95% score and a confidence interval error margin of 86.53%, ROCAUC of 0.9510, and Kappa of 0.7143, with both high sensitivity and specificity. A low-resource, faster solution was proposed by Janbi et al.,²⁹ which trained up three types of medical skin problems with CNN and Multiclass SVM, achieving 100% accuracy. Another case study approach

DL with 22 different types of skin with higher accuracy was achieved by Abbas et al.³⁰ The global dataset ISIC2017 of Melanoma skin diseases was used in a study that used Deeplabv3plus, Inception-ResNet-v2-uNet, MobileNetV2-unet, ResNet50-uNet, and VGG19-uNet. Instead of Deeplabv3plus, the model that showed the highest recall of 91% was chosen, and both preprocessing methods were applied in 5 models. Weng et al.³¹ used the same dataset from HAM10000, comprising 100154 images, which was used. The photos were trained with ResNet-50, DenseNet-121, and a seven-layer CNN architecture. These models achieved a perception of 99% accuracy in extracting features. Researchers examined 58,457 skin images, including 10,857 unlabeled samples, for multilevel classification. They achieved a perfect AUC score of 97% and a high F1-macro score, which helped to solve the problem of imbalanced images. Gouda et al.³² claimed that after resizing, augmentation, and normalizing the same data from ISIC2018, models tuned up on CNN, ResNet50, InceptionV3, and ResNet + InceptionV3 combined models achieved more than 85.5%. Shetty et al.³³ made a comparison between ML classifiers and CNN. The study found that the Deep Convolutional Network suggested the best-tuned architecture for predicting separate image cells. Another study conducted by Jasti et al.³⁴ found that classification for feature extraction learning was capable of categorizing all same-level diseases. The purpose of this study was to experiment on the MIAS dataset using AlexNet in addition to NB, KNN, and SVM. One of the prominent studies by Almuayqil et al.³⁵ examined pre-stage signs of skin disease using five different DL models and achieved a high accuracy of 99% on the HAM10000 dataset. Karthik et al.³⁶ experimented with EfficientNetV2 to address the limitations of image classification for acne, actinic keratosis (AK), melanoma, and psoriasis, achieving a result of 87%. Foahom et al.³⁷ used the ISIC dataset with a total of 8917 medical images trained on a CNN architecture. EfficientNetB5 achieved an accuracy of 86% in identifying pigmented lesions and improved the AUROC curve to 97%. Sreekala et al.³⁸ suggested SCM (Spectral Centroid Magnitude) and combined it with KNN, SVM, ECNN, and CNN to achieve a score of up to 83% on 3100 images from PH2 and ISIC. MobileNetV2 and LSTM combined architecture proposed by Kshirsagar et al.³⁹ achieved a score of 86% with high performance and low error occurrence. In contrast, both the ISIC and HAM10000 datasets, which encompass five distinct diseases, utilized SVM, KNN, and DT, contributing to image preprocessing, segmentation, feature extraction, and classification that outperformed the study.

Materials and methods

We proposed a methodology for identifying skin diseases using both augmented and non-augmented approaches. Each evaluated study is described sequentially, starting with image processing, followed by image augmentation, feature extraction, and finally implementation using deep learning techniques. Figure 1 provides an overview of our working procedure.

Figure 1. — A proposed diagram on skin disease classification system shows a dataset with class distribution, image data analysis, feature engineering, and a final result chart on transfer learning and supervised learning.

Dataset description

The biggest contribution and the most time-consuming task is preparing a self-collected dataset. However, there are several benefits in biomedical research, such as using newly identified skin images that other researchers have not used in their methodology. In contrast, the most challenging aspect is that self-collected images are not preprocessed beforehand, making model training quite difficult. In regard to skin disease images, there are various internationally available collections of dermatological images, such as ISIC2019⁴⁰ and HAM10000,⁴¹ which are the two largest datasets for melanoma skin disease. These datasets cover several skin conditions, including actinic keratosis, basal cell carcinoma, benign keratosis, dermatofibroma, melanoma, melanocytic nevus, squamous cell carcinoma, and vascular lesions. However, we have collected images of a few skin diseases that routinely occur in the human body, such as acne, vitiligo, hyperpigmentation, nail psoriasis, and SJS-TEN- as shown in Figure 2, and the total dataset description is depicted in Table 1.

Figure 2. — Sample images of five skin images, such as acne, hyperpigmentation, nail psoriasis, Vitiligo, and SJS-TEN.

Table 1.

Dataset description.

Description	Count
Total number of images	5184
Dimension	224 × 224
Image format	JPG
Acne	984
Hyperpigmentation	900
Nail psoriasis	1080
SJS-TEN	1356

Open in a new tab

Our dataset⁴² consists of 5 categories of skin images, each with detailed depictions of skin sores and injuries in various ways. We collected a total of 5184 dermatoscopic images from different hospitals and online resources of patients from different countries. This indicates that transfer learning can be used to detect all types of skin diseases. Specifically, we worked on the categories of acne (984), vitiligo (864), hyperpigmentation (900), nail psoriasis (1080), and SJS-TEN (1356).

Dataset preparation

There are some challenges in self-collected data, such as a lack of resources and an imbalance in the dataset occurrences. We have five categories: acne, vitiligo, hyperpigmentation, nail psoriasis, and SJS-TEN. We collected online sources with imbalanced data, including 492 images of acne, 864 images of vitiligo, 300 images of hyperpigmentation, 1080 images of nail psoriasis, and 1356 images of SJS-TEN. We have taken imbalanced data in each class, as the data distribution of acne and hyper-pigmentation image collections is very poor, which is responsible for biased accuracy and ignores the minority of image classes. According to the minority of classes, allowing a high error rate, the prediction will fall. The dataset used in this study has been published online as an open-access dataset.⁴²

Random oversampling

When one category has a minority of image ratios, oversampling is a technique that can be used to enhance the image quantity by duplicating the image with different angles, rotations, and flips. In our dataset, we have two classes with image ratios lower than the others. We applied oversampling only to the acne and hyperpigmentation classes. As a result, we achieved a balanced dataset, which is shown in Figure 3.

Figure 3. — Distribution of data before (a) and after (b) random sampling with augmentation.

Image processing

Before image processing, the sizes, shapes, and colors of the images were not in a usable format. Therefore, raw images require preprocessing. In raw skin images, there are countless cracks and distortions. Acquiring high-resolution or good-quality images can significantly improve the detection of skin diseases. Quality images can also optimize complexity, reduce loss curves, and achieve high accuracy. In this subsection, we will discuss the preprocessing steps, including image resizing.

Image resizing

Usually, raw collections of images are in different formats, with different image shapes. Different sizes of images can be resolved using increasing or decreasing resize matrix operations. There are two specific solutions for effective performance and reduced complexity metrics. In this study, all the input images in different shapes were later converted to a size of 224 × 224.

Noise removal

Noise is an inherent fact that occurs as a disturbance when any image is processed, flipped, or generates an output. This causes dependence on removing noise. Instead, these pictures lose their proper size and degrade in quality due to imbalanced color brightness. To apply deep techniques, a few filtering processes appear in digital images to enhance image shape while preserving edges and protecting central edges, and processing 2D images. We can use more than three filters to improve image edges, such as Gaussian blur, averaging blur, median blur, and bilateral blur filtering, which help to blur and remove relevant image noise.⁴³

Augmentation

Image augmentation is the process of augmenting the minimum resources of an image into non-duplicate regions.^44,45,46 Typically, image augmentation involves the reflection of various types of images on texture, grayscale, low-high brightness, color contrast, and other image features. It creates bounding boxes to improve accuracy with compilation time, which is known as synthetic data.

Blur techniques

Gaussian Blur is the simplest way to blur any image. Using this technique, images can be modified to shape out the actual object as per the user's requirements. Quality and quantity can also be improved through this process. This involves using a mathematical equation on a slightly changed matrix formation to create an image. We also used Averaging Blur, Median Blur, and Bilateral Blur to enhance image quality and augment quantity. Figure 4 shows samples of blur representing the matrix deformation pattern. In Table 2, we used several blur techniques and included all the equations that we implemented on our low-resource skin images. Regarding our self-collected photos, we captured color images. For skin images, we computed various features, including contrast, energy, entropy, correlation, and homogeneity, from the image matrix.

G B (x, y) = \frac{1}{2 π σ^{2}} e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}

Figure 4. — Image samples of the transferred image into a blurred image; therefore, several image machines can understand.

Table 2.

Feature values for nine images.

Feature	Image 1	Image 2	Image 3	Image 4	Image 5	Image 6	Image 7	Image 8	Image 9
Energy	0.049	0.038	0.029	0.029	0.033	0.028	0.035	0.064	0.039
Correlation	0.884	0.794	0.976	0.971	0.984	0.903	0.947	0.996	0.975
Contrast	1112.3901	546.9877	45.1985	64.6160	22.8364	559.1883	244.3821	1.8534	27.5155
Homogeneity	0.3357	0.2438	0.2451	0.3491	0.3242	0.2446	0.4573	0.5935	0.3337
Entropy	10.8958	11.5756	11.1457	10.9050	10.7034	11.2315	10.4978	8.3734	10.3353
Mean	0.3171	0.3760	0.5281	0.2792	0.4288	0.2504	0.3817	0.3769	0.5094
Variance	0.0735	0.0204	0.0146	0.0172	0.0109	0.0443	0.0358	0.0039	0.0086
SD	0.2711	0.1429	0.1209	0.1313	0.1046	0.2104	0.1892	0.0628	0.0927
RMS	0.4171	0.4023	0.5418	0.3085	0.4413	0.3270	0.4260	0.3821	0.5177

Open in a new tab

Averaging blur

The averaging blur technique is used to remove high-frequency content, such as edges, from the image, resulting in a smoother appearance. It works by taking the average of pixel values within a specified neighborhood around each pixel. x[n] represents the pixel values, and k ranges from 0 to n.

K (n) = \frac{1}{M} \sum_{k = 0}^{M - 1} x [n - k]

Median blur

The median blur technique replaces each pixel's intensity with the median intensity value from neighboring pixels. This helps in reducing noise and preserving edges in the image, where I(x, y, t) represents the intensity value of the pixel at coordinates (x, y) and frame t.

M B (x, y, l) = median {I (x, y, l - i) i = 0, 1, \dots, M - 1}

Bilateral blur

Bilateral blur replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. It considers both spatial distance and intensity differences when calculating the weights. This technique preserves edges while smoothing the images, where BF[I]p represents the blurred intensity value at pixel p, S denotes the spatial neighborhood, Ip and Iq are the intensity values of pixels p and q, respectively, and Gs is the spatial Gaussian function.

B F [I] (p) = \frac{1}{W_{p}} \sum_{q \in S} G_{σ_{s}} (p - q) G_{σ_{r}} (I (p) - I (q)) I (q)

Statistical features

In terms of image density, the statistical features use the RGB pattern of the image to measure it. These statistical features find out the extracted colors, separate them, and categorize them using Mean, Feature Variance, RMSE, and Standard Deviation.

Energy

Energy is a measure of the sum of squared elements in the Gray-Level Co-occurrence Matrix (GLCM). It indicates the overall “energy” or magnitude of the pixel relationships in the GLCM. The formula for energy is the sum of squared elements in the GLCM, normalized to a value between 0 and 1. Correlation: Correlation measures the overall intensity of the relationship between a pixel and its neighboring pixels in the image. It quantifies the linear dependency between pixel values in the GLCM. The formula for correlation is the sum of the product of the normalized GLCM elements and the product of their respective deviations from the mean.

Contrast

Contrast provides a measure of how closely a pixel is connected to its neighbors in terms of intensity differences. It quantifies the local variations or differences in pixel values within the GLCM. The formula for contrast is the sum of the product of the squared difference between grey-level pairs and their respective frequencies in the GLCM.

Homogeneity

Homogeneity describes the closeness or uniformity of the distribution of elements in the GLCM. It measures how similar or homogeneous the grey-level pairs are in the GLCM. The formula for homogeneity is one divided by the sum of the squared differences between grey-level pairs in the GLCM.

Entropy

Entropy represents the degree of randomness or uncertainty in the distribution of grey-level pairs within the GLCM. It measures the level of information or disorder in the GLCM. The formula for entropy is the sum of the product of the GLCM elements and the logarithm (base 2) of the GLCM elements.

Figure 5 displays a collection of image samples that have been converted into blur image matrix representations through a process of image transfer. The features include Energy, Correlation, Contrast, Homogeneity, and Entropy, and are accompanied by their respective formulas and descriptions. These GLCM features are commonly used in texture analysis of images to quantify the spatial relationship between pixel intensities. They provide valuable information for various image processing applications such as pattern recognition, image segmentation, and object detection.

Mean

Mean represents the average color value of the image. It is calculated by summing up all the pixel values in the image and dividing it by the total number of pixels, where N is the total number of pixels in the image and P is the pixel value.

Mean = \sum_{i, j = 0}^{N - 1} i P_{i j}

Variance

Variance measures the dispersion or spreading of the image values around the mean. It quantifies how much the pixel values deviate from the average, where μ is the mean value.

Variance = \sum_{i, j = 0}^{N - 1} P_{i j} j (i - μ)^{2}

Standard deviation

Standard deviation is a statistical measure that quantifies the amount of variability or 309 dispersions within a set of data points. It provides insight into the spread of values around 310, the mean.

S D = \sqrt{\sum_{i, j = 0}^{N - 1} P_{i j} \cdot {(i - μ)}^{2}}

Root mean square (RMS)

Root Mean Square is the square root of the average of all squared intensity values in the image. It provides a measure of the overall intensity or energy of the image, where (i – e) is the difference between the ith row index and e is the pixel intensity.

R M S = \sqrt{\sum_{i, j = 0}^{N - 1} P_{i j} {(i - e)}^{2}}

Transfer learning-based models

The final step of our work involves classification using transfer learning, which refers to the process of categorizing a group of skin disease data into different classes. Our deep learning approach consists in assuming the patterns of skin disease types based on the extracted features of images. For the application development process and self-collected dataset types, we employed multiple models for classification. To the best of our knowledge, we implemented five transformer models: GoogleNet, Inception, VGG-16, VGG-19, and Xception for classifying diseases. GoogleNet⁴⁷ It is a field of deep learning concerned with various configurations of convolutional neural networks, recurrent neural networks, and deep neural networks used to solve image classification problems, object detection, and feature extraction from skin disease images. It consists of 27 layers with pooling layers. Inception⁴⁸ is the most effective architecture for image classification, which utilizes hidden tricks to enhance output accuracy. This architecture works with a vast variety of images, using three distinct filter sizes (1 × 1, 3 × 3, and 5 × 5) in conjunction with max pooling layers. Corresponding to these three filter sizes, skin disease image filters are connected in 3 steps, with one layer attached to the other layers. VGG-16⁴⁹ uses smaller layers for disease images. Whenever we apply VGG-16 (Visual Geometry Group), it discards the information from the left, right, up, and down images and focuses on spatial features managed by a 3 × 3 filter size. VGG-19 (Visual Geometry Group)⁵⁰ is trained on 19 layers, comprising 5 max-pooling layers and one softmax layer, where it optimizes the output better than AlexNet for disease-related images. VGG-19 works with five different skin disease classifiers, producing the best output. Xception⁵¹ is an architecture with 36 layers with spatial features, where disease images are wrapped with 14 modules of linear residual connections.

Custom CNN architecture

Thus, convolution layers used in this work for feature extraction are more than one in the developed private CNN and each of them is followed by a max pooling layer and ReLU activation. In order to extract the aforementioned characteristics, fully connected layers process the final classification output. The structure is capable of enforcing the detection of features that are said to be of a high level and needful for identifying skin diseases. The feature extraction in the approach we suggested also heavily depends on this architecture, which was developed carefully to detect patterns from the images of paddy leaves. These patterns are vital in disease identification and classification, as explained earlier in this document. The key components of the custom CNN architecture are comprehensively detailed as follows: The key components of the custom CNN architecture are comprehensively detailed as follows:

Input layer: The network takes a scaled image of a skin image with the specifications of 224 × 224 × 3; 224 being the width and height of the image pixels, and 3 being the number of RGB color channels. This resizing helps the network to handle data consistently, and this is achieved by ensuring that the input size is regulated. The raw pixel data is fed to the network at the input layer, which can be termed the initial layer of the neural network.

Convolutional layer: To obtain the relevant characteristics for recognition, the input image is passed through a number of convolutional layers. Each of them develops feature maps that extract different features, such as edges, textures, and form, by convolving the input with learnable filters (kernels). Mathematically, the convolution operation is defined as:

f_{k} (i, j) = \sum_{m, n} x (i + m, j + n) \cdot w_{k} (m, n) + b_{k}

where x(i + m, j + n) are the input pixels that are covered by the filter, wk(m, n) are the filter weights, and bk is the bias for the kth filter. In this case, it is moved across the input image, and by calculating the filter at each point, the density of patterns corresponding to the filter is determined.

Activation function: To include non-linearity into the model, the output of each of the convolutional layers is passed through the ReLU activation function. The definition of ReLU is: The definition of ReLU is:

ReLU (x) = max (0, x)

This function helps to keep positive values in the feature map while setting all the negative pixel values to zero. Due to this non-linear transformation, the neural network is able to model complicated characteristics by comprehensively learning the various patterns and correlated factors from the given data set.

Pooling Layers: Between the convolutional layers there are layers known as the max-pooling layers whose purpose is to decrease the dimensionality of the feature maps while preserving critical features. The definition of the max-pooling operation. The definition of the max-pooling operation is:

p_{i j} = max_{a, b} {x (i + a, j + b)}

where P_ij is the pooled output at the (i, j), position, and a, b are the dimensions of the 381-pooling window. Choosing the maximum value within each window as it slides over the 382-input feature map is known as max-pooling. This down-sampling process reduces spatial 383 dimensions, thereby emphasizing the most important features in each region and reducing 384 computational load and overfitting.

Fully connected layers

The convolutional layers’ high-level features are interpreted by the fully connected (dense) layers of the CNN's final layers, which generate a fixed-size output. The following represents the fully connected layers:

y = σ (W x + b) .

where x is the input vector from the previous layer, y is the output vector, W is the weight matrix, b is the bias term, and σ. is the activation function, typically softmax for classification tasks. These layers combine information from all previous layers, enabling the network to make final predictions based on important traits discovered during pooling and convolutional processes. Accurate classification is facilitated by a feature vector that is the output of the last fully connected layer. This vector captures the most important features of the input image. The softmax activation function in the output layer converts logits into optional divisions. That gives a confidence score for all classes.

First Convolutional Layer: 32 (3 × 3) filters have been used, then max pooling and ReLU activation.

In the second convolutional layer, there are 64 3 × 3 filters followed by max pooling (2 × 2) and rectified linear unit activation.

Third Convolutional Layer: It uses 128 three-by-three filters, max pooling (two by two), and ReLU activation.

Experimental setup

Our overall experiment was executed in Python, and for coding, we used Google Colab to extend RAM facilities. We conducted our experiment in two different ways, before and after augmentation. Before augmentation, we used 4092 data in our 5 transformer learning models, and after augmentation, we used 5184 images for implementation, which improved the results of our experimental approach. To ensure a perfectly balanced dataset, we completed preprocessing steps, including resizing all collected images to a 224X224 matrix. This optimized the compilation time and improved the experiment results of the models. Secondly, we implemented a deep technique on an imbalanced dataset, and finally, we applied an augmentation process using a Gaussian blur to remove noise and create blur in the images with a 5X5 kernel, evaluated for sigma value.

Result analysis and discussion

Hence, we experimented with our data when it was imbalanced as well as after augmentation on the balanced dataset. Table 3 provides an overview of the performance metrics for several models: GoogleNet, Inception, VGG-16, VGG-19, and Xception. These metrics include accuracy, precision, recall, and F1-score. Analyzing the table, it is evident that there are notable variations in performance among the models.

Table 3.

Performance matrix of DL models for an imbalanced dataset

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 score (%)
VGG-16	93.97	92.81	99.30	97.87
VGG-19	95.00	97.72	98.30	99.14
GoogleNet	82.91	90.69	96.61	91.40
Xception	60.30	68.51	77.60	69.78
Inception	55.26	77.52	80.23	74.90

Open in a new tab

Starting with the best performers, VGG-19 stands out with a remarkable accuracy of 95.00%. This model demonstrates high precision (97.72%), indicating a low false positive rate, and impressive recall (98.30%), highlighting its ability to identify positive instances effectively. The F1-score (99.14%) further solidifies VGG-19 as a top-performing model, reflecting a harmonious balance between precision and recall. VGG-16 also performs exceptionally well, achieving a high accuracy of 93.97% and an impressive F1-score of 97.87%. On the other end of the spectrum, Inception exhibits the lowest accuracy among the models at 47.40%. This indicates a significant disparity between the predicted labels and the actual labels, suggesting a considerable number of misclassifications. While Inception shows a relatively higher precision of 67.74%, it suffers from a lower recall of 88.59%, indicating a higher rate of false negatives. Consequently, the F1-score for Inception is comparatively lower at 57.87%, emphasizing the model's struggle to achieve a balance between precision and recall.

Table 4 presents the performance metrics for several DL models when evaluated on a balanced dataset. These metrics include accuracy, precision, recall, and F1-score. Analyzing the table, we can observe significant variations in the performance of the different models. Starting with the best performers, both VGG-16 and VGG-19 demonstrate exceptional results across the metrics. VGG-16 achieves an impressive accuracy of 97.07, indicating its ability to correctly classify a high proportion of instances. The precision value for VGG-16 is also high at 98.52%, highlighting its low false positive rate. Additionally, the model shows excellent recall (98.34%), indicating its capability to identify positive instances effectively. The F1-score (99.16%) further solidifies VGG-16 as a top-performing model, indicating a well-balanced trade-off between precision and recall. VGG-19 performs similarly well, with a slightly lower accuracy of 96.29% but maintaining high precision (97.12%), recall (97.82%), and F1-score (98.37%). On the other end of the spectrum, Inception exhibits the lowest performance among the models across various metrics. It achieves an accuracy of 58.20%, suggesting a substantial number of misclassifications. Although Inception demonstrates a precision of 78.72%, indicating a relatively low false positive rate, its recall of 85.83% suggests a higher rate of false negatives. Consequently, the F1-score for Inception stands at 76.05%, reflecting its struggle to achieve a balanced performance between precision and recall.

Table 4.

Performance matrix of DL models for a balanced dataset.

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 score (%)
VGG-16	97.07	98.52	98.34	99.16
VGG-19	96.29	97.12	97.82	98.37
GoogleNet	91.11	97.56	98.87	96.00
Xception	68.36	73.58	76.47	76.57
Inception	58.20	78.72	85.83	76.05

Open in a new tab

Discussion

Our aim is to determine the predictive accuracy of the model, highlight significant observations, and provide insights into the model's decision-making process through visual interpretation. On the basis of forecasted output analysis and Grad-CAM visualizations, in this discussion, the consistency, accuracy, and interpretability of the model in identifying some skin diseases are analyzed. The outcome is essential in defining the pragmatics of the model in actual practice.

Figure 6 showcases the predicted output of skin diseases based on images. The image output provides a visual representation of the model's accuracy in classifying skin diseases, making it easier to identify and treat skin conditions accurately. This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

Figure 7 shows the Grad-CAM (Gradient-weighted Class Activation Mapping) visualization for a skin disease as detected by a CNN model. Grad-CAM visualization helps understand which parts of the image are most influential in the CNN model's classification process. Let's break down what this visualization indicates: original image (Leftmost Panel) of the disease without any overlay. Grad-CAM Heatmap (Middle Panel) represents the regions that the CNN model considers significant for identifying the disease. Colors towards the red end of the spectrum (red, orange, yellow) indicate higher importance, while colors towards the blue end (blue, green) indicate lower importance. The central red region shows the most significant area that the model used for classification. Overlay Image (Rightmost Panel) shows the original image with the Grad-CAM heatmap overlaid on it. It visually merges the significance of regions (from the heatmap) with the actual insect.

Conclusion and future work

This study aimed to develop an early and predictive system for identifying patterns in skin diseases using advanced deep-learning models. By conducting multiple phases of experimentation and evaluation, we assessed the performance of five prominent models: GoogleNet, Inception, VGG-16, VGG-19, and Xception. Utilizing a self-collected dataset from various online sources, we focused on five skin disease classes, despite the inherent challenge of dataset imbalance. The study underscored the efficacy of data augmentation techniques in addressing class imbalance and enhancing model performance. Notably, VGG-19 achieved an accuracy of 95.00% on the imbalanced dataset, proving its robustness in identifying and classifying skin diseases. Conversely, Inception, with an accuracy of 47.40%, highlighted the limitations of certain models under imbalanced conditions. Upon applying data augmentation, VGG-16 emerged as the best performer with an accuracy of 97.07%, illustrating the critical role of balanced data in achieving reliable and accurate results. The comprehensive statistical evaluations, including the Confusion Matrix, Sensitivity, Specificity scores, and validation graphs, further validated the superior performance of our developed models over existing methods. This study's findings emphasize the potential of deep learning models in early skin disease diagnosis, paving the way for their practical application in medical fields. Despite the promising results, this study has several limitations that open avenues for future research. The generalizability of our findings to other medical conditions and datasets remains uncertain. Future studies should explore the application of our methodologies to a broader spectrum of medical domains, encompassing diverse diseases and disorders. Lastly, further optimization of image preprocessing techniques such as resizing, noise removal, and blur techniques is essential. Future research can explore novel algorithms and approaches to enhance the quality and utility of medical images for diagnosis, thereby improving the overall effectiveness of predictive systems in medical applications. As a potential future work, the effectiveness of the proposed classification approach can be evaluated for the classification of gliomas, cervical tumors, breast cancer types, portal and hepatic vessels. Also, the proposed model can be updated to achieve precise segmentations of the kidneys and the liver. This is because, although there are various approaches applied for these tasks,^52–59 the proposed technique can produce better results. Our current models were trained to differentiate between multiple skin disorders; they are inherently capable of distinguishing healthy from diseased skin if provided with balanced data that include normal cases. Expanding the dataset to incorporate healthy skin images is a key future direction, which would enhance the models’ ability to screen broadly and reduce the risk of misclassification. Additionally, future work will focus on adapting the models for use in low-resource settings by optimizing them for lightweight devices and limited expertise.

Acknowledgments

We would like to express our sincere gratitude to the NLP and ML Research Lab at Daffodil International University for their invaluable technical support throughout this research. Additionally, we extend our heartfelt thanks to the University of North Texas for their generous funding support, which made this study possible.

ORCID iDs: Sharun Akter Khushbu https://orcid.org/0000-0002-8900-8580

S M Shaqib https://orcid.org/0009-0005-0840-5891

Asif Hossain Anik https://orcid.org/0000-0002-7003-4738

K S M Tozammel Hossain https://orcid.org/0000-0003-0136-2145

Ethical approval: This study primarily utilized a self-collected dataset from publicly available online resources for skin disease images. As no direct human or animal subjects were involved, and the data were collected from sources where patient anonymity and data privacy were already maintained, formal ethical approval was not required for this specific research. All images used were either publicly accessible or licensed for research purposes.

Informed consent statement: We affirm that this manuscript is unique, unpublished, and not under consideration for publication elsewhere. We confirm that the manuscript has been read and approved by all named authors and that there are no other people who meet the criteria for authorship but are not listed. We further affirm that we have all approved the order of authors listed in the manuscript. We understand that the corresponding author is the sole contact for the editorial process. She is responsible for communicating with the other authors about progress, submissions of revisions, and final approval of proofs. Moreover, this study does not need any consent from patients, as we have collected from online resources.

Contributorship: S. M. Saiful Islam Badhon: conceptualization, methodology, data curation, writing—original draft, visualization, and funding acquisition. Sharun Akter Khushbu: conceptualization, review and editing, software, data curation, and visualization. S. M. Shaqib: conceptualization, coding, formal analysis, methodology, writing, and validation. Md. Aiyub Ali: resources, investigation, and writing—review and editing. Asif Hossain Anik: resources, investigation, and writing—review and editing. K. S. M. Tozammel Hossain: project administration and review and editing. All authors have read and agreed to the published version of the manuscript.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was made possible through the generous funding support from Daffodil International University and University of North Texas.

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement: Data will be made available on request.

Guarantor: S. M. Saiful Islam Badhon, Sharun Akter Khushbu, and S. M. Shaqib take responsibility for the integrity of the work as a whole, from inception to the published article, and have approved the final version.

References

1.Martin G, Guérard S, Fortin MMR, et al. Pathological crosstalk in vitro between T lymphocytes and lesional keratinocytes in psoriasis: necessity of direct cell-to-cell contact. Lab Invest 2012; 92: 1058–1070. [DOI] [PubMed] [Google Scholar]
2.Hay RJ, Johns NE, Williams HC, et al. The global burden of skin disease in 2010: an analysis of the prevalence and impact of skin conditions. J Invest Dermatol 2014; 134: 1527–1534. [DOI] [PubMed] [Google Scholar]
3.Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–249. [DOI] [PubMed] [Google Scholar]
4.Akter MF, Sathi SS, Mitu S, et al. Lifestyle and heritability effects on cancer in Bangladesh: an application of Cox proportional hazards model. Asian J Med Biol Res 2021; 7: 82–89. [Google Scholar]
5.Khoury JD, Solary E, Abla O, et al. The 5th edition of the World Health Organization classification of haematolymphoid tumours: myeloid and histiocytic/dendritic neoplasms. Leukemia 2022; 36: 1703–1719. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Roberts W. Air pollution and skin disorders. Int J Womens Dermatol 2021; 7: 91–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Han SS, Kim MS, Lim W, et al. Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. J Invest Dermatol 2018; 138: 1529–1538. [DOI] [PubMed] [Google Scholar]
8.Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 2018; 5: 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Jafari MH, Nasr-Esfahani E, Karimi N, et al. Extraction of skin lesions from non-dermoscopic images for surgical excision of melanoma. Int J Comput Assist Radiol Surg 2017; 12: 1021–1030. [DOI] [PubMed] [Google Scholar]
10.Zhang X, Wang S, Liu J, et al. Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Med Inform Decis Mak 2018; 18: 69–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Harangi B, Baran A, Hajdu A. Classification of SJS-TEN Stevens-Johnson syndrome and toxic epidermaskli nnelcersoiloynss is using an ensemble of deep neural networks. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018, 21, pp.2575–2578: IEEE. [DOI] [PubMed] [Google Scholar]
12.Ge Z, Demyanov S, Chakravorty R, et al. Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11–13, 2017, Proceedings, Part III, pp.250–258: Springer. 2017. [Google Scholar]
13.Goceri E, Karakas AA. Comparative evaluations of CNN based networks for skin lesion classification. In: Proceedings of the 14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing (CGVCVIP), Zagreb, Croatia, 2020, pp.1–6. [Google Scholar]
14.Goceri E. Impact of deep learning and smartphone technologies in dermatology: automated diagnosis. In: Proceedings of the 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), 2020, pp.1–6: IEEE. [Google Scholar]
15.Goceri E. Convolutional neural network based desktop applications to classify dermatological diseases. In: Proceedings of the 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS), 2020, pp.138–143: IEEE. [Google Scholar]
16.Goceri E. Automated skin cancer detection: where we are and the way to the future. In: Proceedings of the 2021 44th International Conference on Telecommunications and Signal Processing (TSP), 2021, pp.48–51: IEEE. [Google Scholar]
17.Nagarajaiah K. Developing an innovative machine learning–driven diagnosis system for classifying skin disease.
18.Liu Y, Jain A, Eng C, et al. A deep learning system for differential diagnosis of skin diseases. Nat Med 2020; 26: 900–908. [DOI] [PubMed] [Google Scholar]
19.Srinivasu PN, SivaSai JG, Ijaz MF, et al. Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors 2021; 21: 2852. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Shanthi T, Sabeenian RS, Anand R. Automatic diagnosis of skin diseases using convolution neural network. Microprocess Microsyst 2020; 76: 103074. [Google Scholar]
21.Chen M, Zhou P, Wu D, et al. AI-Skin: skin disease recognition based on self-learning and wide data collection through a closed-loop framework. Inf Fusion 2020; 54: 1–9. [Google Scholar]
22.Goceri E. Deep learning based classification of facial dermatological disorders. Comput Biol Med 2021; 128: 104118. [DOI] [PubMed] [Google Scholar]
23.Esteva A, Chou K, Yeung S, et al. Deep learning-enabled medical computer vision. NPJ Digit Med 2021; 4: 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Allugunti VR. A machine learning model for skin disease classification using convolutional neural network. Int J Comput Program Database Manag 2022; 3: 141–147. [Google Scholar]
25.Groh M, Harris C, Soenksen L, et al. Evaluating deep neural networks trained on clinical images in dermatology with the Fitzpatrick 17k dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp.1820–1828. [Google Scholar]
26.Soenksen LR, Kassis T, Conover ST, et al. Using deep learning for dermatologist-level detection of suspicious pigmented skin lesions from wide-field images. Sci Transl Med 2021; 13: eabb3652. [DOI] [PubMed] [Google Scholar]
27.Burlina PM, Joshi NJ, Ng E, et al. Automated detection of erythema migrans and other confounding skin lesions via deep learning. Comput Biol Med 2019; 105: 151–156. [DOI] [PubMed] [Google Scholar]
28.Al-Masni MA, Kim DH, Kim TS. Multiple skin lesions diagnostics via integrated deep convolutional networks for segmentation and classification. Comput Methods Programs Biomed 2020; 190: 105351. [DOI] [PubMed] [Google Scholar]
29.Janbi N, Mehmood R, Katib I, et al. Imtidad: a reference architecture and a case study on developing distributed AI services for skin disease diagnosis over cloud, fog and edge. Sensors 2022; 22: 1854. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Abbas M, Imran M, Majid A, et al. Skin diseases diagnosis system based on machine learning. J Comput Biomed Inform 2022; 4: 37–53. [Google Scholar]
31.Weng F, Ma Y, Sun J, et al. An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases. arXiv [preprint]. 2022, https://arxiv.org/abs/2211.10858
32.Gouda W, Sama NU, Al-Waakid G, et al. Detection of skin cancer based on skin lesion images using deep learning. Healthcare 2022; 10: 1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Shetty B, Fernandes R, Rodrigues AP, et al. Skin lesion classification of dermoscopic images using machine learning and convolutional neural network. Sci Rep 2022; 12: 18134. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Jasti VD, Zamani AS, Arumugam K, et al. Computational technique based on machine learning and image processing for medical image analysis of breast cancer diagnosis. Security Commun Netw 2022; 2022: 1918379. [Google Scholar]
35.Almuayqil SN, Abd El-Ghany S, Elmogy M. Computer-aided diagnosis for early signs of skin diseases using multi types feature fusion based on a hybrid deep learning model. Electronics (Basel) 2022; 11: 4009. [Google Scholar]
36.Karthik R, Vaichole TS, Kulkarni SK, et al. Eff2Net: an efficient channel attention-based convolutional neural network for skin disease classification. Biomed Signal Process Control 2022; 73: 103406. [Google Scholar]
37.Foahom Gouabou AC, Collenne J, Monnier J, et al. Computer aided diagnosis of melanoma using deep neural networks and game theory: application on dermoscopic images of skin lesions. Int J Mol Sci 2022; 23: 13838. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Sreekala K, Rajkumar N, Sugumar R, et al. Skin diseases classification using hybrid AI based localization approach. Comput Intell Neurosci 2022; 2022: 6138490. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Kshirsagar PR, Manoharan H, Shitharth S, et al. Deep learning approaches for prognosis of automated skin disease. Life 2022; 12: 426. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Cassidy B, Kendrick C, Brodzicki A, et al. Analysis of the ISIC image datasets: usage, benchmarks and recommendations. Med Image Anal 2022; 75: 102305. [DOI] [PubMed] [Google Scholar]
41.Alenezi F, Armghan A, Polat K. Wavelet transform based deep residual neural network and ReLU based extreme learning machine for skin lesion classification. Expert Syst Appl 2023; 213: 119064. [Google Scholar]
42.Khushbu SA. Skin disease classification dataset [dataset]. 2024, 10.17632/3hckgznc67.1 [DOI]
43.Singh S, Verma R, Singh AK. Image filtration in Python using openCV. Turk J Comput Math Educ 2021; 12: 5136–5143. [Google Scholar]
44.Goceri E. Image augmentation for deep learning based lesion classification from skin images. In: Proceedings of the 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS), 2020, pp.144–148: IEEE. [Google Scholar]
45.Goceri E. Comparison of the impacts of dermoscopy image augmentation methods on skin cancer classification and a new augmentation method with wavelet packets. Int J Imaging Syst Technol 2023; 33: 1727–1744. Wiley Online Library. [Google Scholar]
46.Goceri E. GAN Based augmentation using a hybrid loss function for dermoscopy images. Artif Intell Rev 2024; 57: 234. [Google Scholar]
47.Singla A, Yuan L, Ebrahimi T. Food/non-food image classification and food categorization using pre-trained GoogLeNet model. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016, pp.3–11. [Google Scholar]
48.Mi Q, Keung J, Xiao Y, et al. An inception architecture-based model for improving code readability classification. In: Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering, 2018, pp.139–144. [Google Scholar]
49.Younis A, Li Q, Nyatega CO, et al. Brain tumor analysis using deep learning and VGG-16 ensembling learning approaches. Appl Sci 2022; 12: 7282. [Google Scholar]
50.Subetha T, Khilar R, Christo MS, et al. A comparative analysis on plant pathology classification using deep learning architecture–Resnet and VGG19. 2021.
51.Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.1251–1258. [Google Scholar]
52.Idlahcen F, Idri A, Goceri E. Exploring data mining and machine learning in gynecologic oncology. Artif Intell Rev 2024; 57: 20. Springer. [Google Scholar]
53.Nakach F-Z, Idri A, Goceri E. A comprehensive investigation of multimodal deep learning fusion strategies for breast cancer classification. Artif Intell Rev 2024; 57: 327. Springer. [Google Scholar]
54.Kaya B, Goceri E, Becker A, et al. Automated fluorescent microscopic image analysis of PTBP1 expression in glioma. PLoS One 2017; 12: e0170991. Public Library of Science San Francisco, CA USA. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Goceri E, Gürcan MN, Dicle O. Fully automated liver segmentation from SPIR image series. Comput Biol Med 2014; 53: 265–278. Elsevier. [DOI] [PubMed] [Google Scholar]
56.Göceri E. A comparative evaluation for liver segmentation from SPIR images and a novel level set method using signed pressure force function. Turkey: Izmir Institute of Technology, 2013. [Google Scholar]
57.Goceri E. Automatic kidney segmentation using Gaussian mixture model on MRI sequences. In: Electrical Power Systems and Computers: Selected Papers from the 2011 International Conference on Electric and Electronics (EEIC 2011), Nanchang, China, 2011 Jun 20–22, 3, pp.23–29: Springer. 2011 [Google Scholar]
58.Domingo J, Dura E, Goceri E. Iteratively learning a liver segmentation using probabilistic atlases: preliminary results. In: Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 2016 Dec 13–15, pp.593–598: IEEE. 2016. [Google Scholar]
59.Goceri E. Automatic labeling of portal and hepatic veins from MR images prior to liver transplantation. Int J Comput Assist Radiol Surg 2016; 11: 2153–2161. Springer. [DOI] [PubMed] [Google Scholar]

[bibr1-20552076251404523] 1.Martin G, Guérard S, Fortin MMR, et al. Pathological crosstalk in vitro between T lymphocytes and lesional keratinocytes in psoriasis: necessity of direct cell-to-cell contact. Lab Invest 2012; 92: 1058–1070. [DOI] [PubMed] [Google Scholar]

[bibr2-20552076251404523] 2.Hay RJ, Johns NE, Williams HC, et al. The global burden of skin disease in 2010: an analysis of the prevalence and impact of skin conditions. J Invest Dermatol 2014; 134: 1527–1534. [DOI] [PubMed] [Google Scholar]

[bibr3-20552076251404523] 3.Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–249. [DOI] [PubMed] [Google Scholar]

[bibr4-20552076251404523] 4.Akter MF, Sathi SS, Mitu S, et al. Lifestyle and heritability effects on cancer in Bangladesh: an application of Cox proportional hazards model. Asian J Med Biol Res 2021; 7: 82–89. [Google Scholar]

[bibr5-20552076251404523] 5.Khoury JD, Solary E, Abla O, et al. The 5th edition of the World Health Organization classification of haematolymphoid tumours: myeloid and histiocytic/dendritic neoplasms. Leukemia 2022; 36: 1703–1719. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr6-20552076251404523] 6.Roberts W. Air pollution and skin disorders. Int J Womens Dermatol 2021; 7: 91–97. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr7-20552076251404523] 7.Han SS, Kim MS, Lim W, et al. Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. J Invest Dermatol 2018; 138: 1529–1538. [DOI] [PubMed] [Google Scholar]

[bibr8-20552076251404523] 8.Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 2018; 5: 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr9-20552076251404523] 9.Jafari MH, Nasr-Esfahani E, Karimi N, et al. Extraction of skin lesions from non-dermoscopic images for surgical excision of melanoma. Int J Comput Assist Radiol Surg 2017; 12: 1021–1030. [DOI] [PubMed] [Google Scholar]

[bibr10-20552076251404523] 10.Zhang X, Wang S, Liu J, et al. Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Med Inform Decis Mak 2018; 18: 69–76. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr11-20552076251404523] 11.Harangi B, Baran A, Hajdu A. Classification of SJS-TEN Stevens-Johnson syndrome and toxic epidermaskli nnelcersoiloynss is using an ensemble of deep neural networks. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018, 21, pp.2575–2578: IEEE. [DOI] [PubMed] [Google Scholar]

[bibr12-20552076251404523] 12.Ge Z, Demyanov S, Chakravorty R, et al. Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11–13, 2017, Proceedings, Part III, pp.250–258: Springer. 2017. [Google Scholar]

[bibr13-20552076251404523] 13.Goceri E, Karakas AA. Comparative evaluations of CNN based networks for skin lesion classification. In: Proceedings of the 14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing (CGVCVIP), Zagreb, Croatia, 2020, pp.1–6. [Google Scholar]

[bibr14-20552076251404523] 14.Goceri E. Impact of deep learning and smartphone technologies in dermatology: automated diagnosis. In: Proceedings of the 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), 2020, pp.1–6: IEEE. [Google Scholar]

[bibr15-20552076251404523] 15.Goceri E. Convolutional neural network based desktop applications to classify dermatological diseases. In: Proceedings of the 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS), 2020, pp.138–143: IEEE. [Google Scholar]

[bibr16-20552076251404523] 16.Goceri E. Automated skin cancer detection: where we are and the way to the future. In: Proceedings of the 2021 44th International Conference on Telecommunications and Signal Processing (TSP), 2021, pp.48–51: IEEE. [Google Scholar]

[bibr17-20552076251404523] 17.Nagarajaiah K. Developing an innovative machine learning–driven diagnosis system for classifying skin disease.

[bibr18-20552076251404523] 18.Liu Y, Jain A, Eng C, et al. A deep learning system for differential diagnosis of skin diseases. Nat Med 2020; 26: 900–908. [DOI] [PubMed] [Google Scholar]

[bibr19-20552076251404523] 19.Srinivasu PN, SivaSai JG, Ijaz MF, et al. Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors 2021; 21: 2852. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr20-20552076251404523] 20.Shanthi T, Sabeenian RS, Anand R. Automatic diagnosis of skin diseases using convolution neural network. Microprocess Microsyst 2020; 76: 103074. [Google Scholar]

[bibr21-20552076251404523] 21.Chen M, Zhou P, Wu D, et al. AI-Skin: skin disease recognition based on self-learning and wide data collection through a closed-loop framework. Inf Fusion 2020; 54: 1–9. [Google Scholar]

[bibr22-20552076251404523] 22.Goceri E. Deep learning based classification of facial dermatological disorders. Comput Biol Med 2021; 128: 104118. [DOI] [PubMed] [Google Scholar]

[bibr23-20552076251404523] 23.Esteva A, Chou K, Yeung S, et al. Deep learning-enabled medical computer vision. NPJ Digit Med 2021; 4: 5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr24-20552076251404523] 24.Allugunti VR. A machine learning model for skin disease classification using convolutional neural network. Int J Comput Program Database Manag 2022; 3: 141–147. [Google Scholar]

[bibr25-20552076251404523] 25.Groh M, Harris C, Soenksen L, et al. Evaluating deep neural networks trained on clinical images in dermatology with the Fitzpatrick 17k dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp.1820–1828. [Google Scholar]

[bibr26-20552076251404523] 26.Soenksen LR, Kassis T, Conover ST, et al. Using deep learning for dermatologist-level detection of suspicious pigmented skin lesions from wide-field images. Sci Transl Med 2021; 13: eabb3652. [DOI] [PubMed] [Google Scholar]

[bibr27-20552076251404523] 27.Burlina PM, Joshi NJ, Ng E, et al. Automated detection of erythema migrans and other confounding skin lesions via deep learning. Comput Biol Med 2019; 105: 151–156. [DOI] [PubMed] [Google Scholar]

[bibr28-20552076251404523] 28.Al-Masni MA, Kim DH, Kim TS. Multiple skin lesions diagnostics via integrated deep convolutional networks for segmentation and classification. Comput Methods Programs Biomed 2020; 190: 105351. [DOI] [PubMed] [Google Scholar]

[bibr29-20552076251404523] 29.Janbi N, Mehmood R, Katib I, et al. Imtidad: a reference architecture and a case study on developing distributed AI services for skin disease diagnosis over cloud, fog and edge. Sensors 2022; 22: 1854. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr30-20552076251404523] 30.Abbas M, Imran M, Majid A, et al. Skin diseases diagnosis system based on machine learning. J Comput Biomed Inform 2022; 4: 37–53. [Google Scholar]

[bibr31-20552076251404523] 31.Weng F, Ma Y, Sun J, et al. An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases. arXiv [preprint]. 2022, https://arxiv.org/abs/2211.10858

[bibr32-20552076251404523] 32.Gouda W, Sama NU, Al-Waakid G, et al. Detection of skin cancer based on skin lesion images using deep learning. Healthcare 2022; 10: 1183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr33-20552076251404523] 33.Shetty B, Fernandes R, Rodrigues AP, et al. Skin lesion classification of dermoscopic images using machine learning and convolutional neural network. Sci Rep 2022; 12: 18134. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr34-20552076251404523] 34.Jasti VD, Zamani AS, Arumugam K, et al. Computational technique based on machine learning and image processing for medical image analysis of breast cancer diagnosis. Security Commun Netw 2022; 2022: 1918379. [Google Scholar]

[bibr35-20552076251404523] 35.Almuayqil SN, Abd El-Ghany S, Elmogy M. Computer-aided diagnosis for early signs of skin diseases using multi types feature fusion based on a hybrid deep learning model. Electronics (Basel) 2022; 11: 4009. [Google Scholar]

[bibr36-20552076251404523] 36.Karthik R, Vaichole TS, Kulkarni SK, et al. Eff2Net: an efficient channel attention-based convolutional neural network for skin disease classification. Biomed Signal Process Control 2022; 73: 103406. [Google Scholar]

[bibr37-20552076251404523] 37.Foahom Gouabou AC, Collenne J, Monnier J, et al. Computer aided diagnosis of melanoma using deep neural networks and game theory: application on dermoscopic images of skin lesions. Int J Mol Sci 2022; 23: 13838. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr38-20552076251404523] 38.Sreekala K, Rajkumar N, Sugumar R, et al. Skin diseases classification using hybrid AI based localization approach. Comput Intell Neurosci 2022; 2022: 6138490. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr39-20552076251404523] 39.Kshirsagar PR, Manoharan H, Shitharth S, et al. Deep learning approaches for prognosis of automated skin disease. Life 2022; 12: 426. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr40-20552076251404523] 40.Cassidy B, Kendrick C, Brodzicki A, et al. Analysis of the ISIC image datasets: usage, benchmarks and recommendations. Med Image Anal 2022; 75: 102305. [DOI] [PubMed] [Google Scholar]

[bibr41-20552076251404523] 41.Alenezi F, Armghan A, Polat K. Wavelet transform based deep residual neural network and ReLU based extreme learning machine for skin lesion classification. Expert Syst Appl 2023; 213: 119064. [Google Scholar]

[bibr42-20552076251404523] 42.Khushbu SA. Skin disease classification dataset [dataset]. 2024, 10.17632/3hckgznc67.1 [DOI]

[bibr43-20552076251404523] 43.Singh S, Verma R, Singh AK. Image filtration in Python using openCV. Turk J Comput Math Educ 2021; 12: 5136–5143. [Google Scholar]

[bibr44-20552076251404523] 44.Goceri E. Image augmentation for deep learning based lesion classification from skin images. In: Proceedings of the 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS), 2020, pp.144–148: IEEE. [Google Scholar]

[bibr45-20552076251404523] 45.Goceri E. Comparison of the impacts of dermoscopy image augmentation methods on skin cancer classification and a new augmentation method with wavelet packets. Int J Imaging Syst Technol 2023; 33: 1727–1744. Wiley Online Library. [Google Scholar]

[bibr46-20552076251404523] 46.Goceri E. GAN Based augmentation using a hybrid loss function for dermoscopy images. Artif Intell Rev 2024; 57: 234. [Google Scholar]

[bibr47-20552076251404523] 47.Singla A, Yuan L, Ebrahimi T. Food/non-food image classification and food categorization using pre-trained GoogLeNet model. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016, pp.3–11. [Google Scholar]

[bibr48-20552076251404523] 48.Mi Q, Keung J, Xiao Y, et al. An inception architecture-based model for improving code readability classification. In: Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering, 2018, pp.139–144. [Google Scholar]

[bibr49-20552076251404523] 49.Younis A, Li Q, Nyatega CO, et al. Brain tumor analysis using deep learning and VGG-16 ensembling learning approaches. Appl Sci 2022; 12: 7282. [Google Scholar]

[bibr50-20552076251404523] 50.Subetha T, Khilar R, Christo MS, et al. A comparative analysis on plant pathology classification using deep learning architecture–Resnet and VGG19. 2021.

[bibr51-20552076251404523] 51.Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.1251–1258. [Google Scholar]

[bibr52-20552076251404523] 52.Idlahcen F, Idri A, Goceri E. Exploring data mining and machine learning in gynecologic oncology. Artif Intell Rev 2024; 57: 20. Springer. [Google Scholar]

[bibr53-20552076251404523] 53.Nakach F-Z, Idri A, Goceri E. A comprehensive investigation of multimodal deep learning fusion strategies for breast cancer classification. Artif Intell Rev 2024; 57: 327. Springer. [Google Scholar]

[bibr54-20552076251404523] 54.Kaya B, Goceri E, Becker A, et al. Automated fluorescent microscopic image analysis of PTBP1 expression in glioma. PLoS One 2017; 12: e0170991. Public Library of Science San Francisco, CA USA. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr55-20552076251404523] 55.Goceri E, Gürcan MN, Dicle O. Fully automated liver segmentation from SPIR image series. Comput Biol Med 2014; 53: 265–278. Elsevier. [DOI] [PubMed] [Google Scholar]

[bibr56-20552076251404523] 56.Göceri E. A comparative evaluation for liver segmentation from SPIR images and a novel level set method using signed pressure force function. Turkey: Izmir Institute of Technology, 2013. [Google Scholar]

[bibr57-20552076251404523] 57.Goceri E. Automatic kidney segmentation using Gaussian mixture model on MRI sequences. In: Electrical Power Systems and Computers: Selected Papers from the 2011 International Conference on Electric and Electronics (EEIC 2011), Nanchang, China, 2011 Jun 20–22, 3, pp.23–29: Springer. 2011 [Google Scholar]

[bibr58-20552076251404523] 58.Domingo J, Dura E, Goceri E. Iteratively learning a liver segmentation using probabilistic atlases: preliminary results. In: Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 2016 Dec 13–15, pp.593–598: IEEE. 2016. [Google Scholar]

[bibr59-20552076251404523] 59.Goceri E. Automatic labeling of portal and hepatic veins from MR images prior to liver transplantation. Int J Comput Assist Radiol Surg 2016; 11: 2153–2161. Springer. [DOI] [PubMed] [Google Scholar]

PERMALINK

Explainable AI for skin disease classification using gradient-weighted class activation mapping and transfer learning in digital health to identify contours

S M Saiful Islam Badhon

Sharun Akter Khushbu

S M Shaqib

Md Aiyub Ali

Asif Hossain Anik

K S M Tozammel Hossain

Abstract

Objective

Methods

Results

Conclusion

Introduction

Literature review

Materials and methods

Figure 1.

Dataset description

Figure 2.

Table 1.

Dataset preparation

Random oversampling

Figure 3.

Image processing

Image resizing

Noise removal

Augmentation

Blur techniques

Figure 4.

Table 2.

Averaging blur

Median blur

Bilateral blur

Statistical features

Energy

Contrast

Homogeneity

Entropy

Figure 5.

Mean

Variance

Standard deviation

Root mean square (RMS)

Transfer learning-based models

Custom CNN architecture

Fully connected layers

Experimental setup

Result analysis and discussion

Table 3.

Table 4.

Discussion

Figure 6.

Figure 7.

Conclusion and future work

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases