. 2022 Jul 13;12:893972. doi: 10.3389/fonc.2022.893972

Table 4.

Different methods for solving data imbalance and data limitation.

Ref.	Dataset	Highlights	Limitations	Performance
(84)	ISIC-2017	By coupling seven GANs to generate seven skin-disease images. At the same time, they improved the efficiency of the model by making the initial layers of GANs share the same parameters.	The model was unable to distinguish the lesion area well when it was mixed with the skin surface, and artifacts such as human hair can also affect the generation of new images.	Accuracy: 0.816 AUC: 0.88
(85)	ISIC-2018	Proposed a GAN architecture that was customized to the style of skin lesions. At the same time, it can generate higher resolution and more diverse skin disease images by adjusting the progressive growth structure of the generator and discriminator in the GAN network.	The content of the GAN-generated synthetic dataset was not complicated enough when compared with the original dataset, and it was also not diverse enough.	Accuracy: 0.952
				Sensitivity: 0.832
				Specificity: 0.743
(86)	ISIC-2018	Utilized conditional generative adversarial networks (CGAN) to extract key information from all layers to generate skin lesion images with different textures and shapes while ensuring the stability of training.	The amount of data used for training was relatively limited.	Accuracy: 0.941
				Precision: 0.915
				Recall: 0.799
(87)	ISIC Archive	Explored four types of data augmentation methods and a multiple layers augmentation method in melanoma classification.	The data augmentation methods evaluated in this paper were limited and not validated on a large amount of datasets.	Accuracy: 0.829
(88)	HAM10000	They adopted a variational autoencoder network to get domain-dependent noise vectors. Also, a student-like distribution was employed to increase image diversity, and an auxiliary classifier was used to create images of certain classes.	Due to the specificity of medical images, different image generation models may generate skin disease images that did not belong to the same class.	Accuracy: 0.925
(89)	HAM10000	It combined the attention mechanism with PGGAN to obtain global features of skin lesions images, also introduced the Two-Timescale Update Rule to generate features with high fine-grainedness, while increasing the stability of GAN.	Due to the limitation of hardware conditions, this data augmentation method was only evaluated on the resolution of 256 × 256, rather than the original resolution of 600 × 450 in HAM10000 dataset.	AUC: 0.793
(90)	HAM10000	Proposed a class-weighted loss function and a focal loss to overcome the problem of data imbalance.	There is no artifact removal for the images in the training dataset, which leads the model to be biased. Also, it has a relatively high computational complexity.	Accuracy: 0.93
(90)	HAM10000			Recall: 0.86
(91)	HAM10000	A novel loss function was combined with the balanced mini-batch logic of the data level to alleviate the imbalance problem of the dermatology dataset.	The classification accuracy for rare skin diseases with limited data needs to be improved further.	Accuracy: 0.8997 Recall: 0.8613
	ISIC-2019			Accuracy: 0.8997 Recall: 0.8613
(92)	HAM10000	Proposed a two-stage technique for determining the appropriate augmentation procedure for mobile devices.	Given the particularity of lightweight CNN, more data augmentation methods and data need to be considered to alleviate the problem of overfitting.	Accuracy: 0.853
(93)	PAD-UFES	Designed two algorithms based on evolutionary algorithm and also applied weighted loss function and oversampling to alleviate the problem of data imbalance.	A larger dataset was necessary to improve the performance further.	Accuracy: 0.92
(93)	PAD-UFES			Recall: 0.94
(94)	PH²	Proposed novel a data augmentation method based on a oversampling technique (SMOTE).	The proposed data augmentation method was not validated in the deep learning architectures, and experiments on larger datasets were also required.	Accuracy: 0.922
				Sensitivity: 0.808
				Specificity: 0.951