Skip to main content
. 2022 Jul 13;12:893972. doi: 10.3389/fonc.2022.893972

Table 4.

Different methods for solving data imbalance and data limitation.

Ref. Dataset Highlights Limitations Performance
(84) ISIC-2017 By coupling seven GANs to generate seven skin-disease images. At the same time, they improved the efficiency of the model by making the initial layers of GANs share the same parameters. The model was unable to distinguish the lesion area well when it was mixed with the skin surface, and artifacts such as human hair can also affect the generation of new images. Accuracy: 0.816
AUC: 0.88
(85) ISIC-2018 Proposed a GAN architecture that was customized to the style of skin lesions. At the same time, it can generate higher resolution and more diverse skin disease images by adjusting the progressive growth structure of the generator and discriminator in the GAN network. The content of the GAN-generated synthetic dataset was not complicated enough when compared with the original dataset, and it was also not diverse enough. Accuracy: 0.952
Sensitivity: 0.832
Specificity: 0.743
(86) ISIC-2018 Utilized conditional generative adversarial networks (CGAN) to extract key information from all layers to generate skin lesion images with different textures and shapes while ensuring the stability of training. The amount of data used for training was relatively limited. Accuracy: 0.941
Precision: 0.915
Recall: 0.799
(87) ISIC Archive Explored four types of data augmentation methods and a multiple layers augmentation method in melanoma classification. The data augmentation methods evaluated in this paper were limited and not validated on a large amount of datasets. Accuracy: 0.829
(88) HAM10000 They adopted a variational autoencoder network to get domain-dependent noise vectors. Also, a student-like distribution was employed to increase image diversity, and an auxiliary classifier was used to create images of certain classes. Due to the specificity of medical images, different image generation models may generate skin disease images that did not belong to the same class. Accuracy: 0.925
(89) HAM10000 It combined the attention mechanism with PGGAN to obtain global features of skin lesions images, also introduced the Two-Timescale Update Rule to generate features with high fine-grainedness, while increasing the stability of GAN. Due to the limitation of hardware conditions, this data augmentation method was only evaluated on the resolution of 256 × 256, rather than the original resolution of 600 × 450 in HAM10000 dataset. AUC: 0.793
(90) HAM10000 Proposed a class-weighted loss function and a focal loss to overcome the problem of data imbalance. There is no artifact removal for the images in the training dataset, which leads the model to be biased. Also, it has a relatively high computational complexity. Accuracy: 0.93
Recall: 0.86
(91) HAM10000 A novel loss function was combined with the balanced mini-batch logic of the data level to alleviate the imbalance problem of the dermatology dataset. The classification accuracy for rare skin diseases with limited data needs to be improved further. Accuracy: 0.8997
Recall: 0.8613
ISIC-2019
(92) HAM10000 Proposed a two-stage technique for determining the appropriate augmentation procedure for mobile devices. Given the particularity of lightweight CNN, more data augmentation methods and data need to be considered to alleviate the problem of overfitting. Accuracy: 0.853
(93) PAD-UFES Designed two algorithms based on evolutionary algorithm and also applied weighted loss function and oversampling to alleviate the problem of data imbalance. A larger dataset was necessary to improve the performance further. Accuracy: 0.92
Recall: 0.94
(94) PH2 Proposed novel a data augmentation method based on a oversampling technique (SMOTE). The proposed data augmentation method was not validated in the deep learning architectures, and experiments on larger datasets were also required. Accuracy: 0.922
Sensitivity: 0.808
Specificity: 0.951