Table 2.
List of Publicly Available Datasets
| Name of Dataset | Dataset Description | Access |
|---|---|---|
| ISIC challenge 2016 (melanoma) | Task 1: Lesion Segmentation Training Data: 900 dermoscopic lesion images in JPEG format, with EXIF data stripped Training Ground Truth: 900 binary mask images in PNG format Test Data: 379 images of the same format as the training data Task 2: Detection and Localization of Visual Dermoscopic Features/Patterns Training Data: 807 lesion images in JPEG format and 807 corresponding superpixel masks in PNG format Training Ground Truth: 807 dermoscopic feature files in JSON format Test Data: 335 images of the exact same format as the training data Task 3: Disease Classification Training Data: 900 lesion images in JPEG format Training Ground Truth: 900 entries of gold standard malignant status Test Data: 379 images of the exact same format as the training data Image Label Split: Task 3 (727 benign, 173 malignant) |
https://challenge.isic-archive.com/data/ |
| ISIC challenge 2017 (melanoma) | For all three tasks: Training Dataset: 2,000 lesion images in JPEG format and 2,000 corresponding superpixel masks in PNG format, with EXIF data stripped Training ground truth: 2,000 binary mask images in PNG format, 2,000 dermoscopic feature files in JSON format, 2,000 entries of gold standard lesion diagnosis Validation Dataset: 150 images Test dataset: 600 images Image Label Split: Melanoma 374, seborrheic keratosis 254, other (benign): 1,372 |
https://challenge.isic-archive.com/data/ |
| ISIC Challenge 2018 (melanoma) | For Tasks 1 and 2: Training Dataset: 2,594 images and 12,970 corresponding ground truth response masks (five for each image). Validation Dataset: 100 images Test dataset: 1,000 images For Task 3 (HAM10000 Dataset): Training Dataset: 10,015 images and 1 ground truth response CSV file (containing one header row and 10,015 corresponding response rows). 10,015 entries grouping each lesion by image and diagnosis confirm the type. Training ground truth: 2,000 binary mask images in PNG format, 2,000 dermoscopic feature files in JSON format, and 2,000 entries of gold standard lesion diagnosis Validation Dataset: 193 images Test dataset: 1,512 images Image Label Split: Actinic keratoses and intraepithelial carcinoma/Bowen's disease (327 images), basal cell carcinoma (514 images), benign keratosis-like lesions (1,099 images), dermatofibroma (115 images), melanoma (1,113 images), melanocytic nevi (6,705 images), and vascular lesions (142 images) |
ISIC Dataset: https://challenge.isic-archive.com/data/HAM10000 Dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T |
| ISIC challenge 2019 (melanoma) | Training Set: 25,331 JPEG images of skin lesions and metadata entries of age, sex, and general anatomic site with gold standard lesion diagnosis Test Set: 8,238 JPEG images of skin lesions and metadata entries of age, sex, and general anatomic site. Image Label Split: Actinic keratoses and intraepithelial carcinoma/Bowen's disease (867 images), basal cell carcinoma (3,323 images), benign keratosis‒like lesions (2,624 images), dermatofibroma (239 images), melanoma (4,522 images), melanocytic nevi (12,875 images), and vascular lesions (253 images) |
https://challenge.isic-archive.com/data/ |
| ISIC challenge 2020 (melanoma) | Training Set: 33,126 DICOM images with embedded metadata and metadata entries of patient ID, sex, age, and general anatomic site with gold standard lesion diagnoses. Test Set: 10,982 DICOM images with embedded metadata and metadata entries of patient ID, sex, age, and general anatomic site. Image Label Split: Benign keratosis-like lesions (37 images), Lentigo (44 images), solar lentigo (7 images), melanoma (584 images), melanocytic nevi (5193 images), seborrheic keratoses (135 images) and other/unknown (benign) (27,124 images) |
https://challenge.isic-archive.com/data/ |
| MED—NODE Database | Image Label Split: 100 dermoscopic images: 80 melanomas and 20 nevi 100 nondermoscopic images: 80 melanomas and 20 nevi |
https://skinclass.de/mclass/ |
| PH2 Database | Image Label Split: 200 dermoscopic images of melanocytic lesions, including 80 common nevi, 80 atypical nevi, and 40 melanomas | https://www.fc.up.pt/addi/ph2%20database.html |
| DermIS Database | Image Label Split: 43 macroscopic photographs with lesions diagnosed as melanoma and 26 diagnosed as nonmelanoma | http://www.dermis.net |
| DermQuest | Image Label Split: 76 images of melanoma lesions and 61 images of nonmelanoma lesions | http://www.dermquest.com |
| Interactive Atlas of Dermoscopy | The dataset includes over 2,000 clinical and dermoscopy color images, along with corresponding structured metadata Image Label Split: basal cell carcinoma (42 images), nevi (575 images), dermatofibroma (20 images), lentigo (24 images), melanoma (268 images), miscellaneous (8 images), seborrheic keratoses (45 images), and vascular lesion (29 images) Only 1,011 labels are shown in the dataset |
https://derm.cs.sfu.ca/ |
| DFUC 2020 Dataset | 4,000 images, with 2,000 used for the training set and 2,000 used for the testing set. An additional 200 images were used for sanity checking. The training set consists of DFU images only, and the testing set comprised images of DFU and other foot/skin conditions and images of healthy feet. The dataset is heterogeneous, with aspects such as distance, angle, orientation, lighting, focus, and the presence of background objects all varying between photographs. Image Label Split: Not reported unless requested |
https://dfu-challenge.github.io/dfuc2020.html |
| DFUC 2021 Dataset | 15,683 DFU patches, with 5,955 training, 5,734 for testing, and 3,994 unlabeled DFU patches Image Label Split: Training set (2,555 infections only, 227 ischemia only, 621 both infection and ischemia, and 2,552 without ischemia and infection) |
https://dfu-challenge.github.io/dfuc2021.html |
Abbreviations: DFU, diabetic foot ulcer; ISIC, International Skin Imaging Collaboration.