TABLE 2.
Dataset name | Dataset range | Major use |
---|---|---|
Medical images datasets | ||
TCIA | Over 1.8 million multi‐modal images from 35,000+ subjects, 170+ collections | Cancer detection, diagnosis, and treatment |
MURA | 14,000+ musculoskeletal x‐rays | Classification of normal and abnormal bone images |
ISIC | 23,000+ images of skin lesions | Skin lesions detection |
ChestX‐ray8 | 100,000+ chest x‐rays | Classification of eight common thoracic diseases |
BraTS | MRI images from 393 cases of glioma | Brain tumor segmentation and recognition |
COVID19‐CT | 1000+ chest CT images of patients with confirmed COVID‐19 diagnosis | COVID19 detection and diagnosis |
Electronic health record (EHR) datasets | ||
MIMIC‐III | 40,000+ patients with demographic, clinical, and outcome data | Patients' outcome prediction and diseases risks assessment |
eICU | 200,000+ ICU patient records | Patients' survival prediction |
UK Biobank | 500,000+ individuals with demographic, lifestyle, and health data | Develop methods for disease prevention, diagnosis, and treatment |
Omics dataset | ||
TCGA | 11,000+ patients with cancer across 33 different cancer types | Identify potential targets for new therapies, and develop predictive models for patient outcomes |
PDB | 170,000+ protein structures from organisms | Prediction of protein structure, design new drugs and therapeutic agents |
KEGG | 22,000+ human genes, 600+ diseases, and associated molecular pathways | Explore functional relationships between genes, proteins, and other molecules |
HMDB | 114,000+ metabolites' structures, functions, and associated diseases | Identify potential biomarkers for diagnosis and treatment of various conditions |
Abbreviations: BraTS, Brain Tumor Segmentation Challenge; eICU, eICU Collaborative Research Database; HMDB, Human Metabolome Database; ISIC, International Skin Imaging Collaboration; KEGG, Kyoto Encyclopedia of Genes and Genomes; MIMIC‐III, Medical Information Mart for Intensive Care III; MURA, Stanford's Musculoskeletal Radiographs; PDB, Protein Data Bank; TCGA, The Cancer Genome Atlas; TCIA, The Cancer Imaging Archive; UK Biobank, UK Biobank Imaging Study.