Table 1.
Machine learning and artificial intelligence applications to autoimmune diseases.
Disease | Number of studies | Years | Most popular classification/prediction application(s) | Most popular machine learning method(s) | Median sample size (min, max) | Data types used |
---|---|---|---|---|---|---|
Multiple sclerosis | 4130,45,50,51,60,61,71,91–93,100,101,111,117–144 | 2008–2019 | Diagnosis, Prognosis, Disease Subtype | Type of Regression, Random Forest, Support Vector Machine | 99 (12, 12566) | Clinical, Survey, Genetic, MRI, Lipid Markers, SNPs, Gait Data, Immune repertoire, Gene Expression |
Rheumatoid arthritis | 3220–22,26,27,31,32,40–42,46–48,52,59,62–64,70,72,80–82,88,97,145–151 | 2003–2018 | Risk, Diagnosis, Early Diagnosis, Identify Patients | Support Vector Machine, Variations of Random Forest, Neural Network and Decision Tree | 338 (22, 922199) | Medical Database, Immunoassay, Metagenomic, Microbiome, GWAS/SNP, Clinical, Movement Data, Amino acid analytes, Transcriptomic, EMRs, Ultrasound images, Proteomic, Laser images |
Inflammatory bowel disease | 3033–36,43,57,69,73,79,83–86,94,95,98,152–165 | 2007–2018 | Diagnosis, Response to Treatment, Disease Risk, Disease Severity | Random Forest, Support Vector Machine | 273 (50, 53279) | Clinical, Colonoscopy Images, Metagenomic, Gene Expression, GWAS, Microbiota, miRNA Expression, EMRs, Exome, MRI |
Type 1 diabetes | 1737–39,67,68,102–104,166–174 | 2009–2018 | Disease Management | Novel Methods/Hybrid Models, Neural Network, Support Vector Regression | 23 (10, 10579) | Clinical, Red Blood Cell Images, VOCs, GWAS/SNPs |
Systemic lupus erythematosus | 1419,23,44,49,89,96,175–182 | 2009–2018 | Variations of prognosis, Diagnosis | Logistic Regression, Neural Network, Random Forest Decision Tree | 318 (14, 17057) | Clinical, Electronic Health Records, Drug Treatment, SNPs, MRI, Exome, Gene Expression, Proteomic, Urine Biomarkers |
Psoriasis | 1153,74–77,99,112,183–186 | 2007–2018 | Diagnosis, Disease Severity | Support Vector Machine | 540 (80, 22181) | Digital Image, GWAS, Proteomic, RNA Biomarkers |
Coeliac disease | 724,25,54,65,66,78,187 | 2011–2018 | Diagnosis | Random Forest, Logistic Regression, Bayesian Classifier, Support Vector Machine, Logistic Model, Natural Language Processing, Combined Fuzzy Cognitive Map and Possibilistic Fuzzy c-means clustering. | 465 (47, 1498) | VOCs, Clinical, Peptide, EMRs |
Thyroid diseases | 6188–193 | 2008–2018 | Diagnosis | Hybrid Models | 215 (215, 7200) | Clinical |
Autoimmune liver diseases | 558,87,90,194,195 | 2009–2018 | Prognosis | Variations on Random Forest | 288 (64, 787) | Clinical, Clinical Trial, Microbiome |
Systemic sclerosis | 455,113,196,197 | 2016–2018 | Diagnosis, Treatment, Prognosis | Support Vector Machine, Random Forest | 119 (37, 991) | Gene Expression, Nailfold capillaroscopy images, Peripheral Blood Mononuclear cell data (flow cytometry, DNA, mRNA) |
Information includes the number of studies per autoimmune disease, the years they occurred, popular applications and methods and data types used. Median sample size was a better representation than mean, due to large cohorts in studies using data from genome-wide association studies and electronic medical records.
EMR electronic medical record, GWAS genome-wide association study, miRNA micro RNA, MRI magnetic resonance imaging, SNP single nucleotide polymorphism, VOC volatile organic compound.