Skip to main content
. 2019 Apr 26;8(5):giz042. doi: 10.1093/gigascience/giz042

Table 1:

Microbiome datasets with available classification tasks in ML Repo

Project name V Region Target size No. samples No. subjects Area Description Sequencing technology Study design
Cho 2012 V3 177 95 47 Antibiotics Mouse fecal and cecal samples, control vs 4 kinds of antibiotics 454 Cross-sectional
Claesson 2012 V4 221 168 168 Age Elderly and young adults 454 Cross-sectional
David 2014 V4 282 235 11 Diet Plant-based vs animal-based diet, cross-over study Illumina MiSeq Longitudinal
Gevers 2014 V4 173 1,321 668 IBD Biopsies from patients with IBD prior to treatment Illumina MiSeq Cross-sectional
HMP 2012 V35 527 6,407 242 Body habitat, sex Up to 18 body sites across 242 healthy subjects at 1–2 time points 454 Cross-sectional
Kostic 2012 V35 569 190 95 Colorectal cancer Adjacent healthy vs tumor colon biopsy tissues 454 Paired
Montassier 2016 V56 280 28 28 Bacteremia Patients prior to chemotherapy who did or did not develop bacteremia 454 Cross-sectional
Morgan 2012 V35 569 231 231 IBD Healthy controls, patients with Crohn's disease or ulcerative colitis 454 Cross-sectional
Turnbaugh 2009 V2 230 281 154 Obesity Monozygotic or dizygotic twin pairs concordant for body mass index class, and their mothers 454 Cross-sectional
Wu 2011 V12 244 95 10 Diet Controlled high-fat or low-fat feeding on 10 subjects over 10 days 454 Longitudinal
Yatsunenko 2012 V4 282 531 531 Geography, age, sex Humans of varying ages from the USA, Malawi, and Venezuela Illumina MiSeq Cross-sectional
Ravel 2011 V12 240 396 396 Bacterial vaginosis Vaginal samples from 4 ethnic groups; Nugent scores for bacterial vaginosis 454 Cross-sectional
Karlsson 2013 NA NA 144 144 Diabetes Patients with normal, impaired, or type 2 diabetes glucose tolerance categories Illumina HiSeq Cross-sectional
Qin 2012 NA NA 134 134 Diabetes Chinese healthy controls vs patients with type 2 diabetes Illumina HiSeq Cross-sectional
Qin 2014 NA NA 130 130 Cirrhosis Healthy controls vs patients with cirrhosis Illumina HiSeq Cross-sectional

ML Repo contains 33 classification and regression tasks from 15 publicly available human microbiome datasets shown here. IBD: inflammatory bowel disease; NA: not applicable.