Abstract
The bacterial composition of the human fecal microbiome is influenced by many lifestyle factors, notably diet. It is less clear, however, what role host genetics plays in dictating the composition of bacteria living in the gut. In this study, we examined the association of ~200K host genotypes with the relative abundance of fecal bacterial taxa in a founder population, the Hutterites, during two seasons (n = 91 summer, n = 93 winter, n = 57 individuals collected in both). These individuals live and eat communally, minimizing variation due to environmental exposures, including diet, which could potentially mask small genetic effects. Using a GWAS approach that takes into account the relatedness between subjects, we identified at least 8 bacterial taxa whose abundances were associated with single nucleotide polymorphisms in the host genome in each season (at genome-wide FDR of 20%). For example, we identified an association between a taxon known to affect obesity (genus Akkermansia) and a variant near PLD1, a gene previously associated with body mass index. Moreover, we replicate a previously reported association from a quantitative trait locus (QTL) mapping study of fecal microbiome abundance in mice (genus Lactococcus, rs3747113, P = 3.13 x 10−7). Finally, based on the significance distribution of the associated microbiome QTLs in our study with respect to chromatin accessibility profiles, we identified tissues in which host genetic variation may be acting to influence bacterial abundance in the gut.
Introduction
Humans have complex interactions with the bacteria that live in and on their bodies, referred to as the microbiota[1]. Alterations in the microbiota, particularly in the gut, have been linked to variation in risk for obesity[2–4], celiac disease[5], Crohn’s disease[6, 7], ulcerative colitis[8–11], gastroenteritis[12], asthma[13], and inflammatory bowel disease[14, 15]. Therefore, understanding the factors that determine and maintain gut microbiome composition has the potential to unlock therapies to improve human health. Several environmental factors have been shown to play a role in determining gut microbiome composition such as the method of delivery at birth[16], formula vs. breast feeding as an infant[17], and diet[18–21]. One major factor that has yet to be examined in detail is the role of host genetics.
Several groups have investigated the heritability of the gut microbiome in humans and model organisms; however, estimates of genetic contribution to bacterial abundance vary between studies. One early study used temperature gradient gel electrophoresis (TGGE) to examine the similarity of the bacterial 16S rRNA genes from gut bacteria between individuals with varying degrees of relatedness. Their findings showed TGGE profiles were increasingly similar as the relatedness between pairs of individuals also increased[22]. Although this is consistent with a heritable component to the microbiome, genetic effects are confounded by similarity in environment, as related individuals likely shared environments throughout their lives to a greater extent than unrelated individuals. Another study estimated microbiome heritability using 16S rRNA gene sequencing in twin pairs and parent-offspring trios by examining relatedness using a pairwise distance measurement of microbial composition[3]. Although microbiome composition was more similar between twins than between parent-offspring pairs or unrelated individuals, the microbiome of monozygotic twins was not more similar than that of dizygotic twins, arguing against a strong genetic component to microbiome composition. This study was, however, small (~20–30 twin pairs in each category) and only examined broad measures of the microbiome composition rather than individual bacterial abundances. More recently, a study on the heritability of common gut bacteria in >400 twin pairs suggested host genetics plays a role in determining gut microbiome composition in humans, with some bacterial taxa having heritability estimates as high as 0.39[23].
While evidence for host genetics influencing the gut microbiota is gaining traction, we still lack an understanding of what genes or genetic variants in the human genome might potentially influence bacterial profiles. Many studies have focused on candidate genes, where either natural variation segregating in humans[24–26] or gene knockout models in mice[27–29] were associated with differences in the microbiome. However, only one study to date has performed a genome-wide scan for variants associated with bacterial abundance in the gut: Benson et al.[30] identified 18 quantitative trait loci (QTL) associated with various bacterial taxa in the gut using advanced intercrossed mouse lines. This study demonstrates the utility of genome-wide approaches, however, no such study has been reported in humans.
To address this gap, we examined the fecal microbiome from the Hutterites, a religious isolate living in North America. Importantly, members of this population live and eat on large communal farms, called colonies, limiting inter-individual variation in environmental exposures that might mask genetic effects on microbiome composition. In particular, meals are prepared and eaten in a communal kitchen and dining room, respectively. Previous work in this population examined temporal differences between winter and summer gut microbiomes[18]. Here, we examined the same individuals for sex, age, and genetic effects on microbiome composition using both the winter data (n = 93) and summer data (n = 91), separately. In addition, we considered a composite microbiome (“seasons combined”), where the relative abundances of bacterial taxa sampled in winter and summer are averaged for any individual where stool was available from both seasons (n = 127; see Materials and Methods). While the abundance of only one bacterial taxa correlated with age (genus Bifidobacterium), at least four bacterial taxa in each season were differentially abundant by sex. In addition, we identified at least eight bacterial taxa in each season that are associated with at least one single nucleotide polymorphism (SNP) at a genome-wide significance level. Finally, we identified pathways and candidate tissues where host genetic variation may be acting to influence bacterial abundance in the gut.
Materials and Methods
Ethics statement
The protocol was approved by the University of Chicago IRB (protocol 10-416-B). Written informed consent was obtained from all adult participants and the parents of minors. In addition, written assent was obtained from minor participants.
Accession Numbers
The data for the 16S rRNA amplicon sequencing, genotypes, GWAS results, and metadata have been deposited in dbGaP under accession numbers phs000680 and phs000185.
Microbial data collection
Stool collection, DNA extraction, 16S rRNA gene sequencing, and classification were performed as described previously[18]. Briefly, stool samples were collected during two seasons (winter and the following summer). Sequences were classified using a Naïve-Bayesian classifier as implemented in mothur[31]. In our previous study of seasonal differences in microbiome composition, individuals were excluded if they took antibiotics within 6 months of either the summer or winter sampling date. For this study, in which seasons were considered individually, samples were excluded if the individual had taken antibiotics within the previous 6 months the sampling date for each within season analysis. Additional samples were excluded for individuals without genotyping data. After exclusions, samples for 93 individuals in winter (60 females and 33 males), 91 individuals in summer (57 females and 34 males), and 127 individuals in the combined sample (79 females and 48 males) remained. For all analyses, sequencing reads were randomly subsampled to a maximum depth of 2 million reads per technical replicate and technical replicates were combined, resulting in approximately 4 million reads per individual per season. Taxon abundances were standardized to the total number of reads subsampled to generate relative abundance measures.
Genotype data
Genotyping in 1415 Hutterite individuals (including the 127 individuals considered in this study) was performed using either the Affymetrix 500k Array Set, the Genome-Wide Human SNP Array 5.0, or the Genome-Wide Human SNP Array 6.0 as part of a long-term research program studying the genetic basis of complex phenotypes in the Hutterites[32–35]. Genotypes for the Affymetrix 500k array set were called using BRLMM (http://media.affymetrix.com/support/technical/whitepapers/brlmm_whitepaper.pdf) and genotypes for the Genome-Wide Human SNP Array 5.0 and 6.0 were called using Birdseed[36]. SNPs were initially quality filtered using the following criteria across 1415 individuals: minor allele frequency ≥ 5%, call rates ≥ 95%, Hardy-Weinberg equilibrium P-values ≥ 0.001, and fewer than 5 Mendelian errors across all individuals. 271,365 SNPs that were on all three platforms remained after quality control filtering and re-annotation to the human genome version 19 (hg19, dbSNP135).
Bacterial data pre-processing
Individual seasons
The following bacterial data processing was done separately for samples collected in the winter and in the summer. Because there would be limited power to detect associations in rare members of the microbiota, we first eliminated from subsequent analyses bacterial taxa that did not have at least one read in at least 75% of individuals (number of bacterial taxa remaining: winter = 7 phyla, 15 classes, 26 orders, 45 families, and 80 genera. summer = 7 phyla, 15 classes, 21 orders, 38 families, and 71 genera). Next, taxon data was fit to a standard normal distribution across individuals by quantile normalization using the R function qqnorm[37]. Finally, to limit the burden of multiple testing, we removed any taxon that was highly correlated (Pearson correlation ≥ 0.9) with either a taxon at a lower taxonomic level (the taxon at the higher level was removed) or at the same level (the taxon listed first alphabetically was removed). The pairs of highly correlated bacterial taxa are shown in S1 Table. The final dataset for winter consisted of 116 bacterial taxa: 3 phyla, 3 classes, 8 orders, 22 families, and 80 genera. The final dataset for summer consisted of 104 bacterial taxa: 3 phyla, 3 classes, 8 orders, 19 families, and 71 genera.
Combining seasons, with normalization
Our previous work revealed broad, temporal differences in gut bacterial abundance between winter and summer in this population[18]. However, having the largest possible sample increases power in genetic association studies, which is critical given the number of tests performed. To this end, we normalized bacterial taxon abundance in each season separately before combining data from the two seasons and included only common taxa that had at least one read in 75% of individuals in both seasons (7 phyla, 15 classes, 21 orders, 38 families, and 70 genera). Specifically, within each season, taxon data was fit to a standard normal distribution across individuals by quantile normalization using the R function qqnorm[37]. Normalized bacterial taxon abundances were then averaged for any individuals who were sampled in both seasons. To further limit the burden of multiple testing, we removed bacterial taxa that were highly correlated, as described above. The final dataset consisted of 102 bacterial taxa: 3 phyla, 3 classes, 7 orders, 19 families, and 70 genera. To ensure that seasonal effects were accounted for, principal component analysis was performed on all genera level classifications in the final, normalized data set using prcomp in R; season was not correlated with any of the top 10 principal components in this final dataset (S1 Fig). We refer to this dataset as the “seasons combined” from hereon in.
Bacterial correlations with age and sex
Correlations of individual bacterial taxa to age and sex were performed using linear mixed models that corrected for the relatedness between individuals, as implemented in GEMMA (v0.94)[38]. First, relevant covariates were regressed out of normalized bacterial abundances (sex and colony/date of collection for examining taxa correlations with age; and age and colony/date of collection for examining taxa correlations with sex). Samples were collected from five colonies on separate days; therefore, colony and date of collection are confounded. With this understanding, we refer to ‘date of collection’ rather than ‘colony/date of collection’ throughout the manuscript. Linear models were run using GEMMA specifying the–notsnp option. Relatedness matrices were calculated using identity by descent from genotype data[39]. Significance was assessed using a likelihood ratio test and multiple testing corrections were done using q-values (S5 and S6 Tables)[40].
Diversity metrics
All bacterial taxa present at the genus level in each sample were used to calculate alpha diversity metrics (richness, Shannon diversity, and evenness), without quantile normalization applied to relative abundances and no taxa eliminated due to rarity. Diversity metrics were calculated using the vegan package in R[41].
Chip heritability of gut bacteria
Relevant covariates (age, sex, and date of collection) were regressed from each bacterial taxon within each seasonal analysis. “Chip heritability” (or percent variance explained, PVE, S2 Table) was calculated for the residuals of each taxon using GEMMA v0.94[38]. PVE is considered non-zero if the standard error measurements do not intersect zero.
Bacterial associations to host genetic variation
Prior to association testing, age, sex and date of collection were regressed from normalized bacterial taxon relative abundance data. GEMMA (v0.94) was used to perform GWAS on the residuals, including the relatedness matrices as described above to account for inter-individual relatedness. SNPs were eliminated if their minor allele frequency was lower than 10% in the individuals tested. A total of 211,319 SNPs in the winter analysis, 210,924 SNPs in the summer analysis, and 212,153 SNPs in the “seasons combined” analysis were examined. Both a conservative Bonferroni corrected P-value threshold and less conservative q-value thresholds (0.2 and 0.1) were considered to correct for multiple testing within each genome-wide association study[40].
The number of bacterial taxa within each season that had both non-zero “chip heritability” and at least one genome-wide significant association were examined. To determine the significance of this overlap within each season, “chip heritability” estimates were permuted across bacterial taxa and the overlap of taxa that had both non-zero permuted “chip heritability” and at least one genome-wide significant SNP association was calculated, generating a null distribution of the number of overlaps expected by chance. An empirical P-value was calculated by dividing the number of overlaps equal to or greater than the observed number of overlaps for that season by the number of permutations (10,000) in the null distribution.
Functional annotation enrichment
It is unclear in what tissues genetic variation might be acting in the human body to influence bacterial abundance in the gut. To investigate this, for each bacterial taxon that either had a genome-wide significant association through GWAS or non-zero “chip heritability”, we examined the enrichment of GWAS SNPs in DNase hypersensitivity (DHS) peaks for 16 tissues. In the human genome, DHS peaks mark areas of accessible chromatin, which is assumed to typically participate in gene regulatory functions. Many of these accessible regulatory regions of the human genome are cell-type or temporally specific[42]. Therefore, candidate cell-types where genetic variation may be acting to influence an ultimate phenotype can be determined by examining the enrichment of strongly associated GWAS SNPs in open chromatin across tissues. Maurano et al.[43] demonstrated the utility of this approach, identifying immune cells as the candidate cell types for Crohn’s Disease and multiple sclerosis and heart cells for QRS duration. In this study, we took this approach to identify candidate tissue types where host genetic variation may be acting to determine bacterial abundance in the gut.
DNase-I hypersensitivity data from Maurano et al.[43] was downloaded in bed format from http://www.uwencode.org/proj/Science_Maurano_Humbert_et_al/ on February 12th, 2014. The 349 available cell lines were clustered into 16 tissue classifications by calculating Euclidean distances for DNase peaks between all pairs of lines (S3 Table). DHS peaks for these tissues were determined by taking the intersection of all DHS peaks over all cell lines classified as that tissue in the previous step using intersectBed[44]. Overlaps between the genotyped Hutterite SNPs included in this study and each DHS peak for each tissue were determined using intersectBed.
We performed the following analysis for each of our bacterial abundance GWAS where either i) at least one SNP met suggestive genome-wide significance, or ii) there was non-zero “chip heritability” for that taxon (number of taxa examined in winter = 23, summer = 22, “seasons combined” = 18). For increasingly significant P-value thresholds, we examined the enrichment of SNPs identified in our GWAS (P-values at or below the given threshold) in DHS peaks from each of 16 tissues, compared to the genome-wide distribution of all tested SNPs located within DHS peaks of the same tissue. The enrichment of SNPs in DHS peaks per tissue (called “enrichments” in the remainder of this section) were calculated as described in Maurano et al., with the exception that the following P-value thresholds were examined for each bacterial abundance GWAS only if at least 50 SNPs had P-values at or below the cutoff: 1.0, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005, 0.00001, 0.000005 (S1 File). Significance of enrichment was assessed for each tissue for the smallest P-value bin size per bacterial taxa abundance GWAS by comparing the actual tissue enrichment observed to the distribution of tissue enrichments observed in permuted GWAS data. Specifically, normalized bacterial abundances were randomly assigned to individuals 10,000 times for each seasonal analysis. GWAS was performed for each of these permutations and the enrichment of top GWAS associations in DHS peaks was calculated for each tissue in each permutation, generating a null distribution of enrichments per tissue. The number of SNPs falling into the most significant P-value bin with at least 50 SNPs varied by GWAS. For each comparison, the number of SNPs in the smallest bin was matched in the permutations to the number of top SNPs in the smallest P-value bin from the actual GWAS. For example, if the actual GWAS for taxa “A” contained 63 SNPs in the smallest P-value bin, then the top 63 SNPs were chosen to calculate enrichment in each permutation. An empirical P-value was calculated by dividing the number of enrichments in the null distribution that were equal to or larger than the actual enrichment value by the total number of permutations (10,000) for each tissue (S4 Table). Note that reported P-values are not corrected for multiple testing.
Gene set enrichment analysis (GSEA)
GSEA was performed for any bacterial abundance GWAS with at least one significantly associated SNP using the R package postgwas[45]. Each tested SNP was assigned to the closest gene using Ensembl release 75 and any SNP further than 10kb from a gene was eliminated from analysis. An aggregate gene-wise P-value was calculated using the function gene2p (method = SpD) in postgwas by taking into account the dependency structure between SNPs. Gene ontology enrichment analyses were performed using the gwasGOenrich function with the following parameters: ontologies for cellular components, molecular function, and biological process were examined (ontology = “CC”, “MF”, or “BP”), pruneTermsBySize = 8, pkgname.GO = “org.Hs.eg.db”, and topGOalgorithm = “classic”. Correction for multiple testing was accomplished using q-values.
Yatsunenko et al. data comparison
16S rRNA amplicon data from Yatsunenko et al.[46] was downloaded from MG-RAST on 12/14/2012 and pre-processed as described previously (base quality trimming to classification) using the same procedures as for the Hutterite samples to ensure data comparability[18]. Similar normalization and filtering steps were used to process classified reads from the Yatsunenko populations: bacterial taxa that were detected in fewer than 75% of individuals per population were eliminated and the remaining bacterial taxa relative abundances were fit to a standard normal distribution across individuals using qqnorm in R. Sex effects were examined using a linear model (lm in R) for the USA and Venezuelan populations (the Malawi sample consisted of all females). Q-values were calculated within each population to control for multiple testing, with q ≤ 0.05 chosen as the significance threshold. Hypergeometric tests were performed to determine whether there was significant overlap in the taxa showing differential abundance by sex in each pair of populations.
Results
Correlations of relative taxa abundance with age and sex
In a previous study of these individuals, we examined the relationship of microbiome alpha diversity to age and sex. While a significant association between diversity and sex was not observed in either winter or summer, diversity was significantly inversely correlated with age during the winter months (P < 0.01)[18]. Here, we sought to identify individual bacterial taxa whose relative abundances were correlated with sex or age in the three seasonal analyses (winter, summer, or “seasons combined”). At a q-value cutoff ≤ 0.05, the abundance of only one bacterial taxon was significantly inversely correlated with age in the winter samples (genus Bifidobacterium, Fig 1, Table 1, S5 Table). Members of Bifidobacterium have previously been shown to decrease in abundance with age in the gut[47, 48].
Table 1. Number of bacterial taxa that vary by sex or age in each season.
age | sex | |||||
---|---|---|---|---|---|---|
q-value cutoff | winter | summer | combined | winter | summer | combined |
0.001 | 0 | 0 | 0 | 0 | 0 | 3 |
0.01 | 1 | 0 | 0 | 2 | 1 | 8 |
0.05 a | 1 | 0 | 0 | 4 | 5 | 18 |
0.1 | 1 | 1 | 2 | 16 | 19 | 24 |
0.2 | 4 | 5 | 2 | 19 | 30 | 49 |
Total examined | 116 | 104 | 102 | 116 | 104 | 102 |
a Confidence threshold chosen for significance
In contrast, many bacterial taxa showed sex specific abundance patterns (4/116 –winter, 5/104 –summer, 18/102 –“seasons combined”, Fig 1, Table 1, S6 Table). Moreover, bacterial taxa with sex-specific abundance patterns in multiple seasons tend to show the same direction of sex-specific effects, demonstrating that these are likely consistent sex differences over time (S2 Fig).
Estimates of chip heritability
To examine the role that common host genetic variation might play in determining microbial abundance in the gut, we examined “chip heritability”. This measurement, often referred to as the proportion of variance explained (PVE), is the maximum amount of the variance in a phenotype that can be explained by the genetic variation interrogated in a GWAS framework[49]. This method has proven successful in studies addressing questions of missing heritability for traits such as height[49], working memory performance[50], and liver enzyme levels[51].
We calculated PVE for each bacterial taxon using ~200k SNPs, as implemented in GEMMA (see Materials and Methods). Approximately 10–13% of bacterial taxa in each season have non-zero “chip heritability” estimates, suggesting that a portion of the microbiome is heritable (winter: 14/116 taxa, summer: 10/104 taxa, “seasons combined”: 13/102 taxa; Fig 2, S2 Table, S3 Fig). The standard errors of “chip heritability” estimates are very large, so we have not drawn conclusions about the actual heritability estimate or compared estimates between taxa due to the uncertainty in the measurement, but conservatively consider estimates whose error bars do not intersect zero as showing evidence of heritability.
In addition to bacterial relative abundance, we examined the “chip heritability” for three alpha diversity metrics. Diversity measures were calculated at the genus level (richness—S, Shannon diversity—H, and evenness—J) and included all classified bacterial genera (see Materials and Methods). There was no evidence of “chip heritability” for any diversity metric in summer; however, evenness in the “seasons combined” analysis (“chip heritability”: J = 0.58 ± 0.32) and both evenness and Shannon diversity in winter (“chip heritability”: J = 0.56 ± 0.22, H = 0.52 ± 0.22) have non-zero estimates (S2 Table).
Genome-wide association studies (GWAS)
To determine if specific variants in the human genome are associated with microbial abundance in the gut, we employed a classic GWAS approach for quantitative trait mapping. A large number of SNP-taxon associations were tested (~100 bacterial taxa + 3 diversity metrics per season x 3 seasons x ~200k SNPs per GWAS), which sets a study-wise Bonferroni corrected P-value threshold at 7.0 x 10−10. Given our sample size (~100 individuals per season), only associations with very large effect sizes would be expected to pass this significance threshold. Unsurprisingly, no variants passed this threshold in any of our analyses; therefore, we consider the individual SNP/taxon associations identified in these studies to be suggestive and requiring further replication.
Given this caveat, we identified genome-wide significant associations after correcting for multiple testing within each bacterial abundance GWAS individually, either by Bonferroni correction or by q-value (considering significance thresholds at both q ≤ 0.1 and q ≤ 0.2 cutoffs). Limiting the correction for multiple testing to the number of tests done within an individual GWAS is similar to the typical correction applied to GWAS with human disease (where the correction factor is determined by the parameters of a GWAS for a specific disease, not across GWAS in studies that consider multiple phenotypes[52–55]).
At least one bacterial taxon per season was significantly associated with at least one SNP (at a Bonferroni significance threshold), and the abundances of at least eight bacterial taxa are associated with at least one SNP at q-values less than or equal to 0.2 (Table 2, S7 Table, Fig 3). There were no significant associations with alpha diversity metrics in any of the analyses.
Table 2. Number of bacterial taxa with at least one SNP association reaching suggestive significance per season.
Total | Thresholds of significance | |||||
---|---|---|---|---|---|---|
Season | Bacterial taxa | SNPs | Bonferroni | q ≤ 0.1 | q ≤ 0.2 | Total significant |
Winter | 116 | 211,319 | 2 | 8 | 15 | 15 |
Summer | 104 | 210,924 | 1 | 6 | 14 | 14 |
Combined | 102 | 212,153 | 1 | 2 | 8 | 8 |
We compared the results of our genome-wide association analyses to the “chip heritability” estimates for each bacterial taxon and identified bacterial taxa that have both significant “chip heritability” and at least one suggestive genome-wide significant association (winter: 6 taxa, summer: 2 taxa, “seasons combined”: 3 taxa, Fig 2, see Materials and Methods). Overall, taxa with non-zero “chip heritability” for a given season were significantly enriched for having at least one genome-wide significant SNP association using the winter microbiome data (permutation P < 0.01), were slightly enriched for significant associations in the analysis of data from both seasons (P = 0.07), but were not enriched for genome-wide associations in the summer microbiome data (P = 0.4, S4 Fig).
Gene set enrichment analysis (GSEA)
In order to gain insight into the type of host pathways and functions that might influence bacterial abundance in the gut, we performed gene set enrichment analysis (GSEA) on all GWAS with at least one genome-wide significant association in our study (S2 File, see Materials and Methods). For most bacterial GWAS, we do not observe enrichment of any gene ontology (GO) categories given the distribution of P-values of SNPs located in or near annotated genes in the genome (within 10kb). However, for several bacterial GWAS (winter: 4 taxa, summer: 10 taxa, “seasons combined”: 6 taxa), GSEA resulted in significant enrichments for biological processes one might expect to be interacting with the microbiome (q ≤ 0.2). For example, GSEA revealed enrichments of genes categorized under immune processes (summer: genus Sporacetigenium) and metabolic processes (winter: order Burkholderiales; summer: genus Sporacetigenium; “seasons combined”: genus Megasphaera), as well as in pathways that generate ribosomal components (summer: genus Anaerostipes; “seasons combined”: genus Anaerofilum) and act in multi-organism process and communication (“seasons combined”: genus Anaerostipes). Finally, enrichment in olfactory receptor pathways was observed for a number of taxa (winter: family Succinivibrionaceae; summer: genus Bifidobacterium, order Rhizobiales; “seasons combined”: genus Anaerofilum, genus Faecalibacterium). One caveat to note is that olfactory receptor genes tend to be clustered in the genome, which can lead to spurious enrichment in gene ontology tests.
Identification of candidate tissues
Our GWAS revealed a number of candidate variants that potentially influence gut microbiome composition; however, we lack an understanding of the relevant tissues in which these variants act. We sought to provide insight into this question by intersecting our GWAS results with DNase hypersensitivity (DHS) data, following the approach of Maurano et al.[43]. Cell types identified through their analysis were deemed “pathogenic cell types”; here we will refer to them simply as candidate tissues because the unknown relationships of our GWAS results to disease.
We performed this candidate tissue identification analysis for each of our GWAS of bacterial abundance when either i) at least one SNP met suggestive genome-wide significance, or ii) there was non-zero “chip heritability” for that taxon (number of taxa examined in winter = 23, summer = 22, “seasons combined” = 18). When considering all bacterial taxa that met our inclusion criteria from either the winter or summer data sets, we do not observe more candidate tissues identified at nominal significance than we would expect by chance (S5 Fig). In contrast, for taxa meeting our inclusion criteria from the “seasons combined” analysis, we observed more candidate tissues of nominal significance than would be expected by chance; therefore we only considered results from the “seasons combined” analysis further.
For five bacterial taxa we are able to identify at least one candidate tissue (out of 18 total taxa that met inclusion criteria from “seasons combined” analyses). Endothelial tissues were identified for genus Akkermansia and muscle tissues for genus Dolosigranulum. Both intestine and stomach tissues were identified as potential candidates for genus Faecalibacterium. Family Neisseriaceae and family Pasteurellaceae both have a large number of tissues identified as candidates (Fig 4, S6 Fig).
Discussion
The role of host genetics in gut microbiome composition
In this study, we examined the role of host genetics in determining gut microbiome composition in an isolated, communally living population: the Hutterites. We first explored non-genetic effects of age and sex, and demonstrated that at least four bacterial taxa show sex specific patterns of abundance in each season. We further demonstrated that at least 13 bacterial taxa show evidence of heritability in each season and ultimately identified at least eight bacterial taxa in each season whose abundances were significantly associated with genetic variation in the human genome. Finally, we examined host tissue types in which genetic variation may be acting to influence gut microbial composition.
Although no human replication cohorts are published, we replicated a previously reported association from a quantitative trait locus (QTL) mapping study of microbial abundance in the gut of advanced intercross mouse strains[30]. In that study, genus Lactococcus showed strong evidence of association (LOD score = 8) with markers on mouse chromosome 10. In our study, genus Lactococcus also showed evidence of genetic association in the summer (rs3747113, P = 3.13 x 10−7). Interestingly, the associated variant in the Hutterites is located in the syntenic linkage interval observed in the mouse QTL study, providing evidence of replication of this association across species. The size of the linkage interval in Benson et al. spans tens of megabases of the mouse genome, however, and finer mapping is needed to confirm replication.
One of the more biologically interesting results of our study was the identification of SNPs that were associated with abundance levels of genus Akkermansia (Fig 3). These SNPs lie within intronic and untranslated regions (UTR) of the gene PLD1, which is thought to play a role in signal transduction and subcellular trafficking[57–59]. This gene was implicated in a previous GWAS of body mass index (BMI) in African Americans, because a SNP associated with BMI is located near the gene[56]. In addition, increased abundance of Akkermansia muciniphila was recently shown to be protective against developing obesity in mice[60]. These correlations between a bacterial taxon, genetic variation, and BMI possibly suggest a mechanism for how genetic variation in or near this gene acts to influence obesity. Further support for this mechanism was provided by the de novo candidate tissue identification in which SNPs with low P-values from the genus Akkermansia GWAS were significantly enriched in DHS peaks of endothelial cell types (Fig 4). Endothelial barrier function is thought to be important in obesity. For example, in swine fed with a high-fat diet to promote obesity, endothelial barriers became more permeable early during weight gain[61]. In addition, endothelial permeability is regulated by the immune system[62–64], and there are strong links between alterations in the immune system with obesity[65–67]. We did not observe associations between BMI and the relative abundance of genus Akkermansia in the Hutterite sample (P = 0.51); but BMI was measured on these individuals 3–5 years prior to microbiome sampling. Further investigation is needed to validate this intriguing hypothesis by confirming the relationship of Akkermansia to PLD1 and obesity, ideally within a set of samples where microbiome and obesity measures are collected concurrently.
Host cell types and pathways implicated from GWAS results
In addition to examining human variation that met statistical thresholds at a genome-wide significance level, gene-set enrichment analysis revealed a number of human cellular pathways that might be an important interface between the host and the microbiome. In particular, olfactory receptor activity was significant in GSEA for five taxon GWAS (winter: family Succinivibrionaceae; summer: genus Bifidobacterium, order Rhizobiales; combined: genus Anaerofilum, genus Faecalibacterium). It has already been demonstrated that olfactory receptors form an interface between the host and the gut microbiota. In mice, an olfactory receptor expressed in the kidneys responds to metabolites produced by gut bacteria, and this process aids in regulating blood pressure systemically via renin production[68]. It is possible that additional olfactory receptors in other tissues may also recognize compounds produced by the microbiota and act as a way for the host to regulate either host physiology or the microbiome in response to the gut environment. These results demonstrate that host genetic variation may exert control over microbial abundance through a variety of mechanisms, such as through immune system interaction, metabolism, energy availability, and potentially olfactory receptor activity.
Genetic variation in identified pathways could be acting in a number of host tissues to influence bacterial composition in the gut, either directly (for example, hormones produced in the brain influence bacteria in the gut[69]) or indirectly (for example, microbial byproducts processed by the liver[70]). For the genus Faecalibacterium, both stomach and intestines are identified as candidate tissues where host genetic variation may be acting to influence bacterial abundance (Fig 4). F. prausnitzii, a species of Faecalibacterium, is one of the most common gut bacteria in adults and has been well characterized[71]. This bacterium lives along the mucus interface in the gut[72] and is known to affect expression of host mucus glycans[73]. High levels of F. prauznitzii are thought to be protective against ulcerative colitis[74], Crohn’s disease[75–77], and celiac disease[78]. Given the roles that members of this taxon play in gut health, it is interesting to observe DHS peaks in stomach and intestinal tissues are significantly enriched for the most strongly associated GWAS variants. Considered together, these results provide insight into how host genetic variation may be acting to influence bacterial abundance in the gut.
The role of sex in gut microbiome composition
In addition to examining the role of host genetics in determining gut microbiome composition, we also identified bacterial taxa that are differentially abundant by sex. In the Hutterites, at least four bacterial taxa differ in abundance between the sexes each season, including genus Scardovia, genus Gordonibacter, genus Anaerotruncus, and phylum Proteobacteria. These abundance differences are directionally consistent across season and there are a number of hypotheses for these observations. One potential explaination is that inherent biological differences between the sexes (for instance hormone levels) could drive the observed bacterial abundance differences. Alternatively, the division of labor could drive sex specific differences between men and women in Hutterite society[79]. For example, Hutterite men typically work in the income-generating jobs, which vary by colony. Younger men might work in the fields, barns, or machine shops, while older men take on positions of leadership in the colonies. In contrast, Hutterite women perform family, domestic, and food preparation jobs, including cooking, cleaning, gardening, and sewing. It is possible that men and women are exposed to different environmental microbes due to differences in their daily activities. A similar notion was suggested previously in a study of Hadza hunter-gatherers, where sex differences in the relative abundances of three taxa of the gut microbiome were observed[80]. The authors attributed those differences to the division of labor between men and women in that society (men tend to forage further from camp and for different food sources than women, who remain near to camp to stay with the children).
To further investigate these two hypotheses, we examined whether sex is associated with bacterial abundances in additional populations, including individuals from the USA and Venezuela, using the data produced by Yatsunenko et al.[46] (see Materials and Methods). If underlying physiological differences between the sexes leads to varied bacterial compositions, we might expect to observe sexual dimorphism in the abundance of the same bacterial taxa across different geographies and cultures. While 14 bacterial taxa showed significant differential abundance by sex in the Yatsunenko USA population, none were significantly differentially abundant after multiple testing corrections in the Venezuelan populations (S8 Table, S7 Fig). Additionally, there was no overlap between the taxa identified as differentially abundant by sex in the Hutterites or in the Yatsunenko USA population (S9 Table). Given the differences in microbial exposures and in the uniformity of the environments within the Yatsunenko and Hutterite populations, it is unclear whether sex differences in the microbiome are due to biological or cultural factors, or a combination of both.
In our previous study of seasonal effects on the gut microbiome in the Hutterites[18], we did not observe differences in alpha diversity metrics (Shannon diversity, species richness, or species evenness) between men and women in either winter or summer. Conversely, when we considered individual bacterial taxon abundances we observed many sex differences in this population. These results highlight an important phenomenon that should be considered in microbiome studies: trends observed in broad summary statistics, such as diversity, are not always observed at individual taxon levels and vice versa.
Limitations and conclusions
Although the results of our study indicate that host genetics plays a role in determining gut microbiome composition, there are important limitations. The first is the relatively small size of our sample (n ≈ 100 individuals in each season). For many GWAS of common diseases, the sample sizes necessary to detect significant associations are thousands to tens of thousands of individuals. We proposed that the relatively uniform environment that the Hutterites are exposed to, in particular their communal diet, would reduce variation due to non-genetic factors and increase our power to detect genetic associations. While this may be true, it is clear that much larger sample sizes will be needed to ensure sufficient power to detect association with the high multiple testing burden and for narrowing the confidence intervals on heritability estimates. A second limitation is that published replication cohorts are not currently available for these traits in humans and we are unable to confirm any associations we detect in independent human samples.
Despite these limitations, we do replicate a previously observed bacterial abundance QTL in mouse[30]. Additionally, the candidate tissue identification analysis provided additional validation of our results, as we would not expect any kind of enrichment if our results were all spurious. Finally, several lines of evidence point towards a relationship of genus Akkermansia, in particular, to host genetics, including genome-wide significant GWAS hits (Fig 3), endothelial cell type enrichments for the top associations (Fig 4), and biological plausibility[56, 60]. Given the medical importance of this genus[60, 81], further work should be performed to confirm this relationship and further explore how host genetics might influence this taxon.
This study is one of the first to explore host genetic influences on gut microbiome composition on a genome-wide scale in humans. We identified bacterial taxa that show sex specific patterns of abundance in the Hutterites, although it is unclear whether biological or cultural factors are driving these patterns. We identified at least 10 bacterial taxa in each season that appear to be heritable, by examining “chip heritability”. At least seven bacterial taxa in each season are associated with variation in the human genome at a genome-wide significance level when considering the number of SNP tested within each GWAS. Gene set enrichment analysis demonstrated that the SNPs identified in these GWAS likely function through a variety of different mechanisms in the body including immune function, metabolism, and energy regulation. Finally, candidate tissues where host genetic variation might act to influence microbial abundance in the gut were identified. This work offers a first glimpse into the role human genetics plays in maintaining gut microbiome composition.
Supporting Information
Acknowledgments
We thank the Hutterites for their willingness to participate in the study; Rebecca Anderson, Daniel Cook, and Marina Antillon for assistance with mailings and coordination of the study; Sally Cain for sample collection; Xiang Zhou, Matthew Stephens, and Dan Nicolae for statistical advice; Orna Mizrahi-Man for comments on the manuscript; and members of the Clark, Luca, Pique-Regi, Knights, Pop, and especially Gilad and Ober groups for helpful discussion and suggestions.
Data Availability
The data for the 16S rRNA amplicon sequencing, genotypes, GWAS results, and metadata have been deposited in dbGaP under accession numbers phs000680 and phs000185.
Funding Statement
This work was funded by the National Institutes of Health (NIH, http://www.nih.gov/) grants HG006123 to YG, HL085197 to CO, U19 AI095230 to CO, and the University of Chicago Digestive Disease Research Core Center (http://www.uchicagoddrcc.org/) grant P30-DK42086 (through a P&F grant to YG). DAC was partially supported by NIH grant T32 GM007197. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Savage DC. Microbial ecology of the gastrointestinal tract. Annual review of microbiology. 1977;31:107–33. 10.1146/annurev.mi.31.100177.000543 . [DOI] [PubMed] [Google Scholar]
- 2. Backhed F, Ding H, Wang T, Hooper LV, Koh GY, Nagy A, et al. The gut microbiota as an environmental factor that regulates fat storage. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(44):15718–23. . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, et al. A core gut microbiome in obese and lean twins. Nature. 2009;457(7228):480–4. 10.1038/nature07540 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444(7122):1027–31. 10.1038/nature05414 . [DOI] [PubMed] [Google Scholar]
- 5. Nadal I, Donat E, Donant E, Ribes-Koninckx C, Calabuig M, Sanz Y. Imbalance in the composition of the duodenal microbiota of children with coeliac disease. Journal of medical microbiology. 2007;56(Pt 12):1669–74. . [DOI] [PubMed] [Google Scholar]
- 6. Dicksved J, Halfvarson J, Rosenquist M, Jarnerot G, Tysk C, Apajalahti J, et al. Molecular analysis of the gut microbiota of identical twins with Crohn's disease. The ISME journal. 2008;2(7):716–27. 10.1038/ismej.2008.37 [DOI] [PubMed] [Google Scholar]
- 7. Manichanh C, Rigottier-Gois L, Bonnaud E, Gloux K, Pelletier E, Frangeul L, et al. Reduced diversity of faecal microbiota in Crohn's disease revealed by a metagenomic approach. Gut. 2006;55(2):205–11. . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Sokol H, Lepage P, Seksik P, Dore J, Marteau P. Temperature gradient gel electrophoresis of fecal 16S rRNA reveals active Escherichia coli in the microbiota of patients with ulcerative colitis. J Clin Microbiol. 2006;44(9):3172–7. . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Bibiloni R, Mangold M, Madsen KL, Fedorak RN, Tannock GW. The bacteriology of biopsies differs between newly diagnosed, untreated, Crohn's disease and ulcerative colitis patients. Journal of medical microbiology. 2006;55(Pt 8):1141–9. . [DOI] [PubMed] [Google Scholar]
- 10. Macfarlane S, Furrie E, Kennedy A, Cummings JH, Macfarlane GT. Mucosal bacteria in ulcerative colitis. Br J Nutr. 2005;93 Suppl 1:S67–72. . [DOI] [PubMed] [Google Scholar]
- 11. Sokol H, Seksik P, Rigottier-Gois L, Lay C, Lepage P, Podglajen I, et al. Specificities of the fecal microbiota in inflammatory bowel disease. Inflammatory bowel diseases. 2006;12(2):106–11. . [DOI] [PubMed] [Google Scholar]
- 12. Barman M, Unold D, Shifley K, Amir E, Hung K, Bos N, et al. Enteric salmonellosis disrupts the microbial ecology of the murine gastrointestinal tract. Infection and immunity. 2008;76(3):907–15. . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Russell SL, Gold MJ, Hartmann M, Willing BP, Thorson L, Wlodarska M, et al. Early life antibiotic-driven changes in microbiota enhance susceptibility to allergic asthma. EMBO reports. 2012;13(5):440–7. 10.1038/embor.2012.32 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Kassinen A, Krogius-Kurikka L, Makivuokko H, Rinttila T, Paulin L, Corander J, et al. The fecal microbiota of irritable bowel syndrome patients differs significantly from that of healthy subjects. Gastroenterology. 2007;133(1):24–33. . [DOI] [PubMed] [Google Scholar]
- 15. Collins SM, Denou E, Verdu EF, Bercik P. The putative role of the intestinal microbiota in the irritable bowel syndrome. Dig Liver Dis. 2009;41(12):850–3. 10.1016/j.dld.2009.07.023 [DOI] [PubMed] [Google Scholar]
- 16. Dominguez-Bello MG, Costello EK, Contreras M, Magris M, Hidalgo G, Fierer N, et al. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proceedings of the National Academy of Sciences. 2010:1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Harmsen HJ, Wildeboer-Veloo AC, Raangs GC, Wagendorp AA, Klijn N, Bindels JG, et al. Analysis of intestinal flora development in breast-fed and formula-fed infants by using molecular identification and detection methods. Journal of pediatric gastroenterology and nutrition. 2000;30(1):61–7. . [DOI] [PubMed] [Google Scholar]
- 18. Davenport ER, Mizrahi-Man O, Michelini K, Barreiro LB, Ober C, Gilad Y. Seasonal variation in human gut microbiome composition. PloS one. 2014;9(3):e90731 10.1371/journal.pone.0090731 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Duncan SH, Belenguer A, Holtrop G, Johnstone AM, Flint HJ, Lobley GE. Reduced dietary intake of carbohydrates by obese subjects results in decreased concentrations of butyrate and butyrate-producing bacteria in feces. Applied and environmental microbiology. 2007;73(4):1073–8. 10.1128/AEM.02340-06 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. De Filippo C, Cavalieri D, Di Paola M, Ramazzotti M, Poullet JB, Massart S, et al. Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(33):14691–6. 10.1073/pnas.1005963107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Hehemann JH, Correc G, Barbeyron T, Helbert W, Czjzek M, Michel G. Transfer of carbohydrate-active enzymes from marine bacteria to Japanese gut microbiota. Nature. 2010;464(7290):908–12. 10.1038/nature08937 . [DOI] [PubMed] [Google Scholar]
- 22. Zoetendal EG, Akkermans ADL, Akkermans-van Vliet WM, de Visser JAGM, de Vos WM. The Host Genotype Affects the Bacterial Community in the Human Gastrointestinal Tract. Microbial Ecology in Health and Disease. 2001;13(3%@ 1651–2235|escape}). [Google Scholar]
- 23. Goodrich JK, Waters JL, Poole AC, Sutter JL, Koren O, Blekhman R, et al. Human genetics shape the gut microbiome. Cell. 2014;159(4):789–99. 10.1016/j.cell.2014.09.053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Khachatryan ZA, Ktsoyan ZA, Manukyan GP, Kelly D, Ghazaryan KA, Aminov RI. Predominant role of host genetics in controlling the composition of gut microbiota. PloS one. 2008;3(8):e3064 10.1371/journal.pone.0003064 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. De Palma G, Capilla A, Nadal I, Nova E, Pozo T, Varea V, et al. Interplay between human leukocyte antigen genes and the microbial colonization process of the newborn intestine. Current issues in molecular biology. 2010;12(1):1–10. . [PubMed] [Google Scholar]
- 26. Knights D, Silverberg MS, Weersma RK, Gevers D, Dijkstra G, Huang H, et al. Complex host genetics influence the microbiome in inflammatory bowel disease. Genome medicine. 2014;6(12):107 10.1186/s13073-014-0107-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Zhang C, Zhang M, Wang S, Han R, Cao Y, Hua W, et al. Interactions between gut microbiota, host genetics and diet relevant to development of metabolic syndromes in mice. The ISME journal. 2010;4(2):232–41. 10.1038/ismej.2009.112 . [DOI] [PubMed] [Google Scholar]
- 28. Petnicki-Ocwieja T, Hrncir T, Liu YJ, Biswas A, Hudcovic T, Tlaskalova-Hogenova H, et al. Nod2 is required for the regulation of commensal microbiota in the intestine. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(37):15813–8. 10.1073/pnas.0907722106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Suzuki K, Meek B, Doi Y, Muramatsu M, Chiba T, Honjo T, et al. Aberrant expansion of segmented filamentous bacteria in IgA-deficient gut. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(7):1981–6. 10.1073/pnas.0307317101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Benson AK, Kelly SA, Legge R, Ma F, Low SJ, Kim J, et al. Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(44):18933–8. 10.1073/pnas.1007028107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and environmental microbiology. 2009;75(23):7537–41. 10.1128/AEM.01541-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Ober C, Tan Z, Sun Y, Possick JD, Pan L, Nicolae R, et al. Effect of variation in CHI3L1 on serum YKL-40 level, risk of asthma, and lung function. The New England journal of medicine. 2008;358(16):1682–91. 10.1056/NEJMoa0708801 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Ober C, Nord AS, Thompson EE, Pan L, Tan Z, Cusanovich D, et al. Genome-wide association study of plasma lipoprotein(a) levels identifies multiple genes on chromosome 6q. Journal of lipid research. 2009;50(5):798–806. 10.1194/jlr.M800515-JLR200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Cusanovich DA, Billstrand C, Zhou X, Chavarria C, De Leon S, Michelini K, et al. The combination of a genome-wide association study of lymphocyte count and analysis of gene expression data reveals novel asthma candidate genes. Human molecular genetics. 2012;21(9):2111–23. 10.1093/hmg/dds021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Yao TC, Du G, Han L, Sun Y, Hu D, Yang JJ, et al. Genome-wide association study of lung function phenotypes in a founder population. The Journal of allergy and clinical immunology. 2014;133(1):248–55 e1-10. 10.1016/j.jaci.2013.06.018 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet. 2008;40(10):1253–60. 10.1038/ng.237 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.R: A language and environment for statistical computing. [Internet]. 2011. Available: http://www.R-project.org/.
- 38. Zhou X, Stephens M. Genome-wide efficient mixed-model analysis for association studies. Nature genetics. 2012;44(7):821–4. 10.1038/ng.2310 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Han L, Abney M. Using identity by descent estimation with dense genotype data to detect positive selection. European journal of human genetics: EJHG. 2013;21(2):205–11. 10.1038/ejhg.2012.148 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(16):9440–5. 10.1073/pnas.1530509100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Oksanen J, Guillaume Blanchet F, Kindt R, Legendre P, Minchin P, O'Hara R, et al. vegan: Community Ecology Package. R package version 22–1. 2015; Available: http://CRAN.R-project.org/package=vegan.
- 42. Natarajan A, Yardimci GG, Sheffield NC, Crawford GE, Ohler U. Predicting cell-type-specific gene expression from regions of open chromatin. Genome research. 2012;22(9):1711–22. 10.1101/gr.135129.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337(6099):1190–5. 10.1126/science.1222794 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Hiersche M, Ruhle F, Stoll M. Postgwas: advanced GWAS interpretation in R. PloS one. 2013;8(8):e71775 10.1371/journal.pone.0071775 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, et al. Human gut microbiome viewed across age and geography. Nature. 2012;486(7402):222–7. 10.1038/nature11053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Ouwehand AC, Isolauri E, Kirjavainen PV, Salminen SJ. Adhesion of four Bifidobacterium strains to human intestinal mucus from subjects in different age groups. FEMS microbiology letters. 1999;172(1):61–4. . [DOI] [PubMed] [Google Scholar]
- 48. Hopkins MJ, Macfarlane GT. Changes in predominant bacterial populations in human faeces with age and with Clostridium difficile infection. Journal of medical microbiology. 2002;51(5):448–54. . [DOI] [PubMed] [Google Scholar]
- 49. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, et al. Common SNPs explain a large proportion of the heritability for human height. Nature genetics. 2010;42(7):565–9. 10.1038/ng.608 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Vogler C, Gschwind L, Coynel D, Freytag V, Milnik A, Egli T, et al. Substantial SNP-based heritability estimates for working memory performance. Translational psychiatry. 2014;4:e438 10.1038/tp.2014.81 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. van Beek JH, Lubke GH, de Moor MH, Willemsen G, de Geus EJ, Hottenga JJ, et al. Heritability of liver enzyme levels estimated from genome-wide SNP data. European journal of human genetics: EJHG. 2014. 10.1038/ejhg.2014.259 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Wellcome Trust Case Control C. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–78. 10.1038/nature05911 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, et al. Genome-wide association study of blood pressure and hypertension. Nature genetics. 2009;41(6):677–87. 10.1038/ng.384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Cho YS, Go MJ, Kim YJ, Heo JY, Oh JH, Ban HJ, et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature genetics. 2009;41(5):527–34. 10.1038/ng.357 . [DOI] [PubMed] [Google Scholar]
- 55. Wellcome Trust Case Control C, Craddock N, Hurles ME, Cardin N, Pearson RD, Plagnol V, et al. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature. 2010;464(7289):713–20. 10.1038/nature08979 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Ng MC, Hester JM, Wing MR, Li J, Xu J, Hicks PJ, et al. Genome-wide association of BMI in African Americans. Obesity. 2012;20(3):622–7. 10.1038/oby.2011.154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Chen YG, Siddhanta A, Austin CD, Hammond SM, Sung TC, Frohman MA, et al. Phospholipase D stimulates release of nascent secretory vesicles from the trans-Golgi network. The Journal of cell biology. 1997;138(3):495–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Emoto M, Klarlund JK, Waters SB, Hu V, Buxton JM, Chawla A, et al. A role for phospholipase D in GLUT4 glucose transporter translocation. The Journal of biological chemistry. 2000;275(10):7144–51. . [DOI] [PubMed] [Google Scholar]
- 59. Gomez-Cambronero J, Keire P. Phospholipase D: a novel major player in signal transduction. Cellular signalling. 1998;10(6):387–97. . [DOI] [PubMed] [Google Scholar]
- 60. Everard A, Belzer C, Geurts L, Ouwerkerk JP, Druart C, Bindels LB, et al. Cross-talk between Akkermansia muciniphila and intestinal epithelium controls diet-induced obesity. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(22):9066–71. 10.1073/pnas.1219451110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Galili O, Versari D, Sattler KJ, Olson ML, Mannheim D, McConnell JP, et al. Early experimental obesity is associated with coronary endothelial dysfunction and oxidative stress. American journal of physiology Heart and circulatory physiology. 2007;292(2):H904–11. 10.1152/ajpheart.00628.2006 . [DOI] [PubMed] [Google Scholar]
- 62. Mantovani A, Bussolino F, Dejana E. Cytokine regulation of endothelial cell function. FASEB journal: official publication of the Federation of American Societies for Experimental Biology. 1992;6(8):2591–9. . [DOI] [PubMed] [Google Scholar]
- 63. DiStasi MR, Ley K. Opening the flood-gates: how neutrophil-endothelial interactions regulate permeability. Trends in immunology. 2009;30(11):547–56. 10.1016/j.it.2009.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Markiewski MM, Lambris JD. The role of complement in inflammatory diseases from behind the scenes into the spotlight. The American journal of pathology. 2007;171(3):715–27. 10.2353/ajpath.2007.070166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Samartín S, Chandra RK. Obesity, overnutrition and the immune system. Nutrition Research. 21(1):243–62. 10.1016/S0271-5317(00)00255-4 [DOI] [Google Scholar]
- 66. Marti A, Marcos A, Martinez JA. Obesity and immune function relationships. Obesity reviews: an official journal of the International Association for the Study of Obesity. 2001;2(2):131–40. . [DOI] [PubMed] [Google Scholar]
- 67. Gregor MF, Hotamisligil GS. Inflammatory mechanisms in obesity. Annual review of immunology. 2011;29:415–45. 10.1146/annurev-immunol-031210-101322 . [DOI] [PubMed] [Google Scholar]
- 68. Pluznick JL, Protzko RJ, Gevorgyan H, Peterlin Z, Sipos A, Han J, et al. Olfactory receptor responding to gut microbiota-derived signals plays a role in renin secretion and blood pressure regulation. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(11):4410–5. 10.1073/pnas.1215927110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Sudo N, Chida Y, Aiba Y, Sonoda J, Oyama N, Yu XN, et al. Postnatal microbial colonization programs the hypothalamic-pituitary-adrenal system for stress response in mice. The Journal of physiology. 2004;558(Pt 1):263–75. 10.1113/jphysiol.2004.063388 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Bergman EN. Energy contributions of volatile fatty acids from the gastrointestinal tract in various species. Physiological reviews. 1990;70(2):567–90. . [DOI] [PubMed] [Google Scholar]
- 71. Miquel S, Martin R, Rossi O, Bermudez-Humaran LG, Chatel JM, Sokol H, et al. Faecalibacterium prausnitzii and human intestinal health. Current opinion in microbiology. 2013;16(3):255–61. 10.1016/j.mib.2013.06.003 . [DOI] [PubMed] [Google Scholar]
- 72. Ouwerkerk JP, de Vos WM, Belzer C. Glycobiome: bacteria and mucus at the epithelial interface. Best practice & research Clinical gastroenterology. 2013;27(1):25–38. 10.1016/j.bpg.2013.03.001 . [DOI] [PubMed] [Google Scholar]
- 73. Wrzosek L, Miquel S, Noordine ML, Bouet S, Joncquel Chevalier-Curt M, Robert V, et al. Bacteroides thetaiotaomicron and Faecalibacterium prausnitzii influence the production of mucus glycans and the development of goblet cells in the colonic epithelium of a gnotobiotic model rodent. BMC biology. 2013;11:61 10.1186/1741-7007-11-61 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Wang M, Molin G, Ahrne S, Adawi D, Jeppsson B. High proportions of proinflammatory bacteria on the colonic mucosa in a young patient with ulcerative colitis as revealed by cloning and sequencing of 16S rRNA genes. Digestive diseases and sciences. 2007;52(3):620–7. 10.1007/s10620-006-9461-1 . [DOI] [PubMed] [Google Scholar]
- 75. Hansen R, Russell RK, Reiff C, Louis P, McIntosh F, Berry SH, et al. Microbiota of de-novo pediatric IBD: increased Faecalibacterium prausnitzii and reduced bacterial diversity in Crohn's but not in ulcerative colitis. The American journal of gastroenterology. 2012;107(12):1913–22. 10.1038/ajg.2012.335 . [DOI] [PubMed] [Google Scholar]
- 76. Mondot S, Kang S, Furet JP, Aguirre de Carcer D, McSweeney C, Morrison M, et al. Highlighting new phylogenetic specificities of Crohn's disease microbiota. Inflammatory bowel diseases. 2011;17(1):185–92. 10.1002/ibd.21436 . [DOI] [PubMed] [Google Scholar]
- 77. Sokol H, Pigneur B, Watterlot L, Lakhdari O, Bermudez-Humaran LG, Gratadoux JJ, et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(43):16731–6. 10.1073/pnas.0804812105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. De Palma G, Nadal I, Medina M, Donat E, Ribes-Koninckx C, Calabuig M, et al. Intestinal dysbiosis and reduced immunoglobulin-coated bacteria associated with coeliac disease in children. BMC microbiology. 2010;10:63 10.1186/1471-2180-10-63 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Hostetler JA. Hutterite society Baltimore: Johns Hopkins University Press; 1974. [Google Scholar]
- 80. Schnorr SL, Candela M, Rampelli S, Centanni M, Consolandi C, Basaglia G, et al. Gut microbiome of the Hadza hunter-gatherers. Nature communications. 2014;5:3654 10.1038/ncomms4654 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Shin NR, Lee JC, Lee HY, Kim MS, Whon TW, Lee MS, et al. An increase in the Akkermansia spp. population induced by metformin treatment improves glucose homeostasis in diet-induced obese mice. Gut. 2014;63(5):727–35. 10.1136/gutjnl-2012-303839 . [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data for the 16S rRNA amplicon sequencing, genotypes, GWAS results, and metadata have been deposited in dbGaP under accession numbers phs000680 and phs000185.