Abstract
The gut microbiome is an ecosystem that involves complex interactions. Currently, our knowledge about the role of the gut microbiome in health and disease relies mainly on differential microbial abundance, and little is known about the role of microbial interactions in the context of human disease. Here, we construct and compare microbial co-abundance networks using 2,379 metagenomes from four human cohorts: an inflammatory bowel disease (IBD) cohort, an obese cohort and two population-based cohorts. We find that the strengths of 38.6% of species co-abundances and 64.3% of pathway co-abundances vary significantly between cohorts, with 113 species and 1,050 pathway co-abundances showing IBD-specific effects and 281 pathway co-abundances showing obesity-specific effects. We can also replicate these IBD microbial co-abundances in longitudinal data from the IBD cohort of the integrative human microbiome (iHMP-IBD) project. Our study identifies several key species and pathways in IBD and obesity and provides evidence that altered microbial abundances in disease can influence their co-abundance relationship, which expands our current knowledge regarding microbial dysbiosis in disease.
Subject terms: Metabolic disorders, Inflammatory bowel disease, Gastrointestinal system
Gut microbiome alterations have been linked to inflammatory bowel disease (IBD) and obesity. Here, the authors characterize the metagenomes of four large human cohorts and perform co-abundance network analysis showing that dysbiosis in disease is marked by the altered co-abundance relationships, suggesting that pathway coabundance networks are more heterogeneous than species network.
Introduction
The human gut harbours a diverse community of microorganisms that interact closely with both the host and each other. Gut microorganisms are involved in digestion and degradation of nutrients, maintenance of digestive tract integrity, stimulation of the host immune system and modulation of the host metabolism1–5. In recent years, associations have been identified between gut microbiome composition and the development of certain human diseases, including diabetes6,7, cardiovascular disorders8,9, obesity10,11 and chronic gastrointestinal disorders like inflammatory bowel disease (IBD)12–14. Most associations to human diseases have been linked to lower microbial diversity, altered microbial composition and differing abundances of certain microorganisms and pathways3,8,15–19. However, the gut microbiome is an ecosystem in which microbes can exchange or compete for nutrients, signalling molecules or immune-evasion mechanisms through complicated ecological interactions that are far from fully understood20–22. Enthusiasm has thus been rising to decipher these microbial interactions in order to detect key microbes in health and disease23,24. One way of doing this is to create co-abundance networks based on correlations, a method that has the potential to study interactions between microbes and thereby generate hypotheses for experimental validation at a later stage23,24.
Various network inference tools have been developed25–29 and applied to infer microbial taxonomic networks in healthy individuals and in individuals with extreme longevity, gestational diabetes, Crohn’s disease and colorectal cancer30–35. These studies have identified microbial genera that are potentially key in health and disease, e.g. Porphyromonas and Bacteroides in gestational diabetes33. However, these previous studies were either based on 16S rRNA sequencing data, which yields limited information on microbial species and pathways, or carried out in small cohorts30–34. A further limitation of 16S sequencing is that it can only identify microbial networks up to genus level. As different bacterial species can have very different functional properties, analysis at genus level cannot fully capture the biochemical interactions between microbes. In consequence, the importance of metabolic network construction from metagenomics data has recently been highlighted24,36,37.
Here we present a metagenomics-based network analysis for bacterial species and metabolic pathways in 2379 individuals from 4 cohorts from the Netherlands (Supplementary Fig. 1): an IBD cohort (n = 496), an obesity cohort (300 Obesity cohort (300OB; n = 298), and 2 population-based cohorts (Lifelines-DEEP (LLD; n = 1135) and 500 Functional Genomics (500FG; n = 450)). We compare the microbial taxonomic and functional networks under different host health conditions and identify potential key species and pathways that shape host-associated microbial networks (Fig. 1). We find that the microbial species and pathway co-abundances vary significantly between cohorts and report IBD- and obesity-specific co-abundance networks, which expand our current knowledge regarding microbial dysbiosis in disease. The network key species and pathways identified in IBD and obesity highlight their potential roles in regulating the microbial ecosystem in disease.
Results
Construction of gut microbial co-abundance networks
Metagenomic data of the 2379 participants from the four cohorts was processed using the same pipeline. Principle coordinate analysis showed that microbial composition and functional profiles are largely overlapped, although we observed a significant shift in species composition in the IBD cohort (Supplementary Fig. 2). A total of 134 bacterial species and 343 microbial pathways that were present in >20% of the samples in at least one cohort were included for microbial network inference. We established microbial co-abundance relationships by combining the SparCC29 and SpiecEasi38 methods. For species networks, we identified 2604 co-abundances in the LLD cohort, 1591 in the 500FG cohort, 1107 in the 300OB cohort and 2554 in the IBD cohort, yielding 3454 unique species co-abundances in total (false discovery rate (FDR) < 0.05, Fig. 2a, Supplementary Data 1). Notably, 82.1% of the species co-abundances also exhibited co-occurrence (Supplementary Fig. 3, Supplementary Data 1 and 2). For pathways, the numbers of co-abundances ranged from 37,279 in 500FG to 40,699 in LLD, yielding a total of 43,355 unique pathway co-abundances (FDR < 0.05, Fig. 2b, Supplementary Data 3). Since absence rate of bacterial pathways is much less than in bacterial species, only 29.6% of pathway co-abundances showed co-occurrence (Supplementary Fig. 3, Supplementary Data 3 and 4). The co-occurrence results are summarized and further discussed in Supplementary Note 1.
Microbial co-abundance strength varies between cohorts
We hypothesized that co-abundance strengths could be different depending on host physiological status. We thus assessed to what extent the correlation coefficients were variable across cohorts and characterized variable co-abundance relationships for 38.6% of the species co-abundances and 64.3% of the pathway co-abundances (Cochran-Q test, FDR < 0.05, Supplementary Data 1 and 3).
Differential microbial co-abundances are reflected in abundance levels
When zooming in on the 100 species and 304 pathways that were involved in variable co-abundances, 76% of these species and 84% of these pathways also showed significant differences in their abundance levels among cohorts (analysis of variance test FDR < 0.05, Supplementary Data 5 and 6). This implies that the variable co-abundance relationship is largely reflected by differential microbial abundance. We summarized the number of differential co-abundances between species from the same genus or from different genera (Fig. 3a). The genus with the most heterogeneous co-abundances was Streptococcus, and a large number of variable co-abundances were observed not only between different Streptococcus species but also between Streptococcus species and species from other genera such as Eubacterium and Veillonella (Fig. 3a). In particular, Streptococcus species were higher in the IBD cohort, consistent with the results of previous studies14,39. A similar observation was found for the pathway co-abundances, particularly for amino acid biosynthesis pathways, which showed variability not only within themselves but also with respect to various pathways related to nucleoside and nucleotide biosynthesis (Fig. 3b).
Specific microbial co-abundances are enriched in disease cohorts
Next, we analysed whether the variable co-abundance relationships were driven by a particular cohort, i.e. whether the co-abundance strength in one cohort was very different from those in the other three cohorts. After correcting for the age and sex differences between cohorts, 120 species co-abundances (Supplementary Fig. 4) and 1448 pathway co-abundances (Supplementary Fig. 5) still showed cohort specificity with an FDR of 7.6%, as estimated by permutation (Supplementary Data 1 and 3). Interestingly, cohort-specific co-abundances were significantly enriched in the disease cohorts compared to the population-based cohorts: 113 (94%) species co-abundances and 1050 (72%) pathway co-abundances were specifically related to the IBD cohort (Fisher’s test P = 1.2 × 10−56 and P < 10−260, respectively, Fig. 3c, d) and 281 (19.4%) pathway co-abundances were specifically related to the 300OB cohort (Fisher’s test P = 2.9 × 10−29), as compared to only 3 species and 117 pathway co-abundance relationships specific to the population-based cohorts LLD and 500FG (Fig. 3c, d). Our results highlight that microbial co-abundances are dependent on host health and disease status. Below we discuss the microbial co-abundance networks in IBD and 300OB in more detail, further replicate our findings in independent cohorts and assess the relevance of disease subtypes, disease characteristics and medication usage.
The microbial co-abundance network in IBD
Replication of the IBD network in the integrative Human Microbiome Project (iHMP-IBD) cohort: Of the 2554 species and 37,699 pathway co-abundances established in our IBD cohort, we were able to assess 2090 species co-abundances and 37,106 pathway co-abundances in 77 IBD individuals from the iHMP-IBD39. In the baseline samples of the iHMP-IBD cohort, 531 species co-abundances (25.4%) and 21,882 (59.0%) pathway co-abundance could be replicated at P < 0.05 (Supplementary Data 7 and 8)39. The relatively low replication rate in species co-abundances is largely a power issue, as we also observed that 1705 (81.6%) species co-abundances and 24,165 (65.1%) pathway co-abundances showed no significant difference in their co-abundance strengths between our IBD cohort and the iHMP-IBD cohort (Cochran-Q test, P > 0.05, Supplementary Fig. 6, Supplementary Data 7 and 8). We then compared the IBD networks between the first and last time points of the iHMP-IBD cohort (~1 year apart) and replicated 90.6% of species co-abundances and 99.6% of pathway co-abundances (Cochran-Q test, P > 0.05, Supplementary Fig. 6, Supplementary Data 7 and 8). This suggests that our estimation of co-abundance strengths in IBD was largely replicable in a different cohort and was stable across time.
Microbial networks of IBD in relation to disease characteristics: Previous studies have shown that observed microbial abundance differences could be explained by certain disease characteristics of IBD14. We therefore hypothesized that this could also be the case for co-abundance relationships. We assessed whether IBD co-abundances (including IBD co-abundances at FDR < 0.05 and IBD-specific co-abundances) could be related to the disease subtypes [ulcerative colitis (UC, n = 189) vs. Crohn’s disease (CD, n = 276)], disease location [ileum (n = 212) vs. colon (n = 286)] and disease activity [inflammation (n = 121) vs. no inflammation (n = 377)] (Supplementary Table 1). Most of the co-abundance relationships were comparable between disease characteristics, and only a few showed significant differences at FDR < 0.05 (Supplementary Fig. 7, Supplementary Data 9 and 10), namely 16 species co-abundances related to disease subtypes and 8 species co-abundances related to location. For the pathway co-abundances, 91 were related to disease subtypes, 24 to location and 3 to activity (Cochran-Q test FDR < 0.05, Supplementary Fig. 7). Out of these, five co-abundance relationships were related to an important butyrate producer, Faecalibacterium prausnitzii, which showed stronger co-abundance relationships in UC compared to CD. One example here was the negative co-abundance relationship of F. prausnitzii with Haemophilus parainfluenzae, a species known to have pathogenic properties40.
Microbial networks of IBD in relation to medication: We further tested whether drug usage can affect microbial co-abundance, as usage of antibiotics (20.0%) and proton pump inhibitors (PPIs; 26.5%) was higher in patients with IBD than in the general population cohorts (1.1% and 8.4%) (Supplementary Table 1). Here we detected no significant difference in species co-abundances between antibiotic users and non-users (Cochran-Q test FDR > 0.05, Supplementary Fig. 7), while 1049 out of 37,959 (3.7%) pathway co-abundance relationships showed statistically significant differences between PPI users and non-users, in particular related to the isoprene biosynthesis and methylerythritol phosphate pathways (Cochran-Q test FDR < 0.05, Supplementary Fig. 7, Supplementary Data 10).
Key species and pathways in IBD: When comparing microbial co-abundance in IBD to the other 3 cohorts, we identified 113 species co-abundances and 1050 pathway co-abundances that showed significantly different effects compared to the other 3 cohorts. We then assessed whether these IBD-specific co-abundances were highly connected to a specific pathway or species that may be disease relevant24, and our analysis identified three key species and four key pathways for IBD (Fig. 4).
Key species included Escherichia coli, Oxalobacter formigenes and Actinomyces graevenitzii. E. coli and O. formigenes have previously been associated with IBD14,41–45 (Fig. 4a, Supplementary Data 5). Interestingly, E. coli shows positive co-abundance relationships with species with pro-inflammatory properties, like Streptococcus mutans, and negative co-abundance relationships with species with anti-inflammatory properties, like F. prausnitzii (Supplementary Data 1). The one key species we identified for IBD, A. graevenitzii, is a microbe that is most often identified in the oral cavity or respiratory tract46.
Key IBD pathways included a C1 compound utilization and assimilation pathway (P23-PWY: reductive tricarboxylic acid (TCA) cycle I), two vitamin biosynthesis pathways (FOLSYN-PWY: superpathway of tetrahydrofolate biosynthesis and salvage and PWY-6612: superpathway of tetrahydrofolate biosynthesis) and an amino acid biosynthesis pathway (PWY-5505: L-glutamate and L-glutamine biosynthesis) (Fig. 4b, Supplementary Data 6). The top key functional pathway in IBD was the reductive TCA cycle pathway (P23-PWY), which had 76 IBD-specific co-abundances, and 94.7% of these were replicated in the iHMP-IBD cohort (Supplementary Data 3 and 6). The reductive TCA cycle is a carbon dioxide fixation pathway that has been recognized as a primordial pathway for the production of organic molecules for the biosynthesis of sugars, lipids, amino acids, pyrimidines and menaquinone (Fig. 5a)47. For instance, one IBD-specific co-abundance relationship was related to the biosynthesis of menaquinone (PWY-5837), which is also known as vitamin K2. The co-abundance relationship for this pathway in IBD (r = 0.1) was weaker than in other cohorts (r = 0.3) (Fig. 5b), despite the higher abundance of this pathway in IBD (FDR < 0.05, Fig. 5c, Supplementary Data 6). E. coli is known to be an important species for the biosynthesis of menaquinone, a growth-promoting factor for a variety of microorganisms in the gut microbiota48. In line with this, we found that 18.8% of the menaquinone biosynthesis pathway in IBD patients was contributed by E. coli, two times higher than in the two population-based cohorts (Wilcoxon test P < 3.0 × 10−11; Supplementary Data 11). This finding suggests E. coli as an important contributor to menaquinone biosynthesis in IBD that may promote the growth of other microorganisms. Indeed, our study also revealed E. coli as a key IBD species, exerting IBD-specific co-abundance relationships with 15 species (Supplementary Data 5). Of these, strong positive correlation was observed for inflammation-related Streptococcus species, including S. mutans49, Streptococcus vestibularis41 and Streptococcus infantis42 (Fig. 5d). Accordingly, higher correlations were observed between menaquinone biosynthesis and Streptococcus species in IBD than in the other cohorts (Fig. 5e, Supplementary Data 12).
The microbial co-abundance network in 300OB
Replication of 300OB network in LLD obese individuals: 1107 species and 37,886 pathway co-abundances were detected in the 300OB cohort (Fig. 2). These estimated co-abundance strengths were largely replicable in 134 obese individuals with matched age and body mass index (BMI) from the LLD cohort, with 991 (89.5%) species co-abundances and 32,963 (87.0%) pathway co-abundances showing no difference (Cochran-Q test P > 0.05, Supplementary Fig. 8, Supplementary Data 13 and 14).
Microbial networks in relation to obesity-related diseases: The 300OB cohort was set up to study cardiovascular disease in obese individuals, including 139 patients with atherosclerotic plaque and 159 obese controls (Supplementary Table 1). In addition, 35 300OB participants had diabetes. Here we observed only three species co-abundances related to cardiovascular disease, with all 3 showing stronger co-abundances in patients with plaque than in patients without (Cochran-Q test FDR < 0.05, Supplementary Fig. 9, Supplementary Data 13 and 14). These were positive co-abundances between Dorea longicatena and Dorea formicigenerans and negative co-abundances of Lachnospiraceae bacterium 9.1.43BFAA with Coprococcus comes and D. longicatena.
Key pathways in obesity: When we compared microbial co-abundances in the 300OB to the other 3 cohorts, we identified 281 pathway co-abundances that showed a significantly different effect, i.e. obesity-specific co-abundances. One key pathway in obesity was degradation of allantoin (PWY0-41, Fig. 4b, Supplementary Data 6), which showed obesity-specific co-abundance relationships with 85 pathways. Allantoin is one of the active principles in various plants, e.g. yams, and is found to enhance insulin secretion and lower plasma glucose43,44. Its degradation product, oxamate, plays an inhibitory role in oxaloacetate/aspartate amino acids45. In line with this, we found that the allantoin degradation pathway showed stronger negative correlations with the biosynthesis pathways of oxaloacetate/aspartate amino acids (including lysine, homoserine, methionine, threonine and isoleucine) and the biosynthesis pathway of aspartate (PWY0-781, Fig. 6), which were both positively associated with fasting glucose level and negatively associated with fasting insulin level (P < 0.05, Supplementary Table 2).
Discussion
This study is a microbial co-abundance network analysis based on metagenomics data, involving 2379 participants from two population-based cohorts (LLD and 500FG) and two disease cohorts (IBD and 300OB). We report 3454 species and 43,355 pathway co-abundance relationships that were significant in at least one cohort. Among them, the effect sizes of 38.6% of species co-abundances and 64.3% of pathway co-abundances were significantly different between cohorts. In particular, 113 species co-abundances and 1050 pathway co-abundances showed IBD cohort-specific effects and 281 pathway co-abundances had specific effects in the 300OB cohort. Our study provides evidence that microbial dysbiosis can be reflected in alterations in microbial co-abundance.
Our study yielded several findings. We identified three species and four pathways in IBD and one pathway in 300OB that served as key players in disease-specific co-abundance networks. Key IBD-associated species included E. coli and O. formigenes14,50,51. A higher abundance of the pathogenic species E. coli has previously been associated with IBD50,51, likely due to an increased release of oxidized haemoglobin into the intestinal lumen as a result of chronic inflammation of the gastrointestinal walls52,53. Consistent with this, we replicated high abundances of E. coli and low abundances of anaerobic metabolism pathways in IBD. E. coli also showed strong positive co-abundance with other inflammation-inducing species in IBD, including streptococcus species such as S. mutans49, S. vestibularis41 and S. infantis42. In contrast, these co-abundances were either weak or negative in our population-based and obesity cohorts. We further identified A. graevenitzii as a key species in IBD. Although no evidence supports a direct role for Actinomyces in the pathogenesis of IBD, A. graevenitzii has been associated with coeliac disease in children54 and can induce actinomycosis55,56, with both conditions sharing similar abdominal pathologies with IBD57. Two case reports have also suggested that Actinomyces may aggravate the intestinal injuries caused by inflammation58,59.
The top key functional pathway in IBD was the reductive TCA cycle pathway (P23-PWY), which had 76 IBD-specific co-abundances. Interestingly, the key IBD species E. coli is known to be an important species for the biosynthesis of menaquinone, a growth-promoting factor for a variety of microorganisms in the gut microbiota48. In line with this, we found that 18.8% of the menaquinone biosynthesis pathway in IBD patients was attributed to E. coli, which is two times higher than in the two population-based cohorts (Wilcoxon test P < 3.0×10−11). Another notable IBD key pathway is the tetrahydrofolate pathway, which is responsible for folic acid derivative biosynthesis and supplementation of folic acid. This pathway has been shown to reduce the risk of colorectal cancer in IBD patients60. Interestingly, previous research has shown that oral intake of L-glutamine attenuates the colitis induced by dextran sulfate sodium in mice61. We identified a negative co-abundance with L-glutamine biosynthesis and biosynthesis of other amino acids like L-isoleucine and L-methionine. Previous research showed that both these amino acids play an important role in the immune system62,63. L-glutamine has been tested as supplement in patients with IBD but did not show improvements in clinical outcomes like disease activity scores64. Our results show large numbers of connections for L-glutamine with other pathways such as the biosynthesis of other amino acids. These pathways might also be of interest when exploring L-glutamine as an intervention for IBD.
In obesity, we identified the allantoin degradation pathway as a key pathway, showing obesity-specific co-abundance relationship to 85 pathways, mainly negative correlations with biosynthesis of oxaloacetate/aspartate amino acids. These pathways are related to insulin secretion and glucose metabolism. However, their co-abundance relationships did not show significant differences between patients and non-diabetic individuals, which is likely due to a power issue as there were only 35 diabetic patients in 300OB. Instead, we found three species co-abundances related to presence of atheriosclerotic plaque, involving D. longicatena, D. formicigenerans, L. bacterium 9.1.43BFAA and C. comes. Notably, D. Longticatena and Lachnospiraceae species have been linked to atherosclerotic cardiovascular disease9.
Altogether, our analyses show that microbial dysbiosis in disease may not be driven solely by differences in abundance level, it may also reflect shifts in microbial interactions that are mirrored in co-abundance analyses. Particularly when applied to metagenomics sequence data, pathway-based co-abundance networks provide further insights into functional dysbiosis in IBD and obesity. However, we also acknowledge several limitations of our study. This is an in silico network analysis based on correlation in bacterial abundance levels. Even with the large sample size, our study is still undersized for making comparisons to the number of interactions assessed. In recent years, many different network tools have been developed to tackle the statistical challenges in inferring networks for compositional data. In this study, we applied two independent methods, SparCC and SpiecEasi, to establish microbial co-abundance networks based on MetaPhlan and HUMAnN2 annotation. Our analysis can thus be biased due to these annotation tools. Other annotation tools, e.g. mcSEED65, may yield different pictures of microbial community and functional profile, thereby identifying different co-abundance networks. Thus such in silico-based network inferences require further functional validation. Although bacterial genes are believed to be expressed uniformly66, previous studies have also shown that meta-transcription can exert dynamic changes in response to environmental perturbations that cannot be detected at the metagenome level67,68. Thus, in order to understand the microbial ecosystem in terms of functional interaction in diseases, we need complementary approaches like meta-proteomics and meta-metabolomics that provide a more direct readout of the functional properties of the gut microbiome. Furthermore, the cross-sectional design of this study makes it hard to assess the stability of our findings over time. However, we did observe similar findings for the iHMP-IBD cohort for 98.2% of species co-abundances and 99.4% of pathway co-abundances between two time points spanning 1 year (Cochran-Q test P > 0.05). This implies that co-abundance relationships are largely consistent over time.
In addition, due to our study design, we cannot disentangle cause from consequence. Longitudinal studies are therefore warranted and should be combined with functional validation. Moreover, especially in the context of IBD, which is a heterogeneous disease, we had limited ability to pinpoint co-abundance networks to specific disease characteristics like the subtypes CD and UC. This is probably due to the lack of power to detect this by subgrouping our cohorts. Larger cohorts with well-documented disease characteristics are needed in the future.
This study presents the microbial network analysis to examine both microbial species and functional pathways based on metagenomics sequencing. Our data show that dysbiosis of the gut microbial ecosystem in disease can not only be assessed by the altered abundance level but can also be seen at the level of microbial interaction, at least in terms of co-abundances. We have also identified IBD- and obesity-specific species and pathways that potentially play important roles in regulating the microbial ecosystem in disease, and these disease-specific microbial interactions extend our current knowledge about the role of the microbiome in disease.
Methods
Study cohorts
All four cohorts used in this study have been described before3,14,69,70. In short, the LLD cohort is a large prospective cohort study from the north of the Netherlands71. LLD contains 58.20% females and 41.80% males, the mean age (SD) of participants is 45.04 (13.60) years and their mean BMI is 25.26 (4.18) (Supplementary Fig. 1). In this study, we included 1135 LLD individuals for whom there is metagenomics and phenotype data3.
The 500FG cohort consists of 534 healthy adult volunteers from the Netherlands69,70. In 500FG, 56.50% are women and 43.50% are men, the mean age of participants is 27.43 (12.35) years and their mean BMI is 22.70 (2.72) (Supplementary Fig. 1). In this study, we included 450 500FG participants for whom metagenomics data are available.
The 300OB is a part of the Functional Genomics project and consists of 302 individuals from the Netherlands with a BMI >2770,72,73. These individuals have been clinically screened for obesity-related comorbidities. Around half of participants are clinically diagnosed with metabolic syndrome. 300OB is 55.70% male and 44.30% female, the mean age of participants is 67.07 (5.39) years and their mean BMI is 30.73 (3.48) (Supplementary Fig. 1). In this study, we included 298 participants from 300OB for whom metagenomics data are available.
The 1000 Inflammatory Bowel Disease (1000IBD) cohort consists of patients with IBD recruited at the specialized IBD outpatient clinic of the University Medical Center Groningen in the Netherlands14,74. IBD diagnosis was made based on accepted radiological, endoscopic and histopathological evaluation. The 1000IBD cohort is 60.70% female and 39.30% male, the mean age of participants is 43.45 (14.52) years and their mean BMI is 25.55 (5.17) (Supplementary Fig. 1). In this study, we included 496 IBD participants for whom metagenomics data are available.
Ethical approval
All participants signed an informed consent form prior to sample collection. Institutional ethics review board (IRB) approval was available for all four cohorts: the LLD (ref. M12.113965) and the IBD (IRB-number 2008.338) cohorts were approved by the UMCG IRB and the 500FG study (NL42561.091.12, 2012/550) and 300OB (NL46846.091.13) cohorts were approved by the Ethical Committee of Radboud University Nijmegen.
Metagenomic data generation and pre-processing
All participants from the four cohorts were asked to collect faecal samples at home and to place them in their home freezer (−20 °C) within 15 min after production. Subsequently, a nurse visited the participant to pick up the faecal samples on dry ice and transfer them to the laboratory. Aliquots were then made and stored at −80 °C until further processing. The same protocol for faecal DNA isolation and metagenomics sequencing was used for all four cohorts. Faecal DNA isolation was performed using the AllPrep DNA/RNA Mini Kit (Qiagen, cat. 80204). After DNA extraction, faecal DNA was sent to the Broad Institute of Harvard and MIT in Cambridge, MA, USA, where library preparation and whole-genome shotgun sequencing were performed on the Illumina HiSeq platform. From the raw metagenomic sequencing data, low-quality reads were discarded by the sequencing facility and reads belonging to the human genome were removed by mapping the data to the human reference genome (version NCBI37) with Bowtie2 (v2.1.0)75.
The relative abundance of gut microbial taxonomic units was determined using MetaPhlan (v2.7.2)76, and the relative abundances of metabolic pathways were determined using the HUMAnN2 pipeline (v0.10.0)77, which maps DNA/RNA reads to a customized database of functionally annotated pan-genomes. HUMAnN2 reported the abundances of gene families from the UniProt Reference Clusters78 (UniRef90), which were further mapped to microbial pathways from the MetaCyc metabolic pathway database79,80. In total, we detected 698 species and 489 pathways present in at least 1 of the 4 cohorts. To deal with sparse microbial data in the network analysis, we focused on species/pathways present in at least 20% of samples in at least one cohort. This provided a confined list of 134 species and 343 pathways for use in the network analysis. Together these accounted for, on average, 86.9% and 99.9% of taxonomic and functional compositions, respectively.
Statistical analysis
Co-abundance network inference: co-abundance analysis on compositional data is challenging because it is likely to exhibit spurious correlations due to the dependency of fractions (i.e. relative abundance sums to 1)29,81–84. In particular, the problem can be more serious in a microbial community with low compositionality85. We therefore first assessed the inverse Simpson index of microbial composition for the effective number of species (neff). Our analysis showed high compositionality in both functional pathway composition (2.09, 2.10, 2.11 and 2.08 in LLD, 500FG, 300OB and IBD, respectively) and species composition (10.74, 11.87, 12.30 and 8.80 in LLD, 500FG, 300OB and IBD, respectively). Following the suggestion of Weiss et al.85, based on their assessment of the performance of eight different methods (Bray–Curtis, Pearson, Spearman, CoNet, LSA, MIC, RMT and SparCC), we decided to use the SparCC method because it has been proven to be able to infer linear relationships with high precision for high diversity compositions with neff < 13. Species composition data from MetaPhlan was converted to predicted read counts by multiplying relative abundances by the total sequence counts and then subjected to a Python-based SparCC tool29. For pathway analysis, the read counts from HUMAnN2 were directly used for SparCC. Significant co-abundance was controlled at FDR 0.05 level using 100× permutation. In each permutation, the abundance of each microbial factor was randomly shuffled across samples. To reduce indirect associations, we further applied SpiecEasi (v1.0.6), which infers the microbial network underlying graphical model using the concept of conditional independence38. In this way, we obtained 3454 species and 43,355 pathway co-abundances that were detectable by both methods (Fig. 1).
Co-occurrence network inference: presence and absence of each bacterial species and metabolic pathway were treated as binary traits. The pair-wise co-occurrence relationship between two microbial factors (species or pathway) in each cohort was assessed using Pearson’s chi-squared test. If the number of co-occurrence pairs was greater than the number of co-exclusion pairs, the two microbial factors were considered to be a co-occurrence. If the number of co-occurrence pairs was less than the number co-exclusion pairs, the two factors were considered to be a co-exclusion. Permutation (100×) was conducted to determine significance at an FDR < 0.05. In each permutation, the presence and absence of each microbial factor was randomly shuffled across samples. At the species level, we detected 6015 co-occurrence relationships that were found in at least one cohort, with 3423 found in LLD, 1845 in 500FG, 1199 in 300OB and 4701 in IBD (Supplementary Data 2). At the pathway level, we detected 19,903 co-occurrence relationships that appeared in at least one cohort, with 13,501 found in LLD, 7581 in 500FG, 7580 in 300OB and 16,596 in IBD (Supplementary Data 4).
Network heterogeneity analysis: To assess the variability of networks among the four cohorts, we conducted Cochran-Q tests to assess the heterogeneity of effect sizes and directions across the four cohorts for each co-abundance (correlation coefficient generated by SparCC) and co-occurrence (odds ratio). Here we treated each cohort as one study and conducted the Cochran-Q test using the metagen() function from the package meta (v4.9.5) in R, which calculates the squared difference between individual study effects and the pooled effect using inverse variance weighting86. For each co-abundance, the P values from the Cochran-Q test were recorded, and co-abundances with significant heterogeneity were controlled at the FDR 0.05 level determined by permutation (100×). In this case, all samples from the four cohorts were randomly shuffled across cohorts, i.e. shuffling the cohort labels but keeping the correlation structure of species and pathways intact. Co-abundances with a Cochran-Q test FDR < 0.05 were considered heterogeneous, while co-abundances with Cochran-Q test P > 0.05 were considered stable. We also summarized species co-abundance co-abundances based on microbial genus and pathway co-abundance co-abundances based on metabolic category.
Cohort-specific co-abundance selection
For heterogeneous co-abundances and co-occurrences (Cochran-Q test FDR < 0.05), we further assessed whether these relationships showed cohort specificity, i.e. whether the effect size of co-abundance/co-occurrence in one cohort was very different from that in the other three. In this analysis, effect size for co-abundance was the SparCC correlation coefficient and the odds ratio for co-occurrence. We adopted interquartile ranges (IQRs) based the outlier detection method (Supplementary Fig. 10)87. We ranked the effect sizes from low to high, say b1, b2, b3, b4, and then calculated the corresponding 25%, 50% and 75% quartile values (Q1, Q2 and Q3, respectively). IQR was then calculated, and we assessed whether the smallest or largest effect size fell outside of Q1 − 2 × IQR or Q3 + 2 × IQR. If only one met the condition, we called this co-abundance specific and assigned it to the corresponding cohort (Supplementary Fig. 10). To assess whether cohort-specific co-abundances were enriched for a specific cohort, we conducted Fisher’s exact test. We also calculated the average FDR of cohort-specific co-abundances using 100× permutations as described above for the heterogeneity analysis.
Key microbial species and pathway detection
To assess to what extent cohort-specific microbial relationships were linked to a specific species or pathway, we calculated the number of cohort-specific microbial relationships per species/pathway. To define the key species/pathway, we took the maximum number of false cohort-specific relationships per species/pathway from each permutation and determined the key species/pathway cut-off as the upper range of the 95% of confidence interval based on 100× permutations. At this cut-off, there is a 5% probability that a false enrichment could occur by chance. In this way, a species with at least 13 cohort-specific co-abundances or a pathway with at least 70 cohort-specific co-abundances was recognized as a key species or pathway. For co-occurrence networks, these numbers were 10 for key species and 45 for key pathways. In such a way, we detected 192 cohort-specific species co-abundances and 2235 cohort-specific pathway co-abundances.
Assessing impact of confounding factors
The age and sex distributions were different between cohorts (Supplementary Fig. 1). To assess the impact of age and sex, we conducted partial correlation analysis (Supplementary Fig. 11). For example, to assess the co-abundance between species A and B, we first assessed the Pearson correlation of A and B to each covariant, say C, respectively. Then, a pairwise correlation matrix of A, B and C was subjected to partial correlation (Supplementary Fig. 11) using the partial correlation function cor2pcor from the R package corpcor (version 1.6.9). This insured that the partial correlation determined between A and B was independent of the covariant C. To assess the impact, we compared the correlation coefficient between SparCC correlation and partial correlation for all co-abundances and found comparable effect size (Supplementary Fig. 12). After regressing out the confounding effects of age and sex on cohort-specific co-abundances, 120 out of 192 (62.5%) species and 1448 out of 2235 (64.8%) pathway co-abundances remained cohort specific.
Replication of microbial networks
To replicate microbial networks in IBD, we used data from 77 IBD patients from the iHMP-IBD as a replication cohort88. Given the iHMP-IBD’s longitudinal study design, we could examine metagenomics data from the first and the last sample collection for each individual. In all, 91% of the species (123 out of 134) and 99% of the pathways (340 out of 343) found in our IBD cohort were also detected in the first sample collection in iHMP-IBD. The differences in co-abundance strength between the IBD cohort and the iHMP-IBD cohort were assessed using Cochran-Q test. A significant P > 0.05 was applied to define replicable co-abundances. We also investigated the stability of microbial networks in iHMP-IBD by comparing the microbial co-abundances in the first and last sample collection from the same participants using the same approach.
To replicate microbial networks in 300OB, we selected 134 obese individuals from the LLD cohort with matched age and BMI. Here we considered a co-abundance to be replicable if the Cochran-Q test heterogeneity between the discovery and replication cohorts was not significant at P > 0.05.
Assessing the relevance of microbial co-abundances to sub-phenotypes
Patients in the IBD and 300OB cohorts have different disease subtypes, and both cohorts had higher proportions of drug users than our population cohorts. In particular, the IBD cohort contained 276 patients with CD and 189 with UC. Within the IBD cohort, 126 patients took PPIs and 97 took antibiotics. In the 300OB cohort, 53.4% (159 out of 298) had an atherosclerotic plaque detected by ultrasound72 and 35 were diabetic. To assess the co-abundance related to disease sub-phenotypes, we split the cohorts based on disease subtypes or medication use and inferred microbial co-abundance using SparCC. Cochran-Q test was applied to assess the differential microbial co-abundances at FDR < 0.05.
Species contributions to pathways and species–pathway associations
Since the pathway abundances reported by HUMAnN2 are computed at both community and individual species level77, we further looked into the contribution of species to each pathway and reported the top contributor (species). To show the functional relationship between species and pathways (e.g. whether a given pathway has the potential to promote the growth of a species through its metabolic products), we also checked the correlation (Spearman) between microbial species and pathway abundance after adjusting for age, sex and read depth using a linear regression model89. FDR was further calculated based on 100× permutation.
Network visualization
Cohort-specific networks based on cohort-specific co-abundances were visualized using a circle plot or heatmap with hierarchical clustering analysis (ward.D clustering based on Minkowski distance). Both species and pathways networks were visualized using the package igraph (v1.2.4.1)90 in R. For species networks, species belonging to the same genus were clustered together. For pathway networks, pathways from the same metabolic category were presented in a sub-circle, and categories with a limited number of pathways (<4) were grouped into the other category. Classification of pathways was based on the MetaCyc metabolic pathway database79,80.
Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.
Supplementary information
Acknowledgements
We thank the participants and staff of LifeLines-DEEP, 500FG, 300OB and the IBD cohort for their collaboration and the support of the LifeLines Cohort study and the Human Functional Genomics Project. We thank J. Dekens, M. Platteel, A. Maatman and J. Arends for management and technical support and K. Mc Intyre for English editing. This project was funded by the Netherlands Heart Foundation (IN-CONTROL CVON grant 2012-03 and 2018-27 to L.A.B.J., N.P.R., M.G.N., F.K., A.Z. and J.F.); the Top Institute Food and Nutrition, Wageningen, the Netherlands (TiFN GH001 to C.W.); the Netherlands Organization for Scientific Research (NWO) (NWO-VIDI 864.13.013 to J.F., NWO-VIDI 016.178.056 to A.Z., NWO Spinoza Prize SPI 94-212 to M.G.N., NWO Spinoza Prize SPI 92-266 to C.W. and NWO Gravitation Netherlands Organ-on-Chip Initiative (024.003.001) to C.W.); the European Research Council (ERC) (FP7/2007-2013/ERC Advanced Grant Agreement 2012-322698 to C.W., ERC Consolidator Grant 310372 to M.G.N. and ERC Starting Grant 715772 to A.Z.); the Stiftelsen Kristian Gerhard Jebsen Foundation (Norway) to C.W.; L.A.B.J. was supported by a Competitiveness Operational Programme grant of the Romanian Ministry of European Funds (P_37_762, MySMIS 103587); the RuG Investment Agenda Grant Personalized Health to C.W. and the Foundation De Cock-Hadders grant (20:20-13) to L.C. A.Z. holds a Rosalind Franklin Fellowship from the University of Groningen. L.C. is supported by a joint fellowship from the University Medical Center Groningen and China Scholarship Council (CSC201708320268). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.
Source data
Author contributions
C.W., A.Z., M.G.N., R.K.W. and J.F. conceptualized and managed the study. L.C., V.C., M.J., I.C.L.v.d.M., A.V.V., A.K., R.G., T.S., M.O., L.A.B.J., J.H.W.R. and N.P.R. collected the samples and generated the data. L.C. analysed the data. L.C., V.C. and J.F. drafted the manuscript. All the authors reviewed and edited the manuscript.
Data availability
All relevant data supporting the key findings of this study are available within the article and its Supplementary Information files. Data underlying Fig. 5c and Supplementary Fig. 2 are provided as a Source data file. Data underlying all the other figures are provided in Supplementary Data and data repositories: LifeLines-Deep cohort [https://www.ebi.ac.uk/ega/datasets/EGAD00001001991], 1000 IBD cohort [https://www.ebi.ac.uk/ega/datasets/EGAD00001004194], 300OB cohort [https://ega-archive.org/dacs/EGAC00001001143], and 500FG cohort [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA319574]. The iHMP data are available via https://ibdmdb.org/tunnel/public/summary.html. Due to informed consent regulation, the data sets of the Lifelines-DEEP, IBD, 300OB and 500FG cohorts are available upon request to the University Medical Center of Groningen (UMCG), Lifelines and Radboud University Medical Center, respectively. This includes the submission of a letter of intention to the corresponding data access committee [the Lifelines Data Access Committee for the LifeLines-DEEP data (Jackie Dekens, e-mail: j.a.m.dekens@umcg.nl), 1000 IBD Data access Committee UMCG for the IBD data (Melinde E. Wijers, e-mail: m.e.wijers@umcg.nl) and the Human Functional Genomics Data Access Committee for 500FG and 300OB data (Martin Jaeger, e-mail: Martin.Jaeger@radboudumc.nl)]. Data sets can be made available under a data transfer agreement and the data usage access is subject to local rules and regulations. Source data are provided with this paper.
Code availability
For this study, the following software was used: kneadData (v0.4.6.1), Bowtie2 (v2.1.0), MetaPhlAn2 (v2.7.2), HUMAnN2 (v0.10.0), SparCC Python package, R (v3.5.2), SpiecEasi R package (v1.0.6), and meta R package (v4.9.5). Code used for generating the microbial abundance profiles is publicly available at https://github.com/GRONINGEN-MICROBIOME-CENTRE/Groningen-Microbiome/blob/master/Scripts/Metagenomics_pipeline_v1.md. Code used for the statistical analyses is publicly available at https://github.com/GRONINGEN-MICROBIOME-CENTRE/Groningen-Microbiome/tree/master/Projects/Microbial%20co-abundance%20network. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review informationNature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Valerie Collij, Martin Jaeger
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-020-17840-y.
References
- 1.Chen LM, Garmaeva S, Zhernakova A, Fu JY, Wijmenga C. A system biology perspective on environment-host-microbe interactions. Hum. Mol. Genet. 2018;27:R187–R194. doi: 10.1093/hmg/ddy137. [DOI] [PubMed] [Google Scholar]
- 2.Falony G, et al. Population-level analysis of gut microbiome variation. Science. 2016;352:560–564. doi: 10.1126/science.aad3503. [DOI] [PubMed] [Google Scholar]
- 3.Zhernakova A, et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science. 2016;352:565–569. doi: 10.1126/science.aad3369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lloyd-Price J, et al. Strains, functions and dynamics in the expanded Human Microbiome Project. Nature. 2017;550:61–66. doi: 10.1038/nature23889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kurilshikov A, Wijmenga C, Fu J, Zhernakova A. Host genetics and gut microbiome: challenges and perspectives. Trends Immunol. 2017;38:633–647. doi: 10.1016/j.it.2017.06.003. [DOI] [PubMed] [Google Scholar]
- 6.Wu H, et al. Metformin alters the gut microbiome of individuals with treatment-naive type 2 diabetes, contributing to the therapeutic effects of the drug. Nat. Med. 2017;23:850–858. doi: 10.1038/nm.4345. [DOI] [PubMed] [Google Scholar]
- 7.Grasset E, et al. A specific gut microbiota dysbiosis of type 2 diabetic mice induces GLP-1 resistance through an enteric NO-dependent and gut-brain axis mechanism. Cell Metab. 2017;26:278. doi: 10.1016/j.cmet.2017.06.003. [DOI] [PubMed] [Google Scholar]
- 8.Fu J, et al. The gut microbiome contributes to a substantial proportion of the variation in blood lipids. Circ. Res. 2015;117:817–824. doi: 10.1161/CIRCRESAHA.115.306807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jie Z, et al. The gut microbiome in atherosclerotic cardiovascular disease. Nat. Commun. 2017;8:845. doi: 10.1038/s41467-017-00900-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Liu R, et al. Gut microbiome and serum metabolome alterations in obesity and after weight-loss intervention. Nat. Med. 2017;23:859–868. doi: 10.1038/nm.4358. [DOI] [PubMed] [Google Scholar]
- 11.Bouter KE, van Raalte DH, Groen AK, Nieuwdorp M. Role of the gut microbiome in the pathogenesis of obesity and obesity-related metabolic dysfunction. Gastroenterology. 2017;152:1671–1678. doi: 10.1053/j.gastro.2016.12.048. [DOI] [PubMed] [Google Scholar]
- 12.Halfvarson J, et al. Dynamics of the human gut microbiome in inflammatory bowel disease. Nat. Microbiol. 2017;2:17004. doi: 10.1038/nmicrobiol.2017.4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Imhann F, et al. Interplay of host genetics and gut microbiota underlying the onset and clinical presentation of inflammatory bowel disease. Gut. 2018;67:108–119. doi: 10.1136/gutjnl-2016-312135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vich Vila, A. et al. Gut microbiota composition and functional changes in inflammatory bowel disease and irritable bowel syndrome. Sci. Transl. Med. 10, eaap8914 (2018). [DOI] [PubMed]
- 15.Imhann F, et al. Proton pump inhibitors affect the gut microbiome. Gut. 2016;65:740–748. doi: 10.1136/gutjnl-2015-310376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mangalam A, et al. Human gut-derived commensal bacteria suppress CNS inflammatory and demyelinating disease. Cell Rep. 2017;20:1269–1277. doi: 10.1016/j.celrep.2017.07.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Petriz BA, Franco OL. Metaproteomics as a complementary approach to gut microbiota in health and disease. Front. Chem. 2017;5:4. doi: 10.3389/fchem.2017.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rosshart SP, et al. Wild mouse gut microbiota promotes host fitness and improves disease resistance. Cell. 2017;171:1015.E13–1028.E13. doi: 10.1016/j.cell.2017.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Surana NK, Kasper DL. Moving beyond microbiome-wide associations to causal microbe identification. Nature. 2017;552:244–247. doi: 10.1038/nature25019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Baumler AJ, Sperandio V. Interactions between the microbiota and pathogenic bacteria in the gut. Nature. 2016;535:85–93. doi: 10.1038/nature18849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Eickhoff MJ, Bassler BL. SnapShot: bacterial quorum sensing. Cell. 2018;174:1328. doi: 10.1016/j.cell.2018.08.003. [DOI] [PubMed] [Google Scholar]
- 22.Schirmer M, et al. Linking the human gut microbiome to inflammatory cytokine production capacity. Cell. 2016;167:1897. doi: 10.1016/j.cell.2016.11.046. [DOI] [PubMed] [Google Scholar]
- 23.Berry D, Widder S. Deciphering microbial interactions and detecting keystone species with co-occurrence networks. Front. Microbiol. 2014;5:219. doi: 10.3389/fmicb.2014.00219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Banerjee S, Schlaeppi K, van der Heijden MGA. Keystone taxa as drivers of microbiome structure and functioning. Nat. Rev. Microbiol. 2018;16:567–576. doi: 10.1038/s41579-018-0024-1. [DOI] [PubMed] [Google Scholar]
- 25.Faust K, et al. Microbial co-occurrence relationships in the human microbiome. PLoS Comput. Biol. 2012;8:e1002606. doi: 10.1371/journal.pcbi.1002606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Xia LC, Ai D, Cram J, Fuhrman JA, Sun F. Efficient statistical significance approximation for local similarity analysis of high-throughput time series data. Bioinformatics. 2013;29:230–237. doi: 10.1093/bioinformatics/bts668. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Reshef DN, et al. Detecting associations in large data sets. Science. 2011;334:1518–1524. doi: 10.1126/science.1205438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Deng Y, et al. Molecular ecological network analyses. BMC Bioinformatics. 2012;13:113. doi: 10.1186/1471-2105-13-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 2012;8:e1002687. doi: 10.1371/journal.pcbi.1002687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Faust K, et al. Microbial co-occurrence relationships in the human microbiome. PLoS Comput. Biol. 2012;8:e1002606. doi: 10.1371/journal.pcbi.1002606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Biagi E, et al. Gut microbiota and extreme longevity. Curr. Biol. 2016;26:1480–1485. doi: 10.1016/j.cub.2016.04.016. [DOI] [PubMed] [Google Scholar]
- 32.Flemer B, et al. Tumour-associated and non-tumour-associated microbiota in colorectal cancer. Gut. 2017;66:633–643. doi: 10.1136/gutjnl-2015-309595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wang J, et al. Dysbiosis of maternal and neonatal microbiota associated with gestational diabetes mellitus. Gut. 2018;67:1614–1625. doi: 10.1136/gutjnl-2018-315988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gevers D, et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe. 2014;15:382–392. doi: 10.1016/j.chom.2014.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yilmaz B, et al. Microbial network disturbances in relapsing refractory Crohn’s disease. Nat. Med. 2019;25:323–336. doi: 10.1038/s41591-018-0308-z. [DOI] [PubMed] [Google Scholar]
- 36.Rottjers L, Faust K. Can we predict keystones? Nat. Rev. Microbiol. 2019;17:193–193. doi: 10.1038/s41579-018-0132-y. [DOI] [PubMed] [Google Scholar]
- 37.Banerjee S, Schlaeppi K, van der Heijden MGA. Reply to ‘Can we predict microbial keystones?’. Nat. Rev. Microbiol. 2019;17:194–194. doi: 10.1038/s41579-018-0133-x. [DOI] [PubMed] [Google Scholar]
- 38.Kurtz ZD, et al. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput. Biol. 2015;11:e1004226. doi: 10.1371/journal.pcbi.1004226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Schirmer M, Garner A, Vlamakis H, Xavier RJ. Microbial genes and pathways in inflammatory bowel disease. Nat. Rev. Microbiol. 2019;17:497–511. doi: 10.1038/s41579-019-0213-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Musher, D. M. in Medical Microbiology 4th edn (ed. Baron, S.) Ch. 30 (The University of Texas Medical Branch at Galveston, Galveston, TX, 1996). [PubMed]
- 41.Doyuk E, Ormerod OJ, Bowler ICJW. Native valve endocarditis due to Streptococcus vestibularis and Streptococcus oralis. J. Infect. 2002;45:39–41. doi: 10.1053/jinf.2002.1004. [DOI] [PubMed] [Google Scholar]
- 42.Pimenta F, et al. Streptococcus infantis, Streptococcus mitis, and Streptococcus oralis strains with highly similar cps5 loci and antigenic relatedness to serotype 5 pneumococci. Front. Microbiol. 2018;9:3199. doi: 10.3389/fmicb.2018.03199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Niu CS, et al. Decrease of plasma glucose by allantoin, an active principle of yam (Dioscorea spp.), in streptozotocin-induced diabetic rats. J. Agric. Food Chem. 2010;58:12031–12035. doi: 10.1021/jf103234d. [DOI] [PubMed] [Google Scholar]
- 44.Tsai CC, et al. Allantoin activates imidazoline I-3 receptors to enhance insulin secretion in pancreatic beta-cells. Nutr. Metab. 2014;11:41. [Google Scholar]
- 45.Marlier JF, Cleland WW, Zeczycki TN. Oxamate is an alternative substrate for pyruvate carboxylase from Rhizobium etli. Biochemistry. 2013;52:2888–2894. doi: 10.1021/bi400075t. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hall V. Actinomyces–gathering evidence of human colonization and infection. Anaerobe. 2008;14:1–7. doi: 10.1016/j.anaerobe.2007.12.001. [DOI] [PubMed] [Google Scholar]
- 47.Smith E, Morowitz HJ. Universality in intermediary metabolism. Proc. Natl Acad. Sci. USA. 2004;101:13168–13173. doi: 10.1073/pnas.0404922101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Fenn K, et al. Quinones are growth factors for the human gut microbiota. Microbiome. 2017;5:161. doi: 10.1186/s40168-017-0380-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kojima A, et al. Infection of specific strains of Streptococcus mutans, oral bacteria, confers a risk of ulcerative colitis. Sci. Rep. 2012;2:332. doi: 10.1038/srep00332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kotlowski R, Bernstein CN, Sepehri S, Krause DO. High prevalence of Escherichia coli belonging to the B2+D phylogenetic group in inflammatory bowel disease. Gut. 2007;56:669–675. doi: 10.1136/gut.2006.099796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Mirsepasi-Lauridsen HC, et al. Extraintestinal pathogenic Escherichia coli are associated with intestinal inflammation in patients with ulcerative colitis. Sci. Rep. 2016;6:31152. doi: 10.1038/srep31152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Zhu H, Li YR. Oxidative stress and redox signaling mechanisms of inflammatory bowel disease: updated experimental and clinical evidence. Exp. Biol. Med. 2012;237:474–480. doi: 10.1258/ebm.2011.011358. [DOI] [PubMed] [Google Scholar]
- 53.Albenberg L, et al. Correlation between intraluminal oxygen gradient and radial partitioning of intestinal microbiota. Gastroenterology. 2014;147:1055–1063 e1058. doi: 10.1053/j.gastro.2014.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Ou G, et al. Proximal small intestinal microbiota and identification of rod-shaped bacteria associated with childhood celiac disease. Am. J. Gastroenterol. 2009;104:3058–3067. doi: 10.1038/ajg.2009.524. [DOI] [PubMed] [Google Scholar]
- 55.Nagaoka K, et al. Multiple lung abscesses caused by Actinomyces graevenitzii mimicking acute pulmonary coccidioidomycosis. J. Clin. Microbiol. 2012;50:3125–3128. doi: 10.1128/JCM.00761-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kononen E, Wade WG. Actinomyces and related organisms in human infections. Clin. Microbiol. Rev. 2015;28:419–442. doi: 10.1128/CMR.00100-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wong VK, Turmezei TD, Weston VC. Actinomycosis. BMJ. 2011;343:d6099. doi: 10.1136/bmj.d6099. [DOI] [PubMed] [Google Scholar]
- 58.Lin K, et al. A rare thermophilic bug in complicated diverticular abscess. Case Rep. Gastroenterol. 2017;11:569–575. doi: 10.1159/000480072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Nahum A, Filice G, Malhotra A. A complicated thread: abdominal actinomycosis in a young woman with Crohn disease. Case Rep. Gastroenterol. 2017;11:377–381. doi: 10.1159/000475917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Burr NE, Hull MA, Subramanian V. Folic acid supplementation may reduce colorectal cancer risk in patients with inflammatory bowel disease. J. Clin. Gastroenterol. 2017;51:247–253. doi: 10.1097/MCG.0000000000000498. [DOI] [PubMed] [Google Scholar]
- 61.Jeong SY, Im YN, Youm JY, Lee HK, Im SY. l-glutamine attenuates DSS-induced colitis via induction of MAPK phosphatase-1. Nutrients. 2018;10:E288. doi: 10.3390/nu10030288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Gu C, Mao X, Chen D, Yu B, Yang Q. Isoleucine plays an important role for maintaining immune function. Curr. Protein Pept. Sci. 2019;20:644–651. doi: 10.2174/1389203720666190305163135. [DOI] [PubMed] [Google Scholar]
- 63.Martinez Y, et al. The role of methionine on metabolism, oxidative stress, and diseases. Amino Acids. 2017;49:2091–2098. doi: 10.1007/s00726-017-2494-2. [DOI] [PubMed] [Google Scholar]
- 64.Limketkai BN, Wolf A, Parian AM. Nutritional interventions in the patient with inflammatory bowel disease. Gastroenterol. Clin. North Am. 2018;47:155–177. doi: 10.1016/j.gtc.2017.09.007. [DOI] [PubMed] [Google Scholar]
- 65.Overbeek R, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) Nucleic Acids Res. 2014;42:D206–D214. doi: 10.1093/nar/gkt1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Franzosa EA, et al. Relating the metatranscriptome and metagenome of the human gut. Proc. Natl Acad. Sci. USA. 2014;111:E2329–E2338. doi: 10.1073/pnas.1319284111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Mehta RS, et al. Stability of the human faecal microbiome in a cohort of adult men. Nat. Microbiol. 2018;3:347–355. doi: 10.1038/s41564-017-0096-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Schirmer M, et al. Dynamics of metatranscription in the inflammatory bowel disease gut microbiome. Nat. Microbiol. 2018;3:337–346. doi: 10.1038/s41564-017-0089-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Schirmer M, et al. Linking the human gut microbiome to inflammatory cytokine production capacity. Cell. 2016;167:1897–1897. doi: 10.1016/j.cell.2016.11.046. [DOI] [PubMed] [Google Scholar]
- 70.ter Horst R, et al. Host and environmental factors influencing individual human cytokine responses. Cell. 2016;167:1111–1124. doi: 10.1016/j.cell.2016.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Tigchelaar EF, et al. Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics. BMJ Open. 2015;5:e006772. doi: 10.1136/bmjopen-2014-006772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Kurilshikov A, et al. Gut microbial associations to plasma metabolites linked to cardiovascular phenotypes and risk: a cross-sectional study. Circ. Res. 2019;124:1808–1820. doi: 10.1161/CIRCRESAHA.118.314642. [DOI] [PubMed] [Google Scholar]
- 73.ter Horst, R. et al. Sex-Specific Regulation of Inflammation and Metabolic Syndrome in Obesity. Arterioscler Thromb Vasc Biol.40, 1787–1800 (2020). [DOI] [PMC free article] [PubMed]
- 74.Imhann F, et al. The 1000IBD project: multi-omics data of 1000 inflammatory bowel disease patients; data release 1. BMC Gastroenterol. 2019;19:5. doi: 10.1186/s12876-018-0917-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Truong DT, et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods. 2015;12:902–903. doi: 10.1038/nmeth.3589. [DOI] [PubMed] [Google Scholar]
- 77.Franzosa EA, et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods. 2018;15:962–968. doi: 10.1038/s41592-018-0176-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Bateman A, et al. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43:D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Caspi R, et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 2016;44:D471–D480. doi: 10.1093/nar/gkv1164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Caspi R, et al. The MetaCyc database of metabolic pathways and enzymes. Nucleic Acids Res. 2018;46:D633–D639. doi: 10.1093/nar/gkx935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Lark RM. Compositional data analysis in the geosciences: from theory to practice. J. R. Stat. Soc. Ser. A. 2008;171:313–314. [Google Scholar]
- 82.Filzmoser P, Hron K. Correlation analysis for compositional data. Math. Geosci. 2009;41:905–919. [Google Scholar]
- 83.Gloor GB, Wu JR, Pawlowsky-Glahn V, Egozcue JJ. It’s all relative: analyzing microbiome data as compositions. Ann. Epidemiol. 2016;26:322–329. doi: 10.1016/j.annepidem.2016.03.003. [DOI] [PubMed] [Google Scholar]
- 84.van den Boogaart KG, Tolosana-Delgado R. “compositions”: a unified R package to analyze compositional data. Comput Geosci. 2008;34:320–338. [Google Scholar]
- 85.Weiss S, et al. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. ISME J. 2016;10:1669–1681. doi: 10.1038/ismej.2015.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Schwarzer G. meta: an R package for meta-analysis. R News. 2007;7:6. [Google Scholar]
- 87.Barbato G, Barini EM, Genta G, Levi R. Features and performance of some outlier detection methods. J. Appl Stat. 2011;38:2133–2149. [Google Scholar]
- 88.Proctor LM, Network IHIR. The Integrative Human Microbiome Project: dynamic analysis of microbiome-host omics profiles during periods of human health and disease. Cell Host Microbe. 2014;16:276–289. doi: 10.1016/j.chom.2014.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Zhernakova DV, et al. Individual variations in cardiovascular-disease-related protein levels are driven by genetics and gut microbiome. Nat. Genet. 2018;50:1524–1532. doi: 10.1038/s41588-018-0224-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Csardi, G. N. T. The igraph software package for complex network research. InterJ. Complex Syst.1695, 1–9 (2006).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data supporting the key findings of this study are available within the article and its Supplementary Information files. Data underlying Fig. 5c and Supplementary Fig. 2 are provided as a Source data file. Data underlying all the other figures are provided in Supplementary Data and data repositories: LifeLines-Deep cohort [https://www.ebi.ac.uk/ega/datasets/EGAD00001001991], 1000 IBD cohort [https://www.ebi.ac.uk/ega/datasets/EGAD00001004194], 300OB cohort [https://ega-archive.org/dacs/EGAC00001001143], and 500FG cohort [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA319574]. The iHMP data are available via https://ibdmdb.org/tunnel/public/summary.html. Due to informed consent regulation, the data sets of the Lifelines-DEEP, IBD, 300OB and 500FG cohorts are available upon request to the University Medical Center of Groningen (UMCG), Lifelines and Radboud University Medical Center, respectively. This includes the submission of a letter of intention to the corresponding data access committee [the Lifelines Data Access Committee for the LifeLines-DEEP data (Jackie Dekens, e-mail: j.a.m.dekens@umcg.nl), 1000 IBD Data access Committee UMCG for the IBD data (Melinde E. Wijers, e-mail: m.e.wijers@umcg.nl) and the Human Functional Genomics Data Access Committee for 500FG and 300OB data (Martin Jaeger, e-mail: Martin.Jaeger@radboudumc.nl)]. Data sets can be made available under a data transfer agreement and the data usage access is subject to local rules and regulations. Source data are provided with this paper.
For this study, the following software was used: kneadData (v0.4.6.1), Bowtie2 (v2.1.0), MetaPhlAn2 (v2.7.2), HUMAnN2 (v0.10.0), SparCC Python package, R (v3.5.2), SpiecEasi R package (v1.0.6), and meta R package (v4.9.5). Code used for generating the microbial abundance profiles is publicly available at https://github.com/GRONINGEN-MICROBIOME-CENTRE/Groningen-Microbiome/blob/master/Scripts/Metagenomics_pipeline_v1.md. Code used for the statistical analyses is publicly available at https://github.com/GRONINGEN-MICROBIOME-CENTRE/Groningen-Microbiome/tree/master/Projects/Microbial%20co-abundance%20network. Source data are provided with this paper.