Skip to main content
eLife logoLink to eLife
. 2023 May 9;12:e83152. doi: 10.7554/eLife.83152

Universal gut microbial relationships in the gut microbiome of wild baboons

Kimberly E Roche 1, Johannes R Bjork 2,3,4, Mauna R Dasari 4, Laura Grieneisen 5, David Jansen 4, Trevor J Gould 6, Laurence R Gesquiere 7, Luis B Barreiro 8,9,10, Susan C Alberts 7,11,12, Ran Blekhman 9, Jack A Gilbert 13, Jenny Tung 7,11,12,14, Sayan Mukherjee 1,15,16,17, Elizabeth A Archie 4,
Editors: Dario Riccardo Valenzano18, Wendy S Garrett19
PMCID: PMC10292843  PMID: 37158607

Abstract

Ecological relationships between bacteria mediate the services that gut microbiomes provide to their hosts. Knowing the overall direction and strength of these relationships is essential to learn how ecology scales up to affect microbiome assembly, dynamics, and host health. However, whether bacterial relationships are generalizable across hosts or personalized to individual hosts is debated. Here, we apply a robust, multinomial logistic-normal modeling framework to extensive time series data (5534 samples from 56 baboon hosts over 13 years) to infer thousands of correlations in bacterial abundance in individual baboons and test the degree to which bacterial abundance correlations are ‘universal’. We also compare these patterns to two human data sets. We find that, most bacterial correlations are weak, negative, and universal across hosts, such that shared correlation patterns dominate over host-specific correlations by almost twofold. Further, taxon pairs that had inconsistent correlation signs (either positive or negative) in different hosts always had weak correlations within hosts. From the host perspective, host pairs with the most similar bacterial correlation patterns also had similar microbiome taxonomic compositions and tended to be genetic relatives. Compared to humans, universality in baboons was similar to that in human infants, and stronger than one data set from human adults. Bacterial families that showed universal correlations in human infants were often universal in baboons. Together, our work contributes new tools for analyzing the universality of bacterial associations across hosts, with implications for microbiome personalization, community assembly, and stability, and for designing microbiome interventions to improve host health.

Research organism: P. cynocephalus

eLife digest

Communities of bacteria living in the guts of humans and other animals perform essential services for their hosts such as digesting food, degrading toxins, or fighting viruses and other bacteria that cause disease. These services emerge from so-called ‘ecological’ relationships between different species of bacteria.

One species, for example, may break down a molecule in human food into another compound that is, in turn, digested by another species into a small molecule that the human gut can absorb and use. The bacteria involved in such a process may become more or less common together in their host. In other situations, some bacteria may have opposing roles to each other, meaning that if one species becomes more abundant it may reduce the level of the other.

However, it is not known if relationships between different bacteria are consistent across hosts (i.e., universal) or unique to each host (personalized). In other words, if a pair of bacteria increase and decrease in abundance together in one host, do they do the same in other hosts? Microbes often swap genes with each other to gain new traits; as each host harbors a distinctive set of gut microbes, it may be possible for microbial relationships to change depending on the bacterial species present in a specific environment.

To investigate, Roche et al. studied the bacteria in thousands of samples of feces collected from 56 baboons over a 13-year period. These samples came from a long-term research project in Amboseli, Kenya which has been studying a population of wild baboons continuously since 1971.

Roche et al. measured the abundance of hundreds of gut bacteria in the feces to understand the relationships between pairs. This revealed that connections between species were largely universal rather than personalized to each baboon. Furthermore, the pairs of bacteria with the strongest positive or negative associations had the most consistent relationships across the baboons. Microbial relationships that have strong effects on the microbiome’s composition might therefore be especially universal.

Further analyses measuring gut bacteria in human babies also found that relationships between pairs of bacteria were largely universal. Hence, individual species of bacteria may fill similar ecological roles in each host across humans and other primates, and perhaps also in other mammals. These findings suggest that it may be possible to leverage the ecological relationships between bacteria to develop universal therapies for human diseases associated with gut bacteria, such as inflammatory bowel disease or Clostridium difficile infection.

Introduction

Mammalian gut microbiomes are highly diverse, dynamic communities whose members exhibit the full spectrum of ecological relationships, from strong mutualisms like syntrophy and cross-feeding, to competition, parasitism, and predation (Faust and Raes, 2012; Foster and Bell, 2012; Dolinšek et al., 2016; Seth and Taga, 2014). These relationships mediate a variety of biological processes that have powerful effects on host health and fitness, including the metabolism of complex carbohydrates and toxins, and the synthesis of physiologically important compounds, like short-chain fatty acids, neurotransmitters, and vitamins (Faust and Raes, 2012; Foster and Bell, 2012; Dolinšek et al., 2016; Seth and Taga, 2014; Bäckhed et al., 2005; Gould et al., 2018; Pontrelli et al., 2022; Degnan et al., 2014). Despite their importance, major gaps remain in our understanding of microbial relationships in the gut (Faust and Raes, 2012; Loftus et al., 2021; Bashan et al., 2016). We typically do not know if the abundance of one microbe consistently predicts the abundance of other microbes in the same host community, nor do we understand whether these correlative relationships are consistent in strength or direction across hosts (Bashan et al., 2016; Widder et al., 2016; Cao et al., 2017; Faust and Raes, 2016).

Knowing the overall direction and strength of these correlative relationships is important to understanding the ecological relationships that mediate gut microbial processes and shape gut microbiome assembly, stability, and productivity (Coyte et al., 2015; Palmer and Foster, 2022; Hu et al., 2022). For instance, sets of microbes that exhibit strong, positive relationships within hosts may represent networks of cooperating taxa that promote each other’s growth (Bäckhed et al., 2005; Loftus et al., 2021; Wu et al., 2021). In turn, these strong, mutualistic interdependencies can create an ecological house of cards where microbes rise and fall together, hampering community assembly and stability (Coyte et al., 2015; Coyte et al., 2021). Further, understanding the degree to which correlative relationships between microbes are the same or different in different hosts can shed light on whether hosts share similar, underlying microbial ecologies (Loftus et al., 2021; Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018). Microbial ecologies that are similar across hosts may make it possible to manipulate the microbiome’s emergent properties to improve host health (Loftus et al., 2021; Bashan et al., 2016; Cao et al., 2017; Coyte et al., 2015; Coyte et al., 2021; Gonze et al., 2018).

To date, there are several reasons to think that correlative relationships in the gut microbiome will not be consistent across hosts and will instead be individualized to each host. For instance, several common community and evolutionary processes—such as horizontal gene transfer and priority effects—can lead microbiome taxa to fill different ecological roles in different hosts (Dolinšek et al., 2016; Franzosa et al., 2015; Faith et al., 2013; Bik et al., 2016; Caporaso et al., 2011; Costello et al., 2009). Further, genotype by environment interactions and plasticity could lead some microbes to adopt context-dependent metabolisms and ecological roles depending on their microbial neighbors or other aspects of the environment (Louca et al., 2018; Rainey and Quistad, 2020; Martiny et al., 2015; Debray et al., 2022). Finally, the common observation that gut microbial community compositions (i.e. the presence and abundance of taxa) are highly individualized is sometimes attributed to host-specific microbial ecologies and relationships (Franzosa et al., 2015; Faith et al., 2013; Bik et al., 2016; Caporaso et al., 2011; Costello et al., 2009; Risely et al., 2021; Kolodny et al., 2019; Flores et al., 2014; Johnson et al., 2019; Pruss et al., 2021).

However, to date, the handful of studies that have tested the generalizability of gut microbial relationships across hosts suggest that microbiome community ecology is not highly individualized and is instead largely consistent (i.e. ‘universal’) across hosts (Figure 1A; Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018; Kalyuzhny et al., 2017). For instance, Bashan et al., 2016, inferred ‘universal’ gut microbial relationships in the human gut microbiome by applying dissimilarity overlap analysis (DOA) to cross-sectional samples from several human data sets. DOA infers universal microbial relationships by testing whether pairs of hosts who share many of the same microbiome taxa also have similar abundances of those taxa (Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018; Kalyuzhny et al., 2017). This approach relies on the assumption that, when two communities share many of the same species and have similar abundances of those species, they do so because of a shared, underlying set of between-species bacterial abundance relationships (Bashan et al., 2016; Kalyuzhny et al., 2017). While many studies using this approach find evidence that microbial relationships are ‘universal’ (Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018), DOA’s assumptions have been questioned because environmental gradients, stochastic processes, and the presence of many non-interactive species can lead to the spurious detection of universality (Bashan et al., 2016; Kalyuzhny et al., 2017; Marsland et al., 2020).

Figure 1. Testing the generalizability of gut microbial correlations across hosts.

(A) Schematic illustrating our approach for testing the degree to which gut microbial abundance correlations are consistent (i.e. ‘universal’; Bashan et al., 2016) across different baboon hosts. The left-hand set of images show our expectations for consistent correlation patterns; the right-hand images show our expectation for individualized correlation patterns. Colored circles next to each baboon represent microbes found in at least 50% of samples overall (and at least 20% of samples within each host). We also excluded putative duplicate 16S gene copies (see Methods). In each host, we inferred centered log-ratio (CLR) abundance trajectories for these taxa using a multinomial logistic-normal modeling approach implemented in the R package ‘fido’ (Silverman et al., 2022). Cartoons of two such trajectories for the orange and blue taxa are below each baboon. We used these trajectories to infer covariances between each pair of taxa in all baboons (represented by covariance matrices; we only analyzed microbial pairs whose joint zero abundance was less than 5% of all samples across hosts). We then converted these covariances to Pearson’s correlations and compared bacterial correlation patterns across all hosts, shown as heat maps (red cells are positively correlated taxa; blue cells reflect negatively correlated taxa). (B) Irregular time series of fecal samples used to infer microbial CLR abundance trajectories in 56 baboon hosts (n=5534 total samples; 75–181 samples per baboon across 6–13.3 years). Each point represents a fecal sample collected from a known individual baboon (y-axis) on a given date (x-axis). Samples from the same baboon were collected a median of 20 days apart (range = 0–723 days; 25th percentile = 7 days, 75th percentile = 49 days). (C) Relative abundances of the eight most prevalent gut bacterial orders and families over time (x-axis) for all 56 hosts (samples from females are labeled with an F; male samples with an M). Microbiota composition was somewhat individualized to each host (Figure 1—figure supplement 2; Björk et al., 2022; Grieneisen et al., 2021).

Figure 1.

Figure 1—figure supplement 1. Host ranging patterns.

Figure 1—figure supplement 1.

Ranging patterns for the baboon social groups included in this study during the period of microbiome sampling (May 19, 2000, to September 19, 2013). See Björk et al., 2022, for an in-depth analysis of environmental drivers of microbiome dynamics in this host population.

Figure 1—figure supplement 2. Variation across hosts.

Figure 1—figure supplement 2.

Each baboon exhibits a somewhat distinctive gut microbial community (p<0.0001; ANOVA) summarized here by the first two principal components of centered log-ratio (CLR)-transformed amplicon sequence variant (ASV) abundances of ASVs during six representative years of sample collection (see Figure 1B). Colored dots represent one of five especially well-sampled hosts, and colors indicate samples from the same host. Gray dots represent all other samples collected in the same year.

Figure 1—figure supplement 3. An increase in frequency of joint zeros is associated with positive ASV-ASV correlations across four methods.

Figure 1—figure supplement 3.

ASV-ASV pairs with a high frequency of joint absences (zero abundance observations) are more likely to be estimated as positively correlated by four methods: (A) basset, reporting Pearson’s correlation; (B) basset, reporting proportionality (rho; see Quinn et al., 2017); (C) COAT (see Cao et al., 2019); or (D) SparCC (see Friedman and Alm, 2012). The vertical dashed line in all panels shows a 0.05 cutoff for the proportion of joint zeros, the cutoff used in the main text. ASV, amplicon sequence variant.

An obvious alternative is to measure microbial correlations directly from microbiome time series from several hosts (Loftus et al., 2021; Faust et al., 2015). Unlike DOA, this approach should be able to pinpoint which microbiome taxa exhibit the most and least consistent relationships across hosts. However, measuring microbial correlations from longitudinal, multi-host microbiome time series has its own challenges: time series with adequately dense sampling are rare, and most such data sets exhibit temporal autocorrelation and irregular sampling (Faust et al., 2015). Further, the most common, and still most feasible, way to collect microbiome community data—via high-throughput sequencing—generates noisy count data that usually can only be interpreted in terms of relative (not absolute) abundances (Gloor et al., 2017; Quinn et al., 2017). Finally, correlation cannot be used to infer causality, and in the absence of experiments, we cannot differentiate whether microbial correlation patterns arise from ecological interactions (e.g. competition, predation, facilitation) or shared responses to the environment.

To address several of these challenges, here we combine extensive time series data on the stool-associated microbiota with a multinomial logistic-normal modeling framework (Figure 1; n=5534 samples from 56 baboons; 75–181 samples per baboon across 6–13.3 years, between 2000 and 2013; Alberts and Altmann, 2012; Björk et al., 2022; Grieneisen et al., 2021). This framework uses 16S rRNA sequencing count data to learn a smoothly evolving Gaussian process. The baboons were the subject of long-term research on individually recognized animals by the Amboseli Baboon Research Project in Kenya, which has been studying baboon ecology and behavior in the Amboseli ecosystem since 1971 (Alberts and Altmann, 2012). The baboons range over the same habitat and experience similar diets and sources of microbial colonization, facilitating inference about the consistency of microbial correlations across hosts (Figure 1—figure supplement 1; Björk et al., 2022; Grieneisen et al., 2021). To partly account for environmental drivers of microbial dynamics, our modeling approach controls for variation attributable to seasonal changes in the animals’ diets, proportionality in the count data, and irregularity in sampling to produce per-individual, per-taxon trajectories of log-ratio abundances that we used to estimate pairwise microbial correlations within each host.

We pursued five main objectives. First, we characterized the overall sign and strength of pairwise correlations in bacterial abundance within each host. Second, we tested the degree to which these correlation patterns are systematically consistent across hosts or individualized by host (Figure 1A). Third, we identified phylogenetic and host-related predictors of the direction and universality of bacterial correlations, including phylogenetic relationships between microbes, host age, and host genetic relatedness. Fourth, we tested whether the microbial correlations we observed were driven by shared responses to host diets or seasonality, or by synchronized microbial dynamics across hosts. Fifth, we tested the generalizability of our findings by comparing the patterns of universality in our data set to two microbiome time series from humans (Johnson et al., 2019; Vatanen et al., 2016).

Our predictions for these analyses were influenced by ideas from community and microbial ecology. First, because strong interdependencies can hamper community assembly and destabilize community dynamics (Coyte et al., 2015; Palmer and Foster, 2022; Hu et al., 2022; Coyte et al., 2021), we expected that most microbial correlations would be weak with few strong positive relationships. Second, consistent with studies that used DOA, we expected that microbial relationships would be more consistent across hosts than individualized (see Figure 1A for a visualization of this prediction). This result would suggest that personalized microbiota—their compositions and dynamics—do not arise from host-specific microbiome ecologies (Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018). Third, we expected to observe positive correlations between taxa that are close phylogenetic relatives. This is because related bacteria may have similar functional properties and hence similar ecological relationships with other members of the community. They may also have dynamics that are driven by similar selective forces imposed by the host or host’s environment. Alternatively, competitive exclusion may lead closely related taxa to exhibit neutral or negative relationships. Fourth, because the environments experienced by baboons in Amboseli are far more uniform than those experienced by typical human study subjects (Björk et al., 2022; Grieneisen et al., 2021), we expected that the signature of ‘universality’ in baboons would be stronger than that observed in humans. We discuss the implications of these patterns for individual microbiome community assembly and dynamics, and for understanding how microbiome communities are structured across hosts—a key requirement for successful intervention to improve host health (Bashan et al., 2016; Widder et al., 2016; Cullen et al., 2020).

Results

Most bacterial correlations within individuals are weak and negative

We began by characterizing the overall sign, strength, and significance of pairwise correlations in bacterial abundance within each host. To do so, we applied the approach outlined in Figure 1A to stool-associated time series from 56 baboons (Figure 1B) and calculated Pearson’s correlations between pairs of bacterial taxa. To avoid biases created by zero inflation (see Methods), we restricted our analysis to pairs where each member was present in at least 50% of samples across all hosts and at least 20% of samples within each host (Supplementary file 1a-1c). Further, we required the joint zero abundance of a given bacterial pair to be less than 5% of all samples across hosts. After filtering, the resulting data set included (1) 1878 pairs of centered log-ratio (CLR)-transformed amplicon sequence variants (ASVs; Figure 2A); (2) 57 pairs of bacterial phyla (Figure 2—figure supplement 1); and (3) 473 pairs of taxa agglomerated to the most granular possible family, order, or class (Figure 2—figure supplement 1). To generate an expectation of the strength of bacterial correlations possible by chance, we used a permutation procedure that randomly shuffled the taxonomic identities within each sample of the bacterial count table 10 times for each of the 56 hosts (560 total permutations). We then estimated correlations for these permuted pairs to generate an empirical null distribution of randomly generated taxon-taxon correlations. Observed correlations were judged against this reference (Figure 2B). We also confirmed that the resulting correlation patterns were robust to several modeling choices and were not primarily driven by seasonal shifts in microbiome composition (see results below).

Figure 2. Bacterial correlation patterns across hosts.

The heat map in panel (A) shows Pearson’s correlation coefficients of centered log-ratio (CLR) abundances between all pairs of amplicon sequence variants (ASVs) (x-axis) in each of the 56 baboon hosts (y-axis). Each pair of ASVs is represented on the x-axis, including all pairwise combinations of 107 ASVs with sufficient co-occurrence, resulting in 1878 ASV-ASV correlations measured per host (105,168 total correlations across all 56 hosts). Columns are ordered by the mean correlation coefficient between ASV-ASV pairs, from negative (blue) to positive (red). (B) Pairwise correlations generated from random permutations of the data. Taxonomic identities were shuffled within samples and pairwise ASV-ASV correlations were estimated to produce a null model of ASV-ASV correlation patterns within and between hosts. Column order is the same as in panel A. Panels (C) and (D) show example trajectories of CLR abundances for two pairs of ASVs in the same five hosts. Panel (C) shows a strongly negatively correlated pair (median r across all hosts = −0.461; two ASVs in order Clostridiales: ASV36 (orange) and ASV59 (blue); Supplementary file 1a and d) and panel (D) shows a strongly positively correlated pair (median r across all hosts = 0.508; two ASVs in genus Bifidobacterium; ASV1 (orange) and ASV4 (blue); Supplementary file 1a and d).

Figure 2.

Figure 2—figure supplement 1. Heat maps of correlation of centered log-ratio (CLR) taxon pairs at taxonomic levels higher than amplicon sequence variant (ASV).

Figure 2—figure supplement 1.

Each heat map shows the Pearson’s correlation coefficient of CLR abundances between pairs of taxa that met our filtering criteria (x-axis) in each of the 56 baboons (y-axis). Panel (A) gives 473 pairs of taxa agglomerated to the most granular possible family, order, or class. Panel (B) gives 57 phylum-phylum pairs. See Figure 2A for a heat map of ASV-ASV correlation patterns across hosts.
Figure 2—figure supplement 2. Heat map of correlation of amplicon sequence variant (ASV)-level centered log-ratio (CLR) taxon pairs where pairs not exceeding confidence cutoffs have been omitted.

Figure 2—figure supplement 2.

This heat map is identical to the one shown in Figure 2A in the main text, but black cells represent host-level ASV pairs whose estimated CLR correlation did not exceed the confidence cutoffs described in Figure 2—figure supplement 3, that is a CLR ASV-ASV correlation of less than –0.342 or greater than 0.328. The orientation of rows and columns is the same as in Figure 2A.
Figure 2—figure supplement 3. Confidence cutoffs for correlations and universality scores.

Figure 2—figure supplement 3.

The three plots in panel (A) show density distributions of the observed correlations between taxa (blue distributions) compared to random expectation (gray distributions) for three taxonomic partitions of the data: amplicon sequence variants (ASVs), family/order/class-level assignments, and phyla. The random distributions were generated by permuting the data 10 times per-host, shuffling taxonomic identities within individual microbiome samples and re-calculating taxon-taxon correlations in these permuted data sets. Mean correlations for the observed distributions are shown as black points below each distribution. The observed correlations for ASVs range from –0.767 to 0.904 (mean = –0.009; median = –0.010); for family/order/class designations range from –0.719 to 0.803 (mean = –0.028; median = –0.029); and for phyla from –0.645 to 0.687 (mean = –0.088; median = –0.092). Panel (B) shows density distributions of the observed universality scores in blue compared to random expectations in gray, for pairs of taxa at the ASVs, family/order/class-level, and phyla. Small, dashed black lines underneath each density indicate the mean observed universality score at that taxonomic level.
Figure 2—figure supplement 4. Our universality score identifies the bacterial pairs that exhibit the most consistent relationships across hosts.

Figure 2—figure supplement 4.

The score multiplies the pair’s median correlation coefficient across hosts with its correlation consistency across hosts (i.e. proportion of hosts in which the correlation has the same sign, either positive or negative). The resulting scores range from 0 to 1, where a score of 1 equates to perfect ‘universality’ (i.e. all hosts have a correlation coefficient of 1 or all hosts have a correlation coefficient of –1). The second panel illustrates the calculation of the score for 4 hypothetical taxa pairs across 5 hosts.

Consistent with the expectation that most bacterial correlations in the gut microbiome are weak (Coyte et al., 2015; Coyte et al., 2021), only 19% of ASV-ASV correlations in the heat map in Figure 2A were stronger than expected by chance (FDR≤0.05; Figure 2—figure supplement 3; 19% of phylum-phylum; 22% of family/order/class correlations; Figure 2—figure supplement 1; Figure 2—figure supplement 2). The strongest negatively correlated pair in Figure 2A included an ASV in the family Kiritimatiellae and another in family Lachnospiraceae which had a median correlation of –0.520 (±0.132 s.d.) across all baboon hosts (Figure 2C; ASV19 and ASV23; Supplementary file 1a and d). The strongest positively correlated pair of ASVs included two members of the genus Prevotella that had a median correlation of 0.801 (±0.053 s.d.) across all baboons (Figure 2D; ASV2 and ASV3; Supplementary file 1a and d). While these two ASVs were assigned to the same genus, their V4 16S DNA sequence identity was 97.6%, indicating they are probably not simply duplicate 16S gene copies encoded in the genome of a single species (Vetrovský and Baldrian, 2013 Supplementary file 1d).

In support of the idea that strong, positive bacterial interdependencies are rare (Coyte et al., 2015; Palmer and Foster, 2022; Coyte et al., 2021), only 3.8% of ASV pairs were significantly positively correlated, and the overall bacterial correlation patterns were slightly skewed toward negative relationships. For instance, at the ASV level, the median correlation coefficient in Figure 2A was –0.072, and 60% of these correlations were negative (binomial test p<0.0001). For family/order/class-level taxa, 58% of all correlations were negative (Figure 2—figure supplements 1A and 3A; median family/order/class-level correlation = −0.049; binomial test p<0.0001). Correlations between phyla exhibited the strongest negative skew, with 64% of phyla-phyla correlations having a negative sign (Figure 2—figure supplements 1B and 3A; median phyla-level correlation = −0.100; binomial test p<0.0001). This bias toward negative relationships is consistent with the expectation that neutral or negative relationships between ASVs are more common than mutualisms (Coyte et al., 2015; Palmer and Foster, 2022; Coyte et al., 2021) and that more distantly related taxa (e.g. phyla) respond to distinct environmental drivers due to differences in metabolic requirements and lifestyles.

Within-host bacterial correlation patterns are largely consistent across baboons

Next, we tested the degree to which within-host ASV-ASV correlations were consistent across hosts. We began by plotting the absolute value of each ASV pair’s median Pearson’s correlation coefficient as a function of the consistency of their correlation sign (positive or negative) across the 56 hosts (Figure 3A and B). These plots provide two main insights into the consistency of bacterial associations. First, in support of the idea that ASVs do not exhibit vastly different correlative relationships in different hosts, no taxon pairs were strongly and inconsistently correlated across hosts (Figure 3A and B; Figure 3—figure supplement 1A). Instead, the ASV pairs that had inconsistent correlation signs across hosts always had weak and often non-significant median absolute correlation coefficients within hosts (Figure 3A and B).

Figure 3. None of the amplicon sequence variant (ASV) pairs were strongly and inconsistently correlated across hosts, and the strongest and most consistently correlated ASVs are typically positively correlated.

Plots in (A) and (B) show the median correlation strength for each ASV-ASV pair across all 56 hosts as a function of the consistency in direction of that pair’s correlation across hosts, measured as the proportion of hosts that shared the majority correlation sign (positive or negative; ASV pairs that were positively correlated in half of the 56 hosts have a consistency of 0.5; ASV pairs that were positively [or negatively] correlated in all hosts have a consistency of 1.0). Panel (A) presents this relationship for consensus positively correlated features and panel (B) shows consensus negatively correlated features. The Spearman’s correlation between median correlation strength and the proportion of shared sign for all correlated features is 0.844 (p<0.0001). Multiplying the two axes in either panel (A) or (B) creates a ‘universality score’ (Figure 2—figure supplement 4), whose distribution is shown in panel (C). This score reflects the strength and consistency of pairwise microbial correlations across hosts and ranges from 0 to 1, where a score of 1 indicates ASV-ASV pairs with perfect correlations of the same sign in all hosts. A vertical line indicates the minimum significant universality score. (D) Correlation networks for the top 5% most strongly and consistently correlated ASV pairs across hosts (i.e. the top 5% highest universality scores; pairs with rank 1–93 in Supplementary file 1d). Network edges are colored by the consensus sign of the correlation between that pair (black for pairs where most hosts had a positive correlation; gray for pairs where most hosts had a negative correlation). Node labels indicate the ASV identity in Supplementary file 1a and colors represent bacterial families. (E) Significantly enriched bacterial families in the network in panel D (Fisher’s exact test p<0.01 all, FDR≤0.05; see Supplementary file 1e enrichment statistics for all families).

Figure 3.

Figure 3—figure supplement 1. Universality at taxonomic levels higher than amplicon sequence variant (ASV).

Figure 3—figure supplement 1.

The plot in (A) shows the median absolute value of each family/order/class pair’s correlation coefficient across hosts as a function of correlation consistency, measured as the proportion of hosts that shared the majority correlation sign (positive or negative; taxon pairs that were positively correlated in half of the 56 hosts have a consistency of 0.5; pairs that were positively [or negatively] correlated in all hosts have a consistency of 1.0). Panel (B) plots the same estimates for phylum-phylum pairs. Multiplying the two axes in panels (A) and (B) creates a universality score reflecting the strength and consistency of pairwise microbial correlations across hosts. Universality score distributions for family/order/class pairs are shown in panel (C) and scores for phylum pairs are shown in panel (D).
Figure 3—figure supplement 2. Quantifying the relative strength of universal versus individualized dynamics.

Figure 3—figure supplement 2.

Panel (A) shows observed (y), population mean (m), and host residual dynamics (e = ym) for a single hypothetical host with three amplicon sequence variants (ASVs). The fourth matrix shows a simulated case where host-level and population-level mean dynamics are added in equal proportions (0.5m + 0.5e) to approximate y. We estimate the proportion host- versus individual-level signal as the combination of mean and residual which best approximates y. Panel (B) shows the results of this procedure applied to the Amboseli data, giving a per-host estimate of population-level contribution to the observed centered log-ratio (CLR) ASV-ASV correlations.

Second, the pairs with the most consistent sign agreement across hosts also exhibited the largest median absolute correlation coefficients across hosts (Figure 3A and B; Spearman’s r=0.844, p<0.0001). Hence, pairs of ASVs that have the strongest relationships, and are therefore likely to play the most important roles in structuring gut microbiome dynamics, also tend to have the most consistent relationships in different hosts. Indeed, for the sets of positively or negatively correlated ASV pairs that showed universal agreement in the sign of their correlation across all hosts (i.e. where x=1 in Figure 3A and B), the median absolute correlation coefficient is 0.359, compared to 0.116 for those with no sign consistency (x=0.5 in Figure 3A and B). Note, that the correlation for a given pair of ASVs was only weakly predicted by bacterial abundance (r=0.129 and r=0.223 for the more and less abundant partner in a pair respectively; p<0.0001 both). While this effect was statistically significant, it explained only 6% of the variance in median correlation.

Visual inspection of the patterns in Figures 2A, 3A and B indicate that ASV-ASV correlations are largely consistent across baboons, as opposed to individualized to each baboon. To explicitly quantify the relative strength of shared versus individualized signatures in the heat map in Figure 2A, we calculated the population mean pattern for the ASV-ASV correlation matrix, m. For each host, we then estimated the residual difference, e, between that individual’s observed ASV-ASV correlation matrix, y, and the population mean matrix: y – m (see Figure 3—figure supplement 2A for a cartoon example). We reasoned that the observed correlation matrix for each host can be approximated by a mixture of contributions from the population mean matrix m and the host-specific residual matrix e. To identify the optimal mixture for each host (i.e. the mixture of consistent vs. individualized correlation patterns that best explained the observed data), we titrated the contribution (i.e. weight) of e from 0% to 100% (and correspondingly, the contribution of m from 100% to 0%) and identified the value that minimized the Frobenius distance between the simulated combination and the observed correlation matrix, y.

In support of prior observations of ‘universality’ (Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018), we found that, across hosts, the optimal mixture involved contributions from the shared correlation structure (i.e. m) of between 60% and 80% (median 70%) and a host-level contribution (i.e. from e) of between 40% and 20% (median 30%). Hence, population-level signatures contributed almost twice the weight as host-level signatures (a median population:host ratio of 2.33:1; Figure 3—figure supplement 2B). As a result, ASV-level relationships tend to be more consistent across hosts than host-specific.

The most consistent ASV-level correlations are between phylogenetically related taxa

One advantage of our approach, compared to DOA (Bashan et al., 2016), is that we can identify the bacterial pairs that exhibit the most consistent relationships across hosts. Hence, we next conducted several analyses to understand why some taxon pairs are more consistent than others. To do so, we created a ‘universality’ score (Figure 2—figure supplement 4) that could be calculated for each ASV pair. The score multiplies the pair’s median absolute correlation coefficient across hosts (y-axis of Figure 3A and B) with its correlation consistency across hosts (i.e. proportion of shared sign; x-axis of Figure 3A and B). The resulting scores range from 0 to 1, where a score of 1 equates to perfect ‘universality’ (i.e. all hosts have a correlation coefficient of 1 or all hosts have a correlation coefficient of –1). Applying this score to all pairs of ASVs reveals a right-skewed distribution, reflecting the fact that most bacterial correlations are weak, with inconsistent sign directions across hosts (Figure 3C; Figure 2—figure supplement 3B). However, 48% of these scores were higher than expected by chance (permutation test; FDR≤0.05; Figure 3C; Figure 2—figure supplement 3B), reflecting a signal of universality in our data. Despite the bias toward negative ASV-ASV correlations in the overall set of bacterial correlations (Figure 2—figure supplement 3A), we observed no such bias in the most universal pairs. For instance, in the top 5% most universal ASV pairs (n=94 pairs), 46 pairs exhibited net positive correlations and 48 pairs, net negative correlations, suggesting no particular bias in the direction of the strongest and most consistent associations.

To visualize these highly consistent correlations, we plotted bacterial co-abundance networks connecting the top 5% most universal ASV pairs (Figure 3C). A handful of ASVs were highly connected within this network, especially ASV1 (genus Bifidobacterium; Supplementary file 1a and d), which was connected to 14 other ASVs. Three other ASVs were connected to at least 10 other ASVs, including members of families Atopobiaceae (ASV8), Eggerthellaceae (ASV28), and Muribaculaceae (ASV86) (Supplementary file 1e). These families were also enriched in this network, relative to the rest of the data, as was the family Bifidobacteriaceae (Figure 3E). Pairings between members of the same family were enriched by >3-fold in this network (p<0.0001), making up 32% of pairs in the most universal set versus only 9.8% of pairs outside that set. Almost two-thirds of these were Prevotellaceae-Prevotellaceae pairs (10 of 16 same-family pairs).

We next asked: does the phylogenetic distance between a pair predict the nature of their relationship? In support of the idea that homology leads closely related ASVs to respond similarly to the environment, or perhaps facilitate each other’s growth (Meehan and Beiko, 2014; Vacca et al., 2020), we found that, for positively associated ASV pairs, closely related taxa had higher universality scores than more distantly related taxa (Pearson’s r for positively correlated pairs = −0.232; p<0.0001; Figure 4A). In contrast, when ASV pairs were negatively correlated, there was a weak positive relationship between phylogenetic distance and universality (Pearson’s r=0.106; p=0.004; Figure 4B). In other words, the strongest and most consistently negatively correlated taxa tend to be distantly related, whereas the strongest and most consistently positively correlated taxa were often closely related, especially members of the family Atopobiaceae (Supplementary file 1f).

Figure 4. The most consistent amplicon sequence variant (ASV)-level correlations are positive and often between close evolutionary relatives.

Pairwise universality scores are plotted as a function of phylogenetic distance between the ASV-ASV pair for consensus positively correlated pairs in red (A) and negatively correlated pairs in blue (B). Phylogenetic distance (x-axis) is binned into 0.1 increments; each point represents a given ASV pair, and box plots represent the median and interquartile ranges for a given interval of phylogenetic distance. Phylogenetic distance is negatively correlated with universality score in positive pairs (Pearson’s correlation for positively associated ASV pairs = −0.232, p-value <0.0001), and positively correlated with universality score in negatively associated pairs (Pearson’s correlation for negatively associated ASV pairs = 0.106, p=0.004). Full estimates are given in Supplementary file 1f.

Figure 4.

Figure 4—figure supplement 1. Estimates of centered log-ratio (CLR) ASV-ASV correlation are similar after removing seasonal trend.

Figure 4—figure supplement 1.

Panel (A) shows sample-sample autocorrelation estimates for the CLR-transformed data from 56 baboon hosts. Panel (B) shows sample-sample autocorrelation for the same data after the removal of a seasonal trend by arima() in R. Estimates of CLR ASV-ASV correlation derived from the original (basset) model and this per-ASV linear autoregressive model is highly similar, as shown in panel (C) where r=0.982. ASV, amplicon sequence variant.
Figure 4—figure supplement 2. ASV-ASV pairs with two ‘seasonally varying’ partners have slightly higher median correlations across hosts.

Figure 4—figure supplement 2.

‘Seasonally varying’ ASVs were those identified as having seasonally differential centered log-ratio (CLR) abundance in Björk et al., 2022. ASV-ASV pairs with at least one ‘seasonal’ member had significantly higher median correlation across hosts than did pairs where neither partner was seasonal (difference of 0.018, Wilcoxon rank sum test p = 0.000326). ASV, amplicon sequence variant.
Figure 4—figure supplement 3. Measuring synchronized dynamics in the same amplicon sequence variant (ASV) across many hosts.

Figure 4—figure supplement 3.

The cartoon in panel (A) shows the sample selection procedure. We selected pairs of samples from different hosts collected within 24 hr of each other, producing 20,000 ‘same-day’ pairs (e.g. purple and mustard-colored pairs on days 2, 3, 6, and 9). We subset these pairs to one same-day pair per host pair, producing 1540 same-day pairs from these thinned host series and extracted the correlated ASV abundance from these pairs. Synchrony for a given taxon is estimated as the correlation of the centered log-ratio abundance of that taxon across Series 1 and 2. Panel (B) shows the correlation between log-ratio abundance estimates in paired host samples for the most synchronous taxon, ASV #21 (Clostridiaceae 1; r = 0.480). Panel (C) shows the time-aligned model-estimated centered log-ratio abundances of ASV #23 in five well-sampled hosts that each lived in a different baboon social group (hosts from top to bottom are F09, F31, F35, F27, F20). Panel (D) shows the distribution of observed synchrony estimates for all 125 ASVs (light blue) compared to synchrony estimates from a permuted/null distribution (gray); 118 of 125 ASVs had significantly higher synchrony than expected (FDR≤ 0.05; permutation scheme).
Figure 4—figure supplement 4. Seasonality is not associated with synchrony.

Figure 4—figure supplement 4.

Synchrony estimates for all amplicon sequence variants (ASVs) are given as bar plots. ASVs that were identified as having seasonally differential centered log-ratio (CLR) abundance in Björk et al., 2022, are labeled by the season in which they were more abundant: dry or wet. Seasonally variable labels do not predict synchrony (ANOVA, p=0.358).
Figure 4—figure supplement 5. Synchrony weakly predicts universality.

Figure 4—figure supplement 5.

Mean ASV-ASV pair synchrony estimates are plotted against universality scores for all 1878 high-confidence ASV pairs. Synchrony weakly predicts universality (r=0.116, p<0.0001). ASV, amplicon sequence variant.

Genetic relatives, hosts with similar microbiome compositions, and age mates have more similar bacterial correlation patterns

We next asked whether host attributes, including host sex, social group membership, genetic relatedness, age, or baseline gut microbiome composition, predict host differences in patterns of bacterial correlation. Consistent with studies that use DOA to infer universality (Bashan et al., 2016; Gao et al., 2020; Vila et al., 2020; San-Juan-Vergara et al., 2018; Kalyuzhny et al., 2017), the strongest predictor of distance in bacterial correlation patterns was distance in terms of baseline microbiome composition (a core assumption of DOA). Indeed, a Mantel test correlating compositional distance of average microbial profiles (as Aitchison distances between the per-host mean of CLR-transformed samples) with distance in microbial correlation patterns between hosts revealed that 34% of the variation in correlation patterns was explained by baseline microbiome community composition (Mantel: r2=0.336; p=0.001; Figure 5A; Supplementary file 1g).

Figure 5. Baboons with more similar bacterial correlation patterns are more likely to have more similar baseline microbiome compositions and are slightly more likely to be genetic relatives.

In panel (A) each point is a pair of hosts; the y-axis shows the similarity of these hosts’ bacterial correlation patterns (via Frobenius distance) as a function of their microbiome compositional similarity (via Aitchison distance; Mantel: r2=0.336; p=0.001). Colors show samples from pairs of baboons living in the same social group and gray dots are pairs of animals living in different social groups. There is no detectable effect of social group on correlation pattern similarity. Panel (B) shows the same distances as a function of host genetic dissimilarity (1 – the coefficient of genetic relatedness between hosts; r2=0.009; p-value Mantel test 0.002). Colors reflect pairs of hosts living in the same social group, as in panel A.

Figure 5.

Figure 5—figure supplement 1. Within- and between-age bin ASV-ASV correlations are similar.

Figure 5—figure supplement 1.

For a subset of hosts with sufficient sampling across age ‘bins’ (at least 35 samples within two of the following age ranges: juvenile = 0–6 years; prime age adult = 6–13 years; older adult = 13+ years), ASV-ASV correlation is visualized as a heat map. Panel (A) shows ‘juvenile’ ASV-ASV correlations for hosts F32, F06, F27, F36, F10, F01, F07, F35, M09, F05, M10, F22, F26 (top row to bottom) and (B) the distribution of those correlations. Panels (C) and (D) give ‘prime age’ correlations for the same hosts. Panel (E) shows prime age ASV-ASV correlations for the hosts F28, F17, F16, F31, M03, M08, F04, F12, F03, F33, M02, F14, F08 (top row to bottom) and (F) the distribution of those correlations. Panels (G) and (F) give ‘older adult’ correlations for the same hosts. ASV, amplicon sequence variant.
Figure 5—figure supplement 2. Quantifying the relative strength of universal versus individualized dynamics across age ranges.

Figure 5—figure supplement 2.

Histograms show estimates of the population-level contribution to observed centered log-ratio (CLR) ASV-ASV correlations in each 13-host age bin. (Hosts are matched between age bins where noted.) Variation in the population-level contribution to observed CLR ASV-ASV correlations is slightly, but not significantly, larger in the prime age and older adult bins than in the juvenile bin. Compare these estimates to those obtained for overall hosts series in Figure 3—figure supplement 2. ASV, amplicon sequence variant.
Figure 5—figure supplement 3. PCA plot of ASV-ASV correlations within age bins for a select set of hosts.

Figure 5—figure supplement 3.

For a subset of hosts with sufficient sampling across age ‘bins’ (at least 35 samples within two of the following age ranges: juvenile = 0–6 years; prime age adult = 6–13 years; older adult = 13+ years), ASV-ASV correlation is visualized along principal components (PCs). Circles indicate the set of correlation estimates for a particular host within a particular age bin. Circles in panel (A) refer to ‘juvenile’ and ‘prime age’ hosts in orange and green; panel (B) refers to ‘prime age’ and older adults in light and dark green. Estimates across age bins (e.g. juvenile vs. prime age) for the same host are linked by a gray line. Diamonds indicate the centroid for each age bin. For the juvenile vs. prime age comparison in panel A, the ASV pair contributing most strongly to variation along PC1 was a genus Libanicoccus-genus Bifidobacterium pair (ASV16-ASV58). The pair contributing most strongly to placement along PC2 was a genus Lactobacillus-order Clostridiales pair (ASV22-ASV36). In the prime age adult vs. older adult embedding in panel B, the ASV pair which contributed the most to variation along PC1 was a genus Prevotella 9-unknown bacterium pair (ASV2-ASV40). The pair with the greatest contribution to PC2 was another taxonomically unresolved bacterium (ASV41) in combination with another member of family Prevotellaceae (ASV64). ASV pairs with the largest loadings in PCs 1 and 2 for both age ranges are given in Table S9. ASV, amplicon sequence variant.

Consistent with prior research in our population, which finds widespread heritability for gut microbiome taxon abundances (Grieneisen et al., 2021), we also found a weak but significant relationship between host genetic distance and the distance in microbial correlation patterns between hosts, after controlling for similarity of baseline composition across hosts. Hosts who were more closely related based on a multigenerational pedigree have slightly more similar ASV-level correlation matrices (Figure 5B; Supplementary file 1g; r2=0.009; partial Mantel controlling for baseline similarity: p-value = 0.002). We found no evidence that members of the same social group or sex exhibit more similar microbial correlation patterns (social group: F=2.146; p=0.089; sex: F=2.026; p=0.160; Supplementary file 1g).

Host age may also predict the overall strength of microbial relationships, and some studies find that gut microbial compositions become more individualized with age (Risely et al., 2022; Wilmanski et al., 2021). This observation suggests that host age may also be linked to individualized microbial relationships. To test these ideas, we divided our hosts into three classes: juveniles (0–6 years), prime age adults (6–13 years), and older adults (13+ years) and compared ASV correlation patterns between (1) the juvenile and prime age class and (2) between prime age and older adults. Hosts were only included in these analyses if they had >35 samples in either the juvenile and prime age class or prime age and older adult class (no host had >35 samples in all three age classes; 13 hosts were included in the juvenile and matched prime age adult groups; a separate set of 13 hosts were included in the older adult and matched prime age adult groups).

We found no evidence that microbial correlations get stronger or weaker with age (Figure 5—figure supplement 1). Further, using the same methods described above to estimate the relative host- and population-level contributions to ASV correlation patterns, we found no strong differences in the degree of ‘personalized’ correlation patterns across age groups (Figure 5—figure supplement 2). Further, ASV correlation patterns were slightly more similar within versus between adjacent age classes (juvenile vs. prime age, ANOVA p=0.00175, 1.3% variance explained; prime age vs. older adult, ANOVA p=0.0112, 0.9% variance explained). Principal components analysis on the microbial correlation patterns between juvenile and prime age hosts, and between prime age and older adult hosts revealed some age effects, particularly on the second principal component (Figure 5—figure supplement 3). See the legend of Figure 5—figure supplement 3 and Supplementary file 1i for information on which ASV pairs differed across age categories.

Universality in Amboseli is not well explained by microbes’ shared responses to diet, season, or synchronized dynamics

Without experiments, we cannot disentangle whether our observed bacterial correlations are due to ecological interactions between bacterial species or to shared responses to environmental gradients, either inside or outside the host. While we were unable to control for many aspects of the host environment (e.g. gut pH, hormones, or immune profiles), we were able to include measures of dietary variation in our models of microbial abundances. Seasonal changes in host diet therefore do not account for universality in microbial relationships across hosts.

To account for additional unexplained seasonal variation, we next removed the oscillating seasonal trend from the log-ratio abundances for each ASV (modeled as a sine wave) and re-estimated the ASV-ASV correlation matrix (Figure 4—figure supplement 1). Removing the seasonal trend had little effect on ASV-ASV correlations, as the variance explained by seasonal oscillation was small for all ASVs (median 1.1%, minimum = 0%, maximum = 6%). Consequently, the between-ASV correlation estimates were almost identical to those derived from our original model (Pearson’s r=0.982, p<0.0001; Figure 4—figure supplement 1C). Further, ASV pairs where one or more members was from a family that showed strong seasonal changes in a prior analysis of these data (Björk et al., 2022), henceforth ‘seasonal’ families, had only slightly higher universality scores than taxon pairs where neither partner showed strong seasonal changes in abundance (difference of 0.018; p<0.0001; Figure 4—figure supplement 2).

Because the high level of universality we observed was not well explained by season, we also tested whether universality was explained by synchronized dynamics. We reasoned that if one member of an ASV pair shows highly synchronized dynamics across different hosts, and the other member is also strongly synchronized across hosts, then universality could be an inevitable outcome of strong, but independent synchrony in both members of the pair. We quantified synchrony as the degree to which the observed dynamics of a single, focal ASV are consistent across hosts, such that high synchrony (near 1) implies that the timing and direction of shifts in log-ratio ASV abundance are identical across hosts in the population (see Methods; Figure 4—figure supplement 3). Estimates of synchrony ranged from 0.033 to 0.474 (median=0.187). Interestingly, ASVs in the 13 ‘seasonal’ families are not more likely to have high synchrony than other families (ANOVA, p=0.434; Figure 4—figure supplement 4; Supplementary file 1h). The average synchrony of an ASV-ASV pair had a statistically significant but weak relationship with that pair’s universality score (r=0.116, p<0.0001; Figure 4—figure supplement 5).

Baboon microbiomes are not substantially more ‘universal’ than human microbiomes

Finally, to investigate parallels between baboon and human microbial communities, we turned to two publicly available gut microbial time series data sets: daily samples from 34 human adults over a 17-day span (483 total samples; hereafter Johnson et al., 2019), and the DIABIMMUNE cohort that consists of 285 samples, collected monthly over 3 years, from 15 infants and toddlers living in Russian Karelia (Vatanen et al., 2016; at the time of writing, these cohorts were the only publicly available data sets we could find that included large numbers of repeated samples from the same subjects). Because baboons in Amboseli experience less heterogeneity in their environments and diets than humans (Björk et al., 2022; Grieneisen et al., 2021), we expected they would exhibit greater consistency in microbial correlations than either human cohort. Here, we compared each host cohort’s universality at the level of correlations between families, orders, and classes because these taxonomic levels offered the greatest comparative power (10.1% of families/orders/classes overlap between the cohorts compared to just 3.1% of genera and no ASVs).

Contrary to our expectations, we find comparable evidence of universality in baboons and the DIABIMMUNE infant/toddler cohort, but weaker evidence for universality in Johnson et al. (Figure 6A–D). Bacterial families in the DIABIMMUNE cohort yielded universality scores slightly higher than those observed in Amboseli (25th percentile=0.142, median=0.216, 75th percentile=0.321 for DIABIMMUNE; 25th percentile=0.088, median=0.150, 75th percentile=0.234 for Amboseli), driven by correlations between families that were stronger on average than those estimated for baboons (median DIABIMMUNE family-family correlation strength=0.253; median Amboseli family-family correlation strength=0.170). The high level of consistency between both human infants/toddlers and wild baboons is surprising and may be due to the similar sampling intervals for these cohorts. Both cohorts were sampled approximately monthly, while Johnson et al.’s subjects were sampled daily (Coyte et al., 2021; Guittar et al., 2019). Median correlation strengths and universality scores for the Johnson et al., 2019, cohort were substantially lower (median correlation=0.090; 25th percentile universality=0.050, median=0.086, 75th percentile=0.113) than the DIABIMMUNE cohort or the baboons.

Figure 6. Patterns of universality in baboons are recapitulated in the DIABIMMUNE study.

Figure 6.

Following Figure 2A, panels (A), (B), and (C) show the Pearson’s correlation coefficients of centered log-ratio (CLR) abundances between all pairs of families (x-axis) in two time series data sets from human subjects: (A) the Amboseli baboons, (B) the DIABIMMUNE cohort, consisting of 15 infants/toddlers sampled monthly over 3 years in Russian Karelia (Vatanen et al., 2016), and (C) the diet study of Johnson et al., 2019, including 34 adults sampled daily over 17 days. Following Figure 3A and B, panel (D) shows the median correlation strength of each family pair’s correlation coefficient across hosts as a function of the consistency in direction of that pair’s correlation across hosts (i.e. the proportion of hosts that shared the majority correlation sign, positive or negative). Median correlation strength is low overall in Johnson et al. (median = 0.090), whereas the Amboseli baboon and DIABIMMUNE infant/toddler cohorts show similar relationships between median correlation strength and the proportion shared correlation sign across hosts (Spearman’s r in Amboseli = 0.844; Spearman’s r in DIABIMMUNE = 0.690). (E) Universality scores for overlapping family pairs from the infant/toddler subjects of the DIABIMMUNE study and baboons in the Amboseli study are significantly correlated (r=0.784, p=0.00161). Black outlined points are family pairs that overlapped between the Amboseli baboon and DIABIMMUNE infant/toddler data sets; gray outlined points are family pairs that overlapped between the Amboseli and Johnson et al. data sets. Color represents the taxonomic identities of the family pairs.

Despite considerable differences in the hosts, time scales, and designs of these studies, all three data sets exhibited a positive correlation between correlation strength and sign consistency for family pairs (Figure 6D). This trend was strongest in the Amboseli baboons (exponent of power regression b=1.72; p<0.0001); weaker in the DIABIMMUNE cohort (b=1.51; p<0.0001) and weakest in Johnson et al., 2019 (b=1.19; p<0.0001). Further, the most universal family-family associations skewed positive in both the baboons and the infant data set. All of the top 5% most universal family pairs (30 of 30 pairs) are positively associated in the DIABIMMUNE cohort, compared to 70% (23 of 33 pairs) in the Amboseli baboons.

Finally, we examined the relationship between universality scores for family pairs that overlapped between Amboseli and DIABIMMUNE (n=29 pairs), and between Amboseli and Johnson et al., 2019 (Figure 6E; n=21 pairs; only 10 family pairs overlapped between all three data sets). For these overlapping pairs, scores in the Amboseli data predicted scores for the same family-family pair in the DIABIMMUNE data set (r=0.562, p=0.001). The association between scores in the Amboseli data and the Johnson et al. data was negative, but not statistically significant (r=−0.402, p=0.071).

Discussion

Do different hosts have different microbiome ‘ecologies’? Ecological and evolutionary processes like horizontal gene flow, genotype by environment interactions, and priority effects have been predicted to lead bacterial species to occupy different niches (with different ecological interactions) in different communities (Dolinšek et al., 2016; Franzosa et al., 2015; Faith et al., 2013; Bik et al., 2016; Caporaso et al., 2011; Costello et al., 2009; Louca et al., 2018; Rainey and Quistad, 2020; Martiny et al., 2015). Yet contrary to these expectations, here we find that hosts in the same population exhibit pairwise bacterial correlation patterns that are predominantly shared across hosts, rather than idiosyncratic to individual hosts. If these shared correlation patterns arise from shared microbiome ecologies, this discovery has consequences for understanding the basic eco-evolutionary drivers of microbiome dynamics and for human and animal health. For instance, shared ecologies would mean that designing widely applicable microbiome interventions is a more attainable goal than personalized microbiome compositions would suggest. Shared microbiome ecologies may also enable researchers to develop microbiome interventions that leverage these interactions to manipulate the microbiome’s emergent community properties to improve host health.

By measuring bacterial correlations in multiple hosts, we were also able, for the first time, to pinpoint which pairs of bacterial taxa exhibit the most consistent relationships across hosts. We found that most bacterial abundance correlations—from ASV-ASV to phyla-phyla relationships—were weak and negative. Positive bacterial interactions have been the subject of recent discussion in the literature (Loftus et al., 2021; Palmer and Foster, 2022; Kehe et al., 2021). Ecological theory predicts that strong positive interactions should be rare in natural communities because species interdependencies can hamper community assembly and stability (Coyte et al., 2015; Coyte et al., 2021). This theory is supported by experiments that directly measure the effects of one bacterial species on another’s growth (Weiss et al., 2022; Ortiz et al., 2021; Carlström et al., 2019; Venturelli et al., 2018) (but see Kehe et al., 2021). Our results suggest that strong, positive bacterial correlations are indeed uncommon in intact, unmanipulated microbiomes: significant positive relationships made up just 3.8% of all the pairwise correlations we observed. Hence, strong mutualisms, while key to microbiome function and dynamics, are probably rare in gut communities.

While mutualisms and universal dynamics are important, the correlation patterns we observe likely arise from a combination of ecological interactions between bacteria and shared responses to the environment (i.e. pairs of bacteria that prefer the same or different environments). In support of the idea that at least some of the correlations we observed are due to between-species interactions, our signature of universality was essentially unchanged after accounting for some of the strongest known drivers of microbiome composition and change in our population—host diet and season (Björk et al., 2022; Grieneisen et al., 2021)—as well as microbial synchrony between hosts. However, our approach did not account for important environmental gradients within the gut, such as host immune profiles and intestinal pH. These factors also shape microbiome composition (e.g. Reese et al., 2018; Firrman et al., 2022; de Vos et al., 2022), and can lead to shared abundance correlations between hosts even if hosts themselves differ. Ecological selection via within-host environments may explain our finding that genetic relatives share somewhat similar bacterial correlation patterns. Ecological selection is also consistent with our observation that the most consistent ASV-level correlations are between phylogenetically related taxa, and these patterns were strongest for positively associated taxon pairs. In support, phylogenetically related species have been shown to have similar environmental preferences (Tamames et al., 2016). We note that none of the correlations we observed can be mapped directly to standard categories of pairwise ecological interactions, such as mutualism, commensalism, amensalism, exploitation, or competition. Experimental approaches that directly measure the effects of one species on another’s growth in vitro are better suited to characterizing these relationships (Kehe et al., 2021; Weiss et al., 2022; Ortiz et al., 2021; Carlström et al., 2019; Venturelli et al., 2018).

The strong signal of universality we observed in bacterial abundance correlations stands in contrast to the common observation that microbiome taxonomic composition (i.e. the presence and abundance of bacterial species) is almost always highly personalized (Franzosa et al., 2015; Faith et al., 2013; Bik et al., 2016; Caporaso et al., 2011; Costello et al., 2009; Risely et al., 2021; Kolodny et al., 2019; Flores et al., 2014; Johnson et al., 2019; Pruss et al., 2021). Our own prior analyses of these data found that each baboon exhibited personalized microbiome compositions and asynchronous single-taxon dynamics (Björk et al., 2022). These contrasting patterns—personalized compositions but shared abundance correlations—are important because personalized microbiota have been proposed to arise, at least in part, from personalized microbiome ecologies (Franzosa et al., 2015; Faith et al., 2013; Bik et al., 2016; Caporaso et al., 2011; Costello et al., 2009; Risely et al., 2021; Kolodny et al., 2019; Flores et al., 2014; Johnson et al., 2019; Pruss et al., 2021). We can think of at least three explanations that reconcile these observations. First, consistent with ideas discussed above, if environments in the gut shape bacterial abundances but these environments are not synchronized across hosts, this would lead to shared abundance correlations over time, but individualized microbiome compositions at any single point in time. Second, the effects of horizontal gene transfer and gene by environment interactions on microbial phenotypes may not be strong enough to substantially alter pairwise microbial associations in the gut. This may be especially true for our main unit of analysis, bacterial ASVs. Because ASVs encompass multiple species and strains, each with somewhat different functional capacities, their dynamics may be buffered against idiosyncrasies driven by horizontal gene transfer and functional redundancy, which affect single strains more strongly than whole species or genera. We would strain-level correlation patterns to be more individualized than those between ASVs. Third, personalized gut microbial compositions may emerge from at least two other phenomena: personalized assembly processes and interactions driven by rare, host-specific strains (which were necessarily excluded from our analyses) (Costello et al., 2012; Walter and Ley, 2011). In general, a logical next step would be to confirm the microbial correlation patterns we observed using culture-based approaches, which will help reveal (in vitro) whether they can be attributed to direct effects of one microbe on another’s growth.

The observation that bacterial correlation patterns are largely shared across hosts was also apparent in one human data set, despite between-study differences in study design, host age, and time scale. Specifically, both the Amboseli baboons and the DIABIMMUNE infant/toddler cohort from Russia (Vatanen et al., 2016) exhibit comparable levels of universality of correlation patterns. This outcome surprised us: because the baboons all live in the same environment and are presumably colonized by similar bacterial strains from that environment, we expected that ecological selection and shared strain functionality should lead to stronger universality in bacteria correlation patterns compared to human infants sampled from different households and who were probably colonized by different strains. We also found that the most universal correlations between bacterial families in baboons tended to be highly universal in human infants/toddlers. Hence, some bacterial families may exhibit consistent microbial relationships within hosts, across host populations, and across host species. Finally, a recent, independent study also identified consistent bacterial correlation patterns across four different populations of human hosts (Loftus et al., 2021). While this study lacked resolution at the level of individual hosts, it did identify a conserved network of positively associated and closely related microbes similar to those we identify in Figure 3. The authors speculate that these conserved associations may indicate strong partner fidelity or obligate partnerships.

We did, however, fail to detect universality in a second human data set reported in Johnson et al., 2019, in which subjects were sampled daily, rather than weekly or monthly. The lack of universality in Johnson et al., 2019, may be due to this difference in sampling time scale, especially if daily abundances and correlations are noisier than covariances modeled over the longer time scales in our study. In support, many fewer of the microbial correlations were stronger than random chance in Johnson et al. as compared to the baboons or children in the DIABIMMUNE cohort. However, without the ability to subsample Johnson et al., 2019, to monthly scales (this data set is only 17 days long), it is impossible to test this prediction. The subjects in Johnson et al., 2019, also consumed substantially different diets from each other, perhaps more so than the children in the DIABIMMUNE cohort, and this inter-host difference in diet may reduce the universality of microbial correlations.

In sum, our study indicates that microbiome personalization may not extend to microbiome community ecology. However, more work is needed to understand how relationships between microbiome taxa are explained by shared internal and external environments, direct and indirect ecological interactions, taxonomic levels (e.g. strains to phyla) and time scales (days to months and years). Future studies should also consider how pairwise bacterial interactions scale up to affect the emergent properties of the community (Levine et al., 2017; Letten and Stouffer, 2019; Friedman et al., 2017). We hope that our longitudinal data set and the new methods we developed as part of this study (e.g. the model of log-ratio dynamics, the assessment of covariation from time-ordered abundance trajectories, and the universality score) will be useful in this enterprise.

Methods

Study population and microbiome profiles

The baboon hosts in this study were members of the Amboseli baboon population, which has been studied by the Amboseli Baboon Research Project since 1971 (Alberts and Altmann, 2012). The microbiome compositional profiles are derived from PCR amplification of an ~390-bp-long fragment that encompassed the V4 region of the 16S rRNA gene using primers 515F-806R. These microbiome profiles were previously analyzed in Björk et al., 2022 and Grieneisen et al., 2021. Our analyses use 5534 of these profiles from 56 especially well-sampled baboons, collected over a 13.3-year span between 2000 and 2013 (Figure 1B). Each baboon host in this data set was sampled at least 75 times (mean number of samples=99; range=75–181 samples; median number of days between samples within hosts=20 days; 25th percentile=7 days, 75th percentile=49 days). DNA was extracted from each sample using the MoBio and QIAGEN PowerSoil kit with a bead-beating step. All samples were sequenced on an Illumina HiSeq 2500, with a median read count of 48,827 reads per sample across all 5534 samples (range=982–459,315 reads per sample). Following recommended statistical practices (Gloor et al., 2017), samples were not rarefied, but counts were agglomerated and transformed to additive log-ratios (ALR). Variation in sampling depth and relative abundance were modeled by the method described in the subsequent section. Further details of sample collection, DNA extraction, and sequencing can be found in (Björk et al., 2022; Grieneisen et al., 2021).

Modeling log-ratio dynamics

Data sets of per-sample taxonomic counts were produced at each of three taxonomic levels, from finest to coarsest: ASV, taxonomic assignments finer than phyla, but above the genus level (e.g. class, order, family), and phylum. At the intermediate and coarsest levels, taxa were agglomerated using phyloseq’s tax_glom() function (McMurdie and Holmes, 2013) such that all sequence variants sharing taxonomic identity at that level were collapsed into a single taxon (e.g. family Bifidobacteraceae). To reduce sparsity in the data set, remove 16S sequences that could represent gene duplications, and focus only on taxa that were prevalent in all 56 hosts, we further filtered as follows: (1) in each of the three taxonomically defined data sets (i.e. ASV, taxa assigned to family/order/class, and phylum), we identified taxa present (i.e. having non-zero abundance) in at least 20% of samples from each host; (2) if a given ASV was >99% genetically similar to another ASV we removed the least abundant of the pair to minimize the risk of including duplicate 16S rRNA gene copies from the same taxa (Vetrovský and Baldrian, 2013); and (3) counts associated with all other taxa were combined into a dummy category, hereafter referred to as ‘other.’ The ‘other’ category therefore includes a combination of rare and host-specific gut microbes. This category was retained in the data set (although not analyzed directly) because ‘other’ counts still inform the precision of the observed relative abundances in our model. See the sub-section titled ‘Filtering out taxon pairs with frequent joint absence’ below for further filtering that was required to avoid biases in estimating correlations between taxon pairs. Characteristics of the filtered data at each taxonomic level are provided in Supplementary file 1a-1c. At the ASV level, these filtering steps eliminate a majority of ASVs and ASV pairs from consideration: from 22,097 unique ASVs before filtering to 107 after filtering. This filtering also retained 12 phyla and 35 taxa at the class/order/family level (Supplementary file 1a-1c).

Our modeling approach is similar to several published methods for modeling microbial time series data. There are three key features from our perspective: the use of log-ratios (discussed above), the use of a state space model, and the Gaussian process component. State space models are useful for modeling a dynamic process that is observed only after the introduction of some measurement (e.g. Joseph et al., 2020). The Gaussian process component helps contend with irregularity in the sampling of our data. Rather than evolving in discrete jumps from one time point to the next, it allowed us to model the change in microbial log-ratio abundances as smoothly flowing through interruptions in observation. Other authors have made similar choices (Äijö et al., 2018). Specifically, estimates of taxon-taxon covariance were obtained from the basset model of the ‘fido’ package in R (Silverman et al., 2022). Data for each host took the form of a D×N count matrix, where D gives the number of taxa and N the number of samples collected for a given host. The following model was fit to each host’s count matrix (Y), where Yi represents the counts associated with a single sample:

YiMultinomialπi
πi=ALR1(ηi)
ηNormal(Λ,Σ,I)
ΛGP(Θ[X],Σ,Γ[X])
Σinv-Wishart(Ξ,ν)

The observed relative abundances are considered to be drawn from a multinomial distribution parameterized by a set of proportions (π) which have an analogous representation in the ALR. The dynamics of these log-ratio abundances (η) are described by what amounts to a state space model in the third and fourth lines of the specification above, where a Gaussian process models the evolution of a ‘latent’ state. The matrix Σ captures covariation in log-ratio abundances (the D rows of the observed count matrix). Sample-sample covariation arising from nearness in time (autocorrelation) is modeled by the kernel matrix Γ. Both the kernel matrix and the expected baseline log-ratio abundances (Θ) are parameterized by a set of time-varying covariates X which included the day of sampling (where the date of first sample is defined as zero) and the first three principal components of diet composition, calculated following Björk et al., 2022; Grieneisen et al., 2021, as the diet all juveniles and females living in the host’s social group in the 30 days prior to sample collection. All group members consume highly similar diets as they travel together across the habitat, encountering the same resources at the same time (Björk et al., 2022; Grieneisen et al., 2021). These data are collected via random-order behavioral observations collected two to four times per week on adult females and juveniles in each social group.

The kernel matrix Γ was composed of two component squared exponential kernels. The first, intended to manage sample-sample autocorrelation, was selected to have a bandwidth such that this autocorrelation decayed to a minimum at 90 days. This mirrored the behavior of estimates of sample-sample autocorrelation in the raw data. The second component kernel modeled sample-sample covariance driven by similarity in composition of diet. The relative weight of these effects—autocorrelation and diet—on sample-sample variation was set at 3:1. We fit four alternative versions of our models in order to test the sensitivity of these parameter settings, varying the bandwidth of the squared exponential kernel in such a way as to give minimum sample-sample autocorrelation at either 30 or 90 days. We also varied the proportion of sample-sample covariance driven by diet from 0% to 25% to 50%, and we varied the log scale of total sample-sample variance between 1 and 2. In all cases, estimates of correlation between CLR ASVs were similar, with minimum and maximum r2 of estimates between ‘canonical’ and alternatively parameterized models of 0.993 and 0.996, respectively. This suggests our findings are reasonably robust to a range of hyperparameter settings.

Posterior inference on this model is performed as described in Silverman et al., 2022, and yields estimates of the distributions of parameters necessary to reconstruct trajectories for all log-ratio taxa across sampling time. In particular, we extract the posterior estimates of one such parameter, Σ, the covariance of ALR taxa, from the fitted models for each host. We convert these covariance matrices over ALR taxa to the CLR form (a simple linear transformation of the matrix). We then normalize estimated CLR covariance matrices to Pearson’s correlation matrices in R using the built-in cov2cor() function.

Filtering out taxon pairs with frequent joint absence

The ASV-level relative abundance data were sparse, even after filtering low abundance taxa (i.e. those present in <20% of samples in each host). Zeros comprised 29.7% of the ASV-level count matrix, 16.1% of the class/order/family count matrix, and 9.8% of the phylum level count matrix. This abundance of ‘missing’ observations at the ASV level led to a bias in estimates of taxon-taxon correlation: a high-frequency joint zero abundances for a pair of taxa increased the apparent positive correlation of those taxa (Figure 1—figure supplement 3). This is because abundances below some minimum level of sensitivity in sampling will be ‘flattened’ to zero, reducing the observed variation for those pairs, over those samples, to zero. This loss of variation leads to a tendency to overestimate the (positive) correlation of these pairs. This trend was observed for our basset model (Silverman et al., 2022) when estimating either Pearson’s correlation or proportionality (Quinn et al., 2017). It was also observed in ASV-ASV CLR correlations output by COAT (Cao et al., 2019), which estimates CLR correlation through a sparsity-inducing procedure intended to yield more conservative estimates, and by SparCC (Friedman and Alm, 2012, see Figure 1—figure supplement 3).

To avoid this bias, we restricted our analyses to taxon pairs with strictly less than a 5% frequency of joint absence (i.e. joint zero abundance observations in less than 5% of all samples across hosts) and less than a 50% frequency of absence in either taxon individually across all samples. We illustrate these filtering criteria with a cartoon example involving four taxa, jointly sampled 20 times, indicating presence as an ‘X’ and absence (zero abundance) with a dash:

Taxon A (70% present): X---XXXXXX-XXX-X-XXX

Taxon B (20% present): X---------X--X----X-

Taxon C (50% present): -XXX-----XX-XXXX--X-

Taxon D (65% present): XX-----XX-XX-XX-XXXX

In this example, taxon B would be excluded from analyses by the requirement that taxa be ‘present’ or non-zero in at least half of all samples. Its associations with other taxa (A x B, B x C, or B x D) would be omitted from analyses, leaving pairs A x C, A x D, and C x D. Of these, the rate of joint absence is 5%, 10%, and 15%, respectively, meaning only A x C would pass the filtering criterion of no more than 5% joint absence.

After filtering, 1878 of the original 7750 ASV-ASV pairs remained in this ‘high-confidence’ set. It is this set that we present in all figures and results. This procedure was also carried out at the phylum and class/order/family levels. However, as the frequency of absence was generally small at these higher taxonomic levels, a higher proportion of possible taxon-taxon pairs were included in the high-confidence set: 86.4% (57 of 66 pairs) at the phylum level and 71.0% (473 of 666 pairs) at the class/order/family level.

Calculating universality scores for taxon-taxon pairs

We devised a universality score for each pair of taxa intended to capture the strength and consistency of taxon-taxon correlations across hosts (Figure 2—figure supplement 4). The majority direction is negative otherwise. This score identifies the sign of the taxon-taxon correlation (positive or negative) that is most common across the 56 hosts (i.e. occurs in >50% of the 56 hosts in the data set). The direction of this sign is the ‘majority correlation sign.’

For a pair of taxa i, let nimaj be the number of hosts with CLR correlation over pair i with the majority correlation sign for that pair and let n be the total number of hosts. Let R be the subset of estimated CLR correlations for pair i across hosts with the majority sign. The universality score ui for that taxon-taxon pair is then given by

ui=nimajn×median(R)

This score is the product of the median CLR correlation across hosts and the proportion of hosts with the majority correlation sign, and is bounded between 0 and 1. Scores near 1 indicate strong universality and near-zero scores indicate weak universality. Strong universality can only be achieved by taxon-taxon correlations that are both large in magnitude and highly concordant across hosts.

Defining a cutoff for significant bacterial correlations and universality scores

We identified correlations stronger than expected by chance using permutations of the data set to define empirical null distributions (Figure 2—figure supplement 3A). Specifically, we permuted the microbial count tables by randomly shuffling taxon identity within each sample 10 times for each of the 56 hosts. This procedure maintained relative abundance patterns within a sample but scrambled the covariance patterns of relative abundances. These randomly generated correlations were pooled into a single reference distribution. The distributions of ASV-level CLR correlations in the original and permuted data are shown in Figure 2—figure supplement 3A. We identified ‘significant’ correlations as those below FDR ≤0.05 (Benjamini-Hochberg), testing against the permuted data.

We applied an analogous permutation test to derive a null distribution for taxon-taxon universality scores. In a single iteration of this permutation procedure, rows and columns of the observed taxon-taxon correlation matrix for each host were shuffled, maintaining the distribution over observed correlations at the host level but randomizing the identity of taxon pairs across hosts. This procedure was repeated 100 times and universality scores were calculated from each of these shuffled data sets to give a single pseudo-null distribution of universality scores. The observed and null distributions of universality scores at the ASV level are shown in Figure 2—figure supplement 3B. We used this empirical null distribution to identify universality scores significantly greater than expected (FDR≤0.05).

Estimating the ratio of population-level to host-level contributions to observed taxon-taxon correlation patterns

We used simulations to estimate the degree of shared ‘signal’ between hosts in terms of taxon-taxon correlations. Each host’s ‘observed correlations’ were defined as the basset estimated maximum a posteriori estimates of CLR ASV correlations for that host. We computed the mean correlations across the population using the function estcov() from the shapes package in R (Chen, 2020) and estimated a host-specific contribution to the observed correlations as the residual difference between per-host observed and these mean correlations. That is,

observed host correlations=mean population correlations+host residual

For each host, we simulated a hypothetical set of composite taxon-taxon correlations as a convex combination of mean and host residual:

composite correlations=1-α×mean population correlations+α×host residual

A cartoon example of this procedure is given in Figure 3—figure supplement 2. For example, one such simulated set of taxon-taxon correlations might constitute a mixture of 90% host contribution and 10% shared population-level ‘signal’ (α=0.9). Alternatively, a small host-level contribution might have α=0.1.

For each host, we iterated over increasing proportions of host-level contribution (from 0% to 100%), generating simulated composite correlation matrices according to the formula above. We compared these simulated patterns to those observed for the same host, reasoning that simulated correlation matrices that minimize the distance between the observed correlation matrices and the simulated mixtures provide the best description of the underlying true mixture.

De-trending for season

Seasonally de-trended data was obtained in the following way. The observed ASV count matrices were CLR-transformed and linear autoregressive models were fit to each CLR-transformed ASV’s series. In these models, wet-dry season oscillation was modeled as a sine wave with a period of 365 days. The magnitude of this component was estimated during model fitting after an offset (in weeks) was estimated in a first step, in order to best align the oscillating seasonal component with the data. Per-ASV models were fit using the following syntax:

arima(x=x,xreg=model.matrix(factor(host)+sinf(offset,days))[,1],order=c(1,0,0))

where x gives CLR counts for that ASV, the ‘order’ argument of arima enables a single autoregressive component, and the ‘xreg’ argument specifies a covariance matrix. That covariance matrix contains a per-host label giving host-specific offsets for log-ratio abundance and an oscillating seasonal trend through ‘sin_f,’ a function that samples values corresponding to day indices (through ‘days’) from a sine wave with weekly offset (‘offset’) and a period of 365 days. The residuals from these per-ASV model fits were extracted and used as the seasonally ‘de-trended’ data (see Figure 4—figure supplement 1). Correlations across CLR ASV-ASV pairs were estimated from these residual series with the cov() function in R.

Estimating synchrony

‘Synchrony’ was estimated by sampling aligned microbiome compositional profiles across hosts. We identified all samples collected from pairs of hosts within 1 calendar day. For instance, a sample collected from host F01 on March 14, 2011, could pair with a sample from M04 on March 15, 2011. For all possible pairs of hosts, we selected one such aligned pair of samples, yielding 1540 joint observations of gut microbiome composition. For each such paired sample, one host was arbitrarily designated as host A and the other as host B. The ‘synchrony’ of a given taxon was estimated as the correlation of a taxon’s model-inferred log-ratio abundance across the set of samples from hosts labeled A and the set of samples from hosts labeled B. The cartoon in Figure 4—figure supplement 3 illustrates this sample pairing.

Enrichment analyses

We performed enrichment analyses for bacterial families and family pairs in several settings. In each case we computed the frequency of ASVs belonging to a given family, or of pairs belonging to a family pair, on a subset of the data. These were compared to the overall frequencies of ASVs belonging to those families or pairs.

To determine the enrichment of families and family pairs in the most universal ASV pairs (Figure 3E; Supplementary file 1e), we calculated the frequencies of ASV families and pairs in the top 5% of pairs by universality scores. Significant enrichment of families or pairs was identified using a one-sided Fisher’s exact test. Multiple test correction was applied as a Benjamini-Hochberg adjustment to observed p-values.

Phylogenetic distances between ASV sequences were calculated with the dist.ml function in the ‘phangorn’ package in R (Schliep, 2011) using default settings for amino acid substitution rates. In Supplementary file 1f, low phylogenetic distance/high median correlation strength pairs were identified as those with phylogenetic distances of less than 0.2 and median correlation strengths of greater than 0.5. Again, significance of these was evaluated against overall frequencies of the same families and pairs.

Evaluating explanatory factors

Variation in taxon-taxon correlation patterns explained by kinship and baseline composition

To evaluate a possible explanatory effect of distances in terms of kinship or baseline gut bacterial composition on distances in terms of taxon-taxon correlation patterns, we applied Mantel tests. However, because population structure can lead to anticonservative p-values (Guillot et al., 2013), we also developed a second simulation-based procedure for evaluating the significance of baseline composition, using a permutation procedure of our own design. First, baseline composition for each host was estimated by transforming all of a given host’s samples to the CLR representation after adding a small fraction (0.5) to remove zeros. The vector of per-taxon averages of these CLR values was used as that host’s ‘baseline’ CLR composition. The Euclidean distances between hosts in terms of these per-host baselines were compared against distances in terms of correlation patterns to give an r2 value.

In the case of the customized permutation test, this observed result was evaluated against a pseudo-null distribution computed in the following way. The identity of each taxon in the baseline composition was shuffled for each host independently. Euclidean distances across these shuffled baselines were computed and an r2 value calculated for these distances against the observed distances computed from taxon-taxon correlation patterns. This procedure was repeated 1000 times to give a distribution of ‘random’ r2 values we used as an empirical null.

Variation in taxon-taxon correlation patterns explained by sex and social group

To test whether host sex or social group membership predicted similarity in terms of correlation patterns, we used an ANOVA-like strategy. We calculated the F-statistic, a ratio of between- to within-group variation, on the observed correlation patterns (strictly, the vectorized CLR taxon-taxon correlation matrices; Z in the equation below) and segmented samples into groups defined by either sex or social group. The F-statistic was calculated as

F=between-group variationwithin-group variation=i=1Kni(Zi¯Z¯)2/K1i=1Kj=1ni(ZijZi¯)2/(NK)

and significance was evaluated via an F-distribution parameterized by the appropriate degrees of freedom. Here, K represents the number of groups (e.g. two, in the case of sex) and N, the total number of hosts. The matrix Zi- consists of the mean taxon-taxon correlations for group i and Z-, the population mean correlations.

Comparison to microbiome time series from human populations

We compared our findings to those generated from two human data sets: the DIABIMMUNE project’s infant/toddler cohort from Russian Karelia (Vatanen et al., 2016) and the adult diet-microbiome association study of Johnson et al., 2019. In both cases, count tables were obtained from the project’s public website and subject identity and sampling schedules were available in the associated metadata. We compared each host cohort’s universality at the family/order/class level because this taxonomic level offered the greatest comparative power (10.1% of families/orders/classes overlap between the cohorts compared to just 3.1% of genera and no ASVs). The basset model from the ‘fido’ R package (Cullen et al., 2020) was fit to each subject’s data set using model settings analogous to those employed on the Amboseli baboon series: first, only taxa with non-zero counts in at least 20% of all subjects’ series were retained; second, Gaussian process kernel bandwidth settings were chosen in such a way as to encode an expectation of minimum autocorrelation between samples at a distance in time of 90 days. We extracted CLR estimates of taxa at the family level in the same manner as described previously for the Amboseli data set.

Acknowledgements

We thank Jeanne Altmann for her essential role in stewarding the Amboseli Baboon Project, and in collecting and maintaining the fecal samples used in this manuscript. We thank the Kenya Wildlife Service, the National Council for Science, Technology, and Innovation, and the National Environment Management Authority for permission to conduct research and collect biological samples in Kenya. We also thank the University of Nairobi, Institute of Primate Research, National Museums of Kenya, the Amboseli-Longido pastoralist communities, the Enduimet Wildlife Management Area, Ker & Downey Safaris, Air Kenya, and Safarilink for their cooperation and assistance in the field. We thank Karl Pinc for managing and designing the database. We also thank Tawni Voyles, Anne Dumaine, Yingying Zhang, Meghana Rao, Tauras Vilgalys, Amanda Lea, Noah Snyder-Mackler, Paul Durst, Jay Zussman, Garrett Chavez, and Reena Debray for contributing to fecal sample processing. Complete acknowledgments for the ABRP can be found online at https://amboselibaboons.nd.edu/acknowledgements/.

Funding: This work was supported by the National Science Foundation and the National Institutes of Health, especially NSF Rules of Life Award DEB 1840223 (EAA, JAG), the National Institute on Aging for R01 AG071684 (EAA), R21 AG055777 (EAA, RB), NIH R01 AG053330 (EAA), and NIH R35 GM128716 (RB). We also thank the Duke University Population Research Institute P2C-HD065563 (pilot to JT), the University of Notre Dame’s Eck Institute for Global Health (EAA), and the Notre Dame Environmental Change Initiative (EAA). Since 2000, long-term data collection in Amboseli has been supported by NSF and NIH, including IOS 1456832 (SCA), IOS 1053461 (EAA), DEB 1405308 (JT), IOS 0919200 (SCA), DEB 0846286 (SCA), DEB 0846532 (SCA), IBN 0322781 (SCA), IBN 0322613 (SCA), BCS 0323553 (SCA), BCS 0323596 (SCA), P01AG031719 (SCA), R21AG049936 (JT, SCA), R03AG045459 (JT, SCA), R01AG034513 (SCA), R01HD088558 (JT), and P30AG024361 (SCA). We also thank Duke University, Princeton University, the University of Notre Dame, the Chicago Zoological Society, the Max Planck Institute for Demographic Research, the L.S.B. Leakey Foundation and the National Geographic Society for support at various times over the years.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Elizabeth A Archie, Email: earchie@nd.edu.

Dario Riccardo Valenzano, Leibniz Institute on Aging, Germany.

Wendy S Garrett, Harvard T.H. Chan School of Public Health, United States.

Funding Information

This paper was supported by the following grants:

  • National Science Foundation DEB1840223 to Jack A Gilbert, Elizabeth A Archie.

  • National Institute on Aging R01AG071684 to Elizabeth A Archie.

  • National Institute on Aging R21AG055777 to Ran Blekhman, Elizabeth A Archie.

  • National Institute on Aging R01AG053330 to Elizabeth A Archie.

  • National Institute of General Medical Sciences R35GM128716 to Ran Blekhman.

  • Duke University P2C-HD065563 to Jenny Tung.

  • University of Notre Dame’s Eck Institute for Global Health to Elizabeth A Archie.

  • Notre Dame Environmental Change Initiative to Elizabeth A Archie.

  • National Science Foundation IOS 1456832 to Susan C Alberts.

  • National Science Foundation IOS 1053461 to Elizabeth A Archie.

  • National Science Foundation DEB 1405308 to Susan C Alberts.

  • National Science Foundation DEB 0846532 to Susan C Alberts.

  • National Science Foundation IBN 0322781 to Susan C Alberts.

  • National Science Foundation IBN 0322613 to Susan C Alberts.

  • National Science Foundation BCS 0323553 to Susan C Alberts.

  • National Science Foundation BCS 0323596 to Susan C Alberts.

  • National Institutes of Health P01AG031719 to Susan C Alberts.

  • National Institutes of Health R21AG049936 to Jenny Tung, Susan C Alberts.

  • National Institutes of Health R03AG045459 to Jenny Tung, Susan C Alberts.

  • National Institutes of Health R01HD088558 to Jenny Tung.

  • National Institutes of Health P30AG024361 to Susan C Alberts.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

Reviewing editor, eLife.

Author contributions

Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Writing – original draft, Writing – review and editing.

Investigation, Writing – review and editing.

Data curation, Investigation, Writing – review and editing.

Data curation, Validation, Investigation, Writing – review and editing.

Validation, Investigation, Writing – review and editing.

Funding acquisition, Investigation, Writing – review and editing.

Funding acquisition, Project administration, Writing – review and editing.

Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing.

Supervision, Funding acquisition, Investigation, Writing – review and editing.

Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing.

Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Ethics

All data collection procedures were non-invasive, adhered to the laws and guidelines of Kenya (Research Permit NACOSTI/P/22/22097), and were approved by the Animal Care and Use Committee at the University of Notre Dame (IACUC 05-7259_2023).

Additional files

Supplementary file 1. Supplementary tables.
elife-83152-supp1.xlsx (65.6KB, xlsx)
MDAR checklist

Data availability

16S rRNA gene sequences are available on EBI-ENA (project 590 ERP119849) and Qiita (study 12949). Analyzed data and code are available on GitHub at: https://github.com/kimberlyroche/rulesoflife (copy archived at Roche, 2023).

The following datasets were generated:

University of California San Diego Microbiome Initiative 2021. 16S rRNA gene sequencing data from baboon gut microbiomes collected between 2000 and 2014. European Nucleotide Archive. ERP119849

Grieneisen L, Dasari M, Gould TJ, Björk JR, Grenier J, Yotova V, Jansen D, Gottel N, Gordon JB, Learn NH, Gesquiere LR, Wango TL, Mututua RS, Warutere JK, Siodi L, Gilbert JA, Barreiro LB, Alberts SC, Tung J, Archie EA, Blekhman R. 2021. Gut microbiome heritability is nearly universal but environmentally contingent. Qiita. 12949

The following previously published datasets were used:

Vatanen T, Kostic A, d'Hennezel E. 2016. DIABIMMUNE three country cohort. NCBI BioProject. PRJNA290380

Johnson AJ. 2019. Johnson et al. dietary cohort. European Nucleotide Archive. PRJEB29065

References

  1. Äijö T, Müller CL, Bonneau R. Temporal probabilistic modeling of bacterial compositions derived from 16S rRNA sequencing. Bioinformatics. 2018;34:372–380. doi: 10.1093/bioinformatics/btx549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alberts SC, Altmann J. In: Long-Term Field Studies of Primates. Kappeler P, Watts DP, editors. Springer Verlag; 2012. The Amboseli Baboon research project: 40 years of continuity and change; pp. 261–287. [DOI] [Google Scholar]
  3. Bäckhed F, Ley RE, Sonnenburg JL, Peterson DA, Gordon JI. Host-bacterial mutualism in the human intestine. Science. 2005;307:1915–1920. doi: 10.1126/science.1104816. [DOI] [PubMed] [Google Scholar]
  4. Bashan A, Gibson TE, Friedman J, Carey VJ, Weiss ST, Hohmann EL, Liu Y-Y. Universality of human microbial dynamics. Nature. 2016;534:259–262. doi: 10.1038/nature18301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bik EM, Costello EK, Switzer AD, Callahan BJ, Holmes SP, Wells RS, Carlin KP, Jensen ED, Venn-Watson S, Relman DA. Marine mammals harbor unique microbiotas shaped by and yet distinct from the sea. Nature Communications. 2016;7:10516. doi: 10.1038/ncomms10516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Björk JR, Dasari MR, Roche K, Grieneisen L, Gould TJ, Grenier J-C, Yotova V, Gottel N, Jansen D, Gesquiere LR, Gordon JB, Learn NH, Wango TL, Mututua RS, Kinyua Warutere J, Siodi L, Mukherjee S, Barreiro LB, Alberts SC, Gilbert JA, Tung J, Blekhman R, Archie EA. Synchrony and Idiosyncrasy in the gut microbiome of wild primates. Nature Ecology & Evolution. 2022;6:955–964. doi: 10.1038/s41559-022-01773-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cao HT, Gibson TE, Bashan A, Liu YY. Inferring human microbial dynamics from temporal metagenomics data: Pitfalls and lessons. BioEssays. 2017;39:1600188. doi: 10.1002/bies.201600188. [DOI] [PubMed] [Google Scholar]
  8. Cao Y, Lin W, Li H. Large covariance estimation for compositional data via composition-adjusted thresholding. Journal of the American Statistical Association. 2019;114:759–772. doi: 10.1080/01621459.2018.1442340. [DOI] [Google Scholar]
  9. Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, Knights D, Gajer P, Ravel J, Fierer N, Gordon JI, Knight R. Moving pictures of the human microbiome. Genome Biology. 2011;12:R50. doi: 10.1186/gb-2011-12-5-r50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carlström CI, Field CM, Bortfeld-Miller M, Müller B, Sunagawa S, Vorholt JA. Synthetic microbiota reveal priority effects and keystone strains in the Arabidopsis phyllosphere. Nature Ecology & Evolution. 2019;3:1445–1454. doi: 10.1038/s41559-019-0994-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chen Y. Frechet: Statistitcal analysis for random objects and non-Euclidean data. CRAN. 2020 https://cran.r-project.org/web/packages/frechet/
  12. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body Habitats across space and time. Science. 2009;326:1694–1697. doi: 10.1126/science.1177486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Costello EK, Stagaman K, Dethlefsen L, Bohannan BJM, Relman DA. The application of ecological theory toward an understanding of the human microbiome. Science. 2012;336:1255–1262. doi: 10.1126/science.1224203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and stability. Science. 2015;350:663–666. doi: 10.1126/science.aad2602. [DOI] [PubMed] [Google Scholar]
  15. Coyte KZ, Rao C, Rakoff-Nahoum S, Foster KR. Ecological rules for the assembly of microbiome communities. PLOS Biology. 2021;19:e3001116. doi: 10.1371/journal.pbio.3001116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cullen CM, Aneja KK, Beyhan S, Cho CE, Woloszynek S, Convertino M, McCoy SJ, Zhang Y, Anderson MZ, Alvarez-Ponce D, Smirnova E, Karstens L, Dorrestein PC, Li H, Sen Gupta A, Cheung K, Powers JG, Zhao Z, Rosen GL. Emerging priorities for microbiome research. Frontiers in Microbiology. 2020;11:136. doi: 10.3389/fmicb.2020.00136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. de Vos WM, Tilg H, Van Hul M, Cani PD. Gut microbiome and health: Mechanistic insights. Gut. 2022;71:1020–1032. doi: 10.1136/gutjnl-2021-326789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Debray R, Herbert RA, Jaffe AL, Crits-Christoph A, Power ME, Koskella B. Priority effects in microbiome assembly. Nature Reviews. Microbiology. 2022;20:109–121. doi: 10.1038/s41579-021-00604-w. [DOI] [PubMed] [Google Scholar]
  19. Degnan PH, Taga ME, Goodman AL. Vitamin B12 as a modulator of gut microbial ecology. Cell Metabolism. 2014;20:769–778. doi: 10.1016/j.cmet.2014.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Dolinšek J, Goldschmidt F, Johnson DR. Synthetic microbial ecology and the dynamic interplay between microbial Genotypes. FEMS Microbiology Reviews. 2016;40:961–979. doi: 10.1093/femsre/fuw024. [DOI] [PubMed] [Google Scholar]
  21. Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, Clemente JC, Knight R, Heath AC, Leibel RL, Rosenbaum M, Gordon JI. The long-term stability of the human gut microbiota. Science. 2013;341:6141. doi: 10.1126/science.1237439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Faust K, Raes J. Microbial interactions: From networks to models. Nature Reviews. Microbiology. 2012;10:538–550. doi: 10.1038/nrmicro2832. [DOI] [PubMed] [Google Scholar]
  23. Faust K, Lahti L, Gonze D, de Vos WM, Raes J. Metagenomics meets time series analysis: Unraveling microbial community dynamics. Current Opinion in Microbiology. 2015;25:56–66. doi: 10.1016/j.mib.2015.04.004. [DOI] [PubMed] [Google Scholar]
  24. Faust K, Raes J. Host-microbe interaction: Rules of the game for microbiota. Nature. 2016;534:182–183. doi: 10.1038/534182a. [DOI] [PubMed] [Google Scholar]
  25. Firrman J, Liu L, Mahalak K, Tanes C, Bittinger K, Tu V, Bobokalonov J, Mattei L, Zhang H, Van den Abbeele P. The impact of environmental pH on the gut microbiota community structure and short chain fatty acid production. FEMS Microbiology Ecology. 2022;98:fiac038. doi: 10.1093/femsec/fiac038. [DOI] [PubMed] [Google Scholar]
  26. Flores GE, Caporaso JG, Henley JB, Rideout JR, Domogala D, Chase J, Leff JW, Vázquez-Baeza Y, Gonzalez A, Knight R, Dunn RR, Fierer N. Temporal variability is a personalized feature of the human microbiome. Genome Biology. 2014;15:12. doi: 10.1186/s13059-014-0531-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Foster KR, Bell T. Not cooperation, dominates interactions among culturable microbial species. Current Biology. 2012;22:1845–1850. doi: 10.1016/j.cub.2012.08.005. [DOI] [PubMed] [Google Scholar]
  28. Franzosa EA, Huang K, Meadow JF, Gevers D, Lemon KP, Bohannan BJM, Huttenhower C. Identifying personal microbiomes using metagenomic codes. PNAS. 2015;112:E2930–E2938. doi: 10.1073/pnas.1423854112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLOS Computational Biology. 2012;8:e1002687. doi: 10.1371/journal.pcbi.1002687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Friedman J, Higgins LM, Gore J. Community structure follows simple assembly rules in microbial microcosms. Nature Ecology & Evolution. 2017;1:109. doi: 10.1038/s41559-017-0109. [DOI] [PubMed] [Google Scholar]
  31. Gao C, Montoya L, Xu L, Madera M, Hollingsworth J, Purdom E, Singan V, Vogel J, Hutmacher RB, Dahlberg JA, Coleman-Derr D, Lemaux PG, Taylor JW. Fungal community assembly in drought-stressed sorghum shows stochasticity, selection, and universal ecological dynamics. Nature Communications. 2020;11:34. doi: 10.1038/s41467-019-13913-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: And this is not optional. Frontiers in Microbiology. 2017;8:2224. doi: 10.3389/fmicb.2017.02224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gonze D, Coyte KZ, Lahti L, Faust K. Microbial communities as dynamical systems. Current Opinion in Microbiology. 2018;44:41–49. doi: 10.1016/j.mib.2018.07.004. [DOI] [PubMed] [Google Scholar]
  34. Gould AL, Zhang V, Lamberti L, Jones EW, Obadia B, Korasidis N, Gavryushkin A, Carlson JM, Beerenwinkel N, Ludington WB. Microbiome interactions shape host fitness. PNAS. 2018;115:E11951–E11960. doi: 10.1073/pnas.1809349115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Grieneisen L, Dasari M, Gould TJ, Björk JR, Grenier J-C, Yotova V, Jansen D, Gottel N, Gordon JB, Learn NH, Gesquiere LR, Wango TL, Mututua RS, Warutere JK, Siodi L, Gilbert JA, Barreiro LB, Alberts SC, Tung J, Archie EA, Blekhman R. Gut microbiome heritability is near-universal but environmentally contingent. Science. 2021;373:181–186. doi: 10.1126/science.aba5483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Guillot G, Rousset F, Harmon L. Dismantling the mantel tests. Methods in Ecology and Evolution. 2013;4:336–344. doi: 10.1111/2041-210x.12018. [DOI] [Google Scholar]
  37. Guittar J, Shade A, Litchman E. Trait-based community assembly and succession of the infant gut microbiome. Nature Communications. 2019;10:8377. doi: 10.1038/s41467-019-08377-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hu J, Amor DR, Barbier M, Bunin G, Gore J. Emergent phases of ecological diversity and dynamics mapped in microcosms. Science. 2022;378:85–89. doi: 10.1126/science.abm7841. [DOI] [PubMed] [Google Scholar]
  39. Johnson AJ, Vangay P, Al-Ghalith GA, Hillmann BM, Ward TL, Shields-Cutler RR, Kim AD, Shmagel AK, Syed AN, Walter J, Menon R, Koecher K, Knights D. Daily sampling reveals personalized diet-microbiome associations in humans. Cell Host & Microbe. 2019;25:789–802. doi: 10.1016/j.chom.2019.05.005. [DOI] [PubMed] [Google Scholar]
  40. Joseph TA, Pasarkar AP, Pe’er I. Efficient and accurate inference of mixed microbial population trajectories from longitudinal count data. Cell Systems. 2020;10:463–469. doi: 10.1016/j.cels.2020.05.006. [DOI] [PubMed] [Google Scholar]
  41. Kalyuzhny M, Shnerb NM, Kembel S. Dissimilarity-overlap analysis of community dynamics: Opportunities and pitfalls. Methods in Ecology and Evolution. 2017;8:1764–1773. doi: 10.1111/2041-210X.12809. [DOI] [Google Scholar]
  42. Kehe J, Ortiz A, Kulesa A, Gore J, Blainey PC, Friedman J. Positive interactions are common among culturable bacteria. Science Advances. 2021;7:eabi7159. doi: 10.1126/sciadv.abi7159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kolodny O, Weinberg M, Reshef L, Harten L, Hefetz A, Gophna U, Feldman MW, Yovel Y. Coordinated change at the colony level in fruit bat fur microbiomes through time. Nature Ecology & Evolution. 2019;3:116–124. doi: 10.1038/s41559-018-0731-z. [DOI] [PubMed] [Google Scholar]
  44. Letten AD, Stouffer DB. The mechanistic basis for higher-order interactions and non-additivity in competitive communities. Ecology Letters. 2019;22:423–436. doi: 10.1111/ele.13211. [DOI] [PubMed] [Google Scholar]
  45. Levine JM, Bascompte J, Adler PB, Allesina S. Beyond pairwise mechanisms of species coexistence in complex communities. Nature. 2017;546:56–64. doi: 10.1038/nature22898. [DOI] [PubMed] [Google Scholar]
  46. Loftus M, Hassouneh SA-D, Yooseph S. Bacterial associations in the healthy human gut microbiome across populations. Scientific Reports. 2021;11:2828. doi: 10.1038/s41598-021-82449-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Louca S, Polz MF, Mazel F, Albright MBN, Huber JA, O’Connor MI, Ackermann M, Hahn AS, Srivastava DS, Crowe SA, Doebeli M, Parfrey LW. Function and functional redundancy in microbial systems. Nature Ecology & Evolution. 2018;2:936–943. doi: 10.1038/s41559-018-0519-1. [DOI] [PubMed] [Google Scholar]
  48. Marsland R, Cui W, Mehta P. A minimal model for microbial biodiversity can reproduce experimentally observed ecological patterns. Scientific Reports. 2020;10:3308. doi: 10.1038/s41598-020-60130-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Martiny JBH, Jones SE, Lennon JT, Martiny AC. Microbiomes in light of traits: A Phylogenetic perspective. Science. 2015;350:aac9323. doi: 10.1126/science.aac9323. [DOI] [PubMed] [Google Scholar]
  50. McMurdie PJ, Holmes S. Phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLOS ONE. 2013;8:e61217. doi: 10.1371/journal.pone.0061217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Meehan CJ, Beiko RG. A Phylogenomic view of ecological specialization in the Lachnospiraceae, a family of digestive tract-associated bacteria. Genome Biology and Evolution. 2014;6:703–713. doi: 10.1093/gbe/evu050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Ortiz A, Vega NM, Ratzke C, Gore J. Interspecies bacterial competition regulates community assembly in the C. elegans intestine. The ISME Journal. 2021;15:2131–2145. doi: 10.1038/s41396-021-00910-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Palmer JD, Foster KR. Bacterial species rarely work together. Science. 2022;376:581–582. doi: 10.1126/science.abn5093. [DOI] [PubMed] [Google Scholar]
  54. Pontrelli S, Szabo R, Pollak S, Schwartzman J, Ledezma-Tejeida D, Cordero OX, Sauer U. Metabolic cross-feeding structures the assembly of polysaccharide degrading communities. Science Advances. 2022;8:eabk3076. doi: 10.1126/sciadv.abk3076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Pruss KM, Marcobal A, Southwick AM, Dahan D, Smits SA, Ferreyra JA, Higginbottom SK, Sonnenburg ED, Kashyap PC, Choudhury B, Bode L, Sonnenburg JL. Mucin-derived O-glycans supplemented to diet mitigate diverse microbiota perturbations. The ISME Journal. 2021;15:577–591. doi: 10.1038/s41396-020-00798-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Quinn TP, Richardson MF, Lovell D, Crowley TM. Propr: An R-package for identifying proportionally abundant features using compositional data analysis scientific reports. Scientific Reports. 2017;7:16252. doi: 10.1038/s41598-017-16520-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Rainey PB, Quistad SD. Toward a dynamical understanding of microbial communities. Philosophical Transactions of the Royal Society B. 2020;375:20190248. doi: 10.1098/rstb.2019.0248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Reese AT, Pereira FC, Schintlmeister A, Berry D, Wagner M, Hale LP, Wu A, Jiang S, Durand HK, Zhou X, Premont RT, Diehl AM, O’Connell TM, Alberts SC, Kartzinel TR, Pringle RM, Dunn RR, Wright JP, David LA. Microbial nitrogen limitation in the mammalian large intestine. Nature Microbiology. 2018;3:1441–1450. doi: 10.1038/s41564-018-0267-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Risely A, Wilhelm K, Clutton-Brock T, Manser MB, Sommer S. Diurnal oscillations in gut bacterial load and composition eclipse seasonal and lifetime dynamics in wild meerkats. Nature Communications. 2021;12:6017. doi: 10.1038/s41467-021-26298-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Risely A, Schmid DW, Müller-Klein N, Wilhelm K, Clutton-Brock TH, Manser MB, Sommer S. Gut microbiota individuality is contingent on temporal scale and age in wild meerkats. Proceedings. Biological Sciences. 2022;289:20220609. doi: 10.1098/rspb.2022.0609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Roche KE. Rulesoflife. swh:1:rev:2173f2404e22c7fd6c1bf0fdf94e56905503f41dSoftware Heritage. 2023 https://archive.softwareheritage.org/swh:1:dir:9f06adddff3d1003c405a77a5fb57b9153c9ff61;origin=https://github.com/kimberlyroche/rulesoflife;visit=swh:1:snp:c7372572b74a964db2fb585824bf6e1c23f7793a;anchor=swh:1:rev:2173f2404e22c7fd6c1bf0fdf94e56905503f41d
  62. San-Juan-Vergara H, Zurek E, Ajami NJ, Mogollon C, Peña M, Portnoy I, Vélez JI, Cadena-Cruz C, Diaz-Olmos Y, Hurtado-Gómez L, Sanchez-Sit S, Hernández D, Urruchurtu I, Di-Ruggiero P, Guardo-García E, Torres N, Vidal-Orjuela O, Viasus D, Petrosino JF, Cervantes-Acosta G. A Lachnospiraceae-dominated bacterial signature in the fecal microbiota of HIV-infected individuals from Colombia, South America. Scientific Reports. 2018;8:4479. doi: 10.1038/s41598-018-22629-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Schliep KP. Phangorn: Phylogenetic analysis in R. Bioinformatics. 2011;27:592–593. doi: 10.1093/bioinformatics/btq706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Seth EC, Taga ME. Nutrient cross-feeding in the microbial world. Frontiers in Microbiology. 2014;5:350. doi: 10.3389/fmicb.2014.00350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Silverman JD, Roche K, Holmes ZC, David LA, Mukherjee S. Bayesian multinomial logistic normal models through marginally latent matrix-T processes. Journal of Machine Learning Research. 2022;23:1–42. [Google Scholar]
  66. Tamames J, Sánchez PD, Nikel PI, Pedrós-Alió C. Quantifying the relative importance of phylogeny and environmental preferences as drivers of gene content in prokaryotic microorganisms. Frontiers in Microbiology. 2016;7:433. doi: 10.3389/fmicb.2016.00433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Vacca M, Celano G, Calabrese FM, Portincasa P, Gobbetti M, De Angelis M. The controversial role of human gut Lachnospiraceae. Microorganisms. 2020;8:573. doi: 10.3390/microorganisms8040573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Vatanen T, Kostic AD, d’Hennezel E, Siljander H, Franzosa EA, Yassour M, Kolde R, Vlamakis H, Arthur TD, Hämäläinen A-M, Peet A, Tillmann V, Uibo R, Mokurov S, Dorshakova N, Ilonen J, Virtanen SM, Szabo SJ, Porter JA, Lähdesmäki H, Huttenhower C, Gevers D, Cullen TW, Knip M, DIABIMMUNE Study Group. Xavier RJ. Variation in microbiome LPS Immunogenicity contributes to autoimmunity in humans. Cell. 2016;165:1551. doi: 10.1016/j.cell.2016.05.056. [DOI] [PubMed] [Google Scholar]
  69. Venturelli OS, Carr AC, Fisher G, Hsu RH, Lau R, Bowen BP, Hromada S, Northen T, Arkin AP. Deciphering microbial interactions in synthetic human gut microbiome communities. Molecular Systems Biology. 2018;14:e8157. doi: 10.15252/msb.20178157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Vetrovský T, Baldrian P. The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses. PLOS ONE. 2013;8:e57923. doi: 10.1371/journal.pone.0057923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Vila JCC, Liu YY, Sanchez A. Dissimilarity-overlap analysis of replicate enrichment communities. The ISME Journal. 2020;14:2505–2513. doi: 10.1038/s41396-020-0702-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Walter J, Ley R. The human gut microbiome: Ecology and recent evolutionary changes. Annual Review of Microbiology. 2011;65:411–429. doi: 10.1146/annurev-micro-090110-102830. [DOI] [PubMed] [Google Scholar]
  73. Weiss AS, Burrichter AG, Durai Raj AC, von Strempel A, Meng C, Kleigrewe K, Münch PC, Rössler L, Huber C, Eisenreich W, Jochum LM, Göing S, Jung K, Lincetto C, Hübner J, Marinos G, Zimmermann J, Kaleta C, Sanchez A, Stecher B. In vitro interaction network of a synthetic gut bacterial community. The ISME Journal. 2022;16:1095–1109. doi: 10.1038/s41396-021-01153-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Widder S, Allen RJ, Pfeiffer T, Curtis TP, Wiuf C, Sloan WT, Cordero OX, Brown SP, Momeni B, Shou W, Kettle H, Flint HJ, Haas AF, Laroche B, Kreft J-U, Rainey PB, Freilich S, Schuster S, Milferstedt K, van der Meer JR, Groβkopf T, Huisman J, Free A, Picioreanu C, Quince C, Klapper I, Labarthe S, Smets BF, Wang H, Soyer OS, Isaac Newton Institute Fellows Challenges in microbial ecology: Building predictive understanding of community function and Dynamics. The ISME Journal. 2016;10:2557–2568. doi: 10.1038/ismej.2016.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Wilmanski T, Diener C, Rappaport N, Patwardhan S, Wiedrick J, Lapidus J, Earls JC, Zimmer A, Glusman G, Robinson M, Yurkovich JT, Kado DM, Cauley JA, Zmuda J, Lane NE, Magis AT, Lovejoy JC, Hood L, Gibbons SM, Orwoll ES, Price ND. Gut microbiome pattern reflects healthy ageing and predicts survival in humans. Nature Metabolism. 2021;3:274–286. doi: 10.1038/s42255-021-00348-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wu G, Zhao N, Zhang C, Lam YY, Zhao L. Guild-based analysis for understanding gut microbiome in human health and diseases. Genome Medicine. 2021;13:22. doi: 10.1186/s13073-021-00840-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Dario Riccardo Valenzano 1

In this work, Roche et al. study a 13-year long time series of microbiome samples from wild baboons from Kenya. This data allows disentangling ecological dynamics within and across individuals in a way that has never been done before. The authors show that the ecological relationships among baboon gut bacteria, measured through a correlation based on covariation, are largely universal (similar within and across host individuals) and that the most universally covarying taxa are almost always positively associated with each other. This work is foundational in its compelling effort to generate a rigorous method to evaluate co-abundance dynamics in longitudinal microbiome data. The approach taken will likely inspire developments that will sharpen the capacity to extract co-varying microbial features, taking into account seasonality, diet, age, relatedness, and more.

Decision letter

Editor: Dario Riccardo Valenzano1
Reviewed by: Aura Raulo2, Oren Kolodny3

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Universal gut microbial relationships in the gut microbiome of wild baboons" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Dario Riccardo Valenzano as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Wendy Garrett as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: Aura Raulo (Reviewer #2); Oren Kolodny (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1. Is the covariation data zero-inflated?

2. Did the authors find or analyze the age-dependency in microbial dynamics, i.e. whether baboon age is characterized by specific microbial associations that are not equally maintained across all age groups? More extensively: are there taxonomic covariations that are (i) lost or (ii) acquired during aging?

3. The pairwise species correlations can be explained in (at least) two ways: the species have positive relations in some way (e.g. one is providing something necessary for the other), or the two simply like to be in the same kind of habitat. The "same kind of habitat" may refer to both a similar broad environment of the host (including diet, soil type, etc) OR a similar within-host environment, i.e. host physiology, gut pH, immune status, etc. I would suggest having a discussion of these alternative explanations (and perhaps others) early on, and reference to this discussion in later interpretations of findings, throughout the results and Discussion sections.

4. A graphical summary that would explain the consensus model for the temporal dynamics of microbial pair associations would help clarify the take-home message to a broader audience.

Reviewer #1 (Recommendations for the authors):

I very much liked this work and I congratulate the authors for their contribution to the field of microbiome ecology.

I would suggest better clarifying the novelty compared to previous analyses performed by the authors on this dataset.

It was not clear to me whether the authors found or analyzed the age-dependency in microbial dynamics, i.e. whether baboon age is characterized by specific microbial associations that are not equally maintained across all age groups. More extensively: are there taxonomic covariations that are (i) lost or (ii) acquired during aging?

A graphical summary that would explain the consensus model for the temporal dynamics of microbial pair associations would help clarify the take-home message to a broader audience.

I would like to kindly ask the authors to explain their chosen criteria for authorship. In particular, the authors should clarify whether the contribution of any of the scientific collaborators in Kenya could be worthy of inclusion in the authors' list. To date, the support that goes into field work by local scientists and trainees is not sufficiently acknowledged by foreign researchers, and a more inclusive and less exploitative authorship system can make a difference in developing countries, promoting long-term scientific excellence.

Reviewer #2 (Recommendations for the authors):

• Regarding my worries over the effect of 0-0 links on the positive correlation assessment, if your covariation data is zero-inflated, I suggest you would consider whether a correlation measure based on SparCC-method (See: ), such as SpiecEASI (ref) might be a more robust way of estimating covariation through sparse inverse covariance. If your covariation data is magically not zero-inflated, I would suggest either making it into a bigger thing in the text or considering using the SparCC methods anyway, as they would allow you to have more of the rare taxa in the data. Alternatively, you could just show how much of your positive and negative correlation patterns respectively were influenced by whether or not you consider double zeros or any zeros in the data. You could do this either with separate models or within one zero-inflated hurdle model. If you can show that the pattern prevails even when you only compare non-zero abundances, that would make your correlation method that much more convincing.

• 10 permutations to address the significance of the correlations sounds la quite a low number to me. Would you have the computing power to do 100? I do not really understand how you get to p <0.05 with just 10 permutations.

• You could add a sentence to the abstract to elaborate on why we would expect ecological relationships to be individualized in the first place. I was a bit confused reading the abstract about why is this a matter worth such detailed exploration, but your introduction really convinced me. If you could add something from lines 82-91 into the abstract, it would perhaps make it more intriguing

• You show that population-level signatures contributed almost twice the weight as host-level signatures on correlation patterns. I think this is convincing. But I do think there seems to still be surprisingly much individual variation in ecological associations. I would have expected them to be even more universal, to be honest. I think it would be interesting to add also a discussion on why some taxa are strongly but inconsistently correlated – do these taxa have something special about them? Are they more generalist? Or do they have more positive links (can depend on many others rather than fully dependent on one other taxon)?

• Your universality score takes continuous correlation strength within individual and proportion of hosts with a majority sign as input. I like it, but wonder if you could capture even more of the variation in your data by also using a continuous measure of cross-sectional correlation consistency? Like additive correlation strength in the majority sign relative to additive correlation strength in the non-majority sign. Just a thought though.

• Lines 143-146, you could emphasize that if taxa covariation is driven by selection imposed by the host/environmental, then we would expect phylogenetically or phenotypically similar taxa to be positively covarying. If, on the other hand, covariation patterns were more driven by ecological interactions between taxa, we might expect positive covariation to be not more common in phylogenetically close taxa or less common based on competitive exclusion. Or is there some evidence that phylogenetically close taxa cross-feed more with each other or such?

• Lines 254-255, you write "Note, that the correlation strength for a given pair of ASVs was only weakly predicted by bacterial abundance " – Does this mean it was mostly driven by co-occurrence or that the covariation in abundances was sensitive to overall abundance? I guess the latter. More clarity would be good.

• Line 406, you write " Universality in Amboseli is not solely explained by seasonality or synchrony " – I think this is a bit manipulative title. There is quite a bit of evidence there for seasonality and synchrony and other evidence for environmental of host physiology-related selection driving covariation patterns (such as the fact that positive covariation is more common in phylogenetically close pairs). I feel like someone else could have formulated these results by downplaying the ecological relationships notion and emphasizing the selective effects notion. There is a bit of a tone here like you would prefer the ecological network effect over the environmentally driven covariation. I suggest rewording this to be a bit more neutral, such as "Universality is partially explained by seasonality and synchrony". And also mention that there may be other selective effects (like those related to individual variation in host physiology?) that you did not test but might feed into the selective effects driving covariation.

• Lines 465-467: I am not entirely convinced that the lack of similar patterns in the Johnson data set is likely explained by the different sampling frequencies. Was there much less temporal variation in the Johnson data set? To back up the statement that higher sampling frequency would be the reason the Johnson data set has dissimilar covariation between taxa compared to yours, perhaps you could show that the temporal variation in this data set was different from the baboon one and show that these covariation patterns were sensitive to timescale by subsampling either data to create mock data sets with different sampling frequency and see how this would change the inference of ecological associations. In general, I would tone down the generalizability to humans -conclusions a bit since only one of your data sets showed this, and it is in infants, who have an ecologically more unstable microbiome than adult humans.

• Lines 540-554. Can you clarify why exactly should environmental variation decrease the universality of ecological associations? I would imagine that environmental variation can expand the space of microbial covariation and if universality is driven by covariation due to environmental selection, then this should be maximal when there is broader space for environmental variation to exist. You mentioned in the intro that "genotype by environment interactions, and priority effects-can lead microbiome taxa to fill different ecological roles in different hosts", could you explain a bit more somewhere how this translate to more environmental variation leading to less clear covariation between taxa?

• Lines 575-576 What about individual variation in host physiology?

• Line 633 How much was the sparsity reduced?

• Line 643 Seems very cool but I cannot fully critically evaluate the statistical robustness of this modeling framework

Reviewer #3 (Recommendations for the authors):

• Good abstract, presentation, and introduction.

• Figure 2: perhaps mark in panel A what the threshold for significant positive/negative correlations was.

• Positive correlation – as you note in several places – can be explained in (at least) two ways: the species have positive relations in some way (e.g. one is providing something necessary for the other), or the two simply like to be in the same kind of habitat, so when it is good for one it's also good for the other. You are aware of this, as both possibilities are mentioned in several places, but it seems that sometimes you choose to offer one and sometimes the other, with no clear reason (e.g. you propose that correlations at the phylum level are due to environmental preferences – lines 217-219 – but this explanation is in contrast to the strong emphasis on microbe-microbe interactions that is found throughout).

• I would suggest having a discussion of these alternative explanations (and perhaps others) early on, and reference to this discussion in later interpretations of findings, throughout the results and Discussion sections.

(you are clearly aware of this, e.g. in line 407; I suggest discussing this topic in the introduction and referring to it throughout. This would help readers who aren't aware of the extensive research/discussion/debate about these questions in microbial ecology, landscape ecology, and elsewhere).

• A brief mention/clarification (at least) of causality vs. correlation would be a good idea in this context. Even if clear correlations are found between taxa, this doesn't imply causation, of course. Perhaps discuss in future directions the importance of intervention/manipulation studies to test for causation.

• There's quite a large literature in ecology, particularly microbial ecology, that deals with the link between pairwise interactions between bacteria within a larger consortium of species, and whether inferences can be made from pairwise interactions to more complex scenarios; consider referring to some of this literature and perhaps offering a discussion of your results in light of the insights proposed there. Some such studies (I'm not from the field, there may be better ones) are:

https://www.nature.com/articles/nature22898

https://onlinelibrary.wiley.com/doi/full/10.1111/ele.13211

https://www.nature.com/articles/s41559-017-0109

Also, have a look at one or two possibly relevant studies by Andrew Letten.

• A possible interpretation of the finding that correlations, when exist, tend to be positive: if the driver of significant correlations is the environment, and not positive species' interactions, then this observation might be expected: pairs of species that share environmental preferences will be positively correlated, and pairs of species that prefer different environments would be uncorrelated (and not negatively correlated).

In other words: there is only one way in which environmental preferences can be similar, but many ways in which two environmental preferences can differ (and also an environment is similar to itself in all dimensions, but there are many dimensions in which two environments can differ). "All happy families are alike, but every unhappy family is unhappy in its own way (Leo Tolstoy, Anna Karenina, 1878)".

In a sense, this observation should thus perhaps be viewed as support of the hypothesis that the driver of the positive correlations you find is shared environmental preferences and not species-species interactions. I think. Consider.

• 545-555: If true, the positive correlations are due to shared preferences of environment, it perhaps makes sense that the children dataset, in which children differ quite a bit (more than pairs of baboons), shows a strong signal: the fact that children are different should create high diversity in the overall dataset, and when two children happen to be similar in the conditions they create in their guts – this (and the respective positive correlations between pairs of species that like these specific conditions) would stand out particularly significantly above all this noise. Maybe. This requires some deeper thought, so consider. ((this may be analogous to assessing heritability of traits – heritability seems to decrease – sometimes to the point of being non-significant/below detection level – in a homogenic population, and heritability estimates are higher when the population is diverse))

• 572 – 576 (starting with "We surmise that most") – I would be more cautious about this statement.

I tend to think that the driver of the correlation universality in your data is shared environmental preferences, and – apart from the point I made above – I think this is also particularly likely in light of the phylogenetic signal that you found (it makes sense that phylogenetically related species have similar environmental preferences, stemming from homology; this seems to me more parsimonious compared to the possibility that related species tend to be more supportive of one another for some reason, even though I can come up with some handwaving explanations that could support this if I really had to).

The "environment" in question is the one in the gut. Thus, controlling for diet or seasonal drivers is good, but far from ruling out that there are shared environments that are driving the signal; for that, you'd need to control for the extent to which pairs of host individuals tended to have more similar pH, hormonal status, immune activation (and its profile) and so on.

• 589: There seems to be a problem with this sentence. Look at the "the fact that…" – seems like something is missing.

• Methods: I'd elaborate a bit further about the sequencing, e.g. whether you rarefied samples or accounted for uneven read counts in another way, and which 16s regions were amplified (and/or what their length was – amplifying just V3, for example, would lead to a very different ASV resolution from amplifying V3+V4).

eLife. 2023 May 9;12:e83152. doi: 10.7554/eLife.83152.sa2

Author response


Essential revisions:

1. Is the covariation data zero-inflated?

We very much appreciate the suggestion to check for zero-inflation and R2’s detailed comments on this topic below. We are embarrassed that we didn’t consider how zero inflation might affect the correlation patterns in our original analyses. Indeed, zero inflation biased our correlations such that taxon pairs with a high frequency of joint zero observations (i.e., where both members of the pair often had very low or zero abundances) tended to be positively correlated (Figure 1 —figure supplement 3, shown below). This is because, as R2 suggests, zero inflation in the data lends more weight to positive links than negative links.

To address this problem in the revised manuscript, we now restrict our analyses to taxon pairs with strictly less than a 5% frequency of joint absence (i.e., joint zero-abundance observations in less than 5% of all samples across hosts, to the left of the dashed line in Figure 1 —figure supplement 3). We further restricted to pairs with less than a 50% frequency of absence in either taxon individually across all samples. We explain the rationale behind our filtering criteria in lines 735-764. After filtering, 1,878 of the original 7,750 ASVASV pairs were retained in our analyses (86.4% of the original pairs at the phylum level; 71.0% at the class/order/family level).

Nearly all of our results are robust to these changes. The only two results that differ from the original submission are (i) consistent with the 2nd reviewer’s suspicions, we no longer see a bias towards positive correlations in our most universal taxa pairs, and (ii) we no longer see enrichment for members of the same bacterial family in the most-universal pairs. However, our other main results are the same: most bacterial correlations are weak and negative; each baboon reflects a mixture of idiosyncratic and shared correlation patterns, but shared patterns dominate by almost 2-fold; host pairs with the most similar bacterial correlation patterns also had similar microbiome taxonomic compositions and tended to be genetic relatives.

2. Did the authors find or analyze the age-dependency in microbial dynamics, i.e. whether baboon age is characterized by specific microbial associations that are not equally maintained across all age groups? More extensively: are there taxonomic covariations that are (i) lost or (ii) acquired during aging?

We agree that the effects of host age on microbial dynamics are an interesting topic. Host age may predict the overall strength of microbial relationships, and prior studies suggest that microbial relationships may become more individualized with age [5, 6]. To test these ideas, we added two new paragraphs to the results (lines 395-415) and three new supplementary figures (Figure 5 —figure supplements 1, 2 and 3). Briefly, we found no evidence that microbial correlations get stronger or weaker with age (Figure 5 —figure supplement 1).

Further, we found no strong differences in the degree of “personalized” correlation patterns across age groups (Figure 5 —figure supplement 2).

3. The pairwise species correlations can be explained in (at least) two ways: the species have positive relations in some way (e.g. one is providing something necessary for the other), or the two simply like to be in the same kind of habitat. The "same kind of habitat" may refer to both a similar broad environment of the host (including diet, soil type, etc) OR a similar within-host environment, i.e. host physiology, gut pH, immune status, etc. I would suggest having a discussion of these alternative explanations (and perhaps others) early on, and reference to this discussion in later interpretations of findings, throughout the results and Discussion sections.

Thank you for this suggestion. We now discuss this topic in the Discussion section (lines 552 to 567 and lines 580 to 583). These edits clarify that our correlations can arise from two non-exclusive processes: ecological interactions between species or correlated responses to environmental gradients. While our approach corrects for some of these factors, namely diet, season, and synchronized dynamics between hosts, we did not account for key environmental gradients within hosts—especially immune profiles, intestinal pH, and hormones. We now discuss how these differences could explain some of our observations, especially the finding that close genetic relatives have similar dynamics and that the most consistent ASV-level correlations are between phylogenetically related taxa. We have also made small updates to clarify these ideas in the Introduction (lines 129 to 130; 139 to 141) and the Results (lines 230 to 233; 351 to 354; text starting at line 430).

4. A graphical summary that would explain the consensus model for the temporal dynamics of microbial pair associations would help clarify the take-home message to a broader audience.

We are not confident we understood what you meant by the “consensus model for temporal dynamics of microbial pair associations”, but we think you mean our universality score. In response, we have added a new Figure 2 —figure supplement 4 which illustrates this score.

Reviewer #1 (Recommendations for the authors):

I very much liked this work and I congratulate the authors for their contribution to the field of microbiome ecology.

Thank you for your supportive comments.

I would suggest better clarifying the novelty compared to previous analyses performed by the authors on this dataset.

We agree.

It was not clear to me whether the authors found or analyzed the age-dependency in microbial dynamics, i.e. whether baboon age is characterized by specific microbial associations that are not equally maintained across all age groups. More extensively: are there taxonomic covariations that are (i) lost or (ii) acquired during aging?

We have added age-dependency in the revised text (lines 395-415). Please see our response in R2 above, as well as new Figure 5 —figure supplements 1, 2, and 3. Briefly, we found no evidence that microbial correlations get stronger or weaker with age (Figure 5 —figure supplement 1) and no strong differences in the degree of “personalized” correlation patterns across age groups (Figure 5 —figure supplement 2). We did, however, find small differences in the strength of some microbial correlations between age (Figure 5 —figure supplement 3).

A graphical summary that would explain the consensus model for the temporal dynamics of microbial pair associations would help clarify the take-home message to a broader audience.

Please see our response above.

I would like to kindly ask the authors to explain their chosen criteria for authorship. In particular, the authors should clarify whether the contribution of any of the scientific collaborators in Kenya could be worthy of inclusion in the authors' list. To date, the support that goes into field work by local scientists and trainees is not sufficiently acknowledged by foreign researchers, and a more inclusive and less exploitative authorship system can make a difference in developing countries, promoting long-term scientific excellence.

This is a good question and one we have discussed much among ourselves. Several of our Kenyan research team (Raphael Mututua, Kinyua Warutere, Long’ida Siodi, and Tim Wango) played an essential role on collecting and processing the fecal samples used to generate the microbiome compositional profiles used in this analysis. Raphael Mututua, Kinyua Warutere, and Long’ida Siodi also contributed to collecting the demographic, behavioral, and ecological covariates we analyzed. We therefore included these four authors on the two original publications arising from this data set (citations below). However, because this current paper is a re-analysis of these previously published data, and because these authors did not contribute to designing or implementing these analyses or to writing or revising the paper, we felt it would be inappropriate to include them on this manuscript.

Citations for the first papers to publish analyses of this data set (references 43 and 44 in the manuscript):

Grieneisen L., Dasari M., Gould T.J., Björk J.R., Grenier J., Yotova V., Jansen D., Gottel N., Gordon J.B., Learn N.H., Gesquiere L.R., Wango T.L., Mututua R.S., Warutere J.K., Siodi L., Gilbert J.A., Barreiro L.B., Alberts S.C., Tung J., Archie E.A., Blekhman R. 2021. Gut microbiome heritability is nearly universal but environmentally contingent. Science 373:181-186

Björk J.R., Dasari M., Roche, K., Grieneisen L.,Gould T.J., Grenier J.C., Yotova V., Gottel N., Jansen D., Gesquiere L.R., Gordon J.B., Learn N.H., Wango T.L., Mututua R.S.,

Warutere J.K., Siodi L., Mukherjee, S., Barreiro L.B., Alberts S.C., Gilbert J.A., Tung J., Blekhman R., Archie E.A. 2022. Synchrony and idiosyncrasy in the gut microbiome of wild baboons. Nature Ecology and Evolution 6: 955–964

Reviewer #2 (Recommendations for the authors):

• Regarding my worries over the effect of 0-0 links on the positive correlation assessment, if your covariation data is zero-inflated, I suggest you would consider whether a correlation measure based on SparCC-method (See: ), such as SpiecEASI (ref) might be a more robust way of estimating covariation through sparse inverse covariance. If your covariation data is magically not zero-inflated, I would suggest either making it into a bigger thing in the text or considering using the SparCC methods anyway, as they would allow you to have more of the rare taxa in the data. Alternatively, you could just show how much of your positive and negative correlation patterns respectively were influenced by whether or not you consider double zeros or any zeros in the data. You could do this either with separate models or within one zero-inflated hurdle model. If you can show that the pattern prevails even when you only compare non-zero abundances, that would make your correlation method that much more convincing.

Thank you. In response to this comment, we substantially revised our filtering criteria and re-analyzed the data.

• 10 permutations to address the significance of the correlations sounds la quite a low number to me. Would you have the computing power to do 100? I do not really understand how you get to p <0.05 with just 10 permutations.

We have added more detail to the text starting at line 202 to clarify our permutation approach. Specifically, to generate an expectation of the strength of bacterial correlations possible by chance, we used a permutation procedure that randomly shuffled the taxonomic identities within each sample of the bacterial count table 10 times for each of the 56 hosts (560 total permutations). We then estimated correlations for these permuted pairs to generate an empirical null distribution of randomly generated taxon-taxon correlations. Hence, the “significance” of individual taxon-taxon correlations was evaluated against a very large, pooled distribution of randomly generated correlations.

• You could add a sentence to the abstract to elaborate on why we would expect ecological relationships to be individualized in the first place. I was a bit confused reading the abstract about why is this a matter worth such detailed exploration, but your introduction really convinced me. If you could add something from lines 82-91 into the abstract, it would perhaps make it more intriguing

We have added two sentences to the abstract (starting on line 42), which read, “However, whether bacterial relationships are generalizable across hosts or personalized to individual hosts is debated. Several eco-evolutionary processes could personalize microbiome community ecology, but the few studies that have tested this idea find that bacterial interactions are largely consistent (i.e., “universal”) across hosts”

• You show that population-level signatures contributed almost twice the weight as host-level signatures on correlation patterns. I think this is convincing. But I do think there seems to still be surprisingly much individual variation in ecological associations. I would have expected them to be even more universal, to be honest. I think it would be interesting to add also a discussion on why some taxa are strongly but inconsistently correlated – do these taxa have something special about them? Are they more generalist? Or do they have more positive links (can depend on many others rather than fully dependent on one other taxon)?

Actually, we did not find any pairs that were strongly and inconsistently correlated. For instance, in Figures 3A and 3B, the taxa with inconsistent correlation signs (far left on the xaxis) have only weak median correlations within hosts. To clarify this result, we now mention it in the abstract (line 52-54): “taxon pairs that had inconsistent correlation signs (either positive or negative) in different hosts always had weak correlations within hosts.” In addition, we have revised the text starting at line 259 to clarify that we do not observe any pairs of taxa that are strongly and inconsistently correlated. This text reads, “First, in support of the idea that ASVs do not exhibit vastly different correlative relationships in different hosts, no taxon pairs were strongly and inconsistently correlated across hosts (Figures 3A and 3B; Figure 3 – —figure supplement 1A). Instead, the ASV pairs that had inconsistent correlation signs across hosts always had weak and often non-significant median absolute correlation coefficients within hosts (Figures 3A and 3B)”.

• Your universality score takes continuous correlation strength within individual and proportion of hosts with a majority sign as input. I like it, but wonder if you could capture even more of the variation in your data by also using a continuous measure of cross-sectional correlation consistency? Like additive correlation strength in the majority sign relative to additive correlation strength in the non-majority sign. Just a thought though.

We think this is an interesting idea, but we were concerned that our initial universality score was already a little challenging for readers to understand. We suspect (but did not confirm) that your revised score might show qualitatively similar patterns to the original score, and in the end, we did not include this suggestion in the revised paper.

• Lines 143-146, you could emphasize that if taxa covariation is driven by selection imposed by the host/environmental, then we would expect phylogenetically or phenotypically similar taxa to be positively covarying. If, on the other hand, covariation patterns were more driven by ecological interactions between taxa, we might expect positive covariation to be not more common in phylogenetically close taxa or less common based on competitive exclusion. Or is there some evidence that phylogenetically close taxa cross-feed more with each other or such?

We have updated this prediction (line 151) to read: “Third, we expected to observe positive correlations between taxa that are close phylogenetic relatives. This is because related bacteria may have similar functional properties and hence similar ecological relationships with other members of the community. They may also have dynamics that are driven by similar selective forces imposed by the host or host’s environment. Alternatively, competitive exclusion may lead closely related taxa to exhibit neutral or negative relationships.”

• Lines 254-255, you write "Note, that the correlation strength for a given pair of ASVs was only weakly predicted by bacterial abundance " – Does this mean it was mostly driven by co-occurrence or that the covariation in abundances was sensitive to overall abundance? I guess the latter. More clarity would be good.

Your interpretation is correct. We meant that covariation is sensitive to overall abundance but the effect is weak. We clarified the text starting at line 271 to read, “Note, that the correlation for a given pair of ASVs was only weakly predicted by bacterial abundance (r=0.129 and r=0.223 for the more and less abundant partner in a pair respectively; p < 0.0001 both). While this effect was statistically significant, it explained only 6% of the variance in median correlation.”

• Line 406, you write " Universality in Amboseli is not solely explained by seasonality or synchrony " – I think this is a bit manipulative title. There is quite a bit of evidence there for seasonality and synchrony and other evidence for environmental of host physiology-related selection driving covariation patterns (such as the fact that positive covariation is more common in phylogenetically close pairs). I feel like someone else could have formulated these results by downplaying the ecological relationships notion and emphasizing the selective effects notion. There is a bit of a tone here like you would prefer the ecological network effect over the environmentally driven covariation. I suggest rewording this to be a bit more neutral, such as "Universality is partially explained by seasonality and synchrony". And also mention that there may be other selective effects (like those related to individual variation in host physiology?) that you did not test but might feed into the selective effects driving covariation.

These comments prompted us to give a more even hand to both explanations for our data. We have revised the sub-title for this section, which now reads, “Universality in Amboseli is not well explained by microbes’ shared responses to diet, season, or synchronized dynamics”. We also clarify in line 430 that “without experiments, we cannot disentangle whether our observed bacterial correlations are due to ecological interactions between bacterial species or to shared responses to environmental gradients, either inside or outside the host. We agree that host environments could play a big role in the patterns we see. This is noted in line 433 and discussed in a new paragraph in the discussion, which starts at line 552. In this section, we state that environmental differences between hosts in the host gut are likely to explain, at least in part, the observations that close genetic relatives have similar dynamics and that the most consistent ASV-level correlations were between phylogenetically related taxa.

• Lines 465-467: I am not entirely convinced that the lack of similar patterns in the Johnson data set is likely explained by the different sampling frequencies. Was there much less temporal variation in the Johnson data set? To back up the statement that higher sampling frequency would be the reason the Johnson data set has dissimilar covariation between taxa compared to yours, perhaps you could show that the temporal variation in this data set was different from the baboon one and show that these covariation patterns were sensitive to timescale by subsampling either data to create mock data sets with different sampling frequency and see how this would change the inference of ecological associations. In general, I would tone down the generalizability to humans -conclusions a bit since only one of your data sets showed this, and it is in infants, who have an ecologically more unstable microbiome than adult humans.

Briefly, in the Discussion in line 619, we now clarify that it is not possible to subsample Johnson et al. [7] to monthly scales because the data set is only 17 days long.

• Lines 540-554. Can you clarify why exactly should environmental variation decrease the universality of ecological associations? I would imagine that environmental variation can expand the space of microbial covariation and if universality is driven by covariation due to environmental selection, then this should be maximal when there is broader space for environmental variation to exist. You mentioned in the intro that "genotype by environment interactions, and priority effects-can lead microbiome taxa to fill different ecological roles in different hosts", could you explain a bit more somewhere how this translate to more environmental variation leading to less clear covariation between taxa?

This paragraph of the discussion has been edited. The text starting at line 599 now states, “This outcome surprised us: because the baboons all live in the same environment and are presumably colonized by similar bacterial strains from that environment, we expected that ecological selection and shared strain functionality should lead to stronger universality in bacteria correlation patterns compared to human infants sampled from different households and who were probably colonized by different strains.”

We have also updated the text in the discussion that mentions the role of genotype by environment interactions and how they might lead to personalized covariation between taxa. The text starting at line 87 states, “For instance, several common community and evolutionary processes—such as horizontal gene transfer and priority effects—can lead microbiome taxa to fill different ecological roles in different hosts [8-13]. Further, genotype by environment interactions and plasticity could lead some microbes to adopt contextdependent metabolisms and ecological roles depending on their microbial neighbors or other aspects of the environment [14-17].”

• Lines 575-576 What about individual variation in host physiology?

This sentence is no longer in the paper.

• Line 633 How much was the sparsity reduced?

Please see our response to Essential revision 1 above

• Line 643 Seems very cool but I cannot fully critically evaluate the statistical robustness of this modeling framework

We have added a few lines starting at line 677 to place our approach in context. Essentially, our model resembles several published methods for modeling microbial time series data. There are three key features from our perspective: the use of log-ratios, the use of a state space model, and the Gaussian process component. Log-ratios have occasionally been used to model compositional data in recent years [18]. State space models are useful for modeling a dynamic process that is observed only after the introduction of some measurement [e.g., 19]. Finally, we use the Gaussian process in our state space framework to help contend with irregularity in the sampling of our data. Rather than evolving in discrete jumps from one time point to the next, it allowed us to model the change in microbial logratio abundances as smoothly flowing through interruptions in observation. Other authors have made essentially the same choice, as in Äijö et al.’s TGP-CODA model [20]. In all, we suspect the results in this paper are robust to a variety of modeling decisions. A simple centered log-ratio ARIMA model, for example, yielded very similar estimates of ASV-ASV correlations (Figure 4 —figure supplement 1C).

Reviewer #3 (Recommendations for the authors):

• Good abstract, presentation, and introduction.

• Figure 2: perhaps mark in panel A what the threshold for significant positive/negative correlations was.

Because there is no threshold above which the correlations are all statistically significant, there is no easy way to represent significance on the heat map in Figure 2A. However, we have added a new supplementary figure (Figure 2 —figure supplement 2) that shows the same data as the heat map in Figure 2A, but with non-significant (FDR < 0.05) correlations blacked out.

• Positive correlation – as you note in several places – can be explained in (at least) two ways: the species have positive relations in some way (e.g. one is providing something necessary for the other), or the two simply like to be in the same kind of habitat, so when it is good for one it's also good for the other. You are aware of this, as both possibilities are mentioned in several places, but it seems that sometimes you choose to offer one and sometimes the other, with no clear reason (e.g. you propose that correlations at the phylum level are due to environmental preferences – lines 217-219 – but this explanation is in contrast to the strong emphasis on microbe-microbe interactions that is found throughout).

We have revised the text in several places to make sure we provide both explanations where relevant (see lines 118, 230, 351, etc.). For instance, the text you mention, which was originally at line 217-219, and now at lines 230 states, “This bias towards negative relationships is consistent with the expectation that neutral or negative relationships between ASVs are more common than mutualisms [21-23] and that more distantly related taxa (e.g., phyla) respond to distinct environmental drivers due to differences in metabolic requirements and lifestyles”. We also have added a much more detailed discussion of how we think these patterns contribute to our data starting at line 552.

• I would suggest having a discussion of these alternative explanations (and perhaps others) early on, and reference to this discussion in later interpretations of findings, throughout the results and Discussion sections.

(you are clearly aware of this, e.g. in line 407; I suggest discussing this topic in the introduction and referring to it throughout. This would help readers who aren't aware of the extensive research/discussion/debate about these questions in microbial ecology, landscape ecology, and elsewhere).

We added more thorough discussion of these alternatives in the Introduction (line 118 and line 230) and the Discussion (lines 552 to 567).

• A brief mention/clarification (at least) of causality vs. correlation would be a good idea in this context. Even if clear correlations are found between taxa, this doesn't imply causation, of course. Perhaps discuss in future directions the importance of intervention/manipulation studies to test for causation.

We added this idea to line 117, which now states, “…correlation cannot be used to infer causality, and in the absence of experiments, we cannot differentiate whether microbial correlation patterns arise from ecological interactions (e.g., competition, predation, facilitation) or shared responses to the environment.” Line 430 now also reads, “Without experiments, we cannot disentangle whether our observed bacterial correlations are due to ecological interactions between bacterial species or to shared responses to environmental gradients, either inside or outside the host.”

• There's quite a large literature in ecology, particularly microbial ecology, that deals with the link between pairwise interactions between bacteria within a larger consortium of species, and whether inferences can be made from pairwise interactions to more complex scenarios; consider referring to some of this literature and perhaps offering a discussion of your results in light of the insights proposed there. Some such studies (I'm not from the field, there may be better ones) are:

https://www.nature.com/articles/nature22898

https://onlinelibrary.wiley.com/doi/full/10.1111/ele.13211

https://www.nature.com/articles/s41559-017-0109

Also, have a look at one or two possibly relevant studies by Andrew Letten.

Thank you for pointing out the literature about this topic, and we agree with the value of citing it. We now cite several of these studies and briefly discuss the importance of future work that connects pairwise bacterial interactions to the emergent properties of the microbial community (see line 628).

• A possible interpretation of the finding that correlations, when exist, tend to be positive: if the driver of significant correlations is the environment, and not positive species' interactions, then this observation might be expected: pairs of species that share environmental preferences will be positively correlated, and pairs of species that prefer different environments would be uncorrelated (and not negatively correlated).

In other words: there is only one way in which environmental preferences can be similar, but many ways in which two environmental preferences can differ (and also an environment is similar to itself in all dimensions, but there are many dimensions in which two environments can differ). "All happy families are alike, but every unhappy family is unhappy in its own way (Leo Tolstoy, Anna Karenina, 1878)".

In a sense, this observation should thus perhaps be viewed as support of the hypothesis that the driver of the positive correlations you find is shared environmental preferences and not species-species interactions. I think. Consider.

Thank you for this interesting set of ideas! As mentioned above, we have added new emphasis to the idea that environmental preferences could be playing an important role in the patterns we observe, as an alternative to species-species interactions (see new discussion starting in line 351)—while also cautioning that our data set cannot unambiguously differentiate these possibilities (e.g., text starting at line 430). However, given our revised filtering criteria, we also no longer find a bias towards positive correlations in the most universal taxa (we still find that, overall, most correlations are negative).

• 545-555: If true, the positive correlations are due to shared preferences of environment, it perhaps makes sense that the children dataset, in which children differ quite a bit (more than pairs of baboons), shows a strong signal: the fact that children are different should create high diversity in the overall dataset, and when two children happen to be similar in the conditions they create in their guts – this (and the respective positive correlations between pairs of species that like these specific conditions) would stand out particularly significantly above all this noise. Maybe. This requires some deeper thought, so consider. ((this may be analogous to assessing heritability of traits – heritability seems to decrease – sometimes to the point of being non-significant/below detection level – in a homogenic population, and heritability estimates are higher when the population is diverse))

After correcting for the zero inflation in our data, we no longer see the bias towards positive relationships in the most universal correlation patterns. For this reason, we have not included the suggested ideas, but we agree that if correlated relationships are caused by environmental gradients, and if infants experience more environmental differences than baboons, it might be easier to detect significant correlations in an infant data set. However, the correlation patterns do not seem substantially different between baboons and infants, so we have chosen not to go there.

• 572 – 576 (starting with "We surmise that most") – I would be more cautious about this statement.

I tend to think that the driver of the correlation universality in your data is shared environmental preferences, and – apart from the point I made above – I think this is also particularly likely in light of the phylogenetic signal that you found (it makes sense that phylogenetically related species have similar environmental preferences, stemming from homology; this seems to me more parsimonious compared to the possibility that related species tend to be more supportive of one another for some reason, even though I can come up with some handwaving explanations that could support this if I really had to).

The "environment" in question is the one in the gut. Thus, controlling for diet or seasonal drivers is good, but far from ruling out that there are shared environments that are driving the signal; for that, you'd need to control for the extent to which pairs of host individuals tended to have more similar pH, hormonal status, immune activation (and its profile) and so on.

We agree. These ideas are discussed in other responses above, but briefly, we now state in the discussion, starting at line 559, that “our approach did not account for important environmental gradients within the gut, such as host immune profiles and intestinal pH. These factors also shape microbiome composition [e.g., 24, 25, 26], and can lead to shared abundance correlations between hosts even if hosts themselves differ. Ecological selection via within-host environments may explain our finding that genetic relatives share somewhat similar bacterial correlation patterns. Ecological selection is also consistent with our observation that the most consistent ASV-level correlations are between phylogenetically related taxa, and these patterns were strongest for positively associated taxon pairs. In support, phylogenetically related species have been shown to have similar environmental preferences [27].”

• 589: There seems to be a problem with this sentence. Look at the "the fact that…" – seems like something is missing.

This sentence has been revised to read, “Following recommended statistical practices [28], samples were not rarefied, but counts were agglomerated and transformed to additive log-ratios (ALR). Variation in sampling depth and relative abundance were modeled by the method described in a subsequent section.”

• Methods: I'd elaborate a bit further about the sequencing, e.g. whether you rarefied samples or accounted for uneven read counts in another way, and which 16s regions were amplified (and/or what their length was – amplifying just V3, for example, would lead to a very different ASV resolution from amplifying V3+V4).

We have expanded these sections of the methods. In line 648, we state “Following recommended statistical practices [28], samples were not rarefied, but counts were agglomerated and transformed to additive log-ratios (ALR). Variation in sampling depth and relative abundance were modeled by the method described in a subsequent section.”

To clarify our sequencing approach, we state in line 638 that, “The microbiome compositional profiles are derived from PCR amplification of a ~390 bp-long fragment that encompassed the V4 region of the 16S rRNA gene using primers 515F – 806R [29].”

References

  • Silverman JD, Roche K, Holmes ZC, David LA, Mukherjee S. Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes. Journal of Machine Learning Research. 2022;23:1-42.

  • Quinn TP, Richarrson MF, Lovell D, Crowley TM. propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis Scientific Reports. 2017;7:16252.

  • Cao Y, Lin W, Li H. Large covariance estimation for compositional data via compositionadjusted thresholding.. J Am Stat Assoc. 2019:759-72.

  • Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. PubMed PMID: 23028285; PubMed Central PMCID: PMCPMC3447976.

  • Risely A, Schmid DW, Muller-Klein N, Wilhelm K, Clutton-Brock TH, Manser MB, et al. Gut microbiota individuality is contingent on temporal scale and age in wild meerkats. Proc Biol Sci. 2022;289(1981):20220609. Epub 20220817. doi: 10.1098/rspb.2022.0609. PubMed PMID: 35975437; PubMed Central PMCID: PMCPMC9382201.

  • Wilmanski T, Diener C, Rappaport N, Patwardhan S, Wiedrick J, Lapidus J, et al. Gut microbiome pattern reflects healthy ageing and predicts survival in humans. Nat Metab. 2021;3(2):274-86. Epub 20210218. doi: 10.1038/s42255-021-00348-0. PubMed PMID: 33619379; PubMed Central PMCID: PMCPMC8169080.

  • Johnson AJ, Vangay P, Al-Ghalith GA, Hillmann BM, Ward TL, Shields-Cutler RR, et al. Daily Sampling Reveals Personalized Diet-Microbiome Associations in Humans. Cell Host and Microbe. 2019;25(6):789-802. Epub 2019/06/14. doi: 10.1016/j.chom.2019.05.005. PubMed PMID: 31194939.

  • Franzosa EA, Huang K, Meadow JF, Gevers D, Lemon KP, Bohannan BJM, et al.Identifying personal microbiomes using metagenomic codes. Proceedings of the National Academy of Sciences. 2015;112(22):E2930-E8. doi: 10.1073/pnas.1423854112. PubMed PMID: WOS:000355832200014.

  • Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science. 2013;341(6141):1237439. Epub 2013/07/06. doi: 10.1126/science.1237439. PubMed PMID: 23828941; PubMed Central PMCID: PMC3791589.

  • Bik EM, Costello EK, Switzer AD, Callahan BJ, Holmes SP, Wells RS, et al. Marine mammals harbor unique microbiotas shaped by and yet distinct from the sea. Nat Commun. 2016;7:10516. Epub 20160203. doi: 10.1038/ncomms10516. PubMed PMID: 26839246; PubMed Central PMCID: PMCPMC4742810.

  • Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, et al. Moving pictures of the human microbiome. Genome Biology. 2011;12(5):R50. doi: Artn R50 Doi 10.1186/Gb-2011-12-5-R50. PubMed PMID: ISI:000295732700014.

  • Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326(5960):1694-7. doi: Doi 10.1126/Science.1177486. PubMed PMID: ISI:000272839000053.

  • Dolinsek J, Goldschmidt F, Johnson DR. Synthetic microbial ecology and the dynamic interplay between microbial genotypes. Fems Microbiology Reviews. 2016;40(6):961-79. doi: 10.1093/femsre/fuw024. PubMed PMID: WOS:000387995000010.

  • Louca S, Polz MF, Mazel F, Albright MBN, Huber JA, O'Connor MI, et al. Function and functional redundancy in microbial systems. Nat Ecol Evol. 2018;2(6):936-43. Epub 2018/04/18. doi: 10.1038/s41559-018-0519-1. PubMed PMID: 29662222.

  • Rainey PB, Quistad SD. Toward a dynamical understanding of microbial communities. Philos Trans R Soc Lond B Biol Sci. 2020;375(1798):20190248. Epub 2020/03/24. doi: 10.1098/rstb.2019.0248. PubMed PMID: 32200735; PubMed Central PMCID: PMCPMC7133524.

  • Martiny JB, Jones SE, Lennon JT, Martiny AC. Microbiomes in light of traits: A phylogenetic perspective. Science. 2015;350(6261):aac9323. doi: 10.1126/science.aac9323. PubMed PMID: 26542581.

  • Debray R, Herbert RA, Jaffe AL, Crits-Christoph A, Power ME, Koskella B. Priority effects in microbiome assembly. Nat Rev Microbiol. 2022;20(2):109-21. Epub 20210827. doi: 10.1038/s41579-021-00604-w. PubMed PMID: 34453137.

  • Gloor GB, Reid G. Compositional analysis: a valid approach to analyze microbiome highthroughput sequencing data. Can J Microbiol. 2016;62(8):692-703. Epub 2016/06/18. doi: 10.1139/cjm-2015-0821. PubMed PMID: 27314511.

  • Joseph TA, Pasarkar AP, Pe'er I. Efficient and Accurate Inference of Mixed Microbial Population Trajectories from Longitudinal Count Data. Cell Syst. 2020;10(6):463-9 e6. Epub 20200624. doi: 10.1016/j.cels.2020.05.006. PubMed PMID: 32684275.

  • Aijo T, Muller CL, Bonneau R. Temporal probabilistic modeling of bacterial compositions derived from 16S rRNA sequencing. Bioinformatics. 2018;34(3):372-80. doi: 10.1093/bioinformatics/btx549. PubMed PMID: 28968799; PubMed Central PMCID: PMCPMC5860357.

  • Coyte KZ, Rao C, Rakoff-Nahoum S, Foster KR. Ecological rules for the assembly of microbiome communities. PLoS Biol. 2021;19(2):e3001116. Epub 20210219. doi: 10.1371/journal.pbio.3001116. PubMed PMID: 33606675; PubMed Central PMCID: PMCPMC7946185.

  • Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and stability. Science. 2015;350(6261):663-6. doi: 10.1126/science.aad2602. PubMed PMID: 26542567.

  • Palmer JD, Foster KR. Bacterial species rarely work together. Science. 2022;376(6593):581-2. Epub 20220505. doi: 10.1126/science.abn5093. PubMed PMID:35511986.

  • Reese AT, Pereira FC, Schintlmeister A, Berry D, Wagner M, Hale LP, et al. Microbial nitrogen limitation in the mammalian large intestine. Nat Microbiol. 2018. Epub 2018/10/31. doi: 10.1038/s41564-018-0267-7. PubMed PMID: 30374168.

  • Firrman J, Liu L, Mahalak K, Tanes C, Bittinger K, Tu V, et al. The impact of environmental pH on the gut microbiota community structure and short chain fatty acid production. FEMS Microbiol Ecol. 2022;98(5). doi: 10.1093/femsec/fiac038. PubMed PMID: 35383853.

  • de Vos WM, Tilg H, Van Hul M, Cani PD. Gut microbiome and health: mechanistic insights. Gut. 2022;71(5):1020-32. Epub 20220201. doi: 10.1136/gutjnl-2021-326789. PubMed PMID: 35105664; PubMed Central PMCID: PMCPMC8995832.

  • Tamames J, Sanchez PD, Nikel PI, Pedros-Alio C. Quantifying the Relative Importance of Phylogeny and Environmental Preferences As Drivers of Gene Content in Prokaryotic Microorganisms. Front Microbiol. 2016;7:433. Epub 20160331. doi: 10.3389/fmicb.2016.00433. PubMed PMID: 27065987; PubMed Central PMCID: PMCPMC4814473.

  • Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this is not optional. Front Microbiol. 2017;8:2224. Epub 2017/12/01. doi: 10.3389/fmicb.2017.02224. PubMed PMID: 29187837; PubMed Central PMCID: PMCPMC5695134.

  • Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences. 2011;108:4516-22. doi: Doi 10.1073/Pnas.1000080107. PubMed PMID: ISI:000288451300002.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. University of California San Diego Microbiome Initiative 2021. 16S rRNA gene sequencing data from baboon gut microbiomes collected between 2000 and 2014. European Nucleotide Archive. ERP119849
    2. Grieneisen L, Dasari M, Gould TJ, Björk JR, Grenier J, Yotova V, Jansen D, Gottel N, Gordon JB, Learn NH, Gesquiere LR, Wango TL, Mututua RS, Warutere JK, Siodi L, Gilbert JA, Barreiro LB, Alberts SC, Tung J, Archie EA, Blekhman R. 2021. Gut microbiome heritability is nearly universal but environmentally contingent. Qiita. 12949 [DOI] [PMC free article] [PubMed]
    3. Vatanen T, Kostic A, d'Hennezel E. 2016. DIABIMMUNE three country cohort. NCBI BioProject. PRJNA290380
    4. Johnson AJ. 2019. Johnson et al. dietary cohort. European Nucleotide Archive. PRJEB29065

    Supplementary Materials

    Supplementary file 1. Supplementary tables.
    elife-83152-supp1.xlsx (65.6KB, xlsx)
    MDAR checklist

    Data Availability Statement

    16S rRNA gene sequences are available on EBI-ENA (project 590 ERP119849) and Qiita (study 12949). Analyzed data and code are available on GitHub at: https://github.com/kimberlyroche/rulesoflife (copy archived at Roche, 2023).

    The following datasets were generated:

    University of California San Diego Microbiome Initiative 2021. 16S rRNA gene sequencing data from baboon gut microbiomes collected between 2000 and 2014. European Nucleotide Archive. ERP119849

    Grieneisen L, Dasari M, Gould TJ, Björk JR, Grenier J, Yotova V, Jansen D, Gottel N, Gordon JB, Learn NH, Gesquiere LR, Wango TL, Mututua RS, Warutere JK, Siodi L, Gilbert JA, Barreiro LB, Alberts SC, Tung J, Archie EA, Blekhman R. 2021. Gut microbiome heritability is nearly universal but environmentally contingent. Qiita. 12949

    The following previously published datasets were used:

    Vatanen T, Kostic A, d'Hennezel E. 2016. DIABIMMUNE three country cohort. NCBI BioProject. PRJNA290380

    Johnson AJ. 2019. Johnson et al. dietary cohort. European Nucleotide Archive. PRJEB29065


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES