Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Mar 20.
Published in final edited form as: Cell Rep. 2023 Oct 28;42(11):113233. doi: 10.1016/j.celrep.2023.113233

Hadza Prevotella require diet-derived microbiota-accessible carbohydrates to persist in mice

Rebecca H Gellman 1, Matthew R Olm 1, Nicolas Terrapon 2, Fatima Enam 1, Steven K Higginbottom 1, Justin L Sonnenburg 1,3,4,*, Erica D Sonnenburg 1,4,5,*
PMCID: PMC10954246  NIHMSID: NIHMS1948171  PMID: 38510311

SUMMARY

Industrialization has transformed the gut microbiota, reducing the prevalence of Prevotella relative to Bacteroides. Here, we isolate Bacteroides and Prevotella strains from the microbiota of Hadza hunter-gatherers in Tanzania, a population with high levels of Prevotella. We demonstrate that plant-derived microbiota-accessible carbohydrates (MACs) are required for persistence of Prevotella copri but not Bacteroides thetaiotaomicron in vivo. Differences in carbohydrate metabolism gene content, expression, and in vitro growth reveal that Hadza Prevotella strains specialize in degrading plant carbohydrates, while Hadza Bacteroides isolates use both plant and host-derived carbohydrates, a difference mirrored in Bacteroides from non-Hadza populations. When competing directly, P. copri requires plant-derived MACs to maintain colonization in the presence of B. thetaiotaomicron, as a no-MAC diet eliminates P. copri colonization. Prevotella’s reliance on plant-derived MACs and Bacteroides’ ability to use host mucus carbohydrates could explain the reduced prevalence of Prevotella in populations consuming a low-MAC, industrialized diet.

In brief

Gellman et al. present a set of Bacteroides and Prevotella isolates from the Hadza microbiota. The results of whole-genome sequencing, gnotobiotic mouse models, and RNA sequencing show that P. copri relies on the presence of dietary microbiota-accessible carbohydrates (MACs) to persist in the gut microbiota.

Graphical Abstract

graphic file with name nihms-1948171-f0001.jpg

INTRODUCTION

The industrialized lifestyle is defined by the consumption of highly processed foods, high rates of antibiotic administration, cesarean section births, sanitation of the living environment, and reduced contact with animals and soil, all of which can affect the human gut microbiota.1 Certain taxa are influenced by industrialization; i.e., they are prevalent and abundant in non-industrialized populations and diminished or absent in industrialized populations, or vice versa.28 The microbiota of 1,000- to 2,000 year-old North American paleofeces is more similar to the modern non-industrialized than industrialized gut.9 The industrialized microbiota appears to be a product of both microbial extinction, as once-dominant taxa disappear, and expansion of less-dominant or new taxa.10

The industrialized diet differs drastically from non-industrialized diets, including a reduced amount of microbiota-accessible carbohydrates (MACs), a major metabolic input for microbes in the distal gastrointestinal tract.1012 Some gut-resident microbes use host mucin, which is heavily glycosylated, as a carbon source, depending on the availability of dietary MACs.1317 Shifts in dietary MACs alter microbial relative abundances and may increase inflammation and susceptibility to intestinal pathogens.14,18,19 Taxa are lost due to a lack of dietary MACs over generations in a mouse model20 and in humans as they immigrate to the US.7

As human populations adopt an industrialized lifestyle, the prevalence of Prevotella decreases and that of Bacteroides increases.2,3,21 These genera are both members of the Bacteroidota phylum, are known to colonize mammalian hosts, and make up a significant fraction of the human gut microbiome.2224 While Bacteroides are well studied, Prevotella species remain understudied with few tools available for mechanistic investigation.2528 Both genera harbor well-documented carbohydrate utilization capabilities, encoded in carbohydrate active enzymes (CAZymes), often organized into polysaccharide utilization loci (PULs).2931 Characterization of intestinal Prevotella species have been limited by challenges with colonization, particularly mono-colonization of germ-free mice. Here, we overcome these barriers to establish a causal link between diet and Prevotella copri abundance in a gnotobiotic mouse model.

The decreased prevalence of Prevotella in industrial populations is likely linked to a decline in relative abundance within individual microbiotas.32 Decreased abundance of bacterial taxa in individuals reduces the likelihood of transmission from mother to infant.1,5 When compounded over generations, decreased abundance can result in population-level decline in prevalence and eventually taxa loss or extinction.7 The factors driving the decline in Prevotella and the increase in Bacteroides during industrialization remain elusive. The abundance and prevalence of specific strains of P. copri, the dominant Prevotella species in the human gut, vary among populations based on host lifestyle, particularly diet.33,34 Here, we use gnotobiotic mice to investigate the role of diet in sustaining Prevotella and Bacteroides colonization; we demonstrate that dietary MACs play a key role in controlling the abundances of Bacteroides and Prevotella.

RESULTS

Bacteroides and Prevotella genomes from the Hadza microbiota vary in prevalence across lifestyle

To compare Prevotella and Bacteroides from non-industrialized lifestyle populations, we isolated and sequenced six Bacteroides strains and seven Prevotella strains from stool samples collected from 13 Hadza individuals. Single-isolate genomes were assembled using both MiSeq-generated short reads (146 bp) and nanopore-generated long reads (10–100 kb) (Table 1).

Table 1.

Bacterial strains used

Strain name Genus Species Strain Origin specific Origin Genome size (bp) Genome size (Mb) Number of genes Number of contigs

Bt H-2209 Bacteroides thetaiotaomicron H-2209 Hadza human feces 6,119,319 6.119319 4,722 1
Bt H-2622 Bacteroides thetaiotaomicron H-2622 Hadza human feces 6,037,034 6.037034 4,686 1
Bc H-1617 Bacteroides caccae H-1617 Hadza human feces 5,112,756 5.112756 4,360 1
Bf H-2631 Bacteroides fragilis H-2631 Hadza human feces 4,877,774 4.877774 4,622 3
Bo H-1813 Bacteroides ovatus H-1813 Hadza human feces 6,715,646 6.715646 5,536 4
Bo H-2495 Bacteroides ovatus H-2495 Hadza human feces 6,789,892 6.789892 5,086 1
Bt VPI 5482 Bacteroides thetaiotaomicron VPI 5482 reference human feces, unknown nationality 6,260,000 6.26 5,108 1
Bc ATCC 43185 Bacteroides caccae ATCC 43185 reference human feces, Texas 5,280,000 5.28 4,399 1
Bf NCTC 9343 Bacteroides fragilis NCTC 9343 reference human appendix abscess, UK 5,190,000 5.19 4,194 2
Bo ATCC 8483 Bacteroides ovatus ATCC 8483 reference human feces, unknown nationality 6,470,000 6.47 4,896 1
Pc DSM 18205 Prevotella copri DSM 18205 reference human feces, Japan 3,510,000 3.51 2,968 1
Pc H-2379 Prevotella copri H-2379 Hadza human feces 4,350,632 4.350632 3,611 1
Pc H-2383 Prevotella copri H-2383 Hadza human feces 4,506,031 4.506031 3,836 1
Pc H-2446 Prevotella copri H-2446 Hadza human feces 4,057,255 4.057255 3,383 3
Pc H-2477 Prevotella copri H-2477 Hadza human feces 4,111,062 4.111062 3,405 1
Pc H-2489 Prevotella copri H-2489 Hadza human feces 4,115,122 4.115122 3,563 1
Pc H-2497 Prevotella copri H-2497 Hadza human feces 4,081,238 4.081238 3,548 1
Pc H-2632 Prevotella copri H-2632 Hadza human feces 3850424 3.850424 3251 4
Pc N-01 Prevotella copri N-01 non-Hadza human feces, USA 4,057,390 4.05739 3,880 1
Pc YF2 Prevotella copri YF2 reference unknown 3,860,000 3.86 3,060 2

The taxonomy of these newly isolated strains was evaluated using the Genome Taxonomy Database (GTDB) version r207 (Figure 1A). All Bacteroides isolates belong to known species: Bacteroides ovatus, Bacteroides thetaiotaomicron, Bacteroides caccae, and Bacteroides fragilis. Three of our Prevotella isolates belong to named species Prevotella sp015074785 and Prevotella sp900551275, while the remaining five isolates are novel species according to GTDB. To verify this finding, we created a phylogenetic tree with our Prevotella isolates, representatives of the most closely related representative species in GTDB, and all P. copri representative genomes in GTDB (Figure S1A; Table S1). We observe that our isolated genomes have approximately the same phylogenetic distance to the closest representative genomes as the representative genomes have to one another, supporting their characterization as novel species. Apart from GTDB, there have been other efforts to characterize the extensive genomic diversity of the Prevotella genus.34 Of the four proposed P. copri subgroups possessing >10% inter-clade genetic divergence, all eight Hadza Prevotella strains recovered in this study belong to clade A (Figure S1B).

Figure 1. Hadza Bacteroides and Prevotella strains are related to previously sequenced isolates and vary in prevalence across populations.

Figure 1.

(A) Phylogenetic tree of Prevotella and Bacteroides genomes. Isolates from this study (red) and genomes from GenBank (black), strains used later in this study (bold).

(B) Prevotella and Bacteroides subspecies prevalence across foragers (Hadza and Chepang), agriculturalists (Rau, Raj, Tharu), or industrialized (California). Population size for each group is denoted as n. Prevalence defined as percentage of gut metagenomes from a population (column) in which a particular strain (row) is detected. Gray triangles indicate whole genomes isolated in this study, aligned with (A).

To understand the prevalence of these genomes across human populations, we compared Prevotella and Bacteroides prevalence among Hadza adults and infants, four populations from Nepal living on a lifestyle gradient including foraging (Chepang), recent agriculturalist (Raute, Raji), longer term agriculturalist (Tharu), and industrial lifestyle populations (California) (Figure 1B; Table S2). We chose these groups due to their varied lifestyles and the exceptional metagenomic sequencing depth achieved, averaging 23 Gbp per sample.3,4 Prevotella genomes are rare in or absent from the industrialized populations, while they are more prevalent and abundant in the Hadza and Nepali samples. Conversely, nearly all Bacteroides genomes, including those isolated from the Hadza, are more prevalent in the California samples. The clear lifestyle shift associated with Bacteroides and Prevotella prevalence leads to the question of what aspects of the industrial lifestyle have driven these changes.

Dietary MACs are necessary for P. copri persistence

While many factors differentiate the industrial and non-industrial lifestyles, diet serves as the top candidate for driving microbiota alterations.10 The Hadza diet is rich in dietary MACs from foraged tubers, berries, and baobab.35 In contrast, the industrialized diet is typified by high caloric intake and foods rich in fat and low in MACs.36 We wondered whether diet alone could affect the ability of Hadza Bacteroides and Prevotella to colonize mice. Germ-free (GF) mice were colonized with either Hadza B. thetaiotaomicron (Bt) H-2622, or Hadza P. copri (Pc) H-2477. Mice were maintained on a high-MAC diet for 7 days and then switched to either a diet devoid of MACs (no MAC), a high-fat/low-MAC diet (Western), or maintained on the high-MAC diet for 7 days (Figure 2A). Bt H-2622 colonization density (109 colony-forming units [CFU]/mL in feces) at baseline on the high-MAC diet was maintained in all three diet conditions (Figure 2B). Pc H-2477 colonized to a lower degree on the high-MAC diet (107 CFU/mL on day 0) and declined drastically following the change to the Western or no-MAC diet, with no fecal CFUs detectable 7 days post diet switch (Figure 2C). The lack of detectable Pc H-2477 in the absence of MACs was particularly striking given the absence of competition from other microbes in this mono-associated state. To our knowledge, this is the first example of a strain’s apparent eradication in a mono-associated state due to a diet change. Two other P. copri strains (Hadza Pc H-2497 and a non-Hadza strain isolated from an individual of African origin Pc N-01) are also lost in vivo in the absence of dietary MACs (Figures S2A and S2B), indicating that survival of P. copri in vivo depends on the presence of dietary MACs.

Figure 2. Bt and Pc colonization differ in diet-dependent manner.

Figure 2.

(A) Schematic of gnotobiotic experiments.

(B and C) Fecal density of Bt H-2622 (B), and Pc H-2477 (C) in monocolonized mice (n = 4/group for high-MAC and Western diets, n = 5/group for no-MAC diet) fed different diets (multiple Mann-Whitney tests, *p < 0.05, **p < 0.01). Representative experiments, repeated twice. Dashed line denotes limit of detection (LOD = 500 CFU/mL). Error bars indicate standard error of the mean (SEM).

(D) Proportion of genes upregulated in vivo in cecal contents of monocolonized mice on high-MAC diet on day seven of the experiment shown in (A), compared to gene expression in culture with PYG in late exponential phase. Genes organized by functional categories (Rapid Annotation using Subsystem Technology, RAST).

(E) Proportion of predicted substrate categories of upregulated CAZymes in cecal contents of monocolonized under high-MAC conditions on day seven of the experiment shown in (A).

To measure the gene expression employed by Hadza Pc and Bt in vivo, we analyzed transcriptional profiling data from cecal contents of mice monocolonized with either Pc H-2477 or Bt H-2622 fed a high-MAC diet relative to in vitro growth in peptone yeast glucose broth (PYG). Both Bt H-2622 and Pc H-2477 upregulate a large number of genes in vivo under high-MAC diet conditions. Despite the fact that 18% and 13% of genes in the Bt H-2622 and Pc H-2477 genomes, respectively, encode for predicted carbohydrate utilization proteins, 86% (in Bt H-2622) and 65% (in Pc H-2477) of genes upregulated in vivo relative to in vitro encode for carbohydrate utilization (p < 4e–12 for Bt, p < 5e–13 for Pc, Fisher’s exact test), indicating that carbohydrate utilization is the major metabolic function of these organisms in vivo (Figure 2D).

A comparison of glycosidic linkage-breaking CAZymes, glycoside hydrolases (GHs) and polysaccharide lyases (PLs), reveals that both Bt H-2622 and Pc H-2477 upregulate more CAZymes in vivo on the high-MAC diet than in vitro (Pc, 71/6 CAZymes significantly expressed in vivo/in vitro; Bt, 244/55 CAZymes significantly expressed in vivo/in vitro) (Figures S2C and S2D). However, Bt H-2622 upregulates a higher proportion of GHs and PLs devoted to animal-derived carbohydrate utilization relative to Pc H-2477 (Figure 2E). Specifically, in vivo under high-MAC diet conditions, Bt H-2622 upregulates eight of 22 encoded mucus-targeted GHs (three out of 10 GH18; five out of 12 GH20), whereas Pc H-2477 encodes no GH18s and only one GH20, which is not upregulated in the high-MAC diet condition (Figures 2E, S2C, and S2D). In addition to targeting mucus carbohydrates, Bt H-2622 also upregulates 40 of its 97 plant-targeting GHs and PLs, whereas Pc H-2477 upregulates all 38 of its plant-targeting GHs and PLs in the high-MAC diet (Figure 2E).

On the no-MAC diet relative to the in vitro condition, Bt H-2622 upregulates two additional GH20s (along with the eight other mucin CAZymes upregulated on the high-MAC diet) as well as 27 plant-targeting GHs and PLs (Figures 2E and S2E). In other words, under high-MAC diet conditions, Bt upregulates CAZymes associated with plant, animal, and other carbohydrates equivalently (48 animal, 45 other, 46 plant), whereas, under the no-MAC diet condition, Bt upregulates a larger number and proportion of CAZymes associated with the utilization of animal associated carbohydrates (51 animal, 27 other, 35 plant). Since Pc H-2477 does not colonize mice fed the no-MAC diet, this condition was not profiled. When comparing the no-MAC diet to the high-MAC diet, Bt H-2622 differentially upregulates only three GHs, two of which degrade mucin (GH18) (Figure S2F).17

Taken together, these data indicate that, in vivo, Bt H-2622 relies on both mucus and plant-derived carbohydrates. When plant carbohydrates are eliminated from the diet, Bt H-2622 further upregulates mucus-degrading machinery, whereas Pc H-2477’s minimal mucus-degrading capacity renders it incapable of sustaining colonization in the absence of MACs.

Carbohydrate degradation capacity differences between Hadza Bacteroides and Prevotella mirrors industrialized strains

Hadza Pc and Bacteroides isolates have a similar number and predicted function of GHs and PLs to reference strains of the corresponding species (Table S3; Figure 3). Unsupervised clustering of GHs and PLs reveals that the Hadza strains cluster with their type strain counterparts, in keeping with the genetic similarity between genomes of the same species; the sets of CAZymes in each genome are most similar to the CAZymes found in genomes assigned to the same species (Figure 3A). When comparing the total number of GHs and PLs encoded within the Hadza strains to non-Hadza strains, we found similar total numbers of these genes and distribution of substrate specificity between strains of the same species (Figures 3B and 3C). Comparisons of Hadza Pc CAZymes are limited by the limited number of annotated Pc genomes in the CAZy database (only two exist at the time of this publication). We have now added seven more Pc genomes, and, as more Pc genomes are published, more variation in CAZyme repertoire may be uncovered.

Figure 3. Hadza Bacteroides and Prevotella differ in distribution of GHs and PLs.

Figure 3.

(A) Number of GHs and PLs per genome indicated by CAZy family (rows). CAZymes shown appear at least once in any of the genomes analyzed. Hierarchical clustering via complete-linkage clustering method.

(B) Number of GHs and PLs normalized to genome size (Mb), colored by predicted substrate.

(C) Proportion of GHs and PLs in each genome colored by predicted substrate.

(D) Number of mucin-degrading GH18 and GH20 genes per genome.

While Hadza Bacteroides and Prevotella strains mirror the carbohydrate-degrading capacity of their non-Hadza counterparts, large differences exist between the Bacteroides and Prevotella strains. The Bacteroides encode more GHs and PLs than Prevotella strains even when corrected for genome size (251/21 average GH/PL in Bacteroides; 101/5 in Prevotella; Welch two-sample t test, p = 0.0056) (Table S3; Figure 3B). The proportion of Bacteroides GHs and PLs that are predicted to target plant carbohydrates or animal carbohydrates are equivalent (average 34% and 37%, respectively), whereas the Prevotella-encoded carbohydrate degradation is biased toward plant over animal carbohydrates (average 44% and 19%, respectively) (Figure 3C). The Bacteroides also encode a greater breadth of GH and PL families (averaging 68 CAZyme families per genome), while Pc isolates average 40 CAZy families per genome (Figure 3A), consistent with previously reported distributions for industrial-lifestyle-derived Bacteroides and Prevotella strains.31 The two genera also differ in their predicted mucin-degradation capacity (Figure S3; Wilcoxon test, p = 3e–4). CAZyme families GH18 and GH20 target carbohydrates found within the intestinal mucus lining.37 All Hadza Bacteroides isolates harbor 11–14 GH20 and 1–13 GH18 CAZymes; however, the Hadza Prevotella isolates contain only one or two GH20s and only one isolate, Pc H-2497, contains a single GH18 (Figure 3D; Wilcoxon test, p = 4e–4).

The CAZyme contents of Hadza Bacteroides and Prevotella isolates are similar to their non-Hadza counterparts. Hadza Bacteroides isolates contain both more GHs and PLs overall as well as broader substrate-degrading capabilities that include both plant- and animal-derived carbohydrates relative to the Hadza Prevotella isolates. This difference between the Hadza Bacteroides and Prevotella strains is similar to that seen in non-Hadza strains and industrial lifestyle microbiotas, suggesting that the Prevotella niche is more reliant upon plant carbohydrates compared to Bacteroides.38,39

Dietary MACs are sufficient to maintain Pc colonization in the presence of Bt

To test whether Hadza Bacteroides and Prevotella isolates differ in their ability to use plant- and mucus-derived carbohydrates, we cultured Hadza and type strain Bacteroides and Pc isolates in media containing the plant carbohydrate inulin, porcine gastric mucin glycans, porcine intestinal heparin, or fructose as the sole carbon source. There is a range of ability to utilize inulin across the strains, consistent with previous work (Figure 4A).40 Growth in the presence of mucin, however, is divided by genera; most Bacteroides isolates grow well on mucin, but the P. copri isolates do not (Figure 4A). These data are consistent with the lack of mucin-degrading capacity within the Pc genomes and the loss of Pc colonization in vivo when the host is the major carbohydrate source.

Figure 4. Reintroduction of MACs is sufficient to maintain P. copri colonization.

Figure 4.

(A) Normalized maximum optical density 600 (OD600) of Bacteroides and Prevotella isolates grown in Yeast Casitone Fatty Acids broth (YCFA) with a single added carbohydrate for 24 h.

(B) Fecal CFUs of Pc H-2477 in monocolonized mice fed a no-MAC-g or inulin-g diet (mean + SEM, n = 5 mice per group, multiple Mann-Whitney tests, *p ≤ 0.05, **p < 0.01). Dashed line denotes LOD = 500 CFU/mL. Error bars indicate SEM. Representative experiment, repeated three times.

(C) Schematic of bicolonization with Pc H-2477 and Bt H-2622.

(D and E) qPCR index of DNA of Pc H-2477 (D) and Bt H-2622 (E) quantified from fecal samples from bicolonized mice (mean + SEM, n = 5 mice per group; multiple t tests, *p ≤ 0.05, **p < 0.01). Dashed line denotes LOD = 10−7 index. Error bars indicate SEM. Shaded bars indicate administration of high-MAC diet. Representative experiment, repeated twice.

To determine whether the lack of diet-derived MACs is responsible for the loss of Pc H-2477 colonization we observed in the high-fat/low-MAC Western diet and no MAC diet (Figure 2C), we fed mice monocolonized with Pc H-2477 a high-MAC diet and then switched to either a custom diet containing 34% inulin by weight as the sole fermentable carbohydrate to match MAC content of the high-MAC diet (custom diets use gelatin as a binding agent and are noted by a “-g”; inulin-g) or a no-MAC diet (no-MAC-g).41 The no-MAC-g diet did not sustain Pc H-2477 colonization, with the strain becoming undetectable within 3 days (Figure 4B). However, Pc H-2477 maintained colonization in the presence of the inulin-g diet to levels similar to those observed in the high-MAC diet (Figures 2C and 4B), consistent with the requirement of MACs for Pc H-2477 colonization in vivo.

We were curious how dietary MACs affect the relative abundance of Pc and Bt in mice when colonized together. GF mice were co-colonized with Pc H-2477 and Bt H-2622 and fed a high-MAC diet for 7 days and then either maintained on the high-MAC diet, switched to the no-MAC-g diet, or switched to the inulin-g diet for 2 weeks, followed by a 1-week period in which all mice consumed the high-MAC diet (Figure 4C). Prior to the diet switch (day 0), mice harbored both Pc H-2477 and Bt H-2622 with Pc abundance significantly lower than Bt (Pc index = 1.9e–4; Bt index = 9e–4; unpaired t test, p = 0.002; n = 15) (Figures 4D and 4E). However, 7 days after the switch to the no-MAC-g diet, Pc H-2477 was no longer detectable, whereas Bt H-2622 colonization remained the same. The switch to the inulin-g diet resulted in a less dramatic decrease, with Pc still detectable after 7 days but not after 14 days, indicating that inulin provided support to Pc beyond the no-MAC diet (Figure 4D). Bt colonization remained stable on the inulin-g diet, with a small but significant increase in abundance on day 14 relative to the high-MAC condition (Figure 4E). When mice were returned to the high-MAC diet on day 14, those fed the inulin-g diet regained relative abundance of Pc H-2477 equivalent to that of baseline and to mice fed the high-MAC diet throughout the experiment. However, in mice switched to the high-MAC diet from the no-MAC-g diet, Pc H-2477 DNA remained undetectable. Bt levels stayed constant in the no-MAC diet condition, with a small decrease on day 21 relative to high-MAC-fed mice (Figure 4E). These data are consistent with the requirement of dietary MACs for Pc colonization in the presence of Bt and indicate that the variety of carbohydrates in the high-MAC diet (derived from wheat, corn, oats, and alfalfa) better supports Pc colonization relative to a single-MAC diet (inulin). Given that Bt’s consumption of inulin is minimal in vivo (Figure 4A), it was unexpected that Pc colonization decreased in the inulin-g diet condition (Figure 4D). It is possible that Pc may face competition from Bt for the host-sourced carbohydrates it can access (Figure 2E) or that Bt colonization may alter the environment such that Pc abundance is affected in the inulin-g diet but not the high-MAC diet (Figure 2E). Our data further indicate that prolonged absence of MACs restricts the ability of Pc to regain abundance when dietary MACs are reintroduced.

DISCUSSION

The tradeoff between a microbiota dominated by Bacteroides or Prevotella based on host lifestyle has been well described, but its basis is not well understood.8,42

Here, we demonstrate that Hadza isolates of Bacteroides and Prevotella do not differ dramatically from their non-Hazda counterparts in terms of genome-wide average nucleotide identity and carbohydrate utilization, suggesting that differences in their relative abundance and prevalence across lifestyle is not due to an inherent property of the population-specific strains themselves but to differences in their environments. Furthermore, we demonstrate that MACs are crucial for Prevotella to maintain colonization: even as the sole microbe, Prevotella is eradicated when dietary MACs are removed. Bacteroides species, however, can maintain colonization in the absence of dietary MACs due to their ability to use both plant- and host-derived carbohydrates, enabling continued colonization in low-MAC industrialized diets. Our data demonstrate that, in the presence of dietary MACs in gnotobiotic models, Hadza Bacteroides and Prevotella can coexist, as is seen in the Hadza microbiota. However, removal of dietary MACs results in a precipitous decline in Prevotella, which does not recover when MACs are reintroduced. The presence of a single MAC, inulin, in the diet was sufficient to maintain an intermediate level of colonization that then rebounded when a more complete palate of MACs was available. These data are reminiscent of the seasonal pattern of Prevotella abundance in the Hadza, which cycles in abundance with the seasonality of their diet.

All together, these data are consistent with the model that, prior to industrialization, human microbiotas harbored both Bacteroides and Prevotella species. As diets shifted from high-MAC foraged foods to low-MAC industrially produced foods, abundance and prevalence of Prevotella diminished to the point of extinction in some individuals.4 Prevotella has been associated with beneficial health states, including improved glucose metabolism and increased resistance to malnutrition.43,44 However, Prevotella species have also been linked to negative outcomes, including rheumatoid arthritis and insulin sensitivity.45,46 Strain-level variation, differences in other members of the microbiota (i.e., context-specific effects), and differences in host immune status could account for these contradictions. While a high-MAC diet is broadly beneficial to overall health, whether the presence of Prevotella affects these benefits is unknown.4750 Additionally, how the loss of Prevotella and increased abundance of Bacteroides within the industrialized microbiota affects human physiology remains an important question.

Limitations of the study

Prior to this study, very few human-derived isolates of Pc were available for study, making comparisons of the strains isolated here to existing strains limited. As more isolates of Pc become available, it will be important to update the comparisons performed in this study. Additionally, while we demonstrate that Pc colonization decreases in response to a no-MAC diet, this colonization decline occurs so rapidly that we were not able to capture transcriptional data under this diet condition. This dataset would have been a useful comparison to Pc colonized mice on the high-MAC diet and Bt-colonized mice on the no-MAC diet. Furthermore, our study measured the effect of diet on Pc colonization, which was profound, but it is possible that there are other factors outside of diet that are important regulators of Pc colonization in vivo.

STAR★METHODS

RESOURCE AVAILABILITY

Lead contact

All information and requests for further resources should be directed to and will be fulfilled by the Lead Contact, Erica Sonnenburg, erica.sonnenburg@stanford.edu.

Materials availability

This study did not generate new unique reagents.

Data and code availability

Raw data files for WGS and RNAseq can be found at Zenodo: https://doi.org/10.5281/zenodo.7651179. Code used to generate the figures and additional data can be found at Zenodo: https://doi.org/10.5281/zenodo.8339517. Isolate genomes will be available at NCBI: PRJNA1015720 upon publication. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS

Bacterial culture

Bacteria not isolated in this study were purchased from DSMZ (P. copri DSM 18205), or ATCC (all other reference strains). Glycerol stocks were struck out on Brain Heart Infusion agar with 10% defibrinated horse blood (BHIBA) and incubated anaerobically for 24–48 h at 37°C. All growth and culturing of Bacteroides and Prevotella strains were performed anaerobically in a Coy anaerobic chamber containing 87% N2, 10% CO2, and 3% H2.

Mouse husbandry

All mouse experiments were performed in accordance with the Stanford Institutional Animal Care and Use Committee. Mice were maintained on a 12-h light/dark cycle at 20.5 °C at ambient humidity, fed ad libitum, and maintained in flexible film gnotobiotic isolators for the duration of all experiments (Class Biologically Clean). Swiss-Webster mice were used for gnotobiotic experiments and the sterility of germ-free mice was verified by 16S PCR amplification and anaerobic culture of feces. Sample sizes were chosen on the basis of litter numbers and controlled for sex and age within experiments. Researchers were unblinded during sample collection.79

Statement on work with indigenous communities

In order to acquire scientific knowledge that accurately represents all human populations, rather than only reflecting and benefiting those in industrialized nations, it is necessary to involve indigenous populations in research in a legal, ethical, and non-exploitative manner.25,80 Here, we isolated live bacterial strains from anonymized fecal samples collected from Hadza hunter-gatherers in 2013/2014.4,6,81 Samples were collected with permission from the Tanzanian government, National Institute of Medical Research (MR/53i 100/83, NIMR/HQ/R.8a/Vol.IX/1542), the Tanzania Commission for Science and Technology, and with aid from Tanzanian scientists. A material transfer agreement with the National Institute for Medical Research in Tanzania specifies that collected samples are solely to be used for academic purposes. For more information on the consent practices followed, and our ongoing work to communicate the results of these projects to the Hadza, please see Carter et al.4 and Olm et al.5

METHOD DETAILS

Strain isolation from fecal samples

Samples for strain isolation were chosen from the samples reported previously based on the 16S abundance of either Bacteroides or Prevotella genera.6 All isolations were performed under anaerobic conditions on YCFA agar with 5% glucose or baobab powder. 1μL of frozen feces was struck onto a single agar plate. Visible colonies from the initial plates were identified via colony PCR using bacterial 16S primers, and re-plated onto BBE and LKV plates (Anaerobe Systems). PCR products were purified using the QIAquick PCR Purification Kit (Qiagen), and sequenced via Sanger sequencing at Elim Biopharm. The resulting sequences were identified using nucleotideBLAST.82 Colonies that were predicted to share >95% identity with a Bacteroides or Prevotella species were re-struck two additional times on either BBE or LKV plates, respectively, to ensure a pure culture. Glycerol stocks were made by growing a liquid culture of a single colony overnight in PYG, and then mixing at a 1:1 ratio with a 50% glycerol, 50% PBS solution.

Whole genome sequencing

Genomic DNA was extracted from single-isolate cultures grown for 24 h using a MasterPure Gram Positive DNA Purification Kit. Long-read sequencing was performed using a Nanopore MinION (flow cell FLO-MIN106, Ligation Sequencing Kit SQK-LSK109; Barcoding Kit EXP-NBD104) and short read sequencing was performed using an Illumina MiSeq. Nanopore basecalling was performed with Guppy version 3.4.2m using the command “guppy_basecaller -r -i raw_fast5/–flowcell $flowcell –kit $kit -x auto –compress_ fastq –gpu_runners_per_device 8 -q 0 –chunks_per_runner 4096”. Short read sequence quality was assessed using Fastqc with the command “fastqc –nogroup -q”, and adapters were trimmed with BBTools using the command “bbduk.sh -Xmx2g -eoom ref = adapters, phix threads = 8 ktrim = r k = 23 mink = 11 edist = 2 entropy = 0.05 tpe tbo qtrim = rl minlength = 100 trimq = 30 pigz = t unpigz = t samplerate = 0.25.” If there was more than 100x coverage of the genome, reads were normalized using the command “bbnorm.sh target = 100 min = 2”. Hybrid assembly of the short and long reads was performed using SPAdes with the command “spades.py –careful –cov-cutoff auto -k 21,33,55,77,99,127”.83 RagOUT was used for chromosome-level scaffolding using either the matched reference genome of the same species for Bacteroides (Table 1), or Pc H-2477 for Prevotella.52 Assembly quality was assessed with Quast.84 Gene annotation was performed using RASTtk.53

Clustering genomes into subspecies

All public Bacteroides and Prevotella genomes of “Scaffold” quality or better were downloaded from NCBI GenBank on 5/15/2023 using the program ncbi-genome-download (https://github.com/kblin/ncbi-genome-download). The commands used were “ncbi-genome-download –genera Bacteroides –section GenBank –formats fasta –assembly-levels complete, chromosome,scaffold bacteria” and “ncbi-genome-download –genera Prevotella –section GenBank –formats fasta –assembly-levels complete, chromosome,scaffold bacteria” This resulted in a total of 888 and 1894 genomes of Prevotella and Bacteroides, respectively.

Public genomes were clustered along with the isolate genomes recovered in the study using dRep v3.2.154 using the command “dRep dereplicate –S_algorithm fastANI -sa 0.98”. The 98% ANI threshold was chosen manually based on a histogram of reported ANI values (Figure S4A). Representative genomes were chosen using dRep’s default scoring system with the following adjustments: isolates sequenced in this study were given an additional 100 points, isolate genomes used in this study were given an additional 80 points, public genomes marked as “representative genome” in Refseq were given an additional 60 points, and public genomes of “Complete Genome” and “Chromosome” quality were given an additional 40 and 20 points, respectively.

Evaluating subspecies prevalence and phylogenetic analysis

All metagenomic reads were downloaded from Carter et al.4 Metagenomic reads were mapped to Prevotella and Bacteroides subspecies representative genomes using Bowtie2 with default settings55 (command “bowtie2 -x $index −1 $r1 −2 $r2 | samtools sort -o $output.bam), and the resulting .bam files were profiled using coverM as implemented through inStrain with default settings56 (command “inStrain profile $bam $fasta -s $stb –coverm”). Genomes detected with ≥65% genome breadth were considered “present” in a metagenome. This threshold was chosen based on manual inspection of a genome breadth histogram (Figure S4B).

The prevalence of each genome in each population was calculated as the percentage of metagenomes in which the genome was detected. Phylogenetic trees were made for Bacteroides and Prevotella subspecies representative genomes detected in at least one metagenome using GToTree v1.5.36 with the command “GToTree -H Bacteria -T IQ-TREE”. One outgroup from a different genus was included in each tree. Tree leaves were labeled based on GTDB taxonomy release r20759 Trees were visualized using iTol.60 Figure S1A includes GTDB representative genomes for i) all species “copri” in their species name, ii) all Prevotella species of isolates recovered in this study, and iii) the closest representative genome (according to GTDB) for all isolate genomes recovered in this study. For Figure S1B, 10 genomes from each clade were randomly chosen to include in the tree. Prevotella stercorea was included as an outgroup.

CAZyme annotation

CAZyme annotations were performed for each isolate. An additional 20 strains of Prevotella copri available at NCBI, with variable assembly levels, were annotated as well for comparative purpose, with the isolates and two model strains. All amino acid sequences were first compared to the full-length sequences stored in the CAZy database (Sept. 2021)61 using BlastP (version 2.3.0+).62 Queries obtaining 100% coverage, >50% sequence identity and E-value ≤10−6 were automatically annotated with the same domain composition as the closest reference homolog. All remaining sequences were subject to human curation to verify the presence of each putative module. During this process, the curator could rely on (i) bioinformatics tools, including BLAST against libraries on either full-length protein, modules only or characterized modules only, and HMMER version 3.163 against in-house built models for each CAZy (sub)family; (ii) human expertise on the appropriate coverage, sequence identity and E-value thresholds which vary across (sub)families, and ultimately on the verification of the catalytic amino acid conservation. Hierarchical clustering of isolates’ CAZyme repertoires was performed using ComplexHeatmap.75 Predicted substrate assignment was compiled from previously published works.6,14

In vitro polysaccharide growth assays

Glycerol stocks were struck out on Brain Heart Infusion plates with 10% defibrinated horse blood and incubated anaerobically for 24 h at 37°C. Isolates were passaged overnight in BHI-S (Bacteroides), and YCFA-G (Prevotella). After 16h, cultures were diluted 1:50 for Bacteroides and 1:10 for Prevotella into 200uL of culture media in a clear, flat bottomed 96-well plate. Growth media was composed of a YCFA background, plus 0.5% carbohydrate, with the exception of inulin, which was added at a 1.5% concentration. OD600 was measured every 15 min for 48h using a BioTek Epoch2 plate reader, with 30 s of shaking prior to each reading. Normalized OD was calculated for each carbohydrate condition by subtracting the average blank OD600 from the raw OD600 for each isolate grown in the corresponding polysaccharide. Maximum OD was calculated as the highest normalized OD in the first 24h period.

Colonization and enumeration of gnotobiotic mice

For colonization with B. thetaiotaomicron H-2622, mice were gavaged with 300uL of a 3mL liquid culture grown for 16h in BHI-S. For colonization with P. copri, mice were gavaged with 300uL of a 3mL liquid culture grown for 16h in YCFAC, in which was suspended 10–15 lawns (~1 per mouse) of P. copri grown on BHIBA for 48 h. For Prevotella colonization, food was removed from mouse cages and bedding was changed 12h before gavage. Before the gavage of Prevotella, mice were gavaged with 300uL of 10% sodium bicarbonate in water. Food was returned 2h post-gavage. For bicolonization experiments, mice were first colonized with Pc H-2477, then gavaged with Bt H-2622 7 days later. Bicolonization was allowed to stabilize for 5–7 days before the diet switch.

To measure bacterial density, feces were collected from individual mice. Two biological replicates of 1 μL feces were resuspended in 200 μL sterile PBS, serially diluted 1:10 in sterile PBS using a 96-well tissue culture plate, and 3 technical replicates of 2μL of each dilution were plated on BHIBA. CFUs were counted after 36h anaerobic growth at 37 °C.

In vivo competition assays

Feces were collected from individual mice. Genomic DNA was extracted from 2 biological replicates of fecal pellets using the DNeasy PowerLyzer PowerSoil kit (Qiagen). Concentration of Pc and Bt DNA was assessed using species-specific qPCR primers (Key Resources Table). qPCR was performed using the Brilliant III, Ultra Fast SYBR Green QPCR Master Mix and a Bio Rad CFX thermocycler. Genomic DNA from Bt H-2622 and Pc H-2477 were used to generate a standard curve for each primer pair. The standard curves were used to calculate the absolute quantity of Bt or Pc DNA in the sample. The efficiency value (E) for each primer pair was calculated as 10(1/−slope) of log10(DNA input) against Ct value. qPCR index was calculated using this equation: ECt primer pair.

KEY RESOURCES TABLE.
REAGENT or RESOURCE SOURCE IDENTIFIER

Biological samples

Human fecal samples from Hadza people Smits et al., 20176 N/A

Critical commercial assays

MasterPure Gram Positive DNA Purification Kit LGC Biosearch Technologies Cat#NC9197506
QIAquick PCR Purification Kit Qiagen Cat#28106
MinION Flowcell FLO-MIN106 Nanopore Kit: SQK-LSK109
Barcode kit: EXP-NBD104
MiSeq Illumina https://www.illumina.com/systems/sequencing-platforms/miseq.html
Epoch2 Microplate Reader BioTek https://www.biotek.com/products/detection-microplate-readers/epoch-2-microplate-spectrophotometer/
DNeasy PowerLyzer PowerSoil Qiagen Cat#12855–50
Brilliant III, Ultra Fast SYBR Green QPCR Master Mix Agilent Cat#600883
CFX Connect Real-Time PCR Detection System Bio-Rad Cat#1855201
RNeasy PowerMicrobiome Kit Qiagen Cat#26000–50
RiboMinus Transcriptome Isolation Kit, bacteria Invitrogen Cat#K155004
TruSeq® Stranded Total RNA Library Prep Human/Mouse/Rat Illumina Cat#20020597
NovaSeq SP Flow Cell Illumina NovaSeq 6000 System

Deposited data

Hadza 16S sequencing Smits et al., 20176 N/A
Metagenomic reads: Hadza, Nepal, and California populations Carter et al., 20234 N/A
Whole genome sequences: Bacteroides and Prevotella isolates this study NCBI BioProject PRJNA1015720: http://www.ncbi.nlm.nih.gov/bioproject/1015720
RNAseq data this study Zenodo: https://doi.org/10.5281/zenodo.7651179
Bacteroides and Prevotella reference genomes GenBank NCBI

Experimental models: Organisms/strains

P. copri isolates this study N/A
Bacteroides sp. isolates this study N/A
Prevotella copri DSM 18205 DSMZ Cat#DSM 18205
Bacteroides ovatus ATCC 8483 ATCC Cat#8483
Bacteroides thetaiotaomicron VPI-5482 ATCC Cat#29148
Bacteroides fragilis NCTC 9343 ATCC Cat#25285
Bacteroides caccae ATCC 43185 ATCC Cat#43185
Mouse: Swiss Webster, germ free Taconic Cat#SW-F GF

Oligonucleotides

P. copri forward primer: hpc_gyrB_03_F1 this study CACCCACACCATGTAAACCGCCAG
P. copri reverse primer: hpc_gyrB_03_R this study TGTACCGACATCGAAGTTACCATCAACGAAG
B. thetaiotaomicron forward primer: HBT05_03F this study GCAGGCACGGGCAGTATCAGTATCG
B. thetaiotaomicron reverse primer: HBT05_03R this study CGCCACGGATAGGCAGACATTTGTCA
Bacterial 16S forward primer: 16S rRNA 515F Parada et al., 199851 5’-GTGYCAGCMGCCGCGGTAA-3’
Bacterial 16S reverse primer: 16S rRNA 806R Parada et al., 199851 5’-GGACTACHVGGGTWTCTAAT-3’

Software and algorithms

FastQC Babraham Bioinformatics https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
BBTools Joint Genome Institute https://jgi.doe.gov/data-and-tools/software-tools/bbtools/
SPAdes Center for Algorithmic Biotechnology https://cab.spbu.ru/software/spades/
RagOUT Kolmogorov et al., 201852 https://github.com/fenderglass/Ragout
Quast Center for Algorithmic Biotechnology https://quast.sourceforge.net/index.html
RASTtk Brettin et al., 201553 https://rast.nmpdr.org/
dRep(v3.2.1) Olm et al., 201754 https://github.com/MrOlm/drep
Bowtie2 Langmeade and Salzberg, 201255 https://bowtie-bio.sourceforge.net/bowtie2/index.shtml
inStrain Olm et al., 202156 https://github.com/MrOlm/inStrain
scipy.cluster.hierarchy Virtanen et al., 202057 https://docs.scipy.org/doc/scipy/reference/cluster.hierarchy.html
GToTree (version 1.5.36) Lee, MD, 201958 https://github.com/AstrobioMike/GToTree/tree/V1.5.36
GTDB Chaumeil et al., 202059 https://gtdb.ecogenomic.org/
iTol Letunic and Bork, 202160 https://itol.embl.de/
CAZy Drula et al., 202261 http://www.cazy.org/
BlastP (version 2.3.0+) Camacho et al., 200962 https://blast.ncbi.nlm.nih.gov/Blast.cgi? PAGE_TYPE = BlastDocs&DOC_TYPE = Download
HMMER (version 3.1) Mistry et al., 201363 http://hmmer.org/download.html
Multiqc (version 1.14) Ewels et al., 201664 https://multiqc.info/
Trimmomatic (version 0.39) Bolger et al., 201465 http://www.usadellab.org/cms/?page= trimmomatic
HiSAT2 (version 2.2.0) Kim et al., 201966 http://daehwankimlab.github.io/hisat2/
SAMtools (version 1.16.1) Danecek et al., 202167 http://www.htslib.org/
StringTie (version 2.1.3) Shumate et al., 202268 http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual
DESeq2 (version 1.38.3) Love et al., 201469 https://bioconductor.org/packages/release/bioc/html/DESeq2.html
R (version 4.2.2) R Core Team https://www.r-project.org/
tidyverse (version 1.3.2) Wickham et al., 201970 https://www.tidyverse.org/
RStudio (version 1.4) R Core Team https://www.rstudio.com/
stringr (version 1.5.0) Wickham, 202271 https://stringr.tidyverse.org
MetBrewer (version 0.2.0) Blake Mills https://github.com/BlakeRMills/MetBrewer
RColorBrewer (version 1.1–3) Erich Neuwirth https://cran.r-project.org/web/packages/RColorBrewer/index.html
cowplot (version 1.1.1) Claus Wilke https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html
readxl (version 1.4.1) Wickham and Bryan, 202372 https://readxl.tidyverse.org/
svglite (version 2.1.1) Wickham, 202373 https://svglite.r-lib.org/
circlize (version 0.4.15) Gu et al., 201474 https://jokergoo.github.io/circlize_book/book/
ComplexHeatmap (version 2.14.0) Gu, 202275 https://jokergoo.github.io/ComplexHeatmap-reference/book/
dendsort version 0.3.4 Sakai et al., 201476 https://cran.rstudio.com/web/packages/dendsort/index.html
dendextend version 1.16.0 Tal Galili https://www.rdocumentation.org/packages/dendextend/versions/1.16.0#how-to-cite-the-dendextend-package
ggplot2 version 3.4.0 Wickham, 201677 https://cran.r-project.org/web/packages/ggplot2/index.html
seriation version 1.4.1 Hahsler et al., 200878 https://www.rdocumentation.org/packages/seriation/versions/1.4.1
Prism 9 for macOS GraphPad Software https://www.graphpad.com/guides/prism/latest/user-guide/citing_graphpad_prism.htm

Other

Heparin sodium salt (from porcine intestinal mucosa) Sigma H3393–50KU
D-Glucose, Anhydrous Alfa-aesar aaa16828–0e
D-Fructose 99% Alfa-aesar AAA17718–30
Inulin (chicory) Beneo Orafti®HP
KAIBAE Premium Baobab Fruit Powder Amazon N/A
Mucin from porcine stomach, Type III, bound sialic acid 0.5–1.5%, partially purified powder Sigma M1778–100G
Bacteroides Bile Esculin (BBE) Agar Plates Anaerobe Systems AS-144
Laked Brucella Blood Agar w/Kanamycin and Vancomycin (LKV) Plates Anaerobe Systems AS-142
Brain Heart Infusion Agar (BHI) BD Diagnostics DF0418177
Defibrinated horse blood Hemostat DHB500
96-Well, Cell Culture-Treated, Flat-Bottom Microplate Falcon Cat#353072
Yeast Casitone Fatty Acids Broth with Carbohydrates - YCFAC Broth Anaerobe Systems AS-680
No MAC diet: Teklad custom diet, Glucose Only Carb (93G, Irrad) Envigo TD.150689
Western diet: Teklad custom diet, adjusted fat diet Envigo TD.96132
High MAC chow: LabDiet® JL Rat and Mouse/ Auto 6F LabDiet 5K67
AIN-93G Basal Mix (CHO, Cellulose Free) Envigo TD.200788
Gelatin, bovine Sigma G9391

Mouse diets

The Inulin-g and No MAC-g diets were created using 32% AIN-93G Basal Mix (CHO, Cellulose Free) and 68% carbohydrates, to match the carbohydrate content of the No MAC diet (TD.150689). The Basal Mix and carbohydrate components were suspended in a mixture of water (1100mL per 250g package of Basal Mix) and 5% bovine gelatin as a binder. The carbohydrates (100% glucose, no MAC-g; 50% glucose and 50% inulin, Inulin-g) and gelatin were dissolved separately in MilliQ water and autoclaved. The gelatin mix and AIN-93G Basal Mix (CHO, Cellulose Free) (TD.200788) were added to the carbohydrate solution in a tissue culture hood, and the mix was allowed to solidify at 4°C. Diets are listed in the key resources table. After 1 week post-colonization, standard chow was removed and replaced with the desired test diet, and the bedding was changed. Gelatin chow was replaced every 3 days as the chow dried out.

RNAseq

RNA was extracted from mouse cecal contents and in vitro cultures using the RNeasy PowerMicrobiome Kit (Qiagen). Ribosomal RNA depletion was performed using the RiboMinus Transcriptome Isolation Kit (Invitrogen). A cDNA library was constructed using the TruSeq Stranded Total RNA Library Prep Human/Mouse/Rat kit. Sequencing was performed on a NovaSeq SP flow cell. Quality of raw reads was assessed with Multiqc using the command “multiqc”.64 Adapters were trimmed using Trimmomatic and the command “trimmomatic PE ILLUMINACLIP -PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36”.65 Reads were aligned to the Pc H-2477 and Bt H-2622 genomes using HiSAT2 commands “hisat2-build” to generate indexes, and “hisat2 -p 8 –dta -x” to align reads to the indexes.66 SAMtools was used to generate .bam files with the commands “samtools sort -@ 8 -o” and “samtools index”.67 Transcripts were assembled using the Stringtie commands “stringtie”, “stringtie-merge”, and “stringtie -e -B -p 11 -G”.68 Differential expression was analyzed using DESeq2.69

QUANTIFICATION AND STATISTICAL ANALYSIS

Quantification and statistical analyses were performed using R version 4.2.2 or GraphPad Prism version 9 (key resources table). The ComplexHeatmap and ggplot2 packages were used to create heatmap and barplot visualizations. Prism was used to generate graphs of fecal CFU and qPCR data, to calculate the mean and standard error values for these data, and to perform statistical analyses for these experiments. The number of samples per group (n) for each experiment is indicated either in the figure legend or within the figure itself. Two-tailed Mann-Whitney tests were used to compare the distributions of two unmatched groups without the assumption of normal distribution. Paired or unpaired t-tests were used to compare two normally distributed groups of paired or unpaired samples, respectively. The Welch two-sample t test was used to compare two normally distributed groups with different standard deviations. A Fisher’s exact test was used to determine the presence of nonrandom associations between two sets of categorical variables with a small sample size. The Wilcoxon test was used to determine the distinctness of the means of two groups of independent samples. Further details of statistical analyses for experiments can be found in the results section and figure legends.

Supplementary Material

1
2
3
4

Highlights.

  • Bacteroides and Prevotella sp. are isolated, sequenced from Hadza fecal samples

  • Bacteroides sp. encode more mucus degrading capacity than Prevotella sp.

  • Unlike Bacteroides, Prevotella colonization requires dietary plant fiber

  • Bacteroides outcompetes Prevotella in vivo on low plant fiber diet

ACKNOWLEDGMENTS

We are indebted to the participants that provided samples used in this study. We acknowledge the numerous people and organizations who provided logistical support and conducted sample collection in the USA, Tanzania, and Nepal, including Dorobo Safaris, the Human Food Project, John Changalucha, Alphaxard Manjurano, Maria Gloria Domiguez-Bello, Allison Weakley, Samuel Smits, Gabriela Fragiadakis, Hannah Wastyk, Yoshina Gautam, Dinesh Bhandari, Sarmila Tandukar, Katharine Ng, Guru Prasad Gautam, Jeevan B. Sherchand, and members of the Gardner lab at Stanford. We thank Bryan Merrill, Madeline Topf, and Michelle St. Onge for technical support; Gabriela Gall Rosa for help with analysis; Samuel Lancaster, Brittany Sison, and Audrey Zhang for assistance processing RNA sequencing data; Michael Fischbach and Niokhor Dione for providing P. copri N-01; Brian Yu and Rose Yan for sequencing support; David Schneider for advice and tissue culture hood use; and Bernard Henrissat for advice on CAZyme annotation.

R.H.G. was supported by an NIH/NHGRI T32 training grant and the Blavatnik Family Foundation. This work was supported by grants from the NIH to J.L.S. (R01-DK085025, DP1-AT009892). J.L.S. is a Chan Zuckerberg Biohub investigator.

INCLUSION AND DIVERSITY

We support inclusive, diverse, and equitable conduct of research.

Footnotes

DECLARATION OF INTERESTS

The authors declare no competing interests.

SUPPLEMENTAL INFORMATION

Supplemental information can be found online at https://doi.org/10.1016/j.celrep.2023.113233.

REFERENCES

  • 1.Sonnenburg JL, and Sonnenburg ED (2019). Vulnerability of the industrialized microbiota. Science 366, eaaw9255. 10.1126/science.aaw9255. [DOI] [PubMed] [Google Scholar]
  • 2.De Filippo C, Cavalieri D, Di Paola M, Ramazzotti M, Poullet JB, Massart S, Collini S, Pieraccini G, and Lionetti P (2010). Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proc. Natl. Acad. Sci. USA 107, 14691–14696. 10.1073/pnas.1005963107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Jha AR, Davenport ER, Gautam Y, Bhandari D, Tandukar S, Ng KM, Fragiadakis GK, Holmes S, Gautam GP, Leach J, et al. (2018). Gut microbiome transition across a lifestyle gradient in Himalaya. PLoS Biol. 16, e2005396. 10.1371/journal.pbio.2005396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Carter MM, Olm MR, Merrill BD, Dahan D, Tripathi S, Spencer SP, Yu FB, Jain S, Neff N, Jha AR, et al. (2023). Ultra-deep sequencing of Hadza hunter-gatherers recovers vanishing gut microbes. Cell 186, 3111–3124.e13. 10.1016/j.cell.2023.05.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Olm MR, Dahan D, Carter MM, Merrill BD, Yu FB, Jain S, Meng X, Tripathi S, Wastyk H, Neff N, et al. (2022). Robust variation in infant gut microbiome assembly across a spectrum of lifestyles. Science 376, 1220–1223. 10.1126/science.abj2972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Smits SA, Leach J, Sonnenburg ED, Gonzalez CG, Lichtman JS, Reid G, Knight R, Manjurano A, Changalucha J, Elias JE, et al. (2017). Seasonal Cycling in the Gut Microbiome of the Hadza Hunter-Gatherers of Tanzania Authors. Science 357, 802–806. 10.1126/science.aan4834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Vangay P, Johnson AJ, Ward TL, Kashyap PC, Culhane-Pera KA, and Knights Correspondence D (2018). US Immigration Westernizes the Human Gut Microbiome. Cell 175, 962–972. 10.1016/j.cell.2018.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, Magris M, Hidalgo G, Baldassano RN, Anokhin AP, et al. (2012). Human gut microbiome viewed across age and geography. Nature 486, 222–227. 10.1038/nature11053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wibowo MC, Yang Z, Borry M, Hübner A, Huang KD, Tierney BT, Zimmerman S, Barajas-Olmos F, Contreras-Cubas C, García-Ortiz H, et al. (2021). Reconstruction of ancient microbial genomes from the human gut. Nature 594, 234–239. 10.1038/s41586-021-03532-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sonnenburg ED, and Sonnenburg JL (2014). Starving our microbial self: The deleterious consequences of a diet deficient in microbiota-accessible carbohydrates. Cell Metab. 20, 779–786. 10.1016/j.cmet.2014.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cordain L, Eaton SB, Sebastian A, Mann N, Lindeberg S, Watkins BA, O’Keefe JH, and Brand-Miller J (2005). Origins and evolution of the Western diet: health implications for the 21st century. Am. J. Clin. Nutr. 81, 341–354. 10.1093/ajcn.81.2.341. [DOI] [PubMed] [Google Scholar]
  • 12.Flint HJ, Scott KP, Duncan SH, Louis P, and Forano E (2012). Microbial degradation of complex carbohydrates in the gut. Gut Microb. 3, 289–306. 10.4161/GMIC.19897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bell A, and Juge N (2021). Mucosal glycan degradation of the host by the gut microbiota. Glycobiology 31, 691–696. 10.1093/GLYCOB/CWAA097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Desai MS, Seekatz AM, Koropatkin NM, Kamada N, Hickey CA, Wolter M, Pudlo NA, Kitamoto S, Terrapon N, Muller A, et al. (2016). A Dietary Fiber-Deprived Gut Microbiota Degrades the Colonic Mucus Barrier and Enhances Pathogen Susceptibility. Cell 167, 1339–1353.e21. 10.1016/J.CELL.2016.10.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Pudlo NA, Urs K, Crawford R, Pirani A, Atherly T, Jimenez R, Terrapon N, Henrissat B, Peterson D, Ziemer C, et al. (2022). Phenotypic and Genomic Diversification in Complex Carbohydrate-Degrading Human Gut Bacteria. mSystems 7, e0094721. 10.1128/MSYS-TEMS.00947-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Salyers AA, West SE, Vercellotti JR, and Wilkins TD (1977). Fermentation of mucins and plant polysaccharides by anaerobic bacteria from the human colon. Appl. Environ. Microbiol. 34, 529–533. 10.1128/aem.34.5.529-533.1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sonnenburg JL, Xu J, Leip DD, Chen CH, Westover BP, Weatherford J, Buhler JD, and Gordon JI (2005). Glycan foraging in vivo by an intestine-adapted bacterial symbiont. Science 307, 1955–1959. 10.1126/SCIENCE.1109051/SUPPL_FILE/SONNENBURG.SOM.PDF. [DOI] [PubMed] [Google Scholar]
  • 18.Earle KA, Billings G, Sigal M, Lichtman JS, Hansson GC, Elias JE, Amieva MR, Huang KC, and Sonnenburg JL (2015). Quantitative Imaging of Gut Microbiota Spatial Organization. Cell Host Microbe 18, 478–488. 10.1016/j.chom.2015.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Martens EC, Neumann M, and Desai MS (2018). Interactions of commensal and pathogenic microorganisms with the intestinal mucosal barrier. Nat. Rev. Microbiol. 16, 457–470. 10.1038/s41579-018-0036-x. [DOI] [PubMed] [Google Scholar]
  • 20.Sonnenburg ED, Smits SA, Tikhonov M, Higginbottom SK, Wingreen NS, and Sonnenburg JL (2016). Diet-induced extinctions in the gut microbiota compound over generations. Nature 529, 212–215. 10.1038/nature16504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kaplan RC, Wang Z, Usyk M, Sotres-Alvarez D, Daviglus ML, Schneiderman N, Talavera GA, Gellman MD, Thyagarajan B, Moon J-Y, et al. (2019). Gut microbiome composition in the Hispanic Community Health Study/Study of Latinos is shaped by geographic relocation, environmental factors, and obesity. Genome Biol. 20, 219. 10.1186/s13059-019-1831-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zafar H, and Saier MH (2021). Gut Bacteroides species in health and disease. Gut Microb. 13, 1–20. 10.1080/19490976.2020.1848158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Tett A, Pasolli E, Masetti G, Ercolini D, and Segata N (2021). Prevotella diversity, niches and interactions with the human host. Nat. Rev. Microbiol. 19, 585–599. 10.1038/s41579-021-00559-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Li H, Meier-Kolthoff JP, Hu C, Wang Z, Zhu J, Zheng W, Tian Y, and Guo F (2022). Panoramic Insights into Microevolution and Macroevolution of A Prevotella copri-containing Lineage in Primate Guts. Dev. Reprod. Biol. 20, 334–349. 10.1016/j.gpb.2021.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Abdill RJ, Adamowicz EM, and Blekhman R (2022). Public human microbiome data are dominated by highly developed countries. PLoS Biol. 20, e3001536. 10.1371/journal.pbio.3001536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Accetto T, and Avguštin G (2015). Polysaccharide utilization locus and CAZYme genome repertoires reveal diverse ecological adaptation of Prevotella species. Syst. Appl. Microbiol. 38, 453–461. 10.1016/J.SYAPM.2015.07.007. [DOI] [PubMed] [Google Scholar]
  • 27.Li J, Gálvez EJC, Amend L, Almási É, Iljazovic A, Lesker TR, Bielecka AA, Schorr E-M, and Strowig T (2021). A versatile genetic toolbox for Prevotella copri enables studying polysaccharide utilization systems. EMBO J. 40, e108287. 10.15252/embj.2021108287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Xu J, Bjursell MK, Himrod J, Deng S, Carmichael LK, Chiang HC, Hooper LV, and Gordon JI (2003). A Genomic View of the Human-Bacteroides thetaiotaomicron Symbiosis. Science 299, 2074–2076. 10.1126/science.1080029. [DOI] [PubMed] [Google Scholar]
  • 29.Bjursell MK, Martens EC, and Gordon JI (2006). Functional Genomic and Metabolic Studies of the Adaptations of a Prominent Adult Human Gut Symbiont, Bacteroides thetaiotaomicron, to the Suckling Period * □ S Downloaded from. J. Biol. Chem. 281, 36269–36279. 10.1074/jbc.M606509200. [DOI] [PubMed] [Google Scholar]
  • 30.Dodd D, Moon Y-H, Swaminathan K, Mackie RI, and Cann IKO (2010). Transcriptomic analyses of xylan degradation by Prevotella bryantii and insights into energy acquisition by xylanolytic bacteroidetes. J. Biol. Chem. 285, 30261–30273. 10.1074/jbc.M110.141788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Fehlner-Peach H, Magnabosco C, Raghavan V, Scher JU, Tett A, Cox LM, Gottsegen C, Watters A, Wiltshire-Gordon JD, Segata N, et al. (2019). Distinct Polysaccharide Utilization Profiles of Human Intestinal Prevotella copri Isolates. Cell Host Microbe 26, 680–690.e5. 10.1016/j.chom.2019.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sprockett DD, Martin M, Costello EK, Burns AR, Holmes SP, Gurven MD, and Relman DA (2020). Microbiota assembly, structure, and dynamics among Tsimane horticulturalists of the Bolivian Amazon. Nat. Commun. 11, 3772. 10.1038/s41467-020-17541-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.De Filippis F, Pasolli E, Tett A, Tarallo S, Naccarati A, De Angelis M, Neviani E, Cocolin L, Gobbetti M, Segata N, and Ercolini D (2019). Distinct Genetic and Functional Traits of Human Intestinal Prevotella copri Strains Are Associated with Different Habitual Diets. Cell Host Microbe 25, 444–453.e3. 10.1016/j.chom.2019.01.004. [DOI] [PubMed] [Google Scholar]
  • 34.Tett A, Huang KD, Asnicar F, Fehlner-Peach H, Pasolli E, Karcher N, Armanini F, Manghi P, Bonham K, Zolfo M, et al. (2019). The Prevotella copri Complex Comprises Four Distinct Clades Underrepresented in Westernized Populations. Cell Host Microbe 26, 666–679.e7. 10.1016/j.chom.2019.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Marlowe FW, and Berbesque JC (2009). Tubers as fallback foods and their impact on Hadza hunter-gatherers. Am. J. Phys. Anthropol. 140, 751–758. 10.1002/ajpa.21040. [DOI] [PubMed] [Google Scholar]
  • 36.Monteiro CA, Moubarac J-C, Cannon G, Ng SW, and Popkin B (2013). Ultra-processed products are becoming dominant in the global food system. Obes. Rev. 14, 21–28. 10.1111/obr.12107. [DOI] [PubMed] [Google Scholar]
  • 37.Luis AS, Jin C, Pereira GV, Glowacki RWP, Gugel SR, Singh S, Byrne DP, Pudlo NA, London JA, Baslé A, et al. (2021). A single sulfatase is required to access colonic mucin by a gut bacterium. Nat 598, 332–337. 10.1038/s41586-021-03967-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Gálvez EJC, Iljazovic A, Amend L, Lesker TR, Renault T, Thiemann S, Hao L, Roy U, Gronow A, Charpentier E, et al. (2020). Distinct Polysaccharide Utilization Determines Interspecies Competition between Intestinal Prevotella spp. Cell Host Microbe 28, 838–852. 10.1016/j.chom.2020.09.012. [DOI] [PubMed] [Google Scholar]
  • 39.Aakko J, Pietilä S, Toivonen R, Rokka A, Mokkala K, Laitinen K, Elo L, and Hänninen A (2020). A carbohydrate-active enzyme (CAZy) profile links successful metabolic specialization of Prevotella to its abundance in gut microbiota. Sci. Rep. 10, 12411. 10.1038/s41598-020-69241-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sonnenburg ED, Zheng H, Joglekar P, Higginbottom SK, Firbank SJ, Bolam DN, and Sonnenburg JL (2010). Specificity of Polysaccharide Use in Intestinal Bacteroides Species Determines Diet-Induced Microbiota Alterations. Cell 141, 1241–1252. 10.1016/j.cell.2010.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Dubos RJ, and Pierce C (1948). The effect of diet on experimental tuberculosis of mice. Am. Rev. Tuberc. 57, 287–293. [DOI] [PubMed] [Google Scholar]
  • 42.Gorvitovskaia A, Holmes SP, and Huse SM (2016). Interpreting Prevotella and Bacteroides as biomarkers of diet and lifestyle. Microbiome 4, 15. 10.1186/s40168-016-0160-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Kovatcheva-Datchary P, Nilsson A, Akrami R, Lee YS, De Vadder F, Arora T, Hallen A, Martens E, Björck I, and Bäckhed F (2015). Dietary Fiber-Induced Improvement in Glucose Metabolism Is Associated with Increased Abundance of Prevotella. Cell Metab. 22, 971–982. 10.1016/J.CMET.2015.10.001. [DOI] [PubMed] [Google Scholar]
  • 44.Raman AS, Gehrig JL, Venkatesh S, Chang H-W, Hibberd MC, Subramanian S, Kang G, Bessong PO, Lima AAM, Kosek MN, et al. (2019). A sparse covarying unit that describes healthy and impaired human gut microbiota development. Science 365, eaau4735. 10.1126/science.aau4735. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Scher JU, Sczesnak A, Longman RS, Segata N, Ubeda C, Bielski C, Rostron T, Cerundolo V, Pamer EG, Abramson SB, et al. (2013). Expansion of intestinal Prevotella copri correlates with enhanced susceptibility to arthritis. Elife 2, 1202. 10.7554/eLife.01202.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Pedersen HK, Gudmundsdottir V, Nielsen HB, Hyotylainen T, Nielsen T, Jensen BAH, Forslund K, Hildebrand F, Prifti E, Falony G, et al. (2016). Human gut microbes impact host serum metabolome and insulin sensitivity. Nature 535, 376–381. 10.1038/nature18646. [DOI] [PubMed] [Google Scholar]
  • 47.Lancaster SM, Lee-McMullen B, Abbott CW, Quijada JV, Hornburg D, Park H, Perelman D, Peterson DJ, Tang M, Robinson A, et al. (2022). Global, distinctive, and personal changes in molecular and microbial profiles by specific fibers in humans. Cell Host Microbe 30, 848–862.e7. 10.1016/j.chom.2022.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Ng KM, Aranda-Díaz A, Tropini C, Frankel MR, Van Treuren W, O’Loughlin CT, Merrill BD, Yu FB, Pruss KM, Oliveira RA, et al. (2019). Recovery of the Gut Microbiota after Antibiotics Depends on Host Diet, Community Context, and Environmental Reservoirs. Cell Host Microbe 26, 650–665.e4. 10.1016/j.chom.2019.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Yang Y, Zhao L-G, Wu Q-J, Ma X, and Xiang Y-B (2015). Association between dietary fiber and lower risk of all-cause mortality: a meta-analysis of cohort studies. Am. J. Epidemiol. 181, 83–91. 10.1093/aje/kwu257. [DOI] [PubMed] [Google Scholar]
  • 50.Kim Y, and Je Y (2014). Dietary fiber intake and total mortality: a meta-analysis of prospective cohort studies. Am. J. Epidemiol. 180, 565–573. 10.1093/aje/kwu174. [DOI] [PubMed] [Google Scholar]
  • 51.Parada AE, Needham DM, and Fuhrman JA (2016). Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environ. Microbiol. 18, 1403–1414. 10.1111/1462-2920.13023. [DOI] [PubMed] [Google Scholar]
  • 52.Kolmogorov M, Armstrong J, Raney BJ, Streeter I, Dunn M, Yang F, Odom D, Flicek P, Keane TM, Thybert D, et al. (2018). Chromosome assembly of large and complex genomes using multiple references. Genome Res. 28, 1720–1732. 10.1101/gr.236273.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, Olson R, Overbeek R, Parrello B, Pusch GD, et al. (2015). RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci. Rep. 5, 8365. 10.1038/srep08365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Olm MR, Brown CT, Brooks B, and Banfield JF (2017). dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868. 10.1038/ismej.2017.126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Langmead B, and Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359. 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Olm MR, Crits-Christoph A, Bouma-Gregson K, Firek BA, Morowitz MJ, and Banfield JF (2021). inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat. Biotechnol. 39, 727–736. 10.1038/s41587-020-00797-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, et al. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272. 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Lee MD (2019). GToTree: a user-friendly workflow for phylogenomics. Bioinformatics 35, 4162–4164. 10.1093/bioinformatics/btz188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Chaumeil P-A, Mussig AJ, Hugenholtz P, and Parks DH (2019). GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927. 10.1093/bioinformatics/btz848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Letunic I, and Bork P (2021). Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296. 10.1093/nar/gkab301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Drula E, Garron M-L, Dogan S, Lombard V, Henrissat B, and Terrapon N (2022). The carbohydrate-active enzyme database: functions and literature. Nucleic Acids Res. 50, D571–D577. 10.1093/nar/gkab1045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, and Madden TL (2009). BLAST+: architecture and applications. BMC Bioinf. 10, 421. 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Mistry J, Finn RD, Eddy SR, Bateman A, and Punta M (2013). Challenges in homology search: HMMER3 and convergent evolution of coiledcoil regions. Nucleic Acids Res. 41, e121. 10.1093/nar/gkt263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ewels P, Magnusson M, Lundin S, and Käller M (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048. 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Bolger AM, Lohse M, and Usadel B (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Kim D, Paggi JM, Park C, Bennett C, and Salzberg SL (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915. 10.1038/s41587-019-0201-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, and Li H (2021). Twelve years of SAMtools and BCFtools. GigaScience 10, giab008. 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pertea M, Kim D, Pertea GM, Leek JT, and Salzberg SL (2016). Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protoc. 11, 1650–1667. 10.1038/nprot.2016.095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Love MI, Huber W, and Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550. 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Wickham H, Averick M, Bryan J, Chang W, McGowan L, Franç ois R, Grolemund G, Hayes A, Henry L, Hester J, et al. (2019). Welcome to the Tidyverse. J. Open Source Softw. 4, 1686. 10.21105/joss.01686. [DOI] [Google Scholar]
  • 71.Wickham H (2022). stringr: Simple, Consistent Wrappers for Common String Operations. [Google Scholar]
  • 72.Wickham H, and Bryan J (2023). readxl: Read Excel Files. [Google Scholar]
  • 73.Wickham H (2023). An SVG Graphics Device. https://svglite.r-lib.org/.
  • 74.Gu Z, Gu L, Eils R, Schlesner M, and Brors B (2014). circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812. 10.1093/bioinformatics/btu393. [DOI] [PubMed] [Google Scholar]
  • 75.Gu Z (2022). Complex heatmap visualization. iMeta 1, e43. 10.1002/imt2.43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Sakai R, Winand R, Verbeiren T, Moere AV, and Aerts J (2014). dendsort: modular leaf ordering methods for dendrogram representations in R. F1000Research 3. 10.12688/f1000research.4784.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Wickham H (2016). Programming with ggplot2. In ggplot2: Elegant Graphics for Data Analysis Use R!, Wickham H, ed. (Springer International Publishing; ), pp. 241–253. 10.1007/978-3-319-24277-4_12. [DOI] [Google Scholar]
  • 78.Hahsler M, Hornik K, and Buchta C (2008). Getting Things in Order: An Introduction to the R Package seriation. J. Stat. Softw. 25, 1–34. 10.18637/jss.v025.i03. [DOI] [Google Scholar]
  • 79.Pruss KM, and Sonnenburg JL (2021). C. difficile exploits a host metabolite produced during toxin-mediated disease. Nat 593, 261–265. 10.1038/s41586-021-03502-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Green ED, Gunter C, Biesecker LG, Di Francesco V, Easter CL, Feingold EA, Felsenfeld AL, Kaufman DJ, Ostrander EA, Pavan WJ, et al. (2020). Strategic vision for improving human health at The Forefront of Genomics. Nature 586, 683–692. 10.1038/s41586-020-2817-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Fragiadakis GK, Smits SA, Sonnenburg ED, Van Treuren W, Reid G, Knight R, Manjurano A, Changalucha J, Dominguez-Bello MG, Leach J, and Sonnenburg JL (2019). Links between environment, diet, and the hunter-gatherer microbiome. Gut Microb. 10, 216–227. 10.1080/19490976.2018.1494103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, and Madden TL (2008). NCBI BLAST: a better web interface. Nucleic Acids Res. 36, W5–W9. 10.1093/nar/gkn201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Prjibelski A, Antipov D, Meleshko D, Lapidus A, and Korobeynikov A (2020). Using SPAdes De Novo Assembler. Curr. Protoc. Bioinforma. 70, e102. 10.1002/cpbi.102. [DOI] [PubMed] [Google Scholar]
  • 84.Mikheenko A, Prjibelski A, Saveliev V, Antipov D, and Gurevich A (2018). Versatile genome assembly evaluation with QUAST-LG. Bioinformatics 34, i142–i150. 10.1093/bioinformatics/bty266. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3
4

Data Availability Statement

Raw data files for WGS and RNAseq can be found at Zenodo: https://doi.org/10.5281/zenodo.7651179. Code used to generate the figures and additional data can be found at Zenodo: https://doi.org/10.5281/zenodo.8339517. Isolate genomes will be available at NCBI: PRJNA1015720 upon publication. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

RESOURCES