Skip to main content
Journal of Crohn's & Colitis logoLink to Journal of Crohn's & Colitis
. 2024 Jun 5;18(11):1819–1831. doi: 10.1093/ecco-jcc/jjae084

Gut Microbial Species and Endotypes Associate with Remission in Ulcerative Colitis Patients Treated with Anti-TNF or Anti-integrin Therapy

Fiona B Tamburini 1,, Anupriya Tripathi 2, Maxwell P Gold 3, Julianne C Yang 4, Tommaso Biancalani 5, Jacqueline M McBride 6, Mary E Keir 7, GARDENIA Study Group 8
PMCID: PMC11532613  PMID: 38836628

Abstract

Background and Aims

The gut microbiota contributes to aberrant inflammation in inflammatory bowel disease, but the bacterial factors causing or exacerbating inflammation are not fully understood. Further, the predictive or prognostic value of gut microbial biomarkers for remission in response to biologic therapy is unclear.

Methods

We perform whole metagenomic sequencing of 550 stool samples from 287 ulcerative colitis patients from a large, phase 3, head-to-head study of infliximab and etrolizumab.

Results

We identify several bacterial species in baseline and/or post-treatment samples that associate with clinical remission. These include previously described associations [Faecalibacterium prausnitzii_F] as well as new associations with remission to biologic therapy [Flavonifractor plautii]. We build multivariate models and find that gut microbial species are better predictors for remission than clinical variables alone. Finally, we describe patient groups that differ in microbiome composition and remission rate after induction therapy, suggesting the potential utility of microbiome-based endotyping.

Conclusions

In this large study of ulcerative colitis patients, we show that few individual species associate strongly with clinical remission, but multivariate models including microbiome can predict clinical remission and have better predictive power compared with clinical data alone.

Keywords: Microbiome, inflammatory bowel disease, biologic therapy

1. Introduction

Inflammatory bowel disease [IBD] is characterised by chronic relapsing and remitting gastrointestinal inflammation. The causes of IBD are complex and multifactorial, and they involve an inappropriate immune response to the gastrointestinal microbiota in genetically susceptible individuals.1 Multiple studies have profiled gut microbiota of ulcerative colitis [UC] or Crohn’s disease [CD] patients relative to non-IBD controls, revealing taxonomic differences including decreased α-diversity and increased relative abundance of pathobionts such as Enterobacteriaceae and Ruminococcus gnavus in IBD.2–5

Treatment options for IBD have increased over the past decade and many patients with moderately-to-severely active disease are prescribed anti-tumour necrosis factor alpha [anti-TNFα] or anti-integrin therapy as a first-line treatment, posing the question of how predictive the microbiome is for response to therapy and what changes treatment induces. Studies that have described baseline microbiome associations with biologic treatment-induced IBD remission have suggested that baseline α-diversity and Faecalibacterium prausnitzii abundance positively associate with remission after biologic treatment.6–9 Longitudinal microbiome alterations have been observed after biologic treatment, including decreased Enterobacteriaceae6 and increased microbial diversity and abundance of butyrate and butyrate producers in responders.7,9–12 However, existing gut microbiome studies in IBD patients treated with biologics are limited by small sample size, often focus on CD patients or paediatric populations, or lack the random allocation of patients to study arms and comprehensive measurements of disease activity used in the clinical trial setting.6,10,11,13–15 To date, the predictive value of the microbiome and specific effects of first-line treatments on the microbiota in adults with UC have not been well-described.

The GARDENIA trial randomised 397 adults with moderate-to-severe UC to etrolizumab [anti-β7 integrin] or infliximab [anti-TNF] treatment and measured Mayo Clinic Score [MCS] clinical response and clinical remission at Week 10 and Week 54.16 Although GARDENIA did not show statistical superiority of etrolizumab [ETRO] over infliximab [INFL] for primary endpoint measures, 21% of patients in ETRO and 33% in INFL achieved the secondary endpoint of clinical remission at Week 10.16 This large, well-controlled study offers the opportunity to evaluate the microbiome as both a prognostic biomarker for treatment response to anti-TNF and anti-integrin therapy and to measure microbiome changes following treatment.

Stool samples for microbiome analysis were collected from patients enrolled in GARDENIA at baseline, Week 10, and Week 54 along with disease measurements, patient characteristics, and serum and stool biomarkers. We performed whole metagenomic sequencing [WMS] of stool for 287 patients [n = 140 ETRO, n = 147 INFL]. We identify bacterial species that associate with predefined clinical remission criteria in cross-sectional analysis at both baseline and post-treatment timepoints. We demonstrate that including microbiome data in predictive models improves performance over clinical data alone. Finally, we group patients by microbiome composition and describe a low remission rate and a low α-diversity patient cluster that is enriched for pathobiont species.

2. Materials and Methods

2.1. Study design

GARDENIA [NCT02136069] was a randomised, double-blinded, double-dummy, parallel-group, phase 3 study that enrolled patients with moderate-to severe-ulcerative colitis [MCS of 6–12] with an established UC diagnosis for at least 3 months who were naive to anti-TNF drugs.16 Co-primary endpoints were clinical response at Week 10 [≥ 3-point decrease and ≥ 30% reduction in MCS from baseline, and ≥ 1-point decrease in rectal bleeding subscore or rectal bleeding score of 0 or 1] and clinical remission at Week 54 [MCS ≤ 2 with subscores ≤ 1]. Secondary outcome measures included clinical remission at Week 10 [MCS ≤ 2 with subscores ≤ 1].

2.2. Outcome derivation

Some patients have entirely or partially missing MCS data at Week 10. Patients were included in analysis wherever sufficient data were present to determine clinical remission status at Week 10. Patients with entirely missing MCS data were excluded from analysis. For patients with partially missing data, if the available data were sufficient to determine outcome, they were retained in the analysis. For example, a patient with a non-missing rectal bleeding score of > 1 would be considered a clinical non-remitter irrespective of missing data because the rectal bleeding score alone fails the criteria for clinical remission.

2.3. Stool sample collection

Stool samples were self-collected at home by patients prior to bowel preparation at screening [baseline], Week 10, and Week 54, and were kept cool and transported to the clinical site within 24 h of collection. Stool samples were aliquoted into plastic tubes by nurses at the study site, frozen at -20°C, shipped to the central laboratory within 2 days and stored at -80°C.

2.4. Metagenomic sequence data generation

Metagenomic DNA was extracted from frozen stool using theMoBio PowerMag Microbiome RNA/DNA Isolation Kit [QIAGEN] as per manufacturer’s instructions, including the recommended incubation step at 70°C for 10 min prior to mechanical lysis. Samples were shipped to Diversigen [New Brighton, MN] for metagenomic library preparation and sequencing. Sequence libraries were generated using the Illumina Nextera DNA Flex kit and 2 × 150 base pair sequencing was performed on an Illumina NovaSeq 6000.

2.5. Metagenomic sequence data analysis

2.5.1. Data processing and taxonomy classification

Metagenomic sequences were de-replicated using PrinSeq v0.20.432 and trimmed using trimmomatic v0.39.33 Processed reads were mapped to a database of bacterial sequences flagged as human contamination34 with bowtie235 and unmapped reads were retained for taxonomy classification. Reads were classified using kraken v2.1.136 with a confidence parameter of 0.2 [“--confidence 0.2”] and bracken v2.537 with a read threshold of 250 [“-t 250”] using a custom reference database containing 112 320 genomes from 64 468 species [61 220 bacterial, 3248 archaeal] from the Genome Taxonomy Database [GTDB] release 207.17,18 Database construction is described in detail in Byrd et al., 2020.22 Briefly, five genomes per species from RefSeq and/or Genbank were randomly chosen per GTDB species to avoid biasing taxonomy profiles to species with a large number of genomes available. Then, a kraken2 database was built according to the GTDB taxonomy tree.

2.5.2. Plotting and statistical analysis

Statistical analysis and figure generation was performed in R v4.3.138 using packages DESeq2 v1.42.0,39 DirichletMultinomial v1.44.0,40 Maaslin2 v1.16.0,41 RColorBrewer v1.1.3,42 cowplot v1.1.3,43 genefilter v1.84.0,44 ggalluvial v0.12.5,45,46 ggbeeswarm v0.7.2,47 ggplot2 v3.4.4,48 ggpubr v0.6.0,49 ggrepel v0.9.5,50 ggupset v0.3.0,51 ggvenn v0.1.10,52 here v1.0.1,53 reshape2 v1.4.4,54 table1 v1.4.3,55 tidyverse v2.0.0,56 vegan v2.6.4.57

2.5.3. Species filtering by horizontal coverage

Horizontal genome coverage was calculated for each species in each metagenome as previously described.58 Briefly, sequence reads were aligned to reference genomes from the GTDB for each species detected by kraken2. Alignments were processed with samtools v1.1159 and custom scripts. To eliminate spurious species classifications resulting from reference genome contamination, counts for species with less than 1% horizontal coverage of the genome were discarded and a zero value was substituted in the taxonomy count table. We subsequently refer to this as coverage-filtered data.

2.5.4. Alpha and beta diversity

Alpha diversity metrics were calculated from coverage-filtered data using the diversity function from vegan v2.6.4. To control for sequence depth, species-level counts were rarefied to the number representing the lowest count total of all samples prior to alpha diversity calculation. PERMANOVA analyses were performed using the adonis2 function from vegan v2.6.4 using parameter ‘na.action=na.omit’, with a separate model run for each covariate unless otherwise specified; p-values were adjusted using the Benjamini–Hochberg [BH] procedure to control the false-discovery rate.

2.5.5. Differential abundance and prevalence

Differential abundance testing to identify species associated with clinical remission was performed with DESeq2 v1.42.0 on coverage-filtered data. DESeq2 was selected due to results from benchmarking studies demonstrating that DESeq2 is able to control the false-positive rate without being overly conservative.60,61 Prior to DESeq2 analysis, a prevalence filter was applied to the coverage-filtered, species-level, taxonomy table to retain only species present in >10% of samples.60 Prevalence filtering was performed separately for each visit. Size factor estimation was performed using “poscounts” [argument ‘type=”poscounts”’ to function “estimateSizeFactors”] instead of the default median ratio method [“ratio”]. The “DESeq” function was run using parameters ‘fitType=”local”’ and ‘test=”Wald”’. DESeq2 was run with the formula “~clinical remission + treatment arm” to control for the effect of treatment, or with the formula “~clinical remission + treatment arm + clinical remission:treatment arm” to assess the interaction between remission and treatment. The “results” function was used to obtain log2 fold change values, and nominal and BH-adjusted p-values. Results were inspected to confirm model convergence.

DESeq2 was used to identify species changing in abundance over time by treatment arm and by remission status, as described in the DESeq2 vignette [https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups]. A term was added to the model to represent individual-specific effects: using Week 10 as an example, the formula “~treatment + treatment:individual + treatment:visit” was used to assess changes over time with respect to treatment and the formula “~clinical remission + clinical remission:individual + clinical remission:visit” was used to assess changes with respect to clinical remission.

Logistic regression to determine differential prevalence was performed using “glm” in R with the parameter “family=binomial” with the formula “taxon presence ~ outcome + treatment arm”.

2.5.6. Cluster identification

DMM groups were identified using DirichletMultinomial v1.44.0. To allow comparison of clusters across visits, baseline and post-induction samples were clustered simultaneously. Prior to clustering, species observed in fewer than two samples were removed. DMM models were fit to one through 10 clusters, and three was selected as the ideal number of clusters based on the lowest Laplace approximation score. Testing for association between clinical data and DMM groups was performed using one-versus-all logistic regression using the “glm” function in R with parameter “family=binomial” and formula “DMM group ~ baseline age + baseline BMI + sex + race + treatment arm + region + tobacco use history + disease extent + disease duration + Week 10 CRP + Week 10 faecal calprotectin”.

2.6. Multivariate models

2.6.1. Classifiers

We considered two classifiers from the scikit-learn package62 [version 1.1.1], LogisticRegression and GradientBoostingClassifier. In all cases, the random_state parameter was fixed [set to 2022]. The input predictors for these models were either microbiome count data for 490 high-quality baseline microbes [coverage-filtered data], 36 clinical variables, or both. The microbiome data was first centre log-ratio [CLR] transformed and then both the microbe and clinical data underwent z-score normalisation. Each classifier was trained to predict clinical remission status.

2.6.2. Feature selection methods

A small number of missing values were mean-imputed. Models that instead excluded these samples yielded similar results. Categorical features [eg, sex, country] were one-hot encoded. When training models with feature selection, t tests were performed between the classes on the training data and N features were optimally chosen based on cross-validation performance by considering the top N/2 features associated with remitters and top N/2 features associated with non-remitters.

2.6.3. Classifier runs and parameter sweeps

For each combination of dataset and model type, feature selection was performed. When only considering microbial variables or both microbial and clinical variables, feature selection was run to select the top N features from the set of [10,20,30,40,50,60,70,80,90,100]. When only considering clinical variables, feature selection was performed for the set of [10,20]. In each case, a parameter sweep was performed using 100 randomised rounds of 5-fold cross-validation [test size of 20%] of the sample and the optimal number of features was the one with the highest average PR-AUC across the 100 rounds. During each round of cross-validation, feature selection was performed on the training data only, and those features were used to train a model and predict results on the left-out samples. The optimal number of features for each parameter set is shown in Supplementary Table 7. Once the optimal number of features was chosen, each set of parameters was run through cross-validation for 500 rounds and a test size of 20% to attain the final reported results, which are included in Supplementary Table 7.

3. Results

3.1. Baseline clinical characteristics are well matched whereas phylum-level microbiome differs between treatment arms

Patients were randomised to ETRO or INFL treatment arms, stratified by baseline concomitant steroid treatment [yes vs no], immunosuppressant treatment [yes vs no], and disease activity [MCS ≤ 9 vs ≥ 10].16 Patients with available microbiome data are well matched between treatments in baseline patient characteristics including age, sex, body mass index, disease duration, and biomarkers faecal calprotectin and C-reactive protein [Figure 1A–F, Table 1].

Figure 1.

Figure 1

Etrolizumab and infliximab treatment arms are comparable in clinical characteristics and differ in microbiome at baseline. [A–F] Patients [n = 140 etrolizumab, n = 147 infliximab] are matched in age [A], body mass index [BMI] [B], disease duration [C], faecal calprotectin [D], C-reactive protein [E] [Wilcoxon rank sum test, p > 0.05], and sex [F] [Fisher’s exact test, p > 0.05]. [G] Upset plot summarising the number of patients with microbiome data across each combination of study visits. [H] No statistically significant difference in α-diversity between etrolizumab and infliximab treatment arms at baseline [Wilcoxon rank sum test, p > 0.05]. [I] Geographical region and race explain a small but statistically significant proportion of microbiome variation in Bray–Curtis dissimilarity at baseline in univariate PERMANOVA [FDR q < 0.1]. [J] Phylum-level relative abundance by treatment arm at baseline. Phyla with a mean relative abundance of > 0.1% are shown. Etrolizumab and infliximab treatment arms differ in relative abundance of Firmicutes_A, Actinobacteria, and Desulfobacterota_I at baseline [Wilcoxon rank sum test, p < 0.05]. Box plot hinges represent the first and third quartiles, whiskers represent the highest and lowest values within 1.5 times the interquartile range, horizontal line represents the median. [K–M] Multidimensional scaling coloured by geographical region [K], treatment arm [L], and Firmicutes_A/Actinobacteria log ratio [M]. Statistical significance: [*] p < 0.05, [**] p < 0.01, [***] p < 0.001, [****] p < 0.0001, [ns] not significant FDR, false-discovery rate.

Table 1.

Baseline patient characteristics.

Etrolizumab Infliximab
[N = 140] [N = 147]
Age [years]
 Median [Min, Max] 38.0 [19.0, 74.0] 39.0 [18.0, 70.0]
Sex
 Female 61 [43.6%] 51 [34.7%]
 Male 79 [56.4%] 96 [65.3%]
Body mass index [kg/m2]
 Mean [SD] 25.0 [4.64] 24.4 [4.53]
Disease duration [years]
 Median [Min, Max] 3.21 [0.279, 49.0] 4.24 [0.277, 33.7]
Mayo Clinic score
 Mean [SD] 8.46 [1.58] 8.62 [1.56]
Mayo endoscopic subscore
 Mean [SD] 2.56 [0.499] 2.61 [0.503]
Faecal calprotectin [μg/g]
 Median [Q1, Q3] 1270 [466, 2510] 1470 [464, 3610]
C-reactive protein [mg/L]
 Median [Q1, Q3] 3.22 [1.22, 8.45] 4.40 [1.59, 8.74]
Disease location
 Left-sided colitis 95 [67.9%] 95 [64.6%]
 Extensive colitis 19 [13.6%] 13 [8.8%]
 Pancolitis 26 [18.6%] 39 [26.5%]

Min, Max, minimum to maximum; SD, standard deviation.

Stool samples were collected at baseline, Week 10, and Week 54. Not all patients had a sample available at all visits [Figure 1G]. Stool samples were subjected to DNA extraction and WMS to a mean depth of 9.3M ± 5.3M reads post-processing. Sequence depth was comparable between treatment arms [Wilcoxon rank sum test, p > 0.05]. Taxonomy profiling was performed using the Genome Taxonomy Reference Database [release 207], a robust bacterial and archaeal taxonomy based on whole-genome average nucleotide identity.17,18

At baseline, α-diversity metrics [species richness, Shannon diversity] do not differ between treatment arms [Figure 1H]. Permutational ANOVA [PERMANOVA] was used to identify patient characteristics associated with microbiome variation at baseline, as measured by Bray–Curtis dissimilarity. Although race and geographical region are statistically significant in univariate analysis [Benjamini–Hochberg false-discovery rate [FDR] q < 0.1] [Figure 1I, K], the amount of variation explained by these variables is very small [R2 = 0.042 and 0.03] compared with the high degree of inter-individual variation. Across all visits, patient identifier explains 78.6% [R2 = 0.786] of variation in Bray–Curtis dissimilarity [PERMANOVA p = 0.001]. Treatment arm is nominally significantly associated with microbiome variation at baseline [p = 0.032, Figure 1I, L] but does not meet a 10% FDR cutoff.

Firmicutes_A and Actinobacteriota are the most abundant phyla at baseline, and differ slightly in relative abundance between treatment arms: Firmicutes_A is nominally significantly more abundant in INFL [two-sided Wilcoxon rank sum test, p = 0.0497, FDR q = 0.12] and Actinobacteriota is more abundant in ETRO [FDR q = 0.068] [Figure 1J]. Desulfobacterota_I, a low-relative abundance phylum, is more abundant in INFL [FDR q = 0.068]. Controlling for treatment arm, Firmicutes_A does not associate with any clinical characteristics [Figure S1B]. Desulfobacterota_I associates with geographical region [Figure S1B, C] and positively correlates with baseline neutrophils [Figure S1B, D], whereas Actinobacteriota negatively associates with faecal calprotectin [Figure S1B, E] and positively associates with baseline serum albumin [Figure S1B, F]. There is an observable gradient between Firmicutes_A and Actinobacteriota relative abundance across patients at baseline [Figure 1M]. Overall, we observe a high degree of inter-individual microbiome variation in this cohort.

3.2. Heterogeneous baseline microbiome associations with Week 10 outcomes

To assess the prognostic value of baseline microbiome data, we sought to determine whether baseline microbiome associates with clinical remission after 10 weeks of induction therapy. Among all 397 patients in GARDENIA [n = 199 ETRO, n = 198 INFL], clinical remission rates at Week 10 were 21% in ETRO and 33% in INFL.16 Among the subset of 287 patients with microbiome data [n = 140 ETRO, n = 147 INFL], clinical remission rates were 26% for ETRO and 35% for INFL. At the community level, clinical remission at Week 10 does not significantly associate with variation in baseline gut microbiome [Figure 1I, PERMANOVA q > 0.1], or with α-diversity in either treatment arm [two-sided Wilcoxon test, p > 0.05] [Figure 2A].

Figure 2.

Figure 2

Baseline microbiome associations with Week 10 clinical remission. [A] Baseline α-diversity is not associated with Week 10 clinical remission in ETRO or INFL [Wilcoxon rank sum test, p > 0.05]. [B] Number of baseline bacterial species associated with Week 10 clinical remission in ETRO and INFL [DESeq2, FDR q < 0.1], coloured by phylum. [C] Heatmap showing associations between bacterial species and clinical remission in ETRO and INFL. Associations with species > 10% prevalence in both arms [ETRO and INFL] or > 20% prevalence in either arm [ETRO, INFL] are shown. Left column indicates phylum, middle columns indicate log2 fold difference of normalised counts between remitters and non-remitters in ETRO and INFL, asterisk indicates FDR q < 0.1, grey color indicates q > 0.1. Right panel indicates prevalence of that species in ETRO and INFL, dashed line at 20%. R, remitter; NR, non-remitter; FDR, false-discovery rate; ETRO, etrolizumab; INFL, infliximab. [D] Venn diagram showing the number of associations overlapping in species and directionality between ETRO and INFL.

Noting that overall microbiome composition is not strongly associated with clinical remission at Week 10, we asked whether the baseline abundance of any individual species associates with clinical remission. For 170 patients [n = 85 ETRO, n = 85 INFL] with baseline data and non-missing clinical remission status [see Methods], we used DESeq2 to evaluate the relationship between species abundance, treatment arm, and clinical remission. We evaluated only species with at least 10% prevalence among baseline samples. Controlling for remission status [species ~ arm + remission], four species differ between ETRO and INFL at baseline. The most prevalent and abundant is Bifidobacterium breve, consistent with observed differences in Actinobacteriota between arms at baseline [Supplementary Figure S2A, B, Table S1].

Recognising the differing mechanisms of action of etrolizumab and infliximab as well as baseline microbiome differences between trial arms, we tested the hypothesis that species associate with outcome in a treatment-specific manner. To assess the relationship between treatment arm and outcome, we included an interaction term in our model [species ~ arm + remission + arm:remission] and identified 95 species with a significant interaction [FDR q < 0.1] [Supplementary Table S2]. Inspection of the normalised counts for each species with a significant interaction term reveals that many of these associations are driven by non-zero counts in all but one of the four groups [Supplementary Figure S3]. Examining the main effect of clinical remission with respect to each arm, we find 63 remission-associated species in ETRO [Figure 2B, Supplementary Figure S4, Table S3] and 64 in INFL [Figure 2B, Supplementary Figure S5, Table S4]. The majority of these associations are in the phyla Actinobacteriota and Firmicutes_A [Figure 2B], and most are low-prevalence species [Figure 2C]: one of 63 remission-associated species in ETRO [1.5%] and seven of 64 in INFL [11%] are more than 20% prevalent. Of these higher-prevalence species, Faecalibacterium prausnitzii_F is more abundant in remitters in ETRO and Faecalibacterium sp009758465 is more abundant in remitters in INFL. Conversely Faecalibacterium prausnitzii is more abundant in non-remitters in INFL [Figure 2C], in contrast with observations that Faecalibacterium species generally associate with health.19 Other higher-prevalence species associated with remission include Flavonifractor plautii and Eggerthellaceae species Rubneribacter badeniensis and Eggerthella sp014287365, which are associated with non-remission in INFL [Figure 2C].

Eighteen associations overlap between ETRO and INFL [Figure 2D], including Proteobacteria species [Escherichia albertii, Enterobacter hormaechei_A, Burkholderiales species Parasutterella gallistercoris, and Mesosutterella massiliensis] and several Firmicutes_A species including Enterocloster aldenensis. In the model controlling for treatment arm without the interaction term, 21 species associate with remission, including Bacteroides, Collinsella, and Enterocloster species, though all of these species fall below 20% prevalence [Supplementary Figure S6, Table S1]. Examining the normalised counts for significant species, we observe a high degree of heterogeneity and many zero values [Supplementary Figures S3–6]. Thus, many significant associations are driven by high counts in a small number of individuals, and the resulting log2 fold change estimates and p-values are extreme, due to the presence of several non-zero counts in one group and entirely zero counts in another.

We compared these results with a publicly available dataset of baseline metagenomic measurements from IBD patients treated with biologics in the PRISM cohort from Lee et al.20 GARDENIA measured disease activity using the MCS, and PRISM used the Simple Clinical Colitis Activity Index [SCCAI] which lacks endoscopy. We re-processed baseline metagenomic data from UC patients in Lee et al. treated with anti-TNF [n = 11] or the anti-integrin vedolizumab [n = 37], using our computational workflows. PRISM metagenomic samples were sequenced less deeply than GARDENIA [Supplementary Figure S7A] and have similar species richness but higher Shannon diversity [Supplementary Figure S7B]. We observed evidence of a batch effect or bias between datasets [Supplementary Figure S7C] driven by the enrichment of Blautia_A species and Phocaeicola vulgatus in PRISM and Faecalibacterium and Bifidobacterium species in GARDENIA [Supplementary Figure S7D]. Due to the relatively small sample size, we did not assess treatment-specific associations in PRISM and instead sought to identify species associated with Week 14 clinical remission [SCCAI ≤ 2] controlling for biologic treatment; 36 species associated with Week 14 clinical remission [Supplementary Table S5], six of which have > 20% prevalence in PRISM [Supplementary Figure S7E]. Zero baseline associations with remission are shared between PRISM and GARDENIA, likely due to differences in DNA extraction methodology, outcome measures, and cohort characteristics.

3.3. Multivariate clinical and microbiome models predict clinical remission

Having observed statistically significant associations between predominantly low-prevalence individual species and clinical remission, we sought to evaluate the predictive power of 490 baseline microbiome species and 36 clinical variables to predict clinical remission in GARDENIA, using multivariate modelling. Clinical variables included demographic and anthropometric measurements, disease activity, and faecal and serum laboratory measurements [Supplementary Table S6].

We evaluated many models on INFL and ETRO data separately as well as on the combined dataset. Modelling was performed on: 1] clinical data only; 2] microbiome data only; and 3] microbiome plus clinical data. In each case, we considered two families of supervised classifiers: Ridge [L2 regularised] logistic regression [which captures linear relationships between the predictors and clinical remission] and gradient-boosted machine [or GBM, which captures non-linear relationships]. Additionally, we evaluated models with and without feature selection. For models using feature selection, we employed a t test on the training data during cross-validation. We built 36 models in total and evaluated their performance using area under the precision recall curve [AUPRC] and area under the receiver operating characteristic curve [AUROC]. These combinations of model, feature selection, and datasets are tabulated in Supplementary Table S7.

We used AUPRC to identify the best-performing models, as this measure of classifier performance is more suitable for imbalanced datasets. The best-performing model from ETRO is the logistic regression model with t test feature selection, with an AUROC of 0.784 and AUPRC of 0.612 [baseline AUPRC = 0.235, fold change AUPRC = 2.60] [Figure 3A, Supplementary Figure S8, Table S7]. This model includes 91 microbial features and 10 clinical features [Figure 3E] and outperforms all models containing only clinical data, including the model with the same classifier [logistic regression] and feature selection strategy [t test] that only used clinical features [Figure 3B]. In INFL, the best model performance was achieved using logistic regression with t test feature selection and microbial features only [311 microbes], with an AUROC of 0.653 and AUPRC of 0.473 [baseline AUPRC = 0.294, fold change AUPRC = 1.608] [Figure 3C, E, Supplementary Figure S8, Table S7]. Similarly, this model outperforms all clinical data-only models for INFL [Figure 3D]. Despite the smaller number of remitters in ETRO (20/85 [24%] in ETRO vs 25/85 [29%] in INFL), the best models in ETRO and INFL have similar performance, and the fold change in AUPRC over baseline is higher for ETRO.

Figure 3.

Figure 3

Microbiome features improve prognostic models over clinical features alone. [A–B] Area under the precision recall curve [AUPRC] for the best-performing model in ETRO [clinical and microbial features, logistic regression, t test feature selection] [A] and the corresponding logistic regression model with t test feature selection that used only clinical data [B]. The model including microbial features [A] outperforms the clinical data-only model [B]. [C–D] AUPRC for the best-performing model in INFL [only microbial features, logistic regression, no feature selection] [C], and the corresponding logistic regression model with no feature selection that used only clinical data [D]. Again, the model containing microbial features [C] outperforms the clinical data-only model [D]. [E] Number of clinical and microbial features selected by the best-performing models in ETRO and INFL. [F–G] Top 15 features by magnitude of mean feature importance for ETRO [F] and INFL [G]. [H-I] Venn diagrams for overlapping microbial features with the same directionality of association between predictive models and significant remission-associated bacteria [FDR q < 0.1] from DESeq2 analysis in ETRO [H] and INFL [I]. FDR, false-discovery rate; ETRO, etrolizumab; INFL, infliximab.

We found that the best-performing models from ETRO and INFL separately each outperformed the best model on the combined datasets [Supplementary Figure S8, Table S7]. The best combined model also used logistic regression with t test feature selection and microbial and clinical predictors, achieving an AUROC of 0.643 and an AUPRC of 0.401 [baseline AUPRC = 0.265, fold change AUPRC = 1.516]. This is consistent with the observation that individual species associations varied between ETRO and INFL, and may reflect the baseline differences in gut microbiome composition or the differing mechanism of action of the biologics. Notably, models including microbiome predictors generally outperform their counterpart models that include only clinical predictors, demonstrating the added predictive value of microbiome measurements [Supplementary Figure S8, Table S7].

Top predictive features in the best ETRO model include Ruminococcus_E bromii_B, Gemmiger spp., Phocaeicola vulgatus, and baseline total MCS and endoscopy score [Figure 3F]. In INFL, top predictive features include Christensenella massiliensis and Phocaeicola vulgatus [Figure 3G]. Top predictive features from the best combined model overlapped with the top predictive features in the best INFL-only and ETRO-only models, including Bacteroides uniformis, Phocaeicola vulgatus, and baseline MCS.

Fifteen microbial species overlap between DESeq2 associations and the best predictive model in ETRO [Figure 3H], and 28 species overlap in INFL [Figure 3I]. The imperfect overlap is likely due to the differences in modelling approaches. In summary, models using microbial and optionally clinical predictors have moderate predictive power and perform better than models using only clinical data.

3.4. Microbiome associations with remission post-induction therapy

In addition to providing predictive or prognostic biomarkers, microbiome data may hold value for better understanding disease pathobiology. As host-microbe interactions may enter a more homeostatic state after treatment, microbes associated with IBD outcome after treatment have the potential to inform experimental hypotheses regarding host-microbiome interactions in IBD. Thus, we evaluated the association between microbiome composition at Week 10 and clinical remission in 213 patients with Week 10 microbiome data [n = 104 ETRO, n = 109 INFL]. Similar to findings at baseline, geographical region, and race explain a significant amount of variation in Bray–Curtis dissimilarity in the Week 10 microbiome [univariate PERMANOVA, FDR q < 0.1, Supplementary Figure S9A]. Additionally, baseline age, disease extent, body mass index [BMI], disease duration, Mayo Clinic Score, and clinical remission are significantly associated with microbiome variation at Week 10 [Figure S9A], though each of these variables only explains a small amount of variation [R2 0.008-0.03] consistent with observations at baseline. We observe no significant differences in α-diversity in the post-treatment microbiome between remitters and non-remitters at Week 10 [Supplementary Figure S9B].

We find 50 microbial associations with outcome in ETRO and 12 in INFL at Week 10 [Figure S9C, D, Table S8-10]. Among species present at > 20% prevalence, Faecalibacterium prausnitzii_J is more abundant in remitters in INFL, and Bifidobacterium pseudocatenulatum is more abundant in non-remitters in ETRO [Figure S9D]. Three associations overlap in directionality between ETRO and INFL [Collinsella sp900754275, Collinsella sp003458415, and Pauljensenia sp000758755] and are more abundant in non-remitters [Supplementary Figure S9D]. Concordance of associations with clinical remission is low across treatment arms and timepoints: only two species [Collinsella sp003458415, Pauljensenia sp000758755] consistently associate with non-remission across all analyses [Supplementary Figure S9E]. Controlling for treatment, six species are more abundant in non-remitters, one of which [Gemmiger formicilis_A] is more than 20% prevalent [Supplementary Table S11].

3.5. The gut microbiome is unstable during biologic treatment in UC

The gut microbiome can change in composition over time as a result of perturbations, including drug treatment or changes in diet. Longitudinal stability of the gut microbiome varies between healthy humans,21,22 and IBD patients have been shown to have decreased gut microbiome stability compared with healthy individuals.3 We assessed longitudinal changes in the gut microbiome of INFL- and ETRO-treated patients from baseline through visits at Week 10 and Week 54, and find no clear trajectory in overall microbiome composition through treatment [Supplementary Figure S10A]. We observe that longitudinal samples from the same patient are more similar on average than samples from distinct patients [Supplementary Figure S10B], suggesting that the microbiome is not entirely remodeled in composition throughout treatment.

For each patient, we calculated Bray–Curtis dissimilarity at Week 10 and Week 54 relative to baseline, and find that there is no significant difference in divergence from baseline between treatment arms at Week 10, but by Week 54 patients in INFL are on average more dissimilar from baseline relative to patients in ETRO [Supplementary Figure S10C]. No significant differences in dissimilarity from baseline are observed between remitters and non-remitters at either Week 10 or Week 54 [Supplementary Figure S10D]. Log2 fold change in α-diversity compared with baseline is not significantly different between treatment arms [Supplementary Figure S10E] or by clinical remission status at Week 10, but by Week 54, log2 fold change in Shannon diversity and species richness relative to baseline is higher in remitters [p < 0.05] [Supplementary Figure S10F].

We tested the hypothesis that individual bacterial species change in abundance from baseline to Week 10 or Week 54 across treatment arms or between remitters and non-remitters. For all patients with baseline and Week 10 and/or Week 54 data available, we used DESeq2 controlling for individual variation [see Methods] to determine the relationship between species abundance and: [1] treatment arm; and [2] clinical remission. From baseline to Week 10, Bifidobacterium breve decreases in both ETRO and INFL [FDR q < 0.1] [Supplementary Figure S10G], and additionally Bifidobacterium kashiwanohense decreases in INFL [FDR q < 0.1] [Supplementary Figure S10H]. Zero species consistently change in abundance from baseline to Week 54 in either treatment arm. With respect to clinical remission, zero species differ significantly in abundance from baseline to Week 10 or Week 54 in either remitters or non-remitters [using the remission status at the corresponding post-baseline visit]. Taken together, these results suggest that the microbiome fluctuates during biologic treatment and that treatment does not appear to be causing consistent shifts in microbiome composition.

3.6. Microbiome clustering revealed a low-diversity, low-remission rate group

The high degree of heterogeneity in microbiome composition between individuals, including those with IBD, may limit our ability to find universal gut bacterial signals associated with remission or response to biologic therapy. We hypothesised that unsupervised clustering of patients by microbiome composition may identify groups of patients with similar microbiome composition and patient characteristics. Previously, Dirichlet multinomial mixture [DMM] models23 identified microbial communities in IBD patients which differ in response rates to biologics.20 We identified DMM groups across pooled baseline and Week 10 data, resulting in three patient clusters [DMM 1, 2. 3] [Figure 4A, Supplementary Figure S11A]. At baseline, DMM 2 is lower in α-diversity than DMM 1 and 3 [Supplementary Figure S11B] [two-sided Wilcoxon test, p < 0.05]. However, baseline MCS does not significantly differ between groups [Supplementary Figure S11C, two-sided Wilcoxon tests, p > 0.05] nor does clinical remission rate [Figure S11D] [Fisher’s exact test, p > 0.05].

Figure 4.

Figure 4

DMM groups at Week 10 associate with clinical remission. [A] Dirichlet multinomial mixture [DMM] modelling identifies three microbiome clusters in GARDENIA, MDS plot of Week 10 data shown. [B] At Week 10, clinical remission rate is lower in DMM 2 relative to DMM 1 and 3 [Fisher’s exact test, p < 0.05]. [C] Week 10 Mayo Clinic Score [MCS] is higher in DMM 2 relative to DMM 1 [two-tailed Wilcoxon rank-sum test, p < 0.05] but not significantly different from DMM 3 [two-tailed Wilcoxon rank sum test, p > 0.05]. [D] Shannon diversity and species richness are lowest in DMM 2, intermediate in DMM 3, and highest in DMM 1 [two-tailed Wilcoxon rank sum test, p < 0.0001 for all comparisons]. [E] Species enriched in DMM 2 versus 1 and 3 [DESeq2, FDR q < 0.1], LFD = log2 fold difference. Statistical significance: [*] p < 0.05, [**] p < 0.01, [***] p < 0.001, [****] p < 0.0001, [ns] not significant. FDR, false-discovery rate.

By Week 10, however, patients in DMM 2 have lower rates of clinical remission across ETRO and INFL [Fisher’s exact test on combined INFL and ETRO data, p = 0.017] [Figure 4B]. In addition, patients in DMM 2 have higher MCS compared with patients in DMM 1 [Figure 4C] [two-sided Wilcoxon test p < 0.05] and lower alpha diversity [Figure 4D] [two-sided Wilcoxon test p < 0.05] compared with patients in DMM 1 and 3. For patients with available data at both baseline and Week 10, the majority of patients [69% ETRO, 85% INFL] remain in the same DMM group [Figure S11E], and INFL-treated patients are more likely to remain in the same DMM group than ETRO-treated [Fisher’s exact test, p = 0.046]. Patients in DMM 2 at Week 10 are enriched in relative abundance of pathobionts including Enterocloster bolteae and aldenensis, Ruminococcus_B gnavus, Bacteroides fragilis, and Escherichia coli [Figure 4E].

DMM cluster membership at Week 10 is not significantly associated with treatment [Fisher’s exact test, p = 0.14] but does correlate with age and disease duration. Patients in DMM 1 are older and have longer disease duration [Figure S11F, G], and these two variables are correlated [linear regression, p = 1.61e-7].

Applying DMM modelling to the re-processed PRISM UC data, we identify two microbiome clusters [DMM 1 and 2]. Concordant with published findings,20 DMM 2 is significantly lower in α-diversity [Supplementary Figure S7F] compared with DMM 1. In our analysis, clinical remission rate is equal between DMM groups for patients in PRISM treated with anti-TNF [50% in each DMM group], and numerically lower in vedolizumab-treated patients in DMM 2 [19%] versus DMM 1 [43%] [Supplementary Figure S7G] although this difference is not statistically significant [Fisher’s exact test, p = 0.077]. DMM 2 in the PRISM data mirrors DMM 2 in GARDENIA: both clusters have low α-diversity and low remission rate compared with other patient groups. DMM 2 in PRISM also has higher relative abundance of pathobionts including Ruminococcus_B gnavus, Escherichia coli, and Enterocloster clostridioformis [Supplementary Figure S7I]. Microbiome measurements at Week 14 are not available to evaluate whether the microbiome more strongly associates with remission at the time of assessment relative to baseline in PRISM.

4. Discussion

The gut microbiome associates to some extent with IBD diagnosis and response to biologic therapy,3,20,24 and is thought of as a potential source of predictive and prognostic biomarkers. In a large phase 3 study of UC patients with moderate to severe disease randomised to treatment with infliximab or etrolizumab, we identify several bacterial species that associate with response and remission. We find that some Faecalibacterium species are enriched in patients with favourable clinical outcomes, whereas species from taxa including Proteobacteria and Eggerthellaceae are enriched in patients who do not achieve remission after induction therapy. Most of these species are low prevalence in the study population [10–20%] and no individual species has sufficient predictive value to constitute a robust prognostic biomarker.

This reflects the high degree of inter-individual variation between IBD patients. Gut microbiome composition varies between healthy individuals and can change over time due to factors including diet, lifestyle, exposures, or illnesses25,26 and the microbiome of IBD patients is even less stable longitudinally compared with controls.3 Consistent with observations from the IBD literature, we observe marked gut microbiome instability in GARDENIA and few consistent longitudinal changes during treatment. Even the two species that showed a statistically significant change over time in remitters compared with non-remitters [Bifidobacterium breve and Bifidobacterium kashiwanohense] did not change [or were not present] in all patients.

Predictive models can include multiple microbial species and clinical characteristics to improve predictive power. We observe that multivariate predictive models using baseline microbiome data not only hold promising signal to predict clinical remission at Week 10, but uniformly outperform models containing only clinical predictors. This is consistent with published findings: predictive models derived from a cohort of patients treated with the anti-integrin vedolizumab had better performance to predict remission at Week 14 compared with models using clinical data only.8,27 Intriguingly, models using data from ETRO and INFL separately each outperform the combined model, even though the combined model includes twice the number of samples. This observation is consistent with differences in microbiome composition observed between treatment arms at baseline, and could also signal that different microbes are predictive for clinical remission across classes of biologic.

Although microbiome-based models have value for predicting clinical remission, these models are likely not yet sufficient for use in a clinical setting. We note that these models were trained on microbiome data from individuals with moderate-to-severe UC meeting a strict set of enrolment criteria,16 and therefore may not generalise to all patients with UC. As this is one of the first studies to analyse the predictive value of WMS data in a large UC clinical study, few validation datasets are available. Indeed, evaluation of a second UC dataset [PRISM], which used different outcome measures, enrolment criteria, and sample collection and processing strategies, shows no reproducible signals in individual microbial associations with clinical remission comparison with our analysis of GARDENIA, highlighting the need for consistent validation cohorts to test the value of microbiome-based findings. Additional data, particularly from diverse populations globally, could increase model performance and facilitate validation.

Given the tremendous heterogeneity observed in the microbiome of IBD patients, we assessed whether dimensionality reduction could meaningfully stratify patients by microbiome. We identify three microbiome groups using DMM models and find that DMM groups have clear differences in α-diversity and pathobiont relative abundance, consistent with published literature.20 Although baseline DMM groups in GARDENIA and re-analysed PRISM data do not strongly associate with clinical remission, by Week 10 DMM group membership does associate with clinical remission. Examining the clinical characteristics associated with DMM groups, it is intriguing to note that whereas age and disease duration are associated with DMM group, they are highest in DMM 1 and not DMM 2, the cluster with the lowest response and remission rates, suggesting that DMM groups may meaningfully associate with heterogeneous patient subsets. We note that the existence of enterotypes, or compositionally distinct clusters of microbiome samples,28 has been debated: there is an increasing appreciation that enterotypes are better viewed as a tool to reduce complexity rather than discrete, non-overlapping clusters.29 Indeed, although we observe statistically significant associations between DMM groups and clinical remission, DMM groups do not perfectly predict clinical remission at Week 10. Future work is required to evaluate the reproducibility of these DMM groups and determine whether, with additional refinement, microbiome-based clustering could be used as a prognostic tool.

A limitation of this study is that the findings presented are associational. It is unclear whether any remission-associated species play a causal role in gastrointestinal inflammation as opposed to merely colonising an open niche in the inflamed, and potentially aerobic, gut environment.30 Further, microbiome sequence data are inherently compositional, and observations based on relative abundance may not perfectly reflect differences in absolute abundance of species. Detailed experimentation is required to test the hypothesis that remission-associated species identified in this study directly contribute to IBD pathobiology. Thus, a major benefit of continued measurement of the microbiome in IBD may lie in identification of candidate bacterial factors associated with key outcomes for the purpose of better understanding disease pathobiology through experimentation. Species that are important predictors in multivariate models or that associate with the low-remission rate DMM group may provide a starting point for hypothesis generation and testing.

Taken together, our results suggest that deriving robust metagenomic biomarkers from the gut microbiome with sufficient predictive value for clinical use will be challenging. Although single species are likely insufficient, multivariate models encompassing multiple microbiome measurements show promise. Future work is required to validate these models in other UC patient populations and evaluate whether DMM groups align with other molecular observations in IBD. For instance, inflammatory cellular modules have been identified in single-cell RNA sequencing of tissue samples from Crohn’s disease31: future work associating microbiome composition or DMM groups with such signatures may reveal candidate host-microbiome interactions. More broadly, additional work is needed to investigate whether other data modalities, such as metatranscriptomics, metaproteomics, or metabolomics, may reveal stronger associations between the microbiome and key IBD outcomes, yield more robust biomarkers, or further enhance predictive or prognostic models. In summary, we advocate for a measured stance on the value of microbiome data in IBD, recognising the promise but also limitations of heterogeneous metagenomic data.

Supplementary Data

Supplementary data are available at ECCO-JCC online.

jjae084_suppl_Supplementary_Figures
jjae084_suppl_Supplementary_Tables

Acknowledgements

We thank the Genentech Pharma Biosample Services staff for receiving, accessioning, and preparing stool samples for this manuscript. Contributors: Allyson Byrd, Jordan Mar, and Oleg Mayba provided helpful feedback on the manuscript. Oleg also provided guidance on statistical analysis. Roche Pharma Biosample Services performed sample accessioning and preparation.

Contributor Information

Fiona B Tamburini, Human Pathobiology & OMNI Reverse Translation, Genentech, South San Francisco, CA, USA.

Anupriya Tripathi, Prescient Design, Genentech, South San Francisco, CA, USA.

Maxwell P Gold, Biological Research & AI Development, Genentech, South San Francisco, CA, USA.

Julianne C Yang, Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.

Tommaso Biancalani, Biological Research & AI Development, Genentech, South San Francisco, CA, USA.

Jacqueline M McBride, Translational Medicine OMNI-Biomarker Development, Genentech, South San Francisco, CA, USA.

Mary E Keir, Human Pathobiology & OMNI Reverse Translation, Genentech, South San Francisco, CA, USA.

GARDENIA Study Group, Human Pathobiology & OMNI Reverse Translation, Genentech, South San Francisco, CA, USA.

Funding

This work was supported by Genentech, South San Francisco, CA, USA.

Conflict of Interest

FBT, AT, MPG, TB, JMM, MEK are employees and stockholders of Genentech/Roche. JCY declares no competing interests.

Author Contributions

Authors: FBT, MEK, JMM planned the study. FBT, AT, MPG, JCY performed data analysis. FBT, MEK, AT, MPG wrote the manuscript. FBT, AT, MPG, TB, JMM, MEK edited the manuscript. The GARDENIA Study Group conceived, planned, and executed the original clinical trial.

Data Availability

Individual participant data will not be shared.

References

  • 1. Glassner KL, Abraham BP, Quigley EMM.. The microbiome and inflammatory bowel disease. J Allergy Clin Immun 2020;145:16–27. [DOI] [PubMed] [Google Scholar]
  • 2. Gevers D, Kugathasan S, Denson LA, et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 2014;15:382–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Lloyd-Price J, Arze C, Ananthakrishnan AN, et al.; IBDMDB Investigators. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 2019;569:655–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Morgan XC, Tickle TL, Sokol H, et al. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol 2012;13:R79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Shan Y, Lee M, Chang EB.. The gut microbiome and inflammatory bowel diseases. Annu Rev Med 2022;74:455–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Lewis JD, Chen EZ, Baldassano RN, et al. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn’s disease. Cell Host Microbe 2015;18:489–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Magnusson MK, Strid H, Sapnara M, et al. Anti-TNF therapy response in patients with ulcerative colitis is associated with colonic antimicrobial peptide expression and microbiota composition. J Crohns Colitis 2016;10:943–52. [DOI] [PubMed] [Google Scholar]
  • 8. Ananthakrishnan AN, Luo C, Yajnik V, et al. Gut microbiome function predicts response to anti-integrin biologic therapy in inflammatory bowel diseases. Cell Host Microbe 2017;21:603–10.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Doherty MK, Ding T, Koumpouras C, et al. Fecal microbiota signatures are associated with response to ustekinumab therapy among Crohn’s disease patients. MBio 2018;9:e02120–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Wang Y, Gao X, Ghozlane A, et al. Characteristics of faecal microbiota in paediatric Crohn’s disease and their dynamic changes during infliximab therapy. J Crohns Colitis 2017;12:337–46. [DOI] [PubMed] [Google Scholar]
  • 11. Zhou Y, Xu ZZ, He Y, et al. Gut microbiota offers universal biomarkers across ethnicity in inflammatory bowel disease diagnosis and infliximab response prediction. MSystems 2018;3:e00188–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Liu J, Fang H, Hong N, et al. Gut microbiome and metabonomic profile predict early remission to anti-integrin therapy in patients with moderate to severe ulcerative colitis. Microbiol Spectr 2023;11:e0145723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Kowalska-Duplaga K, Kapusta P, Gosiewski T, et al. Changes in the intestinal microbiota are seen following treatment with infliximab in children with Crohn’s disease. J Clin Med 2020;9:687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Busquets D, Mas-de-Xaxars T, López-Siles M, et al. Anti-tumour necrosis factor treatment with adalimumab induces changes in the microbiota of Crohn’s disease. J Crohns Colitis 2015;9:899–906. [DOI] [PubMed] [Google Scholar]
  • 15. Schirmer M, Stražar M, Avila-Pacheco J, et al. Linking microbial genes to plasma and stool metabolites uncovers host-microbial interactions underlying ulcerative colitis disease course. Cell Host Microbe 2024;32:209–26.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Danese S, Colombel J-F, Lukas M, et al.; GARDENIA Study Group. Etrolizumab versus infliximab for the treatment of moderately to severely active ulcerative colitis [GARDENIA]: a randomised, double-blind, double-dummy, phase 3 study. Lancet Gastroenterol Hepatol 2022;7:118–27. [DOI] [PubMed] [Google Scholar]
  • 17. Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P.. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 2020;38:1079–86. [DOI] [PubMed] [Google Scholar]
  • 18. Parks DH, Chuvochina M, Waite DW, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 2018;36:996–1004. [DOI] [PubMed] [Google Scholar]
  • 19. Lopez-Siles M, Duncan SH, Garcia-Gil LJ, Martinez-Medina M.. Faecalibacterium prausnitzii: from microbiology to diagnostics and prognostics. ISME J 2017;11:841–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Lee JWJ, Plichta D, Hogstrom L, et al. Multi-omics reveal microbial determinants impacting responses to biologic therapies in inflammatory bowel disease. Cell Host Microbe 2021;29:1294–304.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Fassarella M, Blaak EE, Penders J, Nauta A, Smidt H, Zoetendal EG.. Gut microbiome stability and resilience: elucidating the response to perturbations in order to modulate gut health. Gut 2021;70:595–605. [DOI] [PubMed] [Google Scholar]
  • 22. Byrd AL, Liu M, Fujimura KE, et al.; the Milieu Intérieur Consortium. Gut microbiome stability and dynamics in healthy donors and patients with non-gastrointestinal cancers. J Exp Med 2020;218:e20200606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Holmes I, Harris K, Quince C.. Dirichlet multinomial mixtures: generative models for microbial metagenomics. PLoS One 2012;7:e30126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Estevinho MM, Rocha C, Correia L, et al.; GEDII [Portuguese IBD Group]. Features of fecal and colon microbiomes associate with responses to biologic therapies for inflammatory bowel diseases: a systematic review. Clin Gastroenterol Hepatol 2020;18:1054–69. [DOI] [PubMed] [Google Scholar]
  • 25. Olsson LM, Boulund F, Nilsson S, et al. Dynamics of the normal gut microbiota: a longitudinal one-year population study in Sweden. Cell Host Microbe 2022;30:726–39.e3. [DOI] [PubMed] [Google Scholar]
  • 26. Bäckhed F, Fraser CM, Ringel Y, et al. Defining a healthy human gut microbiome: current concepts, future directions, and clinical applications. Cell Host Microbe 2012;12:611–22. [DOI] [PubMed] [Google Scholar]
  • 27. Caenepeel C, Falony G, Machiels K, et al. Dysbiosis and associated stool features improve prediction of response to biological therapy in inflammatory bowel disease. Gastroenterology 2024;166:483–95. [DOI] [PubMed] [Google Scholar]
  • 28. Arumugam M, Raes J, Pelletier E, et al. Enterotypes of the human gut microbiome. Nature 2011;473:174–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Costea PI, Hildebrand F, Arumugam M, et al. Enterotypes in the landscape of gut microbial community composition. Nat Microbiol 2018;3:8–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Watson AR, Füssel J, Veseli I, et al. Metabolic independence drives gut microbial colonization and resilience in health and disease. Genome Biol 2023;24:78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Martin JC, Chang C, Boschetti G, et al. Single-cell analysis of Crohn’s disease lesions identifies a pathogenic cellular module associated with resistance to anti-TNF therapy. Cell 2019;178:1493–508.e20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Schmieder R, Edwards R.. Quality control and preprocessing of metagenomic datasets. Bioinformatics 2011;27:863–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Bolger AM, Lohse M, Usadel B.. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014;30:2114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Breitwieser FP, Pertea M, Zimin AV, Salzberg SL.. Human contamination in bacterial genomes has created thousands of spurious proteins. Genome Res 2019;29:954–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Langmead B, Salzberg SL.. Fast gapped-read alignment with Bowtie 2. Nat Methods 2012;9:357–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Wood DE, Lu J, Langmead B.. Improved metagenomic analysis with Kraken 2. Genome Biol 2019;20:257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Lu J, Breitwieser FP, Thielen P, Salzberg SL.. Bracken: estimating species abundance in metagenomics data. PeerJ Comput Sci 2017;3:e104. [Google Scholar]
  • 38. R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2022. [Google Scholar]
  • 39. Love MI, Huber W, Anders S.. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Morgan M. DirichletMultinomial: Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data. R package version 1.44.0.?2023. https://bioconductor.org/packages/DirichletMultinomial. Accessed April 18, 2024. [Google Scholar]
  • 41. Mallick H, Rahnavard A, McIver LJ, et al. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput Biol 2021;17:e1009442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Neuwirth E. RColorBrewer: ColorBrewer Palette s. R package version 1.1.3.? 2022. https://CRAN.R-project.org/package=RColorBrewer. Accessed April 18, 2024.
  • 43. Wilke CO. cowplot: Streamlined Plot Theme and Plot Annotations for ggplot2. R package version 1.1.3.? 2024. https://CRAN.R-project.org/package=cowplot. Accessed April 18, 2024.
  • 44. Gentleman R, Carey VJ, Huber W, Hahne F.. genefilter: methods for Filtering Genes from High-throughput Experiments. R package version 1.84.0.? 2023. https://bioconductor.org/packages/genefilter. Accessed April 18, 2024.
  • 45. Brunson JC, Read QD.. ggalluvial: Alluvial Plots in ggplot2. R package version 0.12.5.? 2023. http://corybrunson.github.io/ggalluvial/. Accessed April 18, 2024.
  • 46. Brunson JC. ggalluvial: layered grammar for alluvial plots. Journal of open source software 2020;5:2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Clarke E., Sherrill-Mix S.. ggbeeswarm: Categorical Scatter [Violin Point] Plots. R package version 0.7.2.? 2023. https://CRAN.R-project.org/package=ggbeeswarm. Accessed April 18, 2024.
  • 48. Wickham H. ggplot2: Elegant Graphics for Data Analysis . New York, NY: Springer; 2016. [Google Scholar]
  • 49. Kassambara A. ggpubr: ggplot2 Based Publication Ready Plots. R package version 0.6.0.? 2023. https://CRAN.R-project.org/package=ggpubr. Accessed April 18, 2024.
  • 50. Slowikowski K. ggrepel: Automatically Position Non-Overlapping Text Labels with ggplot2. R package version 0.9.5.? 2024. https://CRAN.R-project.org/package=ggrepel. Accessed March 18, 2024.
  • 51. Ahlmann-Eltze C. ggupset: Combination Matrix Axis for ggplot2 to Create UpSet Plots. R package version 0.3.0. 2020. https://CRAN.R-project.org/package=ggupset. Accessed April 18, 2024.
  • 52. Yan L. ggvenn: Draw Venn Diagram by ggplot2. R package version 0.1.10. 2023. https://CRAN.R-project.org/package=ggvenn. Accessed April 18, 2024.
  • 53. Müller K. here: A Simpler Way to Find Your Files. R package version 1.0.1. 2020. https://CRAN.R-project.org/package=here. Accessed April 18, 2024.
  • 54. Wickham H. reshape2: Flexibly Reshape Data: A Reboot of the Reshape Package. R package version 1.4.4. 2020. https://cran.r-project.org/package=reshape2. Accessed April 18, 2024.
  • 55. Rich B. table1: Tables of Descriptive Statistics in HTML. R package version 1.4.3. 2023. https://CRAN.R-project.org/package=table1. Accessed April 18, 2024.
  • 56. Wickham H, Averick M, Bryan J, et al. Welcome to the tidyverse. JOSS 2019;4:1686. [Google Scholar]
  • 57. Oksanen J, Simpson GL, Blanchet FG, et al. vegan: Community Ecology Package. R package version 2.6.4. 2022. https://CRAN.R-project.org/package=vegan. Accessed April 18, 2024.
  • 58. Younginger BS, Mayba O, Reeder J, et al. Enrichment of oral-derived bacteria in inflamed colorectal tumors and distinct associations of Fusobacterium in the mesenchymal subtype. Cell Rep Med 2023;4:100920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. GigaScience 2021;10:giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Nearing JT, Douglas GM, Hayes MG, et al. Microbiome differential abundance methods produce different results across 38 datasets. Nat Commun 2022;13:342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Calgaro M, Romualdi C, Waldron L, Risso D, Vitulo N.. Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data. Genome Biol 2020;21:191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res 2011;12:2825–30. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jjae084_suppl_Supplementary_Figures
jjae084_suppl_Supplementary_Tables

Data Availability Statement

Individual participant data will not be shared.


Articles from Journal of Crohn's & Colitis are provided here courtesy of Oxford University Press

RESOURCES