Summary
Ulcerative colitis (UC) prevalence is rising globally, yet fewer than 50% of patients achieve mucosal healing (MH) with first-line 5-aminosalicylic acid (5-ASA) therapy. We aimed to identify microbial signatures that could predict the treatment efficacy of 5-ASA. Active UC patients on standardized 5-ASA treatment were prospectively enrolled. Shotgun metagenomic sequencing was performed to identify the taxonomic and functional profiles before and after treatment. Six species were enriched in the effective group and 3 species in the ineffective group at baseline. Faecalibacterium prausnitzii, Blautia massiliensis, and Phascolarctobacterium faecium were consistently depleted in the ineffective group at both time points. A random forest model based on these three species predicted ineffective 5-ASA treatment with area under the curve (AUC) = 0.80 (validation in the Inflammatory Bowel Disease Multi’omics Database [IBDMDB]: AUC = 0.82, specificity = 0.88, negative predictive value [NPV] = 0.70, and positive predictive value [PPV] = 0.80). Gut microbiome signatures have potential to serve as non-invasive predictors for ineffective 5-ASA treatment in UC.
Subject areas: Health sciences, Gastroenterology, Microbiology, Microbiome
Graphical abstract

Highlights
-
•
Finding 1: Six species enriched in effective group, three in ineffective group at baseline
-
•
Finding 2: F. prausnitzii, B. massiliensis, and P. faecium depleted in ineffective group pre/post
-
•
Finding 3: A 3-species RF model predicted 5-ASA failure with AUC 0.82 and external validation
Health sciences; Gastroenterology; Microbiology; Microbiome
Introduction
Ulcerative colitis (UC) is a chronic idiopathic inflammatory disease characterized by inflammation of the colonic mucosa and submucosa, with its incidence and prevalence continuously rising worldwide.1 Mesalazine, also known as 5-aminosalicylic acid (5-ASA), is widely used as the first-line treatment for mild to moderate UC. Mucosal healing (MH) is currently the therapeutic goal for UC, due to its association with superior disease outcomes.2 However, only less than 50% of patients on 5-ASA could achieve MH.3 Identification of patients with 5-ASA treatment failure at early stage of disease could enable earlier escalation therapy, which may lead to fastened MH and improvement of long-term prognosis. Higher concentration of 5-ASA in the colonic mucosa is associated with treatment success, but it is an invasive test and its utilization for efficacy monitoring lacks widespread validation.4 In addition, clinical parameters are insufficient to predict 5-ASA treatment outcomes. Therefore, there is unmet need for non-invasive and accurate tests for predicting the efficacy of treatment.
The intestinal microbiome, central to the pathogenesis of UC, has been found to interfere with pharmacokinetics and pharmacodynamics.5 A study utilizing dextran sulfate sodium (DSS)-induced colitis mice revealed that elimination of gut microbiota with antibiotics markedly diminishes the therapeutic efficacy of 5-ASA, emphasizing the pivotal role of gut microbiota in the efficacy of 5-ASA.6 Huttenhower C et al. identified 12 previously unknown microbial acetyltransferases responsible for 5-ASA inactivation, and cross-validation showed that three of these thiolases and one acyl-CoA N-acyltransferase associated with an increased risk of treatment failure among 5-ASA users.7 However, these findings could not be confirmed in Dutch 1000 inflammatory bowel disease (IBD) cohort.8 Therefore, further research is required to explore the microbial markers for predicting 5-ASA efficacy in UC.
This study aimed to investigate the fecal microbiome composition predictive for 5-ASA efficacy in UC patients based on a prospective IBD registry.
Results
Patient characteristics and 5-ASA efficacy
A total of 51 eligible UC patients were involved. Of these patients, 26 achieved MH during follow-up. A total of 75 fecal samples were collected, including 51 from baseline and 24 after treatment (14 from effective group and 10 from ineffective group) (Figure 1). Baseline characteristics of the included patients are presented in Table 1. No significant differences were observed between the 5-ASA effective and ineffective groups in terms of age, gender, extent of disease, severity of inflammation, or inflammatory markers. It implies that clinical characteristics cannot serve as predictive markers for 5-ASA treatment efficacy.
Figure 1.
Workflow of the study
Table 1.
Baseline characteristics of UC patients enrolled in this study
| Overall | Effective N = 25 (49%)a |
Ineffective N = 26 (51%)a |
p valueb | |
|---|---|---|---|---|
| Age | 38.00 [33.00, 59.00] | 39.00 [34.00, 56.00] | 36.00 [32.25, 63.75] | 0.977 |
| Gender | 0.492 | |||
| Female | 22 (43.14%) | 12 (48.00%) | 10 (38.46%) | |
| Male | 29 (56.86%) | 13 (52.00%) | 16 (61.54%) | |
| Extent | 0.081 | |||
| E1 | 9 (17.65%) | 7 (28.00%) | 2 (7.69%) | |
| E2 | 19 (37.25%) | 10 (40.00%) | 9 (34.62%) | |
| E3 | 23 (45.10%) | 8 (32.00%) | 15 (57.69%) | |
| MES | 0.828 | |||
| 2 | 36 (70.59%) | 18 (72.00%) | 18 (69.23%) | |
| 3 | 15 (29.41%) | 7 (28.00%) | 8 (30.77%) | |
| WBC (∗10ˆ9) | 6.37 [5.61, 7.69] | 6.23 [5.44, 7.05] | 6.48 [5.62, 8.33] | 0.337 |
| HB (g/L) | 132.00 [117.50, 143.50] | 133.00 ± 16.33 | 125.2 ± 29.05 | 0.253 |
| CRP (mg/L) | 2.815 [1.113, 7.035] | 2.270 [0.790, 3.660] | 3.600 [1.750, 8.975] | 0.093 |
| ESR (mm/1h) | 16.500 [7.000, 25.500] | 16.000 [7.000, 28.000] | 17.000 [7.000, 25.000] | 0.768 |
| ALB (mmol/L) | 40.700 [37.330, 42.850] | 41.400 [38.700, 44.500] | 40.200 [36.700, 42.350] | 0.274 |
| Duration of 5-ASA treatment (days) | 321 [253, 352] | 333 [253, 368] | 298 [230, 331] | 0.112 |
MES, Mayo endoscopic score; HB, hemoglobin; CRP, C-reactive protein; ESR, erythrocyte sedimentation rate; ALB, albumin; 5-ASA, 5-aminosalicylic acid.
Mean (SD); median [IQR]; n (%).
Welch two sample t test; Wilcoxon rank-sum test; Pearson’s Chi-squared test; Fisher’s exact test.
Taxonomic and functional features between the two groups at baseline
At the phylum level, patients who failed 5-ASA treatment exhibited an elevated abundance of Proteobacteria at baseline (p = 0.035) (Figures 2A and 2B). In the overall composition of the gut microbiome, no significant differences of α diversity were observed between the two groups (Figure 2C). A trend toward separation was observed in the β diversity analysis (Aitchison distance) with a marginal difference (p = 0.054) (Figure 2D). We examined the influence of clinical factors (e.g., age, sex, MES, and extent) on microbiome variation, but none of these factors demonstrated significant explanatory power for the variance (Table S1).
Figure 2.
Major differences in the gut microbiome profiles between effective and ineffective groups at baseline
(A) Stacked bar plots depicting phylum-level differences in gut microbiome composition between the two groups.
(B) Box of Proteobacteria in the two groups. Data are represented as median ± interquartile range (IQR).
(C) The α diversity of the microbiome. Data are represented as median ± IQR.
(D) Beta diversity analysis at the species level using Aitchison distances (PCoA plot).
(E) Baseline differences in species between the two groups.
(F) Beta diversity analysis of functional profiles using Aitchison distances (PCoA plot).
(G) Baseline differences in pathways between the two groups.
(H) Spearman correlation between species and function. ∗p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.
MaAsLin2 was used to identify differences in taxonomic species between the two groups (Figure 2E). Faecalibacterium prausnitzii (F. prausnitzii), Blautia massiliensis (B. massiliensis), Phascolarctobacterium faecium (P. faecium), Blautia SGB4815, Coprococcus comes (C. comes), and Peptostreptococcus stomatis (P. stomatis) were significantly depleted, whereas Klebsiella penumoniae (K. penumoniae), Eggerthella sinensis (E. sinesis), and GGB80090 SGB1690 were enriched in the baseline stool samples of patients who failed 5-ASA treatment.
For the functional potential of the gut microbial communities, no significant differences were observed in β diversity (Figure 2F). Pathways critical for short-chain fatty acid synthesis, such as pyruvate fermentation to butanoate and the superpathway of Clostridium acetobutylicum acidogenic fermentation were significantly enriched in the baseline stool samples of patients in the effective group (Figure 2G). Strikingly, baseline gondoate biosynthesis was negatively associated with several species enriched in the effective group, including F. prausnitzii, C. comes, Blautia SGB4815, and P. stomatis (Figure 2H).
Longitudinal trajectory of the microbiome after treatment
After treatment, 12 species, including F. prausnitzii, B. massiliensi, P. faecium, Ruminococcus sp AF13_28, Blautia sp Marseille P3087, Anaerostipes_hadrus, Lacrimispora celerecrescens, Bifidobacterium longum, Actinomyces massiliensis, Enterococcus SGB6173, Actinomyces_SGB17163, and Lachnospiraceae_bacterium were depleted in the ineffective group, whereas 6 species, including Escherichia coli (E. coli), Enterococcus avium, Enterococcus faecalis, GGB3746 SGB5089, Limosilactobacillus mucosae, and Lacticaseibacillus paracasei, were enriched (Figure 3A). Three species, F. prausnitzii, B. massiliensis, and P. faecium were consistently depleted in the stool samples both at baseline and follow-up of patients with 5-ASA treatment failure (Figure 3B). The abundance of F. prausnitzii after treatment was even lower than that at baseline in patients who failed 5-ASA therapy (Figure 3C).
Figure 3.
Longitudinal changes in species and pathways between effective and ineffective groups
(A) Significantly differentially abundant species between the two groups at follow-up.
(B) Relative abundance of 3 bacterial species consistently depleted in ineffective patients at both baseline and follow-up. Data are represented as median ± IQR.
(C) Log2 fold change (FC) of F. prausnitzii after treatment in comparison with baseline sample. Data are represented as median ± IQR.
(D) Differentially pathways between the two groups at follow-up. NONMEVIPP-PWY, methylerythritol phosphate pathway I; PWY-7663, gondoate biosynthesis (anaerobic); PWY-6215, 4-chlorobenzoate degradation; PWY4FS-7, phosphatidylglycerol biosynthesis I (plastidic); PWY4FS-8, phosphatidylglycerol biosynthesis II (non-plastidic); PWY-6215, peptidoglycan biosynthesis II (staphylococci).
(E) Spearman correlation analysis between species and function.
(F) Contribution of different bacterial species related to gondoate biosynthesis through HUMAnN analysis. ∗p < 0.05, ∗∗p < 0.01, and ∗∗∗p < 0.001.
At both baseline and follow-up, the methylerythritol phosphate pathway I was the only pathway depleted in the ineffective group. In contrast, 5 pathways showed enrichment: gondoate biosynthesis (anaerobic), 4-chlorobenzoate degradation, phosphatidylglycerol biosynthesis I (plastidic), phosphatidylglycerol biosynthesis II (non-plastidic), and peptidoglycan biosynthesis II (staphylococci) (Figure 3D). Notably, F. prausnitzii showed a negative association with gondoate biosynthesis (baseline: r = −0.388, p = 0.005; follow-up: r = -0.481, p = 0.017) and 4-chlorobenzoate degradation (baseline: r = −0.353, p = 0.011; follow-up: r = −0.444, p = 0.029) at both time points (Figure 3E). We further investigated the microbial contribution to gondoate biosynthesis and found that both E. coli and K. pneumoniae were involved (Figure 3F).
Model construction and validation for predicting 5-ASA treatment efficacy
Given that F. prausnitzii, B. massiliensis, and P. faecium were consistently depleted in ineffective group at both baseline and after treatment, baseline lower abundances of these three species were used to construct the predictive model for 5-ASA treatment failure. A robust RF classification model was constructed, which achieved an average area under the curve (AUC) of 0.80 for predicting ineffective 5-ASA treatment (Figure 4A).
Figure 4.
ROC curves for the test and validation cohorts based on the RF classifier constructed with baseline microbial variables
(A) The ROC curve for the test cohort.
(B) The ROC curve for the validation cohort (PPV, positive predictive value; NPV, negative predictive value).
To further test the general applicability of the model, eligible patients from the Inflammatory Bowel Disease Multi’omics Database (IBDMDB) cohort (n = 15) were utilized as an independent dataset (Table 2) (Figure S1). The model achieved an AUC of 0.82 in predicting 5-ASA treatment failure, with specificity, negative predictive value (NPV), and positive predictive value (PPV) of 0.88, 0.70, and 0.80, respectively (Figure 4B).
Table 2.
Baseline characteristics of the validation cohort
| Overall | Effective N = 8 (53%)a |
Ineffective N = 7 (47%)a |
p valueb | |
|---|---|---|---|---|
| Age | 17.00 [15.00, 29.00] | 16.50 [14.50, 44.00] | 17.00 [15.50, 23.50] | 0.908 |
| Sccai | 3 [1, 3] | 1.5 [0.25, 3] | 3 [2, 4] | 0.236 |
| ESR (mm/h) | 10.00 [6.00, 22.00] | 10.00 [7.00, 23.00] | 7.00 [5.00, 20.00] | 0.600 |
| Gender | 0.619 | |||
| Female | 8 (53.33%) | 5 (62.50%) | 3 (42.86%) | |
| Male | 7 (46.67%) | 3 (37.50%) | 4 (57.14%) | |
| Extent | 0.396 | |||
| E1 | 2 (13.33%) | 1 (12.50%) | 1 (14.29%) | |
| E2 | 4 (26.67%) | 2 (25.00%) | 2 (28.57%) | |
| E3 | 6 (40.00%) | 4 (50.00%) | 2 (28.57%) | |
| Unknown | 3 (20.00%) | 1 (12.50%) | 2 (28.57%) |
Mean (SD); median [IQR]; n (%).
Wilcoxon rank-sum test; Fisher’s exact test.
Network analysis among the efficacy-associated species
To investigate the interplay among efficacy-associated species in the two groups, correlation was assessed by SparCC to generate a co-abundance network based on differential species at baseline. More intricate correlation network could be observed in the effective group compared to the ineffective group because of a greater number of nodes (397 vs. 393) and edges (517 vs. 488) (Tables S2 and S3) (Figures 5A and 5B).
Figure 5.
Baseline microbiome co-occurrence networks for the ineffective and effective groups
(A) Ineffective group co-occurrence network.
(B) Effective group co-occurrence network. Central nodes of clusters represent efficacy-associated species, with node size indicating the number of connections to other nodes. Green and red edges denote positive and negative microbial correlations, respectively.
Additionally, we noted F. prausnitzii was negatively correlated with K. pneumoniae both in the two groups (effective group: r = −0.264, p = 0.04; ineffective group: r = −0.403, p = 0.02).
Discussion
In this prospective cohort study, we have identified the key bacterial species that could predict 5-ASA treatment failure in UC patients. The RF model based on F. prausnitzii, B. massiliensis, and P. faecium was developed and externally validated, demonstrating a favorable AUC of 0.82. Additionally, a negative correlation between F. prausnitzii and K. pneumoniae was observed in co-occurrence networks, suggesting a potential role of these microbial interactions in modulating the efficacy of 5-ASA treatment.
We identified F. prausnitzii, B. massiliensis, and P. faecium as specific bacterial signatures, which were all consistently depleted in ineffective patients at both baseline and follow-up. The combination of these 3 species demonstrated an AUC of 0.82 for predicting 5-ASA treatment failure. In line with previous studies, patients with higher baseline levels of F. prausnitzii tend to respond better to a range of IBD therapies, including biologic therapy and fecal microbiota transplantation.9,10 In addition, mucosal F. prausnitzii abundance increased with higher 5-ASA mucosal concentration.11 Blautia, a functional genus with potential probiotic properties, has also been shown to be positively associated with mucosal 5-ASA concentrations.12 We discovered that B. massiliensis was significantly enriched in the effective group. Several studies have also reported that increased baseline levels of P. faecium have been linked to early clinical remission in patients on anti-cytokine therapy.9
Microbial functional changes were also explored to further understand how the microbiome influences 5-ASA efficacy. Gondoate biosynthesis (anaerobic) and 4-chlorobenzoate degradation were found to be enriched in the ineffective treatment group. Strong association was reported between 4-chlorobenzoate degradation and IL23R, which contributes to IBD pathogenesis by influencing both innate and adaptive immune.13 The disruptions in immune responses may result in the variable efficacy of 5-ASA treatment. Higher gondoate biosynthesis in Crohn’s disease patients was seen before symptoms worsen.14 A study on the therapeutic effects of Bifidobacterium bifidum B1628 in DSS colitis found that this strain may eliminate colitis by inhibiting harmful bacteria and downregulating the gondoate biosynthesis pathway.15 This pathway may promote disease onset and progression, regulating 5-ASA efficacy. F. prausnitzii showed a negative association with the two pro-inflammatory pathways, suggesting its potential role in mitigating inflammation and enhance 5-ASA efficacy.
Our study found that a higher level of F. prausnitzii was associated with positive 5-ASA treatment outcomes, consistent with previous findings that increased baseline abundance of F. prausnitzii serves as a promising biomarker for predicting good response to biologics, such as infliximab, ustekinumab.10 Mechanistically, F. prausnitzii is a versatile commensal bacterium with several important functions. Bacterial cell wall antigen-presenting molecules can trigger immune cells to produce anti-inflammatory cytokines like IL-10, effectively preventing colitis in a TNBS-induced model.16 In DNBS-induced colitis, F. prausnitzii can restore apical junction proteins impairment.17 In addition, F. prausnitzi can secrete beneficial compounds such as butyrate, salicylic acid, and microbial anti-inflammatory molecules, which have been shown to inhibit NF-κB activation both in vitro and in vivo.18,19 An in vitro study found that SCFAs significantly inhibit the growth of K. pneumoniae.20 F. prausnitzii, recognized as SCFA-producer, may similarly suppress the overgrowth of K. pneumoniae through its production of SCFAs. Our study revealed a negative correlation between F. prausnitzii and K. pneumoniae, suggesting that F. prausnitzii may enhance the efficacy of 5-ASA by suppressing K. pneumoniae.
This study has the following important strengths. Firstly, our study has demonstrated the feasibility to predict the efficacy of 5-ASA treatment in UC patients using microbial profiles, which could enable the pre-selection of patients requiring treatment escalation at an early stage of disease and improve the outcomes of patients. Secondly, we used our prospective IBD cohort, which avoided recall bias for drug compliance assessment and enabled comprehensive analysis of longitudinal clinical data and samples. Thirdly, we only included patients with good adherence to standardized 5-ASA treatment to prevent misclassifying patients with treatment failure due to suboptimal dosing or poor drug adherence as patients who failed 5-ASA treatment. Moreover, we defined ineffective treatment as the failure to achieve MH, which is the current therapeutic goal for UC. In previous studies, clinical relapse was the most frequently used indicator of treatment failure.21 However, there are inconsistencies between symptoms and endoscopic findings among UC patients.22 Fourthly, the two groups in our study had comparable baseline clinical features, avoiding the influence of disease activity on the prognostic markers. Finally, we have successfully validated our predictive modal in IBDMDB, suggesting its generalizability, reliability, and robustness among different populations.
In summary, we described associations between gut microbial taxonomic composition and 5-ASA efficacy in UC patients. Ineffective 5-ASA treatment could be predicted by the combination of F. prausnitzii, B. massiliensis, and P. faecium, highlighting its potential as a novel tool for the early identification of patients who may require therapy escalation. Supplementation with probiotics such as F. prausnitzii may serve as an adjunct strategy to enhance the therapeutic efficacy of 5-ASA.
Limitations of the study
There are some limitations that should be acknowledged. First, we included a relatively small number of UC patients due to the stringent inclusion criteria, such as a baseline MES ≥2, adherence to standard 5-ASA treatment, and a follow-up colonoscopy at 6–12 months, in order to minimize classification bias and confounding factors, which could enhance the reliability of our results. Second, the follow-up sample size was limited. However, it may not affect the accuracy of our predictive model, which was derived from the baseline samples. Third, the validation sample size is limited. Majority of the existing studies were cross-sectional, focusing on comparisons between UC patients and healthy controls or between patients with active and inactive disease. Although there were some longitudinal studies reporting the predictive roles of microbial markers for biologics effects,23,24,25 there was only one study investigating the influence of gut microbiota on 5-ASA treatment effects in IBD.7 Furthermore, the dietary data were not collected in this study. However, the consistent predictive performance observed when utilizing data from a different ethnic cohort in the United States (IBDMDB), where dietary habits significantly differ. It suggests that the bacterial biomarkers identified in our study are less likely susceptible to dietary influences. We acknowledge that more robust validations in larger cohorts are essential prior to application to clinical practice.
Resource availability
Lead contact
Further information and requests for resources should be directed to and will be fulfilled by the lead contact Shutian Zhang (zhangshutian@ccmu.edu.cn).
Materials availability
This study did not generate new unique reagents or materials.
Data and code availability
-
•
All sequence files are available from the National Center for Biotechnology Information Bio-Projects (PRJNA1240691). Please contact the corresponding author about the details of the metagenomics data.
-
•
The code for species and functional annotation from raw sequencing reads is available at https://github.com/biobakery/MetaPhlAn and https://github.com/biobakery/humann. The code for multivariate analysis by linear models is available at https://github.com/biobakery/Maaslin2. The code for the construction of random forest model is available at https://github.com/scikit-learn/scikit-learn. Other codes are publicly available, and accession numbers are listed in the key resources table.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
-
•
IBDMDB cohort are available at https://ibdmdb.org/.
Acknowledgments
We thank the members of the Inflammatory Bowel Disease Multiomics Database Project.
Author contributions
Y.D., study concept and design, analysis and interpretation of data, and manuscript drafting. X.X., data analysis and figure generation, manuscript drafting, critical revision of the manuscript, and final submission approval. J.M., microbiome sequencing data generation and final submission approval. M.Z., patient visits, sample collection, and final submission approval. C.X., patient visits, sample collection, and final submission approval. X.H., patient visits, data collection, and final submission approval. F.X., patient visits, data collection, and final submission approval. Z.W., patient visits, data collection, and final submission approval. H.S., study concept and design, interpretation of data, critical revision of the manuscript, funding acquisition, study supervision, and final submission approval. S.Z., study concept and design, critical revision of the manuscript, funding acquisition, study supervision, and final submission approval. Y.D. and X.X. contributed equally to this work.
Declaration of interests
The authors declare no competing interests.
STAR★Methods
Key resources table
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Chemicals, peptides, and recombinant proteins | ||
| 5-ASA (Salofalk) | Dr. Falk Pharma GmbH, Freiburg, Germany | H20171358 |
| Biological samples | ||
| Stool Samples | This paper | N/A |
| Chemicals, peptides, and recombinant proteins | ||
| OMEGA Mag-Bind Soil DNA Kit | OMEGA | M5635-02 |
| Illumina TruSeq Nano DNA LT Library Preparation Kit | Illumina | 20060059 |
| Deposited data | ||
| Fastq. Files from metagenomics sequencing | This paper | PRJNA1240691 |
| Fastq. Files from metagenomics sequencing | IBDMDB | https://ibdmdb.org/ |
| Software and algorithms | ||
| Trimmomatic (v0.39) | Bolger et al.26 | http://www.usadellab.org/cms/?page=trimmomatic |
| Bowtie2 | Langmead and Salzberg27 | http://bowtie-bio.sourceforge.net/bowtie2/index.shtml |
| Kneaddata (v0.12.0) | The huttenhower lab | https://huttenhower.sph.harvard.edu/kneaddata/ |
| MetaPhlAn (v4.0.6) | Blanco-Miguez et al.28 | https://huttenhower.sph.harvard.edu/metaphlan/ |
| HUMAnN (v3.7) | Beghini et al.29 | https://huttenhower.sph.harvard.edu/humann/ |
| MaAsLin2 (v1.18.0) | Mallick et al.30 | https://huttenhower.sph.harvard.edu/maaslin/ |
| SparCC | Friedman et al.31 | N/A |
| scikit-learn (v.1.5.2) | Scikit-learn | https://scikit-learn.org/stable/ |
| R (v4.2.0) | Open Source | https://www.r-project.org/ |
| Other codes | this paper | https://doi.org/10.5281/zenodo.15259652 |
Experimental model and study participant details
Study design
This study was nested within a longitudinal prospective IBD registry at Beijing Friendship Hospital. Briefly, this registry includes all the adult IBD patients seeking care at Beijing friendship hospital, Capital medical university. All patients involved in this study had confirmed diagnosis of UC according to the Lennard-Jones criteria.32 Follow-up evaluations were conducted every 3 months. Detailed clinical information was collected at the time of enrollment and each visit, including demographic details, medical history of IBD, clinical manifestations, medication profiles, laboratory examination results and endoscopic findings. Fecal samples were collected prior to bowel preparation or 1 month after endoscopy.
This nested study was a prospective observational cohort study, in which patients who had not previously received standard 5-ASA treatment were enrolled. The patient enrollment period for this study was from August 2020, to July 2023. Patients were included according to the following criteria: (1) aged over 18 years; (2) diagnosed with active UC at the time of involvement (Mayo Endoscopic Score [MES]≥2); (3) received standardized 5-ASA therapy; (4) had good adherence to 5-ASA regimen; (5) had colonoscopy for UC assessment 6-12 months after 5-ASA treatment. Exclusion criteria were as follows: (1) patients with a history of cancer or gastrointestinal surgery; (2) took antibiotics within 3 months before fecal sample collection; (3) accomplicated with infectious colitis proven by micro biological examinations within 30 days before fecal sampling; (4) had exposed to corticosteroids, immunosuppressants or biologic therapy. A standardized treatment regimen was defined as an adequate 5-ASA dose for remission induction (oral 5-ASA ≥ 3g/day and/or local treatment ≥ 1g/day) or for maintenance (oral 5-ASA ≥ 2g/day and/or local treatment 2-3g/week). Good medication adherence was determined based on the eight-item Morisky Medication Adherence Scale (MMAS) score of 6-8.33
Enrolled patients were prospectively followed up. Achievement of MH (MES = 0 or 1) after 6-12 months of treatment was regarded as effective 5-ASA therapy. Patients who did not achieve MH were allocated to the ineffective group.
Sample collection
All patients self-collected stool samples at home used a stool collection tube provided by the investigator in advance, and promptly shipped them to the laboratory within 24 hours. Stool samples were stored at -80°C prior to DNA extraction and metagenomic sequencing.
Method details
Fecal DNA extraction and shotgun metagenomic sequencing
The OMEGA Mag-Bind Soil DNA Kit (M5635-02) (Omega Bio-Tek, Norcross, GA, USA) was used to extract gut microbial genomic DNA. Extracted samples were stored at -20°C for further evaluation. DNA was quantified using a Qubit 4 Fluorometer (Invitrogen, USA), and quality assessed by agarose gel electrophoresis. DNA libraries were prepared using the Illumina TruSeq Nano DNA LT Library Preparation Kit (400-bp insert size) and sequenced on an Illumina NovaSeq platform (Illumina, USA), generating 6 Gb of data per sample with 2 × 150 bp paired-end reads.
Validation cohort
External validation of the results of our predictive model was performed in The Inflammatory Bowel Disease Multi’omics Database (IBDMDB) (https://ibdmdb.org/). Participants exposed to corticosteroids, immunosuppressants, or biologic therapy were considered to be in the ineffective group, whereas those who did not undergo treatment escalation belonged to the effective group.
Sequence data preprocessing and microbiome profiling
The raw data were processed using Trimmomatic (v0.39) to remove adapters, perform quality control, and filter the reads.26 Subsequently, reads mapping to the human reference genome were filtered out using Kneaddata (v0.12.0). MetaPhlAn (v4.0.6) was employed to taxonomically profile each sample.28 Pathway relative abundance of each sample was quantified by HUMAnN (v3.7) using DIAMOND with package-shipped ChocoPhlAn and EC-filtered UniRef90 databases.29 Taxonomic and functional features prevalent in at least 10% of patients were kept for analysis. For subsequent analysis, read counts were transformed into relative abundances by normalization to the total number of reads per sample.
Data analyses of fecal microbiome were performed using R (v4.2.0). Alpha diversity analysis was assessed utilizing the Shannon and richness methods. The beta-diversity of community structure and functional variation among different samples was visualized through principal coordinate analysis (PCoA) based on Aitchison distance.34 The permutational multivariate analysis of variance (PERMANOVA, n=999) to evaluate the effect of confounding factors on Aitchison distance. Differential microbiome and functional feature were identified using the multivariate analysis by linear models (MaAsLin2) (v1.18.0) statistical frameworks implemented in the Huttenhower Lab Galaxy instance (http://huttenhower.sph.harvard.edu/galaxy/) with significance set at P < 0.05.
Sparse Correlations for Compositional data (SparCC) was used to calculate microbial correlations using species relative abundance, with 100 bootstrap resamples to assess significance.31 The Louvain algorithm identified species clusters based on correlations with |r| > 0.2 and P < 0.05. These significant correlations and clusters were imported into Gephi for network visualization, where the Force Atlas 2 algorithm represented species as nodes, correlations as edges, and correlation strength by edge weight.
Construction of random forest-based diagnostic models
A random forest (RF) binary classifier was implemented using the scikit-learn (v.1.5.2) package, as this algorithm has demonstrated superior performance compared to other machine learning methods for microbiota data in prior studies. Relative abundance tables were used to train the model. Cross-validation was performed by iteratively (10 times) training the RF model with 70:30 train/test split on this training data. The predict_proba() function was used to estimate the probabilities of each sample for different classes. Optimal thresholds were determined using Youden’s index method on the training set. The receiver operating characteristic (ROC) curves and the area under the curve (AUC) were performed for model performance evaluation. Model predictive performance was measured by multi-metrics including sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV).
Quantification and statistical analysis
The normal distribution of data was examined using the One-Sample Kolmogorov-Smirnov test. Continuous variables were presented as means ± standard deviations, and categorical variables were expressed as frequencies or percentages. Group comparisons for normally distributed data employed the independent sample t-test, while non-normally distributed data were assessed using the Mann-Whitney-Wilcoxon test. Fisher’s exact or χ2 tests were applied for categorical variables. Correlations between continuous variables were explored through the Spearman correlation. All statistical tests were two-tailed, and a significance level of 0.05 was considered.
Additional resources
-
•
This study was approved by the Ethics Committee of the Capital medical university, Beijing friendship hospital (2017-P2-094-01), and informed consent was obtained from all participants. The registry was registered in the Chinese Clinical Trial Registry (ChiCTR-POC-17013285). Written informed consent was obtained from all the patients involved in the study.
Published: May 2, 2025
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.isci.2025.112568.
Contributor Information
Haiyun Shi, Email: shihaiyun1016@gmail.com.
Shutian Zhang, Email: zhangshutian@ccmu.edu.cn.
Supplemental information
References
- 1.Ng S.C., Shi H.Y., Hamidi N., Underwood F.E., Tang W., Benchimol E.I., Panaccione R., Ghosh S., Wu J.C.Y., Chan F.K.L., et al. Worldwide incidence and prevalence of inflammatory bowel disease in the 21st century: a systematic review of population-based studies. Lancet. 2017;390:2769–2778. doi: 10.1016/s0140-6736(17)32448-0. [DOI] [PubMed] [Google Scholar]
- 2.Leung C.M., Tang W., Kyaw M., Niamul G., Aniwan S., Limsrivilai J., Wang Y.F., Ouyang Q., Simadibrata M., Abdullah M., et al. Endoscopic and Histological Mucosal Healing in Ulcerative Colitis in the First Year of Diagnosis: Results from a Population-based Inception Cohort from Six Countries in Asia. J. Crohns Colitis. 2017;11:1440–1448. doi: 10.1093/ecco-jcc/jjx103. [DOI] [PubMed] [Google Scholar]
- 3.Römkens T.E.H., Kampschreur M.T., Drenth J.P.H., van Oijen M.G.H., de Jong D.J. High mucosal healing rates in 5-ASA-treated ulcerative colitis patients: results of a meta-analysis of clinical trials. Inflamm. Bowel Dis. 2012;18:2190–2198. doi: 10.1002/ibd.22939. [DOI] [PubMed] [Google Scholar]
- 4.Safdi A.V., Cohen R.D. Review article: increasing the dose of oral mesalazine therapy for active ulcerative colitis does not improve remission rates. Aliment. Pharmacol. Ther. 2007;26:1179–1186. doi: 10.1111/j.1365-2036.2007.03471.x. [DOI] [PubMed] [Google Scholar]
- 5.Zhao Q., Chen Y., Huang W., Zhou H., Zhang W. Drug-microbiota interactions: an emerging priority for precision medicine. Signal Transduct. Target. Ther. 2023;8:386. doi: 10.1038/s41392-023-01619-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Huang L., Zheng J., Sun G., Yang H., Sun X., Yao X., Lin A., Liu H. 5-Aminosalicylic acid ameliorates dextran sulfate sodium-induced colitis in mice by modulating gut microbiota and bile acid metabolism. Cell. Mol. Life Sci. 2022;79:460. doi: 10.1007/s00018-022-04471-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mehta R.S., Mayers J.R., Zhang Y., Bhosle A., Glasser N.R., Nguyen L.H., Ma W., Bae S., Branck T., Song K., et al. Gut microbial metabolism of 5-ASA diminishes its clinical efficacy in inflammatory bowel disease. Nat. Med. 2023;29:700–709. doi: 10.1038/s41591-023-02217-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Karmi N., Sun S., Festen E.A.M., Vich Vila A., Gacesa R., Weersma R.K. Gut microbial metabolism of 5-aminosalicylic acid in inflammatory bowel disease. Gut. 2024;73 doi: 10.1136/gutjnl-2024-332205. [DOI] [PubMed] [Google Scholar]
- 9.Lee J.W.J., Plichta D., Hogstrom L., Borren N.Z., Lau H., Gregory S.M., Tan W., Khalili H., Clish C., Vlamakis H., et al. Multi-omics reveal microbial determinants impacting responses to biologic therapies in inflammatory bowel disease. Cell Host Microbe. 2021;29:1294–1304. doi: 10.1016/j.chom.2021.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wang C., Gu Y., Chu Q., Wang X., Ding Y., Qin X., Liu T., Wang S., Liu X., Wang B., Cao H. Gut microbiota and metabolites as predictors of biologics response in inflammatory bowel disease: A comprehensive systematic review. Microbiol. Res. 2024;282 doi: 10.1016/j.micres.2024.127660. [DOI] [PubMed] [Google Scholar]
- 11.Radhakrishnan S.T., Alexander J.L., Mullish B.H., Gallagher K.I., Powell N., Hicks L.C., Hart A.L., Li J.V., Marchesi J.R., Williams H.R.T. Systematic review: the association between the gut microbiota and medical therapies in inflammatory bowel disease. Aliment. Pharmacol. Ther. 2022;55:26–48. doi: 10.1111/apt.16656. [DOI] [PubMed] [Google Scholar]
- 12.Olaisen M., Spigset O., Flatberg A., Granlund A.V.B., Brede W.R., Albrektsen G., Røyset E.S., Gilde B., Sandvik A.K., Martinsen T.C., Fossmark R. Mucosal 5-aminosalicylic acid concentration, drug formulation and mucosal microbiome in patients with quiescent ulcerative colitis. Aliment. Pharmacol. Ther. 2019;49:1301–1313. doi: 10.1111/apt.15227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bonder M.J., Kurilshikov A., Tigchelaar E.F., Mujagic Z., Imhann F., Vila A.V., Deelen P., Vatanen T., Schirmer M., Smeekens S.P., et al. The effect of host genetics on the gut microbiome. Nat. Genet. 2016;48:1407–1412. doi: 10.1038/ng.3663. [DOI] [PubMed] [Google Scholar]
- 14.Klaassen M.A.Y., Imhann F., Collij V., Fu J., Wijmenga C., Zhernakova A., Dijkstra G., Festen E.A.M., Gacesa R., Vich Vila A., Weersma R.K. Anti-inflammatory Gut Microbial Pathways Are Decreased During Crohn's Disease Exacerbations. J. Crohns Colitis. 2019;13:1439–1449. doi: 10.1093/ecco-jcc/jjz077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Feng C., Zhang W., Zhang T., He Q., Kwok L.Y., Tan Y., Zhang H. Heat-Killed Bifidobacterium bifidum B1628 May Alleviate Dextran Sulfate Sodium-Induced Colitis in Mice, and the Anti-Inflammatory Effect Is Associated with Gut Microbiota Modulation. Nutrients. 2022;14 doi: 10.3390/nu14245233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rossi O., van Berkel L.A., Chain F., Tanweer Khan M., Taverne N., Sokol H., Duncan S.H., Flint H.J., Harmsen H.J.M., Langella P., et al. Faecalibacterium prausnitzii A2-165 has a high capacity to induce IL-10 in human and murine dendritic cells and modulates T cell responses. Sci. Rep. 2016;6 doi: 10.1038/srep18507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Martín R., Miquel S., Chain F., Natividad J.M., Jury J., Lu J., Sokol H., Theodorou V., Bercik P., Verdu E.F., et al. Faecalibacterium prausnitzii prevents physiological damages in a chronic low-grade inflammation murine model. BMC Microbiol. 2015;15:67. doi: 10.1186/s12866-015-0400-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Quévrain E., Maubert M.A., Michon C., Chain F., Marquant R., Tailhades J., Miquel S., Carlier L., Bermúdez-Humarán L.G., Pigneur B., et al. Identification of an anti-inflammatory protein from Faecalibacterium prausnitzii, a commensal bacterium deficient in Crohn's disease. Gut. 2016;65:415–425. doi: 10.1136/gutjnl-2014-307649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Miquel S., Leclerc M., Martin R., Chain F., Lenoir M., Raguideau S., Hudault S., Bridonneau C., Northen T., Bowen B., et al. Identification of metabolic signatures linked to anti-inflammatory effects of Faecalibacterium prausnitzii. mBio. 2015;6:e00300-15. doi: 10.1128/mBio.00300-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chang K.C., Nagarajan N., Gan Y.H. Short-chain fatty acids of various lengths differentially inhibit Klebsiella pneumoniae and Enterobacteriaceae species. mSphere. 2024;9 doi: 10.1128/msphere.00781-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Martí-Aguado D., Ballester M.P., Mínguez M. Risk factors and management strategies associated with non-response to aminosalicylates as a maintenance treatment in ulcerative colitis. Rev. Esp. Enferm. Dig. 2021;113:447–453. doi: 10.17235/reed.2021.7797/2021. [DOI] [PubMed] [Google Scholar]
- 22.Sedano R., Jairath V., Ma C., IBD Trial Design Group Design of Clinical Trials for Mild to Moderate Ulcerative Colitis. Gastroenterology. 2022;162:1005–1018. doi: 10.1053/j.gastro.2021.12.284. [DOI] [PubMed] [Google Scholar]
- 23.Zhou Y., Xu Z.Z., He Y., Yang Y., Liu L., Lin Q., Nie Y., Li M., Zhi F., Liu S., et al. Gut Microbiota Offers Universal Biomarkers across Ethnicity in Inflammatory Bowel Disease Diagnosis and Infliximab Response Prediction. mSystems. 2018;3:e00188-17. doi: 10.1128/mSystems.00188-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ananthakrishnan A.N., Luo C., Yajnik V., Khalili H., Garber J.J., Stevens B.W., Cleland T., Xavier R.J. Gut Microbiome Function Predicts Response to Anti-integrin Biologic Therapy in Inflammatory Bowel Diseases. Cell Host Microbe. 2017;21:603–610. doi: 10.1016/j.chom.2017.04.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Doherty M.K., Ding T., Koumpouras C., Telesco S.E., Monast C., Das A., Brodmerkel C., Schloss P.D. Fecal Microbiota Signatures Are Associated with Response to Ustekinumab Therapy among Crohn's Disease Patients. mBio. 2018;9:e02120-17. doi: 10.1128/mBio.02120-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Blanco-Míguez A., Beghini F., Cumbo F., McIver L.J., Thompson K.N., Zolfo M., Manghi P., Dubois L., Huang K.D., Thomas A.M., et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat. Biotechnol. 2023;41:1633–1644. doi: 10.1038/s41587-023-01688-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Beghini F., McIver L.J., Blanco-Míguez A., Dubois L., Asnicar F., Maharjan S., Mailyan A., Manghi P., Scholz M., Thomas A.M., et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife. 2021;10 doi: 10.7554/eLife.65088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Mallick H., Rahnavard A., McIver L.J., Ma S., Zhang Y., Nguyen L.H., Tickle T.L., Weingart G., Ren B., Schwager E.H., et al. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput. Biol. 2021;17 doi: 10.1371/journal.pcbi.1009442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Friedman J., Alm E.J. Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 2012;8 doi: 10.1371/journal.pcbi.1002687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lennard-Jones J.E. Classification of inflammatory bowel disease. Scand. J. Gastroenterol. Suppl. 1989;170:2–19. doi: 10.3109/00365528909091339. [DOI] [PubMed] [Google Scholar]
- 33.Krousel-Wood M.A., Muntner P., Islam T., Morisky D.E., Webber L.S. Barriers to and determinants of medication adherence in hypertension management: perspective of the cohort study of medication adherence among older adults. Med. Clin. North Am. 2009;93:753–769. doi: 10.1016/j.mcna.2009.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wallen Z.D., Demirkan A., Twa G., Cohen G., Dean M.N., Standaert D.G., Sampson T.R., Payami H. Metagenomics of Parkinson's disease implicates the gut microbiome in multiple disease mechanisms. Nat. Commun. 2022;13:6958. doi: 10.1038/s41467-022-34667-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
-
•
All sequence files are available from the National Center for Biotechnology Information Bio-Projects (PRJNA1240691). Please contact the corresponding author about the details of the metagenomics data.
-
•
The code for species and functional annotation from raw sequencing reads is available at https://github.com/biobakery/MetaPhlAn and https://github.com/biobakery/humann. The code for multivariate analysis by linear models is available at https://github.com/biobakery/Maaslin2. The code for the construction of random forest model is available at https://github.com/scikit-learn/scikit-learn. Other codes are publicly available, and accession numbers are listed in the key resources table.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
-
•
IBDMDB cohort are available at https://ibdmdb.org/.





