Abstract
Background
The gut microbiome is extensively involved in induction of remission in pediatric Crohn’s disease (CD) patients by exclusive enteral nutrition (EEN). In this follow-up study of pediatric CD patients undergoing treatment with EEN, we employ machine learning models trained on baseline gut microbiome data to distinguish patients who achieved and sustained remission (SR) from those who did not achieve remission nor relapse (non-SR) by 24 weeks.
Methods
A total of 139 fecal samples were obtained from 22 patients (8–15 years of age) for up to 96 weeks. Gut microbiome taxonomy was assessed by 16S rRNA gene sequencing, and functional capacity was assessed by metagenomic sequencing. We used standard metrics of diversity and taxonomy to quantify differences between SR and non-SR patients and to associate gut microbial shifts with fecal calprotectin (FCP), and disease severity as defined by weighted Pediatric Crohn’s Disease Activity Index. We used microbial data sets in addition to clinical metadata in random forests (RFs) models to classify treatment response and predict FCP levels.
Results
Microbial diversity did not change after EEN, but species richness was lower in low-FCP samples (<250 µg/g). An RF model using microbial abundances, species richness, and Paris disease classification was the best at classifying treatment response (area under the curve [AUC] = 0.9). KEGG Pathways also significantly classified treatment response with the addition of the same clinical data (AUC = 0.8). Top features of the RF model are consistent with previously identified IBD taxa, such as Ruminococcaceae and Ruminococcus gnavus.
Conclusions
Our machine learning approach is able to distinguish SR and non-SR samples using baseline microbiome and clinical data.
Keywords: pediatric Crohn’s disease, exclusive enteral nutrition, gut microbiome, nutrition in pediatrics
Here, authors present a follow-up on a cohort of 22 pediatric CD patients undergoing EEN treatment to induce remission. They leverage machine learning techniques to show that the baseline microbiome with clinical metadata can predict sustained remission.
INTRODUCTION
Crohn’s disease (CD) is a major form of inflammatory bowel disease (IBD) that can affect any part of the gastrointestinal (GI) tract causing chronic relapsing inflammation.1 Crohn’s disease is characterized by abdominal pain, diarrhea, fatigue, and many other symptoms detrimental to quality of life. Incidence of pediatric-onset IBD is on the rise globally2 and in Canada, especially in Nova Scotia.3 Inflammatory bowel disease is thought to manifest in genetically susceptible individuals that exhibit an exaggerated immune response to intestinal microbes when exposed to environmental triggers,4 but an exact etiology remains elusive.5 The rising incidence and increasingly early onset of IBD substantiates a need to further investigate the etiology of the disease.
Exclusive enteral nutrition (EEN) involves the administration of a liquid-formula diet while excluding all other oral intake for a period of 4 to 12 weeks and has been shown to induce clinical remission of pediatric CD, with rates in excess of 75%.6–8 Exclusive enteral nutrition is as effective as corticosteroids in inducing remission in pediatric CD, but those receiving EEN are more likely to achieve mucosal healing, lower disease activity, and improve weight gain, without the harsh side effects of corticosteroid treatment.9 The European Crohn’s & Colitis Organisation (ECCO) and European Society of Pediatric Gastroenterology, Hepatology & Nutrition (ESPGHAN) guidelines recommend EEN as a first-line therapy for pediatric CD to induce remission or after a flare of symptoms.10
The mechanism of action of EEN is not fully characterized, but it is known to involve bacteria in the GI tract, collectively called the gut microbiome.11, 12 Altered gut microbiome composition (dysbiosis) in CD patients has been well documented and generally characterized by decreased bacterial diversity and differential abundance of select taxa relative to healthy controls.13–15 Specific features of dysbiosis in CD patients include reduced Faecalibacteirum prausnitzii and an increase of taxa from the Proteobacteria phylum, namely Escherichia coli.13, 15, 16
Therapy with EEN has been shown to modulate the gut microbiome and reduce disease activity.11, 17, 18 The effect of EEN on the gut microbiome has shown mixed results, but generally it results in further decreased bacterial diversity after treatment.19 Importantly, EEN leads to mucosal healing of the ileum and colon,20 which is a vital early end point in the management of CD. Though mucosal healing is best assessed directly by endoscopy,10 the invasive nature of this procedure necessitates more frequent use of surrogate markers including C-reactive protein (CRP), fecal calprotectin (FCP), and proxies such as the Pediatric Crohn’s Disease Activity Index (PCDAI) to routinely monitor response to treatment.21 A better understanding of the microbial factors associated with disease activity and response to EEN could provide additional surrogate markers to better stratify patients and monitor responses to EEN.
In this study, we present a follow-up on a cohort of 22 pediatric CD patients undergoing EEN to induce remission.22 We investigated the impact of EEN on gut microbiome composition and employed machine learning using microbiome data to distinguish patients that maintained remission compared with those that did not achieve remission or relapsed upon reintroduction to normal diet. Our aim was to develop a predictive model to identify the features of the CD gut microbiome most predictive of a patient’s response to EEN treatment. We found that a model using input from microbiome data and Paris disease classification was the most informative in distinguishing response to EEN.
MATERIALS AND METHODS
Subjects and Study Design
Twenty-two pediatric CD patients receiving gastroenterology specialist care at the IWK Health Centre (Halifax, NS, Canada) were prospectively recruited to the MAREEN (a metagenomic approach to diagnosis, induction, and maintenance of deep remission after exclusive enteral nutrition in pediatric Crohn’s disease) study. Diagnosis of CD was based on standard histological, endoscopic, and radiological criteria, as previously described.22 All patients were treated with EEN via nasogastric/gastric tubing for at least 12 weeks. Clinical data were collected at regularly scheduled follow-up visits at the GI clinic. Clinical metadata including disease activity and serology results were collected from patient medical records. These study data were managed using the REDCap electronic data capture tool hosted at the IWK Health Centre.23, 24 Clinical remission was defined by weighted Pediatric Crohn’s Disease Activity Index (wPCDAI) score ≤12.5.25 Disease severity was classified by recommended wPCDAI cut-offs from Turner et al: remission (≤12.5), mild (>12.5 to <40), moderate (≥40 to >57.5), and severe (≥57.5).25 Sustained remission (SR) was defined as establishing and maintaining remission through week 24 of treatment. Patients who experienced a relapse or flare of disease before week 12 or before week 24 (after reintroduction of normal diet but often with ongoing partial enteral nutrition with formula) were classified as non-SR. Additional medications, Paris classification of disease,26 sex, and age of patients are shown in Table 1.
TABLE 1.
MAREEN Study Patients and Demographics
| ID | Age at Baseline | Sex | Paris Classification (Location-Behavior) | Other Medications, B/w12a | Fecal Calprotecin (µg/g stool), B/w12 | wPCDAI, B/w12 | Remission Status at Week 24 |
|---|---|---|---|---|---|---|---|
| CD1 | 13 | M | L3+L4a–B1 | 6MP + AZA + antibiotics / MTX (sc) | 2018 / 211 | 0 / 5 | SR |
| CD2 | 15 | M | L3+L4b–B1 | none / AZA | 6555 / 5221 | 35 / 7.5 | SR |
| CD3 | 12 | M | L3+L4a–B1p | none / AZA + antibiotics | 6000 / 58 | 75 / 0 | SR |
| CD4 | 14 | M | L3+L4ab–B1p | none / MTX (sc) + AZA + Adalimumab | 3781 / 6976 | 50 / 35 | Non-SR |
| CD5 | 10 | M | L3+L4a–B1 | none / MTX (sc) | 4407 / 704 | 10 / 0 | SR |
| CD6 | 14 | M | L3–B1p | none / AZA + antibiotics | 6000 / 3904 | 80 / 0 | Non-SR |
| CD7 | 11 | M | L2+L4a–B2 | none / AZA | 2292 / 4560 | 40 / 7.5 | SR |
| CD8 | 14 | M | L3+L4a–B1p | antibiotics / antibiotics | 745 / na | 50 / 0 | SR |
| CD9 | 10 | M | L3+L4ab–B1 | none / AZA | 2745 / 4707 | 55 / 17.5 | Non-SR |
| CD10 | 10 | M | L2–B1 | 5-ASA / prednisone | 3906 / 2693 | 27.5 / 27.5 | Non-SR |
| CD11 | 12 | F | L3+L4a–B1 | none / AZA | na / na | 80 / 7.5 | Non-SR |
| CD12 | 8 | M | L2–B1p | none / none | 2427 / na | 35 / na | SR |
| CD13 | 11 | M | L3–B1p | none / MTX (sc) | 3844 / 2720 | 37.5 / 0 | SR |
| CD14 | 13 | M | L3–B1 | none / na | 2874 / na | 15 / na | NA b |
| CD15 | 10 | F | L1–B1 | none / none | na / 3574 | 45 / 7.5 | SR |
| CD16 | 10 | F | L3+L4b–B1p | none / MTX + Adalimumab | 2691 / 173 | 62.5 / 0 | SR |
| CD17 | 14 | F | L3+L4a–B1 | none / na | 4663 / na | 72.5 / na | NA b |
| CD18 | 12 | M | L1+L4a–B1p | none / none | 2012 / 178 | 30 / 0 | Non-SR |
| CD19 | 8 | M | L3+L4a–B1 | none / AZA | 746 / 1357 | 30 / 7.5 | SR |
| CD20 | 10 | F | L3+L4a–B1 | none / MTX (po) | na / 4755 | 42.5 / 0 | SR |
| CD21 | 12 | M | L3+L4ab–B1 | none / none | 5801 / 1280 | 57.5 / na | NA b |
| CD22 | 14 | M | L3+L4ab–B1 | none / MTX (sc) | 5496 / 2115 | 32.5 / 0 | SR |
Abbreviations: 5- ASA, 5-aminosalicylic acid; 6-MP, 6-mercaptopurine; AZA, azathioprine; B, baseline; MTX, methotrexate
aB is baseline collection timepoint and w12 is 12 weeks into EEN treatment.
bNo week 24 follow-up wPCDAI was available.
Fecal Calprotectin
Fecal calprotectin was quantified for 132 stool samples that were stored at −80ºC before processing. Fecal calprotectin levels were quantified using the EliA Calprotectin 2 automated immunoassay on the Phadia 250 platform (Thermo Fisher Scientific, Uppsala, Sweden) following the manufacturer’s protocol.
DNA Extraction
Stool samples were stored at −20°C following collection and stored at −80°C until analysis. DNA was isolated from 139 stool samples in total for bacterial identification using the Stool DNA Isolation Kit according to manufacturer instructions (NORGEN Biotek, Thorold, ON, Canada).
16S rRNA Gene Sequencing
Variable regions V4-V5 of the bacterial 16S ribosomal RNA gene were amplified from extracted DNA using PCR conditions and custom primers as described in the Microbiome Helper protocol.27 The forward (515FB = GTGYCAGCMGCCGCGGTAA) and reverse (926R = CCGYCAATTYMTTTRAGTTT) primers used Nextera Illumina index tags and sequencing adapters fused to the 16S sequences. Each sample was amplified with a different combination of index barcodes to allow for sample identification after multiplex sequencing. After amplification, paired-end 300 + 300 bp v3 sequencing was performed for all samples on the Illumina MiSeq at the Integrated Microbiome Resource (http://imr.bio/) of Dalhousie University.
16S rRNA Gene Annotation
Analysis of 16S sequencing data was carried out on a dedicated server using the Microbiome Helper workflow27 (https://github.com/LangilleLab/microbiome_helper/wiki/Amplicon-SOP-v2). First, primer sequences were removed from sequencing reads using cutadapt (v 1.14),28 and primer-trimmed files were imported into QIIME2 (v. 2019.4.0).29 Forward and reverse paired-end reads were joined using VSEARCH (v 2.9.0)30 and input into Deblur31 to correct reads and obtain amplicon sequence variants (ASVs). In total, 21,757 unique ASVs were observed, with a mean read depth of 15,305 sequences across 139 samples. Amplicon sequence variants that had a frequency of less than 0.1% of the mean sample depth were removed from downstream analysis (<15 counts). After filtering of rare ASVs, there were 1053 unique ASVs that remained. MAFFT (v 7.407)32 was used to build a multiple-sequence alignment of ASVs, which was input into FastTree33 to construct a phylogenetic tree. Taxonomy was assigned to ASVs using the SILVA rRNA gene database34 and the “feature-classifier” option in QIIME2. Estimates of alpha-diversity (Observed ASVs, Shannon Diversity), beta-diversity (weighted UniFrac), and relative abundance of ASVs were obtained using QIIME2 (v 2019.4). Sequencing results were rarified to 3000 reads per sample to acquire diversity metrics and compare taxonomy between samples, omitting 6 samples from further 16S analysis. To determine whether levels of individual bacterial taxa were significantly different between clinical characteristics, the graphical software package STAMP was used.35 To test for difference in individual taxa between groups of samples, the Welch 2-sided t test with Benjamini-Hochberg false discovery rate (FDR) testing was employed. For testing 3 or more groups, STAMP uses the ANOVA test with FDR correction.
Metagenomics Sequencing and Bioinformatic Pipeline
At least 1 nanogram of each purified DNA sample was subjected to Nextera XT (Illumina, San Diego, CA) library preparation per the manufacturer’s instructions, but clean-up and normalization were completed using the Charm Just-a-Plate Purification and Normalization Kit (Charm Biotech, Cape Girardeau, MO). Complete libraries were then pooled and sequenced in a portion of a 150 + 150 bp PE NextSeq run (Illumina Hi-Output v2 300 cycle kit), with a mean read depth of 5,304,747 reads per sample. The Microbiome Helper27 metagenomic standard operating procedure was followed (https://github.com/LangilleLab/microbiome_helper/wiki/Metagenomics-standard-operating-procedure-v2). Forward and reverse sequencing reads were first concatenated into a single FASTQ file per sample and run through the kneaddata pipeline for sequence preprocessing. Kneaddata uses Trimmomatic36 to remove low-quality sequences and Bowtie237 to screen out human and PhiX174 contaminant sequences. Trimmomatic removes reads smaller than 50 base pairs, and those with low-quality scores (PHRED <Q20). Humann238 (v.0.11.2) was used to functionally annotate the metagenomic sequencing reads into MetaCyc metabolic pathways and KEGG orthologs (KOs). Amplicon sequencing data for each sample are available at the European Nucleotide Archive (accession PRJEB33603).
Random Forest Classification
Random forest (RF) models were run on baseline samples to classify predictive accuracy of 16S rRNA data sets, MetaCyc pathways, and KOs in classifying SR and NR samples. Each data set was preprocessed so only features with >20% of samples having non-zero values were retained. Features were standardized by sample by subtracting each sample’s mean and dividing by the sample’s standard deviation (Z-score standardization). We ran RF models using the randomForest39 R package (v 4.6.12) with default mtry values. All models were run with 1001 trees, and model significance was determined by the permutation test from the rfUtilities40 R package (v 2.0-0). Leave-one-out cross-validation was also run on each data set to output an accuracy metric for each model using the R package caret41 (v 6.0–73). Abundances of taxonomic and functional features of each sample were also regressed against their respective FCP levels to assess feature importance of explaining variance in FCP.
ETHICAL CONSIDERATIONS
Informed consent was obtained from all study participants or their parent/guardian. The MAREEN study was approved by the Research Ethics Board of the IWK Health Centre.
RESULTS
Patient and Sample Cohort
Demographics and clinical characteristics of the cohort are presented in Table 1. All 22 patients enrolled in the study provided baseline (week 0) stool samples before starting EEN. Two patients withdrew before week 12 of EEN therapy, and the remaining 20 patients provided an additional 1 to 9 samples during follow-up to a maximum of 96 weeks (Supplementary Table 1) for a total of 139 samples.
Differential Microbial Diversity and Abundant Taxa Associated With Disease Activity
The 16S rRNA gene was directly sequenced from all 139 stool samples obtained from the cohort. We examined the association of microbial diversity and taxonomic composition in each sample with widely used parameters for disease activity—FCP concentration (µg/g) and wPCDAI score.21 FCP levels were detected in 132 samples (median 1591 [10.0 – 6976] µg/g). Disease activity at the time of sampling was determined by calculating wPCDAI score. Fecal calprotectin levels and wPCDAI scores themselves were weakly correlated (Pearson correlation = 0.49; P = 2.41E-06; Supplementary Fig. 1). However, FCP levels were highest in samples associated with severe disease activity (wPCDAI ≥ 57.5, n = 6), reaching significance when compared with samples associated with remission (wPCDAI ≤12.5, n = 70; 4325 vs 1737 µg/g; P = 0.0013; Fig. 1A). We examined the relationship between species richness and FCP, and using the accepted FCP clinical cut-off value of 250 µg/g,42 we found that samples with low FCP (<250 µg/g; n = 27) had significantly lower species richness than samples with high FCP (≥250 µg/g; n = 105; Fig. 1B). Samples associated with severe disease activity (“severe” samples, wPCDAI ≥ 57.5) had the lowest species richness, trending toward significance when compared with samples associated with remission (“remission” samples, wPCDAI ≤ 12.5; Fig. 1C).
FIGURE 1.
A, In samples with an FCP level above 250 µg/g, alpha diversity was higher (P = 0.039). B, FCP is significantly higher in severe CD compared with remission (t test, P = 0.0306) and mild CD (t test, P = 0.0013). All disease groupings have significantly different FCP levels (P = 0.027, Kruskal-Wallis). Disease severity was classified by recommended wPCDAI cut-offs from Turner et al: remission (≤12.5), mild (>12.5 to <40), moderate (≥40 to >57.5), and severe (≥57.5).24 C, The difference in alpha-diversity between severe CD samples (n = 6) and remission samples (n = 70) trended toward significance (P = 0.053). D, Four genera found by ANOVA to be differentially abundant between disease severity groupings. Statistical tests shown for Wilcoxon tests between severe CD and remission, mild CD, and moderate CD samples (***<0.001, **<0.01, *<0.05).
There were 29 taxa in total that were differentially abundant among the 4 sample subgroups defined by disease activity (eg, remission, mild, moderate, severe), 27 of which had the highest abundance in severe samples. Of these 27 taxa, 4 bacterial genera were significantly increased including Peptostreptococcus, Fusobacterium, Hungatella, and Eikenella (Fig. 1D). Fusobacterium showed the greatest difference, with a 3.05% greater average relative abundance in severe CD samples. The bacterial families Carnobacteriaceae, Campylobacteraceae, and Neisseriaceae, and Dialister sp. were also increased in abundance in severe CD. Twenty-seven taxa in total had an increased relative abundance in remission samples compared with severe samples (Supplementary Table 2). We also compared taxonomic differences between high and low FCP samples and found 10 taxa present in greater abundance in high FCP samples (Supplementary Table 2). We also analyzed patients who exhibited perianal features at baseline endoscopy (n = 12) compared with those without (n = 10) and found no differentially abundant taxa.
Differential Microbial Diversity and Abundant Taxa Associated With Response to EEN
Of the 22 patients enrolled in the study, 19 patients had complete clinical follow-up data available to assess response to EEN (Table 1). Of these, 15 (79%) achieved clinical remission as defined by wPCDAI scores ≤12.5 by week 12. By week 24, 13 (68%) remained in remission and were classified as sustained remission (SR). The 6 patients that did not achieve or remain in clinical remission were classified as non-SR.
To investigate microbial differences associated with response to EEN, we examined 16S rRNA sequences from week 0, 12, and 24 stool samples from SR and non-SR patients. Species richness (alpha diversity) was estimated by observed ASVs. In contrast to previous reports,19 our paired analysis of species richness at week 0 vs week 12 showed that microbial diversity did not decrease in response to EEN, with or without stratification by treatment response (Kruskal-Wallis, P = 0.390 and P = 0.859, respectively; Fig. 2A). This trend persisted even after 3 months, with no differences in species richness at week 24.
FIGURE 2.
A, Observed ASV levels did not differ between baseline and after week 12 of EEN, even when stratified by treatment response. B, Percentage of sequences belonging to bacterial phyla. Verrucomicrobia is significantly elevated in nonresponders at week 12 (asterisk) compared with week 0 non-SR and SR patients (ANOVA, ETA squared = 0.367, q-value = 1.90E-02)
At the phylum level, taxonomic profiles of SR patient samples at week 0 vs week 12 were comparable, whereas non-SR patients exhibited a significantly increased presence of Verrucomicrobia at week 12 compared with week 0 samples from both SR and non-SR patients (Fig. 2B). Interestingly, no taxa within the Verrucomicrobia phylum were differentially abundant between SR and non-SR patients at week 0 or 12, but our analysis of samples associated with disease activity subgroups showed that the Akkermansiaceae family within Verrucomicrobia was 1.6% more abundant in remission samples compared with severe samples (P = 0.040). At week 0, Proteobacteria appeared to be more prevalent in non-SR patient samples than in SR samples, but this trend was not significant. Proteobacteria decreased in relative abundance by week 12 in both SR and non-SR samples compared with week 0, but this change did not differ significantly. Other taxa that were significantly increased in non-SR week 12 samples were the orders Actinomycetales, Corynebacteriales, and Rhizobiales, an uncultured taxon of the Christensenellaceae family, and Ruminiclostridium sp.
To define microbial communities associated with “deep sustained remission” (DSR), we classified samples with both an FCP <250 µg/g and wPCDAI <12.5 as being in DSR. At week 24, 2 patients fulfilled these criteria. We carried out principal coordinates analysis (PCA) using weighted UniFrac43 in QIIME. At baseline, DSR samples did not differ from non-DSR. At week 24, the beta-diversity as measured by UniFrac was significantly different in DSR samples (n = 2) compared with non-DSR samples (n = 18; Supplementary Fig. 2; R = 0.63; P = 0.023, adonis test). Bacteroides sp. was found to be significantly more abundant in non-DSR samples compared with DSR samples at week 24 after FDR correction (P = 0.0051). On a PCoA plot, the samples were visible outliers.
Microbial Functional Pathway Associations With Disease Activity and Response to EEN
To determine the relationship between microbial functions and disease activity and response to EEN in our cohort, we looked at the presence of metagenomic sequencing (MGS)–determined microbial functions (MetaCyc pathways, KEGG orthologs, KEGG pathways, and KEGG modules) among 4 different sample groupings based on clinical metadata: high vs low FCP, remission vs severe disease activity, all 4 categories of disease activity, and the subgroup of SR vs non-SR in week 0 and week 12 samples. There were 174 differentially abundant functions across various clinical metadata groupings (Supplementary Table 3). In samples associated with remission, there were 111 functions in higher relative abundance when compared with samples associated with severe disease activity.
ASVs and Clinical Metadata Significantly Model Response to EEN
We next investigated how well taxonomic and functional data sets classify SR and non-SR samples using baseline microbiome data before induction of EEN therapy. Taxonomic data from 16S sequencing and functional profiles from MGS were used as input for random forest machine learning models to distinguish SR and non-SR samples at baseline. Microbial sequence data from all samples were used in the RF model except for samples from patient CD1 (due to concomitant medications at baseline) and patients CD14, CD17, and CD21 (due to incomplete follow-up data, Table 1). The 16S data sets used as model classifiers were ASV, species, genus, and family taxonomic data. Functional data sets from MGS included KEGG ortholog, pathway, and module counts, in addition to MetaCyc pathways. A total of 8 data sets were used as classifiers for treatment response, each mean-centered and scaled by the standard deviation in each sample. We ran independent RF models to determine the accuracy of each data set of classifying SR and non-SR samples. The only significant taxonomic data set predicting treatment response was ASVs (P = 0.047). None of the functional data sets were significant classifiers on their own.
Having a robust set of clinical data for each patient, we next investigated how these RF models could be improved by the addition of clinical features to microbial data sets. We sequentially added clinical metadata to the ASV RF model to find the best combination of predictors for treatment response. Amplicon sequence variants on their own resulted in an area under the curve (AUC) of 0.743 and was raised to 0.829 with the addition of species richness. We found that adding species richness and disease location and behavior, as defined by the Paris classification scheme,26 resulted in the best model for predicting EEN treatment response, with an AUC of 0.9 (Fig. 3). Although KEGG pathways were not significant predictors on their own, with the addition of species richness, disease location, and disease behavior, the RF model was significantly predictive (P = 0.048), with an AUC of 0.8.
FIGURE 3.
Receiver operating characteristic (ROC) curves for significant RF models for predicting EEN treatment response.
Random forest models output variable importance metrics for each feature used in the model. The most informative taxa in the top predictive model were Ruminococcaceae UCG-002, Lachnospiraceae NK4A136, Bacteroides, and Parabacteroides (Fig. 4A). Although the Paris classification of disease location and behavior improved the model’s AUC, neither metric showed up in the top 30 most important features for classification. In the top MGS model, the 3 top KEGG pathways were (1) ko4910: insulin signaling, (2) ko03013: RNA transport, and (3) ko02010: ABC transporters. Interestingly, disease behavior was the second most informative feature of the MGS model (Fig. 4B).
FIGURE 4.
Most informative features from RF model for treatment response classification.
ASVs and MGS-determined Functions Significantly Model FCP Levels
Amplicon sequence variant abundance data were also input into an RF model that regressed against FCP levels to determine predictive power of individual taxa. The model was significant in regression against FCP levels (P = 0.001) and explained 17% of the variance in FCP levels. The top 20 taxa important in predicting FCP levels are shown in Figure 5A. The majority of informative taxa belonged to the phylum Firmicutes (12), followed by Proteobacteria (4), Bacteroides (2), and 1 taxon each from Fusobacteria and Actinobacteria. Nine taxa from the Lachnospiraceae family were in the top 30 most informative taxa, including the 2 most informative features (Lachnoclostridium sp. and Ruminococcus gnavus group). Six taxa from the Ruminococcaceae family were also in the top 30 features of the model. This FCP RF model and the treatment response model shared 11 of their 30 top ASVs (Table 2).
FIGURE 5.
Most informative features from RF model for regression of fecal calprotectin levels.
TABLE 2.
Top Features Shared Between FCP Regression and Treatment Response RF Models
| Taxa | Functions |
|---|---|
| f__Haemophilus;s__uncultured bacterium | RNA degradation (ko03018) |
| g__Bacteroides | Peptidoglycan biosynthesis (ko00550) |
| g__Bacteroides | Alanine, aspartate and glutamate metabolism (ko00250) |
| g__Faecalibacterium | Bacterial chemotaxis (ko02030) |
| g__Haemophilus;s_uncultured bacterium | Biosynthesis of vancomycin group antibiotics (ko01055) |
| g__Lachnospiraceae NK4A136 group | Thiamine metabolism (ko00730) |
| g__Lachnospiraceae NK4A136 group;s__uncultured organism | Porphyrin and chlorophyll metabolism (ko00860) |
| g__Roseburia | Protein export (ko03060) |
| g__Ruminococcaceae UCG-002;s__uncultured bacterium | Phosphonate and phosphinate metabolism (ko00440) |
| g__Ruminococcaceae UCG-002;s__uncultured organism | |
| g__Streptococcus |
When KEGG pathway relative abundances were used in a regression model, they explained considerably less of the variance in FCP levels but resulted in a significant predictive model (P = 0.003; 4.54% variance explained). The FCP regression and treatment response RF models using MGS functions shared 9 of the top 30 features (Table 2), including the top feature in the FCP regression model (ko00440: phosphonate and phosphonate metabolism).
DISCUSSION
In this study, we highlight gut microbiome changes in response to EEN treatment and use microbial data sets as features in machine learning models to predict treatment response and FCP levels. The patients in this study show similar remission rates (79% at week 12) to previous cohorts undergoing treatment with EEN.6–8 The majority of patients who undergo EEN treatment are reported to exhibit decreased microbial alpha diversity (richness) after the completion of treatment,19 but no changes in microbial diversity were observed in our cohort after EEN (Fig. 2A). Similarly, there were no differentially abundant taxa between samples at baseline, week 12, and week 24.
Our analysis showed that samples with lower levels of FCP have significantly lower bacterial species richness compared with those with higher FCP (Fig. 1B). Our previous analysis of a subgroup of this cohort had shown that there was a paradoxical rise in diversity (from very low to low) in patients who did not achieve or sustain remission after EEN-treatment.22 In our previous OTU-based analysis, patients who achieved SR had consistently greater diversity (dropping with EEN, but still higher than those in non-SR patients); due to different time points studied and different analytical methods, we did not find this in the present ASV-analysis.
Higher FCP levels suggest histologically active disease in IBD patients, and the cut-off of FCP <250 µg/g has been shown to be predictive of endoscopic remission.42 Especially at higher FCP levels, there can be considerable intraindividual and day-to-day variation for FCP.44 A recent study of 313 IBD patients and 582 controls found no significant association with bacterial abundance and FCP levels,16 but deep MGS of 1135 Dutch study participants found that microbial species and metabolic pathways were robustly associated with calprotectin levels.45 The finding of lower bacterial richness in low FCP samples is in contrast to the general finding that high microbial diversity is associated with health.46 Although remission status as measured by wPCDAI was not associated with a decrease in microbial diversity in this cohort, we show here that a lower level of a gut inflammation marker is associated with lower microbial diversity. This is in line with previous EEN studies that have shown a decrease in diversity in patients that achieved remission22,47 and, therefore, may have had reduced levels of FCP. Additionally, we focused on the gut microbiome response to EEN at week 12, whereas others that observed a reduction in bacterial diversity assessed the microbiota at weeks 4 to 8 after starting EEN.18,48 These early changes in gut microbiome were not captured by the current study and may explain interstudy differences comparing gut microbial diversity following EEN. Furthermore, the majority of samples in the low FCP group came from patients at week 36 or later (77.8%), when most children in this study were back on a regular oral diet (sometimes with additional supplemental enteral nutrition). The lower FCP in these individuals may therefore be attributed to long-term immunomodulator and/or biologic therapy and not EEN. By categorizing the cohort into disease severity groups based on recommended wPCDAI cut-offs,25 we find that severe disease trends toward a lower species richness (Fig. 1C).
Fecal calprotectin and wPCDAI levels were weakly correlated in our cohort, and FCP was shown to be significantly higher in severe CD (Fig. 1A, Supplementary Fig. 1). Although these clinical markers do shed some light on disease activity in inflammation, the wPCDAI does not correlate well with endoscopic disease activity,49 and FCP has been shown not to correlate with symptom scores in CD.42 In a recent analysis of 132 adult patients, FCP and the Harvey-Bradshaw Index, another measure of CD activity, showed no significant correlation.50 In addition to intra-individual variability, a further problem complicating FCP usage as a marker of gut inflammation and disease activity is the variability between different assays used.51 The gold standard of disease monitoring and assessing mucosal healing is endoscopic examination, but this requires general anesthesia in children and is not standard-of-care in most settings, including the single center where the current study was performed. Though FCP and wPCDAI are noninvasive parameters for assessing CD activity, our findings highlight their limitations for translational microbiome research.
We found that Verrucomicrobia taxa were increased in non-SR patients at week 12 of EEN (Fig. 2B). The only cultivated gut microbial representative of the Verrucomicrobia phylum is Akkermansia muciniphila, which was not associated with SR after EEN in our cohort but was 1.6% higher in remission samples compared with severe CD at the genus level. Akkermansia muciniphila is routinely observed to be depleted in IBD patients compared with healthy controls and is suggested to play a role in maintenance of the epithelial gut lining as a mucin-degrading bacterium.52Akkermansia muciniphila was also recently shown to be the most informative RF feature for classifying CD vs healthy controls in an analysis of biopsy samples from a Scottish pediatric cohort.53 Our finding that Verrucomicrobia was increased in non-SR patients is in contrast to previous findings showing a 32.9% greater relative abundance in adult CD remission samples compared with those with active disease.54 This increase of Verrucomicrobia in non-SR patients at week 12 and an increase of A. muciniphila in remission patients overall is paradoxical but may be due to presence of another yet-to-be-identified Verrucomicrobia taxon.
Twenty-nine taxa were differentially abundant between wPCDAI-defined groupings of disease severity, and the genera that differed are highlighted in Figure 1D. Fusobacterium,13,15,55,56Peptostreptococcus,55,56 and Eikenella13 have previously been implicated in CD, but this is the first time to our knowledge that Hungatella has been associated with disease activity in patients with CD. Fusobacterium nucleatum is abundant in the healthy oral microbiome57 but has been shown to increase in CD stool samples.13 A recent analysis of human full thickness colon samples found enriched levels of Fusobacterium and Peptostreptococcus in diseased segments of colon compared with adjacent healthy tissues.56 The authors suggest that these putative oral pathogens may migrate to the colon to mediate IBD pathology. The aforementioned genera were also found to be in greater abundance in Saudi Arabian children with CD,55 highlighting cross-population similarities in the pediatric CD microbiome that we confirm here.
Using a definition of deep sustained remission, where patients have both an FCP <250 µg/g and wPCDAI <12.5, we were able to highlight differences in bacterial community composition between DSR and non-DSR groups (Supplementary Fig. 2). This trend may be driven by presence of Bacteroides sp. in the DSR samples compared with non-DSR, in keeping with our previous analysis of a subgroup of this cohort where certain species of Bacteroides (eg, B. fragilis and B. ovatus) were prevalent in SR samples, whereas other Bacteroides species were more prevalent in non-SR samples (eg, B. plebeius).22 Deep remission is defined as clinical remission and mucosal healing,58 but because we did not have follow-up endoscopy data here, we combined FCP and wPCDAI to define DSR in this small cohort. These results should be treated with caution, however, with the recent finding that the simple endoscopic score for pediatric CD (SES-CD) is weakly correlated with the wPCDAI in children with newly diagnosed CD.49 It should also be noted that nonparametric permutation tests such as adonis can suffer from the negative effects of small sample bias, so these results should be treated with discretion.
Using machine learning RF models, we found that the addition of clinical metadata improves SR vs non-SR predictive value of microbiome data. Specifically, the addition of Paris disease behavior and location to ASV abundances and alpha-diversity values improved the AUC of the RF model to 0.9. Although Paris classification improved the predictive value of the model, it was not included in the top 30 important features while alpha-diversity was (Fig. 4A). Despite this, Paris disease behavior was the second most informative feature for SR classification when used in a model with KEGG pathways from MGS data (Fig. 4B). Here, we show that using clinical observations together with gut microbial information improves prediction of response to EEN. Validation of RF models in larger cohorts undergoing treatment may solidify their usage as a baseline diagnostic tool in pediatric CD.
Although there were limited taxonomic differences between high and low FCP samples, RF modeling with ASV abundances explained 17% of the variance in FCP levels (Fig. 5). The second most important taxonomic feature in the RF model was the relative abundance of Ruminococcus gnavus. While R. gnavus is among the most common 57 species of the gut microbiome in >90% of people,59 an increased abundance of this bacteria has been reported in several cohorts of CD patients.60,61 A twin study of IBD also showed the disappearance of Faecalibacterium and Roseburia and increased abundance of Enterobacteriaceae and R. gnavus, which are all in the top 20 most informative taxa of our RF model.60Faecalibacterium prausnitzii is widely accepted to be a healthy member of the gut microbiome, and its reduction has been associated with IBD. Oral administration of F. prausnitzii has been shown to attenuate the severity of murine colitis, partly attributed to its metabolites blocking NF-κB activation and downstream pro-inflammatory cytokine production.62
This study is a follow-up to a preliminary report on a subset of the MAREEN cohort22 that showed reduced diversity after EEN and association of SR status with Akkermansia muciniphila, Bacteroides sp., Lachnospiraceae, and Ruminococcaceae. We found Bacteroides, Lachnospiraceae, and Ruminococcaceae to be top features in RF models for treatment response and FCP prediction (Table 2).
In addition to the limitations described previously with regards to the relatively poor correlation of real-life clinical course with single-timepoint disease activity, FCP measurements, and endoscopy, the current analysis differs in a number of crucial aspects from our preliminary report, which makes direct comparison of the results difficult. Herein, we used amplicon sequence variants, a technique that has improved resolution and lacks the need of a reference database,63 rather than the previous technique of operational taxonomic unit (OTU) construction. We also employed sequencing of the V4-V5 variable region, whereas the preliminary MAREEN analysis used the V6-V8 region. Choice of variable region to target for sequencing can affect taxonomic resolution and results, and should be considered when comparing microbiome results across studies.64
CONCLUSIONS
In conclusion, we leveraged machine learning and standard microbiome analysis techniques to characterize the gut microbiome in children with CD undergoing treatment with EEN. We found reduced overall microbial diversity in samples with lower fecal calprotectin levels. In contrast to several reports of decreased microbial alpha-diversity in pediatric CD patients after EEN,19 we observed that bacterial alpha-diversity based on ASVs is not altered by EEN, irrespective of treatment response. Several taxa are increased in non-SR patients at week 12, namely the phylum Verrucomicrobia. Using wPCDAI scores to stratify patients by disease severity based on their wPCDAI levels, we found several taxa to be differentially abundant, especially between remission and severe CD samples. Several of these taxa have been previously associated with IBD, whereas some were novel associations, such as Hungatella. Using baseline microbial data and clinical observations, we then built machine learning models to accurately predict remission with EEN treatment. We found that microbial taxonomic data, alpha-diversity, and Paris classifications of disease behavior and location produced the most informative predictive model. Functions obtained from metagenomic sequencing were not significantly informative on their own but made a significant contribution to the model in conjunction with the same additional metadata. These results suggest that using patient metadata, including disease classification and clinical biomarkers, in addition to gut microbial information may improve future diagnostics and should be validated in larger cohorts to reveal a more robust interplay between longitudinal disease activity and gut microbial composition. Overall, we demonstrated that machine learning models can be employed to predict treatment response to EEN. Further studies are needed to validate the use of machine learning to identify the subgroup of patients most likely to respond to dietary intervention and to optimize nutritional interventions for maintenance of remission.
Supplementary Material
ACKNOWLEDGMENTS
The authors would like to thank the team of pediatric GI nurses, dietitians (Jennifer Haskett, Cynthia King-Moore, and Lisa Parkinson-McGraw), and all participating families for their help in completing the MAREEN study. Much thanks goes to other members of the GI research team who contributed to patient recruitment, sample collection, and chart reviews (Brad MacIntyre, Amy Postmaa, and Amber MacLellan). Thank you to Scott Whitehouse for processing a portion of the stool samples for 16S sequencing.
Supported by: JVL was supported by a Canadian Institutes of Health Research (CIHR)-Canadian Association of Gastroenterology-Crohn’s Colitis Canada New Investigator Award (2015–2019), a Canada Research Chair Tier 2 in Translational Microbiomics (2018–2019), and a Canadian Foundation of Innovation John R. Evans Leadership fund (awards #35235 and #36764), a Nova Scotia Health Research Foundation (NSHRF) establishment award (2015–2019), an IWK Health Centre Research Associateship, a Future Leaders in IBD project grant, a donation from the MacLeod family, a CIHR-SPOR-Chronic Diseases grant (Inflammation, Microbiome, and Alimentation: Gastro-Intestinal and Neuropsychiatric Effects: the IMAGINE-SPOR chronic disease network), an American Gastroenterology Association Pfizer Young Investigator Pilot Research Award in Inflammatory Bowel Disease (2018), and a Weston Foundation Postbiotics grant (2019–2020). The MAREEN study was made possible through a NASPGHAN/CCFA Young Investigator award 2013–2015 for JVL.
Conflicts of interest: JVL reports consulting, travel, and/or speaker fees and research support from AbbVie, Janssen, Nestlé Health Science, Novalac, Pfizer, Merck, P&G, GSK, Illumina, and Otsuka.
REFERENCES
- 1. Van Limbergen J, Russell RK, Drummond HE, et al. Definition of phenotypic characteristics of childhood-onset inflammatory bowel disease. Gastroenterology. 2008;135:1114–1122. [DOI] [PubMed] [Google Scholar]
- 2. Sýkora J, Pomahačová R, Kreslová M, et al. Current global trends in the incidence of pediatric-onset inflammatory bowel disease. World J Gastroenterol. 2018;24:2741–2763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Benchimol EI, Bernstein CN, Bitton A, et al. Trends in epidemiology of pediatric inflammatory bowel disease in Canada: distributed network analysis of multiple population-based provincial health administrative databases. Am J Gastroenterol. 2017;112:1120–1134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Sartor RB, Wu GD. Roles for intestinal bacteria, viruses, and fungi in pathogenesis of inflammatory bowel diseases and therapeutic approaches. Gastroenterology. 2017;152:327–339.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Ramos GP, Papadakis KA. Mechanisms of disease: inflammatory bowel diseases. Mayo Clin Proc. 2019;94:155–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Buchanan E, Gaunt WW, Cardigan T, et al. The use of exclusive enteral nutrition for induction of remission in children with Crohn’s disease demonstrates that disease phenotype does not influence clinical remission. Aliment Pharmacol Ther. 2009;30:501–507. [DOI] [PubMed] [Google Scholar]
- 7. Connors J, Basseri S, Grant A, et al. Exclusive enteral nutrition therapy in pediatric Crohn’s disease results in long-term avoidance of corticosteroids: results of a propensity-score matched cohort analysis. J Crohn’s Colitis. 2017;11:1063–1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Grover Z, Muir R, Lewindon P. Exclusive enteral nutrition induces early clinical, mucosal and transmural remission in paediatric Crohn’s disease. J Gastroenterol. 2014;49:638–645. [DOI] [PubMed] [Google Scholar]
- 9. Yu YR, Rodriguez JR. Clinical presentation of Crohn’s, ulcerative colitis, and indeterminate colitis: symptoms, extraintestinal manifestations, and disease phenotypes. Semin Pediatr Surg. 2017;26:349–355. [DOI] [PubMed] [Google Scholar]
- 10. Ruemmele FM, Veres G, Kolho KL, et al. ; European Crohn’s and Colitis Organisation; European Society of Pediatric Gastroenterology, Hepatology and Nutrition Consensus guidelines of ECCO/ESPGHAN on the medical management of pediatric Crohn’s disease. J Crohns Colitis. 2014;8:1179–1207. [DOI] [PubMed] [Google Scholar]
- 11. Levine A, Wine E, Assa A, et al. Crohn’s disease exclusion diet plus partial enteral nutrition induces sustained remission in a randomized controlled trial. Gastroenterology. 2019;157:440–450.e8. [DOI] [PubMed] [Google Scholar]
- 12. Sabino J, Lewis JD, Colombel JF. Treating inflammatory bowel disease with diet: a taste test. Gastroenterology. 2019;157:295–297. [DOI] [PubMed] [Google Scholar]
- 13. Gevers D, Kugathasan S, Denson LA, et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe. 2014;15:382–392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Hansen R, Russell RK, Reiff C, et al. Microbiota of de-novo pediatric IBD: increased Faecalibacterium prausnitzii and reduced bacterial diversity in Crohn’s but not in ulcerative colitis. Am J Gastroenterol. 2012;107:1913–1922. [DOI] [PubMed] [Google Scholar]
- 15. Pascal V, Pozuelo M, Borruel N, et al. A microbial signature for Crohn’s disease. Gut. 2017;66:813–822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Imhann F, Vich Vila A, Bonder MJ, et al. Interplay of host genetics and gut microbiota underlying the onset and clinical presentation of inflammatory bowel disease. Gut. 2018;67:108–119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Svolos V, Hansen R, Nichols B, et al. Treatment of active Crohn’s disease with an ordinary food-based diet that replicates exclusive enteral nutrition. Gastroenterology. 2019;156:1354–1367.e6. [DOI] [PubMed] [Google Scholar]
- 18. Quince C, Ijaz UZ, Loman N, et al. Extensive modulation of the fecal metagenome in children with Crohn’s disease during exclusive enteral nutrition. Am J Gastroenterol. 2015;110:1718–1729; quiz 1730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. MacLellan A, Connors J, Grant S, et al. The impact of exclusive enteral nutrition (EEN) on the gut microbiome in Crohn’s disease: a review. Nutrients. 2017;9:0447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Fell JM, Paintin M, Arnaud-Battandier F, et al. Mucosal healing and a fall in mucosal pro-inflammatory cytokine mRNA induced by a specific oral polymeric diet in paediatric Crohn’s disease. Aliment Pharmacol Ther. 2000;14:281–289. [DOI] [PubMed] [Google Scholar]
- 21. Zubin G, Peter L. Predicting endoscopic Crohn’s disease activity before and after induction therapy in children: a comprehensive assessment of PCDAI, CRP, and fecal calprotectin. Inflamm Bowel Dis. 2015;21:1386–1391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Dunn KA, Moore-Connors J, MacIntyre B, et al. Early changes in microbial community structure are associated with sustained remission after nutritional treatment of pediatric Crohn’s disease. Inflamm Bowel Dis. 2016;22:2853–2862. [DOI] [PubMed] [Google Scholar]
- 23. Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (REDCap)-A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377–381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Harris PA, Taylor R, Minor BL, et al. ; REDCap Consortium The REDCap consortium: building an international community of software platform partners. J Biomed Inform. 2019;95:103208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Turner D, Griffiths AM, Walters TD, et al. Mathematical weighting of the pediatric Crohn’s disease activity index (PCDAI) and comparison with its other short versions. Inflamm Bowel Dis. 2012;18:55–62. [DOI] [PubMed] [Google Scholar]
- 26. Levine A, Griffiths A, Markowitz J, et al. Pediatric modification of the Montreal classification for inflammatory bowel disease: the Paris classification. Inflamm Bowel Dis. 2011;17:1314–1321. [DOI] [PubMed] [Google Scholar]
- 27. Comeau AM, Douglas GM, Langille MGI. Microbiome helper: a custom and streamlined workflow for microbiome research. mSystems. 2017;2:e00127-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.J. 2011;17:10. [Google Scholar]
- 29. Bolyen E, Rideout JR, Dillon MR, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol. 2019;37:852–857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Rognes T, Flouri T, Nichols B, et al. VSEARCH: a versatile open source tool for metagenomics. Peerj. 2016;4:e2584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Amir A, McDonald D, Navas-Molina JA, et al. Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems. 2017;2:e00191-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26:1641–1650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Quast C, Pruesse E, Yilmaz P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–D596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Parks DH, Tyson GW, Hugenholtz P, et al. STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics. 2014;30:3123–3124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Franzosa EA, McIver LJ, Rahnavard G, et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods. 2018;15:962–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2:18–22. [Google Scholar]
- 40. Murphy MA, Evans JS, Storfer A. Quantifying Bufo boreas connectivity in Yellowstone National Park with landscape genetics. Ecology. 2010;91:252–261. [DOI] [PubMed] [Google Scholar]
- 41. Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28:1–26.27774042 [Google Scholar]
- 42. D’Haens G, Ferrante M, Vermeire S, et al. Fecal calprotectin is a surrogate marker for endoscopic lesions in inflammatory bowel disease. Inflamm Bowel Dis. 2012;18:2218–2224. [DOI] [PubMed] [Google Scholar]
- 43. Lozupone C, Lladser ME, Knights D, et al. UniFrac: an effective distance metric for microbial community comparison. Isme J. 2011;5:169–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Cremer A, Ku J, Amininejad L, et al. Variability of faecal calprotectin in inflammatory bowel disease patients: an observational case-control study. J Crohns Colitis. 2019;13:1372–1379. [DOI] [PubMed] [Google Scholar]
- 45. Zhernakova A, Kurilshikov A, Bonder MJ, et al. ; LifeLines cohort study Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science. 2016;352:565–569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Lozupone CA, Stombaugh JI, Gordon JI, et al. Diversity, stability and resilience of the human gut microbiota. Nature. 2012;489:220–230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Kaakoush NO, Day AS, Leach ST, et al. Effect of exclusive enteral nutrition on the microbiota of children with newly diagnosed Crohn’s disease. Clin Transl Gastroenterol. 2015;6:e71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Grover Z, Kang A, Morrison M, et al. 633 the relative abundances of dorea and Faecalibacterium spp. In the mucosa associated microbiome of newly diagnosed children with Crohn’s disease are differentially affected by exclusive enteral nutrition. Gastroenterology. 2016;150:S132–S133. [Google Scholar]
- 49. Carman N, Tomalty D, Church PC, et al. ; Canadian Children Inflammatory Bowel Disease Network: A Joint Partnership of Canadian Institutes of Health Research and the Children with Intestinal and Liver Disorders Foundation Clinical disease activity and endoscopic severity correlate poorly in children newly diagnosed with Crohn’s disease. Gastrointest Endosc. 2019;89:364–372. [DOI] [PubMed] [Google Scholar]
- 50. Lloyd-Price J, Arze C, Ananthakrishnan AN, et al. ; IBDMDB Investigators Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature. 2019;569:655–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Labaere D, Smismans A, Van Olmen A, et al. Comparison of six different calprotectin assays for the assessment of inflammatory bowel disease. United European Gastroenterol J. 2014;2:30–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Derrien M, Belzer C, de Vos WM. Akkermansia muciniphila and its role in regulating host functions. Microb Pathog. 2017;106:171–181. [DOI] [PubMed] [Google Scholar]
- 53. Douglas GM, Hansen R, Jones CMA, et al. Multi-omics differentially classify disease state and treatment outcome in pediatric Crohn’s disease. Microbiome. 2018;6:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Tedjo DI, Smolinska A, Savelkoul PH, et al. The fecal microbiota as a biomarker for disease activity in Crohn’s disease. Sci Rep. 2016;6:35216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. El Mouzan MI, Winter HS, Assiri AA, et al. Microbiota profile in new-onset pediatric Crohn’s disease: data from a non-Western population. Gut Pathog. 2018;10:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Dinakaran V, Mandape SN, Shuba K, et al. Identification of specific oral and gut pathogens in full thickness colon of colitis patients: implications for colon motility. Front Microbiol. 2019;10:1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Segata N, Haake SK, Mannon P, et al. Composition of the adult digestive tract bacterial microbiome based on seven mouth surfaces, tonsils, throat and stool samples. Genome Biol. 2012;13:R42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Mack DR, Benchimol EI, Critch J, et al. Canadian Association of Gastroenterology clinical practice guideline for the medical management of pediatric luminal Crohn’s disease. J Can Assoc Gastroenterol. 2019;2:e35–e63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Qin J, Li R, Raes J, et al. ; MetaHIT Consortium A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464:59–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Willing BP, Dicksved J, Halfvarson J, et al. A pyrosequencing study in twins shows that gastrointestinal microbial profiles vary with inflammatory bowel disease phenotypes. Gastroenterology. 2010;139:1844–1854.e1. [DOI] [PubMed] [Google Scholar]
- 61. Joossens M, Huys G, Cnockaert M, et al. Dysbiosis of the faecal microbiota in patients with Crohn’s disease and their unaffected relatives. Gut. 2011;60:631–637. [DOI] [PubMed] [Google Scholar]
- 62. Sokol H, Pigneur B, Watterlot L, et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc Natl Acad Sci U S A. 2008;105:16731–16736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. Isme J. 2017;11:2639–2643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Chakravorty S, Helb D, Burday M, et al. A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of pathogenic bacteria. J Microbiol Methods. 2007;69:330–339. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





