Abstract
Background and Aims:
Irritable bowel syndrome (IBS), inflammatory bowel disease (IBD), and celiac disease (CeD) present with similar gastrointestinal (GI) symptoms. DNA methylation-based biomarkers have not been investigated as diagnostic biomarkers to classify these disorders. We aimed to study DNA methylation profiles of IBS, IBD, CeD and healthy controls (HC), develop machine learning-based classifiers, and identify associated gene ontology (GO) terms.
Methods:
Genome-wide DNA methylation of peripheral blood mononuclear cells from 315 patients with IBS, IBD, CeD, and HC was measured using Illumina’s 450K or EPIC arrays. A methylation dataset on 304 IBD and HC samples was used for external validation. Differential methylation was measured using general linear models. Classifiers were developed using penalized generalized linear models using double cross-validation controlling for confounders. Functional enrichment was assessed using GO.
Results:
315 participants (148 IBS, 47 IBD, 34 CeD, and 86 HC) had DNA methylation data. IBS-IBD and IBD-CeD showed the highest number of differentially methylated CpG sites followed by IBD-HC, CeD-HC and IBS-HC. IBS-associated genes were enriched in cell-adhesion and neuronal pathways, while IBD and CeD-associated markers were enriched in inflammation and MHC class II pathways, respectively (p<0.05). Classification performance assessed using area under the receiver operating characteristic curves (AUC) for IBS-IBD, IBS-CeD, and IBD-CeD was 0.80 (95%CI=0.7-0.87, p=6.75E-10), 0.78 (95%CI=0.68-0.86, p=4.57E-10) and 0.73 (95%CI=0.62-0.83, p=0.03), respectively. The performance of IBD-HC was successfully validated using external data [AUC=0.74 (95%CI=68-0.80, p<0.001)].
Conclusions:
Blood-based DNA methylation biomarkers can potentially distinguish GI disorders. GO suggested functional significance of the classifiers in disease-specific pathology.
Graphical Abstract

Comprehensive genome-wide DNA methylation profiling of blood cells in irritable bowel syndrome, inflammatory bowel disease, celiac disease, and healthy controls identifies potential biomarkers associated with disease-specific pathways, and shows promise as a biomarker for differential diagnosis of gastrointestinal disorders.
INTRODUCTION
Irritable bowel syndrome (IBS) is a highly prevalent stress-sensitive, chronic gastrointestinal (GI) disorder characterized by chronic abdominal pain associated with diarrhea and/or constipation. IBS has a worldwide prevalence of 4.5-11%1 and is associated with a significant healthcare and economic burden2. Currently, no valid test can reliably diagnose IBS, so symptom-based criteria are used. Most IBS patients have seen at least three physicians and undergone multiple expensive and invasive tests before a diagnosis of IBS, as IBS is often considered erroneously a diagnosis of exclusion3, 4. Other GI conditions, such as inflammatory bowel diseases (IBD), celiac disease (CeD) and even GI malignancy, can present with similar GI symptoms but are diagnosed by confirmatory tissue histopathology and treated differently.
IBD are chronic, immune-mediated conditions manifesting as intestinal mucosal inflammation. IBD affect approximately 3 million Americans, evenly divided between Crohn’s disease and UC.5, 6 In general, the diagnosis of IBD is made by gastroenterologists and requires endoscopic and histological confirmation. The time between the onset of symptoms and referral to a specialist can vary substantially. One of the major sources of delay in diagnosis is treatment of the patient under the assumption that they have IBS rather than IBD.7 About 10% of IBD patients are misdiagnosed as IBS.7 Delays in diagnosis are associated with worse outcomes, including increased incidence of complications and the need for early surgery8–10.
CeD is a gluten-induced, immune-mediated enteropathy, with an estimated worldwide prevalence of approximately 1%11–14. CeD can present with variable manifestations and may also be misdiagnosed as IBS15. GI society guidelines recommend serological screening for CeD in patients with IBS, chronic diarrhea, and anemia, for instance. However, up to 5% of patients are seronegative16. Additionally, some patients self-initiate a gluten-free diet (GFD), thus affecting the yield of CeD serology testing17, 18. Substantial delays in diagnosing CeD even up to 13 years can occur in clinical practice13, 19, 20. Delays in diagnosis and treatment of CeD can lead to long-term complications such as osteoporosis, infertility, anemia, and intestinal lymphoma and increased health care utilization and pharmacotherapy21, 22.
Thus, there is a significant unmet need for an objective biomarker-based test that can distinguish IBS, IBD and CeD, allow an earlier diagnosis and appropriate treatment plan, and reduce unnecessary medical costs and tests (e.g., computed tomography [CT] scans, abdominal ultrasounds, stool studies, breath tests, repeat endoscopies). The importance of a diagnostic test for patients was supported by a recent large-scale survey of individuals with IBS who reported that a fast and accurate diagnostic test for IBS is one of the top ten research priorities23. Such a discovery has the transformative potential of shifting the paradigm of diagnosing IBS, IBD and CeD.
Epigenetic modifications including DNA methylation are mechanisms that regulate gene expression and higher-order DNA structure. DNA methylation has emerged as a leading mechanism linking gene-environment interactions to long-term behavioral development, particularly in complex disorders24, 25. In normal mammalian somatic genomes, DNA methylation mainly occurs at Cytosines in a CpG dinucleotide context. CpG methylation is generally absent from short stretches of CpG-rich sequences known as CpG islands (CGIs) which typically occur at or near the transcription start site of genes26. Hypermethylation of CGI promoters is tightly linked with transcriptional repression of the affected gene and therefore have been viewed as an epimutation causing the silencing of a gene. In contrast, recent studies show that gene body methylation is positively correlated with gene expression and can be potential therapeutic targets27.
Aberrant DNA methylation has been associated with a variety of cancers and non-cancerous disorders, including psychiatric and neurodegenerative disorders. Our previous study on DNA methylation and targeted bisulphite sequencing in peripheral blood mononuclear cells (PBMCs) identified changes in DNA methylation in IBS patients compared to healthy controls (HC)28. This study highlighted the role for neuronal and oxidative stress pathways in the pathophysiology of IBS. DNA methylation changes in blood have been associated with inflammation in patients with IBD29, however, there is a lack of blood-based DNA methylation studies in CeD. Additionally, there are no studies investigating DNA methylation as a biomarker for the differential diagnosis of IBS from other diseases that mimic IBS.
Therefore, we hypothesized that DNA methylation marks in PBMCs can serve as a biomarker for diagnosing IBS and for distinguishing IBS from IBD and CeD. Similarly, DNA methylation-based biomarkers may be used for non-invasive diagnoses of IBD and CeD. The aims of this study were to 1) study the differences in genome-wide methylation profiles of IBS, IBD, CeD, and HC 2) develop DNA methylation-based classifiers to discriminate IBS, IBD, CeD, and HC and 3) investigate gene ontology (GO) terms and pathways associated with the classifiers.
2. MATERIALS AND METHODS
2.1. Study population
Male and female participants with IBS, IBD, and CeD, and HC ages 18-55 were recruited by community advertisement or at GI clinics at the University of California, Los Angeles (UCLA). IBS and HC samples were collected between 2009 and 2020 (analyzed on 450K or EPIC platforms). IBD and CeD samples were collected between 2012 and 2020. In addition to the UCLA cohort, we included banked PBMC samples from patients with IBD obtained from the Crohn’s and Colitis Foundation (CCF) Study of a Prospective Adult Research Cohort with IBD (SPARC IBD) with identical inclusion/exclusion criteria. The SPARC IBD cohort has been previously described30. The diagnosis of IBS and bowel habit subtypes were based on Rome III31 or IV32 criteria depending on the time of recruitment and confirmed by a clinician with expertise in IBS. HC had no personal or family history of IBS or other chronic pain conditions. Additional exclusion criteria for IBS and HC included infectious or inflammatory disorders, active psychiatric illness over the past six months assessed by structured clinical interview for the DSM-IV (MINI)30, use of corticosteroids in the past six months, use of narcotics in the past two months, and alcohol abuse. Questionnaires administered to the participants are described in detail in the Supplementary Methods. The Bowel Symptom Questionnaire (BSQ), Hospital Anxiety and Depression Scale (HAD)33, and Adverse Childhood Experiences (ACE)34 questionnaires were administered to all UCLA patients in this study including IBS, IBD and CeD. Additional questionnaires administered to IBS patients included Irritable Bowel Syndrome Severity Scoring System (IBS-SSS)35.
The diagnosis of IBD was confirmed by endoscopy with pathologic tissue confirmation. Since treatments such as inflammation-reducing drugs used in IBD have been associated with changes in DNA methylation36, we recruited patients that were treatment naïve or currently on no IBD treatment including biologic agents (including ustekinumab, risankizumab or vedolizumab) or other agents (tofacitinib, upadicitinib, ozanimod, 6-mercaptopurine, azathioprine, methotrexate, or 5-aminosalicylic acid [5-ASA]). Additional exclusion criteria included UC or Crohn’s disease treated surgically without evidence of subsequent disease and history of coexistent IBS or CeD. Patients with IBD reported active GI symptoms (e.g., abdominal pain, bloating, diarrhea, and/or blood in stool). Disease activity was assessed with the following instruments at the time of sample collection: Simple Endoscopic Score for Crohn’s Disease37 (SES-CD, UCLA and SPARC IBD cohorts), Crohn’s Disease Activity Index38 (CDAI, UCLA cohort), Short-CDAI39 (SCDAI, SPARC IBD cohort), Simple Clinical Colitis Activity Index40 (SCCAI, UCLA cohort), and Ulcerative Colitis Disease Activity Index41 (UCDAI, SPARC IBD cohort) as described in the Supplementary Methods.
Patients with CeD reported active GI symptoms, e.g. abdominal pain, bloating, diarrhea and had their diagnosis confirmed by the presence of Marsh II-III lesions on duodenal biopsies and positive serology [anti-tissue transglutaminase (tTG) and/or anti-endomysial (EMA) antibodies]. Patients currently adherent to a GFD > 2 weeks, history of co-existent diagnosis of IBD, IBS, other causes of malabsorption, or any other medical conditions that could explain their GI symptoms, were excluded. Questionnaires administered to CeD patients included the validated Celiac Symptom Index (CSI)42 and Celiac Dietary Adherence Test (CDAT)43.
All study participants who had a current or past history of > 1/2 pack per day of cigarettes were excluded. A small percentage of participants who were former smokers (one Crohn’s disease (4%) and three CeD patients (9%)) were included in the study. We recorded the use of medications including selective serotonin reuptake inhibitors (SSRIs) and serotonin-norepinephrine reuptake inhibitors (SNRIs), tricyclic anti-depressants (TCAs) or benzodiazepines, statins, and beta blockers (Supplementary Table 1), which may affect DNA methylation44. None of the HC used these medications. Participants recruited at UCLA were compensated. The study was approved by the UCLA Institutional Review Board, and all subjects signed a written informed consent prior to the study.
2.2. Statistical and bioinformatic analyses
2.2.1. Clinical and demographic data
Group differences in demographic characteristics including age, sex, body mass index (BMI), race, ethnicity and smoking status were assessed using t-tests, analysis of variance (ANOVA) or Fisher’s tests. Summary statistics were created for disease activity scores for IBS, IBD and CeD.
2.2.2. DNA methylation data processing
The methods used to process DNA methylation data and implementation of machine learning algorithms are presented in detail in Supplementary Methods. In short, raw Illumina DNA methylation array data (IDAT) files generated by the Illumina iScan scanner were processed using Enmix DNA methylation analysis pipeline45.
DNA methylation data pre-processing:
Quality Control (QC) metrics were generated and samples that did not pass the QC threshold were eliminated. The samples numbers reported throughout the study are the ones that passed the QC threshold. Background correction and dye bias correction were applied to the data and the resulting signal intensities were normalized using the ‘quantile normalization’ method. Cell types were estimated as described previously46. QC information-based filtering was implemented to filter unwanted probes out of the 855,790 CpGs on the array.
Additionally, CpGs on X and Y chromosomes, single nucleotide polymorphisms (SNPs) and repeats, non-specific or cross-reactive probes and probes showing low variability and extreme methylation values were removed. The remaining sites, ~200,000 CpGs were used as input for differential methylation analysis and classifier development. All the analyses were performed using R programming language (https://cran.r-project.org/). Additional details of the analyses are presented in Supplementary Methods.
Differential Methylation:
Group differences in methylation at CpG sites also known as differentially methylated positions (DMPs) between IBS, IBD, CeD, and HC were analyzed using general linear models using ‘limma’ package in R47 using age as a covariate. In a separate analysis that included additional covariates, we tested differential methylation differences between disease and healthy control groups in models controlling for age, array batch, sex, and proportions of various cell types including CD4 T-cells, CD8 T-cells, neutrophils, monocytes, and natural killer cells. P-values were adjusted for multiple tests using false discovery rate (FDR). DMPs were visualized using Manhattan plots, generated using ‘gap’ package in R48. Differentially methylated regions (DMRs) were identified using the ‘DMRCate’ R package49. An FDR<0.05 was considered significant for differences in DMP between diagnostic groups. However, in cases where we did not find any significant CpGs, trends for association (p<0.05) were considered. The DMRs are defined as regions having at least three DMPs within a 500 bp window50. Significance was based on harmonic mean of the individual component FDRs (HMFDR) < 0.05 and trends (p< 0.05) were reported for comparisons with HMFDR >0.05.
Correlation between methylation levels at DMPs and disease symptom severity scores:
We tested potential relationships between methylation levels of disease-associated CpGs and severity scores in IBS, IBD, and CeD patients using general linear models. Within UC patients, we tested group differences in methylation levels of IBD vs healthy control-associated DMPs between high and low UCDAI disease severity scores. Within CD patients, we tested an association between the methylation levels of DMPs and SCDAI score. Within CeD patients, we tested associations between CeD vs healthy control-associated sites and severity index scores (CSI). FDR p<0.05 was considered significant.
Gene ontology (GO) analysis:
Enrichment of GO terms and/or pathways associated with DMPs and ML-based classifiers for each comparison was assessed using ‘missMethyl’ package in R51. Using missMethyl package, we mapped the significant/differentially methylated CpG sites to Entrez Gene IDs and tested for GO term (including, “BP” - biological process, “CC” - cellular component, “MF” - molecular function) or KEGG pathway enrichment using a 44’ non central hypergeometric test. This method takes into account the number of CpG sites per gene on the 450K/EPIC array and multi-gene annotated CpGs51. An FDR<0.05 was considered significant or trends (p<0.05) were reported. For analyses of classifiers, we constructed a model with all samples and analyzed the associated GO terms.
Classifier development and evaluation:
Using the genome-wide DNA methylation data, we developed machine learning (ML)-based classifiers to test their performance as diagnostic biomarkers in classifying the GI diseases and HC. Generalized linear models (GLM) were fit via penalized maximum likelihood using ‘glmnet’ package, considering diagnosis as outcome, normalized and filtered DNA methylation probes (CpG sites) as predictors, and age, cell-type proportions, and technical batch as fixed covariates. These covariates were selected on the basis of their correlation with the principal components derived from the methylation data as is standard in analyses of methylation44. Of the variables tested age, cell-type proportions, and the methylation batch variables were found to be associated with the methylation-based principal components (Supplementary Figure 1). Race/ethnicity was not used as a covariate due to missing race/ethnicity data on some participants specifically in the IBD and CeD groups (Table 1). We used a double cross-validation method52 to evaluate and test machine learning models, which is a preferred method since the models are trained and tested on independent datasets. Double cross-validation process includes two nested cross-validation loops referred to as outer (10% held-out set) and inner loops (90% training data). The training dataset was further sub-divided into tuning/calibration (90% data) and validation sets (10%) which form the inner loop. The calibration set was used for model building (hyperparameter tuning) and the validation set was used to estimate the errors. The model with the lowest prediction error within the inner loop was selected as the best model. This model was then applied to the held-out test dataset and the class labels were predicted. Multiple splits (k=10) of inner and outer datasets were run in order to avoid the bias with respect to variable selection resulting from use of a single training set (Supplementary Figure 2). Lasso, ridge or elastic net regression models (penalty term representing shrinkage α=0 to 1) were fitted and the model associated with an α resulting in the best cross-validated performance was chosen as the final model. The prediction score, which is the weighted sum of CpGs, was calculated for each classifier. These scores are termed methylation risk scores (MRS) and have been used as an extension of polygenic risk scores (PRS), which capture multi-factorial information leveraging high-dimensional data to aid in the prediction of clinical phenotype44. Optimal cutoffs were chosen so as to have a minimum difference between the sensitivity and specificity and the values close to the area under the receiver operating characteristic curve (AUC)53. Performance metrics including accuracy (mean and 95% confidence interval [CI]), sensitivity, specificity, F1 score, and p values were generated based on the 2X2 confusion matrix constructed using the predicted and true test sample labels.
Table 1.
Demographic Characteristics of the Study Population
| IBS (N=148) | UC (N=22) | Crohn’s disease (N=25) | CeD (N=35) | HC (N=86) | P value | |
|---|---|---|---|---|---|---|
| Age; Mean (SD) | 30.9 (10.9) | 45.8 (14.5) | 36.6 (14.7) | 36.1 (13.4) | 30.5 (10.2) | 1.22e-07# |
| Sex (% Female) | 96 (65%) | 12 (55%) | 12 (48%) | 23 (66%) | 47 (55%) | 0.31$ |
| BMI; Mean (SD) | 25.8 (5.6) | 25.0 (5.9) | 24.0 (5.2) | 24.9 (4.5) | 26.6 (4.2) | 0.31# |
| Race | 0.0005$ | |||||
| Caucasian | 77 (52%) | 14 (64%) | 20 (80%) | 25 (71%) | 35 (41%) | |
| Black | 14 (9%) | 1 (5%) | 0 | 0 | 10 (12%) | |
| Asian | 27 (18%) | 1 (5%) | 1 (4%) | 2 (6%) | 22 (26%) | |
| Multiracial | 22 (15%) | 1 (5%) | 1 (4%) | 0 | 10 (12%) | |
| American Indian | 3 (2%) | 1 (5%) | 0 | 1 (3%) | 6 (7%) | |
| Unknown | 5 (3%) | 3 (14%) | 3 (12%) | 5 (14%) | 3 (3%) | |
| Ethnicity (%) | 0.001$ | |||||
| Hispanic | 34 (23%) | 3 (14%) | 1 (4%) | 5 (14%) | 23 (27%) | |
| Non-Hispanic | 109 (74%) | 15 (68%) | 21 (84%) | 25 (74%) | 63 (73%) | |
| Unknown | 5 (3%) | 4 (18%) | 3 (12%) | 4 (11%) | 0 (0%) |
, ANOVA p value;
, Fisher test p value;
IBS, irritable bowel syndrome; UC, ulcerative colitis; CeD, celiac disease; BMI, body mass index; SD, standard deviation
Sensitivity analyses:
We performed a sensitivity analysis to assess the performance of the classifiers excluding patients who consumed medications that can potentially alter DNA methylation44 including SSRIs, benzodiazepines, statins or beta blockers (Supplementary Table 1). Since IBD PBMC samples were derived from two sites (UCLA and CCF), we repeated IBS vs IBD comparison and included the “site” variable as an additional batch covariate.
Assessment of Classifier Performance
External validation set:
We downloaded whole blood HM450K DNA methylation data on IBD vs HC from the Gene Expression Omnibus (GEO) database (accession# GSE87648) for validating the performance of our IBD vs HC model. The processed dataset includes 460,398 probes on 304 samples (204 IBD and 100 HC). This dataset consists of whole blood DNA methylation data on newly diagnosed Crohn’s disease patients (n=103; 50 women and 53 men, mean[sd] age=38.7[16.3]) and UC patients (n=101; 45 women and 56 men, mean[sd] age=37.1[14.0]), and 100 HC (49 women and 51 men, mean[sd] age=34.3[10.3]). We filtered the external data to include probes that overlapped with the probes on the Epic array to match the internal IBD and HC datasets. We trained our algorithm on our internal IBD and HC samples using ‘glmnet’ with 10-fold cross-validation as described in the previous section, and used this trained model to predict the disease status of external samples. The results were evaluated using AUC and 95% CI.
Permutation testing:
Permutation testing is a statistical method to help understand if the results of an experiment are meaningful or just happened by chance. Permutation-based testing to assess classifier performance has been used extensively in classification problems in computational biology to test if there is a real class structure in the data54. To create null distributions, we generated 100 random permutations to generate AUCs from classifiers generated from shuffled labels. We compared the IBS vs IBD, IBS vs CeD, IBD vs CeD, and IBS, IBD, and CeD vs models generated on the actual class labels to the corresponding null distributions and calculated the p-values for differences in the distributions.
3. RESULTS
We analyzed DNA methylation profiles of 315 participants including IBS (N=148; 65% women, 45 constipation-predominant IBS [IBS-C], 54 diarrhea-predominant IBS [IBS-D], 49 mixed or unsubtyped IBS [IBS-M or IBS-U]), CeD (N=34; 68% women), IBD (N=47; 22 UC and 25 Crohn’s disease, 49% women)], and HC (N=86; 55% women). Table 1 shows the demographic characteristics of study participants. There were significant overall group differences in mean age, race, and ethnicity (p=5.97e-06, 5.0e-4, 1.0e-3, respectively). There were no significant group differences in the percentage of women or BMI (Fisher p = .31 and ANOVA p=.26, respectively).
About 35% IBS, 28% IBD and 31% CeD patients used medications such as antidepressants, statins or NSAIDs (Supplementary Table 1). Severity of IBS symptoms was moderate with mean (standard deviation [SD]) overall severity score of 9.45 (4.18; range 0-20) and IBS-SSS of 237 (89.70; range 0-500) which represents moderately severe symptoms (Supplementary Table 2A). UC disease activity assessed by UCDAI and SCCAI (mean [SD] = 4.29 [1.25] and 6.17 [4.84], respectively) indicated mild to moderate disease. Crohn’s disease activity assessed by CDAI, SCDAI, and SES-CD indicated mild to moderate disease (mean [SD] = 146.36 [78.10], 243.82 [73.77] and 4.40 [5.30], respectively) Supplementary Table 2B). CeD activity assessed by CSI scores indicated moderate disease (mean [SD] = 38 [10.98]) and CDAT scores (Mean [SD] = 14.94 [3.88]) indicated a poor adherence to a GFD by the CeD patients in this study (>13 indicates non-adherence, Supplementary Table 2C).
Based on the clinical data, a history of anxiety and/or depression was the most common co-morbidity (28%) associated with IBD, followed by thyroid disease (8%) and gastroesophageal reflux disease (GERD, 4%) in the UCLA cohort, and IBD-associated arthropathy amongst SPARC IBD patients (5%). Patients with CeD also reported the diagnosis of thyroid disease (14%), anemia and GERD (9%) and anxiety or depression (6%).
3.1. DMPs and DMRs Associated with Disease Groups
When adjusted for age, there were significant differences (FDR<0.05) in CpG methylation between IBS vs IBD (N of DMPs=248), IBS vs CeD (N of DMPs=6), IBD vs CeD (N of DMPs=655), IBD vs HC (N of DMPs=4130), and CeD vs HC (N of DMPs=311). Between IBS and HC, 98 CpGs showed a trend for differential methylation at p<0.001, but none at FDR<0.05 (Supplementary Figure 3A – 3F, Supplementary Table 3).
Analysis of locations of DMPs in the regulatory regions of the genes within each comparison suggested that a majority of DMPs in IBS vs IBD were in the promoter region. Of these, a majority of CpGs showed decreased methylation in IBS compared to IBD, suggesting potential epigenetic silencing in IBD patients. Only a minority of probes were promoter-associated in IBD vs CeD, IBD vs HC, and CeD vs HC. A large number of promoter-associated probes were hypermethylated in IBD compared to HC (Supplementary Table 4).
We then tested potential relationships between methylation levels of disease-associated CpGs and severity scores in IBS, IBD, and CeD patients. No significant association was seen between IBS-SSS and methylation of CpG sites (p<0.05) within IBS patients. Within UC patients, we tested 4130 CpG sites (differentially methylated between IBD vs HC at FDR<0.05), for the difference between high and low UCDAI disease severity scores. Of these, 44 CpG sites were associated with UCDAI scores (p<0.05, FDR>0.05, Supplementary Table 5). The top gene ontology GO terms associated with the 44 genes included “leukocyte migration involved in inflammatory response”. Within CD patients, 90 of the 4130 CpGs tested were associated with CD severity as measured by the SCDAI score (p<0.05, FDR>0.05, Supplementary Table 5). GO terms associated with CD disease severity included inflammation-related terms such as, “positive regulation of T cell activation”. A lack of association at FDR<0.05 may be due to multiple factors, including a smaller sample size of UC and CD compared to a combined IBD cohort (disease severity measures were different for UC and CD and therefore were analyzed separately). However, since the CpG sites were preselected based on an FDR threshold of adjusted p<0.05 and tested based on specific a priori hypotheses, a trend for association may be important.
Similarly, for CeD, we tested 312 CpG sites (FDR<0.05 between CeD and HC) and identified 66 CpG sites associated with CeD severity index scores (CSI, p<0.05). DNA methylation levels of an intergenic CpG site cg04132186, ADORA2A, and BAHCC1 correlated with CSI scores (FDR=0.08, Supplementary Figure 4, Supplementary Table 5). The GO terms associated with the 66 CpG sites included, “MHC class II protein complex assembly”.
We also compared genome-wide methylation profiles of IBS bowel habit subtypes. Between diarrhea-predominant IBS (IBS-D, N=54) and constipation-predominant IBS (IBS-C, N=45), there was one CpG site in FAM71E2 gene that was hypomethylated in IBS-D (FDR adjusted p=0.03) and one CpG site in BCAR1 gene which showed a trend for hypomethylation in IBS-D compared to IBS-C (FDR adjusted p=0.08). Not much is known about the function of FAM71E2 gene. Studies have reported that BCAR1 protein contributes to the regulation to a variety of signaling pathways including cell adhesion, migration, invasion, apoptosis, hypoxia, and mechanical forces55. Due to BCAR1’s role in regulating integrin-dependent and cell-cell adhesion, it can influence the integrity of the epithelial barrier in the gut, potentially contributing to the pathophysiology of IBS-D55.
Analysis of consecutive differentially methylated CpGs sites (FDR<0.05) within a region of the gene suggested that there were 49 and 1726 DMRs associated with IBS vs IBD and IBD vs CeD, comparisons, respectively (FDR<0.05). IBS vs CeD resulted in one DMR with a p<0.05, but none at FDR<0.05. CeD vs HC and IBD vs HC were associated with 58 and 2667 DMRs, respectively (FDR<0.05). Although not significant at HMFDR<0.05, we found trends for associations with 20 DMRs between IBS vs HC. (Supplementary Table 6).
Our additional analyses including a comprehensive list of covariates, such as age, array, sex, race and proportions of various cell types within the disease groups including CD4 T-cells, CD8 T-cells, Neutrophils, Monocytes, and natural killer cells suggested a smaller list of CpG sites associated with various disease pairs (Supplementary Table 7). IBS vs IBD and IBS vs CeD were associated with one and three DMPs, respectively. When comparing diseases with HC, IBD vs HC, and CeD vs HC were associated with nine and three DMPs, respectively. However, no DMPs were associated with IBD vs CeD or IBS vs HC. This may be expected given the collinearity/confounding effect of immune cells with the presence of the disease since inflammation/immune cell types are closely related to IBD and celiac disease pathophysiology56, 57.
To predict function of differentially methylated sites and regions, we performed GO analysis. Genes associated with DMPs in IBS vs IBD comparisons showed enrichment of GO terms such as “immunoglobin mediated immune response” and “negative regulation of immune response” suggesting a differential regulation of inflammation-related genes between IBS and IBD. IBS vs CeD was associated with “MHC class II protein complex binding” and “peptide antigen assembly with MHC protein complex” suggesting a differential activation of major histo-compatibility complex (MHC) class of protein and immune response genes between IBS and CeD. IBS vs healthy control comparison DMPs associated IBS vs HC showed an enrichment in “ion transport activity” and “neuron fate determination” terms. DMPs associated with IBD vs CeD were associated with GO terms associated with immune system-related pathways such as “adaptive immune response” and “cell-cell adhesion” and IBD vs HC comparisons showed enrichment of “immune response” and “adaptive immune response” pathway genes. CeD vs HC was associated with enrichment of terms such as “immune system process” and “regulation of immune system process” and “leukocyte cell-cell adhesion”. All the abovementioned GO terms were significant at FDR<0.05 except IBS vs IBD and IBS vs healthy control comparisons, in which cases trends for GO terms (p<0.05) have been reported. The representative GO terms enriched in IBS, IBD and celiac disease associated DMPs are shown in Supplementary Figures 5A–5C. The comprehensive list of terms is listed in Supplementary Table 8.
3.2. DNA methylation-based classifiers to discriminate IBS, IBD, CeD and HC
To identify disease-associated classifiers, penalized regression models were trained independently on pairs of diagnoses, including IBS, IBD, CeD, and HC (Supplementary Methods Table 1), and tested on a holdout test dataset resulting in disease-specific classifiers.
3.2.1. DNA methylation-based classifiers for IBS, IBD and CeD
The selected classifiers showed high accuracy for discrimination between various patient groups. Table 2 shows the performance metrics for IBS compared to HC, IBD and CeD. The AUC and accuracy for IBS vs IBD classifier were 0.85 and 0.80, respectively (p=6.75E-10) and those for IBS vs CeD were 0.82 and 0.78, respectively (p=4.57E-10). Figure 1 shows the ROC curves for these comparisons. For IBD vs CeD, AUC and accuracy were 0.78 and 0.73 (p=0.002), respectively. When comparing IBS vs Crohn’s disease and UC separately, the classifiers for IBS performed slightly better in discriminating against Crohn’s disease compared to UC (Crohn’s: Accuracy [95% CI] = 0.80 [0.66 – 0.9]; UC: Accuracy [95% CI] = 0.75 [0.60 – 0.87]).
Table 2.
Performance Metrics for IBS, IBD and CeD Associated Classifiers
| IBS vs IBD# | IBS vs CeD | IBD vs CeD | IBS vs HC* | IBD vs HC# | CeD vs HC | |
|---|---|---|---|---|---|---|
| Number of markers | 136 | 36 | 181 | 866 | 550 | 202 |
| AUC | 0.85 (0.77-0.93) | 0.82 (0.72-0.91) | 0.78 (0.67-0.88) | 0.69 (0.61-0.77) | 0.92 (0.88-0.97) | 0.85 (0.77-0.93) |
| Accuracy (95% CI) | 0.80 (0.7-0.87) | 0.78 (0.68-0.86) | 0.73 (0.62 -0.83) | 0.69 (0.61-0.75) | 0.82 (0.73-0.89) | 0.80 (0.70-0.88) |
| Sensitivity | 0.79 | 0.81 | 0.73 | 0.76 | 0.89 | 0.76 |
| Specificity | 0.80 | 0.74 | 0.74 | 0.62 | 0.77 | 0.82 |
| F1 | 0.80 | 0.81 | 0.76 | 0.71 | 0.82 | 0.74 |
| Accuracy P-Value | 4.57E-10 | 2.21E-07 | 1.84E-073 | 3.06E-05 | 3.41E-16 | 7.11E-09 |
For IBS vs healthy control comparison, DNA methylation data from 450K and EPIC arrays was used, for all other comparisons only subjects with EPIC array data were used;
Covariates included age, cell-type proportions, technical batch effects for all columns however, IBD vs healthy control comparison included site of IBD sample collection as an additional covariate. F1 denotes the weighted average of precision and recall;
AUC, area under the receiver operating characteristic (ROC) curve; IBS, irritable bowel syndrome; IBD, inflammatory bowel disease; CeD, celiac disease.
Figure 1:

The Figure shows the receiver operating characteristic (ROC) curves and the area under the ROC curves (AUC) for IBS vs IBD, IBS vs celiac disease and IBD vs celiac disease comparisons. The x-axis represents the sensitivity and y-axis represents the specificity, and each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. The AUC values for IBS vs IBD, IBS vs celiac disease and IBD vs celiac disease classifiers were 0.85, 0.82 and 0.78.
When comparing GI diseases to HC, the AUC and accuracy for IBS vs HC were 0.69 and 0.69, respectively (p=3.06-05). The classification accuracies between the different bowel habit subtypes within IBS including IBS-D, IBS-C and IBS-M vs HC were similar to the overall IBS group (Supplementary Table 9). AUC and accuracy of IBD vs HC were 0.92 and 0.82 (p= 1.33E-08), and those of CeD vs HC were 0.85 and 0.80 (p =7.11E-09). The markers associated with each classifier are included in Supplementary Table 10.
3.2.2. Assessment of Classifier Performance
External validation set
Classifiers developed from our internal IBD vs HC data were used to predict the classes of samples from an independent IBD vs HC DNA methylation dataset as described in the Methods section. The classifier successfully classified a majority of participants from the independent external dataset into correct disease categories (Supplementary Figure 6, AUC=0.74 (0.68-0.80), p<0.0001).
Permutation testing
Permutation testing showed that the AUC values for the true labels for IBD vs HC, IBS vs HC, IBS vs IBD, IBS vs CeD, and IBS vs HC were significantly better than the permutated labels (P<0.05, Supplementary Figure 7). In other words, permutation testing revealed that the performance of our model was significantly better than what would be observed just by chance.
3.2.3. GO terms and pathways associated with classifiers
The GO terms associated with the classifiers were similar to the GO terms associated with DMPs, and suggested that IBS was associated with epigenetic changes in neuronal and immune system pathways. IBD-associated classifier was associated with epigenetic changes in inflammation and immune response pathways, and CeD classifiers were enriched in MHC pathways-related terms (p<0.05, Supplementary Table 11).
3.3. Sensitivity analyses
We repeated IBS vs. IBD and IBS vs. CeD comparisons, excluding patients who consumed medications, including antidepressants, NSAIDs, and statins, which are reported to affect DNA methylation,44 and recalculated the performance metrics. The AUC for the new model was comparable to that of the original model in both comparisons (Supplementary Table 12), suggesting that the models chosen were robust to the effects of these confounders.
DISCUSSION
We report a comprehensive analysis of genome-wide DNA methylation data in the most common GI disorders including IBS, IBD and CeD, and HC. Here, we identify the differentially methylated CpG sites between IBS, IBD, and CeD which can serve as potential blood-based diagnostic biomarkers. The genes associated with the differentially methylated CpG sites corroborated with the pathophysiologic mechanisms of the disease, and the biomarkers we report here showed a significantly predictive AUC with potential for differentiating IBS, IBD, and CeD. Following replication and validation in a larger cohort, these biomarkers can potentially allow an earlier diagnosis and appropriate treatment plan and reduce unnecessary medical costs.
Our prediction model was trained and validated on EPIC DNA methylation data from samples recruited at UCLA and the CCF, and independently evaluated on a 450K external dataset on whole blood samples. Our results and replication across cohorts and platforms suggests that our model may be robust across populations and settings, but this requires further study. Additional strengths of our study include accounting for confounders by including covariates such as batch effects, cell counts, and age in our models, applying strict exclusion criteria such as smoking and medication intake, and performing additional sensitivity analyses such as excluding patients taking antidepressants and accounting for differences by collection sites.
DNA methylation has been studied in the intestinal mucosa and whole blood samples of IBD58, 59 and intestinal epithelium and saliva of CeD60, 61 patients. Our study replicated some of the findings reported by these studies. For example, the differential methylation of human leukocyte antigen (HLA) region has been reported in CeD patients61, which we also report here. In the study reporting methylation differences between IBD and HC59, a gene RPS6KA2 associated with inflammation was reported as hypermethylated in IBD patients was also observed as hypermethylated in IBD vs HC in our cohort (FDR=0.002). However, our study included methylation profiling of blood samples in IBS and these GI disorders that mimic IBS, which makes it easier to compare across disease profiles. In addition to univariate tests which can have limited power to detect smaller linear and non-linear changes, we utilized a machine learning framework which enabled us to not only develop predictive models but also select multiple relevant CpG sites without having to conduct repetitive association tests and consequential stringent multiple hypothesis adjustment. Moreover, our thorough preprocessing steps (including filtering specific sites and individuals, accounting for confounding) as well as the use of a double cross validation methodology lay the framework for reduced risk of overfitting and therefore improved reproducibility (and out of sample prediction). Finally, our stringent inclusion criteria of symptomatic untreated IBD and CeD patients diagnosed with standard diagnostic criteria but not on anti-IBD medications or a GFD, respectively, likely helped make our model more amenable to replicability and applicability to the target population of undiagnosed patients with chronic bowel symptoms.
The goal of this study was primarily to identify diagnostic biomarkers for the selected GI disorders, and blood samples provide insights into systemic changes but may not provide much information on pathophysiological changes. However, the methylation changes we observed in the PBMC samples were also reported to be associated with changes in colonic mucosal gene expression. For example, differential expression of inflammation-related genes in the colon of IBD patients has been previously reported by several studies62–65. In CeD patients, studies have reported a key role for immune system genes involved in innate and adaptive immune response, in particular the Th1 pathway66, 67. The role of HLA-DQ genes in CeD is well-known and HLA-DQ2 and HLA-DQ8 are reported to be the most important genes for the predisposition to this disorder68.
An association between IBS and the expression of barrier function and neuronal pathway genes has been reported in multiple studies from our group as well as by others. For example, in a study on colonic mucosal gene expression in IBS patients compared to HC including cohorts from UCLA, and publicly available data from the Mayo Clinic and the University of Nottingham, we reported differential expression of neuronal genes in IBS-C patients compared to HC. Additionally, another study on the expression of microRNA and mRNA in colonic mucosal biopsies and epithelial cells suggested a role for the cell-adhesion and barrier function-related genes in IBS. Our group also found that gene expression in colonic mucosal biopsies of 105 patients and 60 HC (a subset of DNA methylation patients) on QuantSeq 3` RNA sequencing platform suggested an association of IBS with GO terms such as “leukocyte cell-cell adhesion” and “immune response” (unpublished results). These studies further strengthen and support the disease relevance of biomarkers identified here. However, as the results derived from gene-set enrichment analyses were not significant after the adjustment for multiple comparisons, these findings warrant replication in an independent cohort.
Although we found a set of biomarkers that are potentially capable of capturing disease status, the causal role and a contribution to the diseases pathogenesis is unclear. The systemic changes measured using blood may be a result of the several factors including the disease itself, medication, and other factors. The GO analyses which were aimed at understanding the functional relevance of biomarkers suggested an association of disease-specific classifiers with pathways that are relevant to the pathophysiology of the corresponding diseases. For example, IBS-associated DMPs were associated with cell adhesion, neuronal signaling and pain pathways, which are the most widely studied pathways in IBS69–72. Similarly, inflammatory and immune pathways and related terms were associated with the IBD classifiers further supporting the functional significance of associated classifiers73. Additionally, the CeD classifier was associated with MHC class II receptor activity. MHC class II encoded by HLA is a chief genetic determinant of CeD, and certain HLA-DQ allotypes (DQ2.2, DQ2.5, and/or DQ8), which are allelic variants within the constant region of HLA genes are known to predispose to the disease by presenting posttranslationally modified gluten peptides to CD4+ T cells74. However, the systemic epigenetic changes identified in this study are predictive biomarkers and hypothesis generating, their contribution to the diseases pathogenesis is not clear.
There is currently no reliable diagnostic test for either IBS bowel habit subtypes with acceptable test characteristics that are needed for widespread clinical use. Available commercial blood tests (for IBS-D) lack the diagnostic accuracy needed to discriminate IBS from organic GI disorders, including IBD.75 A non-invasive blood test to diagnose IBS, IBD, and CeD would be of significant value in clinical practice. When validated and refined using additional studies, our blood-based DNA methylation test’s diagnostic potential could be improved by developing algorithms utilizing other non-invasive tests that aid in distinguishing gut inflammation and CeD (e.g., fecal calprotectin, C-reactive protein (CRP) and celiac serologies17, 76).
This study has limitations. Although machine learning is a powerful tool that learns latent patterns in the data to make predictions, it is susceptible to overfitting by learning patterns that arise from known or unknown confounding variables77, 78. To avoid overfitting, we nested the hyperparameter optimization using double cross-validation or nested cross-validation (Supplementary Figure 2). Since this method involves testing on a dataset that was independent of the training dataset, it provides almost unbiased estimates of the true errors52. Only one suitable external genome-wide methylation dataset comparing blood samples in patients IBD and HC was found. Although the performance accuracy of our model was slightly better on the internal dataset compared to the external data, it is known that external validations generally do not perform as well than the development model79. In addition, there were several methodological differences between the external IBD and our IBD datasets which may have affected the performance. For example, we used EPIC array vs the 450K used in the external data, which contains only about half the number of probes compared to the EPIC array. While the external data was on whole blood samples which include all white blood cells, and platelets, we analyzed PBMCs that include only a subset of white blood cells i.e., lymphocytes and monocytes.
We generally observed a relatively lower performance of IBS vs HC models; however, that is expected given that IBS is not considered a structural disease. However, a diagnostic test in discriminating against HC is not needed in clinical practice since HC lack GI symptoms. Nevertheless, a positive test for IBS vs HC is likely to diagnose IBS with 70% sensitivity, suggesting that it could have promise as a rule-in test, particularly in the setting of negative noninvasive tests (e.g., negative CeD serologies, normal fecal calprotectin). Additionally, our GO analysis suggested that the biomarkers associated with IBS vs HC comparison were associated with pathways relevant to IBS pathophysiology. We could not locate similar publicly available DNA methylation blood samples from IBS and CeD patients to further validate our findings. Although we tried to account for some confounders, there might be limitations to this study due to heterogeneity across IBS patients, unavailability of race and ethnicity information on some patients and other confounders. Therefore, although we found promising biomarkers which can be potentially used for the differential diagnosis of gastrointestinal diseases, these results warrant replication in a larger cohorts and longitudinal datasets to understand the functional role of epigenetic changes in the disease pathogenesis.
In conclusion, using comprehensive data on DNA methylation in various GI conditions and HC, our study shows that blood-based DNA methylation changes show promise as a non-invasive biomarker to distinguish IBS, IBD, and CeD, leading to an earlier accurate diagnosis and a rule-in test for IBS. GO analysis supports the functional significance of the classifiers in disease-specific pathology. Future studies should test these markers further and assess their utility in predicting treatment response and identifying novel therapeutic targets.
Supplementary Material
Acknowledgements:
The authors would like to acknowledge the Crohn’s and Colitis Foundation (CCF) for providing samples collected as part of the IBD Plexus Research Initiatives.
Funding:
NIH R21 DK104078 (LC), NIDDK P50 DK64539 (EAM, LC), UCLA TDG Innovation Grant (LC, SJ).
Disclosures:
Dr. Sauk has consulted for CorEvitas, Prometheus, and Abbvie. Dr. Weiss has consulted for Guidepoint, Regeneron, and EverlyHealth. Dr. Chang has served as a member of the scientific advisory board or consultant for Alfasigma, Ardelyx, Arena, Atmo, Bausch Health, Food Marble, GlaxoSmithKline, Ironwood, and Trellus Health. She has received research support from the National Institute of Health, Arena, AnX Robotica, and Ironwood Pharmaceuticals. She has stock options with Food Marble, ModifyHealth and Trellus Health. Drs. Chang and Mahurkar-Joshi have two related patents.
Abbreviations:
- IBS
Irritable bowel syndrome
- IBD
inflammatory bowel disease
- CeD
celiac disease
- UC
Ulcerative Colitis
- HC
healthy controls
- ROC
Area Under the Receiver Operating Characteristic Curve
- CGIs
CpG islands
- DMPs
Differentially Methylated Positions
- DMRs
Differentially Methylated Regions
- PBMCs
Peripheral blood mononuclear cells
Footnotes
Supplementary materials: Supplementary Methods, Supplementary Results (Tables 1–12, Figure 1–7)
REFERENCES
- 1.Lovell RM, Ford AC. Global prevalence of and risk factors for irritable bowel syndrome: a meta-analysis. Clin Gastroenterol Hepatol 2012;10:712–721 e4. [DOI] [PubMed] [Google Scholar]
- 2.Cash B, Sullivan S, Barghout V. Total costs of IBS: employer and managed care perspective. Am J Manag Care 2005;11:S7–16. [PubMed] [Google Scholar]
- 3.Ladabaum U, Boyd E, Zhao WK, et al. Diagnosis, comorbidities, and management of irritable bowel syndrome in patients in a large health maintenance organization. Clin Gastroenterol Hepatol 2012;10:37–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Spiegel BM, Farid M, Esrailian E, et al. Is irritable bowel syndrome a diagnosis of exclusion?: a survey of primary care providers, gastroenterologists, and IBS experts. Am J Gastroenterol 2010;105:848–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dahlhamer JM, Zammitti EP, Ward BW, et al. Prevalence of Inflammatory Bowel Disease Among Adults Aged ≥18 Years - United States, 2015. MMWR Morb Mortal Wkly Rep 2016;65:1166–1169. [DOI] [PubMed] [Google Scholar]
- 6.Kappelman MD, Rifas-Shiman SL, Kleinman K, et al. The prevalence and geographic distribution of Crohn’s disease and ulcerative colitis in the United States. Clin Gastroenterol Hepatol 2007;5:1424–9. [DOI] [PubMed] [Google Scholar]
- 7.Card TR, Siffledeen J, Fleming KM. Are IBD patients more likely to have a prior diagnosis of irritable bowel syndrome? Report of a case-control study in the General Practice Research Database. United European Gastroenterol J 2014;2:505–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nahon S, Lahmek P, Paupard T, et al. Diagnostic Delay Is Associated with a Greater Risk of Early Surgery in a French Cohort of Crohn’s Disease Patients. Dig Dis Sci 2016;61:3278–3284. [DOI] [PubMed] [Google Scholar]
- 9.Nguyen VQ, Jiang D, Hoffman SN, et al. Impact of Diagnostic Delay and Associated Factors on Clinical Outcomes in a U.S. Inflammatory Bowel Disease Cohort. Inflamm Bowel Dis 2017;23:1825–1831. [DOI] [PubMed] [Google Scholar]
- 10.Kang HS, Koo JS, Lee KM, et al. Two-year delay in ulcerative colitis diagnosis is associated with anti-tumor necrosis factor alpha use. World J Gastroenterol 2019;25:989–1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mustalahti K, Catassi C, Reunanen A, et al. The prevalence of celiac disease in Europe: results of a centralized, international mass screening project. Ann Med 2010;42:587–95. [DOI] [PubMed] [Google Scholar]
- 12.Lebwohl B, Rubio-Tapia A. Epidemiology, Presentation, and Diagnosis of Celiac Disease. Gastroenterology 2021;160:63–75. [DOI] [PubMed] [Google Scholar]
- 13.Therrien A, Kelly CP, Silvester JA. Celiac Disease: Extraintestinal Manifestations and Associated Conditions. J Clin Gastroenterol 2020;54:8–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Diagnosis and Management of Gluten-Associated Disorders: Springer Cham, 2021. [Google Scholar]
- 15.Irvine AJ, Chey WD, Ford AC. Screening for Celiac Disease in Irritable Bowel Syndrome: An Updated Systematic Review and Meta-analysis. Am J Gastroenterol 2017;112:65–76. [DOI] [PubMed] [Google Scholar]
- 16.Weiss GA. Diagnosis and Management of Gluten-Associated Disorders, 2021. [Google Scholar]
- 17.Lacy BE, Pimentel M, Brenner DM, et al. ACG Clinical Guideline: Management of Irritable Bowel Syndrome. Am J Gastroenterol 2021;116:17–44. [DOI] [PubMed] [Google Scholar]
- 18.Smalley W, Falck-Ytter C, Carrasco-Labra A, et al. AGA Clinical Practice Guidelines on the Laboratory Evaluation of Functional Diarrhea and Diarrhea-Predominant Irritable Bowel Syndrome in Adults (IBS-D). Gastroenterology 2019;157:851–854. [DOI] [PubMed] [Google Scholar]
- 19.Fuchs V, Kurppa K, Huhtala H, et al. Delayed celiac disease diagnosis predisposes to reduced quality of life and incremental use of health care services and medicines: A prospective nationwide study. United European Gastroenterol J 2018;6:567–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Rubio-Tapia A, Ludvigsson JF, Brantner TL, et al. The prevalence of celiac disease in the United States. Am J Gastroenterol 2012;107:1538–44; quiz 1537, 1545. [DOI] [PubMed] [Google Scholar]
- 21.Ukkola A, Kurppa K, Collin P, et al. Use of health care services and pharmaceutical agents in coeliac disease: a prospective nationwide study. BMC Gastroenterol 2012;12:136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Card TR, Siffledeen J, West J, et al. An excess of prior irritable bowel syndrome diagnoses or treatments in Celiac disease: evidence of diagnostic delay. Scand J Gastroenterol 2013;48:801–7. [DOI] [PubMed] [Google Scholar]
- 23.Black CJ, McKenzie YA, Scofield-Marlowe M, et al. Top 10 research priorities for irritable bowel syndrome: results of a James Lind Alliance priority setting partnership. Lancet Gastroenterol Hepatol 2023;8:499–501. [DOI] [PubMed] [Google Scholar]
- 24.Roth TL. Epigenetic mechanisms in the development of behavior: advances, challenges, and future promises of a new field. Dev Psychopathol 2013;25:1279–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bjornsson HT, Fallin MD, Feinberg AP. An integrated epigenetic and genetic approach to common human disease. Trends Genet 2004;20:350–8. [DOI] [PubMed] [Google Scholar]
- 26.Deaton AM, Bird A. CpG islands and the regulation of transcription. Genes Dev 2011;25:1010–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yang X, Han H, De Carvalho DD, et al. Gene body methylation can alter gene expression and is a therapeutic target in cancer. Cancer Cell 2014;26:577–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Mahurkar S, Polytarchou C, Iliopoulos D, et al. Genome-wide DNA methylation profiling of peripheral blood mononuclear cells in irritable bowel syndrome. Neurogastroenterol Motil 2016;28:410–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kalla R, Adams AT, Nowak JK, et al. Analysis of systemic epigenetic alterations in inflammatory bowel disease: defining geographical, genetic, and immune-inflammatory influences on the circulating methylome. J Crohns Colitis 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Raffals LE, Saha S, Bewtra M, et al. The Development and Initial Findings of A Study of a Prospective Adult Research Cohort with Inflammatory Bowel Disease (SPARC IBD). Inflamm Bowel Dis 2022;28:192–199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Longstreth GF, Thompson WG, Chey WD, et al. Functional bowel disorders. Gastroenterology 2006;130:1480–91. [DOI] [PubMed] [Google Scholar]
- 32.Mearin F, Lacy BE, Chang L, et al. Bowel Disorders. Gastroenterology 2016;150:1393–1407. [DOI] [PubMed] [Google Scholar]
- 33.Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatr Scand 1983;67:361–70. [DOI] [PubMed] [Google Scholar]
- 34.Park SH, Videlock EJ, Shih W, et al. Adverse childhood experiences are associated with irritable bowel syndrome and gastrointestinal symptom severity. Neurogastroenterol Motil 2016;28:1252–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Francis CY, Morris J, Whorwell PJ. The irritable bowel severity scoring system: a simple method of monitoring irritable bowel syndrome and its progress. Aliment Pharmacol Ther 1997;11:395–402. [DOI] [PubMed] [Google Scholar]
- 36.Lin S, Hannon E, Reppell M, et al. Whole blood DNA methylation changes are associated with anti-TNF drug concentration in patients with Crohn’s disease. J Crohns Colitis 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Daperno M, D’Haens G, Van Assche G, et al. Development and validation of a new, simplified endoscopic activity score for Crohn’s disease: the SES-CD. Gastrointest Endosc 2004;60:505–12. [DOI] [PubMed] [Google Scholar]
- 38.Best WR, Becktel JM, Singleton JW, et al. Development of a Crohn’s disease activity index. National Cooperative Crohn’s Disease Study. Gastroenterology 1976;70:439–44. [PubMed] [Google Scholar]
- 39.Thia K, Faubion WA Jr., Loftus EV Jr., et al. Short CDAI: development and validation of a shortened and simplified Crohn’s disease activity index. Inflamm Bowel Dis 2011;17:105–11. [DOI] [PubMed] [Google Scholar]
- 40.Walmsley RS, Ayres RC, Pounder RE, et al. A simple clinical colitis activity index. Gut 1998;43:29–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Sutherland LR, Martin F, Greer S, et al. 5-Aminosalicylic acid enema in the treatment of distal ulcerative colitis, proctosigmoiditis, and proctitis. Gastroenterology 1987;92:1894–8. [DOI] [PubMed] [Google Scholar]
- 42.Leffler DA, Dennis M, Edwards George JB, et al. A simple validated gluten-free diet adherence survey for adults with celiac disease. Clin Gastroenterol Hepatol 2009;7:530-6–536 e1-2. [DOI] [PubMed] [Google Scholar]
- 43.Leffler DA, Dennis M, Edwards George J, et al. A validated disease-specific symptom index for adults with celiac disease. Clin Gastroenterol Hepatol 2009;7:1328-34–1334 e1-3. [DOI] [PubMed] [Google Scholar]
- 44.Thompson M, Hill BL, Rakocz N, et al. Methylation risk scores are associated with a collection of phenotypes within electronic health record systems. NPJ Genom Med 2022;7:50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Xu Z, Niu L, Li L, et al. ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip. Nucleic Acids Res 2016;44:e20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Salas LA, Koestler DC. Illumina EPIC data on immunomagnetic sorted peripheral adult blood cells, 2020. [Google Scholar]
- 47.Team RC. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2020. [Google Scholar]
- 48.Zhao JH. gap: Genetic Analysis Package. Journal of Statistical Software 2007;23:1–18. [Google Scholar]
- 49.Peters TJ, Buckley MJ, Statham AL, et al. De novo identification of differentially methylated regions in the human genome. Epigenetics Chromatin 2015;8:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Campagna MP, Xavier A, Lechner-Scott J, et al. Epigenome-wide association studies: current knowledge, strategies and recommendations. Clin Epigenetics 2021;13:214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Phipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics 2016;32:286–8. [DOI] [PubMed] [Google Scholar]
- 52.Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 2006;7:91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Unal I. Defining an Optimal Cut-Point Value in ROC Analysis: An Alternative Approach. Comput Math Methods Med 2017;2017:3762651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Ojala MGGC. Permutation Tests for Studying Classifier Performance. 2009 Ninth IEEE International Conference on Data Mining, Miami Beach, FL, USA, 2009, pp. 908–913 2009. [Google Scholar]
- 55.Del Pilar Camacho Leal M, Costamagna A, Tassone B, et al. Correction to: Conditional ablation of p130Cas/BCAR1 adaptor protein impairs epidermal homeostasis by altering cell adhesion and differentiation. Cell Commun Signal 2018;16:90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Chen H, Li Q, Gao T, et al. Causal role of immune cells in inflammatory bowel disease: A Mendelian randomization study. Medicine (Baltimore) 2024;103:e37537. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Hisamatsu T, Erben U, Kuhl AA. The Role of T-Cell Subsets in Chronic Inflammation in Celiac Disease and Inflammatory Bowel Disease Patients: More Common Mechanisms or More Differences? Inflamm Intest Dis 2016;1:52–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Agliata I, Fernandez-Jimenez N, Goldsmith C, et al. The DNA methylome of inflammatory bowel disease (IBD) reflects intrinsic and extrinsic factors in intestinal mucosal cells. Epigenetics 2020;15:1068–1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ventham NT, Kennedy NA, Adams AT, et al. Integrative epigenome-wide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease. Nat Commun 2016;7:13507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Fernandez-Jimenez N, Garcia-Etxebarria K, Plaza-Izurieta L, et al. The methylome of the celiac intestinal epithelium harbours genotype-independent alterations in the HLA region. Sci Rep 2019;9:1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Hearn NL, Chiu CL, Lind JM. Comparison of DNA methylation profiles from saliva in Coeliac disease and non-coeliac disease individuals. BMC Med Genomics 2020;13:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hu S, Uniken Venema WT, Westra HJ, et al. Inflammation status modulates the effect of host genetic variation on intestinal gene expression in inflammatory bowel disease. Nat Commun 2021;12:1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Saeterstad S, Ostvik AE, Royset ES, et al. Profound gene expression changes in the epithelial monolayer of active ulcerative colitis and Crohn’s disease. PLoS One 2022;17:e0265189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Alarfaj SJ, Mostafa SA, Negm WA, et al. Mucosal Genes Expression in Inflammatory Bowel Disease Patients: New Insights. Pharmaceuticals (Basel) 2023;16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Vatn SS, Lindstrom JC, Moen AEF, et al. Mucosal Gene Transcript Signatures in Treatment Naive Inflammatory Bowel Disease: A Comparative Analysis of Disease to Symptomatic and Healthy Controls in the European IBD-Character Cohort. Clin Exp Gastroenterol 2022;15:5–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Sallese M, Efthymakis K, Marchioni M, et al. Gene Expression Profiling in Coeliac Disease Confirmed the Key Role of the Immune System and Revealed a Molecular Overlap with Non-Celiac Gluten Sensitivity. Int J Mol Sci 2023;24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Plaza-Izurieta L, Fernandez-Jimenez N, Irastorza I, et al. Expression analysis in intestinal mucosa reveals complex relations among genes under the association peaks in celiac disease. Eur J Hum Genet 2015;23:1100–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Aboulaghras S, Piancatelli D, Taghzouti K, et al. Meta-Analysis and Systematic Review of HLA DQ2/DQ8 in Adults with Celiac Disease. Int J Mol Sci 2023;24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Verne GN, Himes NC, Robinson ME, et al. Central representation of visceral and cutaneous hypersensitivity in the irritable bowel syndrome. Pain 2003;103:99–110. [DOI] [PubMed] [Google Scholar]
- 70.Videlock EJ, Mahurkar-Joshi S, Hoffman JM, et al. Sigmoid colon mucosal gene expression supports alterations of neuronal signaling in irritable bowel syndrome with constipation. Am J Physiol Gastrointest Liver Physiol 2018;315:G140–G157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Moloney RD, O’Mahony SM, Dinan TG, et al. Stress-induced visceral pain: toward animal models of irritable-bowel syndrome and associated comorbidities. Front Psychiatry 2015;6:15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Mahurkar-Joshi S, Chang L. Epigenetic Mechanisms in Irritable Bowel Syndrome. Front Psychiatry 2020;11:805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Chen ML, Sundrud MS. Cytokine Networks and T-Cell Subsets in Inflammatory Bowel Diseases. Inflamm Bowel Dis 2016;22:1157–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Sollid LM. The roles of MHC class II genes and post-translational modification in celiac disease. Immunogenetics 2017;69:605–616. [DOI] [PubMed] [Google Scholar]
- 75.Talley NJ, Holtmann G, Walker MM, et al. Circulating Anti-cytolethal Distending Toxin B and Anti-vinculin Antibodies as Biomarkers in Community and Healthcare Populations With Functional Dyspepsia and Irritable Bowel Syndrome. Clin Transl Gastroenterol 2019;10:e00064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.American Gastroenterological A. AGA Clinical Practice Guidelines on the Laboratory Evaluation of Functional Diarrhea and Diarrhea-Predominant Irritable Bowel Syndrome in Adults (IBS-D): Patient Summary. Gastroenterology 2019;157:856–857. [DOI] [PubMed] [Google Scholar]
- 77.Maksimovic J, Gagnon-Bartsch JA, Speed TP, et al. Removing unwanted variation in a differential methylation analysis of Illumina HumanMethylation450 array data. Nucleic Acids Res 2015;43:e106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Thompson M, Chen ZJ, Rahmani E, et al. CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets. Genome Biol 2019;20:138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Ramspek CL, Jager KJ, Dekker FW, et al. External validation of prognostic models: what, why, how, when and where? Clin Kidney J 2021;14:49–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
