Abstract
Ulcerative colitis (UC) is an immune-related inflammatory bowel disease, with its underlying mechanisms being a central area of clinical research. O-GlcNAcylation plays a critical role in regulating immunity progression and the occurrence of inflammatory diseases and tumors. Yet, the mechanism of O-GlcNAc-associated colitis remains to be elucidated. To this end, the transcriptional and clinical data of GSE75214 and GSE92415 from the GEO database was hereby examined, and genes MUC1, ADAMTS1, GXYLT2, and SEMA5A were found to be significantly related to O-GlcNAcylation using machine learning methods. Based on the four hub genes, two UC subtypes were built. Notably, subtype B might be prone to developing colitis-associated colorectal cancer (CAC). This study delved into the role of intestinal glycosylation changes, especially the O-GlcNAcylation, and forged a foundation for further research on the occurrence and development of UC. Overall, understanding the role of O-GlcNAcylation in UC could have significant implications for diagnosis and treatment, offering valuable insights into the disease’s progression.
Introduction
Ulcerative colitis (UC), an inflammatory bowel disease, has been a persistent challenge for patients over decades. Elucidating the deeper and more precise mechanisms behind UC has been a key focus in clinical research. Immune response holds considerable significance in the occurrence and development of UC [1]. The pathogenesis of UC includes various components of immunoinflammatory pathways related to the intestine, including antigen recognition, immune response, epithelial barrier, and intestinal microbiota [1–3]. In addition, various types of immune cells, such as antigen-presenting cells (dendritic cells and macrophages), T helper cells, regulatory T cells, and natural killer T cells, play vital roles in the pathogenesis of UC by regulating, inhibiting, and maintaining inflammation [4–6]. Addressing immune-related issues is now a critical component of the fundamental research regarding ulcerative colitis. Previous studies have demonstrated a strong link between glycosylation and both colon inflammation and the colonic immune response [7–9].
Glycosylation is a reversible post-translational modification that involves the enzymatic covalent attachment of monosaccharides or glycans to proteins. This process is known as glycosylation [10]. As an essential modification of proteins, protein glycosylation mainly includes N-glycans, O-glycans, and other type [10, 11]. O-GlcNAcylation, also known as O-glycosylation, plays an essential role in regulating innate immune cell function, cell metabolism, and the occurrence of inflammatory diseases and tumors [12]. Research has revealed that the levels of O-GlcNAcylation on proteins alter when innate immune cells are stimulated during inflammatory states [13–16]. Consequently, the disruption of O-GlcNAcylation balance in the body can lead to a range of diseases, encompassing intestinal inflammatory disorders, diabetes, neurodegeneration, and even tumors [17–19]. The relationship between intestinal O-GlcNAcylation and ulcerative colitis has attracted increasing attention [9, 12, 20–22].
Intestinal mucosal injury is the most direct manifestation of ulcerative colitis. Most intestinal glycans are mucin-type O-glycans, making up 80% of the mass of human MUC2, the most prevalent intestinal mucin [23]. Intestinal epithelial O-glycans can directly regulate microorganism interactions by providing ligands for bacterial adhesins and nutrients for bacterial metabolism [22]. Various evidence has supported the strong connection between O-glycosylation and ulcerative colitis. Identifying glycosylation biomarkers and their expression changes is essential for diagnosing ulcerative colitis, predicting its progression, and assessing potential complications.
In this study, bioinformatics methods were employed to explore the role of O-GlcNAcylation in developing ulcerative colitis. Furthermore, multiple machine learning methods were used to classify UC into two subtypes based on key genes, guiding the choice of UC treatment and prognostic judgment of UC.
Materials and methods
Datasets and sample selection
The following criteria from the Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo/) retrieval of UC microarray datawere included: a) data from the same sequencing platform to generate expression of two different spectra; b) inclusion of human test samples only; and c) a minimum of ten samples per groups. Finally, two datasets, namely GSE75214 (provided by the GPL6244 platform) and GSE92415 (provided by the GPL13158 platform), were hereby incorporated. The GSE75214 database contains intestinal mucosal biopsies obtained endoscopically from UC patients (n = 97) and healthy controls (n = 11), followed by microarray analysis to assess gene expression. GSE92415 database enrolled 21 healthy subjects and 162 UC patients, including baseline before treatment (n = 87) and post-treatment individuals (n = 75), to evaluate the effect of golimumab (GLM) during induction treatment in moderately to severe UC. In this study, only 87 untreated UC samples and 21 healthy control samples were selected.
Merge and deduplication of datasets
The genes from both the UC patients and healthy individuals in the GSE75214 dataset were merged with those from GSE92415 to form a comprehensive data set. The batch effect was then eliminated to minimize discrepancies between the different datasets, using the R packages "limma" and "sva" [24, 25]. There were a total of 216 samples and 16467 genes after combination (S1 File). S1A and S1B Fig present the data before and after the merger, respectively (S1A and S1B Fig). To eliminate the adverse effects caused by singular sample data, the datasets was homogenized by R packages "preprocessCore". S1C and S1D Fig illustrate the data before and after normalization, respectively (S1C and S1D Fig).
Identification of differential expressed genes
The genes that met the criteria of the adjusted P-value < 0.05 and |log fold change (FC)| > 1.0 or 0.5 were considered DEGs using the "limma" package. The volcano plot and the heatmap visualized the DEGs using the "ggplot2" package and "pheatmap" package, respectively.
Biological function and pathway enrichment analyses
Using the R language "clusterprofiler" package, Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed to identify the potential functions of differential genes and signaling pathways associated with DEGs. GO assays included biological process (BP), Cell component (CC), and molecular function (MF) categories.
Identification and functional enrichment analysis of O-GlcNAcylation differential genes
The O-GlcNAcylation gene set was downloaded from the MsigDB database (https://www.gsea-msigdb.org/). Subsequently, the gene set between the UC and healthy control groups was extracted and interacted with the O-Glcnacylation gene set to search for differential genes. For this analysis, the R "clusterprofiler" package was utilize [26]. The GO and KEGG enrichment analyses were carried out to derive visual representations of the enrichment results.
PPI
STRING database (https://string-db.org/) was used to construct Protein-Protein Interaction Networks (PPI) encoded by 7 DEGs to represent the relationships among the 7 differential gene [27].
Machine learning
LASSO was performed to enhance the predictive accuracy and comprehensibility of the statistical models by employing a regression method for variable selection. Random Forest (RF) is a versatile computational method capable of predicting continuous variables. It is adaptable to various conditions and is known for its high accuracy and sensitivity [28]. Support vector machine (SVM) is a supervised machine learning (ML) method capable of learning from data and making decisions [29].
In this study, three machine learning methods, namely LASSO regression (R-packaged glmnet), Random Forest (R language randomForest), and SVM support vector machine (R-packaged kernel), were used to screen essential differential genes from the seven candidate genes.
Core genes predicting the disease onset
The receiver operating characteristic (ROC) curve was drawn using the pROC software package (R Package pROC) to evaluate the sensitivity and specificity of four core genes in predicting disease occurrence, with the X-axis indicating "specificity" and the Y-axis representing "sensitivity" [30, 31]. Other gene predictions could obtain different ROC curves. Different areas under the ROC curve (AUC) were obtained, reflecting the gene’s strength in predicting disease occurrence.
Immune landscape of dataset
CIBERSORT (http://cibersort.stanford.edu/) was used to determine the GSE75214 and GSE92415 states of the immune cells infiltrating. Following that, Spearman’s method was employed to assess the correlation between the expression of the four pivotal genes and that of immune cells within the dataset samples.
GSEA analysis of single genes
To characterize the potential functions of the four hub genes, the R clusterProfiler package was used to display the top 20 results of four single-gene GSEA analyses of Reactome [32]. The listed values denoted enrichment scores, with scores above zero indicating a positive correlation between the gene and the pathway, and scores below zero suggesting a negative correlation. The results were then ranked in descending order according to the absolute value of the normalized enrichment score (NES).
Unsupervised clustering of genes
Based on the four core genes, the R package "ConsensusClusterPlus" was used for unsupervised consensus cluster analysis, identifying two subtypes as optimal, this analysis further highlighted the differential expression of core genes among different types. A p-value less than 0.05 was considered significant. R-package pheatmap was used to draw a heatmap to show the expression differences of the four gene expressions among different subtypes.
GSVA analysis of different types of pathways
KEGG path and Reactome path were downloaded from the Msigdb database, respectively. The R package GSVA was used to score the paths and to compare the differences between the paths of the two subtypes [33]. Subsequently, an R package, “pheatmap,” was adopted for drawing a heatmap to compare the two groups.
Biological function and pathway enrichment analysis of two subtypes
PCA diagram was used to show the distribution of different subtypes of UC samples, indicating the relationship between two distinct subtypes of UC. Further differential analysis was performed for subtypes, and the selected differential genes were enriched by GO and KEGG. Clusterprofiler was utilized to obtain visual enrichment analysis results.
Prediction of miRNAs and transcription factors upstream of genes
The regnetwork database (https://regnetworkweb.org/) was used to predict miRNAs and transcription factors (TFs) upstream of genes, with red indicating the core gene. Finally, Cytoscape software was used to construct the network [34].
The R language codes related to bioinformatics methods involved in this study have been uploaded as supplementary information (S2 File).
Results
Identification and functional enrichment analysis of DEGs
Upon the merging of the two datasets, namely GSE75214 (provided by the GPL6244 platform) and GSE92415 (provided by the GPL13158 platform), 184 UC samples and 32 healthy controls were ultimately obtained (Table 1).
Table 1. Sample numbers.
The two databases from the GEO database underwent normalization and subsequent merging (S1 Fig). Subsequently, the limma package of R language was used for differential analysis between UC and control, and the differentially expressed genes were screened according to the criteria of |logFC|>1 and adj.P.Val <0.05. The results showed that 449 genes were co-up-regulated and 233 were co-down-regulated (Fig 1A). Heatmap analysis showed significant gene expression differences between UC and the healthy control group. For example, REG and MMP families had significantly high expression in UC, while low expression was in healthy control groups (Fig 1B).
Fig 1. Identification and functional enrichment analysis of DEGs.
(A) The volcano map showed DEGs from two GEO datasets, UC and health control. (B) The heatmap showed the different genes between UC and healthy controls. The screening criteria were set to |LogFC| > 1 and adj.P.Val < 0.05. (C-E) The enrichment analysis results of GO, including BP, CC, and MF, revealed the underlying functions of DEGs. (F) KEGG revealed the first twenty pathways of differential gene enrichment.
GO enrichment analysis involving BP, CC, and MF showed that differential genes of UC and healthy controls were mainly enriched in leukocyte migration, neutrophil, granulocyte chemotaxis, and regulation of immune processes (Fig 1C–1E). Regarding the KEGG pathway, significant enrichment pathways were TNF, IL-17 signaling pathway, rheumatoid arthritis, NF-κb, and B-cell receptor signaling pathway (Fig 1F). Additionally, UC and healthy controls of DEGs were primarily observed in immune and inflammatory pathways.
Screening and functional enrichment of O-GlcNAcylation-associated differential genes
The upregulated UC-associated differential genes and O-GlcNAcylation gene sets were extracted and overlapped. Upon overlapping analysis, four common differential genes were identified, including ADAMTS1, MUC1, ST3GAL1, and THBS2 (Fig 2A). The downregulated differential genes and O-GlcNAcylation gene sets between the UC group and the healthy control group were extracted for intersection analysis, and three common differential genes were identified, including SEMA5A, GXYLT2, and GALNT12, after overlapping analysis (Fig 2B).
Fig 2. Screening and functional enrichment of O-GlcNAcylation-associated differential genes.
(A) Venn diagram of upregulated differential genes and O- GlcNAcylation gene sets. (B) Venn diagram of downregulating differential gene and O-GlcNAcylation gene set. (C) The enrichment analysis results of GO, including BP, CC, and MF. (D) The KEGG enrichment of DEGs. (E) Mapping between the top 5 pathway of KEGG and three differential genes, with different colored lines corresponding to different KEGG pathways.
GO enrichment analysis uncovered that differential genes were mainly enriched in protein glycosylation, biosynthesis, and metabolism of glycoproteins (Fig 2C). The KEGG pathway significantly increased in other glycosylation biosynthetic pathways, Mucin-type O-glycan biosynthesis, and PI3K-Akt signaling pathway (Fig 2D). Furthermore, it was found that the seven differential genes were mainly involved in the glycosylation biosynthesis and metabolism pathways. Moreover, three critical genes, including GXYLT2, GALNT12, and ST3GAL1, were selected and enriched in the KEGG top 5 pathways (Table 2), and their corresponding relationships were labeled (Fig 2E).
Table 2. The KEGG enrichment analysis table of DEGs.
| ID | Description | GeneRatio | BgRatio | pvalue | p.adjust | geneID |
|---|---|---|---|---|---|---|
| hsa00512 | Mucin type O-glycan biosynthesis | 2/5 | 36/9180 | 0.000148427 | 0.00152442 | ST3GAL1/GALNT12 |
| hsa00514 | Other types of O-glycan biosynthesis | 2/5 | 47/9180 | 0.00025407 | 0.00152442 | GXYLT2/GALNT12 |
| hsa00533 | Glycosaminoglycan biosynthesis ‐ keratan sulfate | 1/5 | 14/9180 | 0.007603702 | 0.019548115 | ST3GAL1 |
| hsa00603 | Glycosphingolipid biosynthesis ‐ globo and isoglobo series | 1/5 | 15/9180 | 0.008145048 | 0.019548115 | ST3GAL1 |
| hsa00604 | Glycosphingolipid biosynthesis ‐ ganglio series | 1/5 | 15/9180 | 0.008145048 | 0.019548115 | ST3GAL1 |
Table 2 demonstrated that KEGG top 5 pathways ranked in ascending order of p-value, and the relevant information including ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, qvalue, geneID, and Count.
Expression and correlation of the hub DEGs
The expression levels of the seven differentially expressed genes identified through intersection analysis between the UC and healthy control group were visualized using a volcano plot and a heatmap. These visualizations provided a comparative analysis of the gene expression patterns in both groups (Fig 3A). As shown in the figure, the expression of ADAMTS1, MUC1, ST3GAL1, and THBS2 were significantly upregulated in UC, while those of SEMA5A, GXYLT2, and GALNT12 were considerably downregulated in UC (Fig 3B). Using the STRING database (https://string-db.org/), PPI Networks encoded by 7 DEGs were constructed. PPI, an interaction network comprising 7 nodes and 18 edges, was visualized using Cytoscape software (Fig 3C).
Fig 3. Expression and correlation of the hub DEGs.
(A) The volcano map of the seven differential genes presented separately. (B) Expression analysis of 7 differential genes in UC and healthy controls (ggplo2 package mapping). ns p>0.05, *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001. (C) 7 DEG-encoded protein interaction networks. The network nodes represent proteins, while the lines indicate predicted relationships: with light blue representing auxiliary database evidence, purple representing laboratory proof, yellow representing text mining evidence, green representing gene similarity, red representing gene fusion, blue representing gene co-production, black lines representing gene co-expression, and gray lines representing protein homology.
Machine learning screening for key differential genes
Herein, the most important features were selected based on three machine learning algorithms to screen out the hub genes with the most guiding value further from the 7 DEGs. The LASSO logistic regression algorithm, RF analysis, and SVM algorithm were carried out successively. 4 key genes were selected based on the results of the three algorithms.
Furthermore, LASSO analysis was conducted to identify 5 tag genes, namely ADAMTS1, MUC1, ST3GAL1, SEMA5A, and GXYLT2 (Fig 4A). In the RF analysis, 5 tag genes were selected in order of relative importance, namely SEMA5A, ADAMTS1, MUC1, GXYLT2, and THBS2 (Fig 4B). Moreover 6 tag genes, namely ADAMTS1, SEMA5A, MUC1, GXYLT2, THBS2 and GALNT12, were identified using SVM (Fig 4C). 4 core genes were finally recognized through the interaction of these three algorithms, including GXYLT2, MUCI, ADAMTS1, and SEMA5A (Fig 4D). Subsequently, the correlations among the 4 core genes screened by machine learning were evaluated, with red representing positive and green indicating negative correlations (Fig 4E).
Fig 4. Machine learning screening for key differential genes.
(A) LASSO regression screening of 5 genes. (B) RF selected 5 genes in order of importance. (C) SVM screened 6 genes. (D) Intersection obtained 4 core genes. (E) The correlation between the 4 core genes, with red represents a positive correlation, and green indicating a negative correlation. (F) ROC curve of 4 genes predicting disease occurrence.
A significant correlation between SEMA5A, GXYLT2, and MUC1 expression was observed, with SEMA5A showing a strong positive correlation with GXYLT2 expression and a strong negative correlation with MUC1 expression. ADAMTS1 was negatively correlated with the expression of GXYLT2 but exhibited no significant correlation with the expression of MUC1. In addition, there was a significant negative correlation between MUC1 and GXYLT2 expression. As indicated by the ROC curves, the four hub genes demonstrated a strong predictive power for UC (GXYLT2 AUC = 0.923, MUC1 AUC = 0.898, ADAMTS1 AUC = 0.955, and SEMA5A AUC = 0.944) (Fig 4F).
Evaluation of the degree of immune cell infiltration
In this study, the relationship between immune cells in UC was also investigated and the results demonstrated a positive correlation in the expression of Activated B cells, Activated CD4, CD8 T cells, Natural Killer cells, and other immune cells. The expressions of Type 17 helper cell, activated B cell, and activated CD8 T cell were negatively correlated, respectively, while most other immune cells were positively correlated with each other (Fig 5A). CiberSort was employed to futher demonstrate the difference in immune cell infiltration between UC and healthy control group. The results showed that the levels of Activated B cells, Activated CD4, CD8 T cells, Natural Killer cells, and other immune cells in UC patients were significantly higher than those in the standard control group. Besides, no significant difference in the expression of Type 17 helper cell and CD56dim natural killer cells was identified between the UC and healthy control group (Fig 5B).
Fig 5. Evaluation of the degree of immune cell infiltration.
(A) Correlation analysis between immune cells. (B) Differences in immune cell infiltration between UC and healthy control group, ns p>0.05, *p<0.05, **p<0.01, ***p<0.001. (C) Correlation analysis between 4 core genes and immune cells.
Furthermore, the connection between 4 core genes and immune cells was also delved into. The findings revealed a significant correlation between the expression of these core genes and activated CD4 T cells, natural killer cells, Type 17 helper cells, and CD56dim natural killer cells. Among them, ADAMTS1 was observed to be significantly positively correlated with the expression of natural killer T cells and activated CD4 T cells while being significantly negatively correlated with the expression of Type 17 helper cells and CD56dim natural killer cells. Meanwhile, GXYLT2 and SEMA5A were significantly negatively correlated with the expression of activated dendritic cells, activated CD4 T cells, natural killer cells, Type 17 helper cells, and CD56dim natural killer cells. Moreover, there was a significant positive correlation between the expression of MUC1 and activated dendritic cells, natural killer cells, activated CD4 T cells, and Type 17 helper cells (Fig 5C).
Single gene enrichment analysis
Based on the significant role of the four hub genes, the correlation genes associated with ADAMTS1, GXYLT, MUC1, and SEMA5A expression were hereby analyzed. The heatmap positively revealed the top 50 co-expressed genes with four core genes (S2A–S2D Fig).
Single-gene GSEA was performed to characterize the potential function of the four hub genes. The ridgeline plot displayed only the top 20 results. Details are shown in Fig 6, and the values below representing enrichment scores, with a value exceeding 0 indicating a positive correlation between a gene and a pathway, while a value less than 0 indicating a negative correlation.
Fig 6. Single gene enrichment analysis.
(A) GSEA analysis for ADAMTS1. (B) GSEA analysis for GXYLT2. (C) GSEA analysis for MUC1. (D) GSEA analysis for SEMA5A.
Almost all pathways identified were related to immunity and inflammation, including antigen processing-cross-presentation, signaling by interleukins, integral cell surface interactions, and interferon signaling. Meanwhile, MUC1 showed a negative correlation with both Asparagine N-linked glycosylation and O-linked glycosylation of mucins. Additionally, GXYLT2 and SEMA5A exhibited a negative correlation with collagen formation, whereas ADAMTS1 displayed a positive correlation with the same process (Fig 6A–6D).
Unsupervised consensus clustering analysis of gene expression profiles revealed two subtypes of UC
An unsupervised consensus clustering analysis was conducted based on the four hub genes, with all UC samples initially divided into k (k = 2–9) clusters. The cumulative distribution function (CDF) curves of the consensus score matrix statistic indicating that the optimal number was obtained when k = 2. Consequently, two distinct subtypes of UC were identified (Fig 7A), involving 106 samples in subtype A and 78 in subtype B. The four genes exhibited remarkable differences in expression between the two subtypes (p<0.05). The expression of all other genes in subtype A was higher than that in subtype B, except for MUC1 (Fig 7B). Furthermore, a heatmap was drawn to more intuitively display the expression differences of four genes between 184 samples from two subtypes using the R software package “pheatmap”. GXYLT2, ADAMTS1, and SEMA5A were significantly upregulated in subtype B, while MUC1 was upregulated considerably in subtype A, further validating the presence of diverse subtypes in UC (Fig 7C).
Fig 7. Identification and validation of ulcerative colitis subtypes.
(A) Heatmap of sample clustering at consensus k = 2. (B) The expression status of four hub genes in the two subtypes, ***p<0.001. (C) Heatmap of four hub genes in the two subtypes.
GSVA of biological pathways between two subtypes
GSVA enrichment was performed to explore the biological behavior and pathway differences of the two clusters. The GSVA enrichment analysis showed that the two subtypes significantly varied in the metabolism of various substances. A heatmap of the genes was organized in ascending order according to their P values, and the top 20 were selected for further analysis.
The results of the KEGG analysis showed that the A subtype was enriched in pathways of base excision repair and substance metabolism, including galactose, fructose, mannose, amino sugar, nucleotide sugar metabolism, and glycerolipid. In contrast, the B subtype was frequently involved in cancer-related pathways, such as non-small cell lung cancer, colorectal cancer, chronic myeloid leukemia, endometrial cancer, among others. As a result, it could be reasonably speculated that subtype B of UC could possibly develop into ulcerative colitis-associated colorectal cancer (CAC) (Fig 8A).
Fig 8. The diversity of the underlying biological function characteristics between the two subtypes.
(A) The differences in KEGG pathway enrichment score between subtypes A and B. (B) The differences in Reactome pathway enrichment score between subtypes A and B.
Furthermore, the results of the Rectome analysis indicated that the subtype A was enriched in nucleotide catabolism and purine catabolism pathways. In contrast, the subtype B was enriched in pathways of ESR mediated signaling, signaling by the nuclear receptor, IGF1R signaling, RUNX2 regulates osteoblast differentiation, RUNX2 regulates bone development, and glutamate and glutamine metabolism (Fig 8B).
Differential genes and enrichment analysis of the two subtypes
The principal component analysis (PCA) demonstrated that UC patients were well distributed into two clusters (Fig 9A). The PCA offered a holistic and clear visual representation, mapping all samples and highlighting the separation between groups. The substantial distance between subtypes A and B indicated pronounced distinctions between them.
Fig 9. Differential genes and enrichment analysis of the two subtypes.
(A) PCA analysis demonstrating a distinctive difference between the two clusters. (B) Volcano plot of the 229 DEGs. The threshold for the volcano plot was |logFC| >0.5 and adj.p.Val. < 0.05. (C) GO enrichment analysis showing the BP, CC, and MF parts. (D) The bubble plot depicting the KEGG pathway enrichment analysis of DEGs. (E) The correspondence between the KEGG top five pathways and genes.
Through differentially expressed genes analysis, 229 DEGs were obtained, including 105 DEGs markedly upregulated and 124 DEGs significantly downregulated (Fig 9B). Following that, GO and KEGG analyses of DEGs were performed to further interpret the clustering results from the perspective of fundamental biological processes. The top ten results of GO enrichment analyses were exhibited, including BP, CC, and MF (Fig 9C). The BP indicated the enrichment function of the regulation of peptidase activity and response to peptide hormone. Meanwhile, the CC showed that the DEGs were primarily correlated with the collagen-containing extracellular matrix, apical part of the cell, and apical plasma membrane. For MF, extracellular matrix structural constituent, receptor ligand activity, and signaling receptor activator activity were mainly enriched for the DEGs. Additionally, KEGG analysis showed that the DEGs were primarily involved in inflammation, immunity, and infectious diseases (Fig 9D).
According to KEGG enrichment analysis, the top 5 significant pathways of DEGs and related genes were identified, including Cytokine-cytokine receptor interaction, IL-17 signaling pathway, Viral protein interaction with cytokine and cytokine receptor, Amoebiasis and Pertussis (Fig 9E) (Table 3).
Table 3. The KEGG enrichment analysis table of the DEGs of the two subtypes.
| ID | Description | GeneRatio | BgRatio | pvalue | p.adjust | geneID |
|---|---|---|---|---|---|---|
| hsa04657 | IL-17 signaling pathway | 9/110 | 94/9180 | 1.67E-06 | 0.000330056 | LCN2/MUC5B/S100A7/CCL2/CCL20/PTGS2/FOS/IL6/CXCL1 |
| hsa04061 | Viral protein interaction with cytokine and cytokine receptor | 8/110 | 100/9180 | 2.45E-05 | 0.001604571 | CXCL12/IL22RA1/CCL28/CCL2/CXCR4/CCL20/IL6/CXCL1 |
| hsa05146 | Amoebiasis | 8/110 | 102/9180 | 2.84E-05 | 0.001604571 | NOS2/FN1/IL1R2/LAMC2/SERPINB4/IL6/CXCL1/SERPINB3 |
| hsa05133 | Pertussis | 7/110 | 76/9180 | 3.24E-05 | 0.001604571 | C4BPA/NOS2/CASP1/C4BPB/C1S/FOS/IL6 |
| hsa04060 | Cytokine-cytokine receptor interaction | 13/110 | 295/9180 | 4.84E-05 | 0.00183806 | CXCL12/LIFR/IL1R2/CXCL17/IL22RA1/CCL28/GHR/CCL2/CXCR4/CCL20/TNFRSF11B/IL6/CXCL1 |
Table 3 demonstrated that KEGG top 5 pathways ranked in ascending order of p-value, and the relevant information including ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, qvalue, geneID, and Count.
Prediction of miRNAs and transcription factors
To determine the upstream TFs and miRNAs of hub genes, 56 TFs and 49 miRNAs were obtained via the RegNetwork repository (https://regnetworkweb.org/), with a vast network established to present enhanced co-regulatory patterns using Cytoscape (Fig 10). MUC1, in the core position of the network, was regulated by 33 TFs and 27 miRNAs. For SEMA5A and MUC1, the common TF was SP1 and MEF2A, and miRNA was hsa-miR-519e. MUC1 and ADAMTS1 had four common TF, including TFAP2A, STAT1, STAT3, and CTCF. GXYLT2 had only one upstream miRNA has-miR-37 and two TF HNF4A and NR2F1.
Fig 10. TF–miRNA co-regulatory network analysis, with red nodes representing hub genes, and blue nodes indicating TFs and miRNAs.
Discussion
Glycosylation is the process of attaching various sugars to proteins through glycosidic bonds, representing the most prevalent post-translational modification across all cellular organisms. Glycosylation enhances the stability of proteins, primarily involving N-glycans and O-glycans, with enzymes overseeing the entire process [10]. O-GlcNAc transferase (OGT) adds the O-linked β-N-acetylglucosamine (O-GlcNAc) monosaccharides to the serine or threonine residues of nuclear or cytoplasmic protein [35]. O-GlcNAcase (OGA) can then remove the monosaccharide reversibly [36]. Most protein glycosylation occurs in the endoplasmic reticulum (ER) and Golgi. O-glycosylation, in particular, regulates immune cells’ development, homeostasis, and functions [37, 38].
As one type of IBD, UC is characterized by an abnormal immune response to the gut microbiota. The prevalence of UC is escalating year by year [39]. Mucosal lesions usually originate in the rectum and may spread to the entire colon as the disease progresses [3]. The extra-intestinal manifestations also influence the quality of life and even cause disability, including anemia, arthropathy, metabolic bone disease, and hepatobiliary disease [40]. Immune cells play an essential role in the occurrence and development of UC. Antigen-presenting cells (APCs), such as macrophages and dendritic cells (DCs), can recognize antigens and initiate the immune response by releasing cytokines like IL-12. L-12 is instrumental in driving the differentiation of Th1 cells, which in turn secrete the pro-inflammatory cytokines TNF-α, IFN-γ, and IL-2 [41, 42].
Gut microbiota influences intestinal physiology and emphasizes the potential of bacterial OGAs as a promising therapeutic strategy in colonic inflammation by hydrolyzing O-GlcNAcylated proteins [43]. Moreover, Qian-Hui Sun et al. identified increased O-GlcNAc level in the gut epithelium of AIEC LF82-infected mice and CD patients, linking the change to intestinal inflammation [17]. In addition, in dextran sodium sulfate (DSS)-induced colitis and azoxymethane (AOM)/DSS-induced CAC mice models, the O-GlcNAcylation of colonic tissues was also elevated. Compared to normal colonic tissues, human CAC tissues’ O-GlcNAcylation was increased [44]. Many studies have implicated O-GlcNAcylation as a contributing factor in the promotion of chronic colonic inflammation.
However, the relationship between O-GlcNAcylation and UC has not been well-studied, making it necessarily important to explore the specific molecular mechanism of O-GlcNAcylation in UC. Herein, efforts were made to determine the possible role of O-glycosylation in UC through bioinformatic analysis. Specifically, GSE75214 and GSE9241 datasets downloaded from GEO were analyzed to identify DEGs in UC patients. The GO and KEGG analyses revealed that the DEGs were enriched in immunity, inflammation, and cytokine signaling pathways.
To explore the relationship between O-GlcNAcylation and UC, the DEGs in UC were intersected with 151 O-GlcNAcylation-related genes. A total of 7 DEGs were detected. Furthermore, GO and KEGG were conducted for the seven DEGs. Through LASSO, SVM-RFE, and RF algorithm, four O-glycosylation-related hub DEGs in UC were screened, including MUC1, ADAMTS1, GXYLT2, and SEMA5A. The AUC values of the four hub genes were high, demonstrating these four hub genes as potential target genes for treating UC through O-glycosylation. This offered a new direction for exploring the role of O-glycosylation in UC. Given the importance of immunity in UC, an immune infiltration analysis was further performed. The results revealed a significant difference between the normal group and UC patients. UC patients had a higher level of DCs, Th1 cells, Treg, B cell, CD4+ T cell, CD8+ T cell, macrophage and neutrophil compared to their normal counterparts. The results were highly consistent with the results of previous studies, underscoring the importance of immune cells in the pathogenesis of UC.
MUC1, a member of the mucin family and a membrane-bound protein, is secreted by goblet and absorptive cells of the intestinal epithelium and plays a role in the mucus layer [45]. It is highly expressed in the epithelial mucosa of the gastrointestinal tract. Mucins are O-glycosylated proteins that can form protective mucous barriers [46]. The membrane shift and overexpression of MUC1 affect the prognosis of related malignant tumors, like colon cancer [47, 48]. Besides, MUC1 also holds considerable significance in intracellular signaling and immune regulation, especially colonic inflammation [49–51]. Increased MUC1 expression is often associated with a decrease in the beneficial gut microbiota [52]. The breakdown of the mucus barrier and dysregulation of intestinal microflora can exacerbate the incidence and progression of UC. Yet, the precise mechanism of MUC1’s involvement in UC requires additional research.
ADAMTS1, a disintegrin-like, and metalloprotease with the thrombospondin type 1 motif, is a protein-coding gene whose related pathways are the diseases associated with O-glycosylation. It plays a vital role in inflammatory processes and the development of cancer [53, 54] and presents angiogenic inhibitor activity [55]. Compared to the standard group, the ADAMTS1 level in UC patients was hereby found to be higher and correlated with IL-17. IL-17 may damage the intestinal wall by promoting the expression of ADAMTS1 [56]. However, the mechanisms by which ADAMTS1 operates in UC are not fully understood and warrant additional research for clarification.
GXYLT2 (glucoside xylosyltransferase) encodes a xylosyltransferase, which catalyzes the addition of xylose to the O-glucose-modified residues of EGF repeats of Notch proteins [57]. Compared to quiescent UC, the expression of GXYLT2 in active UC is elevated, which facilitates the assessment of disease activity [58]. Meanwhile, Barnicle et al. found that methylated genes GXYLT2 differed between inflamed tissues and regular counterparts of UC patients, and the Wnt signaling pathway was involved [59]. The dysregulation of Wnt and Notch signaling pathways, associated with the proliferation and differentiation of intestinal stem cells (ISCs), induces cell overgrowth and malignant transformation. In UC, the inhibition of Wnt and overexpression of Notch induce the decrease of Paneth cells, thereby leading to intestinal barrier damage [60].
The final gene among the four central hub genes is SEMA5A, with a limited number of studies currently available on this gene. Using differentially expressed lncRNAs to predict target genes, Benhai Xiong et al. investigated the effects of extracellular vesicles (EVs) on the expression of Sema5a genes in DSS-treated mice. The results showed a considerable down-regulation of Sema5a gene expression [61]. Besides, axon guidance cue Sema5a may cause the expression of pro-inflammatory genes (TNF-α and IL-8) [62].
Currently, UC classification is primarily based on the severity of the disease, categorized as mild, moderate, and severe. Classification at the genetic level, however, remains understudied. Therefore, UC was hereby further grouped into two subtypes using unsupervised consensus clustering. This classification was based on the expression of the four hub genes, utilizing machine learning methods and unsupervised clustering algorithms. The expression of the four hub genes varied significantly between the two subtypes. The four core gene expression trends in subtype B were the same as those of previous research. Furthermore, GSVA enrichment analysis showed that subtype A was enriched in various substance metabolisms, such as glucose metabolism, lipid metabolism, and amino acid metabolism. In contrast, subtype B was significantly enriched in cancer-related pathways, like colorectal cancer. This significant finding suggested that subtype B of UC could potentially progress to CAC. Moreover, GO and KEGG enrichment analyses of subtype A and B DEGs were conducted to further identify the differences between the two subtypes. The results enriched in cytokine-related signaling pathways and immune-related diseases, underscore the significance of distinguishing between the two subtypes, calling for further research.
In conclusion, two subtypes of UC were hereby confirmed. Each possessed distinct molecular features, biological behavior, and clinical characteristics. Overall, the classification provides a basis for further studies about the therapy and prognosis of UC. However, this study is still subjected to several limitations. Firstly, a comparative analysis of survival curves for the two UC subtypes was not conducted. Secondly, the study relied solely on bioinformatic methods, which have yet to be experimentally validated. The sampling method did not exclude the effects of age, gender, disease severity, complications, and therapeutic approaches. Further efforts could be made to compare the hub gene expression level between the mice with DSS and the control group, UC patients, and healthy individuals, and to explore the prognosis of different subtypes of UC and their effects on CAC.
In summary, the research warrants in-depth exploration to demonstrate the further mechanisms of O-GlcNAcylation, the expression of the hub genes, and the clinical significance of the two subtypes in UC.
Supporting information
(A-B) GSE75214 and GSE92415 combined and used R packages "limma" and "sva" to remove batch effects, resulting in 16,467 genes and 216 samples. A: before merging; and B: after merging. (C-D) R language preprocessCore package homogenized the dataset; C: before homogenization; and D: after homogenization.
(TIF)
The positive correlation between the top 50 genes and 4 hub genes was displayed using heatmaps.
(TIF)
The genes from both the UC patients and healthy individuals in the GSE75214 dataset were merged with those from GSE92415 to form a comprehensive data set. The batch effect was then eliminated to minimize discrepancies between the different datasets. There were a total of 16467 genes after combination.
(CSV)
This is the R language code used by the bioinformatics method involved in this study.
(DOCX)
Acknowledgments
The authors would like to thank Shuyuan Zhang of the First Affiliated Hospital of Harbin Medical University for helpful discussions on topics related to this work. Besides, I could not have completed this dissertation without the support of my friends Dr. Yonghui Wu, who provided stimulating discussions and creative ideas.
Data Availability
The data underlying the results presented in the study are available from the GEO database (www.ncbi.nlm.nih.gov/geo/) at the following accession numbers: Accession Number GSE75214 - https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75214 Accession Number GSE92415 - https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92415
Funding Statement
This study was funded by the Natural Science Foundation of Heilongjiang Province, LH2020H037, Hongyu Xu.
References
- 1.Zhang S. Z., Zhao X. H. & Zhang D. C. Cellular and molecular immunopathogenesis of ulcerative colitis. Cell Mol Immunol 3, 35–40 (2006). [PubMed] [Google Scholar]
- 2.Lavelle A, Sokol H. Gut microbiota-derived metabolites as key actors in inflammatory bowel disease. Nat Rev Gastroenterol Hepatol. 2020; 17:223–37. doi: 10.1038/s41575-019-0258-z [DOI] [PubMed] [Google Scholar]
- 3.Le Berre C, Honap S, Peyrin-Biroulet L. Ulcerative colitis. Lancet. 2023; 402:571–84. doi: 10.1016/S0140-6736(23)00966-2 [DOI] [PubMed] [Google Scholar]
- 4.Na YR, Stakenborg M, Seok SH, Matteoli G. Macrophages in intestinal inflammation and resolution: a potential therapeutic target in IBD. Nat Rev Gastroenterol Hepatol. 2019; 16:531–43. doi: 10.1038/s41575-019-0172-4 [DOI] [PubMed] [Google Scholar]
- 5.Mudter J, Neurath MF. Il-6 signaling in inflammatory bowel disease: pathophysiological role and clinical relevance. Inflamm Bowel Dis. 2007; 13:1016–23. doi: 10.1002/ibd.20148 [DOI] [PubMed] [Google Scholar]
- 6.Mitsialis V, Wall S, Liu P, Ordovas-Montanes J, Parmet T, Vukovic M, et al. Single-Cell Analyses of Colon and Blood Reveal Distinct Immune Cell Signatures of Ulcerative Colitis and Crohn’s Disease. Gastroenterology. 2020; 159:591–608.e10. doi: 10.1053/j.gastro.2020.04.074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Brazil JC, Parkos CA. Finding the sweet spot: glycosylation mediated regulation of intestinal inflammation. Mucosal Immunol. 2022; 15:211–22. doi: 10.1038/s41385-021-00466-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Biermann MH, Griffante G, Podolska MJ, Boeltz S, Stürmer J, Muñoz LE, et al. Sweet but dangerous ‐ the role of immunoglobulin G glycosylation in autoimmunity and inflammation. Lupus. 2016; 25:934–42. doi: 10.1177/0961203316640368 [DOI] [PubMed] [Google Scholar]
- 9.Theodoratou E, Campbell H, Ventham NT, Kolarich D, Pučić-Baković M, Zoldoš V, et al. The role of glycosylation in IBD. Nat Rev Gastroenterol Hepatol. 2014; 11:588–600. doi: 10.1038/nrgastro.2014.78 [DOI] [PubMed] [Google Scholar]
- 10.Eichler J. Protein glycosylation. Curr Biol. 2019. 29(7): R229–R231. doi: 10.1016/j.cub.2019.01.003 [DOI] [PubMed] [Google Scholar]
- 11.Torres CR, Hart GW. Topography and polypeptide distribution of terminal N-acetylglucosamine residues on the surfaces of intact lymphocytes. Evidence for O-linked GlcNAc. J Biol Chem. 1984; 259:3308–17. [PubMed] [Google Scholar]
- 12.Magalhães A, Duarte HO, Reis CA. The role of O-glycosylation in human disease. Mol Aspects Med. 2021. 79: 100964. doi: 10.1016/j.mam.2021.100964 [DOI] [PubMed] [Google Scholar]
- 13.Kearse KP, Hart GW. Lymphocyte activation induces rapid changes in nuclear and cytoplasmic glycoproteins. Proc Natl Acad Sci U S A. 1991; 88:1701–5. doi: 10.1073/pnas.88.5.1701 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Li T, Li X, Attri KS, Liu C, Li L, Herring LE, et al. O-GlcNAc Transferase Links Glucose Metabolism to MAVS-Mediated Antiviral Innate Immunity. Cell Host Microbe. 2018; 24:791–803.e6. doi: 10.1016/j.chom.2018.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Li X, Gong W, Wang H, Li T, Attri KS, Lewis RE, et al. O-GlcNAc Transferase Suppresses Inflammation and Necroptosis by Targeting Receptor-Interacting Serine/Threonine-Protein Kinase 3. Immunity. 2019; 50:576–90.e6. doi: 10.1016/j.immuni.2019.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lund PJ, Elias JE, Davis MM. Global Analysis of O-GlcNAc Glycoproteins in Activated Human T Cells. J Immunol. 2016; 197:3086–98. doi: 10.4049/jimmunol.1502031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sun QH, Wang YS, Liu G, Zhou HL, Jian YP, Liu MD, et al. Enhanced O-linked Glcnacylation in Crohn’s disease promotes intestinal inflammation. EBioMedicine. 2020; 53:102693. doi: 10.1016/j.ebiom.2020.102693 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Li X, Zhang Z, Li L, Gong W, Lazenby AJ, Swanson BJ, et al. Myeloid-derived cullin 3 promotes STAT3 phosphorylation by inhibiting OGT expression and protects against intestinal inflammation. J Exp Med. 2017; 214:1093–109. doi: 10.1084/jem.20161105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Liu F, Iqbal K, Grundke-Iqbal I, Hart GW, Gong CX. O-GlcNAcylation regulates phosphorylation of tau: a mechanism involved in Alzheimer’s disease. Proc Natl Acad Sci U S A. 2004; 101:10804–9. doi: 10.1073/pnas.0400348101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wei J, Chen C, Feng J, Zhou S, Feng X, Yang Z, et al. Muc2 mucin O-glycosylation interacts with enteropathogenic Escherichia coli to influence the development of ulcerative colitis based on the NF-kB signaling pathway. J Transl Med. 2023; 21:793. doi: 10.1186/s12967-023-04687-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fu J, Wei B, Wen T, Johansson ME, Liu X, Bradford E, et al. Loss of intestinal core 1-derived O-glycans causes spontaneous colitis in mice. J Clin Invest. 2011; 121:1657–66. doi: 10.1172/JCI45538 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kudelka MR, Stowell SR, Cummings RD, Neish AS. Intestinal epithelial glycosylation in homeostasis and gut microbiota interactions in IBD. Nat Rev Gastroenterol Hepatol. 2020. 17(10): 597–617. doi: 10.1038/s41575-020-0331-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Johansson ME, Larsson JM, Hansson GC. The two mucus layers of colon are organized by the MUC2 mucin, whereas the outer layer is a legislator of host-microbial interactions. Proc Natl Acad Sci U S A. 2011. 108 Suppl 1(Suppl 1): 4659–65. doi: 10.1073/pnas.1006451107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015; 43:e47. doi: 10.1093/nar/gkv007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012; 28:882–3. doi: 10.1093/bioinformatics/bts034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012; 16:284–7. doi: 10.1089/omi.2011.0118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 2017; 45:D362-362D368. doi: 10.1093/nar/gkw937 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ellis K, Kerr J, Godbole S, Lanckriet G, Wing D, Marshall S. A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers. Physiol Meas. 2014; 35:2191–203. doi: 10.1088/0967-3334/35/11/2191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Valkenborg D, Rousseau AJ, Geubbelmans M, Burzykowski T. Support vector machines. Am J Orthod Dentofacial Orthop. 2023; 164:754–7. doi: 10.1016/j.ajodo.2023.08.003 [DOI] [PubMed] [Google Scholar]
- 30.Obuchowski NA, Bullen JA. Receiver operating characteristic (ROC) curves: review of methods with applications in diagnostic medicine. Phys Med Biol. 2018; 63:07TR01. doi: 10.1088/1361-6560/aab4b1 [DOI] [PubMed] [Google Scholar]
- 31.Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011; 12:77. doi: 10.1186/1471-2105-12-77 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005; 102:15545–50. doi: 10.1073/pnas.0506580102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013; 14:7. doi: 10.1186/1471-2105-14-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003; 13:2498–504. doi: 10.1101/gr.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hart GW, Slawson C, Ramirez-Correa G, Lagerlof O. Cross talk between O-GlcNAcylation and phosphorylation: roles in signaling, transcription, and chronic disease. Annu Rev Biochem. 2011; 80:825–58. doi: 10.1146/annurev-biochem-060608-102511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Wani WY, Chatham JC, Darley-Usmar V, McMahon LL, Zhang J. O-GlcNAcylation and neurodegeneration. Brain Res Bull. 2017; 133:80–7. doi: 10.1016/j.brainresbull.2016.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Chang YH, Weng CL, Lin KI. O-GlcNAcylation and its role in the immune system. J Biomed Sci. 2020; 27:57. doi: 10.1186/s12929-020-00648-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Qiang A, Slawson C, Fields PE. The Role of O-GlcNAcylation in Immune Cell Activation. Front Endocrinol (Lausanne). 2021; 12:596617. doi: 10.3389/fendo.2021.596617 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Du L, Ha C. Epidemiology and Pathogenesis of Ulcerative Colitis. Gastroenterol Clin North Am. 2020; 49:643–54. doi: 10.1016/j.gtc.2020.07.005 [DOI] [PubMed] [Google Scholar]
- 40.Magro F, Gionchetti P, Eliakim R, Ardizzone S, Armuzzi A, Barreiro-de Acosta M, et al. Third European Evidence-based Consensus on Diagnosis and Management of Ulcerative Colitis. Part 1: Definitions, Diagnosis, Extra-intestinal Manifestations, Pregnancy, Cancer Surveillance, Surgery, and Ileo-anal Pouch Disorders. J Crohns Colitis. 2017; 11:649–70. doi: 10.1093/ecco-jcc/jjx008 [DOI] [PubMed] [Google Scholar]
- 41.Geremia A, Biancheri P, Allan P, Corazza GR, Di Sabatino A. Innate and adaptive immunity in inflammatory bowel disease. Autoimmun Rev. 2014; 13:3–10. doi: 10.1016/j.autrev.2013.06.004 [DOI] [PubMed] [Google Scholar]
- 42.Saez A, Gomez-Bris R, Herrero-Fernandez B, Mingorance C, Rius C, Gonzalez-Granado JM. Innate Lymphoid Cells in Intestinal Homeostasis and Inflammatory Bowel Disease. Int J Mol Sci. 2021; 22. doi: 10.3390/ijms22147618 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.He X, Gao J, Peng L, Hu T, Wan Y, Zhou M, et al. Bacterial O-GlcNAcase genes abundance decreases in ulcerative colitis patients and its administration ameliorates colitis in mice. Gut. 2021; 70:1872–83. doi: 10.1136/gutjnl-2020-322468 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yang YR, Kim DH, Seo YK, Park D, Jang HJ, Choi SY, et al. Elevated O-GlcNAcylation promotes colonic inflammation and tumorigenesis by modulating NF-κB signaling. Oncotarget. 2015; 6:12529–42. doi: 10.18632/oncotarget.3725 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Vancamelbeke M, Vanuytsel T, Farré R, Verstockt S, Ferrante M, Van Assche G, et al. Genetic and Transcriptomic Bases of Intestinal Epithelial Barrier Dysfunction in Inflammatory Bowel Disease. Inflamm Bowel Dis. 2017; 23:1718–29. doi: 10.1097/MIB.0000000000001246 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hattrup CL, Gendler SJ. Structure and function of the cell surface (tethered) mucins. Annu Rev Physiol. 2008; 70:431–57. doi: 10.1146/annurev.physiol.70.113006.100659 [DOI] [PubMed] [Google Scholar]
- 47.Chen W, Zhang Z, Zhang S, Zhu P, Ko JK, Yung KK. MUC1: Structure, Function, and Clinic Application in Epithelial Cancers. Int J Mol Sci. 2021; 22. doi: 10.3390/ijms22126567 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Sun Y, Fan L, Mian W, Zhang F, Liu X, Tang Y, et al. Modified apple polysaccharide influences MUC-1 expression to prevent ICR mice from colitis-associated carcinogenesis. Int J Biol Macromol. 2018; 120:1387–95. doi: 10.1016/j.ijbiomac.2018.09.142 [DOI] [PubMed] [Google Scholar]
- 49.Murwanti R, Denda-Nagai K, Sugiura D, Mogushi K, Gendler SJ, Irimura T. Prevention of Inflammation-Driven Colon Carcinogenesis in Human MUC1 Transgenic Mice by Vaccination with MUC1 DNA and Dendritic Cells. Cancers (Basel). 2023; 15. doi: 10.3390/cancers15061920 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Long L, Huang X, Yu S, Fan J, Li X, Xu R, et al. The research status and prospects of MUC1 in immunology. Hum Vaccin Immunother. 2023; 19:2172278. doi: 10.1080/21645515.2023.2172278 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Nishida A, Lau CW, Zhang M, Andoh A, Shi HN, Mizoguchi E, et al. The membrane-bound mucin Muc1 regulates T helper 17-cell responses and colitis in mice. Gastroenterology. 2012; 142:865–74.e2. doi: 10.1053/j.gastro.2011.12.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Xu S, Li X, Zhang S, Qi C, Zhang Z, Ma R, et al. Oxidative stress gene expression, DNA methylation, and gut microbiota interaction trigger Crohn’s disease: a multi-omics Mendelian randomization study. BMC Med. 2023; 21:179. doi: 10.1186/s12916-023-02878-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kuno K, Kanada N, Nakashima E, Fujiki F, Ichimura F, Matsushima K. Molecular cloning of a gene encoding a new type of metalloproteinase-disintegrin family protein with thrombospondin motifs as an inflammation associated gene. J Biol Chem. 1997; 272:556–62. doi: 10.1074/jbc.272.1.556 [DOI] [PubMed] [Google Scholar]
- 54.Tan Ide A, Ricciardelli C, Russell DL. The metalloproteinase ADAMTS1: a comprehensive review of its role in tumorigenic and metastatic pathways. Int J Cancer. 2013; 133:2263–76. doi: 10.1002/ijc.28127 [DOI] [PubMed] [Google Scholar]
- 55.Schrimpf C, Xin C, Campanholle G, Gill SE, Stallcup W, Lin SL, et al. Pericyte TIMP3 and ADAMTS1 modulate vascular stability after kidney injury. J Am Soc Nephrol. 2012; 23:868–83. doi: 10.1681/ASN.2011080851 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Buran T, Batır MB, Çam FS, Kasap E, Çöllü F, Çelebi H, et al. Molecular analyses of ADAMTS-1, -4, -5, and IL-17 a cytokine relationship in patients with ulcerative colitis. BMC Gastroenterol. 2023; 23:345. doi: 10.1186/s12876-023-02985-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Sethi MK, Buettner FF, Krylov VB, Takeuchi H, Nifantiev NE, Haltiwanger RS, et al. Identification of glycosyltransferase 8 family members as xylosyltransferases acting on O-glucosylated notch epidermal growth factor repeats. J Biol Chem. 2010; 285:1582–6. doi: 10.1074/jbc.C109.065409 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zeng Z, Mukherjee A, Zhang H. From Genetics to Epigenetics, Roles of Epigenetics in Inflammatory Bowel Disease. Front Genet. 2019; 10:1017. doi: 10.3389/fgene.2019.01017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Barnicle A, Seoighe C, Greally JM, Golden A, Egan LJ. Inflammation-associated DNA methylation patterns in epithelium of ulcerative colitis. Epigenetics. 2017; 12:591–606. doi: 10.1080/15592294.2017.1334023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hou Q, Huang J, Ayansola H, Masatoshi H, Zhang B. Intestinal Stem Cells and Immune Cell Relationships: Potential Therapeutic Targets for Inflammatory Bowel Diseases. Front Immunol. 2020; 11:623691. doi: 10.3389/fimmu.2020.623691 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Du C, Wang K, Zhao Y, Nan X, Chen R, Quan S, et al. Supplementation with Milk-Derived Extracellular Vesicles Shapes the Gut Microbiota and Regulates the Transcriptomic Landscape in Experimental Colitis. Nutrients. 2022; 14. doi: 10.3390/nu14091808 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Sugimoto M, Fujikawa A, Womack JE, Sugimoto Y. Evidence that bovine forebrain embryonic zinc finger-like gene influences immune response associated with mastitis resistance. Proc Natl Acad Sci U S A. 2006; 103:6454–9. doi: 10.1073/pnas.0601015103 [DOI] [PMC free article] [PubMed] [Google Scholar]










