Abstract
Various omics-based biomarkers related to the occurrence, progression, and prognosis of colorectal cancer (CRC) have been identified. In this study, we attempted to identify gut microbiome-based biomarkers and detect their association with host gene expression in the initiation and progression of CRC by integrating analysis of the gut mucosal metagenome, RNA sequencing, and sociomedical factors. We performed metagenome and RNA sequencing on colonic mucosa samples from 13 patients with advanced CRC (ACRC), 10 patients with high-risk adenoma (HRA), and 7 normal control (NC) individuals. All participants completed a questionnaire on sociomedical factors. The interaction and correlation between changes in the microbiome and gene expression were assessed using bioinformatic analysis. When comparing HRA and NC samples, which can be considered to represent the process of tumor initiation, 28 genes and five microbiome species were analyzed with correlation plots. When comparing ACRC and HRA samples, which can be considered to represent the progression of CRC, seven bacterial species and 21 genes were analyzed. When comparing ACRC and NC samples, 16 genes and five bacterial species were analyzed, and four correlation plots were generated. A network visualizing the relationship between bacterial and host gene expression in the initiation and progression of CRC indicated that Clostridium spiroforme and Tyzzerella nexilis were hub bacteria in the development and progression of CRC. Our study revealed the interactions of and correlation between the colonic mucosal microbiome and host gene expression to identify potential roles of the microbiome in the initiation and progression of CRC. Our results provide gut microbiome-based biomarkers that may be potential diagnostic markers and therapeutic targets in patients with CRC.
Subject terms: Computational biology and bioinformatics, Genetics, Microbiology, Gastroenterology
Introduction
Colorectal cancer (CRC) is one of the most common carcinomas worldwide1,2. Despite CRC screening programs, including fecal immunochemical tests and colonoscopy in worldwide, CRC still has a high incidence and mortality3,4. Multiple studies have shown that the gut microbiome is a crucial environmental factor that can regulate human health, and genomic changes in the gut microbiota can contribute to a variety of human diseases, including malignant disease, chronic inflammatory diseases, and metabolic disease5–10. The gut microbiota plays an important role in the regulation of gut homeostasis. It can metabolize the indigestible components of food, synthesize nutrients for epithelial regeneration, and modulate the immune response to maintain mucosal integrity by protecting against harmful environmental and endogenous toxic stimuli11–14. The initiation and progression of CRC are related to complex biological pathways involving multiple genetic and epigenetic alterations15–17. Many reports have shown that dysbiosis is closely associated with the initiation and progression of CRC and that the gut microbiome can be a candidate marker for early detection of CRC18–25. Therefore, modulation of the gut microbiome has been attempted as an adjunctive therapeutic strategy for CRC, such as increasing the sensitivity of immune checkpoint inhibitors in advanced and metastatic CRC26–28. With the development of bioinformatics analysis, accumulated omics data have been widely used to investigate the pathogenesis of CRC29,30. Multi-omics data are also rapidly expanding knowledge of metagenomic and host gene expression in health and disease31.
Some bacterial taxonomic groups have been found to be significantly correlated with the methylation or demethylation of host genes. However, the role of gut microbes as environmental factors in the initiation and progression of CRC and the interaction between microbes and host genes during CRC tumorigenesis remain unclear and need to be elucidated. Previously, we assessed the differentially expressed genes (DEGs) among high-risk adenoma (HRA), advanced CRC (ACRC), and normal control (NC) samples using RNA sequencing (RNA-seq) to identify candidate genes that play a role in CRC progression32. In this study, we integrated the results of simultaneous metagenomic sequencing and RNA-seq on the same colonic mucosa in HRA, ACRC, and NC samples to analyze the correlation between bacterial species and host gene expression and define the role of gut microbiota-host gene crosstalk in CRC development and progression. To validate and exclude the interference of environmental factors on metagenomics and gene expression, sociomedical factors such as dietary patterns, socioeconomic status, medical/family history, and psychiatric factors were included in this integrated analysis33,34.
Results
Integrated patterns of HRA, ACRC, and NC samples
We retrieved gene expression data, microbiome data, and survey results from RNA-seq, metagenome analysis, and a sociomedical questionnaire, respectively. The RNA-seq data included 46,851 features, and 763 features were selected based on significant differences between the three groups (analysis of variance [ANOVA]; p-value < 1 × 10–6). The metagenome analysis yielded 529 features, each of which corresponded to a microbe at the species level from 16S rRNA sequencing data. Ninety-three continuous variables were selected from the survey results. Each of the three datasets was normalized to a value between 0 and 1, and the three datasets were merged into one table (30 samples with 1385 features).
We used PCA as a dimension reduction tool to observe the merged datasets of gene expression data, microbiome data, and sociomedical patterns of the survey results. The matrix separated the ACRC, HRA, and NC groups. The merged multi-omics data consisted of 30 samples (Fig. S1 and Table S1), and a total of 1385 features were diminished to 10 principal components (PCs). The first and second PCs are listed in the PCA plot (Fig. 1A). In the PCA plot, a two-dimensional plot represents two components, reduced from the 1385 merged and normalized matrices. Each plot indicates a sample, and the three groups are plotted in different colors (Fig. 1A). The variances of each PC were graphed as a scree plot. The top four components accounted for 78.14% of the variance, and the top two components accounted for 71.13% of the variance in the total dataset. In the scree plot, the x-axis indicates the top 10 PCs, and the y-axis represents the variance of each PC captured from a total of 1385 features (Fig. 1B). By merging the PCA plot and vectors from the loadings of variables, a biplot was created (Fig. 1C). The ternary plot shows the relative abundance of the 1385 features. The three sides of the triangle represent the relative abundance of the three groups. As the two sides approach the vertex where they meet, the relative abundance between the two pairs is different (Fig. 1D). These results show a clear clustering of metagenomic, RNA-seq, and sociomedical factors associated with the tumors and the NC samples, except for three HRA samples.
Abundance and diversity of the colonic mucosal microbiome in HRA and ACRC samples
We tried to identify the interactions between the microbiome and host cells in the colonic mucosa of the three groups (Fig. 2A). We estimated the diversity and richness of the microbial communities in mucosal biopsies. Microbiome diversity and richness were obtained as indices at the operational taxonomic unit (OTU) level. Species diversity was determined using the Shannon and Simpson indices. Species richness was defined as the observed number of species assigned to the OTUs detected in each sample. Richness was retrieved from the observed number of species using Chao1. On average, 75.36, 86.50, and 76.92 OTUs were detected in the ACRC, HRA, and NC samples, respectively. The average Chao1 values were 76.49 in ACRC samples, 77.12 in NC samples, and 87.33 in HRA samples. One patient each with HRA and NC had abnormally high OTUs, and no statistically significant difference in diversity was found between the groups.
Decreasing patterns in the three indices (Shannon, Inverse Simpson, and Good’s Coverage) were observed for disease samples compared to NC samples, but there was no significant difference in expression. In the diversity analysis, different patterns were observed between the three groups; however, only Good’s coverage was significantly different. We further compared the microbiome abundances between the three groups at six taxonomic levels (phylum, class, order, family, genus, and species) (Fig. 2B). Taken together, there was no difference in the abundance and diversity of microbiota taxa between HRA, ACRC, and NC samples.
Functional analysis and detection of bacterial species in HRA and ACRC samples
We attempted to detect and identify the colonic mucosa bacteria associated with CRC development and progression. The significantly associated eight species were selected based on the results of the ANOVA comparing the three groups (Fig. 2C and Table S2; p < 0.05). Each row in the heatmap indicates a species identified by the ANOVA results, and each column indicates a sample. The column annotation bar indicates three classes of samples. Each row and column pair is clustered using k-means clustering. Each column was split into three groups, and the NC group was clustered. Three species of Bacteroides were detected at higher levels in the NC samples. At each classification level, the top three taxa were compared among the three groups. Taxons belonging to the same lower level were included for frequently detected taxa. Therefore, values showing a similar pattern were observed at each classification level. Although no statistically significant differences in abundance were found between the three groups at all classification levels, we could retrieve eight species correlated with the initiation and progression of CRC. Bacteroides fragilis had significantly different abundances in HRA and NC samples, and Bacteroides vulgatus had significantly different abundances in ACRC and NC samples and in HRA and NC samples. The Kruskal–Wallis test results between the three groups were statistically significant (p = 0.0037) (Fig. 2D).
Predicted function of bacterial species correlated with CRC-associated gene expression
To investigate the role of mucosal bacteria in CRC initiation and progression, we assessed the correlation between gene expression and microbial distribution in the three groups. The initiation and progression of CRC are related to complex biological pathways involving multiple genetic and epigenetic alterations16,19. HRA is known to be the precursor of CRC, and the adenoma-carcinoma sequence is the classic mechanism of the development of ACRC. We selected the DEGs in three pairings: ACRC versus HRA samples to represent CRC progression (Fig. 3A,B), HRA versus NC samples to represent CRC initiation (Fig. 3C,D), and ACRC versus NC samples (Fig. 3E,F). Each bacterium and gene was selected by fold change (FC) and p-value (PV) in the t-test. We visualized all correlations between species abundance and host gene expression (Fig. 3). In all three pairings, each correlation was provided as a correlation plot and scatter plots. In the correlation plot, relationships of species abundance and host gene expression with PV < 0.05 are indicated as plots, and the plot is red if the correlation is negative.
In the ACRC-HRA pairing, 21 genes (PV < 0.001 and |FC| > 0.4) and seven bacterial species were examined (Fig. 3A). The top two positively and negatively correlated genes and microbiome species are provided as correlation plots (Fig. 3B).
In the ACRC-NC pairing, 16 genes (PV < 0.001 and |FC| > 0.75) and five microbiome species were analyzed, and four correlation plots were created (Fig. 3C,D). All 16 genes were significantly correlated with Clostridium spiroforme. C. spiroforme had low abundance. RMRP and RNR1 were more highly expressed in ACRC samples than in NC samples, and NBPF13P was expressed at lower levels in ACRC samples. The correlation coefficients of C. spiroforme with RMRP, RNR1, and NBPF13P were − 0.644, − 0.636, and 0.631, respectively.
In the HRA-NC pairing, 28 genes (PV < 0.001 and |FC| > 0.75) and five microbiome species were analyzed using correlation plots (Fig. 3E,F). Twenty-two genes were significantly correlated with T. nexilis; 19 and 3 genes showed negative and positive correlation patterns, respectively. The correlation plots showed four genes that were correlated with T. nexilis: SLC26A3, FAM72B, REG1B, and REG3A. We selected four species-gene pairs using the two top and bottom correlation coefficients. The correlation coefficients were − 0.605, − 0.490, 0.589, and 0.646 for the GSTO2 and Acetivibrio ethanolgignens pair, the MGAT4A and T. nexilis pair, the REG3A and Clostridium leptum pair, and the REG3A and T. nexilis pair, respectively.
Network and visualization of multi-omics data
We then built a network to visualize the relationship between the bacteria and host gene expression during CRC initiation and progression. In the network analysis, we compared NC samples and disease samples (HRA + ACRC) (Fig. 4A), as well as the three original pairings (Fig. 4B–D). For genes, overexpression during CRC progression (NC to HRA to ACRC) is indicated in red, and downregulation is indicated in blue. For microbiome species, increased abundance during progression is indicated in red and decreased abundance in blue. Then, we provided the results of network analysis as four correlation plots (Fig. S2). We identified seven, four, and four species from network analysis in the ACRC-HRA, ACRC-NC, and HRA-NC pairings, respectively (Fig. S2A, S2B, and S2C), and 8, 16, and 26 genes, respectively. In the ACRC-HRA network (Fig. 4B), only eight genes were connected to seven species. Therefore, the correlation patterns of the eight genes were not detected. C. spiroforme was connected to 16 genes in the ACRC-NC network (Fig. 4C,D), and T. nexilis had 20 gene connections in the HRA-NC network. Most genes were positively correlated; RNR1 and RMRP were negatively correlated in the ACRC-NC network (Fig. S2D), and RNR1 was negatively correlated in the HRA-NC network (Fig. S2E). Therefore, our results suggest that C. spiroforme and T. nexilis are hub bacteria in the development and progression of CRC, and these bacteria can be candidates for the detection of CRC, including precancerous lesions.
Analysis of correlation between omics data and sociomedical factors
The gut microbiome is a crucial environmental factor in the development of CRC, but its composition can also be affected by external factors such as dietary patterns and psychiatric factors35. In order to exclude such external interference and investigate the effect of external factors on metagenomic and RNA-seq results, sociomedical factors were also included in the integrated analysis. The 88 questionnaire-based sociomedical factors and the methods of analysis are summarized in Table S3. We obtained two correlation results between gene expression and sociomedical factors and between microbial distribution and sociomedical factors. The correlation coefficients were visualized as a heatmap, and clustering analysis was performed for each variable. From 88 survey questions, there were six clusters associated with gene expression and five clusters associated with microbial distribution were visualized as two heatmaps (Fig. S3 and Table S4). In both analyses, six clustered variables are listed in the heatmap. “Vomit” and “MtSor” were commonly clustered in both analyses, and positively and negatively correlated genes and microbes are listed. Psychiatric factors were closely located in the cluster. In the gene analysis, DEF (defensin alpha) and REG (regenerating family member) genes had higher correlation coefficients with diet. In the microbiome analysis, E. fergusonii showed higher correlation coefficients. However, the sociomedical factors did not affect the mucosal microbiome or gene expression during CRC initiation and progression.
Discussion
Abnormal gene expression in the intestinal mucosa along with an imbalance in the intestinal microflora is one of the main causes of colorectal disease, and several mechanisms by which intestinal microbes and abnormal gene expression affect the development of colonic tumors have been suggested27. In this study, we integrated the results of RNA-seq, metagenomics, and sociomedical pattern analysis of ACRC, HRA, and NC samples to determine significant differences. We identified the diversity of the microbiome, showed a correlation between gene expression and the microbiome, and performed network analysis. We separated the signatures between the three groups and visualized distinct patterns. Our results provide a basis for manipulating the microbiome in treatment strategies for colorectal diseases.
Dysbiosis due to environmental factors such as dietary pattern or genetic variations can disrupt the immune system and may promote colorectal neoplasm36–38. The gut microbiota change can alter the efficacy of CRC treatment by increasing the sensitivity of chemotherapeutic agents, radiotherapy, and immune checkpoints inhibitors and reduced the toxicity of these treatment modalities39. Recent scientific evidence suggests that colorectal microbiota modification can inhibit ACRC progression and improve the treatment effect in ACRC40. A literature survey revealed that changing the colorectal microbiota composition by probiotics, prebiotics, and diet protects ACRC patients from treatment-associated adverse effects18,40–42. This study provides insights into the association between colorectal microbiota and colorectal diseases (including ACRC and HRA) to provide innovative strategies for enhancing the safety and efficacy of ACRC and HRA therapy.
Many studies have examined specific gut bacterial species associated with colorectal diseases. A typical example is sulfidogenic bacteria. Hydrogen sulfide-producing bacteria such as Fusobacterium, Desulfovibrio, and B. wadsworthia are known to be involved in ACRC development through the production of DNA-damaging hydrogen sulfide43–45. In addition, patients with beneficial gut microbiota, such as B. longum, Ruminococcaceae spp., E. faecium, Faecalibacterium spp., and C. aerofaciens, have superior systemic and antitumor immunity compared to patients with low strain diversity and relatively high abundance of Bacteroidetes41,46. This phenomenon suggests that intestinal microbiota can modulate immune function in the intestine and increase tumor immunity.
Most existing studies on gut microbiota in CRC have analyzed gut microbiota in feces. Examination of the feces is non-invasive and may be appropriate as a screening test, but there may be variables that affect the metagenomic results, such as the collection process, dietary pattern, and antibiotic administration10,12. In this study, the microbiome genome was profiled in the colonic mucosa using samples removed during colonoscopy. We were able to identify species with high relative abundance in ACRC- and HRA-derived mucosa (three Bacteroides and two Clostridium species) and extracted genes that were highly correlated with these bacterial species.
Understanding the molecular mechanisms of CRC development and progression is key to early diagnosis and the development of personalized medicines. Several previous studies have clarified the importance of the interaction between host cells and the microbiome in the pathogenesis of CRC24,31,47,48. To understand the role of these interactions in the adenoma-carcinoma sequence, we correlated host gene expression and mucosal microbiome genomic composition data using microbiota and RNA-seq data in HRA, ACRC, and NC samples. In the correlation and network analyses of mucosal-derived microorganisms and gene expression, T. nexilis was a hub species related to DEGs in HRA and NC samples and in ACRC and HRA samples. In addition, C. spiroforme was identified as a hub species related to differential gene expression when comparing ACRC and NC samples. C. spiroforme had a strong positive relationship with NBPF13P and a strong negative correlation with RMRP.
REG3A was found to be elevated in ACRC samples compared to NC samples. High REG3A levels are correlated with larger tumor size, poorer tumor differentiation, higher tumor stage, and lower survival rate49. REG3A has been shown to have pro-tumorigenic effects, including promotion of cell proliferation, inhibition of cell apoptosis, and regulation of cancer cell migration by activating AKT and ERK1/2 pathways in gastric cancer cells50. REG3A has also been considered to play a key role in inflammation-linked pancreatic carcinogenesis51,52. Therefore, REG3A may serve as a promising therapeutic target in ACRC. We are the first to identify a relationship between ACRC and NBPF13P. We revealed a relationship between the microbiome and NBPF13P, which could provide a new pathway for targeting in colorectal diseases.
The colon has the highest load of gut microbiota, with over 1011 bacteria per milliliter. Colonic symbionts can be classified according to their anatomical distribution as (1) luminal-resident bacteria, (2) mucous-resident bacteria, (3) epithelial-resident bacteria, and (4) lymphoid tissue-resident symbionts. Intestinal epithelial cells (IECs) play an important role in innate immunity by forming a physical barrier against environmental stimuli, including gut genetic toxins, and maintaining a balance between commensal bacteria and host cells53,54. Although this barrier is sterile, invasive bacteria, including adherent-invasive Escherichia coli, segmented filamentous bacilli, Enterococcus faecalis, Bacteroides fragilis, and Clostridium spp., can reside in and attach to IECs55,56. This can lead to chronic inflammation of the mucous membrane, which is one of the critical pathogeneses of inflammatory bowel disease and CRC and correlates with disease severity. Since our study analyzed the microbiota of the colonic mucosa, it is difficult to exclude the possibility that a large portion of the microbiota present in IECs might be included in the metagenomic analysis. Tyzzerella and Clostridium, which are correlated with CRC progression and differential gene expression, are known to reside in IECs.
Clostridium spp., a representative epithelium-resident bacteria, forms endospores and has strong dissemination power, survival, and resistance to antibiotics57. Spore-forming bacteria have the following characteristics: resistance to antibiotic treatment, strong binding properties, high permeability, and harmful spores58. The role of sporobiota in the pathogenesis and progression of CRC remains unclear and needs to be elucidated. Our results suggest that gut sporobiota may be important in the pathogenesis of CRC. Understanding the mechanism of CRC pathogenesis is useful not only for the development of targeted therapeutics, which could potentially define markers and guide precision medicine, but also for the early detection and prevention of CRC. Identification of the exact role of sporobiota in colorectal tumorigenesis will help us understand the current limitations of gut microbiota modulations, such as antibiotic administration, diet modification, and probiotic administration, for CRC prevention and treatment and can suggest new target therapies for CRC.
In this study, we also investigated the association between the composition of the intestinal microflora and dietary patterns and other environmental factors. No definitive difference was observed in the gut mucosal microbiota diversity between HRA, ACRC, and NC samples. This result is not in line with those of previous studies using fecal microbiota analysis. This suggests that environmental factors, including dietary patterns and socioeconomic, psychiatric, and clinical factors may have less influence on the gut mucosal microbiota diversity compared to that of feces. Future large-scale studies are needed to clarify this.
Our study had several limitations. First, whether mucosal microbiota analysis reflects the effect of microbiota on the development and progression of CRC may be controversial. Since our colonic tissue was obtained during colonoscopy, it is possible that the results of the metagenomic analysis may have been affected by the bowel preparation process. Second, because of the small number of samples, differences in the mucosal microbiome and gene expression according to clinical characteristics of ACRC and HRA, such as tumor stage, were not fully analyzed. Third, we used 16S rRNA in mucosal microbiome analysis. 16S rRNA analysis has a limitation in that the accuracy of taxonomic resolution to species is lower compared to that of full-length sequencing (shotgun metagenome). Although similar patterns were detected between two methods59, relatively low resolutions with biases and errors were predicted in 16S rRNA analysis for taxonomic classification (78% of species-level vs 98% genera-level)60. The taxonomic assignment of species is more important and crucial than that of genus level. So, additional studies, including new omics techniques and culturomics, should be performed to confirm and validate our results.
Conclusions
In summary, we provided a set of candidate correlations and interactions between the gut microbiota and host genes in ACRC and HRA samples that are distinct from those of NC samples. This demonstrates the correlation between the microbiome and gene expression in the colonic mucosa during disease progression from NC to HRA to ACRC. Our results may provide clinicians and researchers with a basis for diagnosis and targeted treatment using gut mucosal microbiota, suggesting the relevance of sporobiota in CRC progression.
Methods
Study design and participants
This study was approved by the Institutional Review Board of Korea University Guro Hospital (2019GR0341). Colonic mucosal tissues were obtained from colonoscopies after bowel cleansing with 2L-based PEG (polyethylene glycol)-based laxatives at the Korea University Guro Hospital. None of the enrolled patients had an acute infection within the 3 months before the procedure. ACRC and advanced HRA tissues were obtained from the core lesion, as depicted in Fig. S1A. NC samples were obtained from the sigmoid colons of patients with normal colonoscopy findings who underwent routine colonoscopic CRC screening. Two pinch biopsies (~ 3 × 3 mm) from the lesions or sigmoid colons were obtained using colonoscopic biopsy forceps, one for RNA-seq and one for metagenomic sequencing. All tissues were placed into RNA stabilization solution (Thermo Fisher Scientific, Waltham, MA, USA) and stored for 24 h at 4 °C prior to freezing at − 80 °C to prevent anaerobic bacteria from being exposed to oxygen and to avoid bacterial overgrowth before DNA extraction. RNA-seq and 16S metagenomics sequencing were performed on 13 ACRC samples, 10 HRA samples, and 7 NC samples. The demographic and basal characteristics of the enrolled patients are summarized in Fig. S1B.
Assessment of sociomedical factors
We considered social lifestyle factors, family history of cancer using a family tree, medical histories, gastrointestinal symptoms, including the Bristol stool form scale, and psychosocial factors using CES. Diet patterns were assessed using the Korean standard nutrition questionnaire (Fourth version; 2007–2009), which is a 100-item questionnaire. Detailed information on the questionnaire is provided in Table S3.
DNA extraction and 16S rRNA sequencing
DNA was extracted from the colonic mucosa samples using the DNeasy Blood and Tissue kit (Qiagen, Germany). The bacterial V3–V4 region of 16S rRNA gene was used for PCR amplification. The primers used were 338F (5′-ACTCCTACGGGAGGCAGCA-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3′). The PCR process was initial denaturation at 95 °C for 5 min, 28 cycles consisting of 15 s denaturation at 95 °C, 30 s annealing at 55 °C and 30 s extension at 72 °C, with a final extension at 72 °C for 10 min. Amplicons of the V3–V4 region were maintained in equal amounts, and pair-end 2 × 300 bp was sequenced by the Illumina MiSeq platform with the MiSeq Reagent Kit v3. The raw pair-ended amplicon sequence reads were retrieved.
16S rRNA analysis and diversity analysis
We processed the FASTQ files using FastQC61 to perform quality control of the raw sequences. The raw 16S amplicon sequences were processed by QIIME2 v1.8.0 with default parameters. We then used SHI762 for trimming Nextera adapters and stitching paired-end reads and performed quality trimming at both ends of the stitched reads until a minimum Phred score of 32 was reached. These merged and filtered reads were used for closed-reference operational taxonomic unit (OTU) picking, and the OTUs were determined by de novo clustering of the sequences with a 97% sequence identity cut-off by QIIME. We performed alpha- and beta-diversity analyses in R using the vegan63 and phyloseq64 packages. Based on the OTU table, we calculated the average richness estimate for each alpha-diversity metric (Chao1, observed OTUs, and Shannon) (Table S5).
Bioinformatics and visualization
We used the RNA-seq data from our previous study32. From 46,851 features, 763 were selected using the “anova” function in R (p < 1 × 10–6). The final dataset included 529 species from metagenomics (Table S6), 763 genes from RNA-seq, and 93 variables from the survey results from the 30 samples (ACRC, n = 13, HRA, n = 10, and NC, n = 7). A total of 1385 features were normalized to values between 0 and 1. Principal component analysis (PCA) was performed by the “prcomp” function in R. To display the ternary plot, we used the “triax.plot” function of the “plotrix” package in R.
Integrated analysis of interaction between microbiome and host gene expression
The normalized 1385 features were used for the correlation analysis and visualized by correlation plots and heatmaps. The correlation analysis were performed by “cor.test” default function in R, and visualized by “corrplot” function of the “corrplot” package in R. Scatter plots were visualized by “ggplot” function of the “ggplot2” package in R. Network analysis was visualized by using Cytoscape, and each features was used as color keys.
Ethics approval and consent to participate
All cases were over 18 and informed consent was obtained in all cases. All methods were carried out in accordance with relevant guidelines and regulations (Declaration of Helsinki). This study was approved by the Institutional Review Board at Korea University Guro Hospital (2019GR0341).
Supplementary Information
Abbreviations
- ACRC
Advanced colorectal cancer
- ANOVA
Analysis of variance
- AUC
Area under the curve
- CRC
Colorectal cancer
- DEG
Differentially expressed gene
- HRA
High-risk adenoma
- IEC
Intestinal epithelial cell
- NBPF13P
Neuroblastoma breakpoint family member 13, pseudogene
- NC
Normal control
- OTU
Operational taxonomic unit
- PC
Principal component
- PCA
Principal component analysis
- PV
P-value
- REG3A
Regenerating family member 3 alpha
- RMRP
RNA component of mitochondrial RNA processing endoribonuclease
- RNA-seq
RNA sequencing
- RNR1
RNA, ribosomal 45S cluster 1
- FC
Fold change
Author contributions
Conceptualization: B.J.L., N.K., J.G.; Formal analysis: N.K., B.J.L., J.G.; Project administration: B.J.L.; Case collection and management: B.I.C., S.H.K., M.K.J., J.J.P., S.H.K., and H.S.Y.; Pathologic review: C.K.; Supervision: B.J.L.; Visualization: N.K., J.G.; Writing-original draft: N.K., G.J., and B.J.L.
Funding
This study was supported by a grant from the National Research Foundation of Korea, No. KEIT 20003699 and Korea University Guro Hospital (O1700521).
Data availability
The datasets generated during the current study are not publicly available due to Personal Information Protection Act of Republic of Korea and IRB recommendation of Korea University Guro Hospital but are available from the corresponding author on reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Namjoo Kim and Jeong-An Gim.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-022-17823-7.
References
- 1.Sung H, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
- 2.Wong MC, Ding H, Wang J, Chan PS, Huang J. Prevalence and risk factors of colorectal cancer in Asia. Intest. Res. 2019;17:317–329. doi: 10.5217/ir.2019.00021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Haghighat S, Sussman DA, Deshpande A. US Preventive Services Task Force recommendation statement on screening for colorectal cancer. JAMA. 2021;326:1328. doi: 10.1001/jama.2021.13466. [DOI] [PubMed] [Google Scholar]
- 4.Randel KR, et al. Colorectal cancer screening with repeated fecal immunochemical test versus sigmoidoscopy: Baseline results from a randomized trial. Gastroenterology. 2021;160:1085–1096 e1085. doi: 10.1053/j.gastro.2020.11.037. [DOI] [PubMed] [Google Scholar]
- 5.Fan Y, Pedersen O. Gut microbiota in human metabolic health and disease. Nat. Rev. Microbiol. 2021;19:55–71. doi: 10.1038/s41579-020-0433-9. [DOI] [PubMed] [Google Scholar]
- 6.Gao R, et al. Dysbiosis signature of mycobiota in colon polyp and colorectal cancer. Eur. J. Clin. Microbiol. Infect. Dis. 2017;36:2457–2468. doi: 10.1007/s10096-017-3085-6. [DOI] [PubMed] [Google Scholar]
- 7.Hong BY, et al. Characterization of mucosal dysbiosis of early colonic neoplasia. NPJ Precis. Oncol. 2019;3:29. doi: 10.1038/s41698-019-0101-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wroblewski LE, Peek RM, Jr, Coburn LA. The role of the microbiome in gastrointestinal cancer. Gastroenterol. Clin. North Am. 2016;45:543–556. doi: 10.1016/j.gtc.2016.04.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chen F, et al. Integrated analysis of the faecal metagenome and serum metabolome reveals the role of gut microbiome-associated metabolites in the detection of colorectal cancer and adenoma. Gut. 2021 doi: 10.1136/gutjnl-2020-323476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Song M, Chan AT, Sun J. Influence of the gut microbiome, diet, and environment on risk of colorectal cancer. Gastroenterology. 2020;158:322–340. doi: 10.1053/j.gastro.2019.06.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Gagnaire A, Nadel B, Raoult D, Neefjes J, Gorvel JP. Collateral damage: Insights into bacterial mechanisms that predispose host cells to cancer. Nat. Rev. Microbiol. 2017;15:109–128. doi: 10.1038/nrmicro.2016.171. [DOI] [PubMed] [Google Scholar]
- 12.Kau AL, Ahern PP, Griffin NW, Goodman AL, Gordon JI. Human nutrition, the gut microbiome and the immune system. Nature. 2011;474:327–336. doi: 10.1038/nature10213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nicholson JK, et al. Host-gut microbiota metabolic interactions. Science. 2012;336:1262–1267. doi: 10.1126/science.1223813. [DOI] [PubMed] [Google Scholar]
- 14.Gensollen T, Iyer SS, Kasper DL, Blumberg RS. How colonization by microbiota in early life shapes the immune system. Science. 2016;352:539–544. doi: 10.1126/science.aad9378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schmitt M, Greten FR. The inflammatory pathogenesis of colorectal cancer. Nat. Rev. Immunol. 2021;21:653–667. doi: 10.1038/s41577-021-00534-x. [DOI] [PubMed] [Google Scholar]
- 16.Nguyen LH, Goel A, Chung DC. Pathways of colorectal carcinogenesis. Gastroenterology. 2020;158:291–302. doi: 10.1053/j.gastro.2019.08.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tapizadeh E, et al. Molecular pathways, screening and follow-up of colorectal carcinogenesis: An overview. Curr. Cancer Ther. Rev. 2020;16:88–96. doi: 10.2174/1573394715666190730111946. [DOI] [Google Scholar]
- 18.Kim SH, Lim YJ. The role of microbiome in colorectal carcinogenesis and its clinical potential as a target for cancer treatment. Intest. Res. 2021 doi: 10.5217/ir.2021.00034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Oh HH, Joo YE. Novel biomarkers for the diagnosis and prognosis of colorectal cancer. Intest. Res. 2020;18:168–183. doi: 10.5217/ir.2019.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Yu J, et al. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer. Gut. 2017;66:70. doi: 10.1136/gutjnl-2015-309800. [DOI] [PubMed] [Google Scholar]
- 21.Keshinro A, et al. Do differences in the microbiome explain early onset in colon cancer? J. Clin. Oncol. 2020;38:15. doi: 10.1200/JCO.2020.38.15_suppl.e16070. [DOI] [Google Scholar]
- 22.Bandera B, et al. The first demonstration of a link between the microbiome and recurrence in colon cancer: Results from a prospective, multicenter nodal ultrastaging trial. Ann. Surg. Oncol. 2017;24:S7. [Google Scholar]
- 23.Avril M, DePaolo RW. “Driver-passenger” bacteria and their metabolites in the pathogenesis of colorectal cancer. Gut Microbes. 2021;13:1941710. doi: 10.1080/19490976.2021.1941710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Feng Q, et al. Gut microbiome development along the colorectal adenoma–carcinoma sequence. Nat. Commun. 2015;6:6528. doi: 10.1038/ncomms7528. [DOI] [PubMed] [Google Scholar]
- 25.Leung PHM, et al. Characterization of mucosa-associated microbiota in matched cancer and non-neoplastic mucosa from patients with colorectal cancer. Front. Microbiol. 2019;10:1317. doi: 10.3389/fmicb.2019.01317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Helmink BA, Khan MAW, Hermann A, Gopalakrishnan V, Wargo JA. The microbiome, cancer, and cancer therapy. Nat. Med. 2019;25:377–388. doi: 10.1038/s41591-019-0377-7. [DOI] [PubMed] [Google Scholar]
- 27.Allen J, Sears CL. Impact of the gut microbiome on the genome and epigenome of colon epithelial cells: Contributions to colorectal cancer development. Genome Med. 2019;11:11. doi: 10.1186/s13073-019-0621-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Roberti MP, et al. Chemotherapy-induced ileal crypt apoptosis and the ileal microbiome shape immunosurveillance and prognosis of proximal colon cancer. Nat. Med. 2020;26:919. doi: 10.1038/s41591-020-0882-8. [DOI] [PubMed] [Google Scholar]
- 29.Zhu M, et al. Comprehensive RNA sequencing in adenoma-cancer transition identified predictive biomarkers and therapeutic targets of human CRC. Mol. Ther. Nucleic Acids. 2020;20:25–33. doi: 10.1016/j.omtn.2020.01.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Komor MA, et al. Molecular characterization of colorectal adenomas reveals POFUT1 as a candidate driver of tumor progression. Int. J. Cancer. 2020;146:1979–1992. doi: 10.1002/ijc.32627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dayama G, Priya S, Niccum DE, Khoruts A, Blekhman R. Interactions between the gut microbiome and host gene regulation in cystic fibrosis. Genome Med. 2020;12:12. doi: 10.1186/s13073-020-0710-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kim N, et al. RNA-sequencing identification and validation of genes differentially expressed in high-risk adenoma, advanced colorectal cancer, and normal controls. Funct. Integr. Genomics. 2021;21:513–521. doi: 10.1007/s10142-021-00795-8. [DOI] [PubMed] [Google Scholar]
- 33.Wastyk HC, et al. Gut-microbiota-targeted diets modulate human immune status. Cell. 2021;184:4137. doi: 10.1016/j.cell.2021.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Taylor BC, et al. Consumption of fermented foods is associated with systematic differences in the gut microbiome and metabolome. Msystems. 2020;5:e00901-19. doi: 10.1128/mSystems.00901-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Milani C, et al. Multi-omics approaches to decipher the impact of diet and host physiology on the mammalian gut microbiome. Appl. Environ. Microbiol. 2020;86:e01864-20. doi: 10.1128/AEM.01864-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Arthur JC, et al. Microbial genomic analysis reveals the essential role of inflammation in bacteria-induced colorectal cancer. Nat. Commun. 2014;5:4724. doi: 10.1038/ncomms5724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hale VL, et al. Distinct microbes, metabolites, and ecologies define the microbiome in deficient and proficient mismatch repair colorectal cancers. Genome Med. 2018;10:78. doi: 10.1186/s13073-018-0586-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Purcell RV, Visnovska M, Biggs PJ, Schmeier S, Frizelle FA. Distinct gut microbiome patterns associate with consensus molecular subtypes of colorectal cancer. Sci. Rep. 2017;7:11590. doi: 10.1038/s41598-017-11237-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Sehgal K, Khanna S. Gut microbiome and checkpoint inhibitor colitis. Intest. Res. 2021;19:360–364. doi: 10.5217/ir.2020.00116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fong W, Li Q, Yu J. Gut microbiota modulation: A novel strategy for prevention and treatment of colorectal cancer. Oncogene. 2020;39:4925–4943. doi: 10.1038/s41388-020-1341-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Routy B, et al. The gut microbiota influences anticancer immunosurveillance and general health. Nat. Rev. Clin. Oncol. 2018;15:382–396. doi: 10.1038/s41571-018-0006-2. [DOI] [PubMed] [Google Scholar]
- 42.Taghinezhad-S S, Mohseni AH, Fu XS. Intervention on gut microbiota may change the strategy for management of colorectal cancer. J. Gastroenterol. Hepatol. 2021;36:1508–1517. doi: 10.1111/jgh.15369. [DOI] [PubMed] [Google Scholar]
- 43.Dahmus JD, Kotler DL, Kastenberg DM, Kistler CA. The gut microbiome and colorectal cancer: A review of bacterial pathogenesis. J. Gastrointest. Oncol. 2018;9:769–777. doi: 10.21037/jgo.2018.04.07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Suehiro Y, et al. Highly sensitive stool DNA testing of Fusobacterium nucleatum as a marker for detection of colorectal tumours in a Japanese population. Ann. Clin. Biochem. 2017;54:86–91. doi: 10.1177/0004563216643970. [DOI] [PubMed] [Google Scholar]
- 45.Attene-Ramos MS, Wagner ED, Plewa MJ, Gaskins HR. Evidence that hydrogen sulfide is a genotoxic agent. Mol. Cancer Res. 2006;4:9–14. doi: 10.1158/1541-7786.Mcr-05-0126. [DOI] [PubMed] [Google Scholar]
- 46.Boleij A, et al. The Bacteroides fragilis toxin gene is prevalent in the colon mucosa of colorectal cancer patients. Clin. Infect. Dis. 2015;60:208–215. doi: 10.1093/cid/ciu787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wang Q, et al. Multi-omic profiling reveals associations between the gut mucosal microbiome, the metabolome, and host DNA methylation associated gene expression in patients with colorectal cancer. BMC Microbiol. 2020;20:83. doi: 10.1186/s12866-020-01762-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bisht V, et al. Integration of the microbiome, metabolome and transcriptomics data identified novel metabolic pathway regulation in colorectal cancer. Int. J. Mol. Sci. 2021;22:5763. doi: 10.3390/ijms22115763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Ye Y, et al. Up-regulation of REG3A in colorectal cancer cells confers proliferation and correlates with colorectal cancer risk. Oncotarget. 2016;7:3921–3933. doi: 10.18632/oncotarget.6473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Qiu YS, Liao GJ, Jiang NN. REG3A overexpression suppresses gastric cancer cell invasion, proliferation and promotes apoptosis through PI3K/Akt signaling pathway. Int. J. Mol. Med. 2018;41:3167–3174. doi: 10.3892/ijmm.2018.3520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zhang MY, Wang J, Guo J. Role of regenerating islet-derived protein 3A in gastrointestinal cancer. Front. Oncol. 2019;9:1449. doi: 10.3389/fonc.2019.01449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Guo J, Liao MF, Hu XM, Wang J. Tumour-derived Reg3A educates dendritic cells to promote pancreatic cancer progression. Mol. Cells. 2021;44:647–657. doi: 10.14348/molcells.2021.0145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Fung TC, Artis D, Sonnenberg GF. Anatomical localization of commensal bacteria in immune cell homeostasis and disease. Immunol. Rev. 2014;260:35–49. doi: 10.1111/imr.12186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zhao LY, Zhang X, Zuo T, Yu J. The composition of colonic commensal bacteria according to anatomical localization in colorectal cancer. Engineering. 2017;3:90–97. doi: 10.1016/J.Eng.2017.01.012. [DOI] [Google Scholar]
- 55.Cossart P, Sansonetti PJ. Bacterial invasion: The paradigms of enteroinvasive pathogens. Science. 2004;304:242–248. doi: 10.1126/science.1090124. [DOI] [PubMed] [Google Scholar]
- 56.Bonnet M, et al. Colonization of the human gut by E. coli and colorectal cancer risk. Clin. Cancer Res. 2014;20:859–867. doi: 10.1158/1078-0432.Ccr-13-1343. [DOI] [PubMed] [Google Scholar]
- 57.Paredes-Sabja D, Shen A, Sorg JA. Clostridium difficile spore biology: Sporulation, germination, and spore structural proteins. Trends Microbiol. 2014;22:406–416. doi: 10.1016/j.tim.2014.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Tetz G, Tetz V. Introducing the sporobiota and sporobiome. Gut Pathog. 2017;9:38. doi: 10.1186/s13099-017-0187-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Peterson D, et al. Comparative analysis of 16S rRNA gene and metagenome sequencing in pediatric gut microbiomes. Front. Microbiol. 2021;12:670336. doi: 10.3389/fmicb.2021.670336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Bharti R, Grimm DG. Current challenges and best-practice protocols for microbiome analysis. Brief. Bioinform. 2019;22:178–193. doi: 10.1093/bib/bbz155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Andrews, S. S. FastQC: A quality control tool for high throughput sequence data. 10.12688/f1000research.21142.2 (2010).
- 62.Al-Ghalith GA, Hillmann B, Ang K, Shields-Cutler R, Knights D. SHI7 is a self-learning pipeline for multipurpose short-read DNA quality control. Msystems. 2018;3:e00202–00217. doi: 10.1128/mSystems.00202-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Dixon P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 2003;14:927–930. doi: 10.1111/j.1654-1103.2003.tb02228.x. [DOI] [Google Scholar]
- 64.McMurdie PJ, Holmes S. phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One. 2013;8:e61217. doi: 10.1371/journal.pone.0061217. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets generated during the current study are not publicly available due to Personal Information Protection Act of Republic of Korea and IRB recommendation of Korea University Guro Hospital but are available from the corresponding author on reasonable request.