Abstract
Background
In recent years, a growing body of research has revealed that long noncoding RNAs (lncRNAs) participate in regulating genomic instability.
Materials and Methods
We obtained RNA expression profiles, somatic mutation profiles, clinical information, and pathological features of colorectal cancer (CRC) from The Cancer Genome Atlas project. We divided the cohort into two groups based on mutation frequency and identified genomic instability-related lncRNAs (GI-lncRNAs) using R software. We further analyzed the function of identified GI-lncRNAs and established a prognostic model through Cox regression. Using the established prognostic model, we divided the cohort into the high- and low-risk groups and further verified the prognostic differences between the two groups as well as the predictive power of prognosis-related lncRNAs in the genomic instability of CRC.
Results
We identified a total of 143 GI-lncRNAs that were differentially expressed between the higher mutation frequency group and the lower mutation frequency group. According to Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology analyses, a series of cancer-associated terms were enriched. We further constructed a prognostic model that included five GI-lncRNAs (lncRNA PTPRD-AS1, lncRNA AC009237.14, lncRNA LINC00543, lncRNA AP003555.1, and lncRNA AL109615.3). We confirmed that the expression of the five GI-lncRNAs was associated with prognosis and the mutation of critical genes in the CRC patient cohort.
Conclusions
The present research further confirmed the vital function of GI-lncRNAs in the genomic instability of CRC. The five GI-lncRNAs identified in our study are potential biomarkers and need to be studied in more depth.
1. Introduction
Colorectal cancer (CRC) is the most common malignant neoplasm of the digestive system. According to “Cancer Statistics, 2020” produced by the American Cancer Society, a total of 147,950 new cases and 53,200 deaths were registered from CRC in the United States [1]. Accumulating studies indicated that CRC is the result of a combination of environmental, dietary, lifestyle, and genetic factors [2]. A series of treatments including surgery, chemotherapy, targeted therapy, radiotherapy, and immunotherapy can improve patient outcomes; however, the prognosis of patients with CRC in the advanced stage is still poor [3]. Hence, there is an urgent need to explore and elucidate the molecular biological mechanisms and novel effective biomarkers like noncoding RNAs or circulating tumor DNA of CRC [4].
Genomic instability, including chromosomal instability and microsatellite instability (MSI), is an immensely complex molecular phenotype and mechanism [5]. Genomic instability has been verified as a “facilitating characteristic” that promotes cellular oncogenesis and metastasis [6]. Emerging evidence has indicated that inactivation of the mismatch repair system and the base excision repair system as well as genetic mutations is a critical mechanism in the tumorigenesis of CRC [7]. Long noncoding RNAs (lncRNAs) are a type of transcript of more than 200 nucleotides which do not encode proteins [8]. Recently, research into lncRNAs has been a new academic focus and attracted more interest. Accumulating evidence indicates that lncRNAs are involved in driving genomic instability at transcriptional and translational levels [9]. For instance, Lee et al. indicated that the lncRNA NORAD maintained ploidy and genomic stability by acting as a molecular decoy for PUMILIO proteins [10]. Chen et al. revealed that lncRNA CCAT2 drives chromosomal instability and carcinogenesis of CRC by upregulating the expression of the ribosomal biogenesis factor BOP1 [11].
With the development and utilization of next-generation sequencing platforms, an increasing number of novel genes and mutations have been identified and verified to play critical roles in tumor progression. For example, exome sequencing was utilized to identify CRC somatic mutations, and paternally expressed gene 3 (PEG3) was verified as a high-frequency mutated gene [12]. Imperial et al. performed comparative somatic and proteomic analyses of right-sided colon cancer, left-sided colon cancer, and rectal cancers. The results indicated that a nonsense mutation of adenomatous polyposis coli (APC) was a biomarker of right-sided colon cancer, and hub proteins in protein-protein interaction networks have critical roles in left- or right-sided colon cancer [13].
Recently, an accumulating body of bioinformatic studies was focused on the identification of potential biomarkers associated with tumor progression. For instance, seven identified miRNAs (miR-139-5p, miR-146a-5p, miR-185-5p, miR-195-5p, miR-340-5p, miR-331-3p, and miR-484) were found related to development, lifestyles, and overall survival of breast cancer patients [14]. And four differently expressed miRNAs (miR-21-5p, miR-183-5p, miR-195-5p, and miR-497-5p) were related to CRC through multiple signaling pathways based on the GEO datasets [15]. Lu et al. established a metabolism-related lncRNA prognostic model to predict the clinical outcome of CRC patients [16]. To the best of our knowledge, however, there are few academic researches relevant to lncRNAs and genomic instability of CRC based on bioinformatic analysis. In the present research, we obtained in silico data from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov) project [17] and performed bioinformatic analysis to identify genomic instability-related lncRNAs (GI-lncRNAs). Furthermore, we constructed a prognostic model to predict the overall survival and critical genomic mutations of patients.
2. Materials and Methods
2.1. Data Collection and GI-lncRNA Identification
We downloaded transcriptome profiles (RNA-Seq), simple nucleotide variation (masked somatic mutation), and clinical data from TCGA database (https://portal.gdc.cancer.gov). The inclusion criteria were as follows: the samples (1) were pathologically diagnosed as colon adenocarcinoma or rectal adenocarcinoma, (2) had integral clinical and pathological data, and (3) had the follow-up time and surviving state. As a result, a total of 568 CRC and 44 normal tissues were included in this research. According to the documents and guidelines published by TCGA Authority (https://cancergenome.nih.gov/publications/publicationguidelines), the present research does not require ethical review.
We extracted the lncRNA expression matrix and genetic mutation information from TCGA samples using PERL software. The samples were then ranked according to the number of cumulative mutant genes. We further selected the top 25% and bottom 25% of the samples to compare differences in lncRNA expression. GI-lncRNAs were identified using the “edgeR” package in R software [18] with a threshold log2 fold change > 1.0 and P < 0.01. Furthermore, a heatmap that included the 20 top upregulated and 20 top downregulated GI-lncRNAs was mapped using the “heatmap” package in R software. According to the expression of GI-lncRNAs, we regrouped 568 samples into the genomically stable-like group (GS-like group) and a genomically unstable-like group (GU-like group) through clustering analysis. Genetic mutations and gene expressions between the two groups were also analyzed.
2.2. GI-lncRNA Enrichment Analyses
We performed Spearman's correlation analysis to identify the mRNAs that were coexpressed with GI-lncRNAs. Each lncRNA that corresponded to 10 mRNAs was shown in a network diagram. Moreover, we performed Gene Ontology (GO) [19] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [20] pathway analyses using “clusterProfiler,” “org.Hs.eg.db,” “enrichplot,” and “ggplot2” packages in R software. The results of GO and KEGG analyses were visualized in a bar chart, and P < 0.05 was regarded as statistically significant.
2.3. Construction of a GI-lncRNA Prognostic Model
A total of 509 CRC patients, whose follow-up time was more than 30 days and for whom complete clinical data were available, were enrolled in this study. We randomly divided the samples into a “train” group and a “test” group, and then, we performed Cox proportional hazard regression analysis to identify the GI-lncRNA signature. We then identified GI-lncRNAs which were significantly related to survival in the “train” group and the “test” group (P < 0.05). Calculation of hazard ratio and 95% confidence interval and generation of the profile of the receiver operating characteristic (ROC) curve were performed by applying “survival,” “caret,” “glmnet,” “survminer,” and “timeROC” packages in R software.
We further identified the GI-lncRNA signature as an independent prognostic factor based on the coefficient in Cox multivariate analysis. The model that included the expression of the GI-lncRNA signature and coefficient was constructed as follows:
(1) |
According to the risk score, the samples were further divided into a high-risk group and a low-risk group. The Kaplan-Meier method was utilized to compare overall survival between the two groups. We calculated the hazard ratios and 95% confidence intervals of age, TNM stage, and risk score. Furthermore, we analyzed differences in the mutation of critical genes in CRC between the two groups.
3. Results
3.1. Identification of GI-lncRNAs in CRC
To identify the GI-lncRNAs in CRC, we compared lncRNA expression between the GS samples (n = 118) and the GU samples (n = 128). A total of 143 GI-lncRNAs were identified which were significantly differentially expressed between the two groups. Among the GI-lncRNAs, 67 GI-lncRNAs were upregulated and 76 GI-lncRNAs were downregulated. The list of GI-lncRNAs is shown in Supplementary Materials, and the top 20 upregulated and downregulated GI-lncRNAs are shown as a heatmap in Figure 1.
We performed cluster detection to regroup the 568 samples into a GS-like group and a GU-like group. As a result, 363 samples were classified into the GS-like group, and 203 samples were classified into the GU-like group. The differential expression of GI-lncRNAs between the two groups is shown as a heatmap in Supplementary Figure 1. The number of mutant genes in the GU-like group was significantly more than that in the GS-like group (Figure 2(a), P < 2.22e − 16). Moreover, we analyzed differences in the expression of mismatch repair genes and colorectal oncogenes between the two groups. The expression of caudal-related homeobox transcription factor 2 (CDX2) (Figure 2(b), P < 2.22e − 16), mismatch repair gene mutL homolog 1 (MLH1) (Figure 2(c), P = 8.3e − 10), and postmeiotic segregation increased 2 (PMS2) (Figure 2(d), P = 0.014) was significantly lower in the GU-like group, while expression of epidermal growth factor receptor (EGFR) (Figure 2(e), P = 1.2e − 4) was higher in the GU-like group. However, there was no significant difference in the expression of mutator S homolog 2 (MSH2) (Figure 2(f), P = 0.7) or mutator S homolog 6 (MSH6) (Figure 2(g), P = 0.066) between the two groups.
3.2. Functional Enrichment Analyses of GI-lncRNAs
To further explore the potential function of the 143 GI-lncRNAs, we performed GO and KEGG pathway analyses using R software. As shown in Supplementary Materials, we analyzed the 10 mRNAs with the highest coexpression coefficient with GI-lncRNAs. Moreover, we constructed the coexpression network that included the GI-lncRNAs and mRNAs (Supplementary Figure 2). The results of GO analysis indicated that the biological process and molecular function terms were mainly associated with immune-modulatory mechanisms and chemokine function, respectively (Figure 3(a)). The results of KEGG pathway analysis further indicated the involvement of multiple cancer-related pathways including “NOD-like (nucleotide-binding oligomerization domain-like) receptor signaling pathway,” “IL-17 (interleukin-17) signaling pathway,” “Wnt signaling pathway,” and “TNF (tumor necrosis factor) signaling pathway.” These results suggested that GI-lncRNAs may play critical roles in the progression of CRC (Figure 3(b)).
3.3. Construction of Prognostic Model Based on GI-lncRNAs
To further analyze the function of the GI-lncRNAs in predicting overall survival, we divided 509 CRC samples into the “train” and “test” groups; the clinicopathological characteristics between the two groups were not significantly different (Table 1). Then, we performed Cox proportional hazard regression analysis to identify the GI-lncRNA signature using the “train” group. As shown in Figure 4(a), we identified five GI-lncRNAs (lncRNA PTPRD-AS1, lncRNA AC009237.14, lncRNA LINC00543, lncRNA AP003555.1, and lncRNA AL109615.3) as independent prognostic factors using “train” group data. As shown in Figure 4(b), the area under the curve (AUC) of the ROC curve was 0.739. We further conducted Cox regression analyses to evaluate the prognostic role of the five-GI-lncRNA signature. The results, shown in Table 2, identified the five GI-lncRNAs as independent prognostic factors. According to the coefficient in multivariate analysis and expression of the five-GI-lncRNA signature, we constructed a prognostic model: risk score = (0.469 × AP003555.1 expression) + (0.182 × AC009237.14) + (0.072 × AL109615.3 expression) + (0.063 × LINC00543 expression) + (−0.447 × PTPRD‐AS1 expression). Based on the prognostic model, we then calculated the risk score of each sample and further plotted the heatmap by an ascending risk degree (Figure 4(c)). This provided preliminary evidence that PTPRD-AS1 is a protective factor and AC009237.14, LINC00543, AP003555.1, and AP003555.1 are risk factors. Furthermore, the low-risk samples exhibited better overall survival than the high-risk samples (Figure 4(d), P < 0.001). Moreover, with the increasing risk score, tumor somatic mutation count also increased, especially ranking between 50 and 150 (Figure 4(e)).
Table 1.
Clinicopathologic characteristics | Type | Train | Test | P value |
---|---|---|---|---|
Age (years) | ≤65 | 108 (42.19%) | 115 (45.45%) | 0.5135 |
>65 | 148 (57.81%) | 138 (54.55%) | ||
Gender | Female | 119 (46.48%) | 113 (44.66%) | 0.7465 |
Male | 137 (53.52%) | 140 (55.34%) | ||
Stage | Stage I-II | 134 (52.34%) | 145 (57.31%) | 0.3641 |
Stage III-IV | 113 (44.14%) | 102 (40.32%) | ||
Unknown | 9 (3.52%) | 6 (2.37%) | ||
T | T1-2 | 51 (19.92%) | 53 (20.95%) | 0.8593 |
T3-4 | 205 (80.08%) | 200 (79.05%) | ||
M | M0 | 185 (72.27%) | 195 (77.08%) | 0.1698 |
M1 | 42 (16.41%) | 30 (11.86%) | ||
Unknown | 29 (11.33%) | 28 (11.07%) | ||
N | N0 | 145 (56.64%) | 151 (59.68%) | 0.5095 |
N1-3 | 111 (43.36%) | 101 (39.92%) | ||
Unknown | 0 (0%) | 1 (0.4%) |
Table 2.
Variable | Univariate analysis | Multivariate analysis | |||
---|---|---|---|---|---|
HR (95% CI) | P value | Coefficient | HR (95% CI) | P value | |
AP003555.1 (high/low) | 1.343 (1.171-1.538) | <0.001∗∗ | 0.469 | 1.598 (1.267-2.014) | <0.001∗∗ |
AC009237.14 (high/low) | 1.170 (1.077-1.271) | <0.001∗∗ | 0.182 | 1.199 (1.078-1.334) | <0.001∗∗ |
AL109615.3 (high/low) | 1.070 (1.016-1.126) | 0.015∗ | 0.072 | 1.074 (1.022-1.130) | 0.005∗∗ |
LINC00543 (high/low) | 1.046 (1.008-1.109) | 0.047∗ | 0.063 | 1.065 (1.005-1.129) | 0.033∗ |
PTPRD-AS1 (high/low) | 0.645 (0.442-0.943) | 0.024∗ | -0.447 | 0.639 (0.436-0.937) | 0.022∗ |
∗ P < 0.05, ∗∗P < 0.01.
We further utilized the “test” group samples and all TCGA set samples to verify the accuracy and reliability of the GI-lncRNA signature. The ROC curve analysis of the “test” set and TCGA set yielded AUCs of 0.658 and 0.704, respectively (Figures 5(a) and 5(b)). We also plotted the heatmap and ranked the risk score which was based on expression of the five GI-lncRNAs. The same as the results of the “train” group, PTPRD-AS1 was also verified as a protective factor, and AC009237.14, LINC00543, AP003555.1, and AP003555.1 were verified as risk factors (Figures 5(c) and 5(d)). We further analyzed the somatic mutation count in the “test” group and TCGA set, and the results indicated that samples in the quadrate range on each side of the median risk score had higher frequencies of mutation (Figures 5(e) and 5(f)). Kaplan-Meier analyses indicated that the low-risk group had better overall survival (Figure 5(g), P = 0.012, and Figure 5(h), P < 0.001). The above results confirmed the consistency and robustness of our model.
We next calculated the hazard ratio and 95% confidence interval of age, TNM stage, and risk score in TCGA set. As shown in Table 3, age, pTNM stage, and risk score were observed to be significant factors in both univariate and multivariate analyses. Then, we performed Kaplan-Meier curve analysis by age, pTNM stage, and gender to determine whether the GI-lncRNA signature was consistent across different pathological characteristics. The CRC samples of TCGA set were further stratified into two sets using age of 65 years. The results indicated that the high-risk group had a relatively poor prognosis for those both above and below 65 years of age (Figure 6(a), P < 0.001, and Figure 6(b), P = 0.036). Similarly, the high-risk group had lower overall survival in both the female and the male sets (Figures 6(c) and 6(d), P = 0.008 and P < 0.001). Moreover, we stratified the samples into T1-2 and T3-4 sets, N0 and N1–3 sets, and M0 and M1 sets. Kaplan-Meier analyses indicated that the high-risk group had a poorer prognosis than the low-risk group in the T1-2 and T3-4 set (Figures 6(e) and 6(f), P = 0.046 and P < 0.001). We also observed similar results in the N0, N1–3, and M0 sets (Figures 6(g)–6(i), P = 0.014, P = 0.003, and P = 0.002). However, there was no significant difference in the M1 set (Figure 6(j), P = 0.254).
Table 3.
Variable | Univariate analysis | Multivariate analysis | ||
---|---|---|---|---|
HR (95% CI) | P value | HR (95% CI) | P value | |
Age (years) | 1.028 (1.009-1.047) | 0.004∗∗ | 1.033 (1.012-1.049) | 0.003∗∗ |
Gender (male/female) | 1.095 (0.728-1.646) | 0.664 | ||
pT stage (T1/T2/T3/T4) | 2.927 (1.948-4.396) | <0.001∗∗ | 2.079 (1.339-3.226) | <0.001∗∗ |
pN stage (N0/N1/N2) | 2.901 (1.898-4.434) | <0.001∗∗ | 1.692 (1.025-2.795) | 0.039∗ |
pM stage (M0/M1) | 4.753 (3.107-7.269) | <0.001∗∗ | 2.989 (1.799-4.966) | <0.001∗∗ |
Risk score (high/low) | 1.248 (1.131-1.378) | <0.001∗∗ | 1.056 (1.021-1.092) | <0.001∗∗ |
∗ P < 0.05, ∗∗P < 0.01.
3.4. Comparison between GI-lncRNA Signatures and Previous lncRNA Signatures
To compare the present prognostic model with the existing lncRNA-related signature, we further performed ROC curve analysis using the same TCGA cohort. Previously published research included a five-lncRNA signature derived from Gu et al. [21], six-lncRNA signature from Cheng et al. [22], and nine-lncRNA signature from Zhang et al. [23]. As a result, the 3-year AUC of the GI-lncRNA signature was 0.704 which was more than the Gu-lncRNA signature (AUC = 0.645), Cheng-lncRNA signature (AUC = 0.675), or Zhang-lncRNA signature (AUC = 0.623) (Figure 7). These results indicated that the GI-lncRNA signature had a better performance in survival prediction.
3.5. GI-lncRNA Signature Predicts the Mutation Status of Genes
We further analyzed whether the GI-lncRNA signature could predict CRC genetic mutation. We divided all samples into the high-risk and low-risk groups based on the median risk score. The results are shown in Figure 8; the mutation of BRAF (v-Raf murine sarcoma viral oncogene homolog B) and TP53 (tumor protein P53) significantly increased in the high-risk group (Figures 8(a) and 8(b), P = 0.008 and P = 0.014). However, PIK3CA (phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha) mutation manifested a higher frequency of mutation in the low-risk group (Figure 8(c), P = 0.006). The mutation of KRAS showed no difference between the two groups (Figure 8(d), P = 0.708). To a certain extent, the results indicated that the GI-lncRNA signature was able to predict the mutation of BRAF, TP53, and PIK3CA.
4. Discussion
Noncoding RNAs including circRNAs [24, 25], miRNAs [26], lncRNAs [27], and circulating tumor DNA [28] have been verified as novel diagnostic biomarkers in malignant tumors. In the present study, we selected lncRNAs as the object for in-depth study. In recent years, targeted therapy and immunotherapy for genomic instability are gradually replacing chemotherapy-based tumor therapy. Scholars have indicated that plentiful mutations produce vast numbers of altered peptides, some of which are expressed and processed as new antigens, to which the immune system can produce antitumor reactions [29]. For CRC patients with microsatellite instability, anti-PD-1 (programmed cell death protein 1) therapy has proven superior to chemotherapy alone in terms of local remission and prognosis [30]. Recently, multiple lncRNAs including lncRNA NORAD [31], lncRNA GUARDIN [32], and BGL3 [33] have been verified to play important roles in genomic instability. Therefore, the construction of the GI-lncRNA signature related to genomic instability has profound implications for CRC diagnosis and treatment.
In the present study, we conducted bioinformatic analysis and identified a total of 143 GI-lncRNAs. However, there is still little research into the role of GI-lncRNAs in CRC. We performed cluster detection to regroup the samples into a GU-like group and a GS-like group. We found that CDX2, MLH1, and PMS2 were expressed at significantly lower levels in the GU-like group. CDX2 has been verified to be a critical biomarker of normal epithelium and prognosis in CRC patients [34, 35]. Furthermore, CDX2 has been indicated to be associated with BRAF mutation and MSI status [34]. MLH1 and PMS2 comprise an important and common mismatch repair protein heterodimer. Salem et al. demonstrated that MLH1/PMS2 loss in CRC has a higher tumor mutation burden than MLH1/PMS2 loss in endometrial cancer [36]. Interestingly, there was no difference in the expression of MSH2 or MSH6 between the GU-like group and the GS-like group. We speculated that the interaction and regulation between MSH2/MSH6 and GI-lncRNAs are relatively weak. Moreover, we found that EGFR was relatively upregulated in the GU-like group. EGFR is a key target in CRC treatment, and studies have indicated that lncRNA SLCO4A1-AS1 [37], lncRNA SCARNA2 [38], lncRNA EGFR-AS1 [39], lncRNA DNAJC3-AS1 [40], and lncRNA LOXL1-AS1 [41] all target EGFR in CRC. LOXL1-AS1 was identified as a GI-lncRNA in the present research.
We conducted GO and KEGG analyses to uncover the biological function of GI-lncRNAs. The results revealed multiple enriched terms related to immunoregulation, genomic instability, and chemokine activity. For instance, incubation of colon cells with IL-6 (interleukin-6) engendered migration of MSH3 from the nucleus to the cytosol and promoted MSI [42]. Wunderlich et al. indicated that IL-6 promoted lymphocyte recruitment through the CCL-20 (CC-chemokine-ligand-20)/CCR-6 (CC-chemokine-receptor-6) axis in the CRC microenvironment [43]. The response to retinoic acid [44] and interferon-gamma [45] was also related to MSI in CRC. Moreover, the enriched terms including leukocyte chemotaxis [46], lymphocyte chemotaxis [47], chemokine activity [48], NOD-like receptor signaling pathway [49], IL-17 signaling pathway [50], Wnt signaling pathway [51], and TNF signaling pathway [52] also play critical roles in CRC.
Furthermore, we performed Cox proportional hazard regression and identified a five-GI-lncRNA signature (PTPRD-AS1, AC009237.14, LINC00543, AP003555.1, and AL109615.3). PTPRD-AS1 has been identified as an immune-related biomarker which predicts overall survival and immunotherapeutic response in bladder cancer [53]. AC009237.14 [22] and AL109615.3 [54] were recently verified as biomarkers based on TCGA database in CRC and gastric cancer, respectively. However, little is known about the GI-lncRNA signature in the progression and genomic instability of CRC. According to the expression of the five-GI-lncRNA signature, we divided the samples into a high-risk and a low-risk group. We found that the signature suggested a difference in prognosis in diverse pathological characteristics except the metastatic set. We compared the AUC value of the GI-lncRNA signature to previously published prognostic signatures via literature review [21–23]. We found that the GI-lncRNA signature obtained the highest AUC value with the lowest number of biomarkers.
A wide array of studies has demonstrated that genetic mutation influences the drug sensitivity and biological behavior of tumors [55]. In CRC, the mutation patterns of BRAF, KRAS, TP53, and PIK3CA are increasingly important in the selection of optimal treatment [56, 57]. In the present research, we found that the five-GI-lncRNA signature captured the mutation status of BRAF, TP53, and PIK3CA. Esposito et al. indicated that silencing of lncRNA COMET increased the drug sensitivity of vemurafenib in BRAF-mutated papillary thyroid cancer [58]. Zhao et al. indicated that expressions of lnc273-31 and lnc273-34 were both elevated in CRC cancer samples with p53-R273H mutation [59]. However, to the best of our knowledge, there was still a lack of lncRNA biomarkers in BRAF and PIK3CA mutation prediction.
The present research is mainly based on bioinformatic analysis and still has some limitations. Firstly, chromosomal instability and MSI have been revealed to have a critical role in genomic instability, but the simple nucleotide variation data could only supply information on mutant genes. Secondly, more molecular biological experiments are needed to verify the identified biomarkers and mechanisms involved in the future.
5. Conclusions
Our study provided a bioinformatic strategy to identify lncRNAs and potential mechanisms based on TCGA database and bioinformatic software. Moreover, we identified the five-GI-lncRNA signature as an independent prognostic marker in different cohorts. This GI-lncRNA signature has profound significance in genomic instability and certain value for further research.
Acknowledgments
This work is supported by the Cultivation Fund of National Natural Science Foundation of Liaoning Cancer Hospital, No. 2021-ZLLH-12.
Data Availability
The raw data of this study are derived from TCGA database (https://portal.gdc.cancer.gov/), which is a publicly available database.
Conflicts of Interest
All authors declare no conflicts of interest in this paper.
Authors' Contributions
Meng QK conducted and designed the study; Liang Y and Sun HX collected data and collated. Liang Y and Ma B applied the experiments on TCGA project. All authors read and approved the final manuscript.
Supplementary Materials
References
- 1.Siegel R. L., Miller K. D., Jemal A. Cancer statistics, 2020. CA: a Cancer Journal for Clinicians . 2020;70(1):7–30. doi: 10.3322/caac.21590. [DOI] [PubMed] [Google Scholar]
- 2.Fearon E. R. Molecular genetics of colorectal cancer. Annual Review of Pathology . 2011;6(1):479–507. doi: 10.1146/annurev-pathol-011110-130235. [DOI] [PubMed] [Google Scholar]
- 3.El Bairi K., Tariq K., Himri I., et al. Decoding colorectal cancer epigenomics. Cancer Genetics . 2018;220:49–76. doi: 10.1016/j.cancergen.2017.11.001. [DOI] [PubMed] [Google Scholar]
- 4.Hong S. N. Genetic and epigenetic alterations of colorectal cancer. Intestinal Research . 2018;16(3):327–337. doi: 10.5217/ir.2018.16.3.327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Tubbs A., Nussenzweig A. Endogenous DNA damage as a source of genomic instability in cancer. Cell . 2017;168(4):644–656. doi: 10.1016/j.cell.2017.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hanahan D., Weinberg R. A. Hallmarks of cancer: the next generation. Cell . 2011;144(5):646–674. doi: 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
- 7.Vodenkova S., Jiraskova K., Urbanova M., et al. Base excision repair capacity as a determinant of prognosis and therapy response in colon cancer patients. DNA Repair . 2018;72:77–85. doi: 10.1016/j.dnarep.2018.09.006. [DOI] [PubMed] [Google Scholar]
- 8.Mattick J. S., Rinn J. L. Discovery and annotation of long noncoding RNAs. Nature Structural & Molecular Biology . 2015;22(1):5–7. doi: 10.1038/nsmb.2942. [DOI] [PubMed] [Google Scholar]
- 9.Wang X., Arai S., Song X., et al. Induced ncRNAs allosterically modify RNA-binding proteins in cis to inhibit transcription. Nature . 2008;454(7200):126–130. doi: 10.1038/nature06992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee S., Kopp F., Chang T. C., et al. Noncoding RNA NORAD regulates genomic stability by sequestering PUMILIO proteins. Cell . 2016;164(1-2):69–80. doi: 10.1016/j.cell.2015.12.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Chen B., Dragomir M. P., Fabris L., et al. The long noncoding RNA CCAT2 induces chromosomal instability through BOP1-AURKB signaling. Gastroenterology . 2020;159(6):2146–2162.e33. doi: 10.1053/j.gastro.2020.08.018. e2133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Liu Z., Yang C., Li X., et al. The landscape of somatic mutation in sporadic Chinese colorectal cancer. Oncotarget . 2018;9(44):27412–27422. doi: 10.18632/oncotarget.25287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Imperial R., Ahmed Z., Toor O. M., et al. Comparative proteogenomic analysis of right-sided colon cancer, left-sided colon cancer and rectal cancer reveals distinct mutational profiles. Molecular Cancer . 2018;17(1):p. 177. doi: 10.1186/s12943-018-0923-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Falzone L., Grimaldi M., Celentano E., Augustin L. S. A., Libra M. Identification of modulated microRNAs associated with breast cancer, diet, and physical activity. Cancers . 2020;12(9):p. 2555. doi: 10.3390/cancers12092555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Falzone L., Scola L., Zanghi A., et al. Integrated analysis of colorectal cancer microRNA datasets: identification of microRNAs associated with tumor development. Aging . 2018;10(5):1000–1014. doi: 10.18632/aging.101444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lu Y., Wang W., Liu Z., Ma J., Zhou X., Fu W. Long non-coding RNA profile study identifies a metabolism-related signature for colorectal cancer. Molecular Medicine . 2021;27(1):p. 83. doi: 10.1186/s10020-021-00343-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wang Z., Jensen M. A., Zenklusen J. C. A practical guide to The Cancer Genome Atlas (TCGA) Methods in Molecular Biology . 2016;1418:111–141. doi: 10.1007/978-1-4939-3578-9_6. [DOI] [PubMed] [Google Scholar]
- 18.Varet H., Brillet-Gueguen L., Coppee J. Y., Dillies M. A. SARTools: a DESeq2- and EdgeR-based R pipeline for comprehensive differential analysis of RNA-Seq data. PLoS One . 2016;11(6, article e0157022) doi: 10.1371/journal.pone.0157022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Gene Ontology C. The Gene Ontology (GO) project in 2006. Nucleic Acids Research . 2006;34(90001):D322–D326. doi: 10.1093/nar/gkj021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Draghici S., Khatri P., Tarca A. L., et al. A systems biology approach for pathway level analysis. Genome Research . 2007;17(10):1537–1545. doi: 10.1101/gr.6202607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gu L., Yu J., Wang Q., et al. Identification of a 5‑lncRNA signature‑based risk scoring system for survival prediction in colorectal cancer. Molecular Medicine Reports . 2018;18(1):279–291. doi: 10.3892/mmr.2018.8963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cheng L., Han T., Zhang Z., et al. Identification and validation of six autophagy-related long non-coding RNAs as prognostic signature in colorectal cancer. International Journal of Medical Sciences . 2021;18(1):88–98. doi: 10.7150/ijms.49449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zhang Z., Liu Q., Wang P., et al. Development and internal validation of a nine-lncRNA prognostic signature for prediction of overall survival in colorectal cancer patients. PeerJ . 2018;6, article e6061 doi: 10.7717/peerj.6061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Stella M., Falzone L., Caponnetto A., et al. Serum extracellular vesicle-derived circHIPK3 and circSMARCA5 are two novel diagnostic biomarkers for glioblastoma multiforme. Pharmaceuticals . 2021;14(7):p. 618. doi: 10.3390/ph14070618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Li R. D., Guan M., Zhou Z., Dong S. X., Liu Q. The role of circRNAs in the diagnosis of colorectal cancer: a meta-analysis. Frontiers of Medicine . 2021;8:p. 766208. doi: 10.3389/fmed.2021.766208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen B., Xia Z., Deng Y. N., et al. Emerging microRNA biomarkers for colorectal cancer diagnosis and prognosis. Open Biology . 2019;9(1):p. 180212. doi: 10.1098/rsob.180212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ogunwobi O. O., Mahmood F., Akingboye A. Biomarkers in colorectal cancer: current research and future prospects. International Journal of Molecular Sciences . 2020;21(15):p. 5311. doi: 10.3390/ijms21155311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Marcuello M., Vymetalkova V., Neves R. P., et al. Circulating biomarkers for early detection and clinical management of colorectal cancer. Molecular Aspects of Medicine . 2019;69:107–122. doi: 10.1016/j.mam.2019.06.002. [DOI] [PubMed] [Google Scholar]
- 29.Negrini S., Gorgoulis V. G., Halazonetis T. D. Genomic instability -- an evolving hallmark of cancer. Nature Reviews Molecular Cell Biology . 2010;11(3):220–228. doi: 10.1038/nrm2858. [DOI] [PubMed] [Google Scholar]
- 30.André T., Shiu K. K., Kim T. W., et al. Pembrolizumab in microsatellite-instability-high advanced colorectal cancer. The New England Journal of Medicine . 2020;383(23):2207–2218. doi: 10.1056/NEJMoa2017699. [DOI] [PubMed] [Google Scholar]
- 31.Munschauer M., Nguyen C. T., Sirokman K., et al. The NORAD lncRNA assembles a topoisomerase complex critical for genome stability. Nature . 2018;561(7721):132–136. doi: 10.1038/s41586-018-0453-z. [DOI] [PubMed] [Google Scholar]
- 32.Hu W. L., Jin L., Xu A., et al. GUARDIN is a p53-responsive long non-coding RNA that is essential for genomic stability. Nature Cell Biology . 2018;20(4):492–502. doi: 10.1038/s41556-018-0066-7. [DOI] [PubMed] [Google Scholar]
- 33.Hu Z., Mi S., Zhao T., et al. BGL3 lncRNA mediates retention of the BRCA1/BARD1 complex at DNA damage sites. The EMBO Journal . 2020;39(12, article e104133) doi: 10.15252/embj.2019104133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Slik K., Turkki R., Carpen O., et al. CDX2 loss with microsatellite stable phenotype predicts poor clinical outcome in stage II colorectal carcinoma. The American Journal of Surgical Pathology . 2019;43(11):1473–1482. doi: 10.1097/PAS.0000000000001356. [DOI] [PubMed] [Google Scholar]
- 35.Dalerba P., Sahoo D., Paik S., et al. CDX2 as a prognostic biomarker in stage II and stage III colon cancer. The New England Journal of Medicine . 2016;374(3):211–222. doi: 10.1056/NEJMoa1506597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Salem M. E., Bodor J. N., Puccini A., et al. Relationship between MLH1, PMS2, MSH2 and MSH6 gene-specific alterations and tumor mutational burden in 1057 microsatellite instability-high solid tumors. International Journal of Cancer . 2020;147(10):2948–2956. doi: 10.1002/ijc.33115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Tang R., Chen J., Tang M., et al. LncRNA SLCO4A1-AS1 predicts poor prognosis and promotes proliferation and metastasis via the EGFR/MAPK pathway in colorectal cancer. International Journal of Biological Sciences . 2019;15(13):2885–2896. doi: 10.7150/ijbs.38041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Zhang P. F., Wu J., Wu Y., et al. The lncRNA SCARNA2 mediates colorectal cancer chemoresistance through a conserved microRNA-342-3p target sequence. Journal of Cellular Physiology . 2019;234(7):10157–10165. doi: 10.1002/jcp.27684. [DOI] [PubMed] [Google Scholar]
- 39.Atef M. M., Amer A. I., Hafez Y. M., Elsebaey M. A., Saber S. A., Abd El-Khalik S. R. Long non-coding RNA EGFR-AS1 in colorectal cancer: potential role in tumorigenesis and survival via miRNA-133b sponge and EGFR/STAT3 axis regulation. British Journal of Biomedical Science . 2021;78(3):122–129. doi: 10.1080/09674845.2020.1853913. [DOI] [PubMed] [Google Scholar]
- 40.Tang Y., Tang R., Tang M., et al. LncRNA DNAJC3-AS1 regulates fatty acid synthase via the EGFR pathway to promote the progression of colorectal cancer. Frontiers in Oncology . 2021;10, article 604534 doi: 10.3389/fonc.2020.604534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Wu X., Cui F., Chen Y., Zhu Y., Liu F. Long non-coding RNA LOXL1-AS1 enhances colorectal cancer proliferation, migration and invasion through miR-708-5p/CD44-EGFR axis. Oncotargets and Therapy . 2020;13:7615–7627. doi: 10.2147/OTT.S258935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tseng-Rogenski S. S., Hamaya Y., Choi D. Y., Carethers J. M. Interleukin 6 alters localization of hMSH3, leading to DNA mismatch repair defects in colorectal cancer cells. Gastroenterology . 2015;148(3):579–589. doi: 10.1053/j.gastro.2014.11.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wunderlich C. M., Ackermann P. J., Ostermann A. L., et al. Obesity exacerbates colitis-associated cancer via IL-6-regulated macrophage polarisation and CCL-20/CCR-6-mediated lymphocyte recruitment. Nature Communications . 2018;9(1):p. 1646. doi: 10.1038/s41467-018-03773-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ogino S., Kawasaki T., Ogawa A., Kirkner G. J., Loda M., Fuchs C. S. TGFBR2 mutation is correlated with CpG island methylator phenotype in microsatellite instability-high colorectal cancer. Human Pathology . 2007;38(4):614–620. doi: 10.1016/j.humpath.2006.10.005. [DOI] [PubMed] [Google Scholar]
- 45.Kikuchi T., Mimura K., Okayama H., et al. A subset of patients with MSS/MSI-low-colorectal cancer showed increased CD8(+) TILs together with up-regulated IFN-γ. Oncology Letters . 2019;18(6):5977–5985. doi: 10.3892/ol.2019.10953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Bessa X., Elizalde J. I., Mitjans F., et al. Leukocyte recruitment in colon cancer: role of cell adhesion molecules, nitric oxide, and transforming growth factor β1. Gastroenterology . 2002;122(4):1122–1132. doi: 10.1053/gast.2002.32369. [DOI] [PubMed] [Google Scholar]
- 47.Giannini R., Zucchelli G., Giordano M., et al. Immune profiling of deficient mismatch repair colorectal cancer tumor microenvironment reveals different levels of immune system activation. The Journal of Molecular Diagnostics . 2020;22(5):685–698. doi: 10.1016/j.jmoldx.2020.02.008. [DOI] [PubMed] [Google Scholar]
- 48.Yu L., Yang X., Xu C., et al. Comprehensive analysis of the expression and prognostic value of CXC chemokines in colorectal cancer. International immunopharmacology . 2020;89(Part B, article 107077) doi: 10.1016/j.intimp.2020.107077. [DOI] [PubMed] [Google Scholar]
- 49.Liu S., Fan W., Gao X., et al. Estrogen receptor alpha regulates the Wnt/β-catenin signaling pathway in colon cancer by targeting the NOD-like receptors. Cellular Signalling . 2019;61:86–92. doi: 10.1016/j.cellsig.2019.05.009. [DOI] [PubMed] [Google Scholar]
- 50.Chen J., Ye X., Pitmon E., et al. IL-17 inhibits CXCL9/10-mediated recruitment of CD8(+) cytotoxic T cells and regulatory T cells to colorectal tumors. Journal for Immunotherapy of Cancer . 2019;7(1):p. 324. doi: 10.1186/s40425-019-0757-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Grinat J., Heuberger J., Vidal R. O., et al. The epigenetic regulator Mll1 is required for Wnt-driven intestinal tumorigenesis and cancer stemness. Nature Communications . 2020;11(1):p. 6422. doi: 10.1038/s41467-020-20222-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Alotaibi A. G., Li J. V., Gooderham N. J. Tumour necrosis factor-α (TNF-α) enhances dietary carcinogen-induced DNA damage in colorectal cancer epithelial cells through activation of JNK signaling pathway. Toxicology . 2021;457:p. 152806. doi: 10.1016/j.tox.2021.152806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wu Y., Zhang L., He S., et al. Identification of immune-related LncRNA for predicting prognosis and immunotherapeutic response in bladder cancer. Aging . 2020;12(22):23306–23325. doi: 10.18632/aging.104115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Cheng C., Wang Q., Zhu M., Liu K., Zhang Z. Integrated analysis reveals potential long non-coding RNA biomarkers and their potential biological functions for disease free survival in gastric cancer patients. Cancer Cell International . 2019;19(1):p. 123. doi: 10.1186/s12935-019-0846-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Martincorena I., Campbell P. J. Somatic mutation in cancer and normal cells. Science . 2015;349(6255):1483–1489. doi: 10.1126/science.aab4082. [DOI] [PubMed] [Google Scholar]
- 56.Müller M. F., Ibrahim A. E., Arends M. J. Molecular pathological classification of colorectal cancer. Virchows Archiv . 2016;469(2):125–134. doi: 10.1007/s00428-016-1956-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Afrăsânie V. A., Marinca M. V., Alexa-Stratulat T., et al. KRAS, NRAS, BRAF, HER2 and microsatellite instability in metastatic colorectal cancer - practical implications for the clinician. Radiology and Oncology . 2019;53(3):265–274. doi: 10.2478/raon-2019-0033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Esposito R., Esposito D., Pallante P., Fusco A., Ciccodicola A., Costa V. Oncogenic properties of the antisense lncRNACOMETinBRAF- andRET-driven papillary thyroid carcinomas. Cancer Research . 2019;79(9):2124–2135. doi: 10.1158/0008-5472.CAN-18-2520. [DOI] [PubMed] [Google Scholar]
- 59.Zhao Y., Li Y., Sheng J., et al. P53-R273H mutation enhances colorectal cancer stemness through regulating specific lncRNAs. Journal of Experimental & Clinical Cancer Research . 2019;38(1) doi: 10.1186/s13046-019-1375-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw data of this study are derived from TCGA database (https://portal.gdc.cancer.gov/), which is a publicly available database.