Skip to main content
BioMed Research International logoLink to BioMed Research International
. 2020 Nov 25;2020:3497810. doi: 10.1155/2020/3497810

The Development of Three-DNA Methylation Signature as a Novel Prognostic Biomarker in Patients with Colorectal Cancer

Shu Gong 1, Weijian Ye 2, Tiankai Liu 1, Shaofen Jian 3, Wenhua Liu 1,
PMCID: PMC7714567  PMID: 33294438

Abstract

Aims

The prognosis of colorectal cancer (CRC) remains poor. This study aimed to develop and validate DNA methylation-based signature model to predict overall survival of CRC patients.

Methods

The methylation array data of CRC patients were retrieved from The Cancer Genome Atlas (TCGA) database. These patients were divided into training and validation datasets. A risk score model was established based on Kaplan-Meier and multivariate Cox regression analysis of training cohort and tested in validation cohort.

Results

Among total 14,626 DNA methylation candidate markers, we found that a three-DNA methylation signature (NR1H2, SCRIB, and UACA) was significantly associated with overall survival of CRC patients. Subgroup analysis indicated that this signature could predict overall survival of CRC patients regardless of age and gender.

Conclusions

We established a prognostic model consisted of 3-DNA methylation sites, which could be used as potential biomarker to evaluate the prognosis of CRC patients.

1. Introduction

Colorectal cancer (CRC) is the third leading cause of cancer death worldwide 1. Despite recent development of early diagnosis and treatment techniques, the 5-year survival rate of CRC patients is unsatisfactory. Currently, prognostic models for CRC based on some characteristics such as age and gender are not precise. The ability to distinguish high-risk CRC patients may help clinical trials to demonstrate clinical benefits 2. Therefore, highly specific and sensitive predictive prognostic biomarkers are urgently needed for accurate prediction of patient survival, which may provide important guide to estimate treatment outcomes.

Recent evidence shows that epigenetic markers such as DNA methylation have the potential as a variety of biomarkers in disease diagnosis and prognosis predication 35. The abnormal DNA methylation may be present commonly in tumors and can be utilized as one of the earliest distinguishing molecular characteristics in CRC. Therefore, building the novel DNA methylation prognostic signatures to distinguish high- and low-risk CRC patients is urgently needed, which can be helpful for the stratification of treatment and personalized therapy.

Up to now, the use of genome-wide methylation analysis for CRC is limited by the large sets of DNA methylation data and complex statistical analysis. It is also difficult for the reproducibility with other independent factors. In this study, we analyzed colon adenocarcinoma (COAD) samples with 450k DNA methylation array to identify a prognostic panel for CRC, using the Cancer Genome Atlas (TCGA) dataset. Next, we built 3-DNA methylation biomarkers model associated with patient survival by using the methylation level of all methylation markers related to CRC patients from TCGA. Finally, we applied the Kaplan-Meier method and the ROC analysis to build and evaluate the model performance. Our results showed that the 3-DNA methylation biomarkers could provide the high accuracy performance to predict CRC patient survival and may be used as the novel prognostics markers.

2. Materials and Methods

2.1. Data Source from TCGA Dataset

This is a bioinformatics analysis study and no ethical statement was required. DNA methylation data and the related clinical information including tumor stage, survival status, and survival time for patients were downloaded from TCGA-COAD project (TCGA, https://cancergenome.nih.gov/). The TCGA-COAD methylation data were obtained using Illumina Human Methylation 450k BeadChip (Illumina Inc., CA, USA). Total 480 COAD tissue samples and 41 adjacent normal colon tissue samples were included in the TCGA-COAD cohort.

2.2. Data Analysis

Only samples with complete clinical data were selected to analyze the correlation of DNA methylation. Duplicated clinical information samples could be removed. Ultimately, 457 samples including cases and normal colon tissue were included in this study, and the related clinical information for each sample was obtained from the database. We split the data into training dataset (70% of the entire dataset) and validation dataset (30% of the entire dataset). We applied the training dataset for model building and identifying the prognostic biomarkers and applied the validation dataset for checking the accuracy of the model.

2.3. Statistical Analysis

All statistical analyses were conducted using the R statistical package. The univariate Cox regression survival analysis was conducted in the training dataset to identify methylation markers significantly (p < 0.05) correlated with patient survival as candidate markers, which were put for further multivariate cox regression analysis. Three markers were selected from the candidates to construct the final model. The AUC value was used to measure predictive performance of models; the higher the AUC value is, the more reliable the model is. The prognostic risk scores for each patient were calculated based on this formula, and the patients were separated into “low-risk” and “high-risk” groups using the median risk score as the cutoff point. Kaplan-Meier survival analysis was performed to calculate the cumulative survival time and compare the differences in the survival time between the two groups. The ROC analysis was conducted with the “pROC” R package with the methylation biomarkers.

3. Results

3.1. The Clinical Information of Samples

We collect 457 samples in this study after the filtration from the TCGA-COAD project. The median survival time of all samples was 2,821 days. The gender of about 54% samples was male. The tumor histologic grade of all samples was assigned according to the World Health Organization criteria into stages I, II, III, and IV.

3.2. Identification of Signature Predicting CRC Prognosis

According to the univariate Cox regression model, the methylation levels were used as input variables in the training dataset, and 14,626 DNA methylation candidate markers (p < 0.05) were identified to be significantly associated with overall survival (OS) of COAD patients. Next, multivariate Cox regression was applied to screen the candidate markers, and 3 methylation markers (cg14660573, cg09353563, and cg00110724) were found to be the optimum prognostic model for predicting OS of COAD patients. The risk scoring formula of these 3 methylation sites was as follows: risk score = 0.09 × β value of cg14660573 + −0.04 × β value of cg09353563 + 2.06 × β value of cg00110724.

3.3. The Association between 3-DNA Methylation Signature Predicting Model and COAD Cohort in the Training and Validation Datasets

Kaplan-Meier analysis was performed to calculate the risk score of 3-DNA methylation markers; the distribution of risk score was shown in Figure 1(a). Then, the median of risk score was used as cut-off value to classify the dataset into high-risk group (N = 201) and low-risk group (N = 109). The mortality rate in the high-risk group was higher than that in the low-risk group (Figure 1(b)). The predicting model based on the training dataset demonstrated that samples in the high-risk group had a significantly lower survival rate, whereas the low-risk group had higher survival rate (p = 0.038, Figure 2(a)). Using this model, the same tendency was seen in the validation dataset and the all dataset (Figures 2(b) and 2(c)). These results indicated that 3-DNA methylation markers could distinguish the high-risk patients from the low-risk patients.

Figure 1.

Figure 1

Risk score analysis of the 3-methylation signature and survival distribution by risk scores. (a) Risk score distribution of the 3-methylation signature. (b) Survival status distribution by risk scores.

Figure 2.

Figure 2

Kaplan-Meier curve of OS for high-risk and low-risk groups based on 3-methylation signature in training dataset (a), validation dataset (b), and all cohort dataset (c).

In addition, differential levels of three methylation biomarkers were analyzed individually. The results showed that methylation levels of cg14660573 and cg00110724 were higher in the high-risk group than in the low-risk group, while methylation levels of cg09353563 were lower in the high-risk group than in the low-risk group (Figure 3). The three markers were related to the gene nuclear receptor subfamily 1 group H member 2 (NR1H2), Scribble Planar Cell Polarity Protein (SCRIB), and Uveal Autoantigen With Coiled-Coil Domains And Ankyrin Repeats (UACA), and the detailed information was listed in Table 1.

Figure 3.

Figure 3

The differential methylation levels of cg14660573, cg09353563, and cg00110724 in high-risk and low-risk groups. Mann–Whitney U test was used to compare the differences between high-risk and low-risk groups.

Table 1.

Three DNA methylation markers related to CRC risk.

ID Chromosome location Gene symbol CGI coordinate Feature type p value (univariate) p value (multivariate)
cg14660573 chr19:50376162-50376163 NR1H2 chr19:50376349-50377026 N_Shore 5e-06 0.004
cg00110724 chr8:143803681-143803682 SCRIB chr8:143803099-143803933 Island 0.000208 0.013
cg09353563 chr15:70702096-70702097 UACA chr15:70762559-70763891 None 1.2e-05 3e-04

3.4. Validation of the 3-DNA Methylation Signature for Predicting CRC Prognosis

In order to examine the prediction performance of the 3-DNA methylation signature for CRC prognosis, ROC analysis was conducted to evaluate the sensitivity and specificity of the 3-DNA methylation signature in the validation dataset. The AUC of the 3-DNA methylation model was 0.673 (Figure 4), which indicated that this model could achieve high sensitivity (TPR: true positive rate) and specificity (FPR = 1-specificity: false-positive rate).

Figure 4.

Figure 4

ROC curve showing the AUC of the 3-DNA methylation signature in predicting OS of CRC patients.

3.5. Prediction Performance of the 3-DNA Methylation Signature in CRC Patient Subgroups

To examine whether the 3-DNA methylation signature could achieve high applicability in different clinical cohort, clinical features including the age and gender were used to regroup samples in the cohort, and then, each subgroup was further classified into high-risk and low-risk group using the 3-DNA methylation biomarkers model. First, patients were divided into two cohorts based on the ages at the initial diagnosis: <=73 (N = 212), >73 (N = 98). Kaplan-Meier curves showed that patients in the low-risk group had significantly longer OS in the younger age group (Figure 5(a)), and ROC analysis showed that the AUC value was 0.593 in this age group (Figure 5(b)). Similarly, Kaplan-Meier curves showed that patients in the low-risk group had significantly longer OS in the older age group (Figure 5(c)), and ROC analysis showed that the AUC value was 0.847 in this age group (Figure 5(d)).

Figure 5.

Figure 5

Kaplan-Meier and ROC analyses of patients in different age cohorts based on the age at initial diagnosis: ≤73 (N = 212, 68.4%), >73 (N = 98, 31.6%). (a, c) Kaplan-Meier analysis was performed to estimate the differences in OS between the low-risk and high-risk patients. (b, d) ROC curves of the 3-DNA methylation signature were used to demonstrate the sensitivity and specificity in predicting the OS of patients.

Next, patients were divided into two cohorts based on the gender. Kaplan-Meier curves showed that patients in the low-risk group had significantly longer OS in the female group (Figure 6(a)), and ROC analysis showed that the AUC value was 0.796 in the female group (Figure 6(b)). Similarly, Kaplan-Meier curves showed that patients in the low-risk group had significantly longer OS in the male group (Figure 6(c)), and ROC analysis showed that the AUC value was 0.678 in the male group (Figure 6(d)). Taken together, these data demonstrated that the 3-DNA methylation signature model could predict the survival status in COAD patients regardless of age and gender.

Figure 6.

Figure 6

Kaplan-Meier and ROC analyses of patients in different gender cohorts, female (N = 144, 46.5%), and male (N = 166, 53.5%). (a, c) Kaplan-Meier analysis was performed to estimate the differences in OS between the low-risk and high-risk patients. (b, d) ROC curves of the 3-DNA methylation signature were used to demonstrate the sensitivity and specificity in predicting the OS of patients.

4. Discussion

The absence of highly effective and specific biomarkers for predicting the prognosis remains a challenge in clinical management of CRC patients. The recent rapid development of omics technologies such as genomes, transcriptomes, and proteomes provides new hope for establishing valuable prognostic models for CGC 610. However, the detection of large-scale expression levels of mRNAs, micro RNAs, long noncoding RNAs, or DNA polymorphisms could lead to variations due to different platforms and techniques used in the studies 11, 12. In contrast, epigenetic changes such as DNA methylation are reliable and specific cancer biomarkers that can be detected by PCR in blood and other body fluids samples easily obtained through noninvasive approach 13. In particular, evidence from several studies indicated that the combinations of several DNA methylation markers achieved higher sensitivity and specificity for cancer prognosis compared with individual DNA methylation marker 1416. Therefore, in this study, we aimed to develop DNA methylation signature significantly associated with CRC prognosis with univariate and multivariate Cox proportional hazards regression analysis and the ROC analysis based on the genome-wide DNA methylation analysis. Our results demonstrated that the three-DNA methylation signature can perform well to distinguish high- and low-risk groups and that the risk score calculated by the three-DNA methylation signature could be a prognostic indicator for CRC patients.

Interestingly, the three-DNA methylation signature was related to the methylation in three genes encoding NR1H2, SCRIB, and UACA, respectively. NR1H2 is a member of the nuclear receptor superfamily and is composed of a central DNA-binding domain and C-terminal ligand-biding domain. NR1H2 could regulate glucose and cholesterol metabolism and is potentially involved in tumorigenesis 17. Interestingly, a recent study reported that NR1H2 mRNA levels were lower in CRC tissues compared to control 18. This result is consistent with our result that methylation levels of cg14660573 (NR1H2 gene) were higher in the high-risk group than in the low-risk group, suggesting that methylation of NR1H2 gene may contribute to CRC.

SCRIB is a membrane protein that plays a role in the maintenance of apical-basal cell polarity of the epithelial tissue and is implicated in tumorigenesis 19. Downregulation of SCRIB could disrupt the epithelial polarity and was strongly correlated with poor survival in prostate cancer patients 20. Consistently, with the potential tumor suppressor role of SCRIB, we found that methylation levels of cg00110724 (SCRIB gene) were higher in the high-risk group than in the low-risk group, suggesting that methylation of SCRIB gene would lead to the loss of tumor suppression and promote CRC.

UACA was first identified as an autoantigen in panuveitis patients and later studies showed that UACA expression was higher in lung adenocarcinoma and squamous cell carcinoma, independent of tumor grade 21. A recent study reported that the urine level of UACA was higher in prostate cancer patients and UACA could be a biomarker for prostate cancer 22. In this study, we found that methylation levels of cg09353563 (UACA gene) were lower in the high-risk group than in the low-risk group, indicating that the upregulation of UACA may promote CRC.

To our knowledge, this is the first study on 3-DNA methylation signature as prognostic biomarkers to predict CRC patient survival. While our model based on 3-DNA methylation signature showed high sensitivity and specificity to distinguish high-risk and low-risk patients, it is still early to conclude that our model is superior to traditional methods such as imaging to predict CRC patient outcomes. In addition, while our model incorporated age and gender to predict the OS of CRC patients, we did not analyze other clinical characteristics due to limited information in our study cohorts. In future studies, we need to detect the expression levels and predictive efficacy of three methylation sites in CRC cells and animal models of CRC to explore molecular mechanism underlying CRC progression.

In conclusion, in this study, we established a prognostic model consisted of 3 DNA methylation sites and validated the high sensitivity and specificity of this model in training and validation cohorts. Further studies are needed to confirm that the 3 DNA methylation signature could be used as a potential biomarker to evaluate the prognosis of CRC patients.

Acknowledgments

This study was supported by funds from Guangdong Provincial Department of Education (No. 2017GkQNCX101), Guangdong Provincial Health and Family Planning Commission (No. A2018223), and PhD Start-up Fund of Zhaoqing University (611/221765).

Data Availability

All data are available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Siegel R. L., Miller K. D., Goding Sauer A., et al. Colorectal cancer statistics, 2020. CA: a Cancer Journal for Clinicians. 2020;70(3):145–164. doi: 10.3322/caac.21601. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
  • 2.Wang T., Maden S. K., Luebeck G. E., et al. Dysfunctional epigenetic aging of the normal colon and colorectal cancer risk. Clinical Epigenetics. 2020;12(1):p. 5. doi: 10.1186/s13148-019-0801-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Chen D., Wu H., He B., et al. Five hub genes can be the potential DNA methylation biomarkers for cholangiocarcinoma using bioinformatics analysis. Oncotargets and Therapy. 2019;Volume 12:8355–8365. doi: 10.2147/OTT.S203342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wang L., Mao Q., Zhou S., Ji X. Hypermethylated KLF9 is an independent prognostic factor for favorable outcome in breast cancer. Oncotargets and Therapy. 2019;Volume 12:9915–9926. doi: 10.2147/OTT.S226121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Grazziotin-Soares D., Lotz J. P. Un sous-groupe de Cancer gastrique positif au virus d’Epstein-Barr (EBV) identifié pour sa sensibilité à l’immunothérapie. Oncologie. 2019;21(1-4):69–72. doi: 10.3166/onco-2019-0030. [DOI] [Google Scholar]
  • 6.Di Z., Di M., Fu W., et al. Integrated analysis identifies a nine-microRNA signature biomarker for diagnosis and prognosis in colorectal cancer. Frontiers in Genetics. 2020;11:p. 192. doi: 10.3389/fgene.2020.00192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yang Z. D., Kang H. Exploring prognostic potential of long noncoding RNAs in colorectal cancer based on a competing endogenous RNA network. World Journal of Gastroenterology. 2020;26(12):1298–1316. doi: 10.3748/wjg.v26.i12.1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lin K., Huang J., Luo H., et al. Development of a prognostic index and screening of potential biomarkers based on immunogenomic landscape analysis of colorectal cancer. Aging. 2020;12(7):5832–5857. doi: 10.18632/aging.102979. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chen J., He Q., Wu P., et al. ZMYND8 expression combined with pN and pM classification as a novel prognostic prediction model for colorectal cancer: based on TCGA and GEO database analysis. Cancer Biomarkers. 2020;28(2):201–211. doi: 10.3233/CBM-191261. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
  • 10.Li S., Chen S., Wang B., Zhang L., Su Y., Zhang X. A robust 6-lncRNA prognostic signature for predicting the prognosis of patients with colorectal cancer metastasis. Frontiers in Medicine. 2020;7 doi: 10.3389/fmed.2020.00056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li H., Liao C., Weng W., Zhong H., Zhou T. Association of hypoxia-inducible factor-1α (HIF1α) 1772C/T gene polymorphism with susceptibility to renal cell carcinoma/prostate cancer. Biocell. 2020;44(2):257–262. doi: 10.32604/biocell.2020.08826. [DOI] [Google Scholar]
  • 12.Dairong L., Zhuo X., Huang L., Xiaohui J., Wang D. XRCC1 Arg399Gln and Arg194Trp polymorphisms regulate XRCC1 expression and chemoresistance of non-small cell lung cancer cells. Biocell. 2019;43(3):139–144. doi: 10.32604/biocell.2019.06460. [DOI] [Google Scholar]
  • 13.Pan Y., Liu G., Zhou F., Su B., Li Y. DNA methylation profiles in cancer diagnosis and therapeutics. Clinical and Experimental Medicine. 2018;18(1):1–14. doi: 10.1007/s10238-017-0467-0. [DOI] [PubMed] [Google Scholar]
  • 14.Dai W., Teodoridis J. M., Zeller C., et al. Systematic CpG islands methylation profiling of genes in the wnt pathway in epithelial ovarian cancer identifies biomarkers of progression-free survival. Clinical Cancer Research. 2011;17(12):4052–4062. doi: 10.1158/1078-0432.CCR-10-3021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Guo W., Zhu L., Yu M., Zhu R., Chen Q., Wang Q. A five-DNA methylation signature act as a novel prognostic biomarker in patients with ovarian serous cystadenocarcinoma. Clinical Epigenetics. 2018;10(1):p. 142. doi: 10.1186/s13148-018-0574-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li C., Zheng Y., Pu K., Da Zhao Y. W., Guan Q., Zhou Y. A four-DNA methylation signature as a novel prognostic biomarker for survival of patients with gastric cancer. Cancer Cell International. 2020;20(1) doi: 10.1186/s12935-020-1156-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cariello M., Ducheix S., Maqdasy S., Baron S., Moschetta A., Lobaccaro J.-M. A. LXRs, SHP, and FXR in prostate cancer: enemies orMénage à QuatreWith AR? Nuclear Receptor Signaling. 2018;15:p. 155076291880107. doi: 10.1177/1550762918801070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sharma B., Gupta V., Dahiya D., Kumar H., Vaiphei K., Agnihotri N. Clinical relevance of cholesterol homeostasis genes in colorectal cancer. Biochimica et Biophysica Acta - Molecular and Cell Biology of Lipids. 2019;1864(10):1314–1327. doi: 10.1016/j.bbalip.2019.06.008. [DOI] [PubMed] [Google Scholar]
  • 19.Saito Y., Desai R. R., Muthuswamy S. K. Reinterpreting polarity and cancer: the changing landscape from tumor suppression to tumor promotion. Biochimica Et Biophysica Acta. Reviews on Cancer. 2018;1869(2):103–116. doi: 10.1016/j.bbcan.2017.12.001. [DOI] [PubMed] [Google Scholar]
  • 20.Pearson H. B., Perez-Mancera P. A., Dow L. E., et al. SCRIB expression is deregulated in human prostate cancer, and its deficiency in mice promotes prostate neoplasia. The Journal of Clinical Investigation. 2011;121(11):4257–4267. doi: 10.1172/JCI58509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Burikhanov R., Shrestha-Bhattarai T., Qiu S., et al. Novel mechanism of apoptosis resistance in cancer mediated by extracellular PAR-4. Cancer Research. 2013;73(2):1011–1019. doi: 10.1158/0008-5472.CAN-12-3212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Solé C., Goicoechea I., Goñi A., et al. The urinary transcriptome as a source of biomarkers for prostate cancer. Cancers. 2020;12(2):p. 513. doi: 10.3390/cancers12020513. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All data are available on request.


Articles from BioMed Research International are provided here courtesy of Wiley

RESOURCES