Abstract
Background: There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. Methods: All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. Results: The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. Conclusion: The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.
Keywords: Colorectal cancer, adenoma-carcinoma sequence, advanced colorectal adenoma, mRNA expression profile, prognosis
Introduction
Colorectal cancer (CRC) is the third most common cancer, with an estimated 1.2 million cases and 608,700 deaths worldwide in 2008 [1]. While the overall effects and interactions of environmental and lifestyle factors [2] and the inherited and acquired genetic and epigenetic alterations [3-5] on CRC development are still incompletely understood, knowledge has been improved in recent years.
Colorectal adenomas are precursor lesions of CRC. Fearon and Vogelstein [6] proposed the concept of adenoma-carcinoma sequence, postulating that the histopathological transition from adenoma to carcinoma in patients with CRC was associated with an accumulation of genetic events that conferred a significant growth advantage to a clonal population of cells. This multistep genetic model has identified a number of key regulatory oncogenes and tumor suppressor genes which acquire either activating or loss of function mutations, driving the progression from normal colonic epithelium to cancer cells [3,7,8]. It has been suggested that there may be at least seven distinct genetic changes required for the progression from adenoma to carcinoma, although it is the accumulation of changes rather than the specific nature or temporal order [4]. Despite the Vogelstein’s model [9] and the concept of high risk or ‘advanced adenoma’ [10,11], there are still no absolute criteria that can predict the adenoma progression to cancer. It has been reported that approximately 40% of the Western population will develop adenomas [12,13]. However, only 5% of these people will suffer from CRC [14]. Thus, is it necessary to remove all endoscopically detected adenomas.
Colorectal carcinogenesis is a multistep process involving the gradual accumulation of genetic and epigenetic alterations. These changes promote the malignant transformation of precancerous lesions of the colorectal mucosa [15], a process reflected by progressively severe cellular dysplasia and increase in lesion size. At least two-thirds of all CRCs develop from precancerous lesions with adenomatous features [16]. The molecular events of CRCs have been intensively investigated with high-throughput, array based tools, which furnish quantitative, genome-wide descriptions of the individual gene expression associated with different cell phenotypes (e.g., adenoma cells vs. normal epithelial cells) [17-20]. More recently, other methods used to analyze gene expression data have been developed to gain additional insight into the mechanisms driving the phenotypic changes. There have several methods used for quantitatively analyzing gene expression profile [21-23], including gene set enrichment analysis (GSEA) [24], R project software (R) [25], and gene list analysis with prediction accuracy [23].
With the recent advent of microarray technology, risk assessment for CRC has been improved by gene expression profiling. The present study was to investigate the subtypes of adenoma-carcinoma sequence aiming to better characterize their functional differences on the carcinogenesis multistep process. In contrast to previous studies, a new iterative clustering method was employed in the present study, which allows the detection of expression patterns of varying strength. Firstly, the adenoma-adencarcinoma sequence served as a template in this study, and differences in the mRNA expression profiles determined by using high-throughput gene-chip were detected in tissues including normal mucosa, low-grade adenoma, high-grade adenomas, colorectal adencarcinoma, and then characterization of gene expression profiles, evaluation of relationship with the prognosis of CRC patients and further screening of prognostic molecular markers were performed. Our findings may provide theoretical basis for in-depth study of molecular pathogenesis, drug discovery, diagnosis and personalized treatment of CRC.
Materials and methods
Specimens
Colorectal low-grade intraepithelial neoplasia (LIN), high-grade intraepithelial neoplasia (HIN) and adenocarcinoma tissues were obtained from patients undergoing colorectal surgery in the Department of Colorectal Surgery, Beijing Shijitan Hospital, Capital Medical University between 2010 and 2012. Biopsy was also performed to collect colorectal LINs, HINs and adenocarcinomas from patients undergoing colonoscopy in the Department of Endoscopy, General Hospital of People’s Liberation Army (PLA) and Cancer Hospital of Chinese Academy of Medical Sciences between 2009 and 2011. Patients with a history of familial adenomatous polyposis, hereditary non-polyposis CRC, and inflammatory bowel disease were excluded from the present study. LIN is defined as low-grade adenoma/dysplasia; HIN includes high-grade adenoma/dysplasia, carcinoma in situ, suspicious invasive carcinoma and intramucosal carcinoma; adenocarcinoma is defined as a tumor with submucosal invasion. According to ASGE guideline, biopsy specimens of colorectal neoplasia were excised from 4 to 6 different areas, including the edges and the center of the lesions. Normal colorectal mucosal samples were detached from the surgical specimens obtained from patients with hemorrhoids undergoing surgical excision in the Department of Surgical Oncology of Beijing Shijitan Hospital between 2009 and 2010. Tissue samples were snap-frozen in liquid nitrogen immediately after biopsy or surgery and stored at -80°C. A part of tissues was subjected to pathological examination. Histopathological assessment of all samples was performed by two independent and experienced pathologists blind to this study. Samples meeting the diagnostic criteria for normal mucosa and neoplasia (neoplastic cells > 70%) were enrolled. If more than one biopsy tissue from the same patient was enrolled, these samples were pooled. This study was approved by the Ethics Committee of the Peking Union Medical College and the Beijing Shijian Hospital, Capital Medical University, and informed consent was obtained from all patients.
RNA isolation
Total RNA was extracted from frozen tissues using the TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s instructions. RNA integrity was determined using a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). If the RNA integrity was greater than or equal to five, the total RNA was purified using the RNeasy Mini Kit (Cat No. 74106, Qiagen, Germany). The RNA concentration was determined using a NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE).
Microarray expression profiling
After histopathological examination and RNA integrity detection, 12 colorectal normal mucosa, 85 LIN, 42 HIN, and 66 adenocarcinoma samples were subjected to mRNA microarray assay. Purified RNA samples were labeled and hybridized to Agilent 4-44K Whole Human Genome Oligo Microarrays (G4112F) according to the manufacturer’s instructions.
Statistical analysis of data from microarray assay
Normalization of microarray data
The raw data from mRNA microarray assay were medians normalized with the GeneSpring GX software (version 12.0; Silicon Genetics, Redwood City, CA, USA). A total of 19,476 single probes were obtained based on GeneID, reserving the probe with the largest frequency for the flag-P (present) probe. For the raw data of mRNA microarray assay, quantile normalization was performed using R project software (R Foundation for Statistical Computing, Vienna, Austria).
Differentially expressed mRNAs and GO enrichment analysis
The assessment of differentially expressed mRNAs between samples at different stages of colorectal carcinogenesis was performed using a generalized linear model (Bonferroni-corrected P < 0.05) in R project software. Gene Ontology (GO) enrichment analysis was performed using the DAVID software (False-Discovery Rate-corrected P < 0.05). The interested genes were mapped to pathway enrichment analysis using an R-based software package, and hypergeometric distribution test was used to calculate the P value which was adjusted using the FDR method.
Kaplan-Meier survival analysis and Cox regression analysis
The mRNA expression profiles of CRC with prognosis information were retrieved from the Gene Expression Omnibus, of which the accession numbers is GSE17536, GSE14333, and GSE17537. These survival data were extracted from the original publications. At the same time, another survival data were collected from the Zhejiang University, named ZJU-CRC. Gene-related number (GRN) analysis was conducted using interested genes in each data-set. The first Gene-related number (GRN1) captures the greatest amount of total variance in the profiles, and the patients were divided into two groups with equal size based on the rank order of GRN across their tumor profiles. Kaplan-Meier survival analysis was performed to evaluate the association between GRN-assigned groups and survival time. A log-rank test was applied for comparisons. Cox proportional hazards regression model was used to evaluate the prognostic factors in a stepwise manner using SPSS 15.0 (SPSS Inc., Chicago, USA). A value of P < 0.05 was regarded as statistically significant.
Results
Gene expression profiles of precancerous lesions and malignant CRC
To clarify the genomic changes occurring early during the colorectal tumorigenesis, gene microarray assay was performed using Agilent 4-44K Whole Human Genome Oligo Microarrays in normal colonic tissues (n = 12), precancerous lesions (85 LIN, 42 HIN), and CRCs (n = 66). Hierarchical clustering analysis identified 19476 probes that detected frequent differentially expressed in tumor tissues. Subsequent K-means clustering analysis using these probe sets revealed that the samples could be clearly categorized into three subclasses based on the expression levels.
To further characterize the genes that acquired gene expression profiles progressively during colorectal tumorigenesis, bioinformatics analysis was then performed in a series of precancerous lesions in which LIN was present, along with HIN protruding components within the same lesions. Based on the results previously summarized, a series of marker genes were selected to characterize the differentially gene expression profiles of precancerous lesions and malignant lesions. Results showed these genes involved in the biological responses to stimuli, developmental processes, signal conduction, biological adhesion, cellular processes, immune system, etc.
Expression of CRC carcinogenesis related genes and clinical prognosis of patients with CRC
Then, the microarray data and clinical information of 574 CRC patients (see Table 1) were down-loaded from public database and dataset of unpublished database. Table 1 showed information about the relationship between gene expression and clinical prognosis of patients from adenoma and carcinoma sequence model. According to origin of data source, 4 groups were named GSE14333, GSE17537, GSE17536 and ZJU_CRC, respectively.
Table 1.
Clinical characteristics of CRC patients from groups of 3 independent public and 1 unpublished databases
Clinical characteristics | GSE14333 (N = 290) | GSE17537 (N = 55) | GSE17536 (N = 177) | ZJU_CRC (N = 52) |
---|---|---|---|---|
Age (yr) | ||||
Median | 67 | - | - | 61 |
Range | 26-92 | - | - | 26-93 |
Average, SD | - | 62.3 ± 14.1 | 65.5 ± 13.1 | 60.2 ± 13.5 |
Follow-up Time | ||||
Median | 37.2 | 50.2 | 48.1 | 49.5 |
Range | 0.9-118.6 | 0.4-111.3 | 0.92-142.6 | 4-100 |
Terminal Event = Death | ||||
n (%) | 48 (16.5) | 20 (36.3) | 73 (41.2) | 22 (42.3) |
Deficiency | 60 (20.7) | - | - | 1 (1.9) |
Sex (%) | ||||
Male | - | 30 (54.5) | 96 (54.2) | 29 (55.8) |
Female | - | 25 (45.5) | 81 (45.8) | 23 (44.2) |
TNM Stage (%) | ||||
I | 44 (15.2) | 4 (7.3) | 24 (13.6) | 0 (0) |
II | 94 (32.4) | 15 (27.3) | 57 (32.2) | 17 (32.7) |
III | 91 (31.4) | 19 (34.5) | 57 (32.2) | 35 (67.3) |
IV | 61 (21) | 17 (30.9) | 39 (22) | 0 (0) |
Lymph Node Metastasis (%) | ||||
No | 138 (47.6) | 19 (34.5) | 81 (45.8) | 17 (32.7) |
Yes | 91 (31.4) | 19 (34.5) | 57 (32.2) | 35 (67.3) |
N/A | 61 (21) | 17 (30.9) | 39 (22) | 0 (0) |
Correlations between overall survival of CRC and adenoma-carcinoma sequence
First, the overall expression levels were screened in adenoma-carcinoma sequence genes by unpaired t test, and P values was corrected by Benjamini-Hochberg FDR method, and the significance level was α = 0.05. A total of 93 probes showed significant differences, of which 58 probes had up-regulated expression, 35 had down-regulated expression (Figure 1).
Figure 1.
The differentially expressed genes between groups of adenoma-carcinoma sequence model. Green: normal mucosa (N) and the low-level adenomas (grade L) group; red: low-level adenomas (grade L) and high-level adenomas (grade H) group; blue: high-level adenoma (grade H) and adenocarcinoma (T) group. Circle: differentially expressed probes between two or three groups.
Secondly, the overall survival time was analyzed by using R software in which death served as an end-event from patients receiving follow-up in 4 independent data sets. Results showed there were no significant correlations between overall gene expression profiles and prognosis of CRC patients (Figure 2).
Figure 2.
Kaplan-Meier overall survival and overall gene expression profiles. The overall survival K-M curves of GRN were delineated in low to high grade CRC data sets. The solid and dashed lines: low and high K-M curve of the GRN with two group patients; short vertical lines: censored values. The curve near “H” was high expression. (P values from left to right, up to down as follows: 0.99, 0.15, 0.21 and 0.43).
Differentially expressed genes significantly related to the overall survival time between high grade adenoma group and adenocarcinoma groups from two independent samples of CRC patients
General characteristics of gene expression profiles between high grade adenoma group and adenocarcinoma group
Differentially expressed genes were identified between high grade adenoma group and adenocarcinoma group, using the unpaired t-test Statistical methods with Benjamini-Hochberg FDR multiple testing for P value correction (α = 0.05). Principal component analysis (PCA) showed that (Figure 3) the total mRNA expression profiles of two groups were similar, but had clear boundaries.
Figure 3.
Principal component analysis (PCA) of total mRNA expression profiles in high grade adenoma group and adenoma-carcinoma group. Each point represents a sample, the color represented a histological type, high-grade adenoma was blue, and adenoma-carcinoma was red.
Up-regulated and down-regulated genes of differentially expressed genes between high grade adenoma group and adenoma-carcinoma group
To further investigate whether gene expression levels were associated with clinical outcome of CRC patients, differentially expressed genes were compared between high-level adenoma group and adenocarcinoma group by the unpaired t-test, with Benjamini-Hochberg FDR multiple testing for P value correction (α = 1E-5, fold change ≥ 2.0). Of 235 differentially expressed genes were identified (Figure 4), 160 genes were up-regulated and 75 down-regulated. The top 20 up-regulated genes and down-regulated genes were listed in Table 2 with FC values (Table 2).
Figure 4.
Up-regulated genes and down-regulated genes of differentially expressed genes (Fold change ≥ 2.0) between high-grade adenoma group and adenocarcinoma group. In high grade adenoma group, down-regulated genes were blue, and up-regulated genes were red.
Table 2.
Top20 up-regulated and down-regulated genes among differentially expressed genes between high grade adenoma group and adenoma-carcinoma group (Fold change [FC] ≥ 2.0)
Up-regulated | Down-regulated | ||||
---|---|---|---|---|---|
| |||||
ProbeName | FC ([Ca] vs [H]) | GeneSymbol | ProbeName | FC ([Ca] vs [H]) | GeneSymbol |
A_23_P69030 | 10.403421 | COL8A1 | A_23_P51217 | -23.434 | CLCA1 |
A_23_P41344 | 10.351251 | EREG | A_23_P65307 | -8.04707 | SLITRK6 |
A_23_P56746 | 9.9119005 | FAP | A_32_P234184 | -6.84716 | HES5 |
A_24_P52697 | 9.705877 | H19 | A_23_P24543 | -6.23096 | FAM55A |
A_24_P277367 | 9.336863 | CXCL5 | A_23_P21495 | -6.18959 | FCGBP |
A_23_P7313 | 7.8016624 | SPP1 | A_32_P38093 | -5.63463 | ATOH8 |
A_24_P605612 | 7.485817 | THBS2 | A_23_P402765 | -5.14133 | NRAP |
A_23_P393620 | 6.9419 | TFPI2 | A_23_P358714 | -4.70154 | KIAA1324 |
A_23_P71037 | 6.7830887 | IL6 | A_23_P24507 | -4.515 | LOC643733 |
A_23_P143981 | 6.4896092 | FBLN2 | A_23_P18282 | -4.02727 | DLEC1 |
A_23_P207520 | 6.486388 | COL1A1 | A_23_P407112 | -3.80492 | SPATA18 |
A_23_P165624 | 5.9969482 | TNFAIP6 | A_23_P303149 | -3.69165 | FGFR2 |
A_24_P77432 | 5.7383146 | ROBO1 | A_23_P346900 | -3.54367 | CACNA2D2 |
A_23_P395438 | 5.6759515 | HTRA3 | A_24_P655849 | -3.45881 | SMAD9 |
A_24_P317762 | 5.635429 | LY6E | A_23_P332246 | -3.37031 | ATOH1 |
A_23_P104252 | 5.224433 | ITIH5 | A_23_P2814 | -3.33079 | SMAD9 |
A_23_P383009 | 5.219054 | IGFBP5 | A_24_P270424 | -3.31483 | DPF3 |
A_23_P43164 | 4.560891 | SULF1 | A_24_P166397 | -3.28529 | KIAA0319 |
A_23_P89431 | 4.5253115 | CCL2 | A_23_P106933 | -3.25231 | ACSM1 |
Differentially expressed genes were significantly associated with clinical survival of CRC patients
An important goal of this study was to assess the correlation between differentially expressed genes and prognosis of CRC patients. Thus, R software was employed. As shown in Figure 5, clinical relevance of our classification included a prognostic analysis based on OS restricted to the gene expression profiles with high grade adenoma and carcinoma in patients with significant correlation.
Figure 5.
Kaplan-Meier analysis of overall survival between high-grade adenoma group and adenocarcinoma group. Overall survival (OS) was associated with CRC patients GSE17537 data set (p values, 0.014), but significant correlation was not observed in other three groups (P values, 0.47, 0.39 and 0.14, respectively). “H”: HIN gene expression group. Short vertical line: censored data.
In particular, significant prognostic value of these up-regulated and down-regulated genes was found in patients with CRC as previously described. In Figure 6, the differentially expressed genes with up-regulation had closer relationship with the prognosis of CRC patients than those with down-regulation.
Figure 6.
Kaplan-Meier analysis of overall survival about up-regulated genes in patients with high-grade adenoma and adenocarcinoma. Overall survival (OS) was associated with CRC patients GSE17537 and GSE17536 data set (p values, 0.039 and 0.017, respectively), but there was no significant correlation in the other two groups (P values, 0.63 and 0.42, respectively). H: high grade adenoma gene expression group; Short vertical line: censored data.
However, down-regulated genes were not prognostic for all four data sets shown in Figure 7, and the prognostic value of our signature was not statistically significant in the validation and the overall survival datasets, independently of survival status.
Figure 7.
Kaplan-Meier analysis of overall survival about down-regulated genes in patients with high-grade adenoma and adenocarcinoma. Overall survival (OS) had no significant correlation with 4 separate datasets (P values, 0.16, 0.7, 0.83 and 0.43, respectively). “H”: high grade adenoma gene expression group; short vertical line: censored data.
Gene ontology enrichment analysis for prediction of survival in CRC patients
We collected and manually collected a set of 522 cancer and normal microarray samples from 3 different studies in the Gene Expression Omnibus (GEO). Gene ontology enrichment analysis was performed on the anti-profile genes and results showed that genes involving the development, organ morphogenesis and differentiation were enriched with hyper-variability. Using these data we developed an anti-profile to predict cancer status regardless of tumor or tissue type.
The gene function enrichment of differentially expressed genes were further analyzed with DAVID online database. As shown in Figure 8A and 8B, GSE17536 and GSE17537 databases could be used to predict the prognosis of patients with GO biological processes associated gene classification.
Figure 8.
Gene ontology of biological processes (BP) enriched histogram. A: The differentially expressed genes of public datasets GSE17536 were associated with the prognosis of CRC. Left: classification code and biological function GO annotation; right: number of genes in each category with such capabilities. It is sorted according to ascending order for corrected P value (0.0000 to 0.0413, arranged from top to bottom). B: The differentially expressed gene of public datasets GSE17537 were associated with the prognosis of CRC. Left: classification code and biological function GO annotation; right: number of genes in each category with such capabilities. It is sorted according to ascending order for corrected P value (0.0000 to 0.0486, arranged from top to bottom).
The same pattern was observed on an independent set of samples not used to define hyper-variable genes. We confirmed that hyper-variable genes in cancer coincided with tissue specific genes. Specifically, the set of tissue-specific genes were enriched for universally hyper-variable genes. As shown in Figure 8A and 8B, the cross coincidence from all categories classification was used to predict prognosis and had an important role in the biological processes of adenoma, which might be a clearer explanation for gene expression annotation for the differences between high grade adenoma group and carcinoma group (Table 3). GO analysis characterized ten categories of enrichment related to following biological processes: biological adhesion (GO: 0022610), cell adhesion (GO: 0007155), extracellular structure organization (GO: 0043062), skeletal system development (GO: 0001501), regulation of cell adhesion (GO: 0030155), regulation of growth (GO: 0040008), regulation of cell-substrate adhesion (GO: 0010810), positive regulation of cell-substrate adhesion (GO: 0010811), regulation of cell growth (GO: 0001558), and positive regulation of cell adhesion (GO: 0045785).
Table 3.
Gene ontology enrichment analysis
GO Classification | Genes | P | FDR |
---|---|---|---|
GO: 0022610 biological adhesion | COL18A1, GP1BB, IGFBP7, COL3A1, COL12A1, VCAN, NTM, CYR61 | 4.35E-06 | 0.0001 |
GO: 0007155 cell adhesion | COL18A1, GP1BB, IGFBP7, COL3A1, COL12A1, VCAN, NTM, CYR61 | 4.31E-06 | 0.0001 |
GO: 0043062 extracellular structure organization | COL18A1, COL3A1, MAP1B, COL12A1, CYR61 | 6.51E-07 | 0.0413 |
GO: 0001501 skeletal system development | INHBA, COL3A1, COL12A1, SPARC, WWTR1, IGFBP5, SPP1 | 9.77E-04 | 0.0145 |
GO: 0030155 regulation of cell adhesion | FBLN2, TGM2, COL8A1, CYR61, SPP1, RELL2 | 1.32E-04 | 0.0025 |
GO: 0040008 regulation of growth | INHBA, WISP1, IGFBP7, MAP1B, SEMA3A, ESM1, CYR61, IGFBP5, SPP1 | 2.75E-05 | 0.0004 |
GO: 0010810 regulation of cell-substrate adhesion | FBLN2, COL8A1, CYR61, SPP1, RELL2 | 2.22E-05 | 0.0003 |
GO: 0010811 positive regulation of cell-substrate adhesion | FBLN2, COL8A1, CYR61, SPP1, RELL2 | 2.92E-06 | 0.0000 |
GO: 0001558 regulation of cell growth | INHBA, WISP1, IGFBP7, MAP1B, SEMA3A, ESM1, CYR61, IGFBP5, SPP1 | 2.38E-06 | 0.0000 |
GO: 0045785 positive regulation of cell adhesion | FBLN2, TGM2, COL8A1, CYR61, SPP1, RELL2 | 4.22E-07 | 0.0000 |
Discussion
This comprehensive integrative analysis of colorectal tumor/adenoma pairs provides a number of insights into the biology of adenoma-carcinoma progression and identifies potential prognostic targets [26-28]. Integrated genetic and epigenetic analyses were performed in many colorectal adenocarcinoma, including normal and precancerous lesions. In this study, the mRNAs expression in normal colorectal mucosa, LIN, HIN, and adenocarcinoma were profiled. Substantial changes were found in a number of differentially expressed mRNAs in the transition from HIN to adenocarcinoma, with 160 up-regulated genes and 75 down-regulated genes. However, in the transition from HIN to adenocarcinoma, only a few of these differentially expressed genes varied in different phases of colorectal carcinogenesis. In the present study, results showed that the mRNA expression profiles of adenoma-carcinoma sequence were significantly different between high-grade adenoma group and adenocarcinoma group. The K-M curve of four CRC data sets with prognostic information were analyzed, which showed the value of K-M curve analysis was 0.039 and 0.017 between up-regulated mRNA expression level of the adenoma-carcinoma sequence, and to be important, we found that the two databases (GSE17537, GSE17536) were related to the prognosis of patients with CRC, but the value was not related to the down-regulated genes. Gene ontology (GO) function enrichment analysis was employed to analyze the BPs related to differentially expressed genes between high grade adenoma group and carcinoma group by using DAVID online analysis software, and results showed that genes enriched in extracellular structure organization, skeletal system development, biological adhesion and its regulation (biological adhesion, cell adhesion regulation of cell adhesion regulation of cell-substrate adhesion positive regulation of cell-substrate adhesion, and positive regulation of cell adhesion), growth regulation. Moreover, the IPR-related protein mainly focused on the insulin-like growth factor binding protein genes including WISP1 IGFBP7 ESM1, CYR61 IGFBP5. These results raise the questions regarding whether the molecular events occurring in the initiation stage are necessary and sufficient for the progression of colorectal tumorigenesis and whether the HIN stage is a critical pathological state in which a potentially malignancy-related prognostic molecular event already exists.
Some adenomas are proved to have a significantly higher potential for malignant progression. Being the direct precursors of CRC, they are recognized as advanced adenomas (AA). Advanced adenomas are defined as lesions of at least 10 mm in diameter, and include high-grade neoplasia with villous or tubulo-villous morphology, or any combination of the above features [29]. Colorectal carcinomas and their direct precursors-advanced adenomas-require urgent management and are defined as advanced neoplastic lesions according to their histological and clinical features [30,31]. Luo et al [32] reported there are subclasses of adenomas recognized by their epigenotype and KRAS mutation status, which raises the possibility that one of these subclasses, adenoma-H polyps, might be the precursors for CRCs with a low/intermediate methylation level. In addition, the epigenetic state of adenomas might influence the propensity of the adenoma to undergo malignant transformation and portend the epigenotype of resulting CRC.
Despite the current understanding of genetic alterations associated with the progression of colon cancer, the specific etiology of CRC has yet to be elucidated. Epidemiological studies have indicated that environmental factors and host immunological characteristics may contribute to the initiation and progression of colon cancer. Multiple genetic and epigenetic alterations are involved in colorectal carcinogenesis. The genomic events occurring in precancerous and malignant colorectal tumors are both considerably abundant, implying that genomic instability and resulting gene alterations are key molecular steps occurring early in the CRC development. In the present study, the overall expression level was analyzed from the evolution model of adenoma and carcinoma sequence. On the expression levels of genes, patients were classified as high and low expression groups. In subgroup analysis, the differentially expressed genes in HIN group and carcinoma group were significantly associated with the clinical survival of patients with CRC. To further investigate the gene biology function, 235 prognosis related genes were selected. The biological adhesion, cell adhesion, cell adhesion and extracellular structure organization were involved in the important biological processes from HIN to cancer related with prognosis (FDR adjusted P value: < 0.05), and the major genes included COL18A1, GP1BB, IGFBP7, COL3A1, COL12A1, VCAN, NTM, CYR61 and MAP1B. In addition, clinical outcomes associated protein classification focuses on Insulin - like growth factor-bling protein, IGFBP, von Willebrand factor, and fibrillar collagen, which mainly included WISP1, IGFBP7, ESM1, CYR61, IGFBP5, THBS2, COL3A1, COL1A2, COL1A1 and COL5A2. This helps us to understand that it is a key protein in the colorectal carcinogenesis. It also provides powerful experimental evidence for the future investigation about the prognosis of adenoma and carcinoma sequence.
In addition, our findings also highlighted the molecular drivers of colorectal adenoma-carcinoma sequence. Early detection and excision of intraepithelial lesions may reduce CRC morbidity and mortality. For example, in as early as 1950s, MacDonald proposed the biological predeterminism of human cancers based on a clinical survey which suggests that the clinical outcome can be defined by the intrinsic or destined natural history of cancer [33]. This theory challenges the widely accepted sequential model of carcinogenesis. Invasive behavior has been identified in the pancreatic intraepithelial neoplasia through in vivo lineage tracing, indicating that cancer dissemination precedes the pancreatic tumor formation. These observations imply that certain capabilities of cancer cells may be fully developed in the intraepithelial neoplasms and that the risk for intraepithelial neoplasm progression is predictable.
While gene expression profiling with microarray technologies has been widely applied to CRC for the diagnosis, classification and prognosis based on molecular patterns of expression, its application to response prediction of advanced adenoma is still lacking due to few currently available studies. One of the main limitations of our study is the absence of an independent data set for validation. Over 11 other genomic studies of CRC which were conducted to investigate the gene expression profiles between early versus advanced stage diseases were also considered. Of note, all of these studies used antiquated gene expression microarrays and a number of features seen in a recent large-scale study by the Cancer Genome Atlas (TCGA) were not observed. In addition, these studies failed to perform comprehensive multiple-platform analysis or high genomic resolution as in studies with TCGA. For this analysis, the ability to integrate heterogeneous genomic features is critical and the limitations of other data sets made them less useful for our integrated analysis. As a validation analysis with a single platform, an independent set of 204 samples with both clinical data and copy number variation data were employed from the expanded TCGA CRC data.
The generally assumed model of CRC development implies a sequence of events from adenoma formation to carcinoma that are caused and accompanied by genetic and epigenetic events [3]. Different molecular phenotypes have been used to define the CRC subtypes [27,34], such as the microsatellite instability (MSI) [35], epigenetic alterations (methylation state of CpG islands) [36], location of the colorectal tumor and mutations of genes (KRAS or BRAF). Key pathways that have been implicated in CRC include Wnt/β-catenin, TGF-β, MAPK, and PI3K signaling pathways [3].
Intense studies have been directed at the discovery of biomarkers that are predictive of disease progression or treatment response, albeit with limited success. Analysis of precancerous colorectal lesions of different sizes may provide important information on the steps involved in their malignant transformation. Vonlanthen et al [37] then provided novel information on the association between transcript levels of transcription factor (TF) gene with adenomatous transformation of colorectal epithelium and identified 261 TF genes that appear to play roles in the colorectal tumorigenesis. They pinpointed the TF genes of which the expression is significantly altered in colorectal adenomas and characterizes the extent and direction of these changes [37]. These efforts highlight the importance of gaining a better understanding of the molecular differences between CRC subtypes at the pathway level. Since clinical response data on the targeted treatments are very limited, experiments on cells have become an increasingly important tool for the investigations of molecular basis of different cancers and may link molecular features to phenotypes such as drug response [38,39].
In conclusion, whole genomic microarray assay using routine biopsy samples may be suitable for the identification of discriminative signatures for differential prognostic purposes. Our findings have important implications for translating the molecular basis of carcinogenesis into a clinical benefit. The detection of high-risk precancerous lesions is essential for preventing CRCs, and pit pattern observation by endoscopy enables us to detect neoplastic lesions with the malignant potential. Our results provide a basis of new gene expression pattern-based prognostic methods. As t a strong correlation with the protein expression was present, simultaneous analysis of protein marker sets is warranted. Nowadays, gene expression profiles recognizing a wide range of gene functional classification in tissue sections are available, which implies that it is promising to identify prognostic-specific markers in a simple test for future prognostic utilization.
Acknowledgements
This work was supported by the National High Technology Research and Development Program of China (2012AA02A506). We gratefully thank professor Shu Zheng from the Zhejiang University School of Medicine for providing the survival data of colorectal cancer patients in this study.
Disclosure of conflict of interest
None.
References
- 1.Schlicker A, Beran G, Chresta CM, McWalter G, Pritchard A, Weston S, Runswick S, Davenport S, Heathcote K, Castro DA. Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines. BMC Med Genomics. 2012;5:66. doi: 10.1186/1755-8794-5-66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Tomeo C, Colditz GA, Willett WC, Platz E, Rockhill B, Dart H, Hunter D. Harvard Report on Cancer Prevention. Vol 3: Prevention of colon cancer in the United States. Cancer Causes Control. 1999;10:167–180. doi: 10.1023/a:1017117109568. [DOI] [PubMed] [Google Scholar]
- 3.Fearon ER. Molecular genetics of colorectal cancer. Annu Rev Pathol. 2011;6:479–507. doi: 10.1146/annurev-pathol-011110-130235. [DOI] [PubMed] [Google Scholar]
- 4.Saif MW, Chu E. Biology of colorectal cancer. Cancer J. 2010;16:196–201. doi: 10.1097/PPO.0b013e3181e076af. [DOI] [PubMed] [Google Scholar]
- 5.Issa JP. Colon cancer: it’s CIN or CIMP. Clin Cancer Res. 2008;14:5939–5940. doi: 10.1158/1078-0432.CCR-08-1596. [DOI] [PubMed] [Google Scholar]
- 6.Fearon ER, Vogelstein B. A genetic model for colorectal tumorigenesis. Cell. 1990;61:759–767. doi: 10.1016/0092-8674(90)90186-i. [DOI] [PubMed] [Google Scholar]
- 7.Vogelstein B, Kinzler KW. Cancer genes and the pathways they control. Nat Med. 2004;10:789–799. doi: 10.1038/nm1087. [DOI] [PubMed] [Google Scholar]
- 8.Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J. The genomic landscapes of human breast and colorectal cancers. Science. 2007;318:1108–1113. doi: 10.1126/science.1145720. [DOI] [PubMed] [Google Scholar]
- 9.Bhalla A, Zulfiqar M, Weindel M, Shidham VB. Molecular diagnostics in colorectal carcinoma. Clin Lab Med. 2013;33:835–859. doi: 10.1016/j.cll.2013.10.001. [DOI] [PubMed] [Google Scholar]
- 10.Chung SJ, Kim YS, Yang SY, Song JH, Kim D, Park MJ, Kim SG, Song IS, Kim JS. Five-year risk for advanced colorectal neoplasia after initial colonoscopy according to the baseline risk stratification: a prospective study in 2452 asymptomatic Koreans. Gut. 2011;60:1537–1543. doi: 10.1136/gut.2010.232876. [DOI] [PubMed] [Google Scholar]
- 11.Nusko G, Hahn EG, Mansmann U. Risk of advanced metachronous colorectal adenoma during long-term follow-up. Int J Colorectal Dis. 2008;23:1065–1071. doi: 10.1007/s00384-008-0508-y. [DOI] [PubMed] [Google Scholar]
- 12.Lieberman DA, Weiss DG, Bond JH, Ahnen DJ, Garewal H, Harford WV, Provenzale D, Sontag S, Schnell T, Durbin TE. Use of colonoscopy to screen asymptomatic adults for colorectal cancer. N Engl J Med. 2000;343:162–168. doi: 10.1056/NEJM200007203430301. [DOI] [PubMed] [Google Scholar]
- 13.Imperiale TF, Wagner DR, Lin CY, Larkin GN, Rogge JD, Ransohoff DF. Results of screening colonoscopy among persons 40 to 49 years of age. N Engl J Med. 2002;346:1781–1785. doi: 10.1056/NEJM200206063462304. [DOI] [PubMed] [Google Scholar]
- 14.Sillars-Hardebol AH, Carvalho B, de Wit M, Postma C, Delis-van Diemen PM, Mongera S, Ylstra B, van de Wiel MA, Meijer GA, Fijneman RJ. Identification of key genes for carcinogenic pathways associated with colorectal adenoma-to-carcinoma progression. Tumour Biol. 2010;31:89–96. doi: 10.1007/s13277-009-0012-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Flora M, Piana S, Bassano C, Bisagni A, De Marco L, Ciarrocchi A, Tagliavini E, Gardini G, Tamagnini I, Banzi C. Epidermal growth factor receptor (EGFR) gene copy number in colorectal adenoma-carcinoma progression. Cancer Genet. 2012;205:630–635. doi: 10.1016/j.cancergen.2012.10.005. [DOI] [PubMed] [Google Scholar]
- 16.Peipins LA, Sandler RS. Epidemiology of colorectal adenomas. Epidemiol Rev. 1994;16:273–297. doi: 10.1093/oxfordjournals.epirev.a036154. [DOI] [PubMed] [Google Scholar]
- 17.Gaiser T, Camps J, Meinhardt S, Wangsa D, Nguyen QT, Varma S, Dittfeld C, Kunz-Schughart LA, Kemmerling R, Becker MR. Genome and transcriptome profiles of CD133-positive colorectal cancer cells. Am J Pathol. 2011;178:1478–1488. doi: 10.1016/j.ajpath.2010.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Habermann JK, Paulsen U, Roblick UJ, Upender MB, McShane LM, Korn EL, Wangsa D, Krüger S, Duchrow M, Bruch HP, Auer G, Ried T. Stage-specific alterations of the genome, transcriptome, and proteome during colorectal carcinogenesis. Genes Chromosomes Cancer. 2007;46:10–26. doi: 10.1002/gcc.20382. [DOI] [PubMed] [Google Scholar]
- 19.Kleivi K, Lind GE, Diep CB, Meling GI, Brandal LT, Nesland JM, Myklebost O, Rognum TO, Giercksky KE, Skotheim RI. Gene expression profiles of primary colorectal carcinomas, liver metastases, and carcinomatoses. Mol Cancer. 2007;6:2. doi: 10.1186/1476-4598-6-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sabates-Bellver J, Van der Flier LG, de Palo M, Cattaneo E, Maake C, Rehrauer H, Laczko E, Kurowski MA, Bujnicki JM, Menigatti M. Transcriptome profile of human colorectal adenomas. Mol Cancer Res. 2007;5:1263–1275. doi: 10.1158/1541-7786.MCR-07-0267. [DOI] [PubMed] [Google Scholar]
- 21.Maglietta R, Distaso A, Piepoli A, Palumbo O, Carella M, D’Addabbo A, Mukherjee S, Ancona N. On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers. J Biomed Inform. 2010;43:397–406. doi: 10.1016/j.jbi.2009.09.005. [DOI] [PubMed] [Google Scholar]
- 22.Abatangelo L, Maglietta R, Distaso A, D’Addabbo A, Creanza TM, Mukherjee S, Ancona N. Comparative study of gene set enrichment methods. BMC Bioinformatics. 2009;10:275. doi: 10.1186/1471-2105-10-275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Maglietta R, Piepoli A, Catalano D, Licciulli F, Carella M, Liuni S, Pesole G, Perri F, Ancona N. Statistical assessment of functional categories of genes deregulated in pathological conditions by using microarray data. Bioinformatics. 2007;23:2063–2072. doi: 10.1093/bioinformatics/btm289. [DOI] [PubMed] [Google Scholar]
- 24.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Shi X, Zhang Y, Cao B, Lu N, Feng L, Di X, Han N, Luo C, Wang G, Cheng S. Genes involved in the transition from normal epithelium to intraepithelial neoplasia are associated with colorectal cancer patient survival. Biochem Biophys Res Commun. 2013;435:282–288. doi: 10.1016/j.bbrc.2013.04.063. [DOI] [PubMed] [Google Scholar]
- 26.Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. doi: 10.1038/nature11252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Markowitz SD, Bertagnolli MM. Molecular basis of colorectal cancer. N Engl J Med. 2009;361:2449–2460. doi: 10.1056/NEJMra0804588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lugli A, Jass JR. Types of colorectal adenoma. Verh Dtsch Ges Pathol. 2005;90:18–24. [PubMed] [Google Scholar]
- 29.Konishi F, Morson BC. Pathology of colorectal adenomas: a colonoscopic survey. J Clin Pathol. 1982;35:830–841. doi: 10.1136/jcp.35.8.830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kiedrowski M, Mroz A, Kaminski MF, Kraszewska E, Orlowska J, Regula J. Predictive factors of proximal advanced neoplasia in the large bowel. Arch Med Sci. 2014;10:484–489. doi: 10.5114/aoms.2013.38394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Świątkowski M, Meder A, Sobczyński L, Koza J, Szamocka M, Brudny J, Korenkiewicz J. Adenomas detected during screening colonoscopies in the years 2000-2009. Prz Gastroenterol. 2012;7:299–305. [Google Scholar]
- 32.Luo Y, Wong CJ, Kaz AM, Dzieciatkowski S, Carter KT, Morris SM, Wang J, Willis JE, Makar KW, Ulrich CM. Differences in DNA Methylation Signatures Reveal Multiple Pathways of Progression from Adenoma to Colorectal Cancer. Gastroenterology. 2014:418–429. doi: 10.1053/j.gastro.2014.04.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.MacDonald I. Biological predeterminism in human cancer. Surg Gynecol Obstet. 1951;92:443–452. [PubMed] [Google Scholar]
- 34.Sanchez JA, Krumroy L, Plummer S, Aung P, Merkulova A, Skacel M, DeJulius KL, Manilich E, Church JM, Casey G. Genetic and epigenetic classifications define clinical phenotypes and determine patient outcomes in colorectal cancer. Br J Surg. 2009;96:1196–1204. doi: 10.1002/bjs.6683. [DOI] [PubMed] [Google Scholar]
- 35.Iacopetta B, Grieu F, Amanuel B. Microsatellite instability in colorectal cancer. Asia Pac J Clin Oncol. 2010;6:260–269. doi: 10.1111/j.1743-7563.2010.01335.x. [DOI] [PubMed] [Google Scholar]
- 36.van Engeland M, Derks S, Smits KM, Meijer GA, Herman JG. Colorectal cancer epigenetics: complex simplicity. J. Clin. Oncol. 2011;29:1382–1391. doi: 10.1200/JCO.2010.28.2319. [DOI] [PubMed] [Google Scholar]
- 37.Vonlanthen J, Okoniewski MJ, Meningatti M, Cattaneo E, Pellegrini-Ochsner D, Haider R, Jiricny J, Staiano T, Buffoli F, Marra G. A comprehensive look at transcription factor gene expression changes in colorectal adenomas. BMC Cancer. 2014;14:46. doi: 10.1186/1471-2407-14-46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P, Thompson IR, Luo X, Soares J. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483:570–575. doi: 10.1038/nature11005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]