Abstract
Locally advanced rectal cancer (LARC) is treated with chemoradiation prior to surgical excision, leaving residual tumors altered or completely absent. Integrating layers of genomic profiling might identify regulatory pathways relevant to rectal tumorigenesis and inform therapeutic decisions and further research. We utilized formalin-fixed, paraffin-embedded pre-treatment LARC biopsies (n=138) and compared copy number, mRNA, and miRNA expression with matched normal rectal mucosa. An integrative model was used to predict regulatory interactions to explain gene expression changes. These predictions were evaluated in vitro using multiple colorectal cancer cell lines. The Cancer Genome Atlas (TCGA) was also used as an external cohort to validate our genomic profiling and predictions. We found differentially expressed mRNAs and miRNAs that characterize LARC. Our integrative model predicted the upregulation of miR-92a, miR-182, and miR-221 expression to be associated with downregulation of their target genes after adjusting for the effect of copy number alterations. Cell line studies using miR-92a mimics and inhibitors demonstrate that miR-92a expression regulates IQGAP2 expression. We show that endogenous miR-92a expression is inversely associated with endogenous KLF4 expression in multiple cell lines, and that this relationship is also present in rectal cancers of TCGA. Our integrative model predicted regulators of gene expression change in LARC using pre-treatment FFPE tissues. Our methodology implicated multiple regulatory interactions, some of which are corroborated by independent lines of study while others indicate new opportunities for investigation.
Keywords: Gastrointestinal Cancers, Gene Expression Profiling, Molecular Modeling, Cancer Genome Anatomy, Posttranscriptional and Translational Control
INTRODUCTION
The integration of multiple layers of genomic profiling is a promising approach to provide insights into molecular pathways that contribute to tumorigenesis. Developing a comprehensive molecular characterization for locally advanced rectal cancer (LARC) has been hampered by several key challenges. First, these tumors are treated with neoadjuvant chemoradiation prior to surgical resection. Any residual tumor is therefore markedly altered and may not be representative of the presenting disease (Sfakianos et al., 2013). The only pre-treatment tissue available usually comes in the form of small endoscopic biopsies that are formalin-fixed, paraffin-embedded (FFPE). Second, a large number of specimens are required to derive a comprehensive molecular characterization of LARC. Ideally, these specimens would be obtained from a large, clinical trial with strict inclusion criteria to frame the proper disease entity, and to offer the potential for correlating to clinical outcomes. Few such trials exist, though, which is why most attempts at genomic profiling of LARC have been from small single-institution trials, or large cohorts with broad categorizations that might not accurately represent LARC.
In this study, we set out to use pre-treatment FFPE specimens from a large multi-institutional prospective clinical trial to provide a comprehensive characterization of the miRNA and mRNA expression changes in LARC (Garcia-Aguilar et al., 2011). We compared our cohort's mRNA, miRNA, and copy number alteration profiles with the rectal cancer specimens available in The Cancer Genome Atlas (TCGA) (Cancer Genome Atlas Network, 2012). The TCGA cohort hosts fresh-frozen surgical specimens that came from patients who did not undergo neoadjuvant chemoradiation given the anatomic position of their tumors (e.g., distal descending or sigmoid colon). As such, they represent the closest fresh-frozen representation of rectal cancers available to form a comparison. We then used an established integrative model that leverages both regulatory motif sequence information and copy number information to identify regulators of gene expression changes observed in LARC (Setty et al., 2012). Finally, we performed in vitro validations of several of the predicted regulatory programs.
MATERIALS AND METHODS
Patients
Patients enrolled in the Timing of Rectal Cancer Response to Chemoradiation study were included (ClinicalTrials.org Identifier: NCT00335816, referred to as the “Timing trial” in this manuscript) (Garcia-Aguilar et al., 2011). All patients in this prospective study had clinical stage II/III invasive rectal adenocarcinoma within 12 cm of the anal verge and had diagnostic biopsies obtained by proctoscopy. All patients were subsequently treated with neoadjuvant therapy and total abdominal resection of the rectal cancer. The trial was approved by a central Institutional Review Board (IRB) and the IRB of each participating institution.
Tissue Collection and Preparation
Pre-treatment biopsies were obtained from each patient at the time of diagnosis, and matched normal rectal mucosa was taken from the proximal resection margin of the post-treatment surgical specimen. All the specimens were FFPE, and were sectioned 10 micrometers thick onto at least 10 slides. A pathologist reviewed the sections to confirm the boundaries of the malignant or normal epithelia. The marked areas were microdissected under microscopy. Complete profiling of copy number, gene expression, and miRNA expression was not achieved for all patients, most commonly because of insufficient biopsy tissue. Overall, 80 patient biopsies were processed using microarray for mRNA expression, 89 were processed using RNA-seq for miRNA, and 138 underwent array comparative genomic hybridization (aCGH). 69 patients were profiled completely with mRNA, miRNA, and aCGH for tumor biopsy and matched normal rectal tissue (Figure 1).
Figure 1.
Sample allocation and genomic profiling of locally advanced rectal cancers from pre-treatment biopsies and normal rectal mucosa from surgical resection. The genomic profiles are then evaluated together, pooling the normal samples, using a regression model to integrate the profiles. CNA – copy number alteration.
Molecular Analysis
Genomic DNA and RNA were extracted using AllPrep DNA/RNA FFPE kits (Qiagen, Valencia, CA). Extracted DNA was amplified using the GenomePlex Complete Whole Genome Amplification (Sigma Corp., Cream Ridge, NJ), and adequate samples then underwent oligonucleotide aCGH using the Agilent microarray platform (Human Genomic CGH 244A) using previously published protocols (Chen et al., 2012). Total RNA was amplified to generate cDNA libraries using the Ovation FFPE WTA System (NuGEN Technologies, San Carlos, CA) and sent for Affymetrix U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA). Total RNA was also used to generate libraries for miRNA deep sequencing using an adapted version of the Illumina v1.5 protocol optimizing for reaction volume (Supplementary Methods), and sequenced using the Illumina HiSeq 2000 Platform (Illumina, San Diego, CA).
Computational Analysis
Computational analysis was performed in the “R” statistical environment. The following packages were utilized in the analysis: limma, affy, DESeq2, gplots, RColorBrewer, calibrate, BSgenome, Genomic Features, RCircos, org.Hs.eg.db (Gautier et al., 2004; Gentleman et al., 2004; Smyth, 2005; Neuwirth, 2011; Zhang et al., 2013; Warnes et al., 2013; Love et al., 2014).
mRNA Expression
mRNA expression data was robust multi-array average (RMA) normalized and perfectly matched probes were quantile normalized. Probes that were absent for 85% or more of all samples were excluded using the Affymetrix MAS5 algorithm.
Differential mRNA expression between tumor biopsies and normal rectum was conducted using the limma package in R. Top ranking probes were selected using p-values corrected for multiple hypothesis testing and a threshold for log2 fold change (FDR < 0.01, |log2 fold change| > 2). Comparisons with TCGA data were performed on level 3 standard RNA-seq data for the READ (Rectal Adenocarcinoma) study to evaluate for similarities and differences using enrichment analysis and hypergeometric tests. The most differentially expressed mRNAs in our cohort were evaluated for their level of significance of their differential expression in TCGA using sequential groups of mRNAs to assess for enrichment between the comparison results. Additionally, we evaluated whether genes that were identified as upregulated/downregulated in our cohort were also upregulated/downregulated in the TCGA cohort.
miRNA Expression
miRNA raw FASTQ files were mapped to a transcriptome comprised of all known mature human miRNAs (miRbase 20) (Kozomara et al., 2014) using the Burrows Wheeler Aligner (Li et al., 2009), allowing for zero mismatches and mapping across 2578 miRNAs. Normalization and differential expression analysis was performed using the R package DESeq2, using p-values corrected for multiple hypothesis testing and thresholds for log2 fold change and base mean expression (FDR < 0.01, |log2 fold change| > 1, and a base mean > 30). In evaluating for differential expression for miRNAs with reads mapped to both the 5’ and 3’ arms of a particular miRNA, the arm with the higher number of mapped reads was selected as representative of the expression for that miRNA. Comparisons were once again made with available level 3 standard TCGA miRNA-seq data for the READ study in the same manner as for gene expression.
Copy Number Alterations (CNAs)
Array CGH intensity measurements were processed using circular binary segmentation (Olshen et al., 2004). Segmented CNA data was evaluated in integrative genomics viewer (IGV) (Thorvaldsdóttir et al., 2013) to compare the CNAs in the “Timing trial” with those in the publically available dataset of TCGA Affymetrix SNP 6.0 array level 3 data for the READ study (Cancer Genome Atlas Network, 2012).
CNAs were called using GISTIC 2.0 (Mermel et al., 2011), and mapped to the reference genome, hg19, released by the Genome Reference Consortium GRCh37 for the purpose of training our model. Visual representations of CNAs utilized a similar algorithm to that used in the integrated genomics viewer, assessing the proportion of patients with segmented amplifications and deletions across bins of 200kb throughout each chromosome (Robinson et al., 2011).
Integrative Model to Find miRNAs Leading to Observed Gene Expression Change
An integrative sparse regression model (Setty et al., 2012) was used to identify regulatory programs that are drivers of gene expression changes in LARC. This model explains differential gene expression between tumor and normal tissue as a linear combination of copy number alterations for a particular gene, and the miRNAs that have a complementary binding site to the 3’UTR of the gene. Specifically, the model identifies potential miRNA regulators of gene expression, adjusting for copy number alterations.
CNAs for genes were identified using circular binary segmentation calls. miRNA mapping was based on complementary binding to regulatory regions in the 3’UTR of target genes that were conserved across 5 mammalian species (hg19 - human, gorGor1 - gorilla, rn4 - rat, mm9 - mouse, canFam2 - dog). Targets of miRNAs were determined based on the presence of 7-mer seed matches (sequences complementary to positions 2-8 of the miRNA) in the 3’UTRs of mRNAs (Blanchette et al., 2004).
Following the formulation in (Setty et al., 2012), we try to explain consistent changes in mRNA levels of each gene between normal and tumor across a cohort of patients by looking at the corresponding copy number levels in the tumor and the changes in miRNA levels that are likely to target each of the genes. For simplicity and without lack of generalizability, let us consider a single patient. Let the differential gene expression of a gene i be denoted by De(i), the copy number level obtained by GISTIC for gene i be CN(i) (real value with the appropriate sign for amplification or deletion), and the differential expression of J microRNAs that may target gene i be Dm(i,1)...Dm(i,J). We estimate Dm(i,j) by matching conserved regions in the 3’UTR of the gene i and the seed sequence of the microRNA j. The final model uses the following linear model in a regularized regression setting:
After fitting the model, we test coefficients c(j) to assess significant correlations between a microRNA j and its potential target genes, while correcting for the copy number information. For example, if a gene is differentially expressed, and that change in expression would be completely explained by the CN coefficient, we would not find any significant c parameters. On the other hand, if there are no CN changes, then if a significant c(j) is found, that would indicate that the variation of the levels of expression of the gene are beyond the expected variation due to copy number changes.
Cell Culture and Transfection
Ten colorectal cancer cell lines were grown to measure and compare endogenous miRNA levels (DLD-1, HCT-116, SW-48, LOVO, DIFI, HCT-15, CACO-2, SW-480, HT-29, SW-620). Predicted regulatory miRNAs were further explored using transfection of miRNA mimic and inhibitors using the miRVana system (Life Technologies, Carlsbad, CA).
RESULTS
mRNA Expression is Markedly Dysregulated in LARC Compared with Normal Rectal Mucosa with Differences that are Seen in TCGA
Comparing pre-treatment LARC biopsies and normal rectal mucosa, we identified a signature of 217 differentially expressed (DE) genes (45 upregulated, 172 downregulated; Supplementary Table ST1). Unsupervised two-way hierarchical clustering of our samples using this signature was able to distinguish 80 LARC samples from 46 normal rectal mucosa samples with 100% accuracy (Figure 2a).
Figure 2.
A – Clustering analysis of Rectal Cancer vs. Normal Rectum mRNA expression across all “Timing trial” samples using signature of 248 genes (FDR < 0.01, |logFC| > 2); B – Successive ranked subsets of our DE genes and their corresponding median in a ranked list of DE genes from TCGA; our most significant DE genes are more significantly differentially expressed in TCGA, and as we move down our list, we similarly move down in the list in TCGA; C – Correlation of log fold change for mRNAs differentially expressed in both “Timing trial” and TCGA; D) Hypergeometric test out of 13,630 shared miRNAs across “Timing trial” and TCGA. DE – differentially expressed, TCGA – The Cancer Genome Atlas, * our DE mRNA list includes 217 genes, but only 138 overlap with genes present in TCGA platform.
We tested our signature for enrichment against the 90 TCGA rectosigmoid cancer samples with available microarray mRNA expression data (85 tumors, 5 normal rectal mucosa samples). There were 13,630 shared genes across the “Timing trial” array and TCGA sequencing data. Sequential groups of DE genes from our cohort ordered by their level of significance correlated with the level of significance in TCGA samples (Figure 2b and Supplementary Table ST4). Furthermore, of the 102 genes that were differentially expressed in both the “Timing trial” and TCGA rectal cancer samples, 69/69 (100%) of upregulated genes from our cohort were upregulated in TCGA, and 33/33 (100%) of downregulated genes from our cohort were downregulated in TCGA (Figure 2c). Our gene signature was significantly enriched in the most significant DE genes in TCGA (hypergeometric p-value 1.6 × 10−31, Figure 2d).
miRNA Expression is Markedly Dysregulated in LARC Compared with Normal Rectal Mucosa with Differences that are Seen in TCGA
A signature of 55 differentially expressed miRNAs (39 upregulated, 16 downregulated; Supplementary Table ST2) was identified when comparing LARC biopsies and normal rectal mucosa using small RNA sequencing. Unsupervised two-way hierarchical clustering of samples using this signature was able to discriminate 89 LARC samples from 58 normal rectal mucosa samples with 99.3% accuracy (Figure 3a).
Figure 3.
A – Clustering analysis of Rectal Cancer vs. Normal Rectum miRNA expression across all “Timing trial” samples using signature of 55 DE miRNAs (FDR < 0.01, |logFC| > 1, base mean > 30); B – Successive ranked bins of our DE miRNAs and their corresponding median in a ranked list of DE miRNAs from TCGA; our selected signature of DE miRNAs are more significantly differentially expressed in TCGA, and as we move down our list, successive bins have a lower rank with respect to differential expression in TCGA; C – Correlation of log fold change for miRNAs differentially expressed in both “Timing trial” and TCGA; D) Hypergeometric test of our signature across the 464 shared miRNAs across “Timing trial” and TCGA. DE – differentially expressed, TCGA – The Cancer Genome Atlas. * our DE miRNA list includes 55 miRNAs, but only 45 overlap with miRNAs present in TCGA platform.
We tested our signature for enrichment against 94 TCGA rectosigmoid cancer samples with available miRNA sequencing data (91 tumors, 3 normal rectal mucosa samples). There were 464 shared miRNAs across the “Timing trial” and TCGA sequencing data. Sequential groups of DE miRNAs from our cohort ordered by their level of significance correlated with the level of significance in TCGA samples (Figure 3b and Supplementary Table ST5).
Additionally, of the 35 miRNAs that were differentially expressed in both the “Timing trial” and TCGA rectal cancer samples, 25 of 28 upregulated miRNAs from our cohort were upregulated in TCGA, although only 3 of 7 downregulated miRNAs from our cohort were downregulated in TCGA (Figure 3c). Nevertheless, our miRNA signature was significantly enriched among the most significant DE miRNAs in TCGA (hypergeometric p-value = 0.0001, Figure 3d), similar to the mRNA expression signature.
Copy Number Alterations in our LARC Cohort is Closely Associated with the Profile from Rectosigmoid Specimens in The Cancer Genome Atlas (TCGA)
We found that our LARC and normal rectal mucosa samples with aCGH data revealed copy number alterations matching those which have previously been reported in colorectal cancer across multiple studies, including deletions in 8p, 18p, 18q, and amplifications in 8q, 13q, and 20q (Nakao et al., 2004; Martin et al., 2007; Cancer Genome Atlas Network, 2012).
The segmented copy number data were then additionally compared with CNAs recognized in the TCGA READ cohort compared with normal rectal mucosa. The CNA profile shown in IGV reveals marked similarities for both amplifications and deletions, with correlation that was statistically significance (Figure 4 and Supplementary Figure S2).
Figure 4.
Comparison of Copy Number Alterations as measured by array CGH across all samples in “Timing trial” samples and TCGA samples. The correlation between copy number alterations between cohorts is statistically significant (p < 0.001, Supplementary Figure S3).
Integrative Model Identifies miR-92a, miR-182, and miR-221 as Regulators of Gene Expression Change in LARC
We then trained an integrative regression model to explain tumor vs. normal rectal mucosa mRNA expression changes in LARC in terms of miRNA-mediated regulation by differentially expressed miRNAs. Our model identified miR-92a, miR-182, and miR-221 as regulators of the gene expression changes observed in LARC (Supplementary Figure S1). They are each significantly upregulated, and their target genes are broadly downregulated in LARC (Figure 5).
Figure 5.
Circus plot showing copy number variations in LARC samples compared with normal rectal mucosa (outer track), upregulated miRNAs in LARC that are predicted to regulate gene expression changes (middle track), and downregulated mRNAs in LARC that can be bound by those upregulated miRNAs (inner track). LARC – locally advanced rectal cancer.
Next, we selected miR-92a to investigate further, given other published findings that suggest that it is upregulated in colorectal cancer and has potential roles in tumorigenesis (Lanza et al., 2007, Slattery et al., 2011, Li et al., 2012). A review of the literature led us to test two of the predicted regulatory interactions to provide proof of principle for the use of our integrative model: the predicted miR-92a regulation of IQGAP2 and KLF4. IQ motif containing GTPase activating protein 2 (IQGAP2) is a cell membrane scaffolding protein implicated in cellular migration and downregulation of the canonical Wnt pathway. It has been recognized to serve tumor suppressive functions and has been noted to be downregulated in hepatocellular carcinoma and CRC (Smith et al., 2015). This protein harbors an N-terminal calponin homology motif which functions as an F-actin binding domain in members of the spectrin, filamin, and fimbrin families and may interact with calmodulin and Rho family GTPases (Brill et al., 1996). Krüppel-like factor 4 (KLF4) has long been recognized in human CRC to act as a tumor and metastasis suppressor (Li et al., 2011). Early studies revealed that KLF4 overexpression can induce cell cycle arrest and inhibit cell proliferation (Dang et al., 2003)
Manipulation of miR-92a Expression Demonstrates Regulation of IQGAP2 Levels in SW620 and DLD-1 Cell Lines
Transfection of a miR-92a mimic into DLD-1 cells resulted in significant downregulation of IQGAP2 expression (P < 0.01, Figure 6). Conversely, transfection of a miR-92a inhibitor into SW620 cells resulted in significant upregulation of IQGAP2 expression (p < 0.01, Figure 6). These findings support our integrative model's prediction that miR-92a regulates IQGAP2 expression. Interestingly, a miR-92a inhibitor in the low endogenous miR-92a expressing cell line (DLD-1) did not further upregulate IQGAP2 and miR-92a mimic in the high endogenous miR-92a expressing cell line (SW620) did not further downregulate IQGAP2 expression.
Figure 6.
RT/qPCR expression of miR-92a and IQGAP2 in (left) DLD-1 cells (low endogenous miR-92a expression) and (right) SW620 cells (high endogenous miR-92a expression) with either a mimic or inhibitor. **, p<0.01. Representative of three separate experiments.
An Underlying miR-92a Regulation of KLF4 is Implicated Across Multiple Cell Lines and in TCGA, While Modulation of miR-92a Expression Does Not Alter KLF4 Expression in SW620 and DLD-1 Cells
Evaluating endogenous miR-92a and KLF4 expression across multiple colorectal cancer cell lines demonstrated that the cell lines with the highest or lowest expression of KLF4 corresponded with those with lower and higher expression of miR-92a at baseline, respectively (Figure 7). Furthermore, evaluation of miR-92a and KLF4 expression across the samples in TCGA demonstrated that high miR-92a levels were significantly associated with lower KLF4 expression (Supplemental Figure S3). However, transfection of miR-92a mimics and inhibitors in SW620 and DLD-1 did not demonstrate any significant change in KLF4 expression, so a direct regulatory relationship is not supported. The other regulatory interactions predicted by our integrative model are still being explored and beyond the scope of this current work. A number of other regulatory interactions were predicted by our model, some that have been described in literature and others that have not yet been reported (Supplementary Table ST3).
Figure 7.
RT/qPCR expression of miR-92a and KLF4 across 6 cell lines. These 6 cell lines were selected from 10 colorectal cancer cell lines based on having the 3 highest and lowest endogenous levels of KLF4 expression. Representative of 3 separate experiments.
DISCUSSION
Our results show that genomic profiling of mRNA, miRNA, and copy number alterations from FFPE pre-treatment tissues can be used to characterize tumors and to investigate gene regulatory programs. We have shown that our profiling of LARC compared with normal rectal mucosa can be validated with an external cohort using the fresh-frozen samples available in the TCGA rectal adenocarcinoma data set. We leveraged an integrative model to identify potential regulators of gene expression and performed in vitro validations for several of these predictions. Our results demonstrate the power of integrating multiple layers of genomic information to characterize tumors and their gene regulatory programs. Most analyses to date have examined layers of genomic information in isolation using unsupervised clustering algorithms to define cancer subtypes. By focusing on the relationships between copy number, miRNA expression and mRNA expression, our approach is suited to evaluate the multi-dimensional regulatory interactions that are critical for tumorigenesis. Our integrative regression model takes advantage of regulatory motif sequence information to highlight biologically-plausible miRNA-mRNA interactions, while accounting for the impact of copy number alterations on gene expression, initially being established and validated in glioblastoma multiforme (Setty et al., 2012). Additionally, the model simultaneously evaluates each differentially expressed miRNA against all of its potential targets in order to increase statistical confidence that a selected miRNA is a biologically relevant driver of gene expression change. This is important because it minimizes the potential for false discovery that would otherwise be faced if performing anticorrelation tests between a miRNA and each of its potential targets separately.
In our cohort of LARC, a frequent amplification of chromosome arm 13q was observed, which corroborates well-established observations in colorectal cancer (Diosdado et al., 2009). Additionally, the expression of miR-92a – which maps to 13q – was strongly upregulated in our cohort. Our integrative model predicted miR-92a regulation of both IQGAP2 and KLF4, and associations between both were demonstrated using in vitro cell line experiments in this study. IQGAP2 expression could be manipulated by miR-92a mimics and inhibitors in two cell lines, and endogenous KLF4 expression was inversely associated to miR-92a expression across 6 colorectal cancer cell lines. The regulatory interaction between miR-92a and IQGAP2 has not been described in rectal cancer prior to this study. For the miR-92a-KLF4 interaction, direct regulatory roles have been established in other organ systems (Fang et al., 2012; Loyer et al., 2014), but it has not been studied in rectal cancer. Our in vitro findings further support that there is a relationship between miR-92a and KLF4 expression at baseline across multiple cell lines, though our transfection experiments in SW620 and DLD-1 suggest that it is unlikely to be a direct regulatory mechanism in rectal cancer. These explorations into the interactions between miR-92a and IQGAP2 and KLF4 serve as a proof of principle: this integrative model can help guide further investigation and identify regulatory pathways that may be of particular relevance.
IQGAP2 is a scaffolding protein that has been implicated in colorectal cancer and plays roles in cell adhesion and tumor suppression (Schmidt et al., 2008; Smith et al., 2015). Our results suggest that downregulation of IQGAP2 is likely to be mediated through the observed copy number amplification and upregulation of miR-92a in LARC. The determination of the exact mechanism by which this occurs is beyond the scope of the current study, and the integrative regression model cannot distinguish between a direct and indirect mechanism of regulation. Further experimentation by mutagenesis of the seed targeted by miR-92a would substantiate direct regulatory control of IQGAP2, and luciferase reporter assays could provide additional proof of the functional regulatory interaction that is suggested by our studies.
KLF4 has long been recognized in human CRC to act as a tumor and metastasis suppressor (Li et al., 2011). Early studies revealed that KLF4 overexpression can induce cell cycle arrest and inhibit cell proliferation (Dang et al., 2003), and it is consistently reported to be downregulated in CRC (Xu et al., 2008). Interestingly, the 3 cell lines with low miR-92a expression and correspondingly high KLF4 expression are all from microsatellite unstable tumors, while the 3 cell lines with high miR-92a and low KLF4 expression are derived from tumors with chromosomal instability. This observation was directly facilitated by the integrative model's regulatory predictions.
The findings of this study also support the use of small diagnostic FFPE biopsies as reliable sources from which we can identify molecular markers. Because small diagnostic specimens represent the only pre-treatment source of tissue in an increasing number of solid tumors, our ability to utilize these as a source for molecular biomarkers is increasingly important in order to carry useful prediction models into clinical settings.
This current work establishes a framework and proof of principle study for integrating genomic profiles from clinically derived samples. We have shown that meaningful genomic signatures can be obtained from small FFPE biopsy samples that can be leveraged to explore and test biologically relevant pathways. Furthermore, we have utilized an integrative model that is able to effectively predict regulators of gene expression changes observed in LARC. Our next steps will be to utilize these same approaches to identify markers that predict LARC response to therapies. The identification of predictive biomarkers of response will have broad implications on our ability to personalize therapy for LARC patients.
Supplementary Material
ACKNOWLEDGEMENTS
The results published here are in part based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/. We acknowledge with appreciation the specimen donors and research groups, and their contributions to that work.
The authors thank Jenifer Levin, editor in the Colorectal Surgery Service at Memorial Sloan Kettering Cancer Center, for her assistance in editing this manuscript.
Supported by: This study was supported by the National Institutes of Health (NIH), National Cancer Institute (NCI) R01 Grant CA090559 (JGA) and U24 Grant CA143840 (CSL). ClinicalTrials.org Identifier: NCT00335816. The study was funded in part by the cancer center core grant P30 CA008748. The core grant provides funding to institutional cores, such as Biostatistics and Pathology, which were used in this study.
REFERENCES
- Sfakianos GP, Iversen ES, Whitaker R, Akushevich L, Schildkraut JM, Murphy SK, Marks JR, Berchuck A. Validation of ovarian cancer gene expression signatures for survival and subtype in formalin fixed paraffin embedded tissues. Gynecol Oncol. 2013;129:159–164. doi: 10.1016/j.ygyno.2012.12.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garcia-Aguilar J, Smith DD, Avila K, Bergsland EK, Chu P, Krieg RM. Optimal timing of surgery after chemoradiation for advanced rectal cancer: preliminary results of a multicenter, nonrandomized phase II prospective trial. Ann Surg. 2011;254:97–102. doi: 10.1097/SLA.0b013e3182196e1f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cancer Genome Atlas Network Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. doi: 10.1038/nature11252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Setty M, Helmy K, Khan AA, Silber J, Arvey A, Neezen F, Agius P, Huse JT, Holland EC, Leslie CS. Inferring transcriptional and microRNA-mediated regulatory programs in glioblastoma. Mol Syst Biol. 2012;8:605. doi: 10.1038/msb.2012.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen Z, Liu Z, Deng X, Warden C, Li W, Garcia-Aguilar J. Chromosomal copy number alterations are associated with persistent lymph node metastasis after chemoradiation in locally advanced rectal cancer. Dis Colon Rectum. 2012;55:677–685. doi: 10.1097/DCR.0b013e31824f873f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. bioRxiv. 2014 doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smyth GK. Limma: linear models for microarray data. In: Gentleman R, Carey V, Dudoit S, et al., editors. Bioinformatics and Computational Biology Solutions Using {R} and Bioconductor. Springer; New York: 2005. pp. 397–420. [Google Scholar]
- Neuwirth E. RColorBrewer: ColorBrewer palettes. 2011 [Google Scholar]
- Warnes GR, Bolker B, Bonebakker L, Gentleman R, Liaw WHA, Lumley T, Maechler M, Magnusson A, Moeller S, Schwartz M, Venables B. gplots: Various R programming tools for plotting data. 2013 [Google Scholar]
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang H, Meltzer P, Davis S. RCircos: an R package for Circos 2D track plots. BMC Bioinformatics. 2013;14:244. doi: 10.1186/1471-2105-14-244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gautier L, Cope L, Bolstad BM, Irizarry RA. affy---analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. doi: 10.1093/bioinformatics/btg405. [DOI] [PubMed] [Google Scholar]
- Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 2014;42:D68–D73. doi: 10.1093/nar/gkt1181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5:557–572. doi: 10.1093/biostatistics/kxh008. [DOI] [PubMed] [Google Scholar]
- Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–192. doi: 10.1093/bib/bbs017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12:R41. doi: 10.1186/gb-2011-12-4-r41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AFA, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D, Miller W. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004;14:708–715. doi: 10.1101/gr.1933104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin ES, Tonon G, Sinha R, Xiao Y, Feng B, Kimmelman AC, Protopopov A, Ivanova E, Brennan C, Montgomery K, Kucherlapati R, Bailey G, Redston M, Chin L, DePinho RA. Common and distinct genomic events in sporadic colorectal cancer and diverse cancer types. Cancer Res. 2007;67:10736–10743. doi: 10.1158/0008-5472.CAN-07-2742. [DOI] [PubMed] [Google Scholar]
- Nakao K, Mehta KR, Fridlyand J, Moore DH, Jain AN, Lafuente A, Wiencke JW, Terdiman JP, Waldman FM. High-resolution analysis of DNA copy number alterations in colorectal cancer by array-based comparative genomic hybridization. Carcinogenesis. 2004;25:1345–1357. doi: 10.1093/carcin/bgh134. [DOI] [PubMed] [Google Scholar]
- Lanza G, Ferracin M, Gafà R, Veronese A, Spizzo R, Pichiorri F, Liu C, Calin GA, Croce CM, Negrini M. mRNA/microRNA gene expression profile in microsatellite unstable colorectal cancer. Mol Cancer. 2007;6:54. doi: 10.1186/1476-4598-6-54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slattery ML, Wolff E, Hoffman MD, Pellatt DF, Milash B, Wolff RK. MicroRNAs and colon and rectal cancer: differential expression by tumor location and subtype. Genes Chromosomes Cancer. 2011;50:196–206. doi: 10.1002/gcc.20844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li X, Zhang G, Luo F, Ruan J, Huang D, Feng D, Xiao D, Zeng Z, Chen X, Wu W. Identification of aberrantly expressed miRNAs in rectal cancer. Oncol Rep. 2012;28:77–84. doi: 10.3892/or.2012.1769. [DOI] [PubMed] [Google Scholar]
- Smith JM, Hedman AC, Sacks DB. IQGAPs choreograph cellular signaling from the membrane to the nucleus. Trends Cell Biol. 2015;25:171–184. doi: 10.1016/j.tcb.2014.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brill S, Li S, Lyman CW, Church DM, Wasmuth JJ, Weissbach L, Bernards A, Snijders AJ. The Ras GTPase-activating-protein-related human protein IQGAP2 harbors a potential actin binding domain and interacts with calmodulin and Rho family GTPases. Mol Cell Biol. 1996;16:4869–4878. doi: 10.1128/mcb.16.9.4869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li D, Peng Z, Tang H, Wei P, Kong X, Yan D, Huang F, Li Q, Le X, Li Q, Xie K. KLF4-mediated negative regulation of IFITM3 expression plays a critical role in colon cancer pathogenesis. Clin Cancer Res. 2011;17:3558–3568. doi: 10.1158/1078-0432.CCR-10-2729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dang DT, Chen X, Feng J, Torbenson M, Dang LH, Yang VW. Overexpression of Krüppel-like factor 4 in the human colon cancer cell line RKO leads to reduced tumorigenecity. Oncogene. 2003;22:3424–3430. doi: 10.1038/sj.onc.1206413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diosdado B, van de Wiel MA, Terhaar Sive Droste JS, Mongera S, Postma C, Meijerink WJHJ, Carvalho B, Meijer GA. MiR-17-92 cluster is associated with 13q gain and c-myc expression during colorectal adenoma to adenocarcinoma progression. Br J Cancer. 2009;101:707–714. doi: 10.1038/sj.bjc.6605037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fang Y, Davies PF. Site-specific microRNA-92a regulation of Kruppel-like factors 4 and 2 in atherosusceptible endothelium. Arterioscler Thromb Vasc Biol. 2012;32:979–987. doi: 10.1161/ATVBAHA.111.244053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loyer X, Potteaux S, Vion A-C, Guérin CL, Boulkroun S, Rautou P-E, Ramkhelawon B, Esposito B, Dalloz M, Paul J-L, Julia P, Maccario J, Boulanger CM, Mallat Z, Tedgui A. Inhibition of microRNA-92a prevents endothelial dysfunction and atherosclerosis in mice. Circ Res. 2014;114:434–443. doi: 10.1161/CIRCRESAHA.114.302213. [DOI] [PubMed] [Google Scholar]
- Schmidt VA, Chiariello CS, Capilla E, Miller F, Bahou WF. Development of hepatocellular carcinoma in Iqgap2-deficient mice is IQGAP1 dependent. Mol Cell Biol. 2008;28:1489–1502. doi: 10.1128/MCB.01090-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu J, Lü B, Xu F, Gu H, Fang Y, Huang Q, Lai M. Dynamic down-regulation of Krüppel-like factor 4 in colorectal adenoma-carcinoma sequence. J Cancer Res Clin Oncol. 2008;134:891–898. doi: 10.1007/s00432-008-0353-y. [DOI] [PubMed] [Google Scholar]
- Hörkkö TT, Tuppurainen K, George SM, Jernvall P, Karttunen TJ, Mäkinen MJ. Thyroid hormone receptor beta1 in normal colon and colorectal cancer-association with differentiation, polypoid growth type and K-ras mutations. Int J Cancer. 2006;118:1653–1659. doi: 10.1002/ijc.21556. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.