Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 17.
Published in final edited form as: Hum Pathol. 2021 Oct 14;119:1–14. doi: 10.1016/j.humpath.2021.10.002

Validation of genetic classifiers derived from mouse and human tumors to identify molecular subtypes of colorectal cancer

Santina M Snow a,*, Kristina A Matkowskyj b,c,d,*, Morgan Maresh a, Linda Clipson a, Tien N Vo e,f,m, Katherine A Johnson a, Dustin A Deming a,b,g, Michael A Newton e,f, William M Grady h,i,j, Perry J Pickhardt k, Richard B Halberg a,l,§
PMCID: PMC9936405  NIHMSID: NIHMS1765873  PMID: 34655611

Abstract

Colorectal cancer (CRC) is a leading cause of cancer death in the United States. Standard treatment for advanced stage CRC for decades has included 5-fluorouracil based chemotherapy. More recently, targeted therapies for metastatic CRC are being used based on the individual cancer’s molecular profile. In the past few years, several different molecular subtype schemes for human CRC have been developed. The molecular subtypes can be distinguished by gene expression signatures and have the potential to be used to guide treatment decisions. However, many subtyping classification methods were developed using mRNA expression levels of hundreds to thousands of genes, making them impractical for clinical use. In this study, we assessed whether an immunohistochemical approach could be used for molecular subtyping of colorectal cancers. We validated two previously published, independent sets of immunohistochemistry classifiers and modified the published methods to improve the accuracy of the scoring methods. In addition, we evaluated whether protein and genetic signatures identified originally in the mouse were linked to clinical outcomes of patients with CRC. We found that low DDAH1 or low GAL3ST2 protein levels in human CRCs correlate with poor patient outcome. The results of this study have the potential to impact methods for determining the prognosis and therapy selection for CRC patients.

Keywords: colorectal cancer, consensus molecular subtypes, immunohistochemistry, novel biomarkers, patient outcome

1. INTRODUCTION

Colorectal cancer (CRC) is a leading cause of cancer death in the United States. The standard treatment for high risk Stage II and Stage III CRC is surgery followed by 5-fluorouracil (5-FU) based chemotherapy [1]. Treatment decisions for CRC are made primarily based on the stage of the disease. The emergence of targeted therapies has greatly increased treatment options for patients with metastatic CRC. For example, metastatic colorectal cancers that have deficient DNA mismatch repair (dMMR), resulting in a strong mutator phenotype, known as microsatellite instability, are often quite responsive to immune checkpoint inhibitors [1]. Presumably, the instability creates neoantigens that can be recognized by the immune system [1]. Thus, treatment of patients with metastatic CRC is no longer “one size fits all.” The optimal treatment strategy for individual patients is directed by the molecular features of the cancer, leading to the use of targeted therapies, biologic agents, immunotherapies, or cytotoxic chemotherapy.

In the past few years, a number of different research teams have identified different subtypes of human CRC that can be distinguished by gene expression signatures and somatic gene mutations. The early studies of these subtyping schemes have indicated they have the potential to facilitate treatment decisions for CRC patients [28]. The molecular signatures of the subtypes were primarily based on genetic mutations and changes in gene expression profiles (i.e., the transcriptome). Using this approach, Sadanadam and colleagues found seven genes (CFTR, FLNA, MUC2, RARRES3, SFRP2, TFF3, ZEB1) that were differentially expressed among six subtypes of human CRC. They found concurrent changes in protein levels of these seven genes using immunohistochemistry (IHC) [6]. Each CRC subtype shared similarities with the different cell types that are present in the normal colonic epithelium. Patients with CRCs of three of the subtypes (Goblet-like, Transit amplifying (TA) – cetuximab sensitive, Transit amplifying (TA) – cetuximab resistant) had remarkably better disease-free survival (DFS) following surgery than patients with the other three subtypes (Enterocyte, Inflammatory, Stem-like), indicating the former might be spared chemotherapy and all of its toxic side effects [6]. Other research teams performed similar studies but reached different conclusions with respect to genes that were differentially expressed and the number of CRC subtypes [25,7]. Because of the lack of reproducibility of the subtyping schemes, an international consortium of CRC investigators conducted a study to develop a reproducible and robust CRC subtyping scheme based on CRC mRNA based gene expression signatures. These investigators analyzed six independent classification systems and proposed four consensus molecular subtypes (CMS1, MSI Immune; CMS2, Canonical; CMS3, Metabolic; CMS4, Mesenchymal) [8].

Notably, a major limitation of the original CMS subtyping system with regards to its use in clinical care is the substantial confounding effects of preanalytical variables that cause inconsistent mRNA quality of formalin-fixed paraffin-embedded clinical samples. This limitation has led to attempts to develop IHC-based marker panels that can be used to identify the CMS subtype of CRC. Trinh and colleagues determined that five markers (CDX2, FRMD6, HTR2B, ZEB1, KER) and the expression status of DNA mismatch repair genes (MLH1, MSH2, MSH6, PMS2) can distinguish the four CMS subtypes by IHC with reasonable accuracy [5,9,10]. Mismatch repair (MMR) protein IHC is first performed to determine whether the patient’s tumor is MMR deficient (dMMR), and hence, classified as CMS1. If the tumor was MMR proficient (pMMR), it is then further classified into the epithelial (CMS2/CMS3) or mesenchymal (CMS4) subtypes. The epithelial subtype has a high level of CDX2 expression, whereas the mesenchymal subtype has loss of CDX2 expression, high level of HTR2B, and the presence of FRMD6 expression in goblet cells. ZEB1 is a marker of the epithelial-to-mesenchymal transition (EMT). Patients with the mesenchymal (CMS4) subtype have the worst prognosis.

Polyps, like cancers, appear to belong to different CMS subtypes. Chang and colleagues analyzed adenomatous and serrated polyps from patients with hereditary and sporadic forms of CRC [11]. Adenomatous polyps were predominantly CMS2 with WNT and MYC activation, but a few were CMS4 exhibiting enrichment for stromal signatures [11]. Serrated polyps were predominantly CMS1 exhibiting immune activation [11]. Komor and colleagues also analyzed adenomatous polyps [12]. They found that the majority (87%; 54/62) of adenomatous polyps were CMS3 [12]. These conflicting results indicate that the field would benefit from a robust and rapid test to determine the molecular subtype and additional work to more deeply understand the significance of molecular subtypes at different points in the disease process.

Earlier studies from our laboratory and others have indicated that some polyps might be “born bad” instead of “born good and become bad”, in both a preclinical mouse model and in humans. We identified a transcriptional signature that predicted whether a colorectal polyp would remain a polyp or progress to invasive cancer in a mouse model of colorectal cancer [13]. Biopsies were collected from colon tumors over time and molecularly profiled [13]. The expression of 68 genes differed between polyps and invasive cancers [13]. Surprisingly, this 68-gene signature was evident even in the first biopsies when all lesions appeard to be benign polyps [13]. Six of the genes identified, including Muc2 and Tff3, had already been linked to human CRC [13]. Additionally, we found that early human colorectal polyps carried mutations that were associated with the later stages of tumorigenesis [14]. Human colorectal polyps were followed over time by CT colonography. Some polyps grew, some remained static, and some regressed [14]. Targeted sequencing revealed that growing polyps tended to have more mutations than those that regressed [14]. The mutations were often in driver genes that have been associated with the later stages of tumorigenesis in the colon and rectum [14]. Thus, molecular features of early polyps may predict growth fate and cancer risk and guide treatments.

In this study, we set out to determine whether an IHC-based approach could classify colorectal tumors into different molecular subtypes. In addition, we evaluated whether genetic and protein signatures that were associated with malignancy in the mouse were linked to clinical outcomes of CRC patients. The results of this study have future implications for CRC screening, surveillance, and treatment.

2. MATERIALS AND METHODS

2.1. Colorectal cancer tissue microarray (CRC TMA)

A formalin-fixed paraffin-embedded (FFPE) CRC TMA was developed by the University of Wisconsin Carbone Cancer Center Translational Science Biocore (TSB) under an IRB-approved protocol (2016–0934) as previously described [15]. See Table 1 for patient characteristics.

Table 1.

Characteristics of patients on the CRC TMA.

Characteristic All Female Male

All cases, n 122 46 76
Age, years, mean ± SD 61.7 ± 14.1 63.2 ± 15.7 60.8 ± 13.2
Race, n (%)
 White 111 (91%) 44 (96%) 67 (88%)
 Black 8 (7%) 1 (2%) 7 (9%)
 Chinese 1 (1%) 1 (2%) 0 (0%)
 American Indian 2 (2%) 0 (0%) 2 (3%)
Tumor location, n (%)
 Right 40 (33%) 17 (37%) 23 (30%)
 Left 44 (36%) 16 (35%) 28 (37%)
 Rectum 38 (31%) 13 (28%) 25 (33%)
Clinical staging, n (%)
 I 30 (25%) 9 (20%) 21 (28%)
 II 29 (24%) 8 (17%) 21 (28%)
 III 35 (29%) 17 (37%) 18 (24%)
 IV 28 (23%) 12 (26%) 16 (21%)

2.2. Immunohistochemistry

The CRC TMA was sectioned at 4–5 μm thickness, deparaffinized and rehydrated using standard methods. Antigen retrieval was performed by boiling samples for 30 minutes in citrate buffer (pH 6.0). Peroxidase activity was blocked using Peroxidazed 1 (Biocare Medical) for 5 minutes at room temperature for all antibodies except anti-DNASE1L3, which was incubated for 10 minutes. Background Sniper (Biocare Medical) was used to block for 5 minutes and to dilute primary antibodies in 1x Phosphate-Buffered Saline/Tween (PBST) for anti-FRMD6 (Sigma [HPA001297]; 1:500), anti-HTR2B (Sigma [HPA012867]; 1:500), anti-GAL3ST2 (ThermoFisher Scientific [PA5-64472]; 1:20) and anti-MUC2 (Santa Cruz Biotechnology [SC-15334]; 1:500). Goat serum (5%) (Vector Laboratories [S-1000]) in 1x PBST was used to block for 1 hour and to dilute primary antibodies for anti-DNASE1L3 (Abcam [ab203669]; 1:600), anti-ZEB1 (Sigma [HPA027524]; 1:500) and anti-TFF3 (ThermoFisher Scientific [MA5-27468]; 1:1000). Goat serum (5%) in 5% skim milk in 1x PBST was used to block for 1 hour and to dilute primary antibodies for anti-DDAH1 (Sigma-Aldrich [HPA006308]; 1:1000), anti-CFTR (R&D Systems [MAB25031]; 1:500), and anti-SEMA3E (ThermoFisher Scientific [PA5-56140]; 1:750). Anti-mouse secondary antibody (Vector Laboratories [MP7452]) was used for CFTR and TFF3. Anti-rabbit secondary antibody (Biocare [RHRP520L]) was used for DDAH1, DNase1L3, FRMD6, GAL3ST2, HTR2B, MUC2, SEMA3E and ZEB1. All slides, except GAL3ST2, were developed with diaminobenzidine (DAB; Cell Signaling Technology [11725S]) prepared per the manufacturer protocol for 3 minutes; GAL3ST2 was developed with DAB for 8 minutes. Counterstaining was conducted with 1:2 CAT hematoxylin diluted in water (Biocare Medical, [CATHE-M]) for 30 seconds. Staining for CDX2 (Cell Marque [EPR2764Y]; pre-diluted ready-to-use) and cytokeratin AE1/AE3 (Dako; 1:100) was performed using the Ventana Benchmark Ultra Staining platform in the clinical laboratory. Mismatch repair (MMR) analysis for MLH1, MSH6, MSH2, and PMS2 was performed in the clinical laboratory as previously described [15].

2.3. Immunohistochemistry Scoring

The CRC tissue microarray (TMA) was used to determine the immunohistochemical profile of MUC2, TFF3, CFTR, ZEB1, CDX2, FRMD6, HTR2B, cytokeratin AE1/AE3, DNase1L3, DDAH1, GAL3ST2, SEMA3E, MMR proteins (MLH1, MSH2, MSH6, PMS2). Cores containing any amount of neoplastic epithelium and normal epithelium in the control cores were included in the analysis. Cores that did not contain epithelium, were folded, or were absent following IHC techniques were excluded from analysis. IHC of the epithelium was scored by a board-certified pathologist (K.A.M.) blinded to clinical data using an Olympus BX43 microscope. Immunostaining of the epithelium was qualitatively scored using a tiered system for intensity (0, no staining; 1, low intensity; 2, moderate intensity; and 3, high intensity) for all proteins of interest, except ZEB1 and previously published data for MMR proteins (MLH1, MSH6, MSH2, and PMS2) [16]. For proteins scored using the tiered system, over 75% of the neoplastic epithelium within the core was positive for protein then intensity was assessed. For ZEB1, there was perinuclear staining noted that did not appear to have a difference in intensity. Similarly, this expression was noted in greater than 75% of the neoplastic cells and therefore was scored as either present or absent. With respect to MUC2, expression was dependent on the presence or absence of goblet cells. Thus for MUC2, goblet cell mucin was scored as the percentage of tumor cells containing MUC2 staining (0, 0%; 1, 1–25%; 2, 26–50%; and 3, >50%).

2.4. Gene Expression, Survival Curves, and Other Statistical Analyses

The list of 68 differentially expressed genes in mouse polyps was previously described [13]. The Cancer Genome Atlas (TCGA) gene expression and clinical information for 41 cases of normal colorectal tissue and 285 colorectal adenocarcinomas were downloaded using the R package TCGAbiolinks [16]. Cox regression with elastic net regularization, computed with R package glmnet, was the semiparametric method used to select the gene subset from our list of 68 that best predicts human survival, accounting for effects of age, gender, and tumor stage [17,18]. Cross-validation was used to choose tuning parameters (alpha = 0.95) and thus to identify a best subset of predictors. Further, stability selection was used to rank genes for predictive importance [19]. UALCAN web portal was also used to access human colon adenocarcinoma gene expression data from the TCGA and mass-spectrometry based proteomics of the CPTAC database [20,21]. Kaplan-Meier survival plots in Figure 3 were obtained from UALCAN.

Figure 3. Mouse-derived classifiers translated to human colorectal carcinomas.

Figure 3.

(A) Nonparametric elastic net analysis determined whether the 68 genes signature identified in mice predicted human CRC patient survival in the publicly available TCGA database. Zero (0) represents no predictive ability and 1 represents perfect prediction. The top 11 predictors are shown amongst known factors predictive of patient survival such as age, gender and cancer stage at diagnosis. (B) DNASE1L3, DDAH1, GAL3ST2 and SEMA3E had changes in relative mRNA levels within colon adenocarcinomas (Primary tumor, n=286) as compared to normal colon epithelium (Normal, n=41). Transcriptomics were mined from the TCGA database using UALCAN online portal (*p < 0.05, ***p < 0.0001, Student’s two-tailed t-test). (C) Kaplan-Meier survival curves of human CRC patients in TCGA database based on high (red line) or low/medium (blue line) mRNA level of DNASE1L3, DDAH1, GAL3ST2 or SEMA3E in CRC patients mined from UALCAN online portal. p-values are as indicated from log-rank test. (D) Mass-spectrometry-based proteomics from the CPTAC Confirmatory/Discovery cohort of colon adenocarcinoma was mined through UALCAN for DNASE1L3 and GAL3ST2. DDAH1 and SEMA3E proteomic data were not available. Both DNASE1L3 and GAL3ST2 protein levels were significantly different in primary colon adenocarcinoma (n=97) as compared to normal colon epithelium (n=100; ***p < 0.0001, Student’s two-tailed t-test).

For other statistical comparisons, Fisher’s exact test, chi-square test or Jonckheere-Terpstra test were used where noted using MSTAT software. Student’s two-tailed t-test was used for differential gene expression from the UALCAN web portal. Kaplan-Meier survival plots were analyzed using log-rank (Mantel-Cox) test in GraphPad Prism8 software. A p value < 0.05 was considered statistically significant for all tests.

2.5. Targeted Sequencing

Targeted sequencing for 105 patients on the CRC TMA was performed. DNA was isolated from FFPE whole tumors on a Maxwell 16 AS2000 (Promega) using the Maxwell 16 FFPE+LEV DNA Purification Kit (Promega AS1135) according to manufacturer instructions. Isolated DNA was sequenced using the QiaSeq Comprehensive Cancer Panel Kit (Qiagen #333515) on an Illumina HiSeq 2500. Sequencing analysis through variant calling was performed at the UW Biotechnology Center. Sequence reads were adapter and quality trimmed using Skewer [22], aligned to Homo sapiens build 1k_v37 using BWA-MEM [23], and deduplicated using Picard (http://picard.sourceforge.net) and Je [24]. Base quality scores were recalibrated using GATK [25] and mutations called using Strelka v-2.8.4 [26] without matched controls and annotated using SNPEff [27]. Resulting variant call frequency (VCF) files were uploaded to the public Galaxy web platform at usegalaxy.org [28] and cross-referenced to ClinVar’s publicly available VCF (accessed 4/29/2019) for annotation of predicted clinical response. Mutations were deemed pathogenic if the ClinVar database labeled them either “Pathogenic” or “Likely Pathogenic.” Additionally, alterations in APC, KRAS, BRAF, PIK3CA, and TP53 were evaluated for potentially pathogenic mutations not yet curated in ClinVar.

3. RESULTS

3.1. Colorectal Cancer Tissue Microarray (CRC TMA)

The CRC TMA used in this study consists of 122 adenocarcinomas distributed as 40 right-sided colon cancers, 44 left-sided colon cancers and 38 rectal cancers; each cancer was represented by two core punches (0.6 mm) from different areas of the tumor at the area of the invasive front and one core (0.6mm) of normal colon from the resection margin. Clinicopathological data from each cancer was available and these were equally distributed across Stages I through Stage IV for each location (Table 1).

A correlation study was performed in a previous publication [15]. To validate that the TMA cores were representative of the tumor, a subset of cancers were selected , whole tissue sections were stained for CD8, and the CD8 scoring was compared to the TMA cores. A subset of cancer samples was selected for validation between the TMA and the tissue blocks from which they were derived. Full tumor slides from eight samples with various CD8+ T cell levels observed on the core sections were selected and stained for CD8 as done for the TMA. A total of 5 high power fields were analyzed. This analysis demonstrated that these cores samples are quite representative of the full tumor for CD8 staining. As expected, tumors with low amounts of tumor-infiltrating lymphocytes (TILs) are highly concordant with the TMA cores. In cancers with increased levels of CD8+ TILs, the absolute number of TILs can vary across the slide and within the TMA cores. In all instances, an increased level of TILs was observed for both the whole sections and the TMA cores. There were no instances in which the cores demonstrated stark differences in classification of tumors as having high or low levels of infiltrating CD8+ T cells.

3.2. Early Classification Method

Using the IHC staining and scoring classification methods developed by Sadanadam and colleagues, which differentiate CRC cancer into one of six subtypes [6], our patient cohort on the CRC TMA was categorized. Epithelial staining intensities of MUC2, CFTR, TFF3 and the presence of ZEB1 in epithelial cells were assessed (Figure 1A). Cancers were able to be classified into one of six subtypes based on the IHC staining in a subset of patients (Figure 1B): 12% (15/122) Enterocyte, 10% (12/122) Goblet-like, 2% (3/122) Stem-like, and 11% (13/122) Transit amplifying (TA) subtype. Two percent of cases (2/122) were classified into two categories based on the IHC staining of two cores from the same tumor; one patient was both Enterocyte and TA subtype and another patient was both Enterocyte and Goblet-like. Unfortunately, a large proportion of cases, (63%; 77/122) were either not scorable or the combination of IHC stains did not lead to a predictable subtype. Our results are consistent with earlier studies indicating this IHC-based classification is not robust.

Figure 1. Early immunohistochemistry (IHC)-based classifiers of colorectal carcinomas (CRC).

Figure 1.

(A) CRC tissue microarray (CRC TMA) was stained for four previously established IHC classifiers (MUC2, TFF3, CFTR, and ZEB1). Qualitative scoring of epithelial cells was performed using a tiered system for MUC2, TFF3 and CFTR (0, no staining; 1, low intensity; 2, moderate intensity; and 3, high intensity). ZEB1 was scored based on the presence or absence of epithelial cell staining. Representative images of each score are shown on the right, unless the score was not observed (NA, not applicable). On the left, graphs show the percent of cancers for each score tier for MUC2, TFF3, CFTR, and for the presence/absence of ZEB1. For whole core images, scale bar represents 300 μm. For magnified images, scale bar represents 100 μm. (B) The classification of the cancers based on IHC classifiers in (A).

3.3. Consensus Classification Method

Using the IHC classification methods developed by Trinh and colleagues that differentiate CRC cancers into consensus molecular subtypes [5,9,10], our patient cohort on the CRC TMA was classified into CMS1 (dMMR/Immune infiltration), CMS2/3 (Epithelial-like) or CMS4 (Mesenchymal-like) subtypes. Epithelial staining intensity of CDX2, FRMD6, HTR2B (Figure 2A) and the presence of ZEB1 in epithelial cells (Figure 1A) were assessed for tumor classification. Additionally, cytokeratin staining was used to assess the presence of epithelial cells for each cancer core. Each core that had a definitive neoplastic epithelial component was scored. Cancers in most patients (86%; 105/122) were able to be classified into a single subtype: 16% (20/122) of patients were CMS1, 67% (82/122) of patients were CMS2/3, and 2% (3/122) were CMS4 (Figure 2B). One patient had ambiguous classification with one tumor core classified as CMS2/3 and the other core as CMS4, therefore ambiguous classification. A small percentage of cancers (13%; 16/122) were unable to be classified because the staining did not match any of the consensus molecular subtypes or else the cores did not contain epithelium, were folded, or were absent following IHC techniques and thus were excluded from analysis. Disease-free survival (DFS; Figure 2C top panel) and overall survival (OS; Figure 2C, bottom panel) were plotted based on the CMS classification; no statistically significant difference in survival based on CMS classification was identified, however, our sample size of CMS4 cancers is too small to detect an effect. As expected, CMS2/3 patients had a decrease in OS and DFS with an increase in cancer stage (Figure 2D, right two panels). For CMS1 cases, the OS and DFS did not differ between cancer stages (Figure 2D, left two panels). Since only four patients in the CRC TMA cohort were classified as CMS4, further survival analysis by cancer stage could not be assessed secondary to the small sample size.

Figure 2. Consensus Molecular Subtypes (CMS) as determined by immunohistochemistry (IHC).

Figure 2.

(A) Colorectal cancer tissue microarray (CRC TMA) was stained for previously established IHC classifiers (CDX2, FRMD6, HTR2B, cytokeratin, and ZEB1) to distinguish CMS2/3 from CMS4. CMS1 was determined solely by mismatch repair deficiency (dMMR). Qualitative scoring of epithelial cells was performed using a tiered system for CDX2, FRMD6, HTR2B, cytokeratin (0, no staining; 1, low intensity; 2, moderate intensity; and 3, high intensitywith a score of 0 or 1 being classified as low and a score of 2 or 3 being scored as high). ZEB1 is shown in Figure 1A. Representative images of each scoring tier are shown on the right for CDX2, FRMD6, HTR2B and cytokeratin, unless the score was not observed (NA, not applicable). On the left, the percent of cancers for each score tier is shown. CMS2/3 is expected to have low CDX2, high FRMD6, high HTR2B and the absence of ZEB1 in epithelial cells. CMS4 is expected to have high CDX2, low FRMD6, low HTR2B and the presences of ZEB1 in epithelial cells. For whole core images, scale bar represents 300 μm. For magnified images, scale bar represents 100 μm. (B) The distribution of cancers on our CRC TMA amongst the CMS classes based on IHC classifiers in Figure 2A. (C) There was no difference in disease-free survival (DFS) or overall survival (OS) based on the CMS classification of our CRC TMA patient cohort. (D) When assessed by cancer stage, there was no difference in DFS or OS for CMS1 cancers (left). CMS2/3 cancers (right) showed the expected trend of a decrease in survival with an increase in cancer stage for both OS and DFS.

As previously established by Guinney and colleagues while developing the CMS classifications, different known cancer driver mutations are associated with the individual CMS subtypes [8]. They observed that CMS1 cancers were enriched for BRAF mutations, CMS2 cancers were enriched for APC and TP53 mutations, and CMS3 cancers were enriched for KRAS mutations, though no mutations were exclusive to each subtype [8]. We conducted targeted sequencing on a subset of 105 cases from the CRC TMA to determine if there was an enrichment for certain mutations in the IHC-based CMS classification. Out of the 105 sequenced cancers, there were 16 CMS1 cancers, 74 CMS2/3 cancers, 2 CMS4 cancers and 13 cancers were unable to be subtyped. Similar to the results of Guinney and colleagues, CMS1 cancers within our patient cohort were associated with an enrichment of BRAF mutations (56%; 9/16) compared to CMS2/3 (3%; 2/74). CMS2/3 cancers were associated with an enrichment of APC (73%; 54/74), KRAS (46%; 34/74), and TP53 (66%; 49/74). Only two CMS4 cancers were sequenced, which prevented analysis for enrichment of mutations. The enrichment of mutations on our subtyped CRC TMA for CMS1 and CMS2/3 aligned with the original CMS classification results by Guinney and colleagues. The association with enrichment of mutations for CMS1 and CMS2/3 further validated using the IHC-based method for CMS classification.

3.4. Novel Biomarkers Derived from Animal Studies

Previous research from our group using the ApcMin mouse model of CRC demonstrated that 68 genes were differentially expressed in polyps that grew to adenocarcinomas versus polyps that remained static [13]. In light of our prior results, we speculated whether the genes associated with polyp progression might also affect CRC behavior. We conducted a semiparametric regression analysis using survival and tumor gene expression data from 285 colorectal cancer patients in the TCGA database to compare the 68 murine-identified genes to known predictors of human cancer risk and patient survival such as age, gender, and tumor stage to assess if any of the 68 genes were linked to CRC patient survival. The 68 genes were then ranked based on statistical stability, which is the frequency of occurrence of each gene in best selected predictors identified in random subsets of patients (see Materials and Methods). The top 11 best predictors of poor survival of CRC patients based upon Cox regression with elastic net regularization (see Materials and Methods) are shown in Figure 3A amongst known factors predictive of patient survival such as age, gender and cancer stage at diagnosis. The top candidate genes most predictive of CRC patient survival were: DNASE1L3, DDAH1, GAL3ST2 and SEMA3E. We then used UALCAN online portal [20] to conduct a parametric analysis to independently assess the prognostic significance of DNASE1L3, DDAH1, GAL3ST2 and SEMA3E. All four gene candidates were differentially expressed in CRC tissue as compared to normal colon (Figure 3B). UALCAN defines low/medium mRNA expression to be below the third quartile of patient expression and high mRNA expression to be above the third quartile of patient expression. Low/medium mRNA levels of DNase1L3 or GAL3ST2 in CRC was linked to poor survival as compared to high mRNA levels (Figure 3C). Mass-spectrometry-based proteomic data from the CPTAC Confirmatory/Discovery cohorts of colon adenocarcinomas was mined for our four potential candidates, however, only DNase1L3 and GAL3ST2 proteomic data was available [21]. Both DNase1L3 and GAL3ST2 proteins were significantly different in tumor tissue as compared to normal colon which includes the entire wall (Figure 3D).

The four candidate genes derived originally from murine intestinal adenomas in the ApcMin mouse, DNASE1L3, DDAH1, GAL3ST2 and SEMA3E, were assessed in our human CRC TMA using IHC (Figure 4). A representative core image for each staining intensity score is shown for each stain (Figure 4A). Low stain intensity was considered a score of 0 or 1. High stain intensity was a score of 2 or 3. DNase1L3 protein level was high in all normal colon epithelium and high in CRC except for one patient who had low DNase1L3 tumor staining intensity (Figure 4A). No patient had complete loss of DNase1L3. DDAH1 staining intensity was high for all normal colon epithelium and was low in 11 CRCs (Figure 4A). No CRC had complete loss of DDAH1 expression. GAL3ST2 was highly expressed in 94% of cases in the normal colonic epithelium and in 50% of CRC cases (Figure 4A). Representative images of GAL3ST2 staining show granular expression and subcellular perinuclear localization suggestive of Golgi apparatus. High SEMA3E staining intensity was seen in 13% of normal colonic epithelium and 94% of CRC cases (Figure 4A). SEMA3E staining was cytoplasmic as shown in the representative scoring images.

Figure 4. Protein expression of mouse-derived classifiers in human CRC and patient outcomes.

Figure 4.

(A) IHC staining using the CRC TMA was conducted for each mouse-derived classifier, DNASE1L3, DDAH1, GAL3ST2 and SEMA3E, and qualitatively scored on the 0–3 tiered system described in Figure 1A. Here, a score of 0–1 is defined as low stain intensity, while a score of 2–3 is high intensity. The percentage of cancers and normal colon epithelial tissue with high stain intensity is shown on the left for each classifier (*p < 0.05, ***p < 0.0001, Fisher’s Exact Test). Representative images of each are shown on the right, unless the score was not observed (NA, not applicable). For whole core images, scale bar represents 300 μm. For magnified images, scale bar represents 100 μm. (B) Low DDAH1 protein level was predictive of poor overall survival (OS) and disease-free survival (DFS) as compared to high DDAH1 protein level. (C) Low GAL3ST2 protein level was predictive of poor DFS. Adjuvant chemotherapy did not increase DFS for patients with low GAL3ST2 cancers (p = 0.81).

We tested whether the protein level of DDAH1 and GAL3ST2 differed with cancer stage. Low DDAH1 protein expression was associated with Stage III (16%; Fisher’s exact test, p < 0.01) and Stage IV (14%; Fisher’s exact test, p < 0.05) cancers as compared to normal colonic epithelium (0%). The percent of cancers with low GAL3ST2 was positively associated with cancer stage (Stage I, 35%; Stage II, 52%; Stage III, 53%, Stage IV, 63%) as compared to normal colonic epithelium (6%; Jonckheere-Terpstra test, p < 0.0001). No difference in low protein levels of DDAH1 was seen regarding gender (females (13%), males (9%); Fisher’s exact test, p = 0.74), tumor location (right colon (5%), left colon (14%) or rectal (13%); Chi-square test, p = 0.41), age of diagnosis (< 50 years (19%) or ≥ 50 years (8%); Fisher’s exact test, p = 0.14) or aspirin usage (yes (8%), no (17%); Fisher’s exact test, p = 0.32). No difference in low protein levels of GAL3ST2 was seen regarding gender (females (51%), males (49%); Fisher’s exact test, p = 1), tumor location (right colon (48%), left colon (55%) or rectal (47%); Chi-square test, p = 0.72), age of diagnosis (< 50 years (50%) or ≥ 50 years (46%); Fisher’s exact test, p = 0.81) or aspirin usage (yes (48%), no (60%); Fisher’s exact test, p = 0.29).

Survival analysis was conducted based on IHC staining intensity of mouse-derived genes DDAH1, GAL3ST2 and SEMA3E. Since only one patient had low expression of DNase1L3, survival analysis was unable to be conducted and DNase1L3 was no longer pursued as a potential classifier. Low DDAH1 protein correlated with poor overall survival (OS; log-rank test, p < 0.05) and poor disease-free survival (DFS; log-rank test, p < 0.0001) as compared to high levels of DDAH1 protein expression (Figure 4B). Similarly, low levels of GAL3ST2 expression correlated with poor DFS (log-rank test, p < 0.001) as compared to high levels of GAL3ST2 expression (Figure 4C). High SEMA3E protein level was not predictive of OS (log-rank test, p = 0.63) nor DFS (log-rank test, p = 0.58) when compared to low SEMA3E protein level. Therefore, SEMA3E was no longer pursued as a potential classifier.

To further characterize the mouse-derived genes, we determined whether they were enriched in the IHC-based CRC subtypes determined by Sadanadam and colleagues and in the CMS subtypes, as previously described. DDAH1 protein level did not differ between any of the early Sadanadam classifications, but high-expressing GAL3ST2 cancers were over-represented in Goblet-like tumors (84.6% Goblet-like vs 50% average from total cohort; Fisher’s exact test, p < 0.01). To validate the association between goblet cells and GAL3ST2, quantitative analysis of percent of goblet cells was assessed by a board-certified pathologist (K.A.M). A majority of tumors with low GAL3ST2 expression (70%) had complete loss of goblet-cells. Additionally, we assessed if cancers with low DDAH1 protein or low GAL3ST2 protein were enriched for within the CMS classifications. Cancers with low DDAH1 protein were not enriched for within CMS1 cancers (10%; Chi-square test, p = 0.97) nor CMS2/3 cancers (9%; Chi-square test, p = 0.71) as compared to the total patient population (10%). Cancers with low GAL3ST2 protein were not enriched for within CMS1 cancers (68%; Chi-square test, p = 0.14) nor CMS2/3 cancers (44%; Chi-square test, p=0.38) as compared to the total patient population (50%). Distribution into CMS4 classification could not be assessed based on low DDAH1 or GAL3ST2 protein because of the low number of CMS4 cancers in our cohort.

We next determined whether a low DDAH1 or GAL3ST2 protein expression level was associated with known cancer-associated mutations using the Qiagen Comprehensive Cancer Panel. Low DDAH1 expression levels were not significantly associated with more or fewer mutations in APC, BRAF, TP53, KRAS, PIK3CA, SMAD4 or CTNBB1 as compared to high DDAH1 cancers. When cancers with low GAL3ST2 were compared with high GAL3ST2 expressing cancers, cancers with low GAL3ST2 protein had more BRAF mutations (19% low GAL3ST2 vs 4% high GAL3ST2; Chi-square, p < 0.05), more TP53 mutations (71% low GAL3ST2 vs 45% high GAL3ST2; Chi-square, p = 0.01), fewer KRAS mutations (29% low GAL3ST2 vs 51% high GAL3ST2; Chi-square, p < 0.05) and no CTNBB1 mutations.

4. DISCUSSION

In this study, we assessed whether an immunohistochemical approach could be used for molecular subtyping of colorectal cancers. We analyzed two previously published, independent sets of immunohistochemistry classifiers and modified the published methods of scoring to improve the accuracy of the classification system. In addition, we evaluated whether genetic and protein signatures were linked to clinical outcomes of patients with CRC.

4.1. Immunohistochemistry-based Classification of CRC

The original CMS classification is a transcriptome-based method which is not practical for everyday use in the clinical laboratory. This system has yet to be employed to make therapeutic decisions for CRC patients even though CMS4 patients appear to have a worse prognosis and may benefit from a more aggressive treatment regimen. One approach to making classification of CRC more feasible in the clinical setting is to do so using readily available methodologies like IHC. Two sets of CRC classifiers were previously developed [6,9]. The set that was developed by Sadanandam and colleagues was established prior to the CMS classifications; the other set that was developed by Trinh and colleagues also established prior to CMS but then translated into use with the CMS classifications. We used both methods with our CRC TMA consisting of a patient cohort of 122 colorectal adenocarcinomas. Overall, 35% of the cancers on the CRC TMA could be classified using the early methods by Sadanandam and colleagues, and 87% of the same cohort of cancers were classified by the later set of classifiers proposed by Trinh and colleagues. Based on the later classification method, CMS1 cancers were enriched for BRAF mutations, whereas CMS2/3 cancers were enriched with APC, KRAS and TP53 mutations. The sequencing supported the use of IHC to subtype CRC into the CMS classifications without the need of transcriptomics. Thus, CMS classification using IHC staining could be readily implemented to facilitate therapeutic decision making.

One challenge in classifying a tumor is intratumoral heterogeneity which may lead to ambiguity for subtyping CRC in small samples. For each cancer case on the CRC TMA, two tumor cores were obtained. Both classification methods identified cases in which subtype was discordant (i.e. one tumor core was assigned a subtype while the other tumor core was assigned a different subtype). More commonly, one core was ambiguous or unable to be classified, while the other was able to be classified. Therefore, it is recommended that multiple samplings across the tumor be evaluated for classification when a larger cross-sectional area of tumor is unavailable.

4.2. Potential Refinement of Classification Scoring

Other proposed IHC subtyping schemes for CRC are probably not robust in part because they fail to accurately account for the biology of the proteins used for subtype classification. For example, cell type specification and consideration for protein localization in the CRC is needed when scoring MUC2, TFF3, and ZEB1 [29,30]. We suggest refinements to IHC subtyping (Supplemental Information; Supplemental Figures 1 and 2).

Clinically available technique of IHC can be used to subtype CRC, however as mentioned above, rigorous selection of IHC classifiers is critical for accurate and robust classification. By using the recently proposed set of classifiers by Trinh and colleagues, we were able to classify more CRC into the CMS classifications without the use of transcriptomics than the earlier IHC-based methods had proposed. The IHC-based CMS classification was able to subtype an equal proportion of CRC as the transcriptomic-based method originally proposed and retained the enrichment of commonly mutated genes in CRC. Because IHC is routinely used in the clinical laboratory, this permits for ease of translation of these IHC-based CMS classifiers into clinical practice. While our current study utilizes a small, retrospective patient cohort, this modified immunohistochemical panel can be used to prospectively study the CMS classification of new cases of CRC.

Other techniques are beginning to emerge for subtyping. Recently, Morris and colleagues narrowed the original list of 472 genes of interest for transcriptomic-based subtyping [8] down to a panel of 100 genes and optimized the panel for subtyping of CRC from formalin-fixed paraffin embedded tissue with Nanostring-based technology [31]. Though this assay has been developed and tested in a Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory, it is not ready for clinical use until clinical trials are performed.

The laboratory that developed the IHC-based CMS classification method recently published an extensive review on the prognosis and predictive response to therapy based on the CMS [32]. CMS1 cancers were predicted to benefit from the addition of bevacizumab and CMS1 cancers may be the best candidates for immunotherapy, based on the immune-infiltration phenotype [32]. CMS2 cancers wild-type for KRAS were also predictive to respond favorably to cetuximab and Stage II CMS2 cancers had an increase in relapse-free survival with the addition of oxaliplatin to 5-FU [32]. CMS2 and CMS3 cancers had an increase in progression-free and overall survival with the addition of bevacizumab to capecitabine therapy. CMS4 cancers were favorably predicted to respond to irinotecan backbone with bevacizumab or cetuximab (if KRAS wild-type) and metastatic CMS4 cancers were predictive to have a better response to irinotecan-based chemotherapy as a first line of treatment [32]. Overall, ten Hoorn and colleagues concluded that primary cancers classified as CMS2 or CMS3 would be predicted to respond better to adjuvant chemotherapy, most likely due to their more epithelial-like characteristics as compared to CMS1 (MSI/immune) and CMS4 (mesenchymal-like). The differences in theraupeutic response may be due to the underlying molecular differences of each subtype indicating the benefit of adding CMS classification to the standard assessments of TNM staging, MMR and gene mutation status for treatment decisions. The prognostic and predictive value of the CMS were independent of the method used for subtyping, indicating the strength of the molecular characteristics within each subtype and that the IHC-based subtyping is not inferior to transcriptomic-based subtyping, nor vise versa [32].

4.3. Novel Biomarkers

Our studies showing that the genes having a possible role in the progression of adenoma to carcinoma in a mouse CRC model were also shown to associate with survival in CRC patients, indicating that they might provide further insight into biological differences between the four CMS subtypes of CRC. We demonstrated that the levels of DDAH1 and GAL3ST2, which originated from our study of a mouse model of human CRC [13], correlated with advanced disease. DDAH is a dimethylarginine dimethylaminohydrolase protein that degrades asymmetric dimethylarginine (ADMA), which in turn, regulates the nitric oxide (NO) pathway [33]. Low DDAH1 protein expression has been associated with gastric cancers and DDAH1 protein has been considered to be a possible tumor suppressor in gastric cancer, independent of its role in the ADMA/NO pathway [33]. GAL3ST2 is a galactose-3-O-sulfotransferase located in the Golgi apparatus, involved in protein modifications and an association with metastatic potential in several cancer cell lines [34]. Previous research also demonstrated a down-regulation of GAL3ST2 in association with non-mucinous adenocarcinomas [35]. The additional information gleaned from DDAH1 and GAL3ST3 protein expression could facilitate treatment decisions as the level of these proteins appear to be a marker of advanced disease so subsequently a more aggressive treatment strategy could be employed.

Using our CRC TMA patient cohort, IHC staining was conducted for both DDAH1 and GAL3ST2. Low DDAH1 protein expression levels correlated with decreased OS and DFS in our CRC patient cohort (Figure 4). Reduced GAL3ST2 protein expression also predicted a decrease in DFS in our CRC patient cohort (Figure 4). Because it is uncertain if the addition of chemotherapy in the treatment regimen for CRC will enhance patient survival, we wanted to determine if chemotherapy increased survival of CRC patients with low DDAH1 or low GAL3ST2 protein levels. Because there were only nine patients with low DDAH1 protein expression with survival data available, further survival analysis based on chemotherapy was not possible. However, when GAL3ST2 expression was further stratified according to treatment with chemotherapy, there was no improvement in DFS in patients who received chemotherapy treatment in our CRC patient cohort (Figure 4C). The proportions of CMS1, CMS2/3 and CMS4 patients within the high and low DDAH1 categories were not different; however, patients with low DDAH1 had decreased overall survival and disease-free survival. The current CMS classification alone is insufficient to predict patient survival; the additional input regarding the expression of DDAH1 or GAL3ST2 protein level could enhance the ability to predict patient outcomes.

We recognize that our studies have limitations. The CRC TMA used in this study is a valuable resource, however it has a limited number (122) of colorectal adenocarcinomas with minimal diversity given that 92.5% of patients within our cohort are Caucasian. While the majority of staining was conducted manually in this study, automated staining would need to be employed in order for this to be widely accepted as part of a standard workflow in the clinical laboratory.

In conclusion, we have assessed a method of classification of CRC into the consensus molecular subtypes (CMS) using IHC assays for a panel of proteins and have translated classifiers predictive of outcome from our mouse model of CRC to human CRC. Our results suggest that both sets of classifiers can be used as CRC prognostic biomarkers and have the potential to be used for therapy selection in the future.

Supplementary Material

1

Highlights.

  • Molecular subtypes of colorectal cancer can be identified by immunohistochemistry.

  • Refinement of immunohistochemistry scoring enhances rigor.

  • Patient outcome and treatment response vary depending on molecular subtype.

  • Biomarkers that were identified in preclinical studies might enhance subtyping.

Funding Disclosure:

This work was supported by National Cancer Institute (R01CA194663, R01CA220004) as well as internal funding provided by Division of Gastroenterology and Hepatology, Department of Medicine, University of Wisconsin, The University of Wisconsin Carbone Cancer Center Precision Medicine Molecular Tumor Board and the Ride for Research. The Translational Science Biocore of Pathology and the University of Wisconsin Carbone Cancer Center are supported by Grant P30 CA014520.

Footnotes

Conflicts of interest statement: William M. Grady is an advisory board member for Freenome, Guardant Health, and SEngine and consults for DiaCarta, an investigator in a clinical trial sponsored by Janssen Pharmaceuticals, and receives research support from Tempus and from Pavmed technologies. Perry J. Pickhardt is an advisor for Bracco and Zebra and shareholder in SHINE, Elucent, and Cellectar.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

6. REFERENCES

  • 1.Xie Y-H, Chen Y-X, Fang J-Y. Comprehensive review of targeted therapy for colorectal cancer. Signal Transduction and Targeted Therapy. 2020;5: 1–30. doi: 10.1038/s41392-020-0116-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Budinska E, Popovici V, Tejpar S, D’Ario G, Lapique N, Sikora KO, et al. Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer. J Pathol. 2013;231: 63–76. doi: 10.1002/path.4212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Marisa L, de Reyniès A, Duval A, Selves J, Gaub MP, Vescovo L, et al. Gene Expression Classification of Colon Cancer into Molecular Subtypes: Characterization, Validation, and Prognostic Value. PLoS Med. 2013;10. doi: 10.1371/journal.pmed.1001453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Roepman P, Schlicker A, Tabernero J, Majewski I, Tian S, Moreno V, et al. Colorectal cancer intrinsic subtypes predict chemotherapy benefit, deficient mismatch repair and epithelial-to-mesenchymal transition. Int J Cancer. 2014;134: 552–562. doi: 10.1002/ijc.28387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Melo FDSE, Wang X, Jansen M, Fessler E, Trinh A, de Rooij LPMH, et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nature Medicine. 2013;19: 614–618. doi: 10.1038/nm.3174 [DOI] [PubMed] [Google Scholar]
  • 6.Sadanandam A, Lyssiotis CA, Homicsko K, Collisson EA, Gibb WJ, Wullschleger S, et al. A colorectal cancer classification system that associates cellular phenotype and responses to therapy. Nature Medicine. 2013;19: 619–625. doi: 10.1038/nm.3175 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Schlicker A, Beran G, Chresta CM, McWalter G, Pritchard A, Weston S, et al. Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines. BMC Med Genomics. 2012;5: 66. doi: 10.1186/1755-8794-5-66 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Guinney J, Dienstmann R, Wang X, de Reyniès A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nature Medicine. 2015;21: 1350–1356. doi: 10.1038/nm.3967 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Trinh A, Trumpi K, De Sousa E Melo F, Wang X, de Jong JH, Fessler E, et al. Practical and Robust Identification of Molecular Subtypes in Colorectal Cancer by Immunohistochemistry. Clin Cancer Res. 2017;23: 387–398. doi: 10.1158/1078-0432.CCR-16-0680 [DOI] [PubMed] [Google Scholar]
  • 10.Ten Hoorn S, Trinh A, de Jong J, Koens L, Vermeulen L. Classification of Colorectal Cancer in Molecular Subtypes by Immunohistochemistry. Methods Mol Biol. 2018;1765: 179–191. doi: 10.1007/978-1-4939-7765-9_11 [DOI] [PubMed] [Google Scholar]
  • 11.Chang K, Willis JA, Reumers J, Taggart MW, San Lucas FA, Thirumurthi S, et al. Colorectal premalignancy is associated with consensus molecular subtypes 1 and 2. Ann Oncol. 2018;29: 2061–2067. doi: 10.1093/annonc/mdy337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Komor MA, Bosch LJ, Bounova G, Bolijn AS, Delis van Diemen PM, Rausch C, et al. Consensus molecular subtype classification of colorectal adenomas. J Pathol. 2018;246: 266–276. doi: 10.1002/path.5129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Paul Olson TJ, Hadac JN, Sievers CK, Leystra AA, Deming DA, Zahm CD, et al. Dynamic tumor growth patterns in a novel murine model of colorectal cancer. Cancer Prev Res (Phila). 2014;7: 105–113. doi: 10.1158/1940-6207.CAPR-13-0163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sievers CK, Zou LS, Pickhardt PJ, Matkowskyj KA, Albrecht DM, Clipson L, et al. Subclonal diversity arises early even in small colorectal tumours and contributes to differential growth fates. Gut. 2017;66: 2132–2140. doi: 10.1136/gutjnl-2016-312232 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hope C, Emmerich PB, Papadas A, Pagenkopf A, Matkowskyj KA, Van De Hey DR, et al. Versican-derived matrikines regulate Batf3-dendritic cell differentiation and promote T-cell infiltration in colorectal cancer. J Immunol. 2017;199: 1933–1941. doi: 10.4049/jimmunol.1700529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016;44: e71–e71. doi: 10.1093/nar/gkv1507 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Friedman JH, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software. 2010;33: 1–22. doi: 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Simon N, Friedman JH, Hastie T, Tibshirani R. Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent. Journal of Statistical Software. 2011;39: 1–13. doi: 10.18637/jss.v039.i05 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Meinshausen N, Bühlmann P. Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2010;72: 417–473. doi: 10.1111/j.1467-9868.2010.00740.x [DOI] [Google Scholar]
  • 20.Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi BVSK, et al. UALCAN: A Portal for Facilitating Tumor Subgroup Gene Expression and Survival Analyses. Neoplasia. 2017;19: 649–658. doi: 10.1016/j.neo.2017.05.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chen F, Chandrashekar DS, Varambally S, Creighton CJ. Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers. Nature Communications. 2019;10: 1–15. doi: 10.1038/s41467-019-13528-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jiang H, Lei R, Ding S-W, Zhu S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics. 2014;15: 182. doi: 10.1186/1471-2105-15-182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Li H Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:13033997 [q-bio]. 2013. [cited 20 Apr 2021]. Available: http://arxiv.org/abs/1303.3997
  • 24.Girardot C, Scholtalbers J, Sauer S, Su S-Y, Furlong EEM. Je, a versatile suite to handle multiplexed NGS libraries with unique molecular identifiers. BMC Bioinformatics. 2016;17: 419. doi: 10.1186/s12859-016-1284-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20: 1297–1303. doi: 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kim S, Scheffler K, Halpern AL, Bekritsky MA, Noh E, Källberg M, et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods. 2018;15: 591–594. doi: 10.1038/s41592-018-0051-x [DOI] [PubMed] [Google Scholar]
  • 27.Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6: 80–92. doi: 10.4161/fly.19695 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Cech M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 2018;46: W537–W544. doi: 10.1093/nar/gky379 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Aihara E, Engevik KA, Montrose MH. Trefoil Factor Peptides and Gastrointestinal Function. Annu Rev Physiol. 2017;79: 357–380. doi: 10.1146/annurev-physiol-021115-105447 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lopes-Pacheco M CFTR Modulators: Shedding Light on Precision Medicine for Cystic Fibrosis. Front Pharmacol. 2016;7. doi: 10.3389/fphar.2016.00275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Morris JS, Luthra R, Liu Y, Duose DY, Lee W, Reddy NG, et al. Development and Validation of a Gene Signature Classifier for Consensus Molecular Subtyping of Colorectal Carcinoma in a CLIA-Certified Setting. Clin Cancer Res. 2021;27: 120–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ten Hoorn S, de Back TR, Sommeijer DW, Vermeulen L. Clinical Value of Consensus Molecular Subtypes in Colorectal Cancer: A Systematic Review and Meta-Analysis. J Natl Cancer Inst. 2021; djab106. doi: 10.1093/jnci/djab106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ye J, Xu J, Li Y, Huang Q, Huang J, Wang J, et al. DDAH1 mediates gastric cancer cell invasion and metastasis via Wnt/β-catenin signaling pathway. Mol Oncol. 2017;11: 1208–1224. doi: 10.1002/1878-0261.12089 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Shi B-Z, Hu P, Geng F, He P-J, Wu X-Z. Gal3ST-2 involved in tumor metastasis process by regulation of adhesion ability to selectins and expression of integrins. Biochem Biophys Res Commun. 2005;332: 934–940. doi: 10.1016/j.bbrc.2005.05.040 [DOI] [PubMed] [Google Scholar]
  • 35.Seko A, Nagata K, Yonezawa S, Yamashita K. Down-regulation of Gal 3-O-sulfotransferase-2 (Gal3ST-2) expression in human colonic non-mucinous adenocarcinoma. Jpn J Cancer Res. 2002;93: 507–515. doi: 10.1111/j.1349-7006.2002.tb01285.x [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES