Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jul 21.
Published in final edited form as: N Engl J Med. 2016 Jan 21;374(3):211–222. doi: 10.1056/NEJMoa1506597

CDX2 as a Prognostic Biomarker in Stage II and Stage III Colon Cancer

Piero Dalerba 1, Debashis Sahoo 1, Soonmyung Paik 1, Xiangqian Guo 1, Greg Yothers 1, Nan Song 1, Nate Wilcox-Fogel 1, Erna Forgó 1, Pradeep S Rajendran 1, Stephen P Miranda 1, Shigeo Hisamori 1, Jacqueline Hutchison 1, Tomer Kalisky 1, Dalong Qian 1, Norman Wolmark 1, George A Fisher 1, Matt van de Rijn 1, Michael F Clarke 1
PMCID: PMC4784450  NIHMSID: NIHMS763005  PMID: 26789870

Abstract

Background

The identification of high-risk stage II colon cancers is key to the selection of patients who require adjuvant treatment after surgery. Microarray-based multigene-expression signatures derived from stem cells and progenitor cells hold promise, but they are difficult to use in clinical practice.

Methods

We used a new bioinformatics approach to search for biomarkers of colon epithelial differentiation across gene-expression arrays and then ranked candidate genes according to the availability of clinical-grade diagnostic assays. With the use of subgroup analysis involving independent and retrospective cohorts of patients with stage II or stage III colon cancer, the top candidate gene was tested for its association with disease-free survival and a benefit from adjuvant chemotherapy.

Results

The transcription factor CDX2 ranked first in our screening test. A group of 87 of 2115 tumor samples (4.1%) lacked CDX2 expression. In the discovery data set, which included 466 patients, the rate of 5-year disease-free survival was lower among the 32 patients (6.9%) with CDX2-negative colon cancers than among the 434 (93.1%) with CDX2-positive colon cancers (hazard ratio for disease recurrence, 3.44; 95% confidence interval [CI], 1.60 to 7.38; P = 0.002). In the validation data set, which included 314 patients, the rate of 5-year disease-free survival was lower among the 38 patients (12.1%) with CDX2 protein–negative colon cancers than among the 276 (87.9%) with CDX2 protein–positive colon cancers (hazard ratio, 2.42; 95% CI, 1.36 to 4.29; P = 0.003). In both these groups, these findings were independent of the patient's age, sex, and tumor stage and grade. Among patients with stage II cancer, the difference in 5-year disease-free survival was significant both in the discovery data set (49% among 15 patients with CDX2-negative tumors vs. 87% among 191 patients with CDX2-positive tumors, P = 0.003) and in the validation data set (51% among 15 patients with CDX2-negative tumors vs. 80% among 106 patients with CDX2-positive tumors, P = 0.004). In a pooled database of all patient cohorts, the rate of 5-year disease-free survival was higher among 23 patients with stage II CDX2-negative tumors who were treated with adjuvant chemotherapy than among 25 who were not treated with adjuvant chemotherapy (91% vs. 56%, P = 0.006).

Conclusions

Lack of CDX2 expression identified a subgroup of patients with high-risk stage II colon cancer who appeared to benefit from adjuvant chemotherapy. (Funded by the National Comprehensive Cancer Network, the National Institutes of Health, and others.)


During the past decade, disease-free survival among patients with stage III colon cancer has increased significantly owing to the introduction of new adjuvant chemotherapy regimens.1-3 This therapeutic success, however, has not translated into longer disease-free survival among patients with earlier-stage (stage I or II) cancer.4 The lack of simple, reliable criteria for the identification of patients with early-stage disease who are at high risk for relapse has made it difficult to identify patients in whom the hazards of multiagent chemotherapy may be offset by benefits with respect to disease-specific survival.4-9

To address this problem, researchers have explored the possibility of stratifying patients with colon cancer according to the gene-expression profile of their tumor tissues, and they have developed multigene-expression signatures that can be used to identify high-risk colon cancers.10-15 Although gene-expression signatures hold promise, they are difficult to use in clinical practice16 and are often not predictive of benefit from adjuvant chemotherapy.17

Among the gene-expression signatures with the greatest promise are those derived from stem cells and progenitor cells.18,19 Therefore, we initiated a systematic search for a biomarker that could be used to identify undifferentiated tumors (i.e., tumors depleted of cells with a mature phenotype) by means of immunohistochemical analysis.

To perform this search, we adopted a bioinformatics approach using Boolean logic. This approach, which was designed to discover developmentally regulated genes,20,21 was used to identify genes with expression in colon cancer that was negatively linked to the activated leukocyte-cell adhesion molecule (ALCAM/CD166). This marker of immature colon epithelial cells is preferentially expressed at the bottom of colon crypts22,23 and on human colon-cancer cells with enriched tumorigenic capacity in mouse xenotransplantation models.24

This screening test led us to identify caudal-type homeobox transcription factor 2 (CDX2) as a candidate biomarker of mature colon epithelial tissues. Using subgroup analysis involving retrospective patient cohorts, we evaluated the association of this biomarker with 5-year disease-free survival and benefit from adjuvant chemotherapy among patients with colon cancer (Fig. 1).

Figure 1. Study Design.

Figure 1

A database containing 2329 human gene–expression arrays from both 214 normal colon tissue samples and 2115 colorectal-cancer tissue samples was mined to identify genes that fulfilled the “X-negative implies activated leukocyte-cell adhesion molecule (ALCAM)–positive” Boolean implication. The search yielded 16 candidate genes, of which only 1 (CDX2) encoded for a clinically actionable biomarker. The association between CDX2 expression and disease-free survival was tested in two independent patient cohorts: a discovery data set (National Center for Biotechnology Information Gene Expression Omnibus [NCBI-GEO]) and a validation data set (Cancer Diagnosis Program of the National Cancer Institute [NCI-CDP]). The association between CDX2 expression and benefit from adjuvant chemotherapy was tested in a pooled database of 669 patients with stage II disease and 1228 patients with stage III disease from four independent data sets (NCBI-GEO, NCI-CDP, National Surgical Adjuvant Breast and Bowel Project [NSABP] C-07 trial [NSABP C-07], and the Stanford Tissue Microarray Database [TMAD]).

Methods

Bioinformatics Analysis of Gene-Expression Array Databases

We searched for genes that fulfilled the “X-negative implies ALCAM-positive” Boolean relationship in a collection of 2329 human colon gene-expression array experiments (Fig. S1 in Supplementary Appendix 1, available with the full text of this article at NEJM.org). This collection was downloaded from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) repository (www.ncbi.nlm.nih.gov/geo). The search was conducted with the use of BooleanNet software20 with a false discovery rate of less than 0.0001 as a cutoff point for positive results (Fig. S2 in Supplementary Appendix 1). Candidate genes were ranked according to the dynamic range of their expression levels (Fig. S3 in Supplementary Appendix 1).

The relationship between CDX2 expression levels and other molecular features such as micro-satellite instability and TP53 mutations was studied in ad hoc collections annotated with the respective information after tumor samples were stratified into CDX2-negative and CDX2-positive subgroups with the use of the StepMiner algorithm25 (Fig. S4 and S5 in Supplementary Appendix 1). The relationship between CDX2 messenger RNA (mRNA) expression levels or ALCAM mRNA expression levels and disease-free survival was tested in a discovery data set of 466 patients. We obtained this data set by pooling four NCBI-GEO data sets (GSE14333, GSE17538, GSE31595, and GSE37892) (Fig. S6 in Supplementary Appendix 1).12,13,26,27 Patients were stratified into negative-to-low (negative) and high (positive) subgroups with regard to CDX2 and ALCAM gene-expression levels with the use of the StepMiner algorithm, implemented within the Hegemon21 software (Fig. S7 through S10 in Supplementary Appendix 1).

An in-depth description of all bioinformatics procedures used in this study is provided in Supplementary Appendix 1. Complete lists of all NCBI-GEO sample number identifiers of individual gene-expression array experiments that were used to perform the various tests are provided in Tables S1 through S5 in Supplementary Appendix 1, Supplementary Appendix 2, Supplementary Appendix 3, Supplementary Appendix 4, and Supplementary Appendix 5, respectively.

Immunohistochemical Testing

Formalin-fixed, paraffin-embedded tissue sections were stained with 4 mg per milliliter of a mouse antihuman CDX2 monoclonal antibody that was previously validated for diagnostic applications (clone CDX2-88, BioGenex).28,29 The staining protocol was based on recommendations from the Nordic Immunohistochemical Quality Control organization (www.nordiqc.org), which suggests heat-induced antigen retrieval with Tris buffer and EDTA (pH 9.0) (Epitope Retrieval Solution pH9, Leica).30 Tissue slides were stained on a Bond-Max automatic stainer (Leica), and antigen detection was visualized with the use of the Bond Polymer Refine Detection kit (Leica).

Analysis of Tissue Microarrays

Colon-cancer tissue microarrays, fully annotated with clinical and pathological information, were obtained from three independent sources: 367 patients in the Cancer Diagnosis Program of the National Cancer Institute (NCI-CDP), 1519 patients in the National Surgical Adjuvant Breast and Bowel Project (NSABP) C-07 trial (NSABP C-07), and 321 patients in the Stanford Tissue Microarray Database (Stanford TMAD). A detailed description of the patient cohorts represented in each tissue microarray and of the scoring system used to evaluate CDX2 expression is provided in Figures S11 through S14 in Supplementary Appendix 1.

All tissue microarrays were scored for CDX2 expression in a blinded fashion. In cases in which tissue microarrays contained two tissue cores for a patient (i.e., two samples from distinct areas of the same tumor), the two cores were scored independently and paired at the end. If scores for the two samples were discordant, the final score for the tumor was upgraded to the higher score. All tumors in which the malignant epithelial component showed widespread nuclear expression of CDX2, either in all or a majority of cancer cells, were scored as CDX2-positive. All tumors in which the malignant epithelial component either completely lacked CDX2 expression or showed faint nuclear expression in a minority of malignant epithelial cells were scored as CDX2-negative.

The concordance between the scoring results obtained by two independent investigators was evaluated with the use of contingency tables and by calculation of Cohen's kappa indexes (Fig. S15 in Supplementary Appendix 1). The association between CDX2 expression and survival outcomes was tested by a third investigator who did not participate in the scoring process.

Statistical Analysis

Patient subgroups were compared with respect to survival outcomes with the use of Kaplan–Meier curves, log-rank tests, and multivariate analyses based on the Cox proportional-hazards method. Differences in the frequency of CDX2-negative cancers across different subgroups were compared with the use of Pearson's chi-square test and by computation of odds ratios together with their 95% confidence intervals. Interactions between the biomarker (CDX2 status) and adjuvant chemotherapy were evaluated with the use of the Cox proportional-hazards method in a 2-by-2 factorial design (i.e., by testing for the presence of an interaction factor between the hazard rates of the two variables).

Results

Identification of CDX2

The first aim of this study was to identify an actionable biomarker of poorly differentiated colon cancers (i.e., tumors depleted of mature colon epithelial cells). An actionable biomarker is one for which a clinical-grade diagnostic test had already been developed. Using a software algorithm designed for the discovery of genes with expression patterns that are linked by Boolean relationships (BooleanNet),20 we mined a database of 2329 human colon gene-expression array experiments, searching for genes that fulfilled the “X-negative implies ALCAM-positive” Boolean implication (i.e., genes with expression that was, at the same time, absent only in ALCAM-positive tumors and always present in ALCAM-negative tumors) (Fig. S2 in Supplementary Appendix 1).

The search led to the identification of 16 candidate genes (Fig. S3 in Supplementary Appendix 1). Of these genes, only 1 gene encoded a protein that could be studied by means of immunohistochemical analysis with the use of a clinical-grade diagnostic test: the homeobox transcription factor CDX2.28,29,31 CDX2 is a master regulator of intestinal development and oncogenesis,32,33 and its expression is highly specific to the intestinal epithelium.29 Colon cancers without CDX2 expression are often associated with an increased likelihood of aggressive features such as advanced stage, poor differentiation, vascular invasion, BRAF mutation, and the CpG island methylator phenotype (CIMP).34-39

A detailed analysis of the gene-expression relationship between CDX2 and ALCAM confirmed the existence of three gene-expression groups: CDX2-negative and ALCAM-positive, CDX2-positive and ALCAM-positive, and CDX2-positive and ALCAM-negative (Fig. S2 in Supplementary Appendix 1). Lack of CDX2 expression was restricted to a small subgroup of 87 of 2115 colorectal cancers (4.1%). This subgroup was characterized by high levels of ALCAM expression (Fig. S3 in Supplementary Appendix 1) and only partial overlap with tumors defined by microsatellite instability or TP53 mutations (Fig. S4 and S5 in Supplementary Appendix 1). We thus proceeded to evaluate the association between CDX2 expression and disease-free survival in two independent patient data sets: the NCBI-GEO discovery data set and the NCI-CDP validation data set.

CDX2 Expression and Disease-free Survival in the NCBI-GEO Discovery Data Set

To evaluate the association between CDX2 expression and disease-free survival among patients in the NCBI-GEO discovery data set, we used the StepMiner algorithm to stratify the population of 466 patients into CDX2-negative and CDX2-positive subgroups and then used Kaplan–Meier curves to compare the disease-free survival of the two subgroups (Fig. 2). The analysis showed that the rate of 5-year disease-free survival was lower among the 32 patients (6.9%) with CDX2-negative tumors than among the 434 (93.1%) with CDX2-positive tumors (41% vs. 74%, P<0.001). In a multivariate analysis that excluded age, sex, and tumor stage as confounding variables, the hazard ratio for disease recurrence among patients with CDX2-negative versus CDX2-positive tumors was 2.73 (95% confidence interval [CI], 1.58 to 4.72; P<0.001).

Figure 2. Relationship between CDX2 Expression and Disease-free Survival in the NCBI-GEO Discovery Data Set.

Figure 2

Analysis of CDX2 messenger RNA (mRNA) expression in the NCBI-GEO discovery data set revealed the presence of a minority subgroup of CDX2-negative colon cancers that were characterized by high ALCAM mRNA expression levels (Panel A) and that were associated with a lower rate of 5-year disease-free survival than CDX2-positive colon cancers (Panel B). In Panel A, each circle in the scatter plot represents one patient sample. The association between CDX2-negative cancers and a lower rate of disease-free survival remained significant in a multivariate analysis that excluded tumor stage, tumor grade, age, and sex as confounding variables (Panel C).

Within the NCBI-GEO discovery data set, data on only 216 patients were annotated with information on pathological grade (Table S1 in Supplementary Appendix 1). A multivariate analysis that was restricted to these 216 patients showed that CDX2-negative tumors were associated with a higher risk of recurrence than CDX2-positive ones (hazard ratio, 3.44; 95% CI, 1.60 to 7.38; P = 0.002); the hazard ratio associated with the CDX2 status was higher than that associated with increasing pathological grade (hazard ratio, 0.99; 95% CI, 0.56 to 1.74; P = 0.96).

High levels of ALCAM expression had previously been shown to be associated with worse clinical outcomes.23 Moreover, in the NCBI-GEO discovery data set, the rate of 5-year disease-free survival associated with ALCAM-positive tumors was moderately, but significantly lower than that associated with ALCAM-negative ones (67% vs. 78%, P = 0.048) (Fig. S7 in Supplementary Appendix 1). Therefore, we evaluated whether the association between CDX2-negative tumors and a lower rate of disease-free survival could be explained by the fact that most CDX2-negative tumors were also ALCAM-positive. To this end, we used Hegemon software21 to stratify the NCBI-GEO population into three subgroups (CDX2-negative and ALCAM-positive, CDX2-positive and ALCAM-positive, and CDX2-positive and ALCAM-negative) and then compared their clinical outcomes (Fig. S8 and S9 in Supplementary Appendix 1).

The results showed that CDX2-negative and ALCAM-positive tumors were associated with a lower rate of 5-year disease-free survival than CDX2-positive and ALCAM-positive and CDX2-positive and ALCAM-negative tumors. A similar set of tests also indicated that when compared side by side with the use of multivariate analysis, the hazard ratios for disease recurrence associated with the CDX2 and ALCAM grouping system were higher than those associated with the “intestinal stem-cell” gene-expression signature19 (Fig. S10 in Supplementary Appendix 1).

CDX2 Expression and Disease-free Survival in the NCI-CDP Validation Data Set

To evaluate the robustness of our findings, we decided to test whether they could be reproduced in an independent data set,40 and we chose to analyze a human colon-cancer tissue microarray obtained from the NCI-CDP. This microarray was explicitly designed to contain a balanced distribution of patients with and without tumor recurrence, as well as with a relatively homogeneous long-term follow-up, with the aim to maximize the statistical power to find associations between biomarkers and clinical outcomes.

To evaluate CDX2 protein expression, we used immunohistochemical analysis with an anti-CDX2 monoclonal antibody that had previously been validated for diagnostic purposes.28,29 Analysis of stained sections confirmed the presence of a minority subgroup of cancers that lacked expression of CDX2 protein in malignant epithelial cells, as compared with the majority of samples that had intense nuclear staining (Fig. 3). On the basis of these results, we stratified the patient cohort into two subgroups: CDX2-negative (48 of 366 patients [13%]) and CDX2-positive (318 of 366 patients [87%]). A description of the scoring system and its performance in terms of interobserver agreement is provided in Figures S14 and S15 in Supplementary Appendix 1.

Figure 3. Relationship between CDX2 Protein Expression and Disease-free Survival in the NCI-CDP Validation Data Set.

Figure 3

Analysis of CDX2 protein expression in the NCI-CDP validation data set confirmed the existence of a minority subgroup of CDX2-negative cancers (Panel A) that lacked the distinctive CDX2 nuclear expression that is characteristic of epithelial cancer cells in the majority of colon cancers (Panel B). CDX2-negative cancers were associated with a lower rate of 5-year disease-free survival than CDX2-positive cancers (Panel C). The association between the absence of CDX2 expression and a lower rate of 5-year disease-free survival was confirmed by means of a multivariate analysis (based on the Cox proportional-hazards method) that excluded tumor stage, tumor grade, age, and sex as confounding variables (Panel D). CDX2-negative tumors were associated with a lower rate of survival independent of their sub-classification with regard to low or intermediate (G1 or G2) or high (G3 or G4) pathological grade (Panel E).

CDX2-negative tumors were associated with a worse prognosis than were CDX2-positive tumors, with lower rates of 5-year disease-free survival (48% vs. 71%, P<0.001) (Fig. 3), overall survival (33% vs. 59%, P<0.001) (Fig. S16 in Supplementary Appendix 1), and disease-specific survival (45% vs. 72%, P<0.001) (Fig. S16 in Supplementary Appendix 1). The association remained significant in multivariate analyses that excluded age, sex, tumor stage, and tumor grade as confounding variables: in the analysis of disease-free survival, the hazard ratio for disease recurrence associated with CDX2-negative tumors as compared with CDX2-positive tumors was 2.42 (95% CI, 1.36 to 4.29; P = 0.003); in the analysis of overall survival, the hazard ratio for death was 1.79 (95% CI, 1.18 to 2.71; P = 0.006); and in the analysis of disease-specific death, the hazard ratio for death was 2.09 (95% CI, 1.22 to 3.58; P = 0.007).

CDX2-negative status was more common among tumors with a high pathological grade (Fig. S17 in Supplementary Appendix 1). However, CDX2-negative tumors were associated with a lower rate of survival irrespective of their low or intermediate (G1 or G2) or high (G3 or G4) pathological grade — a finding consistent with the results of the multivariate analysis (Fig. 3, and Fig. S17 in Supplementary Appendix 1).

CDX2 Expression and Survival among Patients with Stage II Disease

To evaluate our findings with respect to the prognosis among patients with early-stage colon cancer, we decided to study the association between the CDX2-negative phenotype, assessed at either the mRNA or protein level, and disease-free survival among patients with stage II disease. Stage II CDX2-negative tumors were associated with a lower rate of 5-year disease-free survival than were stage II CDX2-positive tumors in both the NCBI-GEO discovery data set (49% vs. 87%, P = 0.003) (Fig. 4) and the NCI-CDP validation data set (51% vs. 80%, P = 0.004) (Fig. 4).

Figure 4. Relationship between CDX2 Expression and Disease-free Survival among Patients with Stage II Disease.

Figure 4

In the NCBI-GEO discovery data set (Panel A), CDX2-negative cancers were associated with a rate of 5-year disease-free survival that was lower than the rate associated with CDX2-positive cancers. In the NCI-CDP validation data set (Panel B), CDX2-negative cancers were associated with a rate of 5-year disease-free survival that was lower than the rate associated with CDX2-positive cancers.

We found similar associations with respect to overall survival (40% among patients with CDX2-negative tumors vs. 70% among those with CDX2-positive tumors, P<0.001) (Fig. S18 in Supplementary Appendix 1) and disease-specific survival (66% vs. 89%, P = 0.005) (Fig. S18 in Supplementary Appendix 1). These associations were not confounded by risk factors that are known to affect survival rates among patients with stage II colon cancer, such as the depth of invasion of the primary tumor (T3 vs. T4) (Fig. S19 in Supplementary Appendix 1) and the number of lymph nodes resected at surgery (≥12 vs. <12) (Fig. S19 in Supplementary Appendix 1). However, in each of the two data sets, only 15 patients with stage II CDX2-negative disease were identified.

CDX2 Expression and Benefit from Adjuvant Chemotherapy

To evaluate whether patients with CDX2-negative tumors might benefit from adjuvant chemotherapy, we investigated the association between CDX2 status, assessed at either the mRNA or protein level, and disease-free survival among patients who either did or did not receive adjuvant chemotherapy. A preliminary test involving cohorts of patients with stage III disease in both the discovery and validation data sets suggested a strong association between the use of adjuvant chemotherapy and a higher rate of disease-free survival in the CDX2-negative subgroups (Fig. S20 in Supplementary Appendix 1).

We thus decided to validate this observation in an expanded database of 669 patients with stage II colon cancer and 1228 patients with stage III colon cancer. We obtained this database by pooling data from four independent patient cohorts (NCBI-GEO, NCI-CDP, NSABP C-07, and Stanford TMAD); these data were annotated with information about adjuvant chemotherapy (Fig. 1). A detailed description of all patient data sets used for this experiment is provided in Figure S6 and Figures S11, S12, and S13 in Supplementary Appendix 1.

The results confirmed that treatment with adjuvant chemotherapy was associated with a higher rate of disease-free survival in both the stage II subgroup (91% with chemotherapy vs. 56% with no chemotherapy, P = 0.006) and the stage III subgroup (74% with chemotherapy vs. 37% with no chemotherapy, P<0.001) of the CDX2-negative patient population (Fig. 5). A test for the interaction between the biomarker and the treatment revealed that the benefit observed in CDX2-negative cohorts was superior to that observed in CDX2-positive cohorts in both the stage II subgroup (P = 0.02 for the interaction) and the stage III subgroup (P = 0.005 for the interaction). The association between CDX2-negative status and benefit from adjuvant chemotherapy was not confounded by risk factors that are known to affect the survival rates among patients with stage II and stage III disease. These risk factors include the depth of invasion of the primary tumor (T3 vs. T4), the number of lymph nodes resected at surgery (≥12 vs. <12), and the number of metastatic lymph nodes (N1 vs. N2) (Figs. S21 through S24 in Supplementary Appendix 1).

Figure 5. Relationship between CDX2 Expression and Benefit from Adjuvant Chemotherapy.

Figure 5

The relationship between CDX2 expression and benefit from adjuvant chemotherapy was evaluated in a pooled database of 669 patients with stage II disease (Panel A) and 1228 patients with stage III disease (Panel B) from four independent data sets (NCBI-GEO, NCI-CDP, NSABP C-07, and Stanford TMAD). Among all patients with stage II disease in the entire database, treatment with adjuvant chemotherapy was not associated with a higher rate of 5-year disease-free survival. However, treatment with adjuvant chemotherapy was strongly associated with a higher rate of 5-year disease-free survival in the CDX2-negative subgroup, but it was not associated with a higher rate of 5-year disease-free survival in the CDX2-positive subgroup. Among patients with stage III disease, treatment with adjuvant chemotherapy was associated with a higher rate of 5-year disease-free survival in the entire database and in both the CDX2-negative and CDX2-positive subgroups. A test for an interaction between the biomarker and the treatment indicated that in both stage II and stage III disease, the benefit associated with adjuvant chemotherapy was superior among CDX2-negative patients than among CDX2-positive patients.

Discussion

Prognostic biomarkers are key to the risk stratification of patients with colon cancer and the decision to recommend adjuvant chemotherapy in patients with early-stage disease.6 Currently, tumor stage, tumor grade, and microsatellite instability remain the most important among a handful of prognostic variables that are considered in the development of algorithms for the treatment of patients with early-stage colon cancer.5,9 Prognostic variables such as lymphovascular invasion by cancer cells and perineural invasion by cancer cells, though very promising, have proved difficult to standardize because of technical problems inherent in the visual analysis and subjective definition of these features.6 Microarray-derived gene-expression signatures from stem cells and progenitor cells have also shown promise,19 but they are often difficult to translate into clinical tests.16 Overall, it has proved difficult to identify a prognostic bio-marker that is also predictive of benefit from adjuvant chemotherapy.7,8,17

In this study, we applied a bioinformatics approach to the discovery of prognostic bio-markers in human cancer. We assembled a large database of gene-expression array experiments involving human colorectal cancers and searched for genes with differential expression, defined by a Boolean relationship with respect to a well-established differentiation marker, across the patient population. The concept behind this approach was that genes associated with differentiation processes (e.g., transcription factors involved in the regulation of stem-cell self-renewal, lineage commitment, or both) could be identified as single prognostic biomarkers that could be used to stratify tumors on the basis of a molecular definition of their differentiation status and to recapitulate the prognostic information contained in complex multigene-expression signatures obtained from populations of stem cells and progenitor cells.

Using this approach, we identified CDX2 as a biomarker with expression that has been found to be absent in a minority subgroup of colon cancers that are characterized by high levels of ALCAM, a molecule that is expressed at the highest levels at the bottom of human colonic crypts22,23 and on human colon-cancer cells with enriched tumorigenic capacity in mouse xenotransplantation models.24 We then performed a test to determine whether CDX2-negative cancers might be associated with a worse prognosis. The results revealed that without adjuvant chemotherapy, CDX2-negative tumors were associated with a lower rate of disease-free survival than CDX2-positive tumors across independent data sets. This effect was independent of many known risk factors, including pathological grade.

Previous studies had indicated that CDX2-negative tumors are often associated with several adverse prognostic variables (e.g., advanced stage, poor differentiation, vascular invasion, BRAF mutation, and CIMP-positive status).31,35-38 We hypothesize that the prognostic effect associated with an absence of CDX2 expression could be explained by its aggregate capacity to function as a single biomarker for multiple biologic risk factors, under the common theme of a highly immature progenitor-cell phenotype.

Finally, our results indicate that patients with stage II or stage III CDX2-negative colon cancer might benefit from adjuvant chemotherapy and that adjuvant chemotherapy might be a treatment option for patients with stage II CDX2-negative disease, who are commonly treated with surgery alone. Given the exploratory and retrospective design of our study, these results will need to be further validated. We advocate for these findings to be confirmed within the framework of randomized, clinical trials, in conjunction with genomic DNA sequencing studies.

Supplementary Material

Supplement1
Supplement2
Supplement3
Supplement4
Supplement5

Acknowledgments

Supported by an NCCN 2012 Young Investigator Award (to Dr. Dalerba); National Institutes of Health (NIH) grants (U54-CA126524 and P01-CA139490, to Dr. Clarke, and R00-CA151673, to Dr. Sahoo); a grant from the Siebel Stem Cell Institute and the Thomas and Stacey Siebel Foundation (to Drs. Dalerba and Sahoo); a grant from the Virginia and D.K. Ludwig Fund for Cancer Research (to Dr. Clarke); a California Institute for Regenerative Medicine training grant (to Dr. Dalerba); a Department of Defense grant (W81XWH-10-1-0500, to Dr. Sahoo); a Bladder Cancer Advocacy Network 2013 Young Investigator Award (to Dr. Sahoo); and a BD Biosciences 2011 Stem Cell Research Grant (to Dr. Dalerba). The National Surgical Adjuvant Breast and Bowel Project was supported by NIH grants U24-CA114732, U10-CA37377, U10-CA180868, U10-CA180822, UG1-CA189867, and U24-CA196067. Some of the tissue microarrays used in this study were provided by the Cooperative Human Tissue Network and the Cancer Diagnosis Program, which are funded by the National Cancer Institute. Other investigators may have received slides from these same array blocks.

We thank Edward Gilbert, Chona Enrile, Ivy Mangonon, Marissa Palmor, Coralie Donkers, and Darius M. Johnston, all of Stanford University, for help and support during various phases of the study.

Footnotes

All opinions, findings, and conclusions expressed in this article are those of the authors and do not necessarily reflect those of the National Comprehensive Cancer Network (NCCN) or the NCCN Foundation.

Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.

References

  • 1.André T, Boni C, Mounedji-Boudiaf L, et al. Oxaliplatin, fluorouracil, and leucovorin as adjuvant treatment for colon cancer. N Engl J Med. 2004;350:2343–51. doi: 10.1056/NEJMoa032709. [DOI] [PubMed] [Google Scholar]
  • 2.Meyerhardt JA, Mayer RJ. Systemic therapy for colorectal cancer. N Engl J Med. 2005;352:476–87. doi: 10.1056/NEJMra040958. [DOI] [PubMed] [Google Scholar]
  • 3.Saltz LB, Cox JV, Blanke C, et al. Irinotecan plus fluorouracil and leucovorin for metastatic colorectal cancer. N Engl J Med. 2000;343:905–14. doi: 10.1056/NEJM200009283431302. [DOI] [PubMed] [Google Scholar]
  • 4.O'Connor ES, Greenblatt DY, LoConte NK, et al. Adjuvant chemotherapy for stage II colon cancer with poor prognostic features. J Clin Oncol. 2011;29:3381–8. doi: 10.1200/JCO.2010.34.3426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bardia A, Loprinzi C, Grothey A, et al. Adjuvant chemotherapy for resected stage II and III colon cancer: comparison of two widely used prognostic calculators. Semin Oncol. 2010;37:39–46. doi: 10.1053/j.seminoncol.2009.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Compton C, Fenoglio-Preiser CM, Pettigrew N, Fielding LP. American Joint Committee on Cancer Prognostic Factors Consensus Conference: Colorectal Working Group. Cancer. 2000;88:1739–57. doi: 10.1002/(sici)1097-0142(20000401)88:7<1739::aid-cncr30>3.0.co;2-t. [DOI] [PubMed] [Google Scholar]
  • 7.Gill S, Loprinzi CL, Sargent DJ, et al. Pooled analysis of fluorouracil-based adjuvant therapy for stage II and III colon cancer: who benefits and by how much? J Clin Oncol. 2004;22:1797–806. doi: 10.1200/JCO.2004.09.059. [DOI] [PubMed] [Google Scholar]
  • 8.Meropol NJ. Ongoing challenge of stage II colon cancer. J Clin Oncol. 2011;29:3346–8. doi: 10.1200/JCO.2011.35.4571. [DOI] [PubMed] [Google Scholar]
  • 9.Tournigand C, de Gramont A. Chemotherapy: is adjuvant chemotherapy an option for stage II colon cancer? Nat Rev Clin Oncol. 2011;8:574–6. doi: 10.1038/nrclinonc.2011.139. [DOI] [PubMed] [Google Scholar]
  • 10.Barrier A, Boelle PY, Roser F, et al. Stage II colon cancer prognosis prediction by tumor gene expression profiling. J Clin Oncol. 2006;24:4685–91. doi: 10.1200/JCO.2005.05.0229. [DOI] [PubMed] [Google Scholar]
  • 11.Wang Y, Jatkoe T, Zhang Y, et al. Gene expression profiles and molecular markers to predict recurrence of Dukes' B colon cancer. J Clin Oncol. 2004;22:1564–71. doi: 10.1200/JCO.2004.08.186. [DOI] [PubMed] [Google Scholar]
  • 12.Jorissen RN, Gibbs P, Christie M, et al. Metastasis-associated gene expression changes predict poor outcomes in patients with Dukes stage B and C colorectal cancer. Clin Cancer Res. 2009;15:7642–51. doi: 10.1158/1078-0432.CCR-09-1431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Smith JJ, Deane NG, Wu F, et al. Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology. 2010;138:958–68. doi: 10.1053/j.gastro.2009.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yothers G, O'Connell MJ, Lee M, et al. Validation of the 12-gene colon cancer recurrence score in NSABP C-07 as a predictor of recurrence in patients with stage II and III colon cancer treated with fluorouracil and leucovorin (FU/LV) and FU/LV plus oxaliplatin. J Clin Oncol. 2013;31:4512–9. doi: 10.1200/JCO.2012.47.3116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Fang SH, Efron JE, Berho ME, Wexner SD. Dilemma of stage II colon cancer and decision making for adjuvant chemotherapy. J Am Coll Surg. 2014;219:1056–69. doi: 10.1016/j.jamcollsurg.2014.09.010. [DOI] [PubMed] [Google Scholar]
  • 16.Gröne J, Lenze D, Jurinovic V, et al. Molecular profiles and clinical outcome of stage UICC II colon cancer patients. Int J Colorectal Dis. 2011;26:847–58. doi: 10.1007/s00384-011-1176-x. [DOI] [PubMed] [Google Scholar]
  • 17.National Comprehensive Cancer Network. Clinical practice guidelines in oncology — colon cancer, version 3. 2015 http://www.nccn.org.
  • 18.Liu R, Wang X, Chen GY, et al. The prognostic role of a gene signature from tumorigenic breast-cancer cells. N Engl J Med. 2007;356:217–26. doi: 10.1056/NEJMoa063994. [DOI] [PubMed] [Google Scholar]
  • 19.Merlos-Suárez A, Barriga FM, Jung P, et al. The intestinal stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell Stem Cell. 2011;8:511–24. doi: 10.1016/j.stem.2011.02.020. [DOI] [PubMed] [Google Scholar]
  • 20.Sahoo D, Dill DL, Gentles AJ, Tibshirani R, Plevritis SK. Boolean implication networks derived from large scale, whole genome microarray datasets. Genome Biol. 2008;9:R157. doi: 10.1186/gb-2008-9-10-r157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dalerba P, Kalisky T, Sahoo D, et al. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat Biotechnol. 2011;29:1120–7. doi: 10.1038/nbt.2038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Levin TG, Powell AE, Davies PS, et al. Characterization of the intestinal cancer stem cell marker CD166 in the human and mouse gastrointestinal tract. Gastroenterology. 2010;139(6):2072–2082.e5. doi: 10.1053/j.gastro.2010.08.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Weichert W, Knösel T, Bellach J, Dietel M, Kristiansen G. ALCAM/CD166 is over-expressed in colorectal carcinoma and correlates with shortened patient survival. J Clin Pathol. 2004;57:1160–4. doi: 10.1136/jcp.2004.016238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Dalerba P, Dylla SJ, Park IK, et al. Phenotypic characterization of human colorectal cancer stem cells. Proc Natl Acad Sci U S A. 2007;104:10158–63. doi: 10.1073/pnas.0703478104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Sahoo D, Dill DL, Tibshirani R, Plevritis SK. Extracting binary signals from microarray time-course data. Nucleic Acids Res. 2007;35:3705–12. doi: 10.1093/nar/gkm284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Thorsteinsson M, Kirkeby LT, Hansen R, et al. Gene expression profiles in stages II and III colon cancers: application of a 128-gene signature. Int J Colorectal Dis. 2012;27:1579–86. doi: 10.1007/s00384-012-1517-4. [DOI] [PubMed] [Google Scholar]
  • 27.Laibe S, Lagarde A, Ferrari A, Monges G, Birnbaum D, Olschwang S. A sevengene signature aggregates a subgroup of stage II colon cancers with stage III. OMICS. 2012;16:560–5. doi: 10.1089/omi.2012.0039. [DOI] [PubMed] [Google Scholar]
  • 28.Li MK, Folpe AL. CDX-2, a new marker for adenocarcinoma of gastrointestinal origin. Adv Anat Pathol. 2004;11:101–5. doi: 10.1097/00125480-200403000-00004. [DOI] [PubMed] [Google Scholar]
  • 29.Werling RW, Yaziji H, Bacchi CE, Gown AM. CDX2, a highly sensitive and specific marker of adenocarcinomas of intestinal origin: an immunohistochemical survey of 476 primary and metastatic carcinomas. Am J Surg Pathol. 2003;27:303–10. doi: 10.1097/00000478-200303000-00003. [DOI] [PubMed] [Google Scholar]
  • 30.Borrisholt M, Nielsen S, Vyberg M. Demonstration of CDX2 is highly antibody dependant. Appl Immunohistochem Mol Morphol. 2013;21:64–72. doi: 10.1097/PAI.0b013e318257f8aa. [DOI] [PubMed] [Google Scholar]
  • 31.Kaimaktchiev V, Terracciano L, Tornillo L, et al. The homeobox intestinal differentiation factor CDX2 is selectively expressed in gastrointestinal adenocarcinomas. Mod Pathol. 2004;17:1392–9. doi: 10.1038/modpathol.3800205. [DOI] [PubMed] [Google Scholar]
  • 32.Beck F, Stringer EJ. The role of Cdx genes in the gut and in axial development. Biochem Soc Trans. 2010;38:353–7. doi: 10.1042/BST0380353. [DOI] [PubMed] [Google Scholar]
  • 33.Chawengsaksophak K, James R, Hammond VE, Köntgen F, Beck F. Homeosis and intestinal tumours in Cdx2 mutant mice. Nature. 1997;386:84–7. doi: 10.1038/386084a0. [DOI] [PubMed] [Google Scholar]
  • 34.Hinoi T, Tani M, Lucas PC, et al. Loss of CDX2 expression and microsatellite instability are prominent features of large cell minimally differentiated carcinomas of the colon. Am J Pathol. 2001;159:2239–48. doi: 10.1016/S0002-9440(10)63074-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Lugli A, Tzankov A, Zlobec I, Terracciano LM. Differential diagnostic and functional role of the multi-marker phenotype CDX2/CK20/CK7 in colorectal cancer stratified by mismatch repair status. Mod Pathol. 2008;21:1403–12. doi: 10.1038/modpathol.2008.117. [DOI] [PubMed] [Google Scholar]
  • 36.Baba Y, Nosho K, Shima K, et al. Relationship of CDX2 loss with molecular features and prognosis in colorectal cancer. Clin Cancer Res. 2009;15:4665–73. doi: 10.1158/1078-0432.CCR-09-0401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Zlobec I, Bihl MP, Schwarb H, Terracciano L, Lugli A. Clinicopathological and protein characterization of BRAF- and K-RAS-mutated colorectal cancer and implications for prognosis. Int J Cancer. 2010;127:367–80. doi: 10.1002/ijc.25042. [DOI] [PubMed] [Google Scholar]
  • 38.Bae JM, Lee TH, Cho NY, Kim TY, Kang GH. Loss of CDX2 expression is associated with poor prognosis in colorectal cancer patients. World J Gastroenterol. 2015;21:1457–67. doi: 10.3748/wjg.v21.i5.1457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.De Sousa E, Melo F, Wang X, Jansen M, et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat Med. 2013;19:614–8. doi: 10.1038/nm.3174. [DOI] [PubMed] [Google Scholar]
  • 40.Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): explanation and elaboration. PLoS Med. 2012;9(5):e1001216. doi: 10.1371/journal.pmed.1001216. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement1
Supplement2
Supplement3
Supplement4
Supplement5

RESOURCES