Abstract
Introduction
Gene expression profiling has been extensively used to predict outcome in breast cancer patients. We have previously reported on biological hypothesis-driven analysis of gene expression profiling data and we wished to extend this approach through the combinations of various gene signatures to improve the prediction of outcome in breast cancer.
Methods
We have used gene expression data (25.000 gene probes) from a previously published study of tumours from 295 early stage breast cancer patients from the Netherlands Cancer Institute using updated follow-up. Tumours were assigned to three prognostic groups using the previously reported Wound-response and hypoxia-response signatures, and the outcome in each of these subgroups was evaluated.
Results
We have assigned invasive breast carcinomas from 295 stages I and II breast cancer patients to three groups based on gene expression profiles subdivided by the wound-response signature (WS) and hypoxia-response signature (HS). These three groups are (1) quiescent WS/non-hypoxic HS; (2) activated WS/non-hypoxic HS or quiescent WS/hypoxic tumours and (3) activated WS/hypoxic HS. The overall survival at 15 years for patients with tumours in groups 1, 2 and 3 are 79%, 59% and 27%, respectively. In multivariate analysis, this signature is not only independent of clinical and pathological risk factors; it is also the strongest predictor of outcome. Compared to a previously identified 70-gene prognosis profile, obtained with supervised classification, the combination of signatures performs roughly equally well and might have additional value in the ER-negative subgroup. In the subgroup of lymph node positive patients, the combination signature outperforms the 70-gene signature in multivariate analysis. In addition, in multivariate analysis, the WS/HS combination is a stronger predictor of outcome compared to the recently reported invasiveness gene signature combined with the WS.
Conclusion
A combination of biological gene expression signatures can be used to identify a powerful and independent predictor for outcome in breast cancer patients.
Keywords: Microarray analysis, Breast cancer, Prognostic markers, Biological gene expression profiles
1. Introduction
In the recent years, a number of groups have described different methods for the classification of invasive breast cancers based on gene expression profiling. The most commonly used statistical methods to arrive at gene expression classifiers are the unsupervised analysis, and the data-driven or supervised analysis. Unsupervised methods analyse differences in gene expression between samples without using prior knowledge of clinical outcome.1 The supervised approaches use clinical data to build a predictive model for outcome (e.g. metastasis, death and therapy response).2–11 Supervised approaches can also be used to identify gene expression profiles associated with clinical or pathological/genetic parameters such as BRCA 1 and 2 mutation status, ER status, histological grade and proliferation.2,12–15 While the unsupervised methods are primarily used to unravel biological differences between tumours, but may also aid in predicting outcome, the supervised approaches are applied to build a prognostic test for clinical application, e.g. (neo-) adjuvant treatment decision-making.
An additional method is to use hypothesis-driven gene expression profiling. We have previously classified breast cancers based on the wound-response signature (WS) (also known as the core serum response signature) and the hypoxia-response signature (HS).16–18 The basis for these previous analyses was to use a gene expression signature derived from a model developed to study cancer-associated processes in tissue culture cells. These in vitro generated gene expression profiles have subsequently been tested on human cancer samples. It has been shown that these profiles have prognostic value in different cancer types (lung, gastric, squamous cell ((WS;16) breast (WS and HS;16–18) and ovarian (HS17), thus indicating the relevance of these processes to tumour clinical phenotypes. The WS is based on the theory that ‘tumours are wounds that do not heal’19 and the hypothesis that a wound-like phenotype would be advantageous to cancer cells in the process of metastasising. In gene expression profiling studies using fibroblasts, Iyer and colleagues noticed that serum stimulation of fibroblasts resulted in changes in gene expression that showed resemblance to processes involved in wound healing.20 Chang and colleagues have used a similar serum-stimulated fibroblast model to define a serum response gene expression signature.16 As addition of serum to fibroblasts also leads to a marked increase in proliferation, part of this gene expression signature consists of genes involved in proliferation. Chang and colleagues subsequently filtered out cell cycle dependent genes to arrive at the core serum response (CSR) gene expression signature. Gene expression profiles of tumours can be compared to this WS and subdivided into an ‘activated’ subset and a ‘quiescent’ WS subset.16
Hypoxia is another process that plays an important role in cancer (reviewed in.21) To define a ‘hypoxia signature’, multiple primary epithelial cells were exposed to different levels of hypoxia. A common gene expression pattern associated with hypoxia across these cell lines was defined and called the ‘hypoxia-response signature’ (HS). To find hypoxia specific genes, only genes that are up-regulated by hypoxia were selected, as down-regulated genes included a large number of cell cycle-regulated/proliferation genes. The HS genes were used to classify tumours as hypoxic or non-hypoxic. Tumours showing a gene expression pattern resembling the in vitro model (‘activated’ for the WS and ‘hypoxic’ for the HS) have a significantly worse outcome for both metastasis free probability and overall survival. Modelling (supervising) the wound signature allows for optimising sensitivity18 or allows using the model for predicting a different outcome parameter, i.e. local recurrence after breast conserving therapy22 Other gene expression signatures derived from a biological hypothesis are the p53-signature, the stromal cell signature and the invasiveness signature (‘stem cell signature’).23–25 One of the main statistical caveats with supervised classification is that of overfitting and as a consequence, prognostic or predictive profiles should always be validated in large independent patient cohorts. Two examples of supervised analysis are the 70-gene prognosis profile2 and the 76-gene prognosis profile.11 These profiles have been trained on a cohort of lymph node negative patients and subsequently validated on a larger cohort from the same institute,10,10 and both profiles have been validated in the same independent cohort by the TRANSBIG consortium.26,26 Clinical use of these prognosis gene expression signatures could result in treating a smaller number of patients with adjuvant systemic chemotherapy. Desmedt and colleagues note that both signatures have the highest hazard ratio during the first five years, but still outperform Adjuvant! Online® at 10 years.27 When corrected for clinical variables in multivariate analysis they perform equally well on the TRANSBIG series (hazard ratio (HR) 2.63 and 2.55 for the 70-gene and the 76-gene, respectively).26,26
As biological derived profiles are not trained on a specific data set, they can be directly applied to multiple publicly available data sets. Furthermore, these signatures might also give more insight into the biology that defines prognostic subgroups of breast cancer patients and may suggest ways for intervention.28,28
As the previously published wound-response signature and hypoxia-response signature are independent in predicting outcome in breast cancer patients in multivariate analysis17 and only 6 genes overlap between the signatures, we hypothesised that combining these signatures might have both a better sensitivity and specificity in predicting outcome. Here, we have updated the follow-up of the previously described series of 295 breast cancer patients, and assessed the prognostic value of assigning tumours to subgroups based on the WS and HS.
2. Material and methods
2.1. Tumour samples – patients
We have previously reported on gene expression profiles of tumours from a series of 295 stages I and II breast cancer patients treated at the Netherlands Cancer Institute (NKI) between 1984 and 1995.10 A full clinical data sheet is available online at http://research.nki.nl/vandevijverlab/ under ‘publications’. The clinical data used for the earlier publications were updated until January 2001, resulting in a median follow-up of 6.7 years (range 0.05–18.3). For this study, all the patient charts were reviewed and clinical data were updated until 1st January 2005. The median follow-up is now 10.2 years for all patients and 12 years for patients who are alive (range 0.05–21.7).
Gene expression profiles and clinico-pathological data on 118 invasive breast carcinomas were published in Sorlie and colleagues.30 Data were downloaded from the Stanford micro-array database (http://genome-www5.stanford.edu). The median follow-up as provided for these 118 patients was 28.5 months (range 3–188).
2.2. Gene expression profiling data
RNA isolation, labelling of complementary RNA and hybridisation to 25,000 element oliogonucleotide microarrays, and measurement of expression ratios were previously described.10
The two gene expression profiles (wound-response signature (WS) or core serum response (CSR) and hypoxia-response signature (HS)) used have been derived in the experiments using Stanford cDNA arrays. Genes on Stanford cDNA micro-arrays and Rosetta/NKI oligonucleotide microarrays were mapped across different platforms using Unigene identifiers (build 158, release date 18th January 2003 for the Wound Signature; this version was used here, as this was the original mapping carried out for the validation of the wound signature18. The primary hypoxia analysis was done using build 172, release date July 17th 2004.17 Probes on the NKI array that were mapped to the same unigene cluster were averaged.
The entire expression database for all 295 patients is available online (http://research.nki.nl/vandevijverlab/PublicationTable.html), as well as the files containing the expression ratios for the specific experiments (wound signature http://microarray-pubs.stanford.edu/wound_NKI/explore.html) and hypoxia signature http://microarray-pubs.stanford.edu/hypoxia/downloads.htm). The second data set (Sorlie and colleagues31) used is derived from experiments using the same cDNA microarray as the one used to identify the WS and HS.16,16 The image clones were mapped and all probes could thus be identified. The full expression data are available at http://genome-www.stanford.edu/breast_cancer. Clinical data are available as supplement on website from the original publication (http://genome-www.stanford.edu/breast_cancer/robustness). There are122 patients in this data set, and the WS and HS were previously assessed for the tumours from these patients17,17 by using 2-dimensional unsupervised hierarchical clustering.
The information on the invasiveness gene signature (assignment to prognostic groups based on this signature) was provided by Dr. X. Wang.
2.3. Classification of samples based on gene expression signatures
We have previously assigned patients to the WS activated/ quiescent and HS hypoxic/non-hypoxic groups based on the two-dimensional unsupervised hierarchical clustering. In addition, assigning methods based on a Pearson correlation (WS) or average gene expression (HS) have also been used. Here, we have used the correlation coefficient to the WS18 and the average expression for all the genes for the HS (‘Hypoxia score’).17 For the WS, the average gene expression value for each of the 512 genes is calculated from the original in vitro model. For each patient sample, the Pearson correlation coefficient to this list of values results in a value between −1 and 1. A correlation coefficient >0 is deemed a positive correlation and thus an activated wound signature and <0 represents a negative correlation and thus a quiescent wound signature. For the hypoxia signature, we could not apply the same method because all the genes in this profile are up-regulated. Instead, we have calculated the average expression of all the genes for each patient and defined a value of >0 as positive and thus hypoxic, and a value of <0 as negative and thus non-hypoxic.18 We have not trained and supervised the optimal correlation to the WS and HS, but we have used a correlation coefficient of 0 as cutoff, as a positive correlation coefficient corresponds to the findings from the in vitro model defining cells as having an activated WS or a hypoxic HS, whereas a negative correlation coefficient would indicate a quiescent WS or non-hypoxic HS. After assigning the WS and HS for all patients, patients are categorised into three groups by combining the signatures. Group 1 consists of patients who have tumours with a quiescent WS and non-hypoxic HS; patients who have tumours with either an activated WS or a hypoxic signature are assigned to group 2, and group 3 represents patients who have tumours with an activated WS and hypoxic HS.
2.4. Statistical analysis
The main end-point analysed was distant metastasis as the first event (distant metastasis free probability: DMFP). If a patient developed a local recurrence, axillary recurrence, contra-lateral breast cancer or a second primary cancer (except for non-melanoma skin cancer), she was censored at this time. This approach was taken to account for the possibility that locoregional recurrences and contralateral breast cancers can be a source for distant metastases. An isolated ipsilateral supraclavicular recurrence was the first relapse in five patients. In four of these patients, distant metastasis developed after a short time interval. An ipsilateral supraclavicular recurrence was thus considered to be a preceding event to distant metastasis and patients were not censored at this time. The time of a subsequent distant metastasis was taken as time of the first event and analysed accordingly. Overall survival was analysed by death from any cause and patients were censored at the last follow-up (details on breast cancer specific survival are provided as Supplementary information (file S1 clinical data sheet). Distant metastasis free probability (DMFP) and overall survival (OS) were calculated using the Kaplan–Meier method, and the log rank test was used for comparisons. For the multi-variable analysis, Cox-regression was used. All statistical analyses were done using Winstat® for Excel (R. Fitch Software, Staufen, Germany), SPSS 13.0® (SPSS Inc, Chicago, IL) and Microsoft Excel® (Microsoft Corporation, Redmond, WA).
2.5 Comparing the 70-gene prognosis profile and the wound signature – hypoxia signature combination
In the original publication Van de Vijver and colleagues have included 61 lymph node negative patients from the original 78-patient training set in the validation series of the 70-gene signature.10 To acquire a consecutive series, these 61 of the 78 pN0 patients of the training series used for the construction of the 70-gene prognosis profile2 had to be included. Leaving out these patients would have resulted in a selection bias, since the first series contained a disproportionally large number of patients who developed distant metastases within 5 years. The 61 patients were included in the study to keep the consecutive series complete, but a statistical correction was applied before assigning these patients to the good or poor outcome group to correct for an overestimated performance in these patients.2 For the updated results on the 70-gene prognosis signature, all 295 patients were analysed. However, for a fair comparison with the 70-gene signature on this data set, the patients who were part of the training series cannot be used for the comparative analyses. Using a similar approach to the one used in previous analyses,18 we have analysed the subset of lymph node positive patients (who were not part of the training set of 78 tumours) for the comparative analyses with the 70-gene signature.
3. Results
3.1. Prognostic association for the Wound Signature
For 295 invasive breast carcinomas from patients with stages I and II breast cancer, we had previously assessed gene expression profiles for 25,000 genes. Using the wound signature (WS) genes, 142 tumours were classified as WS activated and 153 tumours as WS quiescent.
Distant metastasis free probability at 15 years is, respectively, 72% and 47% for Quiescent and Activated (log rank p < 10–5; HR 2.8 (95%CI (confidence interval) 1.8–4.3)); overall survival is 77% and 46% (log rank p < 10–8; HR 3.6 (95%CI 2.3–5.6)), respectively (Kaplan–Meier curves are shown in Fig. 1a and b). As can be seen in Table 1, in multivariate analysis the WS is the strongest independent predictor for OS.
Fig. 1.
Kaplan–Meier Curves for distant metastasis free probability and overall survival. (a) The wound-response signature (activated versus quiescent) – distant metastasis free survival, (b) the wound-response signature (activated versus quiescent) – overall survival, (c) the hypoxia-response signature (hypoxic versus non-hypoxic) – distant metastasis free survival, (d) the hypoxia-response signature (hypoxic versus non-hypoxic) – overall survival, (e) The 70-gene prognosis signature (poor versus good) – distant metastasis free survival, (f) the 70-gene prognosis signature (poor versus good) – overall survival.
Table 1.
Multivariate analysis for overall survival showing the wound-response signature and the hypoxia-response signature as independent prognostic factors
Clinical – pathological variable | Significance | Hazard ratio for death |
95.0% CI for hazard ratio |
|
---|---|---|---|---|
Lower | Upper | |||
Diameter T2 (>2 cm) versus diameter T1 (62 cm) | 0.10 | 1.43 | 0.93 | 2.19 |
Lymph node positive versus lymph node negative | 0.54 | 1.22 | 0.66 | 2.25 |
Age above 40 versus age 40 and below | 0.009 | 0.55 | 0.35 | 0.86 |
ER-positive versus ER-negative | 0.061 | 0.64 | 0.40 | 1.02 |
Angio invasion + versus ±, versus − | 0.021 | 1.31 | 1.04 | 1.65 |
Grade 3 versus grade 2 versus grade 1 | 0.27 | 1.23 | 0.85 | 1.78 |
Chemo yes or no | 0.42 | 0.77 | 0.41 | 1.44 |
Hormonal therapy yes or no | 0.86 | 0.94 | 0.45 | 1.96 |
Wound-response Signature (activated versus quiescent) | <0.0001 | 2.77 | 1.70 | 4.50 |
Hypoxia-response signature (hypoxic versus non-hypoxic) | 0.006 | 1.83 | 1.19 | 2.81 |
3.2. Prognostic association for the hypoxia-response signature
When tumours are assigned based on average expression of the hypoxia-response genes, 218 are classified as non-hypoxic and 77 as hypoxic. The distant metastasis free probability at 15 years is 67% and 42% for non-hypoxic and hypoxic, respectively (log rank p < 10–4; HR 2.3 (95%CI 1.5–3.5)) and the overall survival is 69% versus 44% (log rank p < 10–5; HR 2.6 (95%CI 1.7–3.8)) (Kaplan–Meier curves are shown in Fig. 1c and d).
3.3. Performance of a ‘combination signature’
To address the question whether the WS and the HS define different biological pathways and can provide additional prognostic information, we combined the two signatures to sub-classify the tumours (‘combination signature’). We subdivided the tumours into those that show both a quiescent WS and a non-hypoxic HS (group 1; n = 121); those that have either an activated WS or hypoxic HS (group 2; n = 129) and those that have both an activated WS and a hypoxic HS (group 3; n = 45). The distant metastasis free probability at 15 years is 76%, 53% and 36% for group 1, 2 and 3, respectively (log rank p < 10–8; HR 2.3 (95%CI 1.8–3.1)) and the overall survival is 79%, 59% and 27% (log rank p < 10–12; HR 2.8 (95%CI 2.1–3.8)) (Kaplan–Meier curves are shown in Fig. 2a and b). In multivariate analysis, the combination signature is the most powerful predictor in the presence of clinical and pathological risk factors (Table 2). Note that the hazard ratio for the combination signature (2.2) applies to both the difference between the good and intermediate groups and the intermediate groups versus poor group.
Kaplan–Meier curves for distant metastasis free probability and overall survival for the combination signature. (a) All 295 patient – distant metastasis free survival, (b) all 295 patient – overall survival, (c) untreated patients (n = 165) – overall survival, (d) pT1 tumours (n = 155) – overall survival, e) Lymph node negative tumours (n = 151) – overall survival, (f) ER-negative tumours (n = 69) – overall survival.
Table 2.
Multivariate analysis for Overall Survival using the combination signature
Clinical – pathological variable | Significance | Hazard ratio for death |
95.0% CI for hazard ratio |
|
---|---|---|---|---|
Lower | Upper | |||
Diameter T2 (>2 cm) versus diameter T1 (62 cm) | 0.13 | 1.39 | 0.91 | 2.13 |
Lymph node positive versus lymph node negative | 0.46 | 1.26 | 0.68 | 2.34 |
Age above 40 versus age 40 and below | 0.012 | 0.56 | 0.35 | 0.88 |
ER-positive versus ER-negative | 0.075 | 0.66 | 0.41 | 1.04 |
Angio invasion + versus ±, versus − | 0.031 | 1.28 | 1.02 | 1.61 |
Grade 3 versus grade 2 versus grade 1 | 0.17 | 1.29 | 0.90 | 1.85 |
Chemo yes or no | 0.42 | 0.77 | 0.41 | 1.44 |
Hormonal therapy yes or no | 0.90 | 0.95 | 0.45 | 2.00 |
Hypoxia – wound signature combination (poor versus intermediate versus good) | <0.00001 | 2.20 | 1.59 | 3.03 |
3.4. Comparison with the 70-gene prognosis profile
We have previously identified a 70-gene prognosis profile using supervised classification2 and validated the prognostic profile in this series of 295 tumours.10 Using the extended follow-up data collected in this study, we have repeated the prognosis analysis using the 70-gene prognosis profile in these 295 tumours. The distant metastasis free probability at 15 years is 80% and 46% for tumours with a good and poor prognosis profile, respectively (log rank p < 10–7; HR 4.5 (95%CI 2.5–7.3)) and the overall survival is 83% and 48% (log rank p < 10–9; HR 5.3 (95%CI 3.0–9.4)). (Kaplan–Meier curves are shown in Fig. 1e and f. Kaplan–Meier curves for the lymph node positive patients only (n = 144) are shown in Supplementary Fig. 1). The comparative analysis was done for the 144 lymph node positive patients only (see materials and methods). Table 3a shows the distribution of subgroups defined by the combination signature (good, intermediate and poor) versus group assignment according to the 70-gene prognosis profile. Clinical outcome in discordant patients (e.g. 70-gene good prognosis and combination signature poor and vice versa) is shown in Table 3b. To investigate the additive prognostic value of the combination signature, we also performed a multivariate analysis including the 70-gene prognosis signature, the combination signature and the clinico-pathological variables (Table 3c). As can be seen, the 70-genes prognosis profile loses significance in the presence of the combination signature in the lymph node positive subgroup. The combination signature is the most powerful predictor of survival in this model with a hazard ratio of 1.84. Age is the only clinico-pathological variable that adds significantly to the model. A potential advantage of using the combination signature is a higher specificity or positive predictive value. The negative predictive value for distant metastasis is almost the same for both the 70-gene prognosis signature and the combination signature (respectively, 84% and 83%). The positive predictive value is harder to compare with the three-class combination signature. When combining the intermediate and poor groups as ‘poor’ group, the signatures perform equally well (37% both). Of note is that the combination signature creates an intermediate group that potentially has to be addressed separately regarding treatment decisions. When only looking at the good versus poor for both signatures, the combination signature performs slightly better (37% versus 55%) with a specificity of 45% versus 83%. Table 3d shows all the values for the positive and negative predictive values, and sensitivity and specificity for both signatures.
Table 3a.
Subdividing lymph node positive patients by the ‘combination signature’ and the 70-gene prognosis signature. The combination signature is derived by combining the wound-response signature (WS) and the hypoxia-response signature (HS); group 1 comprises tumours with a quiescent WS/non-hypoxic HS; group 2, quiescent WS/hypoxic HS or activated WS/non-hypoxic HS; group 3, activated WS and hypoxic HS
70-Gene signature | Combination – group 1 | Combination – group 2 | Combination – group 3 |
---|---|---|---|
Good prognosis (55: 38%) | 36 | 17 | 2 |
Poor prognosis (89: 62%) | 18 | 54 | 17 |
Total (144) | 53 (37%) | 71 (49%) | 19 (14%) |
Table 3b.
Outcome in disconcordant patients
Distant metastasis as first event |
Death | |
---|---|---|
70-Genes good – combination group 3 (n = 2) | 0 | 0 |
70-Genes poor – combination group 1 (n = 18) | 3 | 4 |
Table 3c.
Multivariate analysis for overall survival for clinical and pathological risk factors in the presence of both the combination signatures and the 70-gene for lymph node positive patients (n = 144)
Clinical – pathological variable | Significance | Hazard ratio for death |
95.0% CI for hazard ratio |
|
---|---|---|---|---|
Lower | Upper | |||
Diameter T2 (>2 cm) versus diameter T1 (62 cm) | 0.14 | 1.61 | 0.86 | 3.01 |
Grade 3 versus grade 2 versus grade 1 | 0.70 | 1.11 | 0.65 | 1.90 |
Age above 40 versus age 40 and below | 0.018 | 0.43 | 0.21 | 0.86 |
ER-positive versus ER-negative | 0.23 | 0.64 | 0.32 | 1.31 |
Angio invasion + versus ±, versus − | 0.059 | 1.38 | 0.99 | 1.94 |
Chemo yes or no | 0.16 | 0.61 | 0.31 | 1.21 |
Hormonal therapy yes or no | 0.50 | 0.72 | 0.28 | 1.85 |
Hypoxia – wound signature combination (poor versus intermediate versus good) | 0.022 | 1.84 | 1.09 | 3.10 |
70-Genes poor versus good | 0.23 | 1.67 | 0.73 | 3.82 |
Table 3d.
Comparison of the positive and negative predictive values and sensitivity and specificity for the 70-gene profile and the combination signature
True | False | ||
---|---|---|---|
70-Gene prognosis profile | |||
Positive (poor) | 33 | 56 | Positive predictive Value: 37% |
Negative (good) | 46 | 9 | Negative predictive Value: 84% |
Sensitivity: 79% | Specificity: 45% | ||
Combination signature good versus intermediate and poor | |||
Positive (intermediate/poor) | 33 | 57 | Positive predictive value: 37% |
Negative (good) | 45 | 9 | Negative predictive value: 83% |
Sensitivity: 79% | Specificity: 44% | ||
Combination signature good versus poor | |||
Positive (poor) | 11 | 9 | Positive predictive value: 55% |
Negative (good) | 45 | 9 | Negative predictive value: 83% |
Sensitivity: 55% | Specificity: 83% |
3.5. Comparison with the Invasiveness gene signature – wound signature combination
Recently, another interesting biological gene expression signature was identified, termed the invasiveness gene signature (invasiveness gene signature (IGS) or ‘stem cell signature’) published by Liu and colleagues.23 This 186-gene signatures is derived from gene expression studies on a highly tumourigenic subpopulation of cells from breast tumours characterised by high CD44 and a low or undetectable level of CD24 expression, which are postulated to be breast cancer stem cells. Also this signature predicts outcome in the NKI295 data set. The IGS is independent in predicting outcome in multivariate analysis in the NKI295 data set, but with a HR of 1.2 (95%CI 1.1–1.4) it is not the strongest in the presence of age and tumour grade. Liu and colleagues generated a combination signature in a similar way (IGS good and quiescent, either activated or IGS poor and activated and IGS poor: IGS–WS combination signature) and showed a difference in distant metastasis free survival (80%, 69% and 47% at 10 years for, respectively, the good, intermediate and poor groups).
We wanted to compare this combination signature on the same patient cohort to our combination signature in the presence of clinical and pathological variables. In the presence of the WS–HS combination, the IGS–WS combination loses its significance. The WS–HS combination remains significant and is the strongest predictor of survival (Tables 4a and Table 4b).
Table 4a.
Multivariate analysis for distant metastasis as the first event for clinical and pathological risk factors in the presence of both combination signatures
Clinical – pathological variable | Significance | Hazard ratio for metastasis |
95.0% CI for hazard ratio |
|
---|---|---|---|---|
Lower | Upper | |||
Diameter T2 (>2 cm) versus diameter T1 (62 cm) | 0.015 | 1.76 | 1.12 | 2.78 |
Lymph node positive versus lymph node negative | 0.75 | 1.11 | 0.58 | 2.12 |
Age above 40 versus age 40 and below | 0.078 | 0.65 | 0.40 | 1.05 |
ER-positive versus ER-negative | 0.96 | 1.01 | 0.59 | 1.73 |
Angio invasion + versus ±, versus − | 0.037 | 1.29 | 1.02 | 1.64 |
Grade 3 versus grade 2 versus grade 1 | 0.73 | 1.07 | 0.73 | 1.56 |
Chemotherapy yes or no | 0.21 | 0.66 | 0.34 | 1.27 |
Hormonal therapy yes or no | 0.37 | 0.70 | 0.32 | 1.53 |
Invasiveness gene signature – wound signature combination (poor versus intermediate versus good) | 0.093 | 1.41 | 0.94 | 2.11 |
Hypoxia – wound signature combination (poor versus intermediate versus good) | 0.002 | 1.75 | 1.23 | 2.50 |
Table 4b.
Multivariate analysis for overall survival for clinical and pathological risk factors in the presence of both combinations signatures
Clinical – pathological variable | Significance | Hazard ratio for death |
95.0% CI for hazard ratio |
|
---|---|---|---|---|
Lower | Upper | |||
Diameter T2 (>2 cm) versus diameter T1 (62 cm) | 0.18 | 1.34 | 0.88 | 2.05 |
Lymph node positive versus lymph node negative | 0.62 | 1.17 | 0.63 | 2.18 |
Age above 40 versus age 40 and below | 0.015 | 0.57 | 0.36 | 0.90 |
ER-positive versus ER-negative | 0.20 | 0.73 | 0.45 | 1.18 |
Angio invasion + versus ±, versus − | 0.031 | 1.28 | 1.02 | 1.61 |
Grade 3 versus grade 2 versus grade 1 | 0.42 | 1.17 | 0.80 | 1.71 |
Chemotherapy yes or no | 0.44 | 0.78 | 0.42 | 1.46 |
Hormonal therapy yes or no | 0.92 | 0.96 | 0.46 | 2.02 |
Invasiveness gene signature – wound signature combination (poor versus intermediate versus good) | 0.11 | 1.39 | 0.93 | 2.09 |
Hypoxia – wound signature combination (poor versus intermediate versus good) | <0.0001 | 2.02 | 1.43 | 2.84 |
3.6. Prognostic value of the WS and HS in clinically defined subgroups
We wished to explore the prognostic value of the WS and HS in clinically defined subgroups. We have assessed the prognostic value of the WS and HS in the following subgroups of tumours: pT < 2 cm versus pT>2 cm, lymph node negative versus lymph node positive tumours, ER-negative versus ER-positive tumours and treated versus untreated patients (adjuvant chemotherapy and/or hormonal therapy).
The combination signature has comparable prognostic power in all subgroups. Most interestingly, the ER-negative subgroup can be subdivided into a good and a poor prognosis group (Kaplan–Meier Curves for survival are shown in Fig. 2c – f).
3.7. Prognostic value of the WS and HS in locally advanced breast cancer
To test the performance of the combination signature in an additional data set, we used the breast cancer data set from Sorlie and colleagues.30 The WS and HS have previously been described for this data set.16,16 This data set encompassed 122 patients with a median age at diagnosis of 58 years (range 21– 85); 25% had an ER-negative tumour; 77% of the patients had a T3- or T4-stage tumour and 82% of patients were lymph node positive. As can be seen in Fig. 3, the combination signature separates 3 subgroups and significantly distinguishes patient for both relapse free (p = 0.01) and overall survival (p = 0.002). However, groups 1 and 2 appear to be more similar in outcome, compared to group 3, as was also observed in the ER-negative subgroup in the NKI data set.
Fig. 3.
Kaplan–Meier curves for relapse free (a) and overall survival (b) for the combination signature on the Sorlie data set (locally advanced tumours). Note that for the survival analysis, 3 more patients are analysed. For these patients only survival data were available.
4. Discussion
Based on variations in gene expression profiles, various subgroups of breast cancer can be identified. Here, we present analyses to optimally combine several gene expression-based classifiers with the purpose to improve the identification of prognostic subgroups of breast cancer, and to better understand the biology underlying these gene expression classifiers. We previously noted that the Hypoxia-response profile identified a different prognostic subgroup compared to that based on the Wound-response signature17 and hypothesised that these profiles might be complementary in predicting outcome. Here, we show that outcome prediction is improved by combining the two profiles. The combination of the two signatures not only significantly predicts outcome in breast cancer patients (both distant metastasis free and overall survival), but also outperforms all clinical and pathological parameters in a multivariate model.
Interestingly, the prognostic power of the combination signatures holds up across all clinically defined subgroups. While patients with ER-negative breast cancer (69 patients) represent only a small proportion (4% in the series described here) of the patients having a good prognosis according to the 70-gene prognosis profile, the combination signature classifies 12 patients as having a good and 30 patients as intermediate risk; the remaining 27 patients are classified as poor prognosis patients. The Kaplan-Meier estimates for overall survival at 15 years are 75%, 66% and 13%, respectively, for, the good, intermediate and poor prognosis patients in the ER-negative subgroup. For the entire patient cohort, the good subgroup contains a comparable number of patients and has comparable outcome to the good group as defined by supervised classification (combination signature versus 70-gene; 121 versus 115 patients in the good prognosis group and 79% versus 83% overall survival at 15 years). Furthermore, the combination signature defines a poor outcome group that has a worse outcome than patients assigned to the poor prognosis group by the 70-gene prognosis signature (27% versus 48% overall survival at 15 years). When we analysed the 70-genes prognosis profile and the combination signature in multivariate analysis together with clinico-pathological variables on the lymph node positive cohort only, the 70-gene prognosis profile lost its significant predictive value in the model for overall survival, whereas the combination signature was the strongest predictor of outcome.
Recently, the so-called invasiveness gene signature (IGS) was published by Liu and colleagues.23 This signature represents the gene expression pattern in a highly tumourigenic subpopulation of breast cancer cells and is a significant predictor of outcome, and even more so when combined with the WS. Here, we have shown that the combination of the HS and the WS results in a stronger prediction of outcome compared to the combination of the IGS and the WS. Furthermore, the number of patients assigned to the good prognosis group is considerably larger for the WS-HS combination (115 versus 61).
Gene expression profiling using microarray analysis of tumours results in a large amount of (gene expression) data points for each individual sample. In order to analyse these data, various methods are widely applied, e.g. unsupervised hierarchical clustering, supervised analysis incorporating clinical data and ‘hypothesis driven’ analysis. One example of supervised analysis is the 70-gene prognosis profile, which has now also become commercially available as a diagnostic test for lymph node negative breast cancer (MammaPrint®) and is currently being tested in a prospective randomised clinical trial by the EORTC and TRANSBIG clinical study groups (MINDACT-trial).31 An additional PCR-based 21-gene recurrence score has also been found to predict outcome in ER-positive breast cancer patients and is commercially available (OncotypeDX®); this test is also being prospectively tested (TAILORX trial, a cooperative study lead by the Eastern Cooperative Oncology Group33).Other prognostic gene expression signatures have been described,11,15,23,24,30,33,34 but have not been carried forward to clinical studies (yet).
Here, we have shown that by combining ‘biology driven’ gene expression signatures, outcome prediction in breast cancer can be done without applying supervised methods. By further integrating gene expression-based classification with genetic alterations in breast cancer, more insight in the biological processes will be obtained. As an example of such improved understanding, Adler and colleagues developed a method called SLAMS (stepwise linkage analysis of microarray signatures) to analyse gene signatures to unravel underlying mechanisms of poor prognosis.28 They have identified two genes (MYC and CSN5) that drive the Wound-response Signature; by stably transducing MYC and CSN5 into MCF10A (breast epithelial cells), the ability of these cells to invade through a three-dimensional basement membrane matrix is strongly increased. Such a model can be used to test drug sensitivity in relation to specific gene expression signatures.29
Another example showing the complexity of breast cancer biology is a study performed by Fan and colleagues. They analysed five different published gene expression signatures on one data set and showed a high concordance between the signatures in identifying the same poor prognosis patients. They hypothesised that multiple pathways can lead to poor prognosis and that the high complexity of the microarray approach, the large numbers of genes analysed and the utilisation of different platforms likely identify different sets of genes that are all associated with a poor prognosis.35 Sotiriou and colleagues have proposed that there is a common team in different gene expression profiles, instead of multiple different pathways. They have hypothesised that the main ‘driving’ component of most prognostic signatures in breast cancer is the set of genes associated with proliferation. Preliminary results of a meta-analysis of published breast cancer data sets support this hypothesis.36
The predictive power of the combined approach described here is independent of clinical and pathological variables and holds up across different clinical and pathological distinct subgroups of breast cancer, even in the ER-negative subgroup. Taking this test into the clinical study could be beneficial for patients and potentially improve clinical decision-making. It Q should be realised that this requires significant efforts and financial resources. First a reproducible assay needs to be developed and subsequently this assays needs to validated, preferably prospectively. Testing different prognostic signatures and testing their ability to better stratify patients for adjuvant treatment decision-making should be a research priority. Large collaborative studies are needed and preferably run by an independent organisation. An example of such a project is the (retrospective) validation of the 70-gene prognosis profile by the TRANSBIG consortium where, in a second study the 76-gene prognosis profile was validated on the same patient cohort.26,26
Hopefully, it will be possible to incorporate novel prognostic and/or predictive gene expression signatures, such as the ones described here, in prospective clinical studies. This should lead to improved predictive tests that can guide the treatment of breast cancer patients.
Supplementary Material
Acknowledgements
This work was supported by Dutch Cancer Society Grant NKB 2002-2575 (to D.S.A.N. and M.v.d.V.), National Institutes of Health Grant R01CA118750 (to H.Y.C) and R01CA125618 (to J.T.C). H.Y.C. is the Kenneth G. and Elaine A. Langone Scholar of the Damon Runyon Cancer Research Foundation. We thank Dr. X. Wang for kindly providing the individual patient data for the NKI 295 patients on the IGS.
Footnotes
Conflicts of interest statement
None declared.
Appendix A. Supplementary data
Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.ejca.2008.07.015.
REFERENCES
- 1.Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747–752. doi: 10.1038/35021093. [DOI] [PubMed] [Google Scholar]
- 2.‘t Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415(6871):530–536. doi: 10.1038/415530a. [DOI] [PubMed] [Google Scholar]
- 3.Ayers M, Symmans WF, Stec J, et al. Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil; doxorubicin; and cyclophosphamide chemotherapy in breast cancer. J Clin Oncol. 2004;22(12):2284–2293. doi: 10.1200/JCO.2004.05.166. [DOI] [PubMed] [Google Scholar]
- 4.Chang J, Wooten E, Tsimelzon A, et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. The Lancet. 2003;362(9381):362–369. doi: 10.1016/S0140-6736(03)14023-8. [DOI] [PubMed] [Google Scholar]
- 5.Foekens JA, Atkins D, Zhang Y, et al. Multicenter validation of a gene expression-based prognostic signature in lymph node-negative primary breast cancer. J Clin Oncol. 2006;(20) doi: 10.1200/JCO.2005.03.9115. [DOI] [PubMed] [Google Scholar]
- 6.Hannemann J, Oosterkamp HM, Bosch CA, et al. Changes in gene expression associated with response to neoadjuvant chemotherapy in breast cancer. J Clin Oncol. 2005;23(15):3331–3342. doi: 10.1200/JCO.2005.09.077. [DOI] [PubMed] [Google Scholar]
- 7.Jansen MP, Foekens JA, van Staveren IL, et al. Molecular classification of tamoxifen-resistant breast carcinomas by gene expression profiling. J Clin Oncol. 2005;23(4):732–740. doi: 10.1200/JCO.2005.05.145. [DOI] [PubMed] [Google Scholar]
- 8.Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. New Engl J Med. 2004;351(27):2817–2826. doi: 10.1056/NEJMoa041588. [DOI] [PubMed] [Google Scholar]
- 9.Sotiriou C, Neo SY, McShane LM, et al. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci USA. 2003;100(18):10393–10398. doi: 10.1073/pnas.1732912100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.van de Vijver MJ, He YD, van’t Veer LJ, et al. A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med. 2002;347(25):1999–2009. doi: 10.1056/NEJMoa021967. [DOI] [PubMed] [Google Scholar]
- 11.Wang Y, Klijn JG, Zhang Y, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–679. doi: 10.1016/S0140-6736(05)17947-1. [DOI] [PubMed] [Google Scholar]
- 12.Dai H, van’t VL, Lamb J, et al. A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. Cancer Res. 2005;65(10):4059–4066. doi: 10.1158/0008-5472.CAN-04-3953. [DOI] [PubMed] [Google Scholar]
- 13.Hedenfalk I, Duggan D, Chen Y, et al. Gene-expression profiles in hereditary breast cancer. New Engl J Med. 2001;344(8):539–548. doi: 10.1056/NEJM200102223440801. [DOI] [PubMed] [Google Scholar]
- 14.Oh DS, Troester MA, Usary J, et al. Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol. 2006;(27) doi: 10.1200/JCO.2005.03.2755. [DOI] [PubMed] [Google Scholar]
- 15.Sotiriou C, Wirapati P, Loi S, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006;98(4):262–272. doi: 10.1093/jnci/djj052. [DOI] [PubMed] [Google Scholar]
- 16.Chang HY, Sneddon JB, Alizadeh AA, et al. Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumours and wounds. PLoS Biol. 2004;2(2):E7. doi: 10.1371/journal.pbio.0020007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chi JT, Wang Z, Nuyten DS, et al. Gene expression programs in response to hypoxia: cell type specificity and prognostic significance in human cancers. PLoS Med. 2006;3(3):e47. doi: 10.1371/journal.pmed.0030047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chang HY, Nuyten DS, Sneddon JB, et al. Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci USA. 2005;102(10):3738–3743. doi: 10.1073/pnas.0409462102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Dvorak HF. Tumours: wounds that do not heal. Similarities between tumour stroma generation and wound healing. New Engl J Med. 1986;315(26):1650–1659. doi: 10.1056/NEJM198612253152606. [DOI] [PubMed] [Google Scholar]
- 20.Iyer VR, Eisen MB, Ross DT, et al. The transcriptional program in the response of human fibroblasts to serum. Science. 1999;283(5398):83–87. doi: 10.1126/science.283.5398.83. [DOI] [PubMed] [Google Scholar]
- 21.Brown JM, Wilson WR. Exploiting tumour hypoxia in cancer treatment. Nat Rev Cancer. 2004;4(6):437–447. doi: 10.1038/nrc1367. [DOI] [PubMed] [Google Scholar]
- 22.Nuyten D, Kreike B, Hart AA, et al. Predicting a local recurrence after breast-conserving therapy by gene expression profiling. Breast Cancer Res. 2006;8(5):R62. doi: 10.1186/bcr1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liu R, Wang X, Chen GY, et al. The prognostic role of a gene signature from tumourigenic breast-cancer cells. New Engl J Med. 2007;356(3):217–226. doi: 10.1056/NEJMoa063994. [DOI] [PubMed] [Google Scholar]
- 24.Miller LD, Smeds J, George J, et al. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA. 2005 Sep 20;102(38):13550–13555. doi: 10.1073/pnas.0506230102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.West RB, Nuyten DS, Subramanian S, et al. Determination of stromal signatures in breast carcinoma. PLoS Biol. 2005;3(6):e187. doi: 10.1371/journal.pbio.0030187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Buyse M, Loi S, van’t VL, et al. Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst. 2006;98(17):1183–1192. doi: 10.1093/jnci/djj329. [DOI] [PubMed] [Google Scholar]
- 27.Desmedt C, Piette F, Loi S, et al. Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res. 2007;13(11):3207–3214. doi: 10.1158/1078-0432.CCR-06-2765. [DOI] [PubMed] [Google Scholar]
- 28.Adler AS, Lin M, Horlings H, et al. Genetic regulators of large-scale transcriptional signatures in cancer. Nat Genet. 2005;(5) doi: 10.1038/ng1752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wong DJ, Nuyten DS, Regev A, et al. Revealing targeted therapy for human cancer by gene module maps. Cancer Res. 2008;68(2):369–378. doi: 10.1158/0008-5472.CAN-07-0382. [DOI] [PubMed] [Google Scholar]
- 30.Sorlie T, Tibshirani R, Parker J, et al. Repeated observation of breast tumour subtypes in independent gene expression data sets. Proc Natl Acad Sci USA. 2003;100(14):8418–8423. doi: 10.1073/pnas.0932692100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.MINDACT. 2006> website http://www.eortc.be/services/unit/mindact/default.asp.
- 32.TailorX Trial website at NCI. 2008 http://www.cancer.gov/clinicaltrials/digestpage/TAILORx.
- 33.Hu Z, Fan C, Oh DS, et al. The molecular portraits of breast tumours are conserved across microarray platforms. BMC Genom. 2006;7:96. doi: 10.1186/1471-2164-7-96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ma XJ, Wang Z, Ryan PD, et al. A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell. 2004;5(6):607–616. doi: 10.1016/j.ccr.2004.05.015. [DOI] [PubMed] [Google Scholar]
- 35.Fan C, Oh DS, Wessels L, et al. Concordance among gene-expression-based predictors for breast cancer. New Engl J Med. 2006;355(6):560–569. doi: 10.1056/NEJMoa052933. [DOI] [PubMed] [Google Scholar]
- 36.Sotiriou C, Wirapati P, Loi S, et al. Comprehensive analysis integrating both clinicopathological and gene expression data. Proc Am Soc Clin Oncol. 24(18S) [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.