Abstract
Introduction:
There are no validated molecular methods that prospectively identify patients with surgically resected lung squamous cell carcinoma (SCC) at high risk for recurrence. By focusing on the expression of genes with known functions in development of lung SCC and prognosis, we sought to develop a robust prognostic classifier of early-stage lung SCC.
Methods:
The expression of 253 genes selected by literature search was evaluated in microarrays from 107 stage I/II tumors. Associations with survival were evaluated by Cox regression and Kaplan-Meier survival analyses in two independent cohorts of 121 and 91 patients with SCC, respectively. A classifier score based on multivariable Cox regression was derived and examined in six additional publicly available data sets of stage I/II lung SCC expression profiles (n = 358). The prognostic value of this classifier was evaluated in meta-analysis of patients with stage I/II (n = 479) and stage I (n = 326) lung SCC.
Results:
Dual specificity phosphatase 6 gene [DUSP6] and actinin alpha 4 gene (ACTN4) were associated with prognostic outcome in two independent patient cohorts. Their expression values were utilized to develop a classifier that identified patients with stage I/II lung SCC at high risk for recurrence (hazard ratio [HR] = 4.7, p = 0.018) or cancer-specific mortality (HR = 3.5, p = 0.016). This classifier also identified patients at high risk for recurrence (HR = 2.7, p = 0.008) or death (HR = 2.2, p = 0.001) in publicly available data sets of stage I/II and in meta-analysis of stage I patients.
Conclusions:
We have established and validated a prognostic classifier to inform clinical management of patients with lung SCC after surgical resection.
Keywords: Lung squamous cell carcinoma, Prognostic classifier, Biomarker, Microarray, Gene expression
Introduction
Approximately 1.8 million new lung cancer cases are diagnosed annually worldwide. In the United States, this accounts for 13% of all cancers.1 Squamous cell carcinoma (SCC) is one of the most common histological subtypes of NSCLC, accounting for up to 30% of all cases.2 SCC is commonly detected in heavy smokers, in whom the risk for development of lung cancer is closely correlated with tobacco consumption.3 Chronic obstructive pulmonary disease (COPD) is also a risk factor.4–6 The National Lung Screening Trial, which compared low-dose computed tomography with chest radiography, demonstrated a statistically significant mortality benefit of low-dose computed tomography screening.7 However, there was no benefit for patients with lung SCC despite an increase in the detection of early-stage tumors.8,9 Other studies examining volume doubling time during follow-up have found that volume doubling times were significantly shorter for SCC than for adenocarcinomas (ADCs), indicating faster growth of early SCC lesions.8 Surgical resection is the recommended treatment for stage I NSCLC. However, up to approximately 30% patients will experience recurrence and die within 5 years of surgery.10 Patients with resected stage II and IIIA NSCLC are eligible for adjuvant chemotherapy; however, its efficacy for stage I patients is still ambiguous.11 In addition, there are currently no validated methods that prospectively identify the 30% of patients at high risk for recurrence after surgery. In view of the high rate of relapse and the lack of predictive biomarkers, it is critical to develop biomarkers that can identify high-risk patients with early-stage lung SCC who may benefit from adjuvant chemotherapy or immunotherapy.12
Previously, we established a four-gene signature that identified patients with stage I lung ADC at high risk for recurrence.13,14 Perhaps reflecting the different molecular and cell biology of ADC and SCC,2,15–20 this classifier was not predictive of outcome in SCC.13 This observation is consistent with recent reports suggesting that it is difficult to establish a universal gene signature in lung cancer.21,22 One reason could be the small number of SCC cases included in previous efforts. Others include heterogeneity across patient cohorts, different probe sets on arrays, different platforms, and indeed the diverse range of normalization and gene selection algorithms. Many prognostic factors for early-stage NSCLC have been reported. However, they mostly focus on ADC, and little is known about risk factors for early-stage SCC. To maximize the potential for developing a biologically and mechanistically relevant classifier, we focused our analysis on 253 genes with known mechanistic roles in lung SCC and histological discrimination related to smoking and/or associated with lung cancer prognosis. We established a gene classifier in 212 patients from three cohorts and validated its prognostic value in 358 patients from six published studies of SCC.
Material and Methods
Patients and Tissue Samples
We analyzed 212 tumor samples from three cohorts of patients with lung SCC from National Cancer Center Hospital in Tokyo, Japan (the Japan cohort [n = 121]), the metropolitan Baltimore, Maryland, area of the United States (the NCI-MD cohort [n = 73]), and the Haukeland University Hospital in Bergen, Norway (the Norway cohort [n = 18]). The Japan cohort was collected between 1997 and 2008. The NCI-MD cohort was collected between 1987 and 2009. The Norway cohort was collected between 1988 and 2003. Tumors were pathologically classified according to the seventh edition of the TNM classification. Eligibility criteria included not having received any neoadjuvant therapy and not having received a diagnosis of cancer in the 5 years before diagnosis of lung SCC. Patients completed questionnaires that included demographic and exposure variables, all of which were self-reported. Studies were approved by the institutional review boards for the National Institutes of Health, the Regional Committees for Medical and Health Research Ethics in Norway, and the National Cancer Center at Japan. Demographic and clinical characteristics of these patient cohorts are listed in Table 1. The NCI-MD and Norway cohorts showed similar 5-year survival rates (Norway, 61.1%; United States, 64.6%; p = 0.4), sex, and age at diagnosis. Thus, to increase the statistical power for all further analyses, they were combined. Our study follows the Reporting Recommendations for Tumor Marker Prognostic Studies and the guidelines set forth to evaluate prognostic lung cancer signatures.12,23
Table 1.
Characteristics of Study Populations in the Japan and NCI-MD/Norway Cohorts
| Characteristic | Japan Cohort (n = 121) |
NCI-MD/ Norway Cohort (n = 91) |
|---|---|---|
| Age, y | ||
| Mean ± 5D | 64.3 ± 6.6 | 67.1 ± 8.1 |
| Range | 49–83 | 43–85 |
| Sex, n (%) | ||
| Male | 108 (89.2) | 64 (70.3) |
| Female | 13 (10.8) | 27 (29.7) |
| Race, n (%) | ||
| White | 0 (0.0) | 69 (75.8) |
| African American | 0 (0.0) | 22 (24.2) |
| Asian | 121 (100.0) | 0 (0.0) |
| Smoking history, n % | ||
| Never | 4 (3.3) | 1 (1.1) |
| Former | 51 (42.1) | 37 (40.7) |
| Current | 66 (54.6) | 45 (49.4) |
| Unknown | 8 (8.8) | |
| Smoking, n (%), pack-years | ||
| Mean ± SD | 46.4 ± 22.7 | 52.4 ± 33.2 |
| Range | 0–161 | 0–164 |
| Histological subtype, n (%) | ||
| Squamous cell carcinoma | 121 (100.0) | 91 (100.0) |
| Stage, n (%)a | ||
| I | 59 (48.8) | 50 (54.9) |
| IA | 26 (21.5) | 23 (25.3) |
| IB | 33 (27.3) | 27 (29.6) |
| II | 62 (51.2) | 41 (45.1) |
| IIA | 33 (27.3) | NA |
| IIB | 29 (24.0) | NA |
Cases were restaged to the American Joint Committee on Cancer classification, seventh edition, on the basis of tumor size and/or pathology reports where possible.
NA, data not available.
RNA Isolation
Primary lung tumors were snap-frozen immediately after surgery and stored at –80°C. RNA was extracted from frozen tissue samples in the Japan cohort and from the NCI-MD/Norway cohort using TRIzol (Invitrogen, Carlsbad, CA). RNA quality was inspected using Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA).
GeneChip Human Transcriptome Array 2.0 Analyses
All samples included in this study had RNA integrity numbers higher than 6.0 and were hybridized to the Human Transcriptome Array 2.0 (Affymetrix, Santa Clara, CA) according to the manufacturer’s recommendations. This platform detects 44,699 protein coding gene transcript clusters and 22,829 nonprotein coding gene transcript clusters. Data normalization and signal summarization were performed using Expression Console software (Affymetrix). Microarray data have been deposited at National Center for Biotechnology Information Gene Expression Omnibus and are accessible through Series accession number GSE74777.24
mRNA qRT-PCR
TaqMan Gene Expression Assays (Applied Biosystems, Foster City, CA) were loaded into 96.96 Dynamic Arrays (Fluidigm Corporation, San Francisco, CA) in triplicate and quantitative real-time PCR (qRT-PCR) reactions were carried out using the BioMark Real-Time PCR system (Fluidigm Corporation) according to the manufacturer’s instructions.14 Taqman probes for 20 genes (aldehyde dehydrogenase family member A1 gene [ALDH1A1] Hs00946916_m1, fibrillin 2 gene [FBN2] Hs00417208_m1, CD99 Molecule gene [CD99] Hs00908458_m1, actinin alpha 4 gene [ACTN4] Hs00245168_m1, DLG associated protein 5 gene [DLGAP5] Hs00207323_m1, cullin 3 gene [CUL3] Hs00180183_m1, Hs04329643_s1, ERCC excision repair 1, endonuclease non-catalytic subunit gene [ERCC1] Hs01012158_m1, survival motor neuron domain containing 1 gene [SMNDC1] Hs01090302, cullin 3 gene [CUL3] Hs00180183_m1, BRCA1 associated RING domain 1 gene [BARD1] Hs00184427_m1, phosphatase and tensin homolog gene [PTEN] Hs02621230_s1, ATP binding cassette subfamily C member 1 gene [ABCC1] Hs01561502_m1, hypoxia inducible factor 1 alpha subunit gene [HIF1A] Hs00153153_m1, regenerating family 1 member alpha gene [REGIA] Hs00984887_g1, SRY-box 2 [per Human Genome Organization database, elsewhere sex-determining region Y-box 2 gene (S0X2)] Hs01053049_s1, serpin family G member 1 gene [SERPING1] Hs00163781_m1, programmed cell death 1 ligand 2 gene [PDCD1LG2] Hs01057777_m1, C-C Motif Chemokine Ligand 22 gene [CCL22] Hs01574247_m1, patched 1 gene [PTCH1] Hs00181117_m1, and prolyl-4 hydroxylase subunit alpha 1 gene [P4HA1] Hs00914594_m1) and 18S (identifier Hs03003631_g1) (as a normalization control) were used.14 After 16 cycles of preamplification using a pool of 20 probes, a 35-cycle amplification step was performed. Delta Cycle threshold values were calculated. Signals over 30 cycles were deemed undetectable and treated as missing data
Statistical Analysis and Gene Classifier Development
Associations between gene expression and survival were evaluated by using the log-rank test in Graphpad Prism v5.0 (Graphpad Software, La Jolla, CA) on patients dichotomized on the basis of the median expression value for each gene. Survival curves were drawn using the method of Kaplan and Meier. Hazard ratios (HRs) were estimated using Cox proportional hazard regression in IBM SPSS Statistics 21 (IBM, Inc., Armonk, NY). Coefficients from multivariable models that included continuous expression values for dual specificity phosphatase 6 gene (DUSP6] and ACTN4 from the Japan cohort were used to build the two-gene classifier score that was subsequently applied to all validation cohorts. Forest plot analyses were performed using Review Manager 5 (The Cochrane Information Management System [Cochrane Collaboration, London, United Kingdom]). A heterogeneity test for the combined HR was carried out using the I2 statistic.25 Functional regulatory gene and protein interactions based on gene expression data were evaluated with ingenuity pathway analysis. Hierarchical clustering analysis was performed using Genesis v.1.7.6 software (Institute for Genomics and Bioinformatics Graz, Graz, Austria) with Pearson correlation and complete linkage. All statistical associations were evaluated using univariable and multivariable models adjusting for clinically relevant risk factors such as age, smoking status, smoking pack-years and stage. We present results and coefficients of our final model in sufficient detail to allow readers to easily test the prognostic classifier in additional patient populations.
Publicly Available Gene Expression Data Sets
In June 2015 we searched Gene Expression Omnibus using the search terms lung cancer, non-small cell lung cancer, lung squamous carcinoma, and NSCLC and also searched ONCOMINE (Thermo Fisher Scientific, Ann Arbor, MI)26 to identify public microarray data sets of patients with lung SCC with clinical follow-up. Selection criteria for all publicly available data sets required that each data set include sufficient survival information for more than 40 patients with TNM stage I or II SCC and have expression data for DUSP6 and ACTN4. Six publicly available microarray data sets were analyzed and used for validation of the prognostic signature: Korea cohort (Oncomine: Affymetrix),27 U.S.-M. D. Anderson cohort (GSE41271: Illumina),28 France cohort (GSE30219: Affymetrix),29 Sweden cohort (GSE37745: Affymetrix),30 U.S.-Duke cohort (Oncomine: Affymetrix),31 and U.S.-Michigan cohort (GSE4573: Affymetrix).32 Demographic and clinical characteristics of these patient cohorts are found in Supplementary Table 1. In each published cohort, cases that received adjuvant therapy were excluded from the analysis. Normalized expression values were obtained from each data set and were not processed further. If more than one probe was selected, they were averaged. To build the gene signature, Affymetrix probes (ACTN4: 200601_at, DUSP6: 208891_at, 208892_s_at, 208893_s_at) and Illumina probes [ACTN4-. ILMN_1725534, DUSP6: ILMN_1677466, ILMN_2396020) were used. The two-gene classifier was calculated for each sample in the publically available data sets and samples were categorized as classifier low, medium, or high within each cohort separately. The within-cohort categorization was performed to standardize risk scores across all cohorts and compensate for the fact that each study used different methodologies to measure the expression of each of the two genes.13
Results
Seven Genes Are Associated with RFS of Early-Stage (Stage I + II) Lung SCC in the Japan Cohort
Our strategy for establishing the coding gene classifier is represented in Supplementary Figure 1. In all, 253 genes were selected on the basis of a literature search for evidence of their association with lung SCC biology or patient prognosis (Supplementary Table 2). We analyzed Human Transcriptosome Array 2.0 data on patients with early-stage (stage I + II [American Joint Committee on Cancer, seventh edition]) with lung SCC from a subset of 107 patients from the Japanese cohort and examined associations of the 260 probes (corresponding to 253 genes) with relapse-free survival (RFS). In univariable Cox regression, the 20 genes most significantly associated with RFS were selected for technical validation by qRT-PCR in the same sample population (Supplementary Table 2). When Taqman probes were used, qRT-PCR measurements significantly correlated with the microarray data (p < 0.05) for all of the 20 genes (Supplementary Fig. 2). Expression values for each gene obtained by qRT-PCR were dichotomized on the basis of median values. Seven of the 20 genes measured were significantly associated with RFS in Cox regression univariable models (Supplementary Table 3), thus validating our microarray results.
ACTN4 and DUSP6 Are Associated with Cancer-Specific Mortality in the NCI-MD/Norway Cohort
The expression of seven genes was measured by qRT-PCR in the combined NCI-MD/Norway cohort (stage I + II, n = 91). DUSP6 (HR = 2.64, 95% confidence interval [Cl]: 1.19–5.85, p = 0.017) and ACTN4 (HR = 2.68, 95% Cl: 1.21–5.92, p = 0.015) were each significantly associated with cancer-specific mortality in univariable Cox regression models (Supplementary Table 3). In addition, high expression of DUSP6 or ACTN4 identified patients with worse prognosis in the Kaplan-Meier analysis of the Japan and NCI-MD/Norway cohorts (Fig. 1A and B).
Figure 1.

Kaplan-Meier survival analysis of actinin alpha 4 gene (ACTN4) and dual specificity phosphatase 6 gene (DUSP6) expression in early-stage (stage I + II) lung squamous cell carcinoma in (A) Japan (n = 121, 5-year relapse-free survival [RFS]) and (B) National Cancer Institute-metropolitan Baltimore, Maryland, area/Norway (n = 91,5-year cancer-specific mortality). Kaplan-Meier survival analysis of the two-gene classifier in early-stage (stage I + II) lung squamous cell carcinoma in (C) Japan (n = 121, 5-year RFS) and (D) National Cancer Institute-metropolitan Baltimore, Maryland, area (n = 91, 5-year cancer-specific mortality).
Development of a Combined DUSP6 and ACTN4 Classifier for Prognosis of SCC
For the purpose of establishing a robust prognostic classifier for patients with lung SCC, we conducted a multivariable Cox regression analysis based on linear expression values for ACTN4 and DUSP6 in the Japan cohort. The resulting coefficients were incorporated into a classifier score as follows: (0.590 × DUSP6) + (0.550 × ACTN4). Patients were categorized in tertiles (low, medium, and high) on the basis of tumor score values. The two-gene classifier was significantly associated with prognosis in stage I + II patients in a multivariable Cox regression model adjusted for stage, age, sex, pack-years of smoking, and smoking status (HRhigh vs low = 4.6, 95% Cl: 1.30–16.69, p = 0.018) (Table 2), and a high score identified high-risk patients by Kaplan-Meier analysis (Fig. 1C).
Table 2.
Univariable and Multivariable Cox Regression Analysis of the Two-Gene Classifier in Japan and NCI-MD/Norway Cohorts
| Univariable |
Multivariablea |
||||
|---|---|---|---|---|---|
| Variables | n | HR (95% Cl) | p Value | HR (95% Cl) | p Value |
| Japan cohort (outcome: RFS) | 121 | ||||
| TNM stageb | |||||
| I | 59 | Reference | NA | Reference | NA |
| II | 62 | 1.93 (0.82–4.54) | 0.134 | 2.04 (0.77–5.42) | 0.153 |
| Age, y | 121 | 1.02 (0.95–1.08) | 0.632 | 1.02 (0.95–1.09) | 0.644 |
| Sexc | |||||
| Female | 13 | Reference | NA | ||
| Male | 108 | 24.44 (0.10–6046) | 0.256 | ||
| Smoking, pack-years | 121 | 1.00 (0.99–1.02) | 0.772 | 1.01 (0.99–1.03) | 0.439 |
| Smoking status | |||||
| Never/former | 55 | Reference | NA | Reference | NA |
| Current | 66 | 1.15 (0.51–2.63) | 0.738 | 1.17 (0.49–2.80) | 0.726 |
| 2-Gene classifierd | |||||
| Low | 41 | Reference | NA | Reference | NA |
| Medium | 40 | 2.61 (0.68–10.11) | 0.164 | 2.56 (0.64–10.31) | 0.185 |
| High | 40 | 5.20(1.48–18.25) | 0.010 | 4.66 (1.30–16.69) | 0.018 |
| Trend | 0.006 | 0.012 | |||
| NCI-MD/Norway cohort (outcome: CSM) | 91 | ||||
| TNM stageb | |||||
| I | 50 | Reference | NA | Reference | NA |
| II | 41 | 1.39 (2.93–0.66) | 0.380 | 1.00 (0.46–2.18) | 0.992 |
| Age | 91 | 1.01 (0.96–1.06) | 0.736 | 1.00 (0.95–1.05) | 0.948 |
| Sex | |||||
| Female | 27 | Reference | NA | Reference | NA |
| Male | 64 | 1.08 (0.48–2.45) | 0.853 | 0.96 (0.41–2.22) | 0.918 |
| Smoking, pack-years | 91 | 1.01 (1.00–1.02) | 0.158 | 1.00 (0.99–1.02) | 0.411 |
| Smoking status | |||||
| Never/former | 38 | Reference | NA | Reference | NA |
| Current | 45 | 0.62 (0.28–1.39) | 0.249 | 0.69 (0.29–1.64) | 0.398 |
| 2-Gene classified | |||||
| Low | 31 | Reference | NA | Reference | NA |
| Medium | 30 | 1.09 (0.35–3.38) | 0.881 | 1.12 (0.34–3.69) | 0.851 |
| High | 30 | 3.73 (1.45–9.58) | 0.006 | 3.49 (1.27–9.60) | 0.016 |
| Trend | 0.004 | 0.008 | |||
Note: Bold indicates significant values (p < 0.05).
Adjusted for stage, age, sex, smoking history, and cohort membership when appropriate.
Stage is defined by the American Joint Committee on Cancer classification, seven edition.
Because there are no relapsed cases among females in the Japan cohort, sex was excluded in the multivariable analyses.
The 2-gene classifier was categorized on the basis of tertiles.
HR, hazard ratio; Cl, confidence interval; RFS, relapse-free survival; NA, not applicable; CSM, cancer-specific mortality.
The classifier was then applied to the NCI-MD/ Norway cohort, in which it also was significantly associated with prognosis in a multivariable Cox regression model adjusted for stage, age, sex, smoking status, and pack-years of smoking (HRhigh vs. low = 3.49, 95% Cl: 1.27–9.60, p = 0.016) (Table 2) and identified patients with early-stage SCC who were at high risk for cancer-specific death (Fig. 1D).
Prospective classification of patients at high risk for recurrence is important for identifying those who may benefit from adjuvant chemotherapy. Although the association did not reach statistical significance, a high score was also associated with worse outcome only in stage I patients from these two cohorts (Supplementary Table 4 and Supplementary Fig. 3). These results provide strong evidence that the two-gene classifier is robust and could lead to reproducible predictions in ethnically and geographically diverse populations.
Validation of Two-Gene Classifier in Multiple Independent Cohorts of Patients with Lung SCC
The goal of our study was to establish a prognostic gene classifier that is broadly applicable to patients with early-stage lung SCC. Thus, we analyzed two cohorts with reported RFS, the Korea cohort (n = 57) and the U.S.-M. D. Anderson cohort (n = 46) using publically available microarray gene expression data. The two-gene classifier was associated with the RFS of patients with stage I or II lung SCC from the Korea cohort [HRhigh vs. low = 9.71, 95% CI: 1.57–59.92, p = 0.014) and the U.S.-M. D. Anderson cohort (HRhigh vs. low = 3.73, 95% Cl: 1.00–13.88, p = 0.049] (Supplementary Tables 5 and 6 and Supplementary Fig. 4A and B). When these two cohorts were combined, the two-gene classifier was significantly associated with the RFS of stage I + II patients (HRhigh vs. low = 2.70, 95% Cl: 1.29–5.66, p = 0.008) (Table 3) as well as with that of stage I patients (HRhigh vs. low = 3.22, 950/0 Cl: 1.30–7.98, p = 0.012) (Supplementary Table 7). The two-gene classifier identified stage I + II as well as stage I patients at high risk for recurrence (Fig. 2A and Supplementary Fig 5A). The three cohorts that reported RFS information (Japan, Korea, and U.S.-M. D. Anderson) were analyzed in a fixed effects meta-analysis model. There was low heterogeneity or inconsistency across these cohorts [stage I + II, I2 = 0% [p = 0.379]; stage I, I2 = 0% [p = 0.420]), suggesting that these data are not the result of selection bias. The two-gene classifier consistently identified stage I + II and stage I patients at high risk for recurrence in the three studies (Fig. 2B and C).
Table 3.
Univariable and Multivariable Cox Regression Analysis of the Two-Gene Classifier in Published Cohorts
| Univariable |
Multivariablea |
||||
|---|---|---|---|---|---|
| Variables | n | HR (95% Cl) | p Value | HR (95% Cl) | p Value |
| Korea/M. D. Anderson cohortb (outcome: RF5) | 103 | ||||
| TNM stagec | |||||
| I | 75 | Reference | NA | Reference | NA |
| II | 28 | 1.33 (0.70–2.51) | 0.380 | 1.73 (0.89–3.35) | 0.105 |
| Age | 1.03 (0.99–1.07) | 0.106 | 1.03 (0.99–1.07) | 0.183 | |
| Sex | |||||
| Female | 24 | Reference | NA | Reference | NA |
| Male | 79 | 0.85 (0.43–1.68) | 0.641 | 0.86 (0.43–1.70) | 0.658 |
| 2-Gene classifierd | |||||
| Low | 35 | Reference | NA | Reference | NA |
| Medium | 34 | 0.77 (0.33–1.78) | 0.543 | 0.82 (0.35–1.93) | 0.649 |
| High | 34 | 2.84 (1.39–5.78) | 0.004 | 2.70 (1.29–5.66) | 0.008 |
| Trend | 0.003 | 0.003 | |||
| Combined four cohortse (outcome: OS) | 255 | ||||
| TNM stagec | |||||
| I | 192 | Reference | NA | Reference | NA |
| II | 63 | 1.21 (0.79–1.84) | 0.385 | 1.23 (0.80–1.88) | 0.349 |
| Age | 255 | 1.01 (0.99–1.03) | 0.195 | 1.02 (1.00–1.04) | 0.086 |
| Sex | |||||
| Female | 70 | Reference | NA | Reference | NA |
| Male | 185 | 0.87(0.58–1.31) | 0.497 | 0.93 (0.62–1.41) | 0.736 |
| 2-Gene classifierd | |||||
| Low | 86 | Reference | NA | Reference | NA |
| Medium | 86 | 1.16 (0.71–1.90) | 0.553 | 1.22 (0.75–2.01) | 0.426 |
| High | 83 | 2.17 (1.37–3.45) | 0.001 | 2.23 (1.40–3.56) | 0.001 |
| Trend | 0.001 | <0.001 | |||
Note: Bold indicates significant values (p < 0.05).
Adjusted for stage, age, sex, smoking history, and cohort membership when appropriate.
Cohort consists of Korea (n = 57, Oncomine) and U.S.-M. D. Anderson (n = 46, GSE41271 ) cohorts, which are publicly available rnicroarray data sets of stage I + II lung squamous cell carcinoma with relapse-free survival information.
Stage is defined by American Joint Committee on Cancer classification, sixth edition.
The two-gene classifier was categorized on the basis of tertiles.
Cohort consists of U.S.-Michigan (n= 107, GSE4573), France (n = 56, GSE30219), Sweden (n = 45, GSE37745), and U.S.-Duke (n = 44, Oncomine) cohorts, which are publicly available microarray data sets of stage I + II lung squamous cell carcinoma with overall survival information.
HR, hazard ratio; CI, confidence interval; RF5, relapse-free survival; NA, not applicable; OS, overall survival.
Figure 2.

Kaplan-Meier survival analysis of the two-gene classifier in early-stage (stage I + II) lung squamous cell carcinoma in (A) two published cohorts with relapse-free survival (RF5) (n = 103, 5-year RF5). Forest plot of the prognostic impact of the two-gene classifier in stage I + II (B) or stage I (C) lung squamous cell carcinoma in three independent cohorts with RFS outcome. M-H, Mantel-Haenszel; CI, confidence interval.
The two-gene classifier was also applied to the four cohorts that reported overall survival (OS) information (France, U.S.-Duke, Sweden, and U.S.-Michigan). In a combined analysis of these published cohorts, the two-gene classifier was significantly associated with OS patients in patients with stage I or II lung SCC in a multivariable Cox regression model (HRhigh vs. low = 2.23, 95% Cl: 1.40–3.56, p = 0.001) (Table 3). This association remained significant when only stage I cases were considered [HRhigh vs. low = 2.17, 95% Cl: 1.20–3.91, p = 0.010) (Supplementary Table 7). The two-gene classifier identified high-risk patients in Kaplan-Meier survival analysis of stage I + II as well as stage I cases (Fig. 3A and Supplementary Fig. 5B). A fixed effects meta-analysis of these four cohorts with OS information demonstrated no heterogeneity or inconsistency (stage I + II, I2 = 0.0% [p = 0.838]; stage I, I2 = 0.0% [p = 0.782]), again indicating a lack of selection bias. The two-gene classifier consistently identified stage I + II and stage I patients at high risk for death in the four cohorts (Fig. 3B and C).
Figure 3.

Kaplan-Meier survival analysis of the two-gene classifier in early-stage (stage I + II) lung squamous cell carcinoma in (A) four published cohorts with overall survival (OS) (n = 255, 5-year OS). Forest plot of the prognostic impact of the two-gene classifier in stage I + II (B) or stage I (C) lung squamous cell carcinoma in four independent cohorts with OS outcome. M-H, Mantel-Haenszel; CI, confidence interval.
Pathway Analysis
To understand the biological effects of ACTN4 and DUSP6 on the progression lung SCC, we evaluated the gene expression profiles associated with high versus low expression of each or both of those genes and explored pathways based on these comparisons. We identified 1563 transcripts showing differential expression among tumors with high expression of both ACTN4 and DUSP6 (p < 0.001J. Hierarchical clustering of all patients on the basis of these differentially expressed transcripts resulted in accurate discrimination of patient outcome, suggesting that the two-gene signature identifies molecular subsets of patients with clinical relevance (Supplementary Fig. 6). Characterization of canonical pathways associated by ingenuity pathway analysis revealed 11 pathways linked to high expression of ACTN4,13 pathways linked to high expression of DUSP6, and 10 pathways linked to high expression of both genes (p < 0.0001) (Supplementary Fig. 7 and Supplementary Tables 8–10). Canonical pathways connected with ACTN4 and DUSP6, included integrin, actin cytoskeleton, paxillin and ERK/MAPK signaling, all of which may lead to cancer cell migration, invasion, and proliferation. Pathways connected with DUSP6, included p21-activated kinase, neuregulin, and ErbB and globally encompass proteins with signal transducing roles as serine/threonine protein kinases and receptor tyrosine kinases. Finally, pathways connected with ACTN4, included regulation of cellular mechanics by calpain protease, epithelial adherens junctions, and protein tyrosine kinase 2, which are involved in tissue remodeling and cell motility.
Discussion
Our objective was to establish a prognostic gene classifier for early-stage lung SCC to guide clinical decisions. Herein, we established and validated a prognostic gene classifier in a total of 570 patients with stage I +II lung SCC. The relationship of the two-gene classifier with prognosis was significant in resected patients across ethnically and geographically diverse populations, suggesting that this classifier has the potential to identify high-risk patients who may benefit from adjuvant chemotherapy. The current standard therapy for early-stage (stage I and II) NSCLC is lobectomy with mediastinal lymph node resection. Adjuvant chemotherapy in resected NSCLC has been recommended for stage II and IIIA patients, and its utility has not been proven for stage I patients,11 perhaps in part because adjuvant chemotherapy trials have been conducted in unselected patient populations.
We recently established a robust classifier consisting of four genes that was predictive of outcome in lung ADC, but not in SCC.13,14 Thus, we aimed to develop a classifier specific for the SCC histological subtype. We propose that a two-gene classifier comprising DUSP6 and ACTN4 can be used to guide therapeutic decisions for patients with early-stage lung SCC.
DUSP6 is one of five genes in a prognostic signature previously associated with outcome of NSCLC.33 In that study as in ours, DUSP6 expression was adversely associated with patient outcome. In addition, DUSP6 was recently identified as a biomarker of poor prognosis in hepatocellular carcinoma.34 Paradoxically, DUSP6 is a member of the mitogen-activated protein kinase phosphatase family that inactivates extracellular signal-regulated kinase, which is an activity consistent with tumor suppression. However, DUSP6 overexpression has been observed in several cancers and functionally correlated with aggressive tumor behavior and malignant phenotypes.35,36 In addition, silencing of DUSP6 resulted in increased sensitivity to cytotoxic drugs and reduced cancer cell proliferation.37,38 Thus, it has been proposed that DUSP6 modulates the DNA damage response and may have opposing roles depending on the cellular context.37 Gene expression profiles in tumors with high DUSP6 were associated with a family of serine/threonine protein kinases and receptor tyrosine kinases, including ErbB family and downstream signaling mediators, which may promote cancer cell invasion and oppose apoptotic programs.
Actinin-4 is predominantly expressed in the cellular protrusions that stimulate the invasive phenotype of cancer cells and is essential for formation of cellular protrusions such as filopodia and lamellipodia.39–42 ACTN4 amplification and high expression are frequently observed in patients with carcinomas of the pancreas, ovary, lung, and salivary gland, and patients with ACTN4 amplifications and high ACTN4 expression have worse outcomes than patients without amplification, which is consistent with our observations.41,43–45 Gene expression profiles in tumors with high ACTN4 were associated with epithelial adherens junction and protein tyrosine kinase 2, which may lead cell migration and invasion through tissue remodeling and cell motility.
The two-gene classifier is composed of biologically and mechanistically relevant genes that are functionally important and each significantly associated with prognosis and recurrence in early-stage lung SCC. The classifier predicted both RFS and OS in multiple cohorts of stage I + II and stage I lung SCC. The results appear to be independent of race, gene expression platform and clinical background. Furthermore, in meta-analysis there is a stronger correlation between the two-gene classifier with RFS and cancer-specific mortality than with OS. COPD and cardiovascular disease contributed to 20% of the deaths in the NCI-MD/Norway cohort and they are frequent comorbidities of patients with lung SCC.5,46,47 In particular, COPD status is one of the most important prognostic factors in patients with NSCLC according to the Surveillance, Epidemiology, and End Results program data.48
Future work will be needed to identify the optimal cutpoint for this classifier that can discriminate high-risk from low-risk patients in a clinical setting. Additionally, efforts will be focused on developing standardized assays that are applicable to formalin-fixed paraffin-embedded tissues, as these are available in routine clinical practice. Moreover, as our recent work showed that the combination of multiple molecularly and functionally distinct biomarkers (such as non-coding RNA, methylation status and genomic alterations) can improve the detection of high-risk patients, additional studies should explore this possibility in lung SCC.49
In conclusion, we have developed and validated a two-gene classifier in multiple large-scale and geographically diverse cohorts of patients with early-stage lung SCC. This classifier could be used to identify the approximately 30% of patients with early-stage lung SCC who remain at a high-risk of recurrence and guide their clinical management.
Supplementary Material
Acknowledgments
This research was supported by the Intramural Research Program of the National Cancer Institute, National Institutes of Health, the Department of Defense Congressionally Directed Medical Research Program (Grant PR093793), the Health Research Board [CPFP/2012/2] (NW), the Norwegian Cancer Society, and the National Cancer Center Research and Development Fund (26A-1: NCC Biobank). The authors thank Ms. Yoko Shimada, Koji Tsuta, and Shun-ichi Watanabe for sample preparation.
Footnotes
Disclosure: The authors declare no conflict of interest.
Supplementary Data
Note: To access the supplementary material accompanying this article, visit the online version of the Journal of Thoracic Oncology at www.jto.org and at http://dx.doi.org/10.1016/j.jtho.2016.08.141.
References
- 1.Torre LA, Bray F, Siegel RL, et al. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108. [DOI] [PubMed] [Google Scholar]
- 2.Travis WD. Pathology of lung cancer. Clin Chest Med. 2011;32:669–692. [DOI] [PubMed] [Google Scholar]
- 3.Pesch B, Kendzia B, Gustavsson P, et al. Cigarette smoking and lung cancer-relative risk estimates for the major histological types from a pooled analysis of case-control studies. Int J Cancer. 2012;131:1210–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Young RP, Hopkins RJ, Christmas T, et al. COPD prevalence is increased in lung cancer, independent of age, sex and smoking history. Eur RespirJ. 2009;34:380–386. [DOI] [PubMed] [Google Scholar]
- 5.Tammemagi CM, Neslund-Dudas C, Simoff M, et al. Impact of comorbidity on lung cancer survival. Int J Cancer. 2003;103:792–802. [DOI] [PubMed] [Google Scholar]
- 6.Papi A, Casoni G, Caramon G, et al. COPD increases the risk of squamous histological subtype in smokers who develop non-small cell lung carcinoma. Thorax. 2004;59: 679–681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.National Lung Screening Trial Research Team, Aberle DR, Adams AM, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365:395–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wilson DO, Ryan A, Fuhrman C, et al. Doubling times and CT screen-detected lung cancers in the Pittsburgh Lung Screening Study. Am J Respir Crit Care Med. 2012; 185:85–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pinsky PF, Church TR, Izmirlian G, et al. The National Lung Screening Trial: results stratified by demographics, smoking history, and lung cancer histology. Cancer. 2013;119:3976–3983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chansky K, Sculier JP, Crowley JJ, et al. The International Association for the Study of Lung Cancer Staging Project: prognostic factors and pathologic TNM stage in surgically managed non-small cell lung cancer. J Thorac Oncol. 2009;4:792–801. [DOI] [PubMed] [Google Scholar]
- 11.Pignon JP, Tribodet H, Scagliotti GV, et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE Collaborative Group. J Clin Oncol. 2008;26:3552–3559. [DOI] [PubMed] [Google Scholar]
- 12.Subramanian J, Simon R. Gene expression-based prognostic signatures in lung cancer: ready for clinical use? J Natl Cancer Inst. 2010;102:464–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Okayama H, Schetter AJ, Ishigame T, et al. The expression of four genes as a prognostic classifier for stage I lung adenocarcinoma in 12 independent cohorts. Cancer Epidemiol Biomarkers Prey. 2014;23:2884–2894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Akagi I, Okayama H, Schetter AJ, et al. Combination of protein coding and noncoding gene expression as a robust prognostic classifier in stage I lung adenocarci-noma. Cancer Res. 2013;73:3821–3832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Virtanen C, Ishikawa Y, Honjoh D, et al. Integrated classification of lung tumors and cell lines by expression profiling. Proc Natl Acad Sci USA. 2002;99:12357–12362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Khuder SA, Mutgi AB. Effect of smoking cessation on major histologic types of lung cancer. Chest. 2001; 120: 1577–1583. [DOI] [PubMed] [Google Scholar]
- 17.Herbst RS, Heymach JV, Lippman SM. Lung cancer. N Engl J Med. 2008;359:1367–1380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543–550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012;489:519–525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Bozinovski S, Vlahos R, Anthony D, et al. COPD and squamous cell lung cancer: aberrant inflammation and immunity is the common link. Br J Pharmacol. 2015; 173: 635–648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lau SK, Boutros PC, Pintilie M, et al. Three-gene prognostic classifier for early-stage non small-cell lung cancer. J Clin Oncol. 2007;25:5562–5569. [DOI] [PubMed] [Google Scholar]
- 22.Gandara DR, Hammerman PS, Sos ML, et al. Squamous cell lung cancer: from tumor genomics to cancer therapeutics. Clin Cancer Res. 2015;21:2236–2243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Altman DG, McShane LM, Sauerbrei W, et al. Reporting recommendations for tumor marker prognostic studies (REAAARK): explanation and elaboration. BMC Med. 2012;10:51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.National Center for Biotechnology Information. Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo/ Accessed June 1, 2015.
- 25.Higgins JP, Thompson SG, Deeks JJ, et al. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.lontorrent by Thermo Fischer Scientific. ONCOAMINE v.4.5: 729 Datasets and 91,866 Samples. http://www. oncomine.com. Accessed June 1, 2015.
- 27.Lee ES, Son DS, Kim SH, et al. Prediction of recurrence-free survival in postoperative non-small cell lung cancer patients by using an integrated model of clinical information and gene expression. Clin Cancer Res. 2008;14: 7397–7404. [DOI] [PubMed] [Google Scholar]
- 28.Sato M, Larsen JE, Lee W, et al. Human lung epithelial cells progressed to malignancy through specific oncogenic manipulations. Mol Cancer Res. 2013;11:638–650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Rousseaux S, Debernardi A, Jacquiau B, et al. Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers. Sci Transl Med. 2013;5:186ra–166.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Botling J, Edlund K, Lohr M, et al. Biomarker discovery in non-small cell lung cancer: integrating gene expression profiling, meta-analysis, and tissue microarray validation. Clin Cancer Res. 2013;19:194–204. [DOI] [PubMed] [Google Scholar]
- 31.Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature. 2006;439:353–357. [DOI] [PubMed] [Google Scholar]
- 32.Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res. 2006;66:7466–7472. [DOI] [PubMed] [Google Scholar]
- 33.Chen HY, Yu SL, Chen CH, et al. A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med. 2007;356:11–20. [DOI] [PubMed] [Google Scholar]
- 34.Yang B, Tan Y, Sun H, et al. Higher intratumor than peritumor expression of DUSP6/MKP-3 is associated with recurrence after curative resection of hepatocellular carcinoma. Chin Med J (Engl). 2014;127:1211–1217. [PubMed] [Google Scholar]
- 35.Messina S, Frati L, Leonetti C, et al. Dual-specificity phosphatase DUSP6 has tumor-promoting properties in human glioblastomas. Oncogene. 2011;30:3813–3820. [DOI] [PubMed] [Google Scholar]
- 36.Degl’lnnocenti D, Romeo P, Tarantino E, et al. DUSP6/ MKP3 is overexpressed in papillary and poorly differentiated thyroid carcinoma and contributes to neoplastic properties of thyroid cancer cells. Endocr Relat Cancer. 2013;20:23–37. [DOI] [PubMed] [Google Scholar]
- 37.Bagnyukova TV, Restifo D, Beeharry N, et al. DUSP6 regulates drug sensitivity by modulating DNA damage response. Br J Cancer. 2013;109:1063–1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Song H, Wu C, Wei C, et al. Silencing of DUSP6 gene by RNAi-mediation inhibits proliferation and growth in MDA-MB-231 breast cancer cells: an in vitro study. Int J Clin Exp Med. 2015;8:10481–10490. [PMC free article] [PubMed] [Google Scholar]
- 39.Wang MC, Chang YH, Wu CC, et al. Alpha-actinin 4 is associated with cancer cell motility and is a potential biomarker in non-small cell lung cancer. J Thorac Oncol. 2015;10:286–301. [DOI] [PubMed] [Google Scholar]
- 40.Koizumi T, Nakatsuji H, Fukawa T, et al. The role of actinin-4 in bladder cancer invasion. Urology. 2010;75: 357–364. [DOI] [PubMed] [Google Scholar]
- 41.Honda K The biological role of actinin-4 (ACTN4) in malignant phenotypes of cancer. Cell Biosci. 2015;5:41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Gao Y, Li G, Sun L, et al. ACTN4 and the pathways associated with cell motility and adhesion contribute to the process of lung cancer metastasis to the brain. BMC Cancer. 2015; 15:277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Yamamoto S, Tsuda H, Honda K, et al. Actinin-4 gene amplification in ovarian cancer: a candidate oncogene associated with poor patient prognosis and tumor chemoresistance. Mod Pathol. 2009;22:499–507. [DOI] [PubMed] [Google Scholar]
- 44.Yamagata N, Shyr Y, Yanagisawa K, et al. A training-testing approach to the molecular classification of resected non-small cell lung cancer. Clin Cancer Res. 2003;9:4695–4704. [PubMed] [Google Scholar]
- 45.Watanabe T, Ueno H, Wcitabe Y, et al. ACTN4 copy number increase as a predictive biomarker for chemoradiotherapy of locally advanced pancreatic cancer. Br J Cancer. 2015;112:704–713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zhai R, Yu X, Shafer A, et al. The impact of coexisting COPD on survival of patients with early-stage non-small cell lung cancer undergoing surgical resection. Chest. 2014;145:346–353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hanagiri T, Sugio K, Mizukami M, et al. Significance of smoking as a postoperative prognostic factor in patients with non-small cell lung cancer. J Thorac Oncol. 2008;3: 1127–1132. [DOI] [PubMed] [Google Scholar]
- 48.Putila J, Guo NL. Combining COPD with clinical, pathological and demographic information refines prognosis and treatment response prediction of non-small cell lung cancer. PLoS One. 2014;9:e100994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Al Robles, Arai E Mathe EA, et al. An integrated prognostic classifier for stage I lung adenocarcinoma based on mRNA, microRNA, and DNA methylation biomarkers. J Thorac Oncol. 2015;10:1037–1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
