Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Nov 9.
Published in final edited form as: J Thorac Cardiovasc Surg. 2008 Jan 18;135(3):627–634. doi: 10.1016/j.jtcvs.2007.10.058

A Simple Two-Gene Prognostic Model for Adenocarcinoma of the Lung

Carolyn E Reed a, Amanda Graham a, Rana S Hoda b, Andras Khoor d, Elizabeth Garrett-Mayer c, Michael B Wallace e, Michael Mitas a
PMCID: PMC2774741  NIHMSID: NIHMS146466  PMID: 18329483

Abstract

Objective

We hypothesized that clinical outcome of resected early stage adenocarcinoma of the lung can be predicted by the expression of a few critically important genes as measured by quantitative real-time reverse transcriptase polymerase chain reaction (RT-PCR) in formalin fixed paraffin-embedded (FFPE) primary tumors.

Methods

Twenty-two prognostic genes for the metastatic phenotype were identified through cDNA microarray analysis of four cancer cell lines and bioinformatics analysis. Expression levels of a subset of these genes (n=13) were measured by real-time RT-PCR in FFPE primary adenocarcinoma from patients who recurred within 2 years (n=9) and who did not recur (n=11). ROC curve analysis was performed to establish prognostic values of single genes. The most informative gene was combined with the remaining genes to determine if there was a particular pair(s) that yielded high diagnostic accuracy. A small validation study was performed.

Results

ROC curve analysis of the single genes revealed that high expression of CK19 was associated with non-recurrence (AUC=0.859, CI=0.651–0.970). The CK19/EpCAM2 gene ratio had the most reproducible prognostic accuracy, followed by the CK19/P-cadherin ratio. A Kaplan Meier survival analysis generated from the CK19/EpCAM2 ratio resulted in highly significant curves as a function of marker positivity (p=0.0007; HR=10.7). Significance declined but was maintained in the validation study.

Conclusions

This preliminary study provides evidence that the CK19/EpCAM2 and/or CK19/P-cadherin ratio(s) may be a simple and accurate prognostic indicator of clinical outcome in early stage adenocarcinoma of the lung. If further validation studies from large patient cohorts confirm the results, adjuvant therapy could be targeted to this high risk group.

Keywords: lung cancer, molecular markers


Despite surgical resection, patients with pathologic stage I non-small cell lung cancer (NSCLC) will have an approximately 30–40 percent incidence of recurrence and those with stage II a 45% to 60% recurrence rate.1 At present the standard of care is to administer postoperative adjuvant chemotherapy to those patients with stage II NSCLC.2,3 Although there was initial enthusiasm for administering adjuvant therapy to resected stage IB patients, recent data do not support this practice.35 However, subsets of stage I patients could potentially benefit from further treatment to prevent recurrence; likewise a method to predict which stage II patients could avoid the unnecessary toxicity of chemotherapy would be helpful.

The development of metastatic disease is the most common cause of death among NSCLC patients and results from dissemination of malignant cells. It is now recognized that the ability of cells to gain metastatic potential is an intrinsic property of the primary tumor, which is substantiated by the high correlations between clinical outcome and gene expression profiles of a variety of primary tumors.6,7 The ability to predict clinical outcome based on analysis of primary tumors would allow cancer patients to be treated more effectively. However, the problem with many of these expression studies is that they require measurements of large sets of predictive genes using a platform (cDNA microarray analysis) that is not well suited to clinical application.

In this pilot study, we hypothesized that clinical outcome of resected early stage patients with adenocarcinoma of the lung could be predicted by the expression of relatively few, but critically important genes, measured by quantitative real-time reverse transcription polymerase chain reaction (RT-PCR) in formalin fixed paraffin-embedded (FFPE) primary tumors. Specifically, we hypothesized that there exists a “good gene” and a “bad gene” such that ratio of the two is a strong prognostic indicator of clinical outcome.

MATERIALS AND METHODS

Identification of 15 highly expressed genes in NSCLC cell lines

Expression levels of 22,283 gene transcripts were determined on oligonucleotide microarrays using RNA prepared from four NSCLC cell lines [CRL 5807 (bronchoalveolar carcinoma), CRL 5876 (adenocarcinoma derived from metastatic lymph node), A549 (adenocarcinoma), and HTB 177 (large cell carcinoma)], as well as from a pool of 4 normal cervical lymph nodes. Eight μg of total RNA per sample was used. First and 2nd strand cDNA synthesis, double stranded cDNA cleanup, biotin-labeled cRNA synthesis, cleanup and fragmentation were performed according to protocols in the Affymetrix GeneChip Expression Analysis technical manual (Affymetrix). Microarray analysis was performed by the DNA Microarray and Bioinformatics Core Facility at the Medical University of South Carolina using U133 A GeneChips (Affymetrix). Fluorescent images of hybridized microarrays were obtained by using a HP GeneArray scanner (Affymetrix). For normalization, the microarray office suite was used such that all fluorescence values were multiplied by a factor that resulted in a mean fluorescent score for all genes equal to 150. Data for normal lymph nodes were obtained from a previous study.8 All microarray results were imported into single Microsoft Excel file. The first algorithm in the selection of highly expressed genes involved elimination of genes from NSCLC cell lines that were expressed in normal lymph nodes [n=11,326; 50.8% of total (22,283)]. Of the remaining 10,957 genes, those that were detected in at least 2 NSCLC cell lines were first selected (n=1731; 7.7% of total). Following this round, genes whose mean fluorescence in all cell lines were > 500 were selected (n=91; 0.41% of total). The final group of 91 genes was sorted according to mean cell line fluorescence/mean fluorescence of normal lymph nodes, and the 15 top genes were selected. (Table 1)

Table 1.

Top 15 most highly overexpressed genes in lung cancer cell lines

Gene description Affymetrix resultsa

Rank Gene Acc. # 1 A549 2 HTB177 3 CRL5807 4 CRL5876 Ratiob
1 AGR2 NM_006408 2124 2053 3082 38 960
2 S100P NM_005980 242 2522 2673 4819 754
3 CK19 NM_002276 27 935 1995 810 589
4 NQO1 NM_000903 1375 1858 982 315 404
5 MET NM_000245 1420 790 2429 378 348
6 MAGE-A6 NM_005363 73 37 3004 4475 311
7 XAGE-1 NM_020411 471 2 2322 3 250
8 KRTHB1 NM_002281 2822 31 221 3 208
9 MAGE-A3 NM_005362 116 29 4055 5107 178
10 MAP7 NM_003980 455 466 381 930 116
11 AKR1B10 NM_020299 11662 10603 17 75 101
12 CK7 related NM_005556 537 21 1319 463 96
13 EpCAM2 NM_002353 2 3 8146 2342 94
14 EpCAM1 NM_002354 278 15 4430 3244 91
15 P-cadherin NM_001793 2 3 1319 1274 87
a

Normalized fluorescent values obtained from Affymetrix U133A array data for the indicated cell line.

b

Ratio of mean NSCLC cell line data to mean of normal lymph node.

Bioinformatics analysis to identify potentially prognostic genes in NSCLC

Of the 15 most highly expressed genes identified by cDNA microanalysis, it was hypothesized that some of them were also expressed in other cancers, while some genes were specific for NSCLC. To identify genes that were highly expressed in other cancers, the on-line Comparative Genome Anatomy Project (CGAP) NCI 60 gene expression database (URL = http://cgap.nci.nih.gov) was queried using all 15 genes. The output of a given query consists of a list of 10 genes whose expression levels are most highly correlated with the query sequence. Using the output of each gene, a correlation map was constructed such that the appearance of a gene on the map required 1) direct contact with one of the 15 highly expressed genes, 2) contacts with at least two genes, 3) that the correlation coefficient of any two genes must have a p value < 8 × 10−6, 4) that the relevant gene must be overexpressed in the CGAP SAGE dataset in at least two cancers (with respect to normal tissue), and 5) that expression of the relevant gene must be at least 16–31 tags/200,000 sequenced tags in at least one cancer tissue. Genes identified from the first set of queries were used as query in a reiterative round of interrogation (data mining).

The correlation map obtained using this bioinformatics data mining approach contained a total of 22 genes (Figure 1). Seven of the 22 genes (AGR2, Map 7, S100P, CK19, EpCAM1, EpCAM2, P-cadherin) were derived from the list of 15 most highly expressed genes and are referred to as the Primary prognostic genes (underlined in Figure 1). The remaining 15 genes identified from this bioinformatics approach are referred to as the Secondary prognostic genes (italicized in Figure 1).

Figure 1. Correlation map of cancer-associated genes.

Figure 1

Correlation map of the genes was constructed as described in the text. Genes are positioned in a hypothetical cell to reflect intracellular, membrane-bound, or extracellular localization. The thickness of a solid line connecting a given gene pair is ~proportional to the R2 value of gene expression, which ranges from 0.91 (p<0.0001) for the Spint1/SNC19 pair, to 0.55 (p<0.0001) for the TFF1/S100P pair.

Identification of genes of prognostic value in early stage NSCLC adenocarcinoma patients

To determine whether the genes described above had potential prognostic value, the expression levels were measured by real-time RT-PCR in paraffin-embedded formalin fixed (FFPE) primary tumors of adenocarcinoma patients who recurred within two years (poor outcome group A; n=9) and who survived disease-free longer than four years (good outcome group B; n=11). Group A patients included 2 with stage IA, 2 stage IB, and 5 stage IIB. Group B patients included 5 with stage IA, 3 stage IB, and 3 stage IIB. Genes analyzed included the seven primary prognostic genes, six secondary prognostic genes (Sprint 2, Esx, CEA6, Ma12, GPX2, E-cadherin) as well as μPAR, a gene whose expression has previously been shown to be associated with multiple cancers. The laboratory investigators were initially blinded to the clinical outcome. The study was approved by the Medical University of South Carolina Institutional Review Board.

A small validation study was performed using paraffin sections from patients with early stage adenocarcinoma who recurred early (n=10) and survived greater than 2 years (n=12) undergoing resection at the Mayo clinic, Jacksonville, Florida.

Real-time reverse transcription-PCR of formalin-fixed paraffin-embedded samples was performed following the method of Sprecht, et al.9 A 50-μm section was cut from tissue blocks of primary tumor for mRNA extraction. For isolation of RNA, paraffin-embedded tissue sections were deparaffinized twice with 1 mL of xylene at 37°C or room temperature for 10 minutes. The pellet was subsequently washed with 1 mL of 100%, 90%, and 70% of ethanol and air-dried at room temperature for 2 hours. The pellet was resuspended in 200 μL of RNA lysis buffer [2% lauryl sulfate, 10 mmol/L Tris-HCI (pH 8.0), and 0.1 mmol/L EDTA] and 100 μg of proteinase K and incubated at 60°C for 16 hours. RNA was extracted using 1 mL of phenol/chloroform (5:1) solution (Sigma, St. Louis, MO). The aqueous layer containing RNA was transferred to a new 1.5 mL tube. Phenol/chloroform extraction was done a total of three times. RNA was precipitated with an equal volume of isopropanol, 0.1 volume of 3 mol/L sodium acetate, and 100 μg of glycogen at −20°C for 16 hours. After centrifugation at 12,000 rpm for 15 minutes (4°C), the RNA pellet was washed with 70% of ethanol and air-dried at room temperature for 2 hours. Finally, the pellet was dissolved in 12 μL of DEPC water. cDNA synthesis was performed using a panel of truncated gene-specific primers. Real-time RT-PCR was performed on a PE Biosystems Gene Amp® 7300 or 7500 Sequence Detection System (Foster City, CA). With the exception of the SYBR Green I master mix (purchased from Qiagen, Valencia, CA), all reaction components were purchased from PE Biosystems. Standard reaction volume was 10 μl and contained 1X SYBR RT-PCR buffer, 3mM MgCl2, 0.2 mM each of dATP, dCTP, dGTP, 0.4 mM dUTP, 0.1 U UngErase enzyme, 0.25 U AmpliTaq Gold, 0.35 μl cDNA template, and 50 nM of oligonucleotide primer. Initial steps of RT-PCR were 2 min at 50°C for UngErase activation, followed by a 10-min hold at 95°C. Cycles (n=40) consisted of a 15 sec melt at 95°C, followed by a 1 min annealing/extension at 60°C. The final step was a 60°C intubation for 1 min. All reactions were performed in triplicate. Threshold for cycle of threshold (Ct) analysis of all samples was set at 0.5 relative fluorescence units.

Gene expression values were quantified as ΔCt values, which were obtained by subtracting the Ct value of an internal reference control gene (β2-microglobulin, B2M) from the gene of interest. Ct values are inversely proportional to gene expression levels and are based on log2 scale.

The results were internally validated by repeating the real-time RT-PCR process using a new section cut from tissue blocks of the primary tumor. Variability of tumor quantity on the sections was minimized by H&E comparison performed by a pathologist. A cross-validation procedure was used to determine if the results were sensitive to the samples included. A leave-one-out procedure was used where each sample was systemically removed and the data reanalyzed.

Statistical Analysis

To assess for prognostic accuracy, ROC curve analysis was performed on the individual genes normalized to B2M (Med Calc software). Prognostic gene combinations were tested by subtracting ΔCt values generated by RT-PCR analysis. Subtraction of ΔCt values (ΔΔCt) is equivalent to the log of the ratio of values. In the text the ΔCtgene A −Ctgene B calculation is abbreviated as a gene expression ratio. The value of the two-gene prognostic assay was further assessed by Kaplan Meier survival analysis.

RESULTS

A primary tumor’s ability to metastasize requires many genetic events. In this study, we hypothesized that there are relatively few genes that may be critical to the metastatic phenotype, such that high expression of a gene that portends non-recurrence coupled with the low expression of a gene critical to metastasis would be useful to predict clinical outcome in adenocarcinoma of the lung.

The correlation map illustrated in Figure 1 resulted from a unique bioinformatics analysis that led to a set of genes that had specific structured connections based on a query of 15 genes over-expressed in four lung cancer cell lines. Of the 22 identified genes, seven were in the original query set and were labeled primary prognostic genes. These genes combined with 6 of the most frequently expressed remaining 16 secondary genes constituted the study’s test gene set in patients with adenocarcinoma of the lung. This unique approach is somewhat similar to the description of expression profiles in different tumors in terms of behavior modules, sets of genes that are in concert to carry out a specific function.10 In fact many of the genes in this study test set were contained in one of the modules (module 180) described by Segal and colleagues.10

AUC values for the primary and secondary genes are shown in Table 2. ROC curve analysis of the individual genes revealed that high expression of CK19 was associated with non-recurrence (≥4 years) (AUC = 0.859; 95% CI = 0.651–0.970); whereas high expression of EpCAM2 was associated with disease recurrence within two years (AUC = 0.606; 95% CI −0.366–0.813).

Table 2.

Recurrence analysis of pilot study using single markers paired with the internal B2M reference control gene.

Recurrence analysis

Gene AUC 95% CI

CK19 0.859 0.631 to 0.970
EpCAM2 0.606 0.366 to 0.813
AGR2 0.596 0.357 to 0.805
Esx 0.566 0.329 to 0.782
GPX2 0.556 0.320 to 0.773
CEA6 0.545 0.312 to 0.765
E-cadherin 0.545 0.312 to 0.765
EpCAM1 0.535 0.303 to 0.757
SPINT2 0.530 0.298 to 0.753
S100P 0.525 0.294 to 0.749
MAL2 0.515 0.285 to 0.740
P-cadherin 0.510 0.281 to 0.736
Map7 0.500 0.272 to 0.728
UPAR 0.470 0.247 to 0.702

To determine whether the prognostic accuracy of CK19 could be improved by combining it with another gene whose overexpression might be necessary for the metastatic phenotype and therefore low expression be favorable, the mean ΔCt values of individual genes as determined by real-time RT-PCR analysis were subtracted from ΔCtCK19. For all potential CK19/gene X combinations, the ratio of CK19/EpCAM2 yielded the highest prognostic accuracy as determined by AUC measurements. (Table 3) This observation provided evidence that EpCAM2 is a “bad” gene. The CK19/EpCAM2 expression ratio, which was derived from the mean of two experiments, also performed well when data were analyzed from individual experiments. In the first and second experiments, the prognostic accuracy of the CK19/EpCAM2 expression ratio as determined by AUC analysis was 0.91 (95% CI=0.69–0.99) and 0.84 (95% CI=0.56–0.97), respectively (data not shown). Of further note is the observation that of the 12 stage I adenocarcinoma patients, the prognostic accuracy of the CK19/EpCAM2 expression ratio was 92% (11/12).

Table 3.

Recurrence and survival analysis of pilot study based on CK19/geneX ratios

Recurrence analysisa Kaplan Meier survival analysisb

geneX AUC 95% CI P-value HR

EpCAM2 0.879 0.656 to 0.978 0.0001 10.7
P-cadherin 0.874 0.650 to 0.976 0.0003 8.13
MAL2 0.869 0.643 to 0.974 0.0004 9.24
Esx 0.742 0.501 to 0.908 0.0008 6.62
Map7 0.889 0.668 to 0.981 0.0013 6.24
UPAR 0.843 0.613 to 0.963 0.0013 6.24
E-cadherin 0.818 0.584 to 0.951 0.0013 6.25
AGR2 0.859 0.631 to 0.970 0.0098 4.69
GPX2 0.722 0.480 to 0.895 0.0184 5.12
SPINT2 0.848 0.619 to 0.965 0.0207 7.78
EpCAM1 0.798 0.561 to 0.940 0.0207 7.78
S100P 0.732 0.490 to 0.901 0.0275 4.08
CEA6 0.732 0.490 to 0.901 0.0729 3.10
a

AUC based on continuous representation of gene expression values

b

Hazard ratios based on binary analysis of best thresholds of outcome discrimination.

The cross-validation procedure found no qualitative differences in inferences. For CK19 alone, the range of AUCs found in the cross-validation analyses was (0.87, 0.92) whereas the AUC when all samples were included was 0.86. Analogous results were found when CK19 was combined with EpCAM2.

To further assess the value of CK19 unpaired and paired with EpCAM2, a Kaplan Meier survival analysis was performed using data generated from single marker and CK19/gene X analyses. For the single CK19 marker, a ΔCt cutoff of 11.4 was used, which separated the 20 patients in to high (ΔCt < 11.4; n=13) and low (ΔCt > 11.4; n=7) expressing tumors. A log-ranked test indicated that the two curves generated as a function of marker positivity were different at a p value of 0.0021 with a hazard ratio of 6.2. (Figure 2A) For the CK19/EpCAM2 ratio, a ΔΔCt cutoff of 7.2 was used, which separated the 20 patients into high (ΔΔCt ≤7.2; n=13) and low (ΔΔCt > 7.2; n=7) groups that correlated with survival. A log-ranked test indicated that the two curves generated as a function of marker positivity were different at a p value of 0.0001 with an associated hazard ratio of 10.7. (Figure 2B) Kaplan Meier survival analysis of other CK19/gene X pairs are shown in Table 3. The gene pair that yielded the second most highly significant curves was CK19/P-cadherin, with an associated hazard ratio of 8.1.

Figure 2. Kaplan Meier survival analysis.

Figure 2

Data generated from single marker (Figure 2A) and CK19/EpCAM2 (Figure 2B) analyses.

To determine assay reliability, we applied the two-gene test to a set of patients (n=22) who were treated at the Mayo Clinic, Jacksonville, Florida. Twelve patients survived longer than two years, while 10 patients recurred within two years. All patients in this dataset died by 65 months. We observed that the hazard ratio of CK19/EpCAM2 expression pair decreased to 4.5 but remained significant (p=0.007). The CK19/P-cadherin expression ratio also clearly identified patients with longer survival (HR=3.24; p=0.0029).

DISCUSSION

Unfortunately, a large number of patients with resected early stage lung cancer will recur within two to three years. The ability to predict those patients at high risk for recurrence could help direct the possible addition of therapy to improve survival, and vice versa, avoid the toxicity for those at low risk. Many molecular markers that predict patient survival independent of TNM status have been reported.11 Tools used to predict recurrence have included immunohistochemical analysis,12 cDNA microarray profiling,1317 real-time reverse transcription polymerase chain reaction (RT-PCR),6,1820 and most recently, proteomics.21 Many of the methods have been costly, not readily available to the average surgeon, required frozen tissue specimens, and have therefore been difficult to translate from the research laboratory to the clinical arena.

In the present study, we measured the expression of 14 different test genes and one internal reference control gene in primary tumors resected from early stage NSCLC patients. Using the B2M gene as an internal reference, we observed that high expression of CK19 was correlated with good clinical outcome (no disease recurrence), while high expression of EpCAM2 was correlated with poor clinical outcome (disease recurrence within two years). Of all possible two-gene combinations (n=105), we further observed that the ratio of CK19/EpCAM2 had the highest accuracy for predicting disease recurrence. The concept of using a two-gene ratio was previously applied to NSCLC by Gordon et al,19,22 who identified S100P as one of seven prognostic markers. It should be noted that in the Mayo dataset, the marker combination of CK19/S100P yielded results similar to CK19/P-cadherin (data not shown). However, the current study is the first to analyze the expression of genes in paraffin samples. In colon cancer, high expression of EpCAM2 (also known as TROP2) has been shown to be associated with a higher frequency of liver metastasis (P=0.005) and more cancer-related death (P=0.046),23 a finding that further supports the concept that for early stage NSCLC, EpCAM2 is a “bad gene.”

The gene pair with the second highest prognostic accuracy for disease recurrence in was CK19/P-cadherin. Previous studies have shown that expression levels of P-cadherin in primary tumors correlate with tumor grade in ovarian cancer24 and metastases to the lung in thyroid cancer.25 Further, overexpression of P-cadherin in vitro results in increased cell motility in pancreatic cancer,26 a necessary requirement for establishment of distant metastases. Taken together, these results provide evidence that P-cadherin may also serve as a candidate “bad gene” in NSCLC. Regarding CK19, antibodies to the protein encoded by this gene (and/or a combination of other cytokeratin genes) has been used for the detection of circulating tumor cells in breast, lung, colon, and other cancers.27,28 In the current study, we suspect that CK19 expression levels serve as a reliable indicator of the epithelial content of the primary tumor.

Although there was a recent report of the use of real-time RT-PCR for prognosis of early stage NSCLC patients, it should be pointed out that the current study differs significantly with the approach taken by Chen et al.20 In this report, patient prognosis was based on a simple calculation of a two-gene ratio, an approach that contains only one “decision node.” In the study of Chen et al, a five-gene marker panel was used that required a relatively high number of decision nodes (n=19). An algorithm that uses such a high number of decision nodes for a few number of genes is less likely to be clinically applicable because of its cumbersome nature. In contrast, the microarray study of Potti et al required only 5 decision nodes, even though 289 genes were involved.6

There are several advantages to the technique used in this preliminary study. It is a simple two-gene model and uses a technology that is relatively inexpensive and is quickly performed once RNA is extracted. Paraffin-embedded tumor tissue can be screened and an appropriate slide(s) could be sent to a reference laboratory. The technique is amenable to small tissue samples, which may be important if preoperative biopsy directs neoadjuvant therapy.

Several limitations of this pilot analysis need to be acknowledged. First, given the small numbers used for the preliminary study, external verification must be performed on a larger data set prior to making definitive statements concerning its application as a prognostic tool. Second, given the number of putative genes which could display either a direct or inverse relationship between expression and prognosis, it is possible that another gene ratio or a combination of two ratio sets will be more informative as patients are added. Correlative experiments looking at protein levels in tumor issues should be a future goal.

In summary, a simple two-gene molecular model has been developed to predict recurrence in resected patient with early stage adenocarcinoma of the lung. The model will require further validation and refinement. It is hoped that in the future a relatively easy, cost-effective, clinically relevant molecular model will be used to individualize therapy in early stage NSCLC.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Mountain CF. Revisions in the international system for staging lung cancer. Chest. 1997;111:1710–17. doi: 10.1378/chest.111.6.1710. [DOI] [PubMed] [Google Scholar]
  • 2.The International Adjuvant Lung Cancer Trial Collaborative Group. Cisplatin-based adjuvant chemotherapy in patients with completely resected non-small-cell lung cancer. N Engl J Med. 2004;350:351–60. doi: 10.1056/NEJMoa031644. [DOI] [PubMed] [Google Scholar]
  • 3.Winton T, Livingston R, Johnson D, Rigas J, Johnston M, Butts C, et al. Vinorelbine plus cisplatin vs. observation in resected non-small cell lung cancer. N Engl J Med. 2005;352:2589–97. doi: 10.1056/NEJMoa043623. [DOI] [PubMed] [Google Scholar]
  • 4.Strauss G, Herndon J, Maddaus M, Johnstone DW, Johnson EA, Watson DM, et al. Adjuvant chemotherapy in stage IB non-small cell lung cancer (NSCLC): update of cancer and leukemia group B (CALGB) protocol 9633. Proc Am Soc Clin Oncol. 2006;24:365s. [Google Scholar]
  • 5.Pignon J, Tribodet H, Scagliotti G, Douillard JY, Shepherd FA, Stephens RJ, et al. Lung adjuvant cisplatin evaluation (LACE): a pooled analysis of five randomized clinical trials including 4,584 patients. Proc Am Soc Clin Oncol. 2006;24:366s. doi: 10.1200/JCO.2007.13.9030. [DOI] [PubMed] [Google Scholar]
  • 6.Potti A, Mukherjee S, Petersen R, Dressman HK, Bild A, Koontz J, et al. A genomic strategy to refine prognosis in early-stage non-small cell lung cancer. N Engl J Med. 2006;355:570–80. doi: 10.1056/NEJMoa060467. [DOI] [PubMed] [Google Scholar]
  • 7.Bertucci F, Finetti P, Cervera N, Maraninchi D, Viens P, Birnbaum D. Gene expression profiling and clinical outcome in breast cancer. J Integrative Biol. 2006;10:429–43. doi: 10.1089/omi.2006.10.429. [DOI] [PubMed] [Google Scholar]
  • 8.Mikhitarian K, Gillanders WE, Almeida JS, Herbert-Martin R, Varela JC, Metcalf JS, et al. An innovative microarray strategy identifies informative molecular markers for the detection of micrometastatic breast cancer. Clin Cancer Res. 2005;11:C–704. doi: 10.1158/1078-0432.CCR-04-2164. [DOI] [PubMed] [Google Scholar]
  • 9.Sprecht K, Richter T, Mueller U, Walch A, Werner M, Hofler H. Quantitative gene expression analysis in microdissected archival formalin-fixed and paraffin-embedded tumor tissue. Am J Pathol. 2001;158:419–29. doi: 10.1016/S0002-9440(10)63985-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Segal E, Friedman N, Koller D, Regev A. A module map showing conditional activity of expression modules in cancer. Nature Genetics. 2004;36:1090–98. doi: 10.1038/ng1434. [DOI] [PubMed] [Google Scholar]
  • 11.Brundage MD, Davies D, Mackillop WJ. Prognostic factors in non-small cell lung cancer: a decade of progress. Chest. 2002;122:1037–57. doi: 10.1378/chest.122.3.1037. [DOI] [PubMed] [Google Scholar]
  • 12.D’Amico TA, Massey M, Herndon JE, Moore M-B, Harpole DH. A biologic risk model for stage I lung cancer: immunohistochemical analysis of 408 patients with the use of ten molecular markers. J Thorac Cardiovasc Surg. 1999;117:736–43. doi: 10.1016/s0022-5223(99)70294-1. [DOI] [PubMed] [Google Scholar]
  • 13.Garber ME, Troyanskaya OG, Schluens K, Petersen S, Taesher Z, Pacyna-Gengelbach M, et al. Diversity of gene expression in adenocarcinoma of lung. Pro Natl Acad Sci USA. 2001;98:13784–89. doi: 10.1073/pnas.241500798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chen C, Gharib TG, Huang C-C, Kuick R, Thomas DG, Shedden KA, Misek DE, et al. Protein profiles associated with survival in lung adenocarcinoma. Proc Natl Acad Sci USA. 2003;100:13537–42. doi: 10.1073/pnas.2233850100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, et al. Classification of human lung carcinomas by mRNA expression profiling reveal distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA. 2001;98:13790–95. doi: 10.1073/pnas.191502998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wigle DA, Jurisica I, Radulovich N, Pintilie M, Rossant J, Liu N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res. 2002;62:3005–8. [PubMed] [Google Scholar]
  • 17.Yanaihara N, Caplen N, Bowman E, Seike M, Kumamoto K, Yi M, et al. Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell. 2006;9:189–98. doi: 10.1016/j.ccr.2006.01.025. [DOI] [PubMed] [Google Scholar]
  • 18.Endoh H, Tomida S, Yatabe Y, Konishi H, Osada H, Tajima K, et al. Prognostic model of pulmonary adenocarcinoma by expression profiling of eight genes as determined by quantitative real-time reverse transcriptase polymerase chain reaction. J Clin Oncol. 2004;22:811–19. doi: 10.1200/JCO.2004.04.109. [DOI] [PubMed] [Google Scholar]
  • 19.Gordon GJ, Jensen RV, Hsiao L-L, Gullans SR, Blumenstock JE, Ramaswamy S, et al. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 2002;62:4963–67. [PubMed] [Google Scholar]
  • 20.Chen HY, Yu SL, Chen CH, Chang GC, Chen CY, Yuan A, et al. A five gene signature and clinical outcome in non-small cell lung cancer. N Engl J Med. 2007;356:11–20. doi: 10.1056/NEJMoa060096. [DOI] [PubMed] [Google Scholar]
  • 21.Kikuchi T, Carbone DP. Proteomic analysis in lung cancer: challenges and opportunities. Respirology. 2007;12:22–8. doi: 10.1111/j.1440-1843.2006.00957.x. [DOI] [PubMed] [Google Scholar]
  • 22.Gordon GJ, Richards WG, Sugarbaker DJ, Jaklitsch MT, Bueno R. A prognostic test for adenocarcinoma of the lung from gene expression profiling data. Cancer Epi Biomarkers Prev. 2003;12:905–910. [PubMed] [Google Scholar]
  • 23.Ohmachi T, Taneka F, Mimori K, Inoue H, Yanaga K, Mori M. Clinical significance of TROP2 in colorectal cancer. Clin Cancer Res. 2006;12:3057–63. doi: 10.1158/1078-0432.CCR-05-1961. [DOI] [PubMed] [Google Scholar]
  • 24.Patel IS, Madan P, Getsios S, Bertrand MM, MacCalman CO. Cadherin switching in ovarian cancer progression. Int J Cancer. 2003;106:172–77. doi: 10.1002/ijc.11086. [DOI] [PubMed] [Google Scholar]
  • 25.Zou M, Famulski KS, Parhour RS, Baitei E, Al-Mohann FA, Farid NR, et al. Microarray analysis of metastasis-associated gene expression profiling in a murine model of thyroid carcinoma pulmonary metastasis: identification of S100A4 (Mts 1) gene over expression as a poor prognostic marker for thyroid carcinoma. J Clin Endocrinol Metab. 2004;89:61646–54. doi: 10.1210/jc.2004-0418. [DOI] [PubMed] [Google Scholar]
  • 26.Taniuchi K, Nakagawa H, Hosokawa M, Nakamura T, Eguchi H, Ohigashi H, et al. Over expressed P-cadherin/CDH3 promotes motility of pancreatic cancer cells by interacting with p120ctn and activating rho-family GTPases. Cancer Res. 2005;65:3092–99. doi: 10.1158/0008.5472.CAN-04-3646. [DOI] [PubMed] [Google Scholar]
  • 27.Allard WJ, Matera J, Miller MC, Repollet M, Connelly MC, Rao C, et al. Tumor cells circulate in the peripheral blood of all major carcinomas but not in healthy subjects of patients with nonmalignant diseases. Clin Cancer Res. 2004;10:6897–6904. doi: 10.1158/1078-0432.CCR-04-0378. [DOI] [PubMed] [Google Scholar]
  • 28.Cristofanilli M, Budd GT, Ellis MJ, Stopeck A, Matera J, Miller MC, et al. Circulating tumor cells, disease progression, and survival in metastatic breast cancer. N Engl J Med. 2004;351:781–91. doi: 10.1056/NEJMoa040766. [DOI] [PubMed] [Google Scholar]

RESOURCES