Genomic technologies enable interrogation of multiple aspects of cancer biology from genetic insults, through transcriptome/proteome expression levels, to the tumor microenvironment. Integrating these rich data sources clinical information has resulted in a plethora of prognostic and predictive assays. However, relatively few of these have been deployed routinely in a clinical setting. In particular, prognostic molecular signatures based on DNA microarrays, with some exceptions, have failed to live up to their early promise.1,2 In many cases this is the result of poor reproducibility in validation studies, or failure to add significant information to existing markers. This may be partially attributable to the tendency to derive complex signatures and then provide a biological interpretation, rather than using knowledge of tumor biology to guide designing robust assays. Moreover, while one might anticipate that a signature leveraging expression levels of many genes would be more robust than one based on a small number, this is not necessarily true.3
Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin lymphoma. Although combination treatment regimes such as R-CHOP cure a significant proportion of patients even with advanced disease, 30–40% ultimately succumb to the lymphoma. Early microarray studies identified distinct subtypes of DLBCL that appear to reflect the cell of origin of the malignancy;4 namely germinal center B cells (GCB-like subtype) or activated B cells (ABC-like subtype). The ABC-like subtype carries much worse prognosis, with overall survival rates at 5 years of around 25%, compared with ∼75% for GCB-like DLBCL.
Recently, using a novel approach, we proposed a new assay for prognostic classification of DLBCL based on the expression level of just two genes, LMO2 (LIM domain only 2) and TNFRSF9 (tumor necrosis factor receptor sub-family 9, also known as CD137).5 We mined several large microarray data sets from the literature, using them for training and initial validation (Fig. 1). First, genes were identified that were differentially expressed between tumor and non-tumor cells in DLBCL. We then used survival analysis to search for pairs of genes whose gene expression profiles were synergistically prognostic in patients treated with the standard R-CHOP regime.6 An important biological constraint was that this pairing must integrate a gene that is highly expressed in tumor B cells, with one highly expressed in non-tumor cells. Our rationale was that such a model would reflect the interaction of the tumor with its microenvironment, and could be more robust when applied to new patient samples. From this analysis a bivariate combination of LMO2 and TNFRSF9 emerged. While high expression of either gene alone was prognostic of better survival, their discriminative power was enhanced when combined into a two-gene score (TGS). Analysis in the training set confirmed that the gene expression score added information to the existing International Prognostic Index (IPI) for DLBCL. We therefore combined the TGS with IPI to form a composite prognostic score (TGS-IPI) integrating gene expression measurements with existing clinical factors. The TGS-IPI performed robustly in several independent microarray data sets. For validation, we employed PCR assays of LMO2 and TNFRSF9 in a completely new cohort of 147 patients. The TGS-IPI was again strongly prognostic in this validation cohort, for both overall and progression-free survival. Most compellingly, the composite model correctly classifies a much larger proportion of high risk patients than IPI alone. Thus its application could immediately enable better risk-adapted therapeutic assignment in prospective trials.7
Figure 1.
Key criteria in developing a prognostic model for DLBCL were that it should: (a) reflect the biology of DLBCL, specifically integrating the role of the tumor microenvironment, (b and c) add to existing prognostic markers, (d) robustly validate in multiple independent patient cohorts, (e) employ technologies routinely employed in clinical practice and (f) confer clinically actionable information.
LMO2 is a highly-conserved gene that plays a key role in hematopoietic development, functioning as a bridging protein in transcriptional complexes. It gained notoriety as a potent oncogene in T-cell acute lymphoblastic leukemia, inadvertently activated by retroviral insertions in gene therapy trials for X-linked Severe Combined Immunodeficiency.8 It is therefore ironic that LMO2 is a marker of the GCB-like sub-type DLBCL, and better survival outcomes. Its functional role in DLBCL remains mysterious, and its activity is clearly context-/tissue-dependent. By immunohistochemical staining, we found that CD137 protein expression is restricted to a subset of DLBCL infiltrating T cells. Both CD4+ and CD8+ T cells expressed CD137, primarily in the memory phenotype (CD45RO+) compartment. Such infiltrating cells were absent from corresponding normal tissue samples, implying a specific interaction between DLBCL tumors and their microenvironment. CD137 expression was not a proxy for the number of infiltrating T cells, suggesting that its presence correlates with a specific activation program. Supporting this, resting T cells from peripheral blood induce CD137 expression when brought into contact with DLBCL tumor cells, an effect enhanced by Rituximab.9
Our results show how prudent incorporation of biological knowledge can assist in construction of gene expression-based prognostic models from high-throughput data, which validate robustly in independent test sets. At least in DLBCL, such tests can be based on assays of a small number of genes, and are therefore amenable to routine use in the clinic. We are currently developing robust biologically-motivated models for other malignancies including acute myeloid leukemia.10 A future challenge will be whether such strategies can be translated successfully to heterogeneous solid tumors.
Comment on: Alizadeh AA, et al. Blood. 2011;118:1350–1358. doi: 10.1182/blood-2011-03-345272.
References
- 1.Subramanian J, et al. J Natl Cancer Inst. 2010;102:464. doi: 10.1093/jnci/djq025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Koscielny S. Sci Transl Med. 2010;2:14. doi: 10.1126/scitranslmed.3000313. [DOI] [PubMed] [Google Scholar]
- 3.Haibe-Kains B, et al. Bioinformatics. 2008;24:2200. doi: 10.1093/bioinformatics/btn374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Alizadeh AA, et al. Nature. 2000;403:503–511. doi: 10.1038/35000501. [DOI] [PubMed] [Google Scholar]
- 5.Alizadeh AA, et al. Blood. 2011;118:1350–1358. doi: 10.1182/blood-2011-03-345272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lenz G, et al. N Engl J Med. 2008;359:2313–2323. doi: 10.1056/NEJMoa0802885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.de Jong D, et al. J Pathol. 2011;223:274–282. doi: 10.1002/path.2807. [DOI] [PubMed] [Google Scholar]
- 8.Davé UP, et al. PLoS Genet. 2009;5:1000491. doi: 10.1371/journal.pgen.1000491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kohrt HE, et al. Blood. 2011;117:2423–2432. doi: 10.1182/blood-2010-08-301945. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 10.Gentles AJ, et al. JAMA. 2010;304:2706–2715. doi: 10.1001/jama.2010.1862. [DOI] [PMC free article] [PubMed] [Google Scholar]

