Abstract
Massively parallel sequencing enables the sequencing of whole genomes, exomes, and transcriptomes from many tumor samples. Thus, it now is possible to comprehensively identify somatic mutations, including single base changes, deletions, insertions, and genomic rearrangements. Early results for hematopoietic tumors show great promise but many questions remain to be answered.
Improved methods for sequencing the human genome at an increasingly lower cost are now being applied to all kinds of tumors(Lander, 2011). A recent paper in Nature (Puente et al., 2011) offers a glimpse of the power of integrating genomic technologies, including global sequence analysis, in B-cell chronic lymphocytic leukemia (CLL). Genomic analysis started from an in-depth analysis of four patients representing the two major molecular subtypes of CLL, two with and two without somatic hypermutation of the immunoglobulin heavy chain variable region (IGHV). Paired advanced tumor and normal blood cells isolated before and after treatment, respectively, were studied using multiple independent technologies: whole genome sequencing (WGS), mate pair sequencing of 2.5 kb DNA fragments for efficient detection of DNA rearrangements; and chip analyses for single nucleotide polymorphisms (SNPs), DNA copy number, and RNA expression. This allowed the authors to determine that the sequencing identified 99.4% of the heterozygous SNPs. Importantly, for a subset of the putative somatic mutations they were able to validate 96% by Sanger sequencing. They found approximately 1000 somatic substitutions per CLL genome. The pattern of base changes and dinucleotide context differed for the IGHV-mutated and –unmutated tumors. The authors suggested that the higher frequency of A>T and C>G transversions in the IGHV-mutated cases was consistent with their introduction by the error prone-polymerase η during somatic hypermutation in immunoglobulin genes. Altogether, they identified changes in the protein-coding region of 45 genes in the four tumors, including 41 non-synonymous single base substitutions and 5 insertions/deletions (indels). Typically, CLL have only a limited number of genomic rearrangements. The comprehensive mate pair analyses enabled the authors to identify and characterize ten large genomic alterations, six of which – including large 13q14 deletions in three tumors - have been reported previously in CLL.
Focusing on the 26 mutated, expressed genes, Puente et al. then extended these findings to a cohort of 169 CLL patients. Using a clever pooled-sequencing strategy, they determined that other CLL tumors had mutations, suggesting identification of four driver genes: NOTCH1, 12.2%; MYD88, 2.9%; XPO, 2.4%; and KLHL6, 1.8%. The non-synonymous to synonymous mutation ratio in the remaining 22 expressed genes and the 19 unexpressed genes were 2.83 and 2.71, respectively, consistent with the lack of selection expected if most are passenger genes(Chapman et al., 2011).
Of the NOTCH1 mutations, which occurred in 20% of IGHV-unmutated and 7% of IGHV-mutated CLL, 29/31 were the P2515Rfs*4. This mutation previously had been identified in lymphoid malignancies, including T cell acute lymphoblastic leukemia and B-CLL. The P2515Rfs*4 mutation and the other two NOTCH 1 mutations all generate premature stop codons predicted to result in truncated proteins lacking the destabilizing PEST domain. The authors confirmed that leukemias carrying the NOTCH1 P2515Rfs*4 mutation expressed higher levels of truncated NOTCH1, together with higher levels of NOTCH1 target genes. In addition, the NOTCH1 mutation is correlated with a more advanced clinical stage at diagnosis, a shorter survival, and an increased frequency of transformation into diffuse large B-cell lymphoma (DLBCL). All of the MYD88 mutations, which were found in 0.8% of the IGHV-unmutated and 5.6% of the IGHV-mutated CLL, were the L265P that recently has been reported to activate NFKB in some lymphoma(Ngo et al., 2011). All XPO1 mutations affected codon 571, suggesting they are activating mutations, and are all found in patients with IGHV-unmutated CLL. In contrast, KLHL6 was found mutated only in 3 patients with IGHV-mutated CLL, with a pattern of mutation consistent with introduction during the process of somatic hypermutation of immunoglobulin genes. Interestingly, KLHL6 had been reported to be mutated in the same region (between residues 49 and 90) in one patient with multiple myeloma (MM). If, as the authors suggest, KLHL6 is a target of the somatic hypermutation process, the presence of recurrent mutations in IGHV-mutated CLL and MM might indicate that this is a passenger and not a driver mutation. If these mutations are mediated by the somatic hypermutation process, one would expect that synonymous mutations in this region of KLHL6 will be detected in normal or tumor post-germinal center B cells. In any case, additional experiments will determine if mutant KLHL6 is a driver mutation.
It is of interest to compare this comprehensive genomic study of mutations in four CLL tumors with recent studies on four other kinds of hematopoietic tumors (Table 1).
TABLE 1.
Global detection of coding mutations: five hematopoietic tumors
| TUMOR1 | # | Sequences1 | # AA2 | Frequent targets3 | # tested | % | Coding changes | COMMENTS |
|---|---|---|---|---|---|---|---|---|
| HCL | 1 | WE | 5 | BRAF | 48 | 100 | all V600E | |
| 2 | WG | 10 | DNMT3 | 281 | 22 | mostly AML-5 and −4 | ||
| NPM1 | 180 | 24 | ||||||
| IDH1 | 188 | 9 | ||||||
| N-RAS | 182 | 9 | ||||||
| AML | ||||||||
| 9 | WE | 7 | DNMT3 | 112 | 21 | mostly AML-5 and −4 | ||
| NPM1 | 112 | 26 | ||||||
| FLT3 | 112 | 19 | ||||||
| N-RAS | 112 | 11 | ||||||
| HL | 2 | RNA.seq | CIITA fusion | 55 | 15 | 4/131(3%) DLBCL | ||
| PMBCL | - | CIITA fusion | 77 | 38 | ||||
| CLL | 4 | WG | 10 | NOTCH1 | 255 | 12 | P2515Rfs*4 | 20% of unmutated CLL |
| MYD88 | 310 | 3 | all L265P | |||||
| XPO1 | 165 | 2 | E571K/G | All 4 unmutated CLL | ||||
| KLHL6 | 160 | 2 | aa 49–90 | All 3 mutated CLL, also in 1/38 MM | ||||
| 23 | WG | 35 | NFKB pathway | 38 | 29 | mostly inactivation of neg. regulators | 13 genes/11 tumors | |
| K-RAS | 38 | 26 | aa 13,19,61,63,64,146 | |||||
| N-RAS | 38 | 24 | aa 13,61 | |||||
| MM | FAM46C | 38 | 13 | 1 apparently inactivating mut. I46fs | 4/5 are hyperdiploid | |||
| 16 | WE | 28 | DIS3 | 38 | 11 | all missense | 3/4 are t(4;14) | |
| BRAF | 199 | 4 | 2/4 V600E | |||||
| KLHL6 | 38 | 2 | F97L | |||||
Abbreviations in text
Average non-synonymous amino acid changes per tumor
Targets mutated in ≥ 4% of tumors surveyed except KLHL6 in MM
Whole exome sequencing (WES) of the purified tumor cells from a single patient with hairy cell leukemia (HCL) identified 5 non-synonymous somatic mutations(Tiacci et al., 2011). One of the mutations was the V600E mutation of BRAF, which is present often in melanoma and papillary thyroid cancers, and infrequently in MM. They then examined 47 additional HCL tumors and found the BRAF V600E mutation in all of them! Obviously, this result needs to be corroborated by others.
Two groups have performed WGS on two M1 subtype acute myeloid leukemia (AML-M1) tumors(Ley et al., 2010; Mardis et al., 2009), and WES on nine AML-M5 tumors(Yan et al., 2011), respectively. The first group identified eleven and ten genes with somatic mutations affecting protein coding in the two AML-M1 tumors.
For each tumor, three of the genes were recurrently mutated in a larger panel of AML tumors. The second group identified 66 somatic mutations, including 58 single base changes and 8 indels, affecting 63 genes in nine AML-5 tumors. Fourteen of these mutations were present recurrently in a larger panel of AML-5 tumors.
A study on MM reported 23 WGS and 16 WES results for 38 paired normal and MM samples(Chapman et al., 2011). By WGS, there was an average of 35 non-synonymous coding mutations, 0.6 indels, and 21 DNA rearrangements per tumor, but WES detected only 28 amino acid changing mutations. Although nearly 200 genes were mutated in more than one tumor, the authors estimated that only ten genes were mutated at statistically significant rates. However, they attached significance to identical mutations occurring in different tumors, functionally related mutations in a pathway (e.g. NFKB activating mutations), clinically relevant mutations (e.g., BRAF, including some that were V600E), histone modifying enzymes, and mutations that might impact microenvironment interactions.
Finally, a recent study used whole-transcriptome paired-end sequencing to identify fusion transcripts(Steidl et al., 2011). The authors initially identified 14 and five predicted fusion transcripts, respectively, in two Hodgkin’s lymphoma (HL) cell lines. They then focused on fusions involving CIITA and identified recurrent translocations that fused the 5′ end of CIITA to multiple partner genes in 15% of HL and 38% of primary mediastinal B-cell lymphoma (PMBCL), which phenotpically is related to HL, but in only 3% of DLBCL. Additional studies showed that the hybrid protein made from the fusion transcripts had several potential effects on tumor-microenvironment interactions that favored survival of the tumor.
Comprehensive genome analysis is identifying important genetic abnormalities in hematologic malignancies. Regardless, distinguishing driver and passenger mutations remains as a daunting task. In these early studies of hematiopoietic tumors most putative driver mutations are present in only a small fraction of tumor cells, suggesting great molecular diversity even for an apparently single clinical disease. Clearly, many additional samples will need to be sequenced to obtain a more complete picture. However, it will be important to focus also on changes other than non-synonymous coding changes, e.g., mutations in regulatory regions, structural variations with long-rang effects on gene expression, and changes in the expression and forms of non-coding RNA. Ultimately, to make sense of all these findings it will be critical to perform multidimensional analysis of large cohorts of patients, ideally uniformly treated, with serial samples collected longitudinally at different disease stages, and comprehensively analyzed in terms of DNA (including epigenetic modifications), RNA, and protein structure and function.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, Schinzel AC, Harview CL, Brunet JP, Ahmann GJ, Adli M, et al. Initial genome sequencing and analysis of multiple myeloma. Nature. 2011;471:467–472. doi: 10.1038/nature09837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lander ES. Initial impact of the sequencing of the human genome. Nature. 2011;470:187–197. doi: 10.1038/nature09792. [DOI] [PubMed] [Google Scholar]
- Ley TJ, Ding L, Walter MJ, McLellan MD, Lamprecht T, Larson DE, Kandoth C, Payton JE, Baty J, Welch J, et al. DNMT3A mutations in acute myeloid leukemia. N Engl J Med. 2010;363:2424–2433. doi: 10.1056/NEJMoa1005143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mardis ER, Ding L, Dooling DJ, Larson DE, McLellan MD, Chen K, Koboldt DC, Fulton RS, Delehaunty KD, McGrath SD, et al. Recurring mutations found by sequencing an acute myeloid leukemia genome. N Engl J Med. 2009;361:1058–1066. doi: 10.1056/NEJMoa0903840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ngo VN, Young RM, Schmitz R, Jhavar S, Xiao W, Lim KH, Kohlhammer H, Xu W, Yang Y, Zhao H, et al. Oncogenically active MYD88 mutations in human lymphoma. Nature. 2011;470:115–119. doi: 10.1038/nature09671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Puente XS, Pinyol M, Quesada V, Conde L, Ordonez GR, Villamor N, Escaramis G, Jares P, Bea S, Gonzalez-Diaz Mv, et al. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 2011 doi: 10.1038/nature10113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steidl C, Shah SP, Woolcock BW, Rui L, Kawahara M, Farinha P, Johnson NA, Zhao Y, Telenius A, Neriah SB, et al. MHC class II transactivator CIITA is a recurrent gene fusion partner in lymphoid cancers. Nature. 2011;471:377–381. doi: 10.1038/nature09754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tiacci E, Trifonov V, Schiavoni G, Holmes A, Kern W, Martelli MP, Pucciarini A, Bigerna B, Pacini R, Wells VA, et al. BRAF mutations in hairy-cell leukemia. N Engl J Med. 2011;364:2305–2315. doi: 10.1056/NEJMoa1014209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yan XJ, Xu J, Gu ZH, Pan CM, Lu G, Shen Y, Shi JY, Zhu YM, Tang L, Zhang XW, et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nat Genet. 2011;43:309–315. doi: 10.1038/ng.788. [DOI] [PubMed] [Google Scholar]
