Abstract
The elucidation of the human and mouse genome sequence and developments in high-throughput genome analysis, and in computational tools, have made it possible to profile entire cancer genomes. In parallel with these advances mouse models of cancer have evolved into a powerful tool for cancer gene discovery. Here we discuss the approaches that may be used for cancer gene identification in both human and mouse and discuss how a cross-species ‘oncogenomics’ approach to cancer gene discovery represents a powerful strategy for finding genes that drive tumourigenesis.
Keywords: Cancer genome, Resequencing, Cross-species analysis, DNA copy number analysis, Mouse cancer models, Forward genetic screens
1. Genome-wide approaches for human cancer gene discovery
Tumours form when a cell gains a selective advantage over other cells and manages to evade the checkpoints that would normally suppress its growth. The acquisition of this behaviour is thought to occur as a result of the development of somatic mutations that deregulate gene function. Somatic mutations in humans may result from a multitude of genetic insults generating different types of genetic lesions. With the exception of point mutations, these lesions are rarely focal and often encompass many genes making the identification of the deregulated gene within the rearranged region problematic. In this review we will discuss the technological approaches that may be applied for finding cancer genes. An overview of these technologies is provided in Table 1.
Table 1.
Overview of genomic technologies for cancer gene discovery.
| Cancer gene discovery approach | Resolution | Pros | Cons |
|---|---|---|---|
| Gene resequencing | Nucleotide | Can be an accurate way of finding somatic mutations at the nucleotide level. | PCR-based strategies are not readily scaleable to genome-wide and are expensive. |
| With new array-based sequence enrichment technology the entire exome can be profiled. | Array-based sequence enrichment is still developmental and many protocols do not reproducibly capture the exome. | ||
| Expression analysis | Transcript | Expression data can be used for diagnostic and prognostic purposes. | Because expression profiling is a quantitative measure of gene activity it reports gene expression changes that are both the cause and the effect of genetic and epigenetic changes at the DNA level which often makes the output of these studies an ‘expression signature’ rather than a cancer gene. |
| RNA-Seq based approaches can be used for profiling the transcriptome including expression levels, splicing and fusion gene discovery. | |||
| Comparative Genomic Hybridization (CGH) | Megabase | Large complex rearrangements can be discovered using this technique. | Largely outdated by array-based approaches. Not readily scaleable to high-throughput. |
| Reveals large regions of rearrangement which may contain many genes so finding causal rearrangements can be difficult. | |||
| Array-based Comparative Genomic Hybridization (aCGH) | 100s bp | High resolution. SNP-based platforms can report allele-specific changes. | Stromal contamination and immune cell infiltrates can influence the ability of these platforms to determine the copy number of the cancer. |
| Like all genomics platforms array-based CGH reports to copy number profile of a population of cells so tumour heterogeneity can be an issue. | |||
| Sequencing entire cancer genomes | Nucleotide | Can report nucleotide level variation, copy number information and can also report neutral changes in the genome such as balanced translocations and inversions. | Extremely expensive. Not clear how to computationally resolve highly rearranged regions. |
| Epigenetic profiling | Nucleotide | Can detect epigenetically silenced genes that would be missed by other approaches. | It has proved difficult to develop technology to scale to genome-wide epigenetic profiling at the nucleotide level. |
1.1. Gene resequencing
Advances in DNA sequencing technology have enabled the identification of recurrent intragenic mutations across multiple cancer genomes. Davies et al. [1] screened the coding sequence and intron–exon junctions of BRAF for mutations in more than 900 human cancer cell lines and primary tumours, and found somatic missense mutations in 66% of malignant melanomas and in a smaller proportion of many other human cancers. 80% of BRAF-mutated melanomas were found to contain a V600E substitution, which is thought to constitutively activate the kinase by mimicking phosphorylation [1]. As the cost of sequencing has diminished, it has become possible to perform larger scale screens to look for mutations in multiple genes across multiple tumours. The first large-scale systematic mutational study of a complete gene family was performed by Bardelli et al. [2], who identified 7 candidate cancer genes in a screen of the tyrosine kinase gene family in 182 colorectal cancers. A further study of mutations in the tyrosine phosphatase gene family identified 6 putative tumour suppressor genes that were mutated in 26% of the colorectal cancers analysed [3]. Resequencing of the phosphatidylinositol 3-kinase (PI3K) gene family revealed one member, PIK3CA, which is frequently mutated in tumours of the colon, breast, brain and lung, with most mutations clustering in the catalytic domain [4]. Mutations in PIK3CA have since been identified in additional tumour types, such as hepatocellular carcinomas [5] and ovarian cancers [6,7]. A screen of serine/threonine kinases showed that 40% of colorectal tumours harbour a mutation in 1 of 8 PI3K-pathway genes [8]. The PI3K pathway regulates a wide range of cellular functions that are important in cancer, including growth, proliferation, survival, angiogenesis and migration [9].
Studies at the Wellcome Trust Sanger Institute have centred round the resequencing of the coding regions of all 518 protein kinase gene family members. A study of 25 breast cancers revealed diverse patterns of mutation, with a variation in the number of mutations and in the identity of mutated genes, such that no commonly point-mutated kinase gene was identified [10]. A study of 33 lung cancers reached similar conclusions [11]. While both studies showed an over-representation of nonsynonymous substitutions, which would be predicted for “driver” mutations that confer a selective growth advantage on the cancer cell, most of the mutations were thought to be “passenger” mutations that were unlikely to contribute to tumourigenesis. Protein kinase resequencing at the Sanger Institute has culminated in the identification of 921 base substitution somatic mutations in 210 diverse human cancers types [12]. Putative driver mutations were identified in 119 genes but 83% of mutations were predicted to be passengers. Cancers showed variation in mutation prevalence, with many of the cancer types with highest prevalence originating from high turnover, surface epithelia that are most exposed to mutagens [12]. Cancers also showed different “mutational signatures”, which often reflect differences in mutagenic exposure. For example, most lung cancers have a high proportion of C:G > A:T transversions, which are caused by exposure to tobacco carcinogens [11].
The first study to approach the scale of a genome-wide screen involved resequencing the coding regions of all (∼ 13,000) consensus coding sequence (CCDS) genes in 11 breast and 11 colorectal cancers [13]. Each cancer was found to harbour an average of 93 mutated genes, of which at least 11 (189 candidates in total) were thought to be driver mutations. Many of the functional groups and pathways enriched for candidate cancer genes were unique to one or other cancer type, suggesting differences in the tumourigenic process in breast and colorectal cancers [14]. There have been claims that the statistical analysis performed in this screen was flawed, in part because they used a different dataset to estimate background mutation rates, which can vary between and within cancer genomes, and because the sample size was small [15]. However, the findings of this study are in agreement with those of Greenman et al. [12] in suggesting that the genomic landscape of human cancers is more complex than previously thought [16]. The study has since been expanded to include all of the human RefSeq [17] genes and a larger number of breast and colorectal cancers [18]. 114 additional candidate cancer genes were identified and most candidates were mutated in fewer than 5% of tumours recapitulating the conclusions of previous studies [12,13]. Each tumour was predicted to contain an average of 15 potential driver mutations, suggesting that each mutation makes only a small contribution to tumourigenesis.
Although statistical methods can provide a prediction of the likely driver and passenger mutations within a cancer, there is a strong rationale for using functional assays to test these predictions. Frohling et al. [19] resequenced the coding exons and splice junctions of the receptor tyrosine kinase FLT3 in samples from patients with acute myeloid leukaemia (AML). They found that out of 9 mutants with candidate driver mutations, only 4 were able to transform cells in culture (for a review, see [20]).
The Sanger Institute Catalogue of Somatic Mutations in Cancer (COSMIC) collates and displays somatic mutation information relating to human cancers [21]. At the time of writing (December 2008, COSMIC release 40), the database contained mutation data for 4773 genes from 291,551 tumours. Gene resequencing is also a major component of the $50 million 3-year pilot phase of the Cancer Genome Atlas (http://cancergenome.nih.gov/), a large-scale collaboration between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI).
Gene resequencing studies have clearly been proven to be a fruitful approach for candidate cancer gene discovery. Technological breakthroughs such as ‘exon-capture’ [22], where the entire exome can be captured by array hybridization and sequenced on a parallel sequencing platform, are likely to facilitate a dramatic increase in genome-wide mutation data and will move large-scale gene resequencing studies out of genome centres and place them within reach of most research laboratories. Already this technology can be used to capture the entire human exome and, with further improvements in the reproducibility of this technology, it is likely to become the method of choice for cancer mutation screens.
1.2. Gene expression profiling
Gene expression arrays can be used to analyse the transcription of thousands of genes, or the entire transcriptome, simultaneously. There are two main array-based gene expression platforms: cDNA arrays, where clones corresponding to the transcripts to be analysed are spotted onto a matrix, and oligonucleotide arrays, where oligonucleotides corresponding to the transcripts are synthesised onto a matrix along with mismatch control oligonucleotides. In two-colour microarray expression analysis, the sample of interest and a control sample are differentially labelled with fluorescent dyes and are hybridized onto the array, which is then scanned to determine the ratio of fluorescence intensities for each gene. The ratio represents the relative amounts of transcript in the sample. Unsupervised clustering of the expression data for multiple samples can be used to subcategorise cancers. For example, lung cancers cluster into known histological subtypes, which are predictive of patient survival [23–25]. Gene expression profiles may also provide an indication of the genes involved in oncogenesis in a given tumour. Lung cancers harbouring a mutation in KRAS have a characteristic expression profile that can be used in their identification [26]. However, analysis of gene expression rarely provides insights into the underlying genetic changes and it can be confounded by physiological variation, such as the degree of inflammatory response or hypoxia [27]. Nevertheless, it is important as a complementary approach to other methods of cancer profiling, such as mutational and copy number analysis. Integrative approaches involving gene expression and copy number analysis are discussed in the following section.
Recently, a new approach for transcriptome analysis, called RNA-Seq or transcript counting, was developed. This approach involves sequencing the transcriptome of a cell or tumour on a parallel sequencing platform. Gene expression levels are calculated by ‘transcript counting’. Because many millions of reads are generated from each lane of a parallel sequencing run, RNA-Seq has a large dynamic range. Early protocols for RNA-Seq involved generating short (25–30 bp) sequence tags of the transcriptome. More recent protocols involve the use of paired-end sequencing. Paired-end sequencing greatly facilitates the generation of physical coverage of the transcriptome and, as such, can be used to identify splice variants and fusion transcripts. Because sequence data can be used for nucleotide variant calling, RNA-Seq could potentially be used to profile a tumours' pattern of mutation. The other major advantage of RNA-Seq is that it is not limited by the probes that can be tiled on an array, and as such is a potent tool for transcript discovery. The RNA-Seq approach and its application are reviewed in Wang et al. [28].
1.3. Analysis of DNA copy number changes
Changes in DNA copy number result from chromosomal aberrations such as deletions and duplications, non-reciprocal translocations and gene amplifications. Copy number variants (CNVs) have been identified in all humans studied [29], and a genome-wide study of 270 apparently healthy individuals from four diverse populations identified almost 1500 germline copy number variable regions encompassing 12% of the human genome [30]. CNVs have been reported to accounted for ∼ 18% of the total detected variation in gene expression between individuals, suggesting that they make a considerable contribution to phenotypic variation [31]. In the context of cancer, genomic instability results in the acquisition of somatic copy number aberrations that may contribute to tumourigenesis through the amplification of oncogenes and/or loss of tumour suppressor genes.
Chromosomal instability, which may manifest as alterations in chromosome number (aneuploidy) or inter- or intra-chromosomal rearrangement, is thought to arise early in tumourigenesis but increases with tumour progression (for a review, see [32]). There are many mechanisms that underlie this transition and cells have evolved potent checkpoints to eliminate cells with unstable genomes. Fridlyand et al. [33] found that shorter or altered telomeres were associated with a greater number of genomic amplifications and that the frequency of low-level changes was associated with altered expression of genes involved in mitosis, cell cycle, DNA replication and repair. These included many genes that are direct targets of the transcription factor E2F [33]. This has lead to the suggestion that the Rb pathway plays an important role in regulating chromosomal instability, as hypothesised by Hernando et al. [34,33]. Advanced tumours tend to reach a stable state, which, in the form of cancer cell lines, are stable over many generations and in different laboratories, suggesting that they have evolved to an optimal state [35]. It has been suggested that rather than a high level of chromosomal instability promoting tumorigenesis, highly unstable cells are selectively eliminated. This has lead to a ‘just right’ model of chromosomal instability [36].
1.3.1. Using comparative genomic hybridization (CGH) to detect copy number changes
Large alterations in copy number were initially detected and quantified using metaphase spreads in a technique known as comparative genomic hybridization (CGH) [37]. In CGH, cancer and normal genomic DNA are differentially labelled with fluorochromes and are co-hybridized to normal metaphase chromosomes. Cot-1 DNA is added to suppress hybridization to repetitive elements in the genome. The ratio of fluorescence intensities at any chromosomal position is approximately proportional to the ratio of copy numbers of the cancer and normal DNA at that position (reviewed in [38]). CGH profiles can be viewed and compared using the NCBI Cancer Chromosomes database, which integrates three databases of chromosomal aberrations in cancer: the SKY/M-FISH and CGH Database, the Mitelman Database of Chromosome Aberrations in Cancer, and the Recurrent Chromosome Aberrations in Cancer Database [39]. Rearrangement breakpoints are linked to the underlying genome assembly. However, the tool is limited to cytogenetic resolution because CGH cannot detect changes of less than 20 Mb or distinguish changes that are close together, and cannot determine exact genomic coordinates [38].
Array CGH is a higher resolution, high-throughput version of conventional CGH, in which differentially labelled cancer and reference samples are hybridized to an array made from large genomic clones, e.g. bacterial artificial chromosomes (BACs), or cDNAs (for reviews, see [38,40,41]). The copy number is measured at each probe on the array, and can be mapped directly to the genome. A disadvantage of array CGH is that it cannot detect LOH, which has traditionally been identified using methods involving microsatellites and restriction fragment length polymorphisms (RFLPs) that are not suitable for large-scale analyses (see [42]).
Single nucleotide polymorphism (SNP) arrays are a more recent development in copy number analysis. SNPs, which have been reported to account for most of the genetic variation in the human genome [31], occur on average every 100–300 base pairs along the human genome sequence. The Affymetrix GeneChip Mapping Assay (http://www.affymetrix.com) is a commonly used procedure that combines a whole-genome sampling assay (WGSA) with high-density SNP arrays [43,44]. WGSA is used to reduce the complexity of the sample, and involves ligating a linker to restriction-digested DNA, which enables PCR amplification using a single primer that is complementary to the adaptor. The amplified DNA is then fragmented, labelled and hybridized to the array. SNPs within the amplified DNA are used as probes on the array, therefore maximising the amount of information that can be extracted from the experiment [45]. In the Affymetrix GeneChip Mapping 10K Assay (which uses an array containing 11,555 SNPs across the genome) WGSA involves a single restriction enzyme, XbaI [43]. Regions of the genome in which the XbaI site is rare will be under-represented in the array [45]. The higher resolution 100K SNP array therefore uses two restriction enzymes, XbaI and HindIII, which produce complementary SNP densities [44]. Each SNP in an Affymetrix array is represented by a “probe set” comprising multiple “probe quartets”. Each probe quartet consists of four 25mer oligonucleotides in the form of two “probe pairs” comprising a perfect match probe and a mismatch probe corresponding to each SNP allele. Probe quartets differ from one another in offset, i.e. the position of the polymorphic site relative to the centre of the oligonucleotide, and orientation (reviewed in [46]). Normal and tumour DNA are hybridized to different arrays, therefore avoiding the need for matched samples and allowing for a pool of normal samples to be used as a control. As in other forms of array CGH, the copy number at each probe can be inferred from the intensity of fluorescence of hybridized sample DNA [45,47].
Commercially available arrays now range in resolution from 10,000 to ∼ 1 million SNPs across the genome. SNP arrays therefore provide the potential for fine mapping of copy number changes, enabling the identification of small aberrations and accurate mapping of chromosomal breakpoints. Furthermore, the SNPs can be genotyped and compared to a normal sample to identify regions of loss of heterozygosity (LOH). This permits the identification of complex changes such as LOH without decrease in copy number and decrease in copy number without LOH [45,47,48]. Such changes are common, as demonstrated in pancreatic and cervical cancer cell lines, where the proportion of LOH associated with copy-reduction was found to be just 32% [49] and 25% [50], respectively. Allele-specific amplification can also be detected using SNP arrays.
CGH signal intensities must be normalised to account for technical bias while still retaining biologically relevant changes. Normalisation of array CGH data has generally involved the use of methods originally developed for normalising gene expression microarray data (for a review, see [51]). Cross-slide and within-slide normalisation are used to transform the data such that all arrays, and all the spots on each array, are comparable. In median normalisation, all values are multiplied by a constant factor so that all arrays have a median log2 ratio of 0. Lowess, or Loess, normalisation accounts for spot intensity biases and other dependencies such as the location of the spot on the array and the use of different print tips. The data are linearised by subtracting a Lowess regression curve. A number of additional methods for dealing with spatial effects in expression microarray data are reviewed in Neuvial et al. [52].
In general, array CGH must be more stringent than gene expression analysis because it is required to detect single copy changes and, while the copy number of a gene, unlike the expression level, is expected to be identical in two samples, this is often not the case due to tumour heterogeneity and the presence of contaminating stromal cells [53]. Khojasteh et al. [53] proposed a multi-step normalisation process specifically for dealing with array CGH data. A “spatial segmentation” algorithm has also been developed to account for array CGH-specific spatial effects designated “local spatial biases”, where clusters of spots show a shift in signal, and “continuous spatial gradient”, where there is a smooth gradient in signal across the array [52]. Staaf et al. [54] showed that copy number imbalances correlate with intensity in array CGH data and that normalisation of expression data erroneously corrects for biologically relevant gains in copy number. They have therefore developed a normalisation algorithm that prevents suppression of copy number ratios by stratifying the data into separate populations representing discrete copy number levels [54]. Array CGH data are also affected by a genome-wide technical artefact termed “spatial autocorrelation”, or “wave”, for which the peaks and troughs are aligned across samples but the amplitude, and for some samples, the direction, varies [55]. Removal of the wave using a Lowess curve led to an increase in the number of biologically relevant CNVs detected in array CGH data from normal individuals [55].
Affymetrix has developed a number of procedures for normalising SNP array CGH data. As described above, each SNP on an Affymetrix array is represented by a probe set comprising multiple probe pairs. Fluorescence on the mismatch probes represents non-specific hybridization, and the data can be corrected by subtracting the mismatch from the perfect match intensity for each probe pair. The corrected intensities are then averaged across the probe set. The data can be globally normalised by multiplying the average intensity of the experimental array, i.e. the array to which the cancer sample is hybridized, by a normalisation factor to make it numerically equivalent to the average intensity of the control array, to which a normal sample is hybridized. Intensity ratios are calculated by dividing the average intensity for each SNP in the experimental array by the equivalent value in the control array. Three software packages that are commonly used for processing copy number data on Affymetrix SNP arrays are Copy Number Analyser for GeneChip arrays (CNAG) [CNAG, 56], DNA-Chip Analyzer (dChip) [dChip, 47] and Affymetrix GeneChip Chromosome Copy Number Analysis Tool (CNAT) (CNAT, [57]). These are compared and reviewed in Baross et al. [58], who concluded that the detection of all real CNVs from a 100K array necessitated the combined use of multiple procedures. The CNAG tool corrects for inter-experimental variation in PCR kinetics by compensating for differences in the length of PCR fragments and GC content, and uses the average of at least 5 “best-fit” normal samples that show the least variation between arrays as a control [56]. A recently published algorithm, ITALICS, addresses the problems of PCR kinetics plus additional sources of nonrelevant variation between the probe quartet intensities, such as systematic variation and spatial effects, and uses an iterative normalisation approach to estimate and remove the nonrelevant effects from the estimated biological signal [59].
ITALICS uses the GLAD algorithm [60] to estimate the biological signal by inferring the copy number across the genome [59]. GLAD is one of many methods that have been developed for segmenting the genome into regions of homogeneous copy number. Different approaches have been used, including change-point analysis, where the genome is segmented at points in the genome where the copy number changes significantly [61,62], Hidden Markov Models (HMMs) [56,63–67], hierarchical clustering along chromosomes [68] and smoothing methods [69,70]. There are also a number of web-based applications, such as ADaCGH [71] and CGHweb [72], for viewing and comparing outputs from multiple algorithms. Further methods have been developed to identify copy number changes specifically in SNP array CGH data, which has increased noise at the probe level compared with BAC array CGH [73], and a number of these infer allele-specific copy numbers [56,73–76].
Finally, having identified regions of copy number change, the statistical power can be increased by examining the region across many samples. Unlike for CNVs in normal samples, cross-sample analysis of copy number changes in cancer is hampered by the large size of many rearrangements, variation in the location of breakpoints between samples, and sample heterogeneity that prevents accurate estimation of the copy number [55]. A handful of methods have been developed to identify recurrent regions of copy number change in tumours: CMAR [77], STAC [78], H-HMM [79] and KC-SMART [80]. The latter is the only algorithm that does not discretise the data into 3 states (1, 0, − 1), which can lead to undetected copy number changes in heterogeneous tumours, and it enables the analysis of both large and small aberrations [80].
1.3.2. Analysis of copy number changes in cancer genomes
CGH can detect aneuploidy, gene amplifications and deletions, and non-reciprocal translocations in cancer genomes. Gene amplifications are gains in copy number of restricted regions of DNA [81] that contribute to tumourigenesis by increasing the transcript levels, and therefore the protein levels, of oncogenes [82]. Gene amplification is the major mechanism of oncogenesis for a number of cancer genes, including MYCN, which is amplified in ∼ 30% of advanced neuroblastomas [83]. Amplified genes represent a promising target for cancer therapy, as demonstrated in breast cancers harbouring an amplified HER/ERBB2 receptor [84].
Deletions are an important mechanism for inactivating tumour suppressor genes, including PTEN [85] and CDKN2A (INK4A/ARF) [86]. A genome-wide analysis of homozygous deletions in over 600 cancer cell lines showed that deletions occur in regions with fewer genes and repeat elements but higher flexibility compared with the rest of the genome [87]. A significant proportion occur in regions that are prone to chromosome breakage, and some of the genes in these “fragile sites”, such as WWOX and FHIT, show similar mutational patterns to known tumour suppressor genes, so it is not clear whether or not these genes are causally implicated in cancer [88].
Like gene expression analysis, copy number profiling can be used to subcategorise cancers. It can distinguish three subtypes of glioblastoma [89], and separates leiomyosarcomas into a distinct cluster from gastrointestinal stromal tumours, which, until recently, were classified as the same tumour type [90]. It also provides predictive power in breast cancer prognosis, where a poor prognosis is indicated by high-level amplification [91], extensive chromosome instability [33] and/or the presence of multiple, closely spaced amplicons, or “firestorms”, on a single chromosome arm [92]. Copy number profiles can also help to stage a tumour, such as in cervical cancer, where gain of chromosome 3q is associated with the transition from severe dysplasia to invasive carcinoma [93]. Furthermore, studies in ovarian cancer have revealed an association between drug response and the presence of copy number changes associated with drug sensitivity or resistance [94,95]. The amplification of genes involved in drug metabolism or inactivation is commonly observed in cultured cells as a means of acquiring drug resistance [32].
While many cancer genomes have been analysed for copy number changes, there has been limited progress in identifying the functional significance of altered regions. One successful approach involves identifying recurrently altered regions that are specific to particular tumour types. This enables the identification of “lineage addiction” cancer genes, which may target essential lineage-specific survival functions and therefore represent promising therapeutic targets [96]. Two such genes are the melanoma-specific oncogene MITF, which is selectively amplified and overexpressed in 20% of melanomas [97], and NKX2-1, which lies in the minimal amplified region of a lung-cancer-specific 14q13.3 amplicon found in up to 20% of lung cancers [98,99]. Genes TTF1 and NKX2-8 are usually co-amplified with NKX2-1 in the 14q13.3 amplicon and all three genes have been shown to co-operate in lung tumourigenesis [99]. The co-occurrence and mutual exclusivity of copy number alterations at different loci may also reflect co-operating and complementary cancer genes, respectively. For example, gains of ERBB2 and CCNE1 frequently co-occur in bladder cancer, while CCND1 and E2F1, which function in the same pathway, never co-occur [100].
The identification of cancer genes in regions of copy number change can be challenging because changes often span large regions of the genome that encompass many genes and may include many attractive candidates. Gains of more than one copy may have involved multiple evolutionary events and the critical gene may reside at the highest peak in copy number, as demonstrated for oncogenes CYP24 and ZNF217 in breast cancer [101]. Measurement of gene expression is also important for evaluating candidate cancer genes. SPANXB was identified as the putative critical gene in an Xq duplication in acute lymphoblastic leukaemias with an ETV6/RUNX1 translocation as a result of high and uniform overexpression across all samples [102]. While gene expression and gene dosage are rarely perfectly correlated, many studies, such as the comparison of array CGH and gene expression data in breast cancers, have shown good correlation [103,104]. However, genes that are amplified are not necessarily overexpressed, as demonstrated by Kloth et al. [50], who did not observe a genome-wide correlation between copy number and gene expression in cervical cancer cell lines. Gene expression is influenced by factors other than gene dosage, such as the availability of transcription and regulatory factors, DNA methylation and chromatin conformation, and the presence of miRNAs [50].
The integration of copy number analysis with gene resequencing also facilitates cancer gene identification. Mullighan et al. [105] performed a genome-wide analysis of genetic alterations in 242 paediatric acute lymphoblastic leukaemias (ALL) using 100K and 250K SNP arrays. They found mutations in genes that regulate late B lymphocyte development in 40% of B-progenitor ALL cases. PAX5 mutations, which included deletions, point mutations and translocations, were identified in 32% of cases [105]. ALL genomes are relatively stable, but genomes harbouring different translocations show variability in the number of copy number changes, which may reflect differences in the number of events required for tumourigenesis [105,106]. The integration of resequencing data, and epigenetic data, can facilitate the identification of tumour suppressor genes in regions of LOH, where the other allele may be inactivated by point mutation or epigenetic changes.
The identification of human cancer genes is aided by the integration of a number of complementary genome-wide analyses of human cancers, but the integration of cancer-associated mutation datasets from other species, particularly the mouse, provides an even more powerful approach for cancer gene discovery. Cross-species cancer gene analysis is discussed in Section 3.
1.3.3. Limitations of CGH and alternative strategies
Limitations of CGH-based approaches include the inability to determine the ploidy of the sample or to identify the location of rearranged sequences in the cancer genome. However, the ploidy and location of larger rearrangements (> 10 Mb) can be discerned by combining CGH with G-banding or Spectral Karyotyping (SKY) [107]. CGH may also struggle to detect low-level changes and changes in heterogeneous samples, e.g. primary cancers containing normal stromal cells, and it is affected by low-copy reiterated sequences, including gene paralogues (for a full review, see [108]).
A further limitation of CGH is that while it can detect non-reciprocal, or unbalanced, translocations, which result in the gain or loss of DNA and often cause the inactivation of tumour suppressor genes [109], it cannot detect reciprocal, or balanced, translocations. These result in fusion transcripts or transcriptional deregulation due to the positioning of an intact gene next to the promoter or enhancer elements of another gene. Until recently, it was thought that balanced translocations predominated in haematopoietic tumours, but an assessment of data in the Mitelman Database of Chromosome Aberrations in Cancer suggests that they also play an important role in epithelial tumourigenesis [109]. Furthermore, human solid tumours appear to contain large numbers of gene fusions [110] and a quarter of the breakpoints detected in 3 breast cancer cell lines were found to be balanced [111].
Balanced translocations are often initiating events in tumourigenesis that are essential for tumour development, and they therefore represent promising therapeutic targets. However, the high-throughput identification of balanced translocations has been hindered because translocation breakpoints cannot be amplified by PCR [111]. Genome-wide techniques for identifying translocations include array painting, in which chromosomes are sorted and DNA is amplified and hybridized to DNA microarrays [111], and informatics approaches, such as the algorithm developed by Tomlins et al. [112] that used RNA expression data to identify candidate gene fusions in prostate cancers. The EML4-ALK fusion was identified in non-small-cell lung cancers by paired-end sequencing [113].
End-sequence profiling (ESP) can be used to precisely map all types of genomic rearrangements, including balanced translocations [114] (Fig. 1). ESP involves constructing a BAC library from the cancer genome and sequencing the ends of clones to identify rearrangements, which map to locations in the reference genome that are of abnormal distance or orientation [114]. DNA may also be sheared and end-sequenced on a parallel sequencing platform [115]. The method can also identify fusion transcripts (tESP) and can be targeted to specific amplicons [110]. Complete sequencing of the BACs enables detailed analysis of the structure of genomic rearrangements, enabling elucidation of the mechanisms of rearrangement. ESP-based analysis of 4 cancer amplicons revealed evidence for sister chromatid break–fusion–bridge cycles, the excision and reintegration of double minutes (extrachromosomal DNA), and more complex architectures involving clusters of small genomic fragments [81]. Break–fusion–bridge cycles are initiated by a double-strand chromosomal break, which, following DNA synthesis, results in sister chromatids with identical free DNA ends that fuse to one another to prevent apoptosis. An anaphase bridge is formed during chromatid separation in mitosis, and this results in a new double-strand break and reinitiation of the cycle [116].
Fig. 1.
End-sequence profiling of tumour DNA. The tumour genome is fragmented and the ends of the fragmented DNA molecules are sequenced. These sequenced ends are then mapped to the reference genome. Ends that are an abnormal distance apart, or in an abnormal orientation, shown here as “invalid”, are indicative of rearrangements within the tumour genome. Redrawn with modifications from Raphael et al. [117].
ESP analysis of 6 epithelial cancers, including primary tumours from brain, breast and ovary, plus a metastatic prostate tumour and 2 breast cancer cell lines, revealed extensive chromosomal rearrangements, some of which appeared to be recurrent [117]. However, ESP is not suitable for analysing large numbers of cancer genomes. A high-throughput approach, which involves massively parallel sequencing of the ends of randomly sheared DNA, has recently been applied to the genome-wide analysis of somatic and germline rearrangements in 2 lung cancers [115]. The analysis revealed a wide spectrum of rearrangements, as well as providing high-resolution copy number information. Despite the benefits of this strategy, sequencing large numbers of clones across many cancer genomes is costly and impractical. However, Bashir et al. [118] have derived a formula to maximise the probability of detecting fusion genes with the least amount of sequencing. The formula depends on the distribution of gene lengths and the parameters of the sequencing strategy used [118]. Paired-end sequencing is an attractive strategy for the complete characterisation of rearrangements in cancer. However, the recent discovery that cytogenetically balanced translocations are frequently associated with focal copy number alterations suggests that high-resolution array CGH may in fact be suitable for detecting most translocations in cancer [107].
1.4. Sequencing whole cancer genomes
With the advent of new sequencing technologies it is now possible to screen the cancer genome for rearrangements [115] and to sequence entire genomes [119,120]. This technology has rapidly been applied to the study of the cancer genome [121]. The first cancer genome to be sequenced was that of an acute myeloid leukaemia with a normal karyotype [121]. This study identified mutations in a number of previously known cancer genes and several mutations in novel genes, which were investigated in a larger set of acute myeloid leukaemias finding that many of these mutations were unique to the sequenced cancer. The short read technology used in these human genome sequencing projects is ideal for finding structural rearrangements and for nucleotide variant calling in the non-repetitive regions of the genome, but resolving entire genomes to the quality of the human genome assembly is not possible with this technology due to the repetitive nature of the human genome and the significant level of rearrangement normally found in human cancers. To completely sequence an entire cancer genome, application of some of the new experimental long read sequencing technologies will need to be achieved. The other confounding factor in the analysis of tumours is that they exhibit significant molecular heterogeneity adding to the complexity of deciphering the cancer genome and making cancer genome sequencing a significantly more complex undertaking than sequencing genomes to assess germline variation. Despite these difficulties, sequencing whole cancer genomes is clearly a major advance from which we will learn a significant amount about the genes involved in cancer formation and how the cancer genome evolves. Recently, the International Cancer Genome Consortium was formed to co-ordinate the sequencing of human cancer genomes. This ambitious project will sequence the genomes of all of the major cancer types and will undoubtedly precipitate a revolution in how we think about cancer and the genes that drive its development.
1.5. Epigenetic profiling
Epigenetic changes are chemical modifications to the DNA or histones that change the structure of chromatin but do not alter the DNA sequence. If chromatin is in the condensed conformation, transcription factors cannot access the DNA and genes are therefore not expressed, whereas genes in open chromatin can be expressed as required. DNA methylation and changes in chromatin conformation have both been implicated in tumourigenesis. DNA methylation of CpG islands, which are located in promoter regions, can result in gene “silencing” by preventing transcription factor binding. It can also repress gene expression by recruiting methyl-binding domain proteins, which associate with histone deacetylases (HDACs). HDACs mediate chromatin condensation by deacetylating histones. [122].
Aberrant DNA methylation of CDKN2A has been observed in a wide range of common cancer types [123,124], while VHL and BRCA1 are silenced by methylation in a significant proportion of kidney [125] and breast and ovarian cancers [126], respectively. VHL and BRCA1 are also frequently mutated in cancer, but for other tumour suppressor genes, such as RASSF1A, promoter hypermethylation appears to be the principal mechanism for inactivation (for a review, see [127]).
Methods involving high-density oligonucleotide arrays have been developed for genome-wide detection of epigenetic changes. Detection of DNA methylation relies on the ability to distinguish cytosine from 5-methylcytosine, while histone modifications can be detected using chromatin immunopreciptation (ChIP). Large genomic regions, such as an entire chromsome arm, can show aberrant methylation in cancer [128], and there is evidence to suggest that some cancers show a CpG island methylator phenotype (CIMP). CIMP + colorectal cancers have significantly more hypermethylation at CpG islands, including an increased incidence of CDKN2A and THBS1 methylation [129], and they are characterised by a methylated mismatch repair gene, MLH1, which gives rise to microsatellite instability [130]. Genes that are reversibly repressed by Polycomb proteins in embryonic stem cells are significantly over-represented amongst constitutively hypermethylated genes in colorectal cancers [131]. This provides support for the theory of a stem cell origin of cancer. A detailed discussion of the epigenomics of cancer is beyond the scope of this review, which focuses on changes in cancer that alter the DNA sequence. Epigenomics approaches are reviewed in Callinan and Feinberg [132] and, for a detailed review of epigenomics and its relevance to the cancer stem cell hypothesis, see Jones and Baylin [133].
1.6. Insertional mutagenesis and cancer gene discovery in human
In the proceeding section of this review we will discuss the application of mouse models for cancer gene discovery, including how insertional mutagenesis has been deployed to find cancer genes. It is important, however, to note that insertional mutagenesis is not confined to experimental organisms, and is a mechanism of cancer initiation in humans.
There are several oncogenic viruses that afflict humans including the human papilloma virus, the Human T-cell Lymphotropic Virus (HTLV1), the hepatitis family of viruses, and the human immunodeficiency virus (HIV) that have all been implicated as insertional mutagens. In each case, it has been shown that the virus may integrate near cancer related genes, although whether clonal insertion events occur is unclear [134–137]. Insertional mutagenesis in humans has been proven to occur in patients who have received retroviral therapy for SCID-X1. Some of these patients developed T-ALL after having received an autologous transplant of cells transduced with a retrovirus expressing a wildtype copy of the γc gene, which is mutated in SCID-X1. Several of these patients acquired clonal viral insertions upstream of LMO2, implicating this gene as an oncogene [136]. Recently it was suggested that transcriptional upregulation of LMO2 by retroviral insertions alone was insufficient for cancer to form, and that alterations in other genes such as NOTCH1 were required [138].
1.7. Using pathways to predict cancer genes and their function
In this review, we have largely focused on gene discovery by looking for mutated, silenced, or rearranged genes. An alternative way of discovering cancer genes is to build pathways around them and to examine how they are ‘connected’ to each other and how they participate in a biological process. This approach is called ‘network modelling’. By combining gene expression data, functional genomic data and proteomic data, components of the network can be linked. This approach was shown to link breast cancer susceptibility to centrosome dysfunction in tumours carrying BRCA1 mutations, and importantly identified the HMMR gene as a new breast cancer susceptibility gene [139]. An alternative approach is to develop ‘module maps’ which cluster genes together based on their behaviour or expression. Recently it was shown that genes may be clustered together to form an embryonic stem cell module map, and that expression of genes that form this signature of ‘stemness’ is predictive of decreased survival in both mouse and human cancers [140,141]. Similarly, genes may be implicated as possible cancer genes based on their functional or physical interaction with known cancer genes [142].
2. Cancer gene discovery in the mouse
2.1. The mouse as a model for studying cancer
The mouse is a leading model system for cancer research because it has a rapid reproduction rate, breeds well in captivity, and, owing to its small size, can be maintained in large numbers in limited space. It is also genetically and physiologically similar to humans. Additionally, the mouse genome has been sequenced and annotated to a high standard, second only to that of human (see [143]).
The mouse was initially used for tumour transplantation within inbred strains, but following the discovery of the immunodeficient “nude” mouse and, later, the severe combined immunodeficient (SCID) mouse, it became possible to transplant human tumours into the mouse, creating xenograft models. Such models can be used to rapidly assess tumour tissue and cell lines in vivo but they do not fully recapitulate the behaviour of an endogenous tumour because many features of the tumour microenvironment, such as stromal cells, vasculature and immune cells, are missing. The tumour xenograft is also likely to be less heterogeneous than the endogenous tumour because cells in culture are under high selective pressure. These factors have contributed to the limited success of xenograft models in drug development (for a review, see [144]).
Many inbred strains that spontaneously develop cancer at high frequency have been established, and these, as well as mice that have been treated with a mutagen, are useful for studying the properties of endogenous cancers in vivo. They have been used to identify cancer genes and to assess the effects of carcinogens and therapeutic compounds. However, these models may be biased towards specific types of tumour that show variable penetrance and latency and that do not accurately reflect common human cancers [143].
2.2. Genetically engineered mouse models
Genetically engineered mouse models represent a major advance in cancer research that allows for the study of gene function in vivo and for the creation of models that more accurately recapitulate human cancers. Genetically engineered models can be classified as transgenic or endogenous [143].
2.2.1. Transgenic models
Transgenic mice can be created to study the effect of overexpressing an oncogene or a dominant-negative tumour suppressor gene, which encodes a mutant tumour suppressor that can inactivate the wildtype protein. Transgenic mice can be generated by pronuclear microinjection, in which a construct containing the gene of interest (transgene) is microinjected into the mouse oocyte after fertilisation and randomly integrates into the genome, usually in tandem copies. If the transgenic cells contribute to the germ line, the genetic change can be transmitted to the next generation, producing mice that are fully transgenic and establishing a strain. Many genes involved in cancer development are also essential for mouse development. Therefore, to prevent embryonic lethality and to restrict overexpression to specific tissues, the construct containing the gene of interest also contains promoter elements designed for spatial and temporal restriction of gene expression. For example, the Tet-On and Tet-Off systems [145] promote gene expression in the presence or absence of doxycycline, a non-toxic analogue of tetracycline, while fusing the gene of interest to a gene encoding the oestrogen receptor binding domain results in an inactive protein that is activated upon treatment with Tamoxifen [146].
Limitations of the microinjection method include the possibility that, because the transgene integrates randomly, it could disrupt other genes, resulting in a phenotype that does not reflect the function of the gene of interest (for a review, see [147]). In addition, the tendency of the transgene to integrate in multiple copies could result in excessive overexpression that is toxic to the animal [147]. However, transgenic mice have made a significant contribution to cancer research. In the earliest examples, mouse models were used to demonstrate the role of oncogenes in cancer. For example, tissue-specific overexpression of the Myc oncogene in mammary glands and B-cells resulted in the generation of mice prone to breast cancer [148] and lymphomas [149], respectively. Overexpression of dominant-negative mutant tumour suppressor genes has also proved effective, e.g. type II transforming growth factor beta (Tgfβ) receptor has been shown to accelerate chemically induced tumourigenesis in the mammary gland and lung [150].
2.2.2. Targeted/endogenous models
A knockout mouse can be created to study the effect of inactivating a tumour suppressor gene. In this method, a targeting vector is transfected into embryonic stem (ES) cells, which are harvested from the inner cell mass of mouse blastocysts. The vector must share homology with the region of the mouse gene that is being targeted, i.e. the tumour suppressor gene of interest, and must also contain genes for selection, such that only cells in which the vector DNA has replaced the endogenous DNA by homologous recombination will survive. The surviving ES cells are injected back into a blastocyst, and will contribute to all cell lineages, including the germ line [151]. The targeting vector can be engineered to knock out the whole gene or part of a gene, or small changes can be introduced into the gene sequence. Alternatively, the complete gene under the control of a strong promoter can be introduced to create a knockin mouse for overexpressing oncogenes. By targeting a single copy to the genome, this overcomes the problems associated with pronuclear microinjection (for a review, see [147]).
As with transgenic mice, mutations can be spatiotemporally regulated. Conditional mouse models frequently use the Cre–lox system from bacteriophage P1, in which Cre recombinase catalyses recombination between loxP sites [152], and the intervening DNA is deleted or inverted, depending on the orientation of the sites [153]. loxP sites can therefore be placed on either side of a gene region to remove that region in the presence of Cre. Large-scale chromosomal deletions and inversions can also be generated by placing loxP sites, either in the same orientation for deletions or the reverse orientation for inversions, further apart on the chromosome [154,155], while chromosomal translocations can be created by placing a loxP site at each breakpoint [156] on non-homologous chromosomes. Conditional oncogene expression can be achieved by inserting a stop cassette, which is flanked by loxP sites, between the promoter and the first exon such that Cre-mediated excision of the cassette results in expression of the gene [157,158].
Unlike the conditional expression systems in transgenic mice, once Cre recombinase has been expressed, the change is irreversible, and there is evidence to suggest that Cre can be cytotoxic, perhaps due to recombination at pseudo-loxP sites (see [159]). In addition, the Cre–lox system cannot generate conditional point mutations, and this represents a significant limitation since point mutations and deletions do not always produce the same phenotype [143]. However, the Cre–lox system has proved invaluable in creating models that would otherwise not arise or survive. For example, homozygous Brca1 and Brca2 knockouts die early in embryogenesis, and heterozygous mice are not tumour-prone, but mice harbouring a Cre-mediated deletion of Brca1 [160] or Brca2 and Trp53 [161] in the adult mammary gland do develop mammary tumours. Likewise, Trp53 mutations have been identified in many types of human cancer, but if Trp53 is mutated in all cells, the mouse is most likely to develop lymphomas and sarcomas. Conditional Trp53 mutations can be used to create models for human cancers that are driven by Trp53 mutation in other tissues [159]. The Flp/FRT system from Saccharomyces cerevisiae is an alternative to Cre–lox that works in a similar way (see [159]).
2.3. Mouse models in cancer gene discovery
The methods described earlier in the review can also be applied to the identification of candidate cancer genes in the mouse. For example, array CGH has been used to identify regions of copy number change in mouse models of malignant melanoma [162] and pancreatic islet carcinomas [163]. However, as with human cancers, by the time the cancer has presented, it is difficult to distinguish the important driver mutations from the background of passenger mutations.
The genetically engineered mouse models discussed thus far are useful for studying the function of a particular gene or for representing a specific human cancer, but the tumours in these models do not evolve naturally. In general, the initiating event, i.e. the engineered mutation, is present throughout a tissue, whereas in natural tumourigenesis, the tumour develops from one mutated cell. Likewise, in mouse models used to study the combined action of multiple genes in cancer, the genes of interest are usually simultaneously mutated, whereas “natural” tumours progress through a multi-step process, where mutations are gradually acquired. Finally, many mouse models are designed to show high penetrance and short latency to keep costs down, but as a result they may not possess many of the co-operating oncogenic events that would eventually be acquired by a naturally evolving tumour (for reviews, see [143,144]).
It is important that the mutations in mouse models used to identify novel cancer genes reflect the mutations found in human cancers, and this requires more accurate modelling of the natural evolution of tumours.
2.4. Forward genetic screens in the mouse
Forward genetic screens using somatic mutagens are a powerful approach for cancer gene discovery in the mouse. Insertional mutagens allow for relatively unbiased, genome-wide, identification of both novel cancer genes and collaborations between genes involved in cancer. Chemical mutagenesis, using agents such as N-ethyl-N-nitrosourea (ENU), is a highly efficient way of inducing tumours in mice but the causal mutations can be hard to identify. In contrast, insertional mutagenesis using retroviruses and transposons is an effective approach for inducing the stepwise progression of a cell to malignancy, and for the identification of the causal genetic lesions, because the mutagen acts as a molecular ‘tag’ allowing its location in the genome to be easily determined.
2.4.1. Retroviral insertional mutagenesis
2.4.1.1. Mechanisms of mutagenesis
The slow transforming retroviruses murine leukaemia virus (MuLV) and mouse mammary tumour virus (MMTV) have been widely used for insertional mutagenesis in the mouse. Unlike acute transforming retroviruses, which induce tumours by expression of a viral oncogene, slow transforming retroviruses do not carry an oncogene, and tumours are induced by mutations caused by insertion of the retrovirus into the host genome. Consequently, tumours develop with a longer latency of 3–12 months, compared with 2–3 months for acute transforming retroviruses [164]. MMTV was identified as a causative agent in several strains of mice that were prone to mammary tumours, while MuLV was identified as a causative agent in the lymphoma-prone AKR mice (see [165]).
Retroviruses infect host cells by binding of the viral envelope proteins to cell surface receptors. Once the retrovirus has inserted into the host genome, forming a provirus, it will produce viral envelope proteins that occupy the cell surface receptors and prevent reinfection of the same cell. However, recombination with endogenous viral sequences results in the production of envelope proteins that bind to other receptors. This, combined with the fact that many proviruses have defective envelope coding sequences, enables retroviruses to reinfect the same cell, resulting in the accumulation of mutations. Mutations that confer a growth advantage on the cell co-operate in tumour formation, and the process therefore recapitulates the multi-step progression of human tumours (for reviews, see [164,166]).
The MuLV provirus consists of viral genes flanked by two long terminal repeats (LTRs), which are composed of three parts: U3, R and U5 [164]. Elements within the LTRs drive expression of the viral genes but can also disrupt host genes. U3 contains enhancer and promoter sequences, while R contains transcription start and termination sites. High levels of viral transcription and, therefore, host gene disruption, will only occur in cells containing transcription factors that bind to U3. The propensity of MuLV to induce T- and B-cell lymphomas can be attributed to its dependence upon T- and B-cell-specific transcription factors, including Runx, Ets and Myb (see [167], Fig. 2). MMTV, and indeed other retroviruses, have a similar structure to MuLV.
Fig. 2.
Structure of insertional mutagens used for cancer gene discovery in the mouse. (A) The provirus contains two long terminal repeats (LTRs) flanking the genes required for viral assembly. Elements within the LTRs drive transcription of the viral genes but can also induce mutation of nearby cellular genes. Splicing of a viral splice donor (SD) or cryptic splice donor (not shown) to a splice acceptor or cryptic splice acceptor in the first intron or 5′ UTR of a cellular gene results in the formation of a chimeric transcript, in which the celluar gene is coupled to the viral promoter. Splicing of a splice donor or cryptic splice donor in a cellular gene to a viral splice acceptor (SA) or cryptic splice acceptor (not shown) can cause premature termination of gene transcription owing to the presence of polyadenylation signal (pA) and cryptic polyadenylation signals (not shown) in the LTR. Adapted from figure in Uren et al. (see [164,168]). Figure is not to scale. (B) Structure of the Sleeping Beauty transposon T2Onc [242]. The presence of splice acceptors (SA) and polyadenylation signals (pA) in both orientations enables premature termination of gene transcription from intragenic insertions in both orientations. The transposon also contains the murine stem cell virus (MSCV) 5′ LTR and a splice donor (SD) site that can induce promoter mutations in cellular genes. Elements for mutagenesis are flanked by 2 IR/DR elements, shown as arrows, which are required for transposon mobilisation.
Retroviruses can mutate host genes in a number of different ways (Fig. 3). The most common mechanism is enhancer mutation, where one of the U3 enhancers upregulates expression of host genes, which may be some distance away from the retroviral insertion. Most proviruses causing enhancer mutations are found upstream of the mutated gene in the antisense orientation or downstream in the sense orientation. Several possible explanations for this are that upregulation of the host gene may be impeded if the viral promoter intercepts the viral enhancer and host gene, and that viral enhancers may only be functional if they are not transcribed (see [164,168]). Myc and Gfi1 are frequent targets of enhancer mutation in retroviral insertional mutagenesis [169–171]. Myc is mutated in many types of human cancer and encodes a transcription factor thought to regulate the expression of 15% of all genes, including those involved in cell division, cell growth and apoptosis (see [172]). Gfi1 is a zinc finger transcriptional repressor that is involved in cell fate determination and differentiation, including in T- and B-cells [173,174].
Fig. 3.
Mechanisms of mutagenesis by the murine leukaemia virus. The provirus is shown in blue; coding and non-coding exons are shown in red and white, respectively. (A) Enhancer mutation. An enhancer element in the 5′ LTR of MuLV can cause upregulation of nearby cellular genes. Oncogenic insertions of this type are most frequently found upstream and in the antisense orientation with respect to the cellular gene(s) that they are mutating. (B) Promoter mutation. Insertion of MuLV into the promoter region of a cellular gene results in chimeric transcripts that are produced at higher levels than the endogenous gene transcript. (C) Truncating mutation. Intragenic MuLV insertions can cause premature termination of gene transcription, resulting in either gene upregulation or gene inactivation. The figure shows an insertion within the 3′ UTR region, which may remove mRNA-destabilising motifs, thereby stabilising the gene transcript.
An alternative mechanism of mutagenesis is the promoter mutation, where the retrovirus inserts in the sense orientation into the promoter region of a host gene. This uncouples the host gene from its own promoter and places it under the control of the viral promoters, resulting in the production of elevated levels of the wildtype protein from chimeric transcripts comprising part of the viral sequence and the complete coding region of the host gene [175]. Promoter mutations led to identification of Evi1 as a potential oncogene [176–178]. EVI1 encodes a zinc finger transcription factor that is frequently overexpressed in myeloid malignancies. It is involved in several recurrent rearrangements, including 2 translocations that result in the fusion transcripts AML1/MDS1/EVI1 and ETV6/MDS1/EVI1, where MDS1 and EVI1 are also expressed as a readthrough transcript in normal tissues (for a review, see [179]).
Since the retrovirus contains a polyadenylation signal within the R region of the LTR and a cryptic polyadenylation signal in the antisense orientation intragenic retroviral insertions in both orientations can cause premature termination of gene transcription. Insertions within the 3′ UTR that truncate a transcript such that mRNA-destabilising motifs are removed will give rise to a more stable transcript and, as a result, increased levels of the wildtype protein (see [164]). The oncogenes Pim1 and Mycn are frequently mutated in this way [180–182]. Pim1 encodes a serine/threonine kinase that is frequently overexpressed in human prostate cancer [183], while Mycn encodes a transcription factor related to Myc that is amplified in a variety of human tumours, most notably neuroblastomas [184,185].
Intragenic insertions can also activate a gene by causing C-terminal or N-terminal truncation of the encoded protein. Insertions in oncogenes Myb and Notch1 cause both N-terminal and C-terminal truncations [164,186]. C-terminally truncated Notch1 lacks the destabilising PEST domain and is therefore produced at increased levels, while N-terminal truncations remove the extracellular domain, resulting in a constitutively active intracellular domain expressed from the viral promoter or from a cryptic promoter in Notch1 [187]. Activating mutations within the extracellular and PEST domains of NOTCH1 have been observed in human T-cell acute lymphoblastic leukaemia [188], in which NOTCH1 plays an important role. Thus analysis of the distribution of insertions within an oncogene may therefore help to explain how the gene is mutated in human cancer.
Intragenic insertions may also cause gene inactivation, either through premature termination of transcription or by disrupting gene splicing (see [164]). It is therefore possible to identify tumour suppressor genes by retroviral insertional mutagenesis, although they are found much less frequently than oncogenes because both copies of the gene must be inactivated. Mutation at the Nf1 locus is observed in acute myeloid leukaemia in BXH2 mice [189], which contain MuLV insertions [190], while in an insertional mutagenesis screen of Blm-deficient mice, 11 genes met the criteria for tumour suppressor genes, including Rbl1 and Rbl2, which are paralogues of Rb1 [191]. Blm-deficient mice have a mutation in the RecQ protein-like-3 helicase gene [192] and show a predisposition to cancer due to increased frequencies of mitotic recombination [193]. There is an increased likelihood of finding tumour suppressor genes in these mice because they have a higher probability of a normal allele being lost so that only one insertion is required to inactivate the gene [193]. However, candidate tumour suppressor genes still only accounted for 5% of all genes identified in the screen by Suzuki et al. [191]. In theory, insertional mutagenesis screens should have a better chance of finding haploinsufficient tumour suppressor genes, but none have yet been unambiguously identified [164].
Insertional bias could also account for the paucity of tumour suppressor genes identified in retroviral screens. For example, MuLV shows a strong preference for integration near to the transcription start sites of actively transcribed genes [194] and is therefore less likely to disrupt a gene by intragenic insertion. However, it is possible that promoter mutations could also cause gene inactivation, as CpG islands in the retroviral LTRs are methylation targets, and DNA methylation could “spread” to CpG islands in the host gene, resulting in gene silencing (see [195]). Retroviruses prefer to insert into open chromatin [196,197], but different retroviruses show different target site preferences, suggesting that virus-specific interactions are involved [198]. DNA sequence does not seem to influence target site selection [199]. The tendency for MuLV to insert in the promoter region suggests that it interacts with cellular proteins bound near start sites [194,198].
2.4.1.2. Identifying candidate cancer genes
The retroviral insertions act as ‘tags’ for identifying the mouse genes that are mutated by insertional mutagenesis, and sequencing of the mouse genome and the development of high-throughput genomic techniques have made it possible to identify thousands of insertions in a single screen. Insertion sites were initially identified using methods that involve Southern blot analysis and genomic library screening, followed by genome walking to find the mutated gene (see [164,167]). However, these have been replaced by PCR-based methods, in which mouse genomic DNA flanking the insertion sites is amplified and is then mapped back to the genome. One such method, known as viral insertion site amplification (VISA) involves using a PCR primer designed to bind to the MuLV LTR and a degenerate, restriction-site specific primer that enables amplification of the DNA between the insertion and a nearby restriction site [200,201]. In inverse PCR and linker-mediated PCR-based methods (Fig. 4) the genomic DNA is digested with a restriction enzyme prior to PCR amplification. In inverse PCR, the digested genomic DNA is allowed to ligate to itself to form a circular template. PCR primers bind to the retroviral DNA and point out towards the genomic sequence, resulting in amplification of genomic DNA directly flanking the retrovirus [202,203]. Only DNA fragments of a suitable length for efficient circularisation and for PCR amplification will be detected [164].
Fig. 4.
Isolation of retroviral insertion sites. (A) Inverse PCR. Tumour DNA is digested using restriction enzyme X and the restricted DNA is allowed to circularise. Genomic DNA flanking retroviral insertions are amplified using PCR primers that bind within the insertion and point out towards the genomic DNA. A second round of PCR is performed using nested primers. The amplified DNA is sequenced and mapped to the mouse reference genome. (B) Splinkerette PCR. As for inverse PCR, except that instead of circularising the digested DNA, a splinkerette adapter (shown in yellow) is ligated and genomic DNA flanking the retroviral insertions is amplified using PCR primers that bind to the adapter and the retroviral LTR.
In linker-mediated PCR, rather than the digested DNA ligating to itself, it is ligated to a linker, and this enables shorter insertions to be identified. One primer is designed to bind to the linker, and the other binds to the retroviral sequence. A number of methods have been developed, each with a different approach for avoiding amplification of DNA that has linkers at both ends but no retroviral DNA. Vectorette PCR involves the use of a double-stranded linker with a cohesive end for ligation to restricted DNA and a central region with a mismatch [204]. The primer is the same sequence as the mismatched part of the upper strand, and this prevents initiation of priming from the vectorette until the complementary strand has been synthesised by priming from within the retroviral insertion. However, this method suffers from non-specific annealing of the primers and ‘end-repair’ priming, in which the ends of unligated vectorettes initiate priming and enable PCR amplification without involving the retroviral-specific primer (see [205]). Any errors that cause amplification of DNA that is not flanking an insertion will lead to the false identification of insertion sites.
An improved method uses splinkerettes, which incorporate a hairpin structure on the bottom strand, rather than a mismatch sequence [205]. The primer has the same sequence as the upper strand and, as with vectorette PCR, cannot anneal until the complementary strand has been synthesised. The stable hairpin does not enable end-repair priming and only the upper strand can act as a non-specific primer. In all the PCR-based methods, insertions are only identified if target sites for the chosen restriction endonuclease are close enough to the insertion for the intervening region to be amplified. Coverage can be improved by using multiple restriction endonucleases [164].
Once the insertion-flanking genomic DNA has been amplified, the PCR products must be separated for sequencing. In the past, products were separated using agarose or polyacrylamide gels, but rare insertions are likely to be missed, and gel extraction is painstaking and subjective. An alternative method is to subclone the PCR products directly into a vector. By shotgun cloning the total mixture, it is possible to maintain the relative proportions of insertions from the starting material. However, it also means that more sequencing will be required to capture the rare insertions (see [164]). The VISA approach sequences PCR products directly, without subcloning, which reduces the risk of sequencing contaminating products [201]. The latest method uses massively parallel sequencing technology from 454 Life Sciences (http://www.454.com) [206,207], in which fragmented genomic DNA is ligated to short adaptors that are used for purification, amplification and sequencing. The DNA is denatured and immobilised onto beads, where PCR amplification and sequencing occur. This approach is extremely high-throughput, does not rely on cloning and is capable of detecting rare insertions. However, it can encounter problems when dealing with repetitive regions and long runs of a single nucleotide.
The next step is to map the sequenced DNA to the genome using a DNA alignment algorithm. For large screens, it is an advantage to be able to find high quality alignments quickly [164]. The Sequence Search and Alignment by Hashing Algorithm (SSAHA) (SSAHA2, [208]) converts the genome into a hash table, which can then be rapidly searched for matches. Sequences in the database (the mouse genome) are preprocessed into consecutive k-tuples of k contiguous bases and the hash table stores the position of each occurrence of each k-tuple. The query sequence (sequenced DNA) is also split into k-tuples and the locations of all occurrences of these sequences in the database, i.e. the “hits”, are extracted from the hash table. The list of hits is sorted, and the algorithm searches for runs of hits in the database that match those in the query sequence. Having identified regions of high similarity, sequences are fully aligned using cross_match [209], which is based on the Smith–Waterman–Gotoh alignment algorithm [210,211]. Because the database is hashed, search time in SSAHA2 is independent of database size, provided k is not too small. SSAHA2 is therefore three to four orders of magnitude faster than the BLAST alignment algorithm [212], which scans the database and therefore performs at a speed that is directly related to database size [208].
If the PCR mixture has been shotgun cloned and preferably sequenced to a high depth, there may be more than 1 read per insertion site. Reads from a single tumour that map to the same genomic region must therefore be clustered into single insertion sites. Like the mutations in human cancer, tumour DNA will contain both insertions that drive oncogenesis (oncogenic insertions) and insertions that are passengers (background insertions). In theory, most identified insertions should be oncogenic because these, and particularly the earliest events in tumourigenesis, should be present in most, if not all, tumour cells, whereas background insertions should be present in a smaller proportion of cells. However, background insertions that occur early in tumour development in a cell containing oncogenic insertions could also be highly represented in the final tumour (see [213]).
Clustering of insertions from different tumours into common insertion sites (CISs) helps to distinguish oncogenic and background insertions. In theory, background insertions should be randomly distributed across the genome. Therefore, for small-scale screens, a gene in the vicinity of a cluster of insertion sites in different tumours is a strong candidate for a role in cancer. Methods for identifying statistically significant CISs, i.e. regions that are mutated by insertions in significantly more tumours than expected by chance, have involved generating a random distribution of insertions across the genome and obtaining an estimate of the number of false CISs in windows of fixed size using Monte Carlo simulation [214] or the Poisson distribution [175]. These methods can be used to define the maximum window size in which insertions must fall to be considered non-randomly distributed. However, for larger scale screens, the window must be decreased to a size that is smaller than the spread of insertions within a single CIS so that many CIS are missed [213]. In addition, the above methods assume that insertions are randomly distributed and take no account of insertional biases [215].
A more recent approach for CIS detection overcomes these problems by using a kernel convolution (KC) framework, which calculates a smoothed density distribution of inserts across the genome [213]. The scale (kernel size) can be varied so that CISs of varying widths can be identified. Decreasing the kernel size may identify separate CISs affecting the same gene, while increasing the kernel size will identify CISs where insertions are widely distributed in or around a gene. The method can be used for large-scale studies because it keeps control of the probability of detecting false CISs. The threshold for significant CISs is based on the alpha-level defined by the user and on a null-distribution of insertion densities obtained by performing random permutations. A background distribution, such as the location of transcription start sites, can be provided to correct for insertional biases [213].
The final step is to identify the genes that are being mutated by insertions within CISs. This may be relatively straightforward for intragenic insertions, but for enhancer mutations, which may have long distance effects, it is often difficult to identify the mutated gene unequivocally. Measuring the expression and transcript size of candidate genes in insertion-containing tumours can shed some light, but animal models and analysis of the orthologues in human cancer data are required for more conclusive evidence [164].
A number of screens have been performed in recent years that have each identified hundreds of insertion sites [175,191,201,214,216–223]. The results of many screens have been collated and stored in the Retroviral Tagged Cancer Gene Database (RTCGD; http://rtcgd.abcc.ncifcrf.gov/) [171]. At the time of writing, the database contains CISs associated with 540 genes from 30 screens (database accessed December 2008). Users can search for individual genes of interest, or for CISs identified using particular mouse models and/or in particular tumour types. Genes with the most CISs are Gfi1 and Myc, with 82 and 78 insertions across all screens, respectively.
2.4.1.3. Identifying co-operating cancer genes
Retroviral insertional mutagenesis is a powerful tool for identifying genes that collaborate in tumour development. Collaborations can be identified by analysing the co-occurrence of CISs in individual tumours. For example, proviral activation of Meis1 and Hoxa7 or Hoxa9 is strongly correlated in myeloid leukaemias from BXH2 mice [190,224]. Meis1 and Hoxa9 are targets of translocation in human pre-B leukaemia [225] and acute myeloid leukaemia (AML) [226], respectively, and they are frequently co-expressed in human AML [227]. Both genes encode homeodomain transcription factors that bind to Pbx, and Meis1–Pbx and Hox–Pbx complexes have been shown to co-occupy the promoters of leukaemia-associated genes, such as Flt3 [228].
A two-dimensional Gaussian Kernel Convolution method has recently been developed for identifying co-operating mutations in insertional mutagenesis data [229]. It is based on the Kernel Convolution framework used for identifying CISs [213]. The method has been applied to the data in RTCGD and, as well as finding previously characterised interactions, such as Meis1 and Hoxa9/Hoxa7, it also finds novel interactions, such as Rasgrp1 and Cebpb, which are both known to play a role in Ras-induced oncogenesis [229]. However, as retroviral-induced tumours are oligoclonal, it is difficult to prove that tagged genes are in the same cell, and therefore that they collaborate [230]. In an alternative approach, retroviral screens are performed on transgenic mice overexpressing known oncogenes, and knockout mice harbouring inactivated tumour suppressor genes, to identify genes that collaborate with the overexpression of oncogenes, and loss of tumour suppressor genes, respectively. For example, 35% of B-cell lymphomas generated in MuLV-infected EμMyc transgenic mice, in which Myc is overexpressed in B-cell progenitors under the control of the immunoglobulin heavy chain enhancer, have an insertion in Pim1 or the polycomb group protein Bmi1 [231]. Bmi1 collaborates with Myc by inhibiting Ink4a/Arf, and therefore inhibiting Myc-induced apoptosis [232]. In concurrence with these findings, Myc insertions were identified in 20% of tumours from MuLV-infected Cdkn2a (Ink4a/Arf)-deficient mice, but none contained insertions in Bmi1 [218]. Insertional mutagenesis also identifies genes that can functionally complement one another in tumour development. For example, in MuLV-infected EμMyc mice, activation of Pim2 increases from 15% to 80% in compound mutant mice lacking Pim1 expression [233], while Pim3 is selectively activated in mice lacking Pim1 and Pim2 expression [175]. Pim1 is a coactivator of Myc that is required for expression of around 20% of all Myc target genes [234]. Pim kinases also appear to suppress Myc-induced apoptosis, but it is not clear whether this mechanism or Myc coactivation is responsible for the co-occurrence of Pim1 and Myc mutations observed in lymphomagenesis (for a review, see [235]). Pim1 also collaborates with Myc in human prostate cancers [236].
Retroviral screening of a mouse model for human myeloid leukaemia has identified 6 CIS genes, including Plag1 and Plagl2, which co-operate with the oncogenic fusion gene CBFB-MYH11 [237]. This screen used a replication-defective retrovirus, cloned amphotropic virus 4070A, to limit the number of mutations and therefore to show that mutation of only one or a few genes were sufficient to induce tumorigenesis. Other studies using replication-competent viruses report 3–6 insertions in a single tumour [175,214] but, as mentioned above, retroviral-induced tumours are oligoclonal and it is therefore difficult to make a reliable estimate of the number of insertions in a tumour clone (see [167]).
2.4.1.4. Generating tumours of different types
As discussed previously the dependence of retroviruses on cell-type-specific transcription factors limits the range of tumours that they can induce. There have been some successful attempts to alter the propensity of MuLV for T-cell lymphomas by using an EμMyc transgenic mouse, which results in predominantly B-cell lymphomas [231], and by expressing platelet derived growth factor B-chain (PDGFβ) from an MuLV-based retrovirus to generate mice with glioblastomas, which require activation of PDGF receptors for tumourigenesis [219]. Mutations in the retroviral LTR may also lead to a change in tumour type, but manipulated viruses have a tendency to revert to wildtype [164]. In addition, MuLV and other retroviruses cannot infect nondividing cells, and infection is inefficient in slowly replicating cells and in tissues that have a basement membrane or mucin layer [238,239]. Transposon insertional mutagenesis is an alternative method that provides the possibility of generating a wider spectrum of tumours.
2.4.2. Transposon-mediated insertional mutagenesis
Like retroviruses, transposons are genetic elements that can mobilise within the genome. They are classified according to their mechanism of transposition. DNA transposons move by a “cut and paste” mechanism, in which they are excised from one site in the genome and integrated into another. Retrotransposons transpose via an RNA intermediate and are classified into LTR retrotransposons, which encode reverse transcriptase and transpose in a similar manner to retroviruses, and non-LTR retrotransposons, which are transcribed by host RNA polymerases and may or may not encode reverse transcriptase [240].
2.4.2.1. The Sleeping Beauty transposon system
While DNA transposons are actively mobile in plants and invertebrates, all of the elements that have been so far identified in vertebrates are non-functional [164]. However, they can be mobilised in the mouse by using an invertebrate DNA transposon or by reconstructing a degenerate vertebrate transposon. Sleeping Beauty (SB) is a synthetic transposon derived from dormant DNA transposons of the Tc1/Mariner family in the genomes of salmonid fish. An active transposon, named SB10, was synthesised by directed mutagenesis on the basis of a consensus sequence obtained by aligning 12 degenerate transposon sequences from 8 species [241]. SB consists of two inverted repeat/direct repeat (IR/DR) elements of ∼ 230 bp each, flanking a cargo sequence [242]. Transposition occurs via binding of a transposase enzyme to two sites in each IR/DR [243]. All four binding sites are required for transposition and, in general, the closer the IR/DRs, the higher the transposition efficiency [243]. Higher levels of transposition have been achieved by introducing point mutations into the transposase, producing, for example, the SB11 [244] and SB12 [245] transposases.
The utility of SB for oncogenic insertional mutagenesis was first demonstrated in two studies published in 2005 [246,247]. In both studies, transposons were introduced into mice by pronuclear injection of a linear plasmid containing one copy of the transposon, which forms a multicopy concatemer of variable length at a single site in the mouse genome. SB was mobilised by crossing these mice to mice expressing a transposase from a ubiquitous promoter. Collier et al. [246] used a transgene containing the SB10 transposase under the control of the CAGGS promoter to mobilise around 25 T2/Onc transposons, while Dupuy et al. [247] used the more active SB11 version knocked into the endogenous Rosa26 locus to mobilise 150–350 copies of the T2/Onc2 transposon. T2/Onc and T2/Onc2 were engineered to contain elements for mutagenesis much like those in retroviruses (Fig. 2). The cargo of both transposons contains the 5′ LTR of the murine stem cell virus (MSCV) followed by a splice donor, as well as splice acceptors followed by polyadenylation sites in both orientations. The transposons are therefore capable of disrupting genes by promoter mutation, N-terminal and C-terminal truncation and gene inactivation but, unlike retroviruses, they show low enhancer activity [247]. T2/Onc and T2/Onc2 are essentially the same, except that T2/Onc2 contains a larger fragment of the Engrailed splice acceptor and the IR/DRs have been optimised for transposase binding [247]. In the study by Dupuy et al. [247], there was a high rate of embryonic lethality and of the 24 T2/Onc2;Rosa26SB11 mice that survived to weaning, all developed cancer, most commonly T-cell lymphomas but also other haematopoietic malignancies and, in a few cases, medulloblastomas and intestinal and pituitary neoplasias. Some mice had 2 or 3 types of cancer and all died within 17 weeks. In contrast, in the study by Collier et al. [246], mice on a wildtype background did not develop tumours, but those on an Arf-null background developed sarcomas at an accelerated rate. The difference between the two studies most likely reflects the differences in transposon copy number and in transposase expression and activity [248]. Transposase expression in CAGGS-SB10 mice has since been shown to be low and variegated in most tissues, probably due to epigenetic silencing of the transgene, while transposase expression is high in nearly all cell types in Rosa26SB11 mice [248]. However, transposase is expressed in the testes of CAGGS-SB10 mice, which show high rates of transposition in the male germline [248,249].
Transposons, like retroviruses, can be used to identify co-operating cancer genes. For example, Braf was frequently mutated in Arf-null mice, suggesting that these genes co-operate in tumour formation [246], while of the six T-cell tumours containing Notch1 mutations, three also contained insertions mutating Rasgrp1, and 2 of these contains Sox8 mutations, suggesting that these three genes also co-operate [247].
While a number of the genes identified in the haematopoietic malignancies of T2/Onc2;Rosa26SB11 mice had been previously identified in retroviral mutagenesis, other genes had not [247]. This indicates that transposon mutagenesis is a complementary approach for cancer gene discovery, and may reflect differences in insertional bias. While MuLV shows a strong preference for inserting near transcription start sites [194], SB shows a less pronounced preference and shows no preference for actively transcribed genes [250]. SB inserts at TA dinucleotides and therefore shows a bias towards AT-rich sites, particularly those with the consensus sequence ANNTANNT [251,252]. However, most significant is the strong tendency of SB to transpose to sites close to the concatemer. This phenomenon, known as “local hopping”, results in a non-random distribution of insertions that hampers CIS detection. Another potential hindrance to cancer gene identification is the ability of transposons to excise themselves and reinsert multiple times. SB leaves a small footprint upon excision, and it is possible that, at least in exons, this could continue to cause gene disruption that would not be identifiable [248]. Likewise, the excision in some cells of transposons that had been critical for tumour development could result in a more heterogeneous tumour in which cancer gene identification would be more complicated. However, it is possible that such an event would be deleterious and that the cell would be eliminated [248] and, as SB transposition efficiency is higher for methylated [253] and heterochromatic [254] transposons, excision of transposons involved in gene disruption may be relatively rare. A further drawback of SB, and possibly other DNA transposons, is that transposition may induce genomic rearrangements, including deletions and inversions near to the transposon concatemer, and tumourigenesis could therefore be initiated by genes disrupted by these rearrangements rather than by mobilised transposons [255].
One of the key benefits of using a transposon such as SB for insertional mutagenesis is that the mutagenic elements can be modified to control the types of mutation that occur. For example, modifying the cargo to enable only truncating mutations could increase the likelihood of identifying tumour suppressor genes [248]. Tissue-specific promoters can be integrated as cargo, making transposons an attractive mutagen for cancer gene discovery in a range of cancer types [256]. Spatial and temporal transposition could also be achieved by introducing a lox–stop–lox cassette between the SB transposase promoter and cDNA, such that transposition is induced upon the addition of Cre [256].
Identification of cancer genes in SB mutagenesis follows much the same procedure as for retroviruses. Largaespada and Collier [257] have developed a technique that uses linker-mediated PCR but that enables PCR amplification of DNA flanking both sides of the transposon to maximise coverage. Primers were designed to bind to the IR/DR sites and to synthetic adapters. Unlike in retroviral mutagenesis, tumour cells contain a concatemer of non-transposed elements. To avoid repeated cloning of the junctions between these elements, “blocking” primers can be used that bind to the plasmid DNA flanking each transposon in the concatemer but that have blocked 3′ ends to prevent polymerase extension. Alternatively, after linker ligation, the DNA could be redigested with an endonuclease that cuts within the flanking plasmid DNA so that the primer binding sites are separated on different molecules (see [257]).
2.4.2.2. Alternative mutagens for transposon insertional mutagenesis
The active invertebrate transposons piggyBac and Minos are the only other DNA transposons that have so far been mobilised in the mouse [248]. The piggyBac transposon, isolated from the cabbage looper moth (Trichoplusia), mobilises in mouse somatic cells and in the germline, and it can carry a larger cargo than SB [258]. The coding sequence of piggyBac has been codon-optimised to enable higher levels of transposition in the mouse, and inducible versions have been generated by fusing the transposon to the ERt2 oestrogen receptor ligand-binding domain [259]. Unlike SB, it shows a strong preference for inserting into genes in the mouse [258] and in human cell lines [260]. The Minos transposon, from Drosophila hydei, has attracted interest because it shows a low insertional bias and high transposition efficiency in a range of animals (for a review, see [261]). However, it has so far shown only weak in vivo activity in the mouse [262,263].
Retrotransposons are also gaining attention as potential insertional mutagens. Long interspersed nuclear elements (LINEs) are non-LTR retrotransposons that are transcribed into mRNA by RNA polymerase II and encode two proteins that are essential for transposition [264]: a protein that binds to single-stranded RNA [265] and a protein with reverse transcriptase and endonuclease activity [266,267]. 17% of the human genome is composed of LINE-1 (L1) elements [268]. Transcription of endogenous L1 elements is generally inefficient but there are a small number of highly active “hot L1s”, which were used to generate a transgenic mouse model of L1 retrotransposition that showed a higher frequency of de novo somatic L1 insertions [269]. A 200-fold increase in transposition in the mouse germline has also been achieved by codon optimisation of the human L1 coding region [270]. L1 mobilises by a “copy and paste” mechanism. It is therefore an attractive mutagen for forward genetic screens because, unlike DNA transposons, it is capable of self-expansion and the original insertion remains intact, aiding identification of mutated genes [248,271]. In addition, it appears to show no [272], or only a slight insertion site preference [269], for inserting into genes and there is no local hopping because the RNA intermediate must exit and re-enter the nucleus before inserting into the genome. However, most L1 insertions are truncated at the 5′ end [269], potentially resulting in the loss of promoters, splice acceptors and polyadenylation signals required for mutagenesis [248]. Controlled insertional mutagenesis using L1 derivatives has not yet been reported and Sleeping Beauty remains the preferred transposon for cancer gene discovery.
3. Cross-species comparative analysis for cancer gene discovery
Important biological sequences, such as gene coding regions and regulatory elements, are conserved in evolution. Cross-species comparative sequence analysis can therefore facilitate the characterisation of known cancer genes. For example, comparison of intronic sequences in human and mouse BRCA1 led to the identification of two evolutionarily conserved regulatory elements in the second intron that, when mutated, had opposite effects on gene expression [273]. Cross-species comparative analysis also provides an extremely powerful approach for identifying novel genes and gene collaborations involved in cancer formation. Many genes and pathways have been implicated in tumourigenesis, and most human cancers exhibit genomic instability, leading to the acquisition of genetic alterations that drive tumourigenesis but also many passenger mutations that do not contribute to the tumour phenotype. Distinguishing driver and passenger mutations is a major challenge. The underlying molecular mechanisms that govern important biological processes are, however, conserved through evolution and cancer-associated mutation data from other species can therefore be used as a filter for identifying genes that represent strong candidates for a role in human tumourigenesis.
Genome-wide expression data for human tumours can be difficult to interpret, and a number of studies have therefore used cross-species comparative analysis to identify conserved expression signatures that are important in tumourigenesis. Expression profiles of intestinal polyps from patients with a germline mutation in APC were compared to those from Apc-deficient mice and the conserved signature showed an over-representation of genes involved in cell proliferation and activation of the Wnt/β-catenin signalling pathway [274]. Likewise, comparison of expression profiles for human lung adenocarcinoma and a mouse model of Kras2-mediated lung cancer led to the identification of a KRAS2 expression signature that was not identified by analysing KRAS2-mutated human tumours alone [26]. More recently, a mutated Kras-specific signature that can be used to classify human and mouse lung tumours on the basis of their KRAS mutation status has been identified by comparing KRAS-mutated human cancer cells to mouse somatic cells containing knocked-in mutant Kras [275].
Mouse prostate cancers induced by human MYC have an expression signature that defines a set of “Myc-like” human prostate tumours and includes overexpression of the oncogene Pim1 [236]. Rat prostate tumours also have a similar expression profile to human prostate tumours, and have been used to identify conserved genes that are differentially expressed in both species in response to treatment with the chemopreventive agent Selenium [276]. The mouse is therefore not the only cancer model that has been used for cross-species comparison. The greater the evolutionary distance between the species, the greater the likelihood that conserved changes in gene expression contribute to the cancer phenotype. An expression signature in zebrafish liver tumours is more consistently associated with human liver tumours than with other human tumour types and, since human and zebrafish are distantly related, genes in the conserved signature are strong candidates for a role in cancer development [277].
Another approach for cross-species analysis involves comparing the CGH profiles of human tumours to the CGH profiles of tumours generated from a mouse model of the corresponding human cancer. Such studies take advantage of the conserved synteny between the human and mouse genomes [278]. Comparison of CGH profiles for human neuroblastomas with profiles for tumours and cell lines from a MYCN transgenic mouse model of neuroblastoma have shown that many genetic aberrations are conserved between species [279,280]. Likewise, 80% of aberrations detected by array CGH in tumour cells of the mouse model for epithelial ovarian cancer are conserved in human epithelial ovarian cancer [281] and epithelial carcinomas in mice with telomere dysfunction show numerous copy number changes in regions syntenic to those in human cancers [282]. Zender et al. [283] used array CGH to identify regions of copy number change in the tumours of a mouse model for hepatocellular carcinoma. The CGH profiles were compared to array CGH data for human hepatocellular carcinomas to identify minimally conserved amplicons, and genes that showed increased expression in both species were chosen as candidate cancer genes. The authors identified 2 oncogenes, cIAP1 and Yap, that act synergistically in a focal amplicon on mouse chromosome 9qA1, which is syntenic to an 11q22 amplicon in human tumours. Kim et al. [284] used a comparable approach to identify Nedd9 as a candidate for a role in promoting metastasis of melanomas. A focal amplicon comprising 8 genes, including Nedd9, was identified on chromosome 13 in 2 metastatic cell lines derived from a Ras mouse model of nonmetastic melanoma. 36% of metastatic melanomas contained a much larger amplicon in a syntenic region on human chromosome 6p25-24, and 35–52% of metastatic melanomas showed significant overexpression of NEDD9, with more advanced tumours showing higher levels.
Comparison of human cancers with mouse models of cancer relies on the use of mouse models that accurately recapitulate the human cancer [285]. While cIAP1 and Yap overexpression was found to be important in p53−/−;Myc-induced hepatoblasts in the study by Zender et al. [283], neither gene contributed to tumourigenesis in p53−/−;Akt or Ras hepatoblasts. Likewise, Nedd9 did not contribute to melanoma metastasis in the absence of Ras or Raf activation [284]. Cross-species comparison of genomic profiles for a particular cancer may therefore require some prior knowledge of the genetic events that drive tumourigenesis in that cancer so that an appropriate mouse model can be generated. However, cross-species analysis can also facilitate the selection of a suitable mouse model. Lee et al. [286] used unsupervised hierarchical clustering of expression data from human and mouse hepatocellular carcinomas to identify the mouse models that provided the best fit for human cancers. Mouse and human tumours that clustered together due to similar expression profiles also shared phenotypic characteristics, such as proliferation rate and prognosis [286]. Most genetically engineered mouse models do not show the high levels of chromosome instability associated with human cancers. Mice that are engineered with telomere dysfunction, or defects in DNA damage checkpoints or DNA repair, may therefore represent better models for comparative oncogenomics [287]. Comparative analysis of copy number alterations in chromosomally unstable murine T-cell lymphomas and human solid tumours identified recurrent aberrations in the mouse that are conserved in human T-ALL but also in other human tumour types [287].
Candidate cancer genes can also be identified by comparing expression and CGH profiles for human tumours with mouse insertional mutagenesis screens. Genes in expression signatures associated with distinct subclasses of human acute myeloid leukaemia were significantly correlated with genes nearest to insertion sites in a Graffi 1.4 MuLV mouse model and with candidate leukaemia genes in BXH2 and AKXD mouse models [288]. There was little overlap between the candidates identified by Graffi 1.4 and BXH2/AKXD, demonstrating that retroviral screens involving multiple models and viruses may be required for a more effective cross-species comparison [195]. Amplified regions in human pancreatic cancer have also been shown to contain more CIS in retrovirus-induced murine lymphomas and leukaemias than expected by chance [289]. As discussed previously, insertional mutagenesis ‘tags’ the mutated gene, therefore facilitating cancer gene identification. In contrast, copy number alterations in human cancer can be very large, encompassing many genes, and no systematic approach currently exists for identifying the critical genes within these regions [290]. Thus comparative analysis of oncogenic insertions in mouse tumours and CGH data for human tumours is potentially a very powerful approach for narrowing down the candidates in regions of copy number change.
4. Validating candidate cancer genes
In many cases, identifying candidate cancer genes using the methods described above is the first step towards proving that they are involved in cancer development and further functional validation is usually required.
4.1. Validating candidate gain of function mutations
Validating candidate gain of function mutations may be achieved in several ways and the approach applied is largely driven by the tissue or organ system in which the candidate oncogene is being studied. Classically, viruses have been used to overexpress candidate cancer genes either by infecting cells ex vivo and then transplanting them back into a host, or by injecting viruses directly into the tissue of interest [291]. The haematopoietic system and the mammary gland are particularly amenable to transplantation, while any organ may be injected with viruses. There are now vast collections of cDNAs in retroviral vectors that provide an ‘off the shelf’ resource for overexpressing and validating candidate cancer genes [292, 293]. Transposons such as Sleeping Beauty have also been used to deliver a ‘payload’ containing an oncogene into tissues including the liver [294] and brain [295]. In this context, transposons essentially represent an alternative delivery tool to viruses. Where it is desirable to generate large numbers of animals for study, an alternative strategy has been proposed to generate arrays of transposons carrying oncogenic cDNAs in ES cells and to make lines of mice from which transposons may be mobilized somatically, resulting in expression of oncogenic cDNAs when the transposon lands near a suitable promoter [296]. The premise is that an oncogene requires the right level of expression to participate in transformation. While novel, it is unclear how useful this approach will be since construction of transposon arrays is cumbersome, and it is clear that oncogenic cDNAs present in these arrays are not silent or inert so developmental defects resulting from ectopic expression of candidate oncogenes may occur. An alternative strategy to these overexpression approaches is to knockdown expression of a candidate oncogene, since depletion of a gene that is important in driving oncogenesis may result in decreased growth of the tumour, or of a cell line derived from the tumour in culture. A very powerful, but low-throughput, approach is to knock cDNAs into a defined locus such as the Rosa locus in a Lox–Stop–Lox vector and to express them conditionally after expression of Cre recombinase [297,298]. Elegant methods have also been developed to validate fusion genes using ‘invertor alleles’ [299,300].
4.2. Validating candidate loss of function mutations
The most high-throughput approach for validating candidate tumour suppressor genes is to knock them down using shRNAs [301,302]. This approach is largely restricted to tissues in which viral delivery of the shRNA can be achieved, as discussed above. There are, however, constructs available to introduce shRNAs into defined locations in the genome such as the Rosa locus. The success of an shRNA depends on the ability of the transcript to be ‘knocked down’ and on the stability of the protein, so it may not be suitable for all genes. One particularly appealing shRNA-based system incorporates tet regulatable elements so that expression of a gene can be switched off and then on again so that its role in tumour initiation and progression can be studied in detail [297].
Clearly the most powerful approach for validating loss of function mutations is to use conditional loss of function alleles in the mouse. Generating conditional alleles in mice is certainly not a high-throughput strategy, since it takes at least a year to generate an allele and to obtain the mice for study. There are, however, extensive programmes such as the Knockout Mouse Programme (KOMP) and the European Conditional Mutant Mouse (EUCOMM) Programme that are generating impressive collections of conditional alleles in ES cells and mice [303,304].
5. Concluding remarks
The study of human and mouse cancers has enabled us to get a window on the genetic complexity of the cancer genome. As we go forward, whole cancer genome sequencing is likely to move to the fore as the primary approach applied to cancer genome analysis. While it is likely that this approach will reveal a number of frequently mutated genes that have been missed by other techniques, intuitively it is likely that many genes will be uncovered that are occasionally mutated. Additionally, it is likely that many frequently rearranged regions containing numerous genes will be identified, further complicating the identification of those mutations that drive the tumorigenic process. Deconvoluting this complexity should be enabled by cross-species cancer gene analysis, which, as described above, has already been shown to be a potent approach for cancer gene identification and validation. The limiting factor of the cross-species cancer gene approach has been generating animal models that faithfully recapitulate the human disease although considerable efforts, such as the Mouse Models of Human Cancer Consortium (MMHCC), are being made to redress this limitation. Clearly, mice, in addition to other animal models, will play a major role in furthering our understanding of the cancer genome. Large-scale oncogenomic approaches that incorporate data from both mouse and human and that apply in vivo techniques, such as shRNA knockdown and viral mediated overexpression, to validate candidate cancer genes are likely to become commonplace in the cancer research arena [305].
What is certainly clear is that when we look back on this era in cancer research we will realise how little we understood about the genes and pathways associated with cancer formation and the ingenuity of cancers to evolve and overcome all that we throw at them.
Acknowledgements
Work in the Adams laboratory is supported by Cancer Research-UK and the Wellcome Trust. Louise van der Weyden is supported by the Kay Kendall Leukaemia Fund.
References
- 1.Davies H., Bignell G.R., Cox C., Stephens P., Edkins S., Clegg S., Teague J., Woffendin H., Garnett M.J., Bottomley W., Davis N., Dicks E., Ewing R., Floyd Y., Gray K., Hall S., Hawes R., Hughes J., Kosmidou V., Menzies A., Mould C., Parker A., Stevens C., Watt S., Hooper S., Wilson R., Jayatilake H., Gusterson B.A., Cooper C., Shipley J., Hargrave D., Pritchard-Jones K., Maitland N., Chenevix-Trench G., Riggins G.J., Bigner D.D., Palmieri G., Cossu A., Flanagan A., Nicholson A., Ho J.W., Leung S.Y., Yuen S.T., Weber B.L., Seigler H.F., Darrow T.L., Paterson H., Marais R., Marshall C.J., Wooster R., Stratton M.R., Futreal P.A. Mutations of the BRAF gene in human cancer. Nature. 2002;417:949–954. doi: 10.1038/nature00766. [DOI] [PubMed] [Google Scholar]
- 2.Bardelli A., Parsons D.W., Silliman N., Ptak J., Szabo S., Saha S., Markowitz S., Willson J.K., Parmigiani G., Kinzler K.W., Vogelstein B., Velculescu V.E. Mutational analysis of the tyrosine kinome in colorectal cancers. Science. 2003;300:949. doi: 10.1126/science.1082596. [DOI] [PubMed] [Google Scholar]
- 3.Wang Z., Shen D., Parsons D.W., Bardelli A., Sager J., Szabo S., Ptak J., Silliman N., Peters B.A., van der Heijden M.S., Parmigiani G., Yan H., Wang T.L., Riggins G., Powell S.M., Willson J.K., Markowitz S., Kinzler K.W., Vogelstein B., Velculescu V.E. Mutational analysis of the tyrosine phosphatome in colorectal cancers. Science. 2004;304:1164–1166. doi: 10.1126/science.1096096. [DOI] [PubMed] [Google Scholar]
- 4.Samuels Y., Velculescu V.E. Oncogenic mutations of PIK3CA in human cancers. Cell. Cycle. 2004;3:1221–1224. doi: 10.4161/cc.3.10.1164. [DOI] [PubMed] [Google Scholar]
- 5.Bachman K.E., Argani P., Samuels Y., Silliman N., Ptak J., Szabo S., Konishi H., Karakas B., Blair B.G., Lin C., Peters B.A., Velculescu V.E., Park B.H. The PIK3CA gene is mutated with high frequency in human breast cancers. Cancer Biol. Ther. 2004;3:772–775. doi: 10.4161/cbt.3.8.994. [DOI] [PubMed] [Google Scholar]
- 6.Campbell I.G., Russell S.E., Choong D.Y., Montgomery K.G., Ciavarella M.L., Hooi C.S., Cristiano B.E., Pearson R.B., Phillips W.A. Mutation of the PIK3CA gene in ovarian and breast cancer. Cancer Res. 2004;64:7678–7681. doi: 10.1158/0008-5472.CAN-04-2933. [DOI] [PubMed] [Google Scholar]
- 7.Levine D.A., Bogomolniy F., Yee C.J., Lash A., Barakat R.R., Borgen P.I., Boyd J. Frequent mutation of the PIK3CA gene in ovarian and breast cancers. Clin. Cancer Res. 2005;11:2875–2878. doi: 10.1158/1078-0432.CCR-04-2142. [DOI] [PubMed] [Google Scholar]
- 8.Parsons D.W., Wang T.L., Samuels Y., Bardelli A., Cummins J.M., DeLong L., Silliman N., Ptak J., Szabo S., Willson J.K., Markowitz S., Kinzler K.W., Vogelstein B., Lengauer C., Velculescu V.E. Colorectal cancer: mutations in a signalling pathway. Nature. 2005;436:792. doi: 10.1038/436792a. [DOI] [PubMed] [Google Scholar]
- 9.Brugge J., Hung M.C., Mills G.B. A new mutational AKTivation in the PI3K pathway. Cancer Cell. 2007;12:104–107. doi: 10.1016/j.ccr.2007.07.014. [DOI] [PubMed] [Google Scholar]
- 10.Stephens P., Edkins S., Davies H., Greenman C., Cox C., Hunter C., Bignell G., Teague J., Smith R., Stevens C., O'Meara S., Parker A., Tarpey P., Avis T., Barthorpe A., Brackenbury L., Buck G., Butler A., Clements J., Cole J., Dicks E., Edwards K., Forbes S., Gorton M., Gray K., Halliday K., Harrison R., Hills K., Hinton J., Jones D., Kosmidou V., Laman R., Lugg R., Menzies A., Perry J., Petty R., Raine K., Shepherd R., Small A., Solomon H., Stephens Y., Tofts C., Varian J., Webb A., West S., Widaa S., Yates A., Brasseur F., Cooper C.S., Flanagan A.M., Green A., Knowles M., Leung S.Y., Looijenga L.H., Malkowicz B., Pierotti M.A., Teh B., Yuen S.T., Nicholson A.G., Lakhani S., Easton D.F., Weber B.L., Stratton M.R., Futreal P.A., Wooster R. A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancer. Nat. Genet. 2005;37:590–592. doi: 10.1038/ng1571. [DOI] [PubMed] [Google Scholar]
- 11.Davies H., Hunter C., Smith R., Stephens P., Greenman C., Bignell G., Teague J., Butler A., Edkins S., Stevens C., Parker A., O'Meara S., Avis T., Barthorpe S., Brackenbury L., Buck G., Clements J., Cole J., Dicks E., Edwards K., Forbes S., Gorton M., Gray K., Halliday K., Harrison R., Hills K., Hinton J., Jones D., Kosmidou V., Laman R., Lugg R., Menzies A., Perry J., Petty R., Raine K., Shepherd R., Small A., Solomon H., Stephens Y., Tofts C., Varian J., Webb A., West S., Widaa S., Yates A., Brasseur F., Cooper C.S., Flanagan A.M., Green A., Knowles M., Leung S.Y., Looijenga L.H., Malkowicz B., Pierotti M.A., Teh B.T., Yuen S.T., Lakhani S.R., Easton D.F., Weber B.L., Goldstraw P., Nicholson A.G., Wooster R., Stratton M.R., Futreal P.A. Somatic mutations of the protein kinase gene family in human lung cancer. Cancer Res. 2005;65:7591–7595. doi: 10.1158/0008-5472.CAN-05-1855. [DOI] [PubMed] [Google Scholar]
- 12.Greenman C., Stephens P., Smith R., Dalgliesh G.L., Hunter C., Bignell G., Davies H., Teague J., Butler A., Stevens C., Edkins S., O'Meara S., Vastrik I., Schmidt E.E., Avis T., Barthorpe S., Bhamra G., Buck G., Choudhury B., Clements J., Cole J., Dicks E., Forbes S., Gray K., Halliday K., Harrison R., Hills K., Hinton J., Jenkinson A., Jones D., Menzies A., Mironenko T., Perry J., Raine K., Richardson D., Shepherd R., Small A., Tofts C., Varian J., Webb T., West S., Widaa S., Yates A., Cahill D.P., Louis D.N., Goldstraw P., Nicholson A.G., Brasseur F., Looijenga L., Weber B.L., Chiew Y.E., DeFazio A., Greaves M.F., Green A.R., Campbell P., Birney E., Easton D.F., Chenevix-Trench G., Tan M.H., Khoo S.K., Teh B.T., Yuen S.T., Leung S.Y., Wooster R., Futreal P.A., Stratton M.R. Patterns of somatic mutation in human cancer genomes. Nature. 2007;446:153–158. doi: 10.1038/nature05610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sjoblom T., Jones S., Wood L.D., Parsons D.W., Lin J., Barber T.D., Mandelker D., Leary R.J., Ptak J., Silliman N., Szabo S., Buckhaults P., Farrell C., Meeh P., Markowitz S.D., Willis J., Dawson D., Willson J.K., Gazdar A.F., Hartigan J., Wu L., Liu C., Parmigiani G., Park B.H., Bachman K.E., Papadopoulos N., Vogelstein B., Kinzler K.W., Velculescu V.E. The consensus coding sequences of human breast and colorectal cancers. Science. 2006;314:268–274. doi: 10.1126/science.1133427. [DOI] [PubMed] [Google Scholar]
- 14.Lin J., Gan C.M., Zhang X., Jones S., Sjoblom T., Wood L.D., Parsons D.W., Papadopoulos N., Kinzler K.W., Vogelstein B., Parmigiani G., Velculescu V.E. A multidimensional analysis of genes mutated in breast and colorectal cancers. Genome Res. 2007;17:1304–1318. doi: 10.1101/gr.6431107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Getz G., Hofling H., Mesirov J.P., Golub T.R., Meyerson M., Tibshirani R., Lander E.S. Comment on “The consensus coding sequences of human breast and colorectal cancers”. Science. 2007;317:1500. doi: 10.1126/science.1138764. [DOI] [PubMed] [Google Scholar]
- 16.Kaiser J. Cancer. First pass at cancer genome reveals complex landscape. Science. 2006;313:1370. doi: 10.1126/science.313.5792.1370. [DOI] [PubMed] [Google Scholar]
- 17.Pruitt K.D., Tatusova T., Maglott D.R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007;35:D61–65. doi: 10.1093/nar/gkl842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wood L.D., Parsons D.W., Jones S., Lin J., Sjoblom T., Leary R.J., Shen D., Boca S.M., Barber T., Ptak J., Silliman N., Szabo S., Dezso Z., Ustyanksky V., Nikolskaya T., Nikolsky Y., Karchin R., Wilson P.A., Kaminker J.S., Zhang Z., Croshaw R., Willis J., Dawson D., Shipitsin M., Willson J.K., Sukumar S., Polyak K., Park B.H., Pethiyagoda C.L., Pant P.V., Ballinger D.G., Sparks A.B., Hartigan J., Smith D.R., Suh E., Papadopoulos N., Buckhaults P., Markowitz S.D., Parmigiani G., Kinzler K.W., Velculescu V.E., Vogelstein B. The genomic landscapes of human breast and colorectal cancers. Science. 2007;318:1108–1113. doi: 10.1126/science.1145720. [DOI] [PubMed] [Google Scholar]
- 19.Frohling S., Scholl C., Levine R.L., Loriaux M., Boggon T.J., Bernard O.A., Berger R., Dohner H., Dohner K., Ebert B.L., Teckie S., Golub T.R., Jiang J., Schittenhelm M.M., Lee B.H., Griffin J.D., Stone R.M., Heinrich M.C., Deininger M.W., Druker B.J., Gilliland D.G. Identification of driver and passenger mutations of FLT3 by high-throughput DNA sequence analysis and functional assessment of candidate alleles. Cancer Cell. 2007;12:501–513. doi: 10.1016/j.ccr.2007.11.005. [DOI] [PubMed] [Google Scholar]
- 20.Futreal P.A. Backseat drivers take the wheel. Cancer Cell. 2007;12:493–494. doi: 10.1016/j.ccr.2007.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Forbes S., Clements J., Dawson E., Bamford S., Webb T., Dogan A., Flanagan A., Teague J., Wooster R., Futreal P.A., Stratton M.R. Cosmic 2005. Br. J. Cancer. 2006;94:318–322. doi: 10.1038/sj.bjc.6602928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hodges E., Xuan Z., Balija V., Kramer M., Molla M.N., Smith S.W., Middle C.M., Rodesch M.J., Albert T.J., Hannon G.J., McCombie W.R. Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 2007;39:1522–1527. doi: 10.1038/ng.2007.42. [DOI] [PubMed] [Google Scholar]
- 23.Bhattacharjee A., Richards W.G., Staunton J., Li C., Monti S., Vasa P., Ladd C., Beheshti J., Bueno R., Gillette M., Loda M., Weber G., Mark E.J., Lander E.S., Wong W., Johnson B.E., Golub T.R., Sugarbaker D.J., Meyerson M. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. U. S. A. 2001;98:13790–13795. doi: 10.1073/pnas.191502998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Garber M.E., Troyanskaya O.G., Schluens K., Petersen S., Thaesler Z., Pacyna-Gengelbach M., van de Rijn M., Rosen G.D., Perou C.M., Whyte R.I., Altman R.B., Brown P.O., Botstein D., Petersen I. Diversity of gene expression in adenocarcinoma of the lung. Proc. Natl. Acad. Sci. U. S. A. 2001;98:13784–13789. doi: 10.1073/pnas.241500798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Beer D.G., Kardia S.L., Huang C.C., Giordano T.J., Levin A.M., Misek D.E., Lin L., Chen G., Gharib T.G., Thomas D.G., Lizyness M.L., Kuick R., Hayasaka S., Taylor J.M., Iannettoni M.D., Orringer M.B., Hanash S. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002;8:816–824. doi: 10.1038/nm733. [DOI] [PubMed] [Google Scholar]
- 26.Sweet-Cordero A., Mukherjee S., Subramanian A., You H., Roix J.J., Ladd-Acosta C., Mesirov J., Golub T.R., Jacks T. An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat. Genet. 2005;37:48–55. doi: 10.1038/ng1490. [DOI] [PubMed] [Google Scholar]
- 27.Eden P., Ritz C., Rose C., Ferno M., Peterson C. “Good Old” clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur. J. Cancer. 2004;40:1837–1841. doi: 10.1016/j.ejca.2004.02.025. [DOI] [PubMed] [Google Scholar]
- 28.Wang Z., Gerstein M., Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009;10:57–63. doi: 10.1038/nrg2484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Feuk L., Carson A.R., Scherer S.W. Structural variation in the human genome. Nat. Rev. Genet. 2006;7:85–97. doi: 10.1038/nrg1767. [DOI] [PubMed] [Google Scholar]
- 30.Redon R., Ishikawa S., Fitch K.R., Feuk L., Perry G.H., Andrews T.D., Fiegler H., Shapero M.H., Carson A.R., Chen W., Cho E.K., Dallaire S., Freeman J.L., Gonzalez J.R., Gratacos M., Huang J., Kalaitzopoulos D., Komura D., MacDonald J.R., Marshall C.R., Mei R., Montgomery L., Nishimura K., Okamura K., Shen F., Somerville M.J., Tchinda J., Valsesia A., Woodwark C., Yang F., Zhang J., Zerjal T., Zhang J., Armengol L., Conrad D.F., Estivill X., Tyler-Smith C., Carter N.P., Aburatani H., Lee C., Jones K.W., Scherer S.W., Hurles M.E. Global variation in copy number in the human genome. Nature. 2006;444:444–454. doi: 10.1038/nature05329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Stranger B.E., Forrest M.S., Dunning M., Ingle C.E., Beazley C., Thorne N., Redon R., Bird C.P., de Grassi A., Lee C., Tyler-Smith C., Carter N., Scherer S.W., Tavare S., Deloukas P., Hurles M.E., Dermitzakis E.T. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007;315:848–853. doi: 10.1126/science.1136678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lengauer C., Kinzler K.W., Vogelstein B. Genetic instabilities in human cancers. Nature. 1998;396:643–649. doi: 10.1038/25292. [DOI] [PubMed] [Google Scholar]
- 33.Fridlyand J., Snijders A.M., Ylstra B., Li H., Olshen A., Segraves R., Dairkee S., Tokuyasu T., Ljung B.M., Jain A.N., McLennan J., Ziegler J., Chin K., Devries S., Feiler H., Gray J.W., Waldman F., Pinkel D., Albertson D.G. Breast tumor copy number aberration phenotypes and genomic instability. BMC Cancer. 2006;6:96. doi: 10.1186/1471-2407-6-96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hernando E., Nahle Z., Juan G., Diaz-Rodriguez E., Alaminos M., Hemann M., Michel L., Mittal V., Gerald W., Benezra R., Lowe S.W., Cordon-Cardo C. Rb inactivation promotes genomic instability by uncoupling cell cycle progression from mitotic control. Nature. 2004;430:797–802. doi: 10.1038/nature02820. [DOI] [PubMed] [Google Scholar]
- 35.Albertson D.G., Collins C., McCormick F., Gray J.W. Chromosome aberrations in solid tumors. Nat. Genet. 2003;34:369–376. doi: 10.1038/ng1215. [DOI] [PubMed] [Google Scholar]
- 36.Li L., McCormack A.A., Nicholson J.M., Fabarius A., Hehlmann R., Sachs R.K., Duesberg P.H. Cancer-causing karyotypes: chromosomal equilibria between destabilizing aneuploidy and stabilizing selection for oncogenic function. Cancer Genet. Cytogenet. 2009;188:1–25. doi: 10.1016/j.cancergencyto.2008.08.016. [DOI] [PubMed] [Google Scholar]
- 37.Kallioniemi A., Kallioniemi O.P., Sudar D., Rutovitz D., Gray J.W., Waldman F., Pinkel D. Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science. 1992;258:818–821. doi: 10.1126/science.1359641. [DOI] [PubMed] [Google Scholar]
- 38.Pinkel D., Segraves R., Sudar D., Clark S., Poole I., Kowbel D., Collins C., Kuo W.L., Chen C., Zhai Y., Dairkee S.H., Ljung B.M., Gray J.W., Albertson D.G. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 1998;20:207–211. doi: 10.1038/2524. [DOI] [PubMed] [Google Scholar]
- 39.Knutsen T., Gobu V., Knaus R., Padilla-Nash H., Augustus M., Strausberg R.L., Kirsch I.R., Sirotkin K., Ried T. The interactive online SKY/M-FISH and CGH database and the Entrez cancer chromosomes search database: linkage of chromosomal aberrations with the genome sequence. Genes Chromosomes Cancer. 2005;44:52–64. doi: 10.1002/gcc.20224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Pollack J.R., Perou C.M., Alizadeh A.A., Eisen M.B., Pergamenschikov A., Williams C.F., Jeffrey S.S., Botstein D., Brown P.O. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat. Genet. 1999;23:41–46. doi: 10.1038/12640. [DOI] [PubMed] [Google Scholar]
- 41.Albertson D.G., Pinkel D. Genomic microarrays in human genetic disease and cancer. Hum. Mol. Genet. 2003;2(12. Spec. No.):R145–152. doi: 10.1093/hmg/ddg261. [DOI] [PubMed] [Google Scholar]
- 42.Thomas R.K., Weir B., Meyerson M. Genomic approaches to lung cancer. Clin. Cancer Res. 2006;12:4384s–4391s. doi: 10.1158/1078-0432.CCR-06-0098. [DOI] [PubMed] [Google Scholar]
- 43.Kennedy G.C., Matsuzaki H., Dong S., Liu W.M., Huang J., Liu G., Su X., Cao M., Chen W., Zhang J., Liu W., Yang G., Di X., Ryder T., He Z., Surti U., Phillips M.S., Boyce-Jacino M.T., Fodor S.P., Jones K.W. Large-scale genotyping of complex DNA. Nat. Biotechnol. 2003;21:1233–1237. doi: 10.1038/nbt869. [DOI] [PubMed] [Google Scholar]
- 44.Matsuzaki H., Dong S., Loi H., Di X., Liu G., Hubbell E., Law J., Berntsen T., Chadha M., Hui H., Yang G., Kennedy G.C., Webster T.A., Cawley S., Walsh P.S., Jones K.W., Fodor S.P., Mei R. Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat. Methods. 2004;1:109–111. doi: 10.1038/nmeth718. [DOI] [PubMed] [Google Scholar]
- 45.Bignell G.R., Huang J., Greshock J., Watt S., Butler A., West S., Grigorova M., Jones K.W., Wei W., Stratton M.R., Futreal P.A., Weber B., Shapero M.H., Wooster R. High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res. 2004;14:287–295. doi: 10.1101/gr.2012304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Xiao Y., Segal M.R., Yang Y.H., Yeh R.F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics. 2007;23:1459–1467. doi: 10.1093/bioinformatics/btm131. [DOI] [PubMed] [Google Scholar]
- 47.Zhao X., Li C., Paez J.G., Chin K., Janne P.A., Chen T.H., Girard L., Minna J., Christiani D., Leo C., Gray J.W., Sellers W.R., Meyerson M. An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays. Cancer Res. 2004;64:3060–3071. doi: 10.1158/0008-5472.can-03-3308. [DOI] [PubMed] [Google Scholar]
- 48.Raghavan M., Lillington D.M., Skoulakis S., Debernardi S., Chaplin T., Foot N.J., Lister T.A., Young B.D. Genome-wide single nucleotide polymorphism analysis reveals frequent partial uniparental disomy due to somatic recombination in acute myeloid leukemias. Cancer Res. 2005;65:375–378. [PubMed] [Google Scholar]
- 49.Calhoun E.S., Gallmeier E., Cunningham S.C., Eshleman J.R., Hruban R.H., Kern S.E. Copy-number methods dramatically underestimate loss of heterozygosity in cancer. Genes Chromosomes Cancer. 2006;45:1070–1071. doi: 10.1002/gcc.20365. [DOI] [PubMed] [Google Scholar]
- 50.Kloth J.N., Oosting J., van Wezel T., Szuhai K., Knijnenburg J., Gorter A., Kenter G.G., Fleuren G.J., Jordanova E.S. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex genetic alterations in cervical cancer. BMC Genomics. 2007;8:53. doi: 10.1186/1471-2164-8-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Quackenbush J. Microarray data normalization and transformation. Nat. Genet. 2002;32 Suppl.:496–501. doi: 10.1038/ng1032. [DOI] [PubMed] [Google Scholar]
- 52.Neuvial P., Hupe P., Brito I., Liva S., Manie E., Brennetot C., Radvanyi F., Aurias A., Barillot E. Spatial normalization of array-CGH data. BMC Bioinformatics. 2006;7:264. doi: 10.1186/1471-2105-7-264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Khojasteh M., Lam W.L., Ward R.K., MacAulay C. A stepwise framework for the normalization of array CGH data. BMC Bioinformatics. 2005;6:274. doi: 10.1186/1471-2105-6-274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Staaf J., Jonsson G., Ringner M., Vallon-Christersson J. Normalization of array-CGH data: influence of copy number imbalances. BMC Genomics. 2007;8:382. doi: 10.1186/1471-2164-8-382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Marioni J.C., Thorne N.P., Valsesia A., Fitzgerald T., Redon R., Fiegler H., Andrews T.D., Stranger B.E., Lynch A.G., Dermitzakis E.T., Carter N.P., Tavare S., Hurles M.E. Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization. Genome Biol. 2007;8:R228. doi: 10.1186/gb-2007-8-10-r228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Nannya Y., Sanada M., Nakazaki K., Hosoya N., Wang L., Hangaishi A., Kurokawa M., Chiba S., Bailey D.K., Kennedy G.C., Ogawa S. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res. 2005;65:6071–6079. doi: 10.1158/0008-5472.CAN-05-0465. [DOI] [PubMed] [Google Scholar]
- 57.Huang J., Wei W., Zhang J., Liu G., Bignell G.R., Stratton M.R., Futreal P.A., Wooster R., Jones K.W., Shapero M.H. Whole genome DNA copy number changes identified by high density oligonucleotide arrays. Hum. Genomics. 2004;1:287–299. doi: 10.1186/1479-7364-1-4-287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Baross A., Delaney A.D., Li H.I., Nayar T., Flibotte S., Qian H., Chan S.Y., Asano J., Ally A., Cao M., Birch P., Brown-John M., Fernandes N., Go A., Kennedy G., Langlois S., Eydoux P., Friedman J.M., Marra M.A. Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data. BMC Bioinformatics. 2007;8:368. doi: 10.1186/1471-2105-8-368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Rigaill G., Hupe P., Almeida A., La Rosa P., Meyniel J.P., Decraene C., Barillot E. ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays. Bioinformatics. 2008;24:768–774. doi: 10.1093/bioinformatics/btn048. [DOI] [PubMed] [Google Scholar]
- 60.Hupe P., Stransky N., Thiery J.P., Radvanyi F., Barillot E. Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics. 2004;20:3413–3422. doi: 10.1093/bioinformatics/bth418. [DOI] [PubMed] [Google Scholar]
- 61.Olshen A.B., Venkatraman E.S., Lucito R., Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5:557–572. doi: 10.1093/biostatistics/kxh008. [DOI] [PubMed] [Google Scholar]
- 62.Venkatraman E.S., Olshen A.B. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics. 2007;23:657–663. doi: 10.1093/bioinformatics/btl646. [DOI] [PubMed] [Google Scholar]
- 63.Marioni J.C., Thorne N.P., Tavare S. BioHMM: a heterogeneous hidden Markov model for segmenting array CGH data. Bioinformatics. 2006;22:1144–1146. doi: 10.1093/bioinformatics/btl089. [DOI] [PubMed] [Google Scholar]
- 64.Rueda O.M., Diaz-Uriarte R. Flexible and accurate detection of genomic copy-number changes from aCGH. PLoS Comput. Biol. 2007;3:e122. doi: 10.1371/journal.pcbi.0030122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Shah S.P., Xuan X., DeLeeuw R.J., Khojasteh M., Lam W.L., Ng R., Murphy K.P. Integrating copy number polymorphisms into array CGH analysis using a robust HMM. Bioinformatics. 2006;22:e431–439. doi: 10.1093/bioinformatics/btl238. [DOI] [PubMed] [Google Scholar]
- 66.Stjernqvist S., Ryden T., Skold M., Staaf J. Continuous-index hidden Markov modelling of array CGH copy number data. Bioinformatics. 2007;23:1006–1014. doi: 10.1093/bioinformatics/btm059. [DOI] [PubMed] [Google Scholar]
- 67.Engler D.A., Mohapatra G., Louis D.N., Betensky R.A. A pseudolikelihood approach for simultaneous analysis of array comparative genomic hybridizations. Biostatistics. 2006;7:399–421. doi: 10.1093/biostatistics/kxj015. [DOI] [PubMed] [Google Scholar]
- 68.Wang P., Kim Y., Pollack J., Narasimhan B., Tibshirani R. A method for calling gains and losses in array CGH data. Biostatistics. 2005;6:45–58. doi: 10.1093/biostatistics/kxh017. [DOI] [PubMed] [Google Scholar]
- 69.Hsu L., Self S.G., Grove D., Randolph T., Wang K., Delrow J.J., Loo L., Porter P. Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics. 2005;6:211–226. doi: 10.1093/biostatistics/kxi004. [DOI] [PubMed] [Google Scholar]
- 70.Huang J., Gusnanto A., O'Sullivan K., Staaf J., Borg A., Pawitan Y. Robust smooth segmentation approach for array CGH data analysis. Bioinformatics. 2007;23:2463–2469. doi: 10.1093/bioinformatics/btm359. [DOI] [PubMed] [Google Scholar]
- 71.Diaz-Uriarte R., Rueda O.M. ADaCGH: a parallelized web-based application and R package for the analysis of aCGH data. PLoS ONE. 2007;2:e737. doi: 10.1371/journal.pone.0000737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Lai W., Choudhary V., Park P.J. CGHweb: a tool for comparing DNA copy number segmentations from multiple algorithms. Bioinformatics. 2008;24:1014–1015. doi: 10.1093/bioinformatics/btn067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Yu T., Ye H., Sun W., Li K.C., Chen Z., Jacobs S., Bailey D.K., Wong D.T., Zhou X. A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array. BMC Bioinformatics. 2007;8:145. doi: 10.1186/1471-2105-8-145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Huang J., Wei W., Chen J., Zhang J., Liu G., Di X., Mei R., Ishikawa S., Aburatani H., Jones K.W., Shapero M.H. CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays. BMC Bioinformatics. 2006;7:83. doi: 10.1186/1471-2105-7-83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Lamy P., Andersen C.L., Dyrskjot L., Torring N., Wiuf C. A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays. BMC Bioinformatics. 2007;8:434. doi: 10.1186/1471-2105-8-434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.LaFramboise T., Weir B.A., Zhao X., Beroukhim R., Li C., Harrington D., Sellers W.R., Meyerson M. Allele-specific amplification in cancer revealed by SNP array analysis. PLoS Comput. Biol. 2005;1:e65. doi: 10.1371/journal.pcbi.0010065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Rouveirol C., Stransky N., Hupe P., Rosa P.L., Viara E., Barillot E., Radvanyi F. Computation of recurrent minimal genomic alterations from array-CGH data. Bioinformatics. 2006;22:849–856. doi: 10.1093/bioinformatics/btl004. [DOI] [PubMed] [Google Scholar]
- 78.Diskin S.J., Eck T., Greshock J., Mosse Y.P., Naylor T., Stoeckert C.J., Jr., Weber B.L., Maris J.M., Grant G.R. STAC: a method for testing the significance of DNA copy number aberrations across multiple array-CGH experiments. Genome Res. 2006;16:1149–1158. doi: 10.1101/gr.5076506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Fiegler H., Geigl J.B., Langer S., Rigler D., Porter K., Unger K., Carter N.P., Speicher M.R. High resolution array-CGH analysis of single cells. Nucleic Acids Res. 2007;35:e15. doi: 10.1093/nar/gkl1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Klijn C., Holstege H., de Ridder J., Liu X., Reinders M., Jonkers J., Wessels L. Identification of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data. Nucleic Acids Res. 2008;36:e13. doi: 10.1093/nar/gkm1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Bignell G.R., Santarius T., Pole J.C., Butler A.P., Perry J., Pleasance E., Greenman C., Menzies A., Taylor S., Edkins S., Campbell P., Quail M., Plumb B., Matthews L., McLay K., Edwards P.A., Rogers J., Wooster R., Futreal P.A., Stratton M.R. Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res. 2007;17:1296–1303. doi: 10.1101/gr.6522707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Schwab M. Oncogene amplification in solid tumors. Semin. Cancer Biol. 1999;9:319–325. doi: 10.1006/scbi.1999.0126. [DOI] [PubMed] [Google Scholar]
- 83.Seeger R.C., Brodeur G.M., Sather H., Dalton A., Siegel S.E., Wong K.Y., Hammond D. Association of multiple copies of the N-myc oncogene with rapid progression of neuroblastomas. N. Engl. J. Med. 1985;313:1111–1116. doi: 10.1056/NEJM198510313131802. [DOI] [PubMed] [Google Scholar]
- 84.Cobleigh M.A., Vogel C.L., Tripathy D., Robert N.J., Scholl S., Fehrenbacher L., Wolter J.M., Paton V., Shak S., Lieberman G., Slamon D.J. Multinational study of the efficacy and safety of humanized anti-HER2 monoclonal antibody in women who have HER2-overexpressing metastatic breast cancer that has progressed after chemotherapy for metastatic disease. J. Clin. Oncol. 1999;17:2639–2648. doi: 10.1200/JCO.1999.17.9.2639. [DOI] [PubMed] [Google Scholar]
- 85.Li J., Yen C., Liaw D., Podsypanina K., Bose S., Wang S.I., Puc J., Miliaresis C., Rodgers L., McCombie R., Bigner S.H., Giovanella B.C., Ittmann M., Tycko B., Hibshoosh H., Wigler M.H., Parsons R. PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science. 1997;275:1943–1947. doi: 10.1126/science.275.5308.1943. [DOI] [PubMed] [Google Scholar]
- 86.Orlow I., Lacombe L., Hannon G.J., Serrano M., Pellicer I., Dalbagni G., Reuter V.E., Zhang Z.F., Beach D., Cordon-Cardo C. Deletion of the p16 and p15 genes in human bladder tumors. J. Natl. Cancer Inst. 1995;87:1524–1529. doi: 10.1093/jnci/87.20.1524. [DOI] [PubMed] [Google Scholar]
- 87.Cox C., Bignell G., Greenman C., Stabenau A., Warren W., Stephens P., Davies H., Watt S., Teague J., Edkins S., Birney E., Easton D.F., Wooster R., Futreal P.A., Stratton M.R. A survey of homozygous deletions in human cancer genomes. Proc. Natl. Acad. Sci. U. S. A. 2005;102:4542–4547. doi: 10.1073/pnas.0408593102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Futreal P.A., Coin L., Marshall M., Down T., Hubbard T., Wooster R., Rahman N., Stratton M.R. A census of human cancer genes. Nat. Rev. Cancer. 2004;4:177–183. doi: 10.1038/nrc1299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Maher E.A., Brennan C., Wen P.Y., Durso L., Ligon K.L., Richardson A., Khatry D., Feng B., Sinha R., Louis D.N., Quackenbush J., Black P.M., Chin L., DePinho R.A. Marked genomic differences characterize primary and secondary glioblastoma subtypes and identify two distinct molecular and clinical secondary glioblastoma entities. Cancer Res. 2006;66:11502–11513. doi: 10.1158/0008-5472.CAN-06-2072. [DOI] [PubMed] [Google Scholar]
- 90.Meza-Zepeda L.A., Kresse S.H., Barragan-Polania A.H., Bjerkehagen B., Ohnstad H.O., Namlos H.M., Wang J., Kristiansen B.E., Myklebost O. Array comparative genomic hybridization reveals distinct DNA copy number differences between gastrointestinal stromal tumors and leiomyosarcomas. Cancer Res. 2006;66:8984–8993. doi: 10.1158/0008-5472.CAN-06-1972. [DOI] [PubMed] [Google Scholar]
- 91.Chin K., DeVries S., Fridlyand J., Spellman P.T., Roydasgupta R., Kuo W.L., Lapuk A., Neve R.M., Qian Z., Ryder T., Chen F., Feiler H., Tokuyasu T., Kingsley C., Dairkee S., Meng Z., Chew K., Pinkel D., Jain A., Ljung B.M., Esserman L., Albertson D.G., Waldman F.M., Gray J.W. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell. 2006;10:529–541. doi: 10.1016/j.ccr.2006.10.009. [DOI] [PubMed] [Google Scholar]
- 92.Hicks J., Krasnitz A., Lakshmi B., Navin N.E., Riggs M., Leibu E., Esposito D., Alexander J., Troge J., Grubor V., Yoon S., Wigler M., Ye K., Borresen-Dale A.L., Naume B., Schlicting E., Norton L., Hagerstrom T., Skoog L., Auer G., Maner S., Lundin P., Zetterberg A. Novel patterns of genome rearrangement and their association with survival in breast cancer. Genome Res. 2006;16:1465–1479. doi: 10.1101/gr.5460106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Kersemaekers A.M., Kenter G.G., Hermans J., Fleuren G.J., van de Vijver M.J. Allelic loss and prognosis in carcinoma of the uterine cervix. Int. J. Cancer. 1998;79:411–417. doi: 10.1002/(sici)1097-0215(19980821)79:4<411::aid-ijc17>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
- 94.Bernardini M., Lee C.H., Beheshti B., Prasad M., Albert M., Marrano P., Begley H., Shaw P., Covens A., Murphy J., Rosen B., Minkin S., Squire J.A., Macgregor P.F. High-resolution mapping of genomic imbalance and identification of gene expression profiles associated with differential chemotherapy response in serous epithelial ovarian cancer. Neoplasia. 2005;7:603–613. doi: 10.1593/neo.04760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kim S.W., Kim J.W., Kim Y.T., Kim J.H., Kim S., Yoon B.S., Nam E.J., Kim H.Y. Analysis of chromosomal changes in serous ovarian carcinoma using high-resolution array comparative genomic hybridization: potential predictive markers of chemoresistant disease. Genes Chromosomes Cancer. 2007;46:1–9. doi: 10.1002/gcc.20384. [DOI] [PubMed] [Google Scholar]
- 96.Garraway L.A., Sellers W.R. From integrated genomics to tumor lineage dependency. Cancer Res. 2006;66:2506–2508. doi: 10.1158/0008-5472.CAN-05-4604. [DOI] [PubMed] [Google Scholar]
- 97.Garraway L.A., Widlund H.R., Rubin M.A., Getz G., Berger A.J., Ramaswamy S., Beroukhim R., Milner D.A., Granter S.R., Du J., Lee C., Wagner S.N., Li C., Golub T.R., Rimm D.L., Meyerson M.L., Fisher D.E., Sellers W.R. Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature. 2005;436:117–122. doi: 10.1038/nature03664. [DOI] [PubMed] [Google Scholar]
- 98.Weir B.A., Woo M.S., Getz G., Perner S., Ding L., Beroukhim R., Lin W.M., Province M.A., Kraja A., Johnson L.A., Shah K., Sato M., Thomas R.K., Barletta J.A., Borecki I.B., Broderick S., Chang A.C., Chiang D.Y., Chirieac L.R., Cho J., Fujii Y., Gazdar A.F., Giordano T., Greulich H., Hanna M., Johnson B.E., Kris M.G., Lash A., Lin L., Lindeman N., Mardis E.R., McPherson J.D., Minna J.D., Morgan M.B., Nadel M., Orringer M.B., Osborne J.R., Ozenberger B., Ramos A.H., Robinson J., Roth J.A., Rusch V., Sasaki H., Shepherd F., Sougnez C., Spitz M.R., Tsao M.S., Twomey D., Verhaak R.G., Weinstock G.M., Wheeler D.A., Winckler W., Yoshizawa A., Yu S., Zakowski M.F., Zhang Q., Beer D.G., Wistuba I., Watson M.A., Garraway L.A., Ladanyi M., Travis W.D., Pao W., Rubin M.A., Gabriel S.B., Gibbs R.A., Varmus H.E., Wilson R.K., Lander E.S., Meyerson M. Characterizing the cancer genome in lung adenocarcinoma. Nature. 2007;450:893–898. doi: 10.1038/nature06358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Kendall J., Liu Q., Bakleh A., Krasnitz A., Nguyen K.C., Lakshmi B., Gerald W.L., Powers S., Mu D. Oncogenic cooperation and coamplification of developmental transcription factor genes in lung cancer. Proc. Natl. Acad. Sci. U. S. A. 2007;104:16663–16668. doi: 10.1073/pnas.0708286104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Veltman J.A., Fridlyand J., Pejavar S., Olshen A.B., Korkola J.E., DeVries S., Carroll P., Kuo W.L., Pinkel D., Albertson D., Cordon-Cardo C., Jain A.N., Waldman F.M. Array-based comparative genomic hybridization for genome-wide screening of DNA copy number in bladder tumors. Cancer Res. 2003;63:2872–2880. [PubMed] [Google Scholar]
- 101.Albertson D.G., Ylstra B., Segraves R., Collins C., Dairkee S.H., Kowbel D., Kuo W.L., Gray J.W., Pinkel D. Quantitative mapping of amplicon structure by array CGH identifies CYP24 as a candidate oncogene. Nat. Genet. 2000;25:144–146. doi: 10.1038/75985. [DOI] [PubMed] [Google Scholar]
- 102.Lilljebjorn H., Heidenblad M., Nilsson B., Lassen C., Horvat A., Heldrup J., Behrendtz M., Johansson B., Andersson A., Fioretos T. Combined high-resolution array-based comparative genomic hybridization and expression profiling of ETV6/RUNX1-positive acute lymphoblastic leukemias reveal a high incidence of cryptic Xq duplications and identify several putative target genes within the commonly gained region. Leukemia. 2007;21:2137–2144. doi: 10.1038/sj.leu.2404879. [DOI] [PubMed] [Google Scholar]
- 103.Hyman E., Kauraniemi P., Hautaniemi S., Wolf M., Mousses S., Rozenblum E., Ringner M., Sauter G., Monni O., Elkahloun A., Kallioniemi O.P., Kallioniemi A. Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res. 2002;62:6240–6245. [PubMed] [Google Scholar]
- 104.Pollack J.R., Sorlie T., Perou C.M., Rees C.A., Jeffrey S.S., Lonning P.E., Tibshirani R., Botstein D., Borresen-Dale A.L., Brown P.O. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc. Natl. Acad. Sci. U. S. A. 2002;99:12963–12968. doi: 10.1073/pnas.162471999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Mullighan C.G., Goorha S., Radtke I., Miller C.B., Coustan-Smith E., Dalton J.D., Girtman K., Mathew S., Ma J., Pounds S.B., Su X., Pui C.H., Relling M.V., Evans W.E., Shurtleff S.A., Downing J.R. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature. 2007;446:758–764. doi: 10.1038/nature05690. [DOI] [PubMed] [Google Scholar]
- 106.Wang Y., Armstrong S.A. Genome-wide SNP analysis in cancer: leukemia shows the way. Cancer Cell. 2007;11:308–309. doi: 10.1016/j.ccr.2007.03.017. [DOI] [PubMed] [Google Scholar]
- 107.Watson S.K., deLeeuw R.J., Horsman D.E., Squire J.A., Lam W.L. Cytogenetically balanced translocations are associated with focal copy number alterations. Hum. Genet. 2007;120:795–805. doi: 10.1007/s00439-006-0251-9. [DOI] [PubMed] [Google Scholar]
- 108.Pinkel D., Albertson D.G. Comparative genomic hybridization. Annu. Rev. Genomics Hum. Genet. 2005;6:331–354. doi: 10.1146/annurev.genom.6.080604.162140. [DOI] [PubMed] [Google Scholar]
- 109.Mitelman F., Johansson B., Mertens F. Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nat. Genet. 2004;36:331–334. doi: 10.1038/ng1335. [DOI] [PubMed] [Google Scholar]
- 110.Volik S., Raphael B.J., Huang G., Stratton M.R., Bignel G., Murnane J., Brebner J.H., Bajsarowicz K., Paris P.L., Tao Q., Kowbel D., Lapuk A., Shagin D.A., Shagina I.A., Gray J.W., Cheng J.F., de Jong P.J., Pevzner P., Collins C. Decoding the fine-scale structure of a breast cancer genome and transcriptome. Genome Res. 2006;16:394–404. doi: 10.1101/gr.4247306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Howarth K.D., Blood K.A., Ng B.L., Beavis J.C., Chua Y., Cooke S.L., Raby S., Ichimura K., Collins V.P., Carter N.P., Edwards P.A. Array painting reveals a high frequency of balanced translocations in breast cancer cell lines that break in cancer-relevant genes. Oncogene. 2008;27:3345–3359. doi: 10.1038/sj.onc.1210993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Tomlins S.A., Rhodes D.R., Perner S., Dhanasekaran S.M., Mehra R., Sun X.W., Varambally S., Cao X., Tchinda J., Kuefer R., Lee C., Montie J.E., Shah R.B., Pienta K.J., Rubin M.A., Chinnaiyan A.M. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science. 2005;310:644–648. doi: 10.1126/science.1117679. [DOI] [PubMed] [Google Scholar]
- 113.Soda M., Choi Y.L., Enomoto M., Takada S., Yamashita Y., Ishikawa S., Fujiwara S., Watanabe H., Kurashina K., Hatanaka H., Bando M., Ohno S., Ishikawa Y., Aburatani H., Niki T., Sohara Y., Sugiyama Y., Mano H. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007;448:561–566. doi: 10.1038/nature05945. [DOI] [PubMed] [Google Scholar]
- 114.Volik S., Zhao S., Chin K., Brebner J.H., Herndon D.R., Tao Q., Kowbel D., Huang G., Lapuk A., Kuo W.L., Magrane G., De Jong P., Gray J.W., Collins C. End-sequence profiling: sequence-based analysis of aberrant genomes. Proc. Natl. Acad. Sci. U. S. A. 2003;100:7696–7701. doi: 10.1073/pnas.1232418100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Campbell P.J., Stephens P.J., Pleasance E.D., O'Meara S., Li H., Santarius T., Stebbings L.A., Leroy C., Edkins S., Hardy C., Teague J.W., Menzies A., Goodhead I., Turner D.J., Clee C.M., Quail M.A., Cox A., Brown C., Durbin R., Hurles M.E., Edwards P.A., Bignell G.R., Stratton M.R., Futreal P.A. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 2008;40:722–729. doi: 10.1038/ng.128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.McClintock B. The stability of broken ends of chromosomes in Zea Mays. Genetics. 1941;26:234–282. doi: 10.1093/genetics/26.2.234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Raphael B.J., Volik S., Yu P., Wu C., Huang G., Linardopoulou E.V., Trask B.J., Waldman F., Costello J., Pienta K.J., Mills G.B., Bajsarowicz K., Kobayashi Y., Sridharan S., Paris P.L., Tao Q., Aerni S.J., Brown R.P., Bashir A., Gray J.W., Cheng J.F., de Jong P., Nefedov M., Ried T., Padilla-Nash H.M., Collins C.C. A sequence-based survey of the complex structural organization of tumor genomes. Genome Biol. 2008;9:R59. doi: 10.1186/gb-2008-9-3-r59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Bashir A., Volik S., Collins C., Bafna V., Raphael B.J. Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer. PLoS Comput. Biol. 2008;4:e1000051. doi: 10.1371/journal.pcbi.1000051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Bentley D.R., Balasubramanian S., Swerdlow H.P., Smith G.P., Milton J., Brown C.G., Hall K.P., Evers D.J., Barnes C.L., Bignell H.R., Boutell J.M., Bryant J., Carter R.J., Keira Cheetham R., Cox A.J., Ellis D.J., Flatbush M.R., Gormley N.A., Humphray S.J., Irving L.J., Karbelashvili M.S., Kirk S.M., Li H., Liu X., Maisinger K.S., Murray L.J., Obradovic B., Ost T., Parkinson M.L., Pratt M.R., Rasolonjatovo I.M., Reed M.T., Rigatti R., Rodighiero C., Ross M.T., Sabot A., Sankar S.V., Scally A., Schroth G.P., Smith M.E., Smith V.P., Spiridou A., Torrance P.E., Tzonev S.S., Vermaas E.H., Walter K., Wu X., Zhang L., Alam M.D., Anastasi C., Aniebo I.C., Bailey D.M., Bancarz I.R., Banerjee S., Barbour S.G., Baybayan P.A., Benoit V.A., Benson K.F., Bevis C., Black P.J., Boodhun A., Brennan J.S., Bridgham J.A., Brown R.C., Brown A.A., Buermann D.H., Bundu A.A., Burrows J.C., Carter N.P., Castillo N., Chiara E.C.M., Chang S., Neil Cooley R., Crake N.R., Dada O.O., Diakoumakos K.D., Dominguez-Fernandez B., Earnshaw D.J., Egbujor U.C., Elmore D.W., Etchin S.S., Ewan M.R., Fedurco M., Fraser L.J., Fuentes Fajardo K.V., Scott Furey W., George D., Gietzen K.J., Goddard C.P., Golda G.S., Granieri P.A., Green D.E., Gustafson D.L., Hansen N.F., Harnish K., Haudenschild C.D., Heyer N.I., Hims M.M., Ho J.T., Horgan A.M., Hoschler K., Hurwitz S., Ivanov D.V., Johnson M.Q., James T., Huw Jones T.A., Kang G.D., Kerelska T.H., Kersey A.D., Khrebtukova I., Kindwall A.P., Kingsbury Z., Kokko-Gonzales P.I., Kumar A., Laurent M.A., Lawley C.T., Lee S.E., Lee X., Liao A.K., Loch J.A., Lok M., Luo S., Mammen R.M., Martin J.W., McCauley P.G., McNitt P., Mehta P., Moon K.W., Mullens J.W., Newington T., Ning Z., Ling Ng B., Novo S.M., O'Neill M.J., Osborne M.A., Osnowski A., Ostadan O., Paraschos L.L., Pickering L., Pike A.C., Pike A.C., Chris Pinkard D., Pliskin D.P., Podhasky J., Quijano V.J., Raczy C., Rae V.H., Rawlings S.R., Chiva Rodriguez A., Roe P.M., Rogers J., Rogert Bacigalupo M.C., Romanov N., Romieu A., Roth R.K., Rourke N.J., Ruediger S.T., Rusman E., Sanches-Kuiper R.M., Schenker M.R., Seoane J.M., Shaw R.J., Shiver M.K., Short S.W., Sizto N.L., Sluis J.P., Smith M.A., Ernest Sohna Sohna J., Spence E.J., Stevens K., Sutton N., Szajkowski L., Tregidgo C.L., Turcatti G., Vandevondele S., Verhovsky Y., Virk S.M., Wakelin S., Walcott G.C., Wang J., Worsley G.J., Yan J., Yau L., Zuerlein M., Rogers J., Mullikin J.C., Hurles M.E., McCooke N.J., West J.S., Oaks F.L., Lundberg P.L., Klenerman D., Durbin R., Smith A.J. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Wang J., Wang W., Li R., Li Y., Tian G., Goodman L., Fan W., Zhang J., Li J., Zhang J., Guo Y., Feng B., Li H., Lu Y., Fang X., Liang H., Du Z., Li D., Zhao Y., Hu Y., Yang Z., Zheng H., Hellmann I., Inouye M., Pool J., Yi X., Zhao J., Duan J., Zhou Y., Qin J., Ma L., Li G., Yang Z., Zhang G., Yang B., Yu C., Liang F., Li W., Li S., Li D., Ni P., Ruan J., Li Q., Zhu H., Liu D., Lu Z., Li N., Guo G., Zhang J., Ye J., Fang L., Hao Q., Chen Q., Liang Y., Su Y., San A., Ping C., Yang S., Chen F., Li L., Zhou K., Zheng H., Ren Y., Yang L., Gao Y., Yang G., Li Z., Feng X., Kristiansen K., Wong G.K., Nielsen R., Durbin R., Bolund L., Zhang X., Li S., Yang H., Wang J. The diploid genome sequence of an Asian individual. Nature. 2008;456:60–65. doi: 10.1038/nature07484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Ley T.J., Mardis E.R., Ding L., Fulton B., McLellan M.D., Chen K., Dooling D., Dunford-Shore B.H., McGrath S., Hickenbotham M., Cook L., Abbott R., Larson D.E., Koboldt D.C., Pohl C., Smith S., Hawkins A., Abbott S., Locke D., Hillier L.W., Miner T., Fulton L., Magrini V., Wylie T., Glasscock J., Conyers J., Sander N., Shi X., Osborne J.R., Minx P., Gordon D., Chinwalla A., Zhao Y., Ries R.E., Payton J.E., Westervelt P., Tomasson M.H., Watson M., Baty J., Ivanovich J., Heath S., Shannon W.D., Nagarajan R., Walter M.J., Link D.C., Graubert T.A., DiPersio J.F., Wilson R.K. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature. 2008;456:66–72. doi: 10.1038/nature07485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Pelengaris S., Khan M. 2 ed. Blackwell Publishing Ltd; 2006. The Molecular Biology of Cancer. [Google Scholar]
- 123.Merlo A., Herman J.G., Mao L., Lee D.J., Gabrielson E., Burger P.C., Baylin S.B., Sidransky D. 5′ CpG island methylation is associated with transcriptional silencing of the tumour suppressor p16/CDKN2/MTS1 in human cancers. Nat. Med. 1995;1:686–692. doi: 10.1038/nm0795-686. [DOI] [PubMed] [Google Scholar]
- 124.Herman J.G., Merlo A., Mao L., Lapidus R.G., Issa J.P., Davidson N.E., Sidransky D., Baylin S.B. Inactivation of the CDKN2/p16/MTS1 gene is frequently associated with aberrant DNA methylation in all common human cancers. Cancer Res. 1995;55:4525–4530. [PubMed] [Google Scholar]
- 125.Herman J.G., Latif F., Weng Y., Lerman M.I., Zbar B., Liu S., Samid D., Duan D.S., Gnarra J.R., Linehan W.M. Silencing of the VHL tumor-suppressor gene by DNA methylation in renal carcinoma. Proc. Natl. Acad. Sci. U. S. A. 1994;91:9700–9704. doi: 10.1073/pnas.91.21.9700. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Esteller M., Silva J.M., Dominguez G., Bonilla F., Matias-Guiu X., Lerma E., Bussaglia E., Prat J., Harkes I.C., Repasky E.A., Gabrielson E., Schutte M., Baylin S.B., Herman J.G. Promoter hypermethylation and BRCA1 inactivation in sporadic breast and ovarian tumors. J. Natl. Cancer Inst. 2000;92:564–569. doi: 10.1093/jnci/92.7.564. [DOI] [PubMed] [Google Scholar]
- 127.Jones P.A., Baylin S.B. The fundamental role of epigenetic events in cancer. Nat. Rev. Genet. 2002;3:415–428. doi: 10.1038/nrg816. [DOI] [PubMed] [Google Scholar]
- 128.Frigola J., Song J., Stirzaker C., Hinshelwood R.A., Peinado M.A., Clark S.J. Epigenetic remodeling in colorectal cancer results in coordinate gene suppression across an entire chromosome band. Nat. Genet. 2006;38:540–549. doi: 10.1038/ng1781. [DOI] [PubMed] [Google Scholar]
- 129.Toyota M., Ahuja N., Ohe-Toyota M., Herman J.G., Baylin S.B., Issa J.P. CpG island methylator phenotype in colorectal cancer. Proc. Natl. Acad. Sci. U. S. A. 1999;96:8681–8686. doi: 10.1073/pnas.96.15.8681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Weisenberger D.J., Siegmund K.D., Campan M., Young J., Long T.I., Faasse M.A., Kang G.H., Widschwendter M., Weener D., Buchanan D., Koh H., Simms L., Barker M., Leggett B., Levine J., Kim M., French A.J., Thibodeau S.N., Jass J., Haile R., Laird P.W. CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat. Genet. 2006;38:787–793. doi: 10.1038/ng1834. [DOI] [PubMed] [Google Scholar]
- 131.Widschwendter M., Fiegl H., Egle D., Mueller-Holzner E., Spizzo G., Marth C., Weisenberger D.J., Campan M., Young J., Jacobs I., Laird P.W. Epigenetic stem cell signature in cancer. Nat. Genet. 2007;39:157–158. doi: 10.1038/ng1941. [DOI] [PubMed] [Google Scholar]
- 132.Callinan P.A., Feinberg A.P. The emerging science of epigenomics. Hum. Mol. Genet. 2006;1(15. Spec. No.):R95–R101. doi: 10.1093/hmg/ddl095. [DOI] [PubMed] [Google Scholar]
- 133.Jones P.A., Baylin S.B. The epigenomics of cancer. Cell. 2007;128:683–692. doi: 10.1016/j.cell.2007.01.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Ruprecht K., Mayer J., Sauter M., Roemer K., Mueller-Lantzsch N. Endogenous retroviruses and cancer. Cell. Mol. Life Sci. 2008;65:3366–3382. doi: 10.1007/s00018-008-8496-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Deichmann A., Hacein-Bey-Abina S., Schmidt M., Garrigue A., Brugman M.H., Hu J., Glimm H., Gyapay G., Prum B., Fraser C.C., Fischer N., Schwarzwaelder K., Siegler M.L., de Ridder D., Pike-Overzet K., Howe S.J., Thrasher A.J., Wagemaker G., Abel U., Staal F.J., Delabesse E., Villeval J.L., Aronow B., Hue C., Prinz C., Wissler M., Klanke C., Weissenbach J., Alexander I., Fischer A., von Kalle C., Cavazzana-Calvo M. Vector integration is nonrandom and clustered and influences the fate of lymphopoiesis in SCID-X1 gene therapy. J. Clin. Invest. 2007;117:2225–2232. doi: 10.1172/JCI31659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Hacein-Bey-Abina S., Garrigue A., Wang G.P., Soulier J., Lim A., Morillon E., Clappier E., Caccavelli L., Delabesse E., Beldjord K., Asnafi V., MacIntyre E., Dal Cortivo L., Radford I., Brousse N., Sigaux F., Moshous D., Hauer J., Borkhardt A., Belohradsky B.H., Wintergerst U., Velez M.C., Leiva L., Sorensen R., Wulffraat N., Blanche S., Bushman F.D., Fischer A., Cavazzana-Calvo M. Insertional oncogenesis in 4 patients after retrovirus-mediated gene therapy of SCID-X1. J. Clin. Invest. 2008;118:3132–3142. doi: 10.1172/JCI35700. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Wang G.P., Garrigue A., Ciuffi A., Ronen K., Leipzig J., Berry C., Lagresle-Peyrou C., Benjelloun F., Hacein-Bey-Abina S., Fischer A., Cavazzana-Calvo M., Bushman F.D. DNA bar coding and pyrosequencing to analyze adverse events in therapeutic gene transfer. Nucleic Acids Res. 2008;36:e49. doi: 10.1093/nar/gkn125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Howe S.J., Mansour M.R., Schwarzwaelder K., Bartholomae C., Hubank M., Kempski H., Brugman M.H., Pike-Overzet K., Chatters S.J., de Ridder D., Gilmour K.C., Adams S., Thornhill S.I., Parsley K.L., Staal F.J., Gale R.E., Linch D.C., Bayford J., Brown L., Quaye M., Kinnon C., Ancliff P., Webb D.K., Schmidt M., von Kalle C., Gaspar H.B., Thrasher A.J. Insertional mutagenesis combined with acquired somatic mutations causes leukemogenesis following gene therapy of SCID-X1 patients. J. Clin. Invest. 2008;118:3143–3150. doi: 10.1172/JCI35798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Pujana M.A., Han J.D., Starita L.M., Stevens K.N., Tewari M., Ahn J.S., Rennert G., Moreno V., Kirchhoff T., Gold B., Assmann V., Elshamy W.M., Rual J.F., Levine D., Rozek L.S., Gelman R.S., Gunsalus K.C., Greenberg R.A., Sobhian B., Bertin N., Venkatesan K., Ayivi-Guedehoussou N., Sole X., Hernandez P., Lazaro C., Nathanson K.L., Weber B.L., Cusick M.E., Hill D.E., Offit K., Livingston D.M., Gruber S.B., Parvin J.D., Vidal M. Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat. Genet. 2007;39:1338–1349. doi: 10.1038/ng.2007.2. [DOI] [PubMed] [Google Scholar]
- 140.Wong D.J., Liu H., Ridky T.W., Cassarino D., Segal E., Chang H.Y. Module map of stem cell genes guides creation of epithelial cancer stem cells. Cell Stem Cell. 2008;2:333–344. doi: 10.1016/j.stem.2008.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Ben-Porath I., Thomson M.W., Carey V.J., Ge R., Bell G.W., Regev A., Weinberg R.A. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat. Genet. 2008;40:499–507. doi: 10.1038/ng.127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Jonsson P.F., Cavanna T., Zicha D., Bates P.A. Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinformatics. 2006;7:2. doi: 10.1186/1471-2105-7-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Frese K.K., Tuveson D.A. Maximizing mouse cancer models. Nat. Rev. Cancer. 2007;7:645–658. doi: 10.1038/nrc2192. [DOI] [PubMed] [Google Scholar]
- 144.Sharpless N.E., Depinho R.A. The mighty mouse: genetically engineered mouse models in cancer drug development. Nat. Rev. Drug Discov. 2006;5:741–754. doi: 10.1038/nrd2110. [DOI] [PubMed] [Google Scholar]
- 145.Baron U., Bujard H. Tet repressor-based system for regulated gene expression in eukaryotic cells: principles and advances. Methods Enzymol. 2000;327:401–421. doi: 10.1016/s0076-6879(00)27292-3. [DOI] [PubMed] [Google Scholar]
- 146.Eilers M., Picard D., Yamamoto K.R., Bishop J.M. Chimaeras of myc oncoprotein and steroid receptors cause hormone-dependent transformation of cells. Nature. 1989;340:66–68. doi: 10.1038/340066a0. [DOI] [PubMed] [Google Scholar]
- 147.Muller U. Ten years of gene targeting: targeted mouse mutants, from vector design to phenotype analysis. Mech. Dev. 1999;82:3–21. doi: 10.1016/s0925-4773(99)00021-0. [DOI] [PubMed] [Google Scholar]
- 148.Stewart T.A., Pattengale P.K., Leder P. Spontaneous mammary adenocarcinomas in transgenic mice that carry and express MTV/myc fusion genes. Cell. 1984;38:627–637. doi: 10.1016/0092-8674(84)90257-5. [DOI] [PubMed] [Google Scholar]
- 149.Adams J.M., Harris A.W., Pinkert C.A., Corcoran L.M., Alexander W.S., Cory S., Palmiter R.D., Brinster R.L. The c-myc oncogene driven by immunoglobulin enhancers induces lymphoid malignancy in transgenic mice. Nature. 1985;318:533–538. doi: 10.1038/318533a0. [DOI] [PubMed] [Google Scholar]
- 150.Bottinger E.P., Jakubczak J.L., Haines D.C., Bagnall K., Wakefield L.M. Transgenic mice overexpressing a dominant-negative mutant type II transforming growth factor beta receptor show enhanced tumorigenesis in the mammary gland and lung in response to the carcinogen 7,12-dimethylbenz-[a]-anthracene. Cancer Res. 1997;57:5564–5570. [PubMed] [Google Scholar]
- 151.Robertson E., Bradley A., Kuehn M., Evans M. Germ-line transmission of genes introduced into cultured pluripotential cells by retroviral vector. Nature. 1986;323:445–448. doi: 10.1038/323445a0. [DOI] [PubMed] [Google Scholar]
- 152.Sauer B., Henderson N. Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. Proc. Natl. Acad. Sci. U. S. A. 1988;85:5166–5170. doi: 10.1073/pnas.85.14.5166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Lakso M., Sauer B., Mosinger B., Jr., Lee E.J., Manning R.W., Yu S.H., Mulder K.L., Westphal H. Targeted oncogene activation by site-specific recombination in transgenic mice. Proc. Natl. Acad. Sci. U. S. A. 1992;89:6232–6236. doi: 10.1073/pnas.89.14.6232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Smith A.J., Xian J., Richardson M., Johnstone K.A., Rabbitts P.H. Cre-loxP chromosome engineering of a targeted deletion in the mouse corresponding to the 3p21.3 region of homozygous loss in human tumours. Oncogene. 2002;21:4521–4529. doi: 10.1038/sj.onc.1205530. [DOI] [PubMed] [Google Scholar]
- 155.Kmita M., Kondo T., Duboule D. Targeted inversion of a polar silencer within the HoxD complex re-allocates domains of enhancer sharing. Nat. Genet. 2000;26:451–454. doi: 10.1038/82593. [DOI] [PubMed] [Google Scholar]
- 156.Forster A., Pannell R., Drynan L.F., McCormack M., Collins E.C., Daser A., Rabbitts T.H. Engineering de novo reciprocal chromosomal translocations associated with Mll to replicate primary events of human cancer. Cancer Cell. 2003;3:449–458. doi: 10.1016/s1535-6108(03)00106-5. [DOI] [PubMed] [Google Scholar]
- 157.de Alboran I.M., O'Hagan R.C., Gartner F., Malynn B., Davidson L., Rickert R., Rajewsky K., DePinho R.A., Alt F.W. Analysis of C-MYC function in normal cells via conditional gene-targeted mutation. Immunity. 2001;14:45–55. doi: 10.1016/s1074-7613(01)00088-7. [DOI] [PubMed] [Google Scholar]
- 158.Jackson E.L., Willis N., Mercer K., Bronson R.T., Crowley D., Montoya R., Jacks T., Tuveson D.A. Analysis of lung tumor initiation and progression using conditional expression of oncogenic K-ras. Genes Dev. 2001;15:3243–3248. doi: 10.1101/gad.943001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Jonkers J., Berns A. Conditional mouse models of sporadic cancer. Nat. Rev. Cancer. 2002;2:251–265. doi: 10.1038/nrc777. [DOI] [PubMed] [Google Scholar]
- 160.Xu X., Wagner K.U., Larson D., Weaver Z., Li C., Ried T., Hennighausen L., Wynshaw-Boris A., Deng C.X. Conditional mutation of Brca1 in mammary epithelial cells results in blunted ductal morphogenesis and tumour formation. Nat. Genet. 1999;22:37–43. doi: 10.1038/8743. [DOI] [PubMed] [Google Scholar]
- 161.Jonkers J., Meuwissen R., van der Gulden H., Peterse H., van der Valk M., Berns A. Synergistic tumor suppressor activity of BRCA2 and p53 in a conditional mouse model for breast cancer. Nat. Genet. 2001;29:418–425. doi: 10.1038/ng747. [DOI] [PubMed] [Google Scholar]
- 162.O'Hagan R.C., Brennan C.W., Strahs A., Zhang X., Kannan K., Donovan M., Cauwels C., Sharpless N.E., Wong W.H., Chin L. Array comparative genome hybridization for tumor classification and gene discovery in mouse models of malignant melanoma. Cancer Res. 2003;63:5352–5356. [PubMed] [Google Scholar]
- 163.Hodgson G., Hager J.H., Volik S., Hariono S., Wernick M., Moore D., Nowak N., Albertson D.G., Pinkel D., Collins C., Hanahan D., Gray J.W. Genome scanning with array CGH delineates regional alterations in mouse islet carcinomas. Nat. Genet. 2001;29:459–464. doi: 10.1038/ng771. [DOI] [PubMed] [Google Scholar]
- 164.Uren A.G., Kool J., Berns A., van Lohuizen M. Retroviral insertional mutagenesis: past, present and future. Oncogene. 2005;24:7656–7672. doi: 10.1038/sj.onc.1209043. [DOI] [PubMed] [Google Scholar]
- 165.Weiss R.A. The discovery of endogenous retroviruses. Retrovirology. 2006;3:67. doi: 10.1186/1742-4690-3-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Mikkers H., Berns A. Retroviral insertional mutagenesis: tagging cancer pathways. Adv. Cancer Res. 2003;88:53–99. doi: 10.1016/s0065-230x(03)88304-5. [DOI] [PubMed] [Google Scholar]
- 167.Neil J.C., Cameron E.R. Retroviral insertion sites and cancer: fountain of all knowledge? Cancer Cell. 2002;2:253–255. doi: 10.1016/s1535-6108(02)00158-7. [DOI] [PubMed] [Google Scholar]
- 168.Clausse N., Baines D., Moore R., Brookes S., Dickson C., Peters G. Activation of both Wnt-1 and Fgf-3 by insertion of mouse mammary tumor virus downstream in the reverse orientation: a reappraisal of the enhancer insertion model. Virology. 1993;194:157–165. doi: 10.1006/viro.1993.1245. [DOI] [PubMed] [Google Scholar]
- 169.Selten G., Cuypers H.T., Zijlstra M., Melief C., Berns A. Involvement of c-myc in MuLV-induced T cell lymphomas in mice: frequency and mechanisms of activation. EMBO J. 1984;3:3215–3222. doi: 10.1002/j.1460-2075.1984.tb02281.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Corcoran L.M., Adams J.M., Dunn A.R., Cory S. Murine T lymphomas in which the cellular myc oncogene has been activated by retroviral insertion. Cell. 1984;37:113–122. doi: 10.1016/0092-8674(84)90306-4. [DOI] [PubMed] [Google Scholar]
- 171.Akagi K., Suzuki T., Stephens R.M., Jenkins N.A., Copeland N.G. RTCGD: retroviral tagged cancer gene database. Nucleic Acids Res. 2004;32:D523–527. doi: 10.1093/nar/gkh013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 172.Gearhart J., Pashos E.E., Prasad M.K. Pluripotency redux-advances in stem-cell research. N. Engl. J. Med. 2007;357:1469–1472. doi: 10.1056/NEJMp078126. [DOI] [PubMed] [Google Scholar]
- 173.Yucel R., Karsunky H., Klein-Hitpass L., Moroy T. The transcriptional repressor Gfi1 affects development of early, uncommitted c-Kit + T cell progenitors and CD4/CD8 lineage decision in the thymus. J. Exp. Med. 2003;197:831–844. doi: 10.1084/jem.20021417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174.Rathinam C., Klein C. Transcriptional repressor Gfi1 integrates cytokine-receptor signals controlling B-cell differentiation. PLoS ONE. 2007;2:e306. doi: 10.1371/journal.pone.0000306. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 175.Mikkers H., Allen J., Knipscheer P., Romeijn L., Hart A., Vink E., Berns A. High-throughput retroviral tagging to identify components of specific signaling pathways in cancer. Nat. Genet. 2002;32:153–159. doi: 10.1038/ng950. [DOI] [PubMed] [Google Scholar]
- 176.Copeland N.G., Jenkins N.A. Retroviral integration in murine myeloid tumors to identify Evi-1, a novel locus encoding a zinc-finger protein. Adv. Cancer Res. 1990;54:141–157. doi: 10.1016/s0065-230x(08)60810-6. [DOI] [PubMed] [Google Scholar]
- 177.Mucenski M.L., Taylor B.A., Copeland N.G., Jenkins N.A. Chromosomal location of Evi-1, a common site of ecotropic viral integration in AKXD murine myeloid tumors. Oncogene Res. 1988;2:219–233. [PubMed] [Google Scholar]
- 178.Mucenski M.L., Taylor B.A., Ihle J.N., Hartley J.W., Morse H.C., III, Jenkins N.A., Copeland N.G. Identification of a common ecotropic viral integration site, Evi-1, in the DNA of AKXD murine myeloid tumors. Mol. Cell Biol. 1988;8:301–308. doi: 10.1128/mcb.8.1.301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179.Wieser R. The oncogene and developmental regulator EVI1: expression, biochemical properties, and biological functions. Gene. 2007;396:346–357. doi: 10.1016/j.gene.2007.04.012. [DOI] [PubMed] [Google Scholar]
- 180.Cuypers H.T., Selten G., Quint W., Zijlstra M., Maandag E.R., Boelens W., van Wezenbeek P., Melief C., Berns A. Murine leukemia virus-induced T-cell lymphomagenesis: integration of proviruses in a distinct chromosomal region. Cell. 1984;37:141–150. doi: 10.1016/0092-8674(84)90309-x. [DOI] [PubMed] [Google Scholar]
- 181.Selten G., Cuypers H.T., Berns A. Proviral activation of the putative oncogene Pim-1 in MuLV induced T-cell lymphomas. Embo J. 1985;4:1793–1798. doi: 10.1002/j.1460-2075.1985.tb03852.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182.van Lohuizen M., Verbeek S., Krimpenfort P., Domen J., Saris C., Radaszkiewicz T., Berns A. Predisposition to lymphomagenesis in pim-1 transgenic mice: cooperation with c-myc and N-myc in murine leukemia virus-induced tumors. Cell. 1989;56:673–682. doi: 10.1016/0092-8674(89)90589-8. [DOI] [PubMed] [Google Scholar]
- 183.Dhanasekaran S.M., Barrette T.R., Ghosh D., Shah R., Varambally S., Kurachi K., Pienta K.J., Rubin M.A., Chinnaiyan A.M. Delineation of prognostic biomarkers in prostate cancer. Nature. 2001;412:822–826. doi: 10.1038/35090585. [DOI] [PubMed] [Google Scholar]
- 184.Brodeur G.M., Seeger R.C., Schwab M., Varmus H.E., Bishop J.M. Amplification of N-myc in untreated human neuroblastomas correlates with advanced disease stage. Science. 1984;224:1121–1124. doi: 10.1126/science.6719137. [DOI] [PubMed] [Google Scholar]
- 185.Brodeur G.M., Seeger R.C., Schwab M., Varmus H.E., Bishop J.M. Amplification of N-myc sequences in primary human neuroblastomas: correlation with advanced disease stage. Prog. Clin. Biol. Res. 1985;175:105–113. [PubMed] [Google Scholar]
- 186.Rosson D., Dugan D., Reddy E.P. Aberrant splicing events that are induced by proviral integration: implications for myb oncogene activation. Proc. Natl. Acad. Sci. U. S. A. 1987;84:3171–3175. doi: 10.1073/pnas.84.10.3171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 187.Hoemann C.D., Beaulieu N., Girard L., Rebai N., Jolicoeur P. Two distinct Notch1 mutant alleles are involved in the induction of T-cell leukemia in c-myc transgenic mice. Mol. Cell. Biol. 2000;20:3831–3842. doi: 10.1128/mcb.20.11.3831-3842.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.Weng A.P., Ferrando A.A., Lee W., Morris J.P.t., Silverman L.B., Sanchez-Irizarry C., Blacklow S.C., Look A.T., Aster J.C. Activating mutations of NOTCH1 in human T cell acute lymphoblastic leukemia. Science. 2004;306:269–271. doi: 10.1126/science.1102160. [DOI] [PubMed] [Google Scholar]
- 189.Largaespada D.A., Brannan C.I., Shaughnessy J.D., Jenkins N.A., Copeland N.G. The neurofibromatosis type 1 (NF1) tumor suppressor gene and myeloid leukemia. Curr. Top. Microbiol. Immunol. 1996;211:233–239. doi: 10.1007/978-3-642-85232-9_23. [DOI] [PubMed] [Google Scholar]
- 190.Bedigian H.G., Johnson D.A., Jenkins N.A., Copeland N.G., Evans R. Spontaneous and induced leukemias of myeloid origin in recombinant inbred BXH mice. J. Virol. 1984;51:586–594. doi: 10.1128/jvi.51.3.586-594.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.Suzuki T., Minehata K., Akagi K., Jenkins N.A., Copeland N.G. Tumor suppressor gene identification using retroviral insertional mutagenesis in Blm-deficient mice. EMBO J. 2006;25:3422–3431. doi: 10.1038/sj.emboj.7601215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 192.Ellis N.A., Groden J., Ye T.Z., Straughen J., Lennon D.J., Ciocci S., Proytcheva M., German J. The Bloom's syndrome gene product is homologous to RecQ helicases. Cell. 1995;83:655–666. doi: 10.1016/0092-8674(95)90105-1. [DOI] [PubMed] [Google Scholar]
- 193.Luo G., Santoro I.M., McDaniel L.D., Nishijima I., Mills M., Youssoufian H., Vogel H., Schultz R.A., Bradley A. Cancer predisposition caused by elevated mitotic recombination in Bloom mice. Nat. Genet. 2000;26:424–429. doi: 10.1038/82548. [DOI] [PubMed] [Google Scholar]
- 194.Wu X., Li Y., Crise B., Burgess S.M. Transcription start regions in the human genome are favored targets for MLV integration. Science. 2003;300:1749–1751. doi: 10.1126/science.1083413. [DOI] [PubMed] [Google Scholar]
- 195.Touw I.P., Erkeland S.J. Retroviral insertion mutagenesis in mice as a comparative oncogenomics tool to identify disease genes in human leukemia. Mol. Ther. 2007;15:13–19. doi: 10.1038/sj.mt.6300040. [DOI] [PubMed] [Google Scholar]
- 196.Pryciak P.M., Varmus H.E. Nucleosomes, DNA-binding proteins, and DNA sequence modulate retroviral integration target site selection. Cell. 1992;69:769–780. doi: 10.1016/0092-8674(92)90289-o. [DOI] [PubMed] [Google Scholar]
- 197.Muller H.P., Varmus H.E. DNA bending creates favored sites for retroviral integration: an explanation for preferred insertion sites in nucleosomes. EMBO J. 1994;13:4704–4714. doi: 10.1002/j.1460-2075.1994.tb06794.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.Mitchell R.S., Beitzel B.F., Schroder A.R., Shinn P., Chen H., Berry C.C., Ecker J.R., Bushman F.D. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS. Biol. 2004;2:E234. doi: 10.1371/journal.pbio.0020234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 199.Bushman F., Lewinski M., Ciuffi A., Barr S., Leipzig J., Hannenhalli S., Hoffmann C. Genome-wide analysis of retroviral DNA integration. Nat. Rev. Microbiol. 2005;3:848–858. doi: 10.1038/nrmicro1263. [DOI] [PubMed] [Google Scholar]
- 200.Hansen G.M., Skapura D., Justice M.J. Genetic profile of insertion mutations in mouse leukemias and lymphomas. Genome Res. 2000;10:237–243. doi: 10.1101/gr.10.2.237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201.Weiser K.C., Liu B., Hansen G.M., Skapura D., Hentges K.E., Yarlagadda S., Morse Iii H.C., Justice M.J. Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma. Mamm. Genome. 2007;18:709–722. doi: 10.1007/s00335-007-9060-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 202.Ochman H., Gerber A.S., Hartl D.L. Genetic applications of an inverse polymerase chain reaction. Genetics. 1988;120:621–623. doi: 10.1093/genetics/120.3.621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203.Triglia T., Peterson M.G., Kemp D.J. A procedure for in vitro amplification of DNA segments that lie outside the boundaries of known sequences. Nucleic Acids Res. 1988;16:8186. doi: 10.1093/nar/16.16.8186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 204.Riley J., Butler R., Ogilvie D., Finniear R., Jenner D., Powell S., Anand R., Smith J.C., Markham A.F. A novel, rapid method for the isolation of terminal sequences from yeast artificial chromosome (YAC) clones. Nucleic Acids Res. 1990;18:2887–2890. doi: 10.1093/nar/18.10.2887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 205.Devon R.S., Porteous D.J., Brookes A.J. Splinkerettes-improved vectorettes for greater efficiency in PCR walking. Nucleic Acids Res. 1995;23:1644–1645. doi: 10.1093/nar/23.9.1644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 206.Rothberg J.M., Leamon J.H. The development and impact of 454 sequencing. Nat. Biotechnol. 2008;26:1117–1124. doi: 10.1038/nbt1485. [DOI] [PubMed] [Google Scholar]
- 207.Wheeler D.A., Srinivasan M., Egholm M., Shen Y., Chen L., McGuire A., He W., Chen Y.J., Makhijani V., Roth G.T., Gomes X., Tartaro K., Niazi F., Turcotte C.L., Irzyk G.P., Lupski J.R., Chinault C., Song X.Z., Liu Y., Yuan Y., Nazareth L., Qin X., Muzny D.M., Margulies M., Weinstock G.M., Gibbs R.A., Rothberg J.M. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452:872–876. doi: 10.1038/nature06884. [DOI] [PubMed] [Google Scholar]
- 208.Ning Z., Cox A.J., Mullikin J.C. SSAHA: a fast search method for large DNA databases. Genome Res. 2001;11:1725–1729. doi: 10.1101/gr.194201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 209.P. Green, http://www.phrap.org/ (unpublished).
- 210.Smith T.F., Waterman M.S. Identification of common molecular subsequences. J. Mol. Biol. 1981;147:195–197. doi: 10.1016/0022-2836(81)90087-5. [DOI] [PubMed] [Google Scholar]
- 211.Gotoh O. An improved algorithm for matching biological sequences. J. Mol. Biol. 1982;162:705–708. doi: 10.1016/0022-2836(82)90398-9. [DOI] [PubMed] [Google Scholar]
- 212.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 213.de Ridder J., Uren A., Kool J., Reinders M., Wessels L. Detecting statistically significant common insertion sites in retroviral insertional mutagenesis screens. PLoS Comput. Biol. 2006;2:e166. doi: 10.1371/journal.pcbi.0020166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 214.Suzuki T., Shen H., Akagi K., Morse H.C., Malley J.D., Naiman D.Q., Jenkins N.A., Copeland N.G. New genes involved in cancer identified by retroviral tagging. Nat. Genet. 2002;32:166–174. doi: 10.1038/ng949. [DOI] [PubMed] [Google Scholar]
- 215.Wu X., Luke B.T., Burgess S.M. Redefining the common insertion site. Virology. 2006;344:292–295. doi: 10.1016/j.virol.2005.08.047. [DOI] [PubMed] [Google Scholar]
- 216.Li J., Shen H., Himmel K.L., Dupuy A.J., Largaespada D.A., Nakamura T., Shaughnessy J.D., Jr., Jenkins N.A., Copeland N.G. Leukaemia disease genes: large-scale cloning and pathway predictions. Nat. Genet. 1999;23:348–353. doi: 10.1038/15531. [DOI] [PubMed] [Google Scholar]
- 217.Hwang H.C., Martins C.P., Bronkhorst Y., Randel E., Berns A., Fero M., Clurman B.E. Identification of oncogenes collaborating with p27Kip1 loss by insertional mutagenesis and high-throughput insertion site analysis. Proc. Natl. Acad. Sci. U. S. A. 2002;99:11293–11298. doi: 10.1073/pnas.162356099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 218.Lund A.H., Turner G., Trubetskoy A., Verhoeven E., Wientjens E., Hulsman D., Russell R., DePinho R.A., Lenz J., van Lohuizen M. Genome-wide retroviral insertional tagging of genes involved in cancer in Cdkn2a-deficient mice. Nat. Genet. 2002;32:160–165. doi: 10.1038/ng956. [DOI] [PubMed] [Google Scholar]
- 219.Johansson F.K., Brodd J., Eklof C., Ferletta M., Hesselager G., Tiger C.F., Uhrbom L., Westermark B. Identification of candidate cancer-causing genes in mouse brain tumors by retroviral tagging. Proc. Natl. Acad. Sci. U. S. A. 2004;101:11334–11337. doi: 10.1073/pnas.0402716101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 220.Theodorou V., Kimm M.A., Boer M., Wessels L., Theelen W., Jonkers J., Hilkens J. MMTV insertional mutagenesis identifies genes, gene families and pathways involved in mammary cancer. Nat. Genet. 2007;39:759–769. doi: 10.1038/ng2034. [DOI] [PubMed] [Google Scholar]
- 221.Stewart M., Mackay N., Hanlon L., Blyth K., Scobie L., Cameron E., Neil J.C. Insertional mutagenesis reveals progression genes and checkpoints in MYC/Runx2 lymphomas. Cancer Res. 2007;67:5126–5133. doi: 10.1158/0008-5472.CAN-07-0433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 222.Slape C., Hartung H., Lin Y.W., Bies J., Wolff L., Aplan P.D. Retroviral insertional mutagenesis identifies genes that collaborate with NUP98-HOXD13 during leukemic transformation. Cancer Res. 2007;67:5148–5155. doi: 10.1158/0008-5472.CAN-07-0075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 223.Uren A.G., Kool J., Matentzoglu K., de Ridder J., Mattison J., van Uitert M., Lagcher W., Sie D., Tanger E., Cox T., Reinders M., Hubbard T.J., Rogers J., Jonkers J., Wessels L., Adams D.J., van Lohuizen M., Berns A. Large-scale mutagenesis in p19(ARF)- and p53-deficient mice identifies cancer genes and their collaborative networks. Cell. 2008;133:727–741. doi: 10.1016/j.cell.2008.03.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 224.Nakamura T., Largaespada D.A., Shaughnessy J.D., Jr., Jenkins N.A., Copeland N.G. Cooperative activation of Hoxa and Pbx1-related genes in murine myeloid leukaemias. Nat. Genet. 1996;12:149–153. doi: 10.1038/ng0296-149. [DOI] [PubMed] [Google Scholar]
- 225.Kamps M.P., Murre C., Sun X.H., Baltimore D. A new homeobox gene contributes the DNA binding domain of the t(1;19) translocation protein in pre-B ALL. Cell. 1990;60:547–555. doi: 10.1016/0092-8674(90)90658-2. [DOI] [PubMed] [Google Scholar]
- 226.Calvo K.R., Sykes D.B., Pasillas M.P., Kamps M.P. Nup98-HoxA9 immortalizes myeloid progenitors, enforces expression of Hoxa9, Hoxa7 and Meis1, and alters cytokine-specific responses in a manner similar to that induced by retroviral co-expression of Hoxa9 and Meis1. Oncogene. 2002;21:4247–4256. doi: 10.1038/sj.onc.1205516. [DOI] [PubMed] [Google Scholar]
- 227.Lawrence H.J., Rozenfeld S., Cruz C., Matsukuma K., Kwong A., Komuves L., Buchberg A.M., Largman C. Frequent co-expression of the HOXA9 and MEIS1 homeobox genes in human myeloid leukemias. Leukemia. 1999;13:1993–1999. doi: 10.1038/sj.leu.2401578. [DOI] [PubMed] [Google Scholar]
- 228.Wang G.G., Pasillas M.P., Kamps M.P. Persistent transactivation by meis1 replaces hox function in myeloid leukemogenesis models: evidence for co-occupancy of meis1-pbx and hox-pbx complexes on promoters of leukemia-associated genes. Mol. Cell. Biol. 2006;26:3902–3916. doi: 10.1128/MCB.26.10.3902-3916.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 229.de Ridder J., Kool J., Uren A., Bot J., Wessels L., Reinders M. Co-occurrence analysis of insertional mutagenesis data reveals cooperating oncogenes. Bioinformatics. 2007;23:i133–141. doi: 10.1093/bioinformatics/btm202. [DOI] [PubMed] [Google Scholar]
- 230.Largaespada D.A. Genetic heterogeneity in acute myeloid leukemia: maximizing information flow from MuLV mutagenesis studies. Leukemia. 2000;14:1174–1184. doi: 10.1038/sj.leu.2401852. [DOI] [PubMed] [Google Scholar]
- 231.van Lohuizen M., Verbeek S., Scheijen B., Wientjens E., van der Gulden H., Berns A. Identification of cooperating oncogenes in E mu-myc transgenic mice by provirus tagging. Cell. 1991;65:737–752. doi: 10.1016/0092-8674(91)90382-9. [DOI] [PubMed] [Google Scholar]
- 232.Jacobs J.J., Scheijen B., Voncken J.W., Kieboom K., Berns A., van Lohuizen M. Bmi-1 collaborates with c-Myc in tumorigenesis by inhibiting c-Myc-induced apoptosis via INK4a/ARF. Genes Dev. 1999;13:2678–2690. doi: 10.1101/gad.13.20.2678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 233.van der Lugt N.M., Domen J., Verhoeven E., Linders K., van der Gulden H., Allen J., Berns A. Proviral tagging in E mu-myc transgenic mice lacking the Pim-1 proto-oncogene leads to compensatory activation of Pim-2. EMBO J. 1995;14:2536–2544. doi: 10.1002/j.1460-2075.1995.tb07251.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 234.Zippo A., De Robertis A., Serafini R., Oliviero S. PIM1-dependent phosphorylation of histone H3 at serine 10 is required for MYC-dependent transcriptional activation and oncogenic transformation. Nat. Cell. Biol. 2007;9:932–944. doi: 10.1038/ncb1618. [DOI] [PubMed] [Google Scholar]
- 235.Naud J.F., Eilers M. PIM1 and MYC: a changing relationship? Nat. Cell. Biol. 2007;9:873–875. doi: 10.1038/ncb0807-873. [DOI] [PubMed] [Google Scholar]
- 236.Ellwood-Yen K., Graeber T.G., Wongvipat J., Iruela-Arispe M.L., Zhang J., Matusik R., Thomas G.V., Sawyers C.L. Myc-driven murine prostate cancer shares molecular features with human prostate tumors. Cancer Cell. 2003;4:223–238. doi: 10.1016/s1535-6108(03)00197-1. [DOI] [PubMed] [Google Scholar]
- 237.Castilla L.H., Perrat P., Martinez N.J., Landrette S.F., Keys R., Oikemus S., Flanegan J., Heilman S., Garrett L., Dutra A., Anderson S., Pihan G.A., Wolff L., Liu P.P. Identification of genes that synergize with Cbfb-MYH11 in the pathogenesis of acute myeloid leukemia. Proc. Natl. Acad. Sci. U. S. A. 2004;101:4924–4929. doi: 10.1073/pnas.0400930101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 238.Yamashita M., Emerman M. Retroviral infection of non-dividing cells: old and new perspectives. Virology. 2006;344:88–93. doi: 10.1016/j.virol.2005.09.012. [DOI] [PubMed] [Google Scholar]
- 239.Wang G., Williams G., Xia H., Hickey M., Shao J., Davidson B.L., McCray P.B. Apical barriers to airway epithelial cell gene transfer with amphotropic retroviral vectors. Gene Ther. 2002;9:922–931. doi: 10.1038/sj.gt.3301714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 240.Kapitonov V.V., Jurka J. A universal classification of eukaryotic transposable elements implemented in Repbase. Nat. Rev., Genet. 2008;9:411–412. doi: 10.1038/nrg2165-c1. author reply 414. [DOI] [PubMed] [Google Scholar]
- 241.Ivics Z., Hackett P.B., Plasterk R.H., Izsvak Z. Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells. Cell. 1997;91:501–510. doi: 10.1016/s0092-8674(00)80436-5. [DOI] [PubMed] [Google Scholar]
- 242.Collier L.S., Largaespada D.A. Hopping around the tumor genome: transposons for cancer gene discovery. Cancer Res. 2005;65:9607–9610. doi: 10.1158/0008-5472.CAN-05-3085. [DOI] [PubMed] [Google Scholar]
- 243.Izsvak Z., Ivics Z., Plasterk R.H. Sleeping Beauty, a wide host-range transposon vector for genetic transformation in vertebrates. J. Mol. Biol. 2000;302:93–102. doi: 10.1006/jmbi.2000.4047. [DOI] [PubMed] [Google Scholar]
- 244.Geurts A.M., Yang Y., Clark K.J., Liu G., Cui Z., Dupuy A.J., Bell J.B., Largaespada D.A., Hackett P.B. Gene transfer into genomes of human cells by the sleeping beauty transposon system. Mol. Ther. 2003;8:108–117. doi: 10.1016/s1525-0016(03)00099-6. [DOI] [PubMed] [Google Scholar]
- 245.Zayed H., Izsvak Z., Walisko O., Ivics Z. Development of hyperactive sleeping beauty transposon vectors by mutational analysis. Mol. Ther. 2004;9:292–304. doi: 10.1016/j.ymthe.2003.11.024. [DOI] [PubMed] [Google Scholar]
- 246.Collier L.S., Carlson C.M., Ravimohan S., Dupuy A.J., Largaespada D.A. Cancer gene discovery in solid tumours using transposon-based somatic mutagenesis in the mouse. Nature. 2005;436:272–276. doi: 10.1038/nature03681. [DOI] [PubMed] [Google Scholar]
- 247.Dupuy A.J., Akagi K., Largaespada D.A., Copeland N.G., Jenkins N.A. Mammalian mutagenesis using a highly mobile somatic Sleeping Beauty transposon system. Nature. 2005;436:221–226. doi: 10.1038/nature03691. [DOI] [PubMed] [Google Scholar]
- 248.Collier L.S., Largaespada D.A. Transposons for cancer gene discovery: Sleeping Beauty and beyond. Genome Biol. 2007;1(8 Suppl.):S15. doi: 10.1186/gb-2007-8-s1-s15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 249.Dupuy A.J., Fritz S., Largaespada D.A. Transposition and gene disruption in the male germline of the mouse. Genesis. 2001;30:82–88. doi: 10.1002/gene.1037. [DOI] [PubMed] [Google Scholar]
- 250.Yant S.R., Wu X., Huang Y., Garrison B., Burgess S.M., Kay M.A. High-resolution genome-wide mapping of transposon integration in mammals. Mol. Cell. Biol. 2005;25:2085–2094. doi: 10.1128/MCB.25.6.2085-2094.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 251.Carlson C.M., Dupuy A.J., Fritz S., Roberg-Perez K.J., Fletcher C.F., Largaespada D.A. Transposon mutagenesis of the mouse germline. Genetics. 2003;165:243–256. doi: 10.1093/genetics/165.1.243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 252.Vigdal T.J., Kaufman C.D., Izsvak Z., Voytas D.F., Ivics Z. Common physical properties of DNA affecting target site selection of sleeping beauty and other Tc1/mariner transposable elements. J. Mol. Biol. 2002;323:441–452. doi: 10.1016/s0022-2836(02)00991-9. [DOI] [PubMed] [Google Scholar]
- 253.Yusa K., Takeda J., Horie K. Enhancement of Sleeping Beauty transposition by CpG methylation: possible role of heterochromatin formation. Mol. Cell. Biol. 2004;24:4004–4018. doi: 10.1128/MCB.24.9.4004-4018.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 254.Ikeda R., Kokubu C., Yusa K., Keng V.W., Horie K., Takeda J. Sleeping beauty transposase has an affinity for heterochromatin conformation. Mol. Cell. Biol. 2007;27:1665–1676. doi: 10.1128/MCB.01500-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 255.Geurts A.M., Collier L.S., Geurts J.L., Oseth L.L., Bell M.L., Mu D., Lucito R., Godbout S.A., Green L.E., Lowe S.W., Hirsch B.A., Leinwand L.A., Largaespada D.A. Gene mutations and genomic rearrangements in the mouse as a result of transposon mobilization from chromosomal concatemers. PLoS Genet. 2006;2:e156. doi: 10.1371/journal.pgen.0020156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 256.Dupuy A.J., Jenkins N.A., Copeland N.G. Sleeping beauty: a novel cancer gene discovery tool. Hum. Mol. Genet. 2006;15(Spec. No. 1):R75–79. doi: 10.1093/hmg/ddl061. [DOI] [PubMed] [Google Scholar]
- 257.Largaespada D.A., Collier L.S. Transposon-mediated mutagenesis in somatic cells: identification of transposon–genomic DNA junctions. Methods Mol. Biol. 2008;435:95–108. doi: 10.1007/978-1-59745-232-8_7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 258.Ding S., Wu X., Li G., Han M., Zhuang Y., Xu T. Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice. Cell. 2005;122:473–483. doi: 10.1016/j.cell.2005.07.013. [DOI] [PubMed] [Google Scholar]
- 259.Cadinanos J., Bradley A. Generation of an inducible and optimized piggyBac transposon system. Nucleic Acids Res. 2007;35:e87. doi: 10.1093/nar/gkm446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 260.Wilson M.H., Coates C.J., George A.L., Jr. PiggyBac transposon-mediated gene transfer in human cells. Mol. Ther. 2007;15:139–145. doi: 10.1038/sj.mt.6300028. [DOI] [PubMed] [Google Scholar]
- 261.Pavlopoulos A., Oehler S., Kapetanaki M.G., Savakis C. The DNA transposon Minos as a tool for transgenesis and functional genomic analysis in vertebrates and invertebrates. Genome. Biol. 2007;1(8 Suppl.):S2. doi: 10.1186/gb-2007-8-s1-s2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 262.Zagoraiou L., Drabek D., Alexaki S., Guy J.A., Klinakis A.G., Langeveld A., Skavdis G., Mamalaki C., Grosveld F., Savakis C. In vivo transposition of Minos, a Drosophila mobile element, in mammalian tissues. Proc. Natl. Acad. Sci. U. S. A. 2001;98:11474–11478. doi: 10.1073/pnas.201392398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 263.Drabek D., Zagoraiou L., deWit T., Langeveld A., Roumpaki C., Mamalaki C., Savakis C., Grosveld F. Transposition of the Drosophila hydei Minos transposon in the mouse germ line. Genomics. 2003;81:108–111. doi: 10.1016/s0888-7543(02)00030-7. [DOI] [PubMed] [Google Scholar]
- 264.Moran J.V., Holmes S.E., Naas T.P., DeBerardinis R.J., Boeke J.D., Kazazian H.H., Jr. High frequency retrotransposition in cultured mammalian cells. Cell. 1996;87:917–927. doi: 10.1016/s0092-8674(00)81998-4. [DOI] [PubMed] [Google Scholar]
- 265.Hohjoh H., Singer M.F. Sequence-specific single-strand RNA binding protein encoded by the human LINE-1 retrotransposon. EMBO J. 1997;16:6034–6043. doi: 10.1093/emboj/16.19.6034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 266.Mathias S.L., Scott A.F., Kazazian H.H., Jr., Boeke J.D., Gabriel A. Reverse transcriptase encoded by a human transposable element. Science. 1991;254:1808–1810. doi: 10.1126/science.1722352. [DOI] [PubMed] [Google Scholar]
- 267.Feng Q., Moran J.V., Kazazian H.H., Jr., Boeke J.D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. doi: 10.1016/s0092-8674(00)81997-2. [DOI] [PubMed] [Google Scholar]
- 268.Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W., Funke R., Gage D., Harris K., Heaford A., Howland J., Kann L., Lehoczky J., LeVine R., McEwan P., McKernan K., Meldrim J., Mesirov J.P., Miranda C., Morris W., Naylor J., Raymond C., Rosetti M., Santos R., Sheridan A., Sougnez C., Stange-Thomann N., Stojanovic N., Subramanian A., Wyman D., Rogers J., Sulston J., Ainscough R., Beck S., Bentley D., Burton J., Clee C., Carter N., Coulson A., Deadman R., Deloukas P., Dunham A., Dunham I., Durbin R., French L., Grafham D., Gregory S., Hubbard T., Humphray S., Hunt A., Jones M., Lloyd C., McMurray A., Matthews L., Mercer S., Milne S., Mullikin J.C., Mungall A., Plumb R., Ross M., Shownkeen R., Sims S., Waterston R.H., Wilson R.K., Hillier L.W., McPherson J.D., Marra M.A., Mardis E.R., Fulton L.A., Chinwalla A.T., Pepin K.H., Gish W.R., Chissoe S.L., Wendl M.C., Delehaunty K.D., Miner T.L., Delehaunty A., Kramer J.B., Cook L.L., Fulton R.S., Johnson D.L., Minx P.J., Clifton S.W., Hawkins T., Branscomb E., Predki P., Richardson P., Wenning S., Slezak T., Doggett N., Cheng J.F., Olsen A., Lucas S., Elkin C., Uberbacher E., Frazier M., Gibbs R.A., Muzny D.M., Scherer S.E., Bouck J.B., Sodergren E.J., Worley K.C., Rives C.M., Gorrell J.H., Metzker M.L., Naylor S.L., Kucherlapati R.S., Nelson D.L., Weinstock G.M., Sakaki Y., Fujiyama A., Hattori M., Yada T., Toyoda A., Itoh T., Kawagoe C., Watanabe H., Totoki Y., Taylor T., Weissenbach J., Heilig R., Saurin W., Artiguenave F., Brottier P., Bruls T., Pelletier E., Robert C., Wincker P., Smith D.R., Doucette-Stamm L., Rubenfield M., Weinstock K., Lee H.M., Dubois J., Rosenthal A., Platzer M., Nyakatura G., Taudien S., Rump A., Yang H., Yu J., Wang J., Huang G., Gu J., Hood L., Rowen L., Madan A., Qin S., Davis R.W., Federspiel N.A., Abola A.P., Proctor M.J., Myers R.M., Schmutz J., Dickson M., Grimwood J., Cox D.R., Olson M.V., Kaul R., Raymond C., Shimizu N., Kawasaki K., Minoshima S., Evans G.A., Athanasiou M., Schultz R., Roe B.A., Chen F., Pan H., Ramser J., Lehrach H., Reinhardt R., McCombie W.R., de la Bastide M., Dedhia N., Blocker H., Hornischer K., Nordsiek G., Agarwala R., Aravind L., Bailey J.A., Bateman A., Batzoglou S., Birney E., Bork P., Brown D.G., Burge C.B., Cerutti L., Chen H.C., Church D., Clamp M., Copley R.R., Doerks T., Eddy S.R., Eichler E.E., Furey T.S., Galagan J., Gilbert J.G., Harmon C., Hayashizaki Y., Haussler D., Hermjakob H., Hokamp K., Jang W., Johnson L.S., Jones T.A., Kasif S., Kaspryzk A., Kennedy S., Kent W.J., Kitts P., Koonin E.V., Korf I., Kulp D., Lancet D., Lowe T.M., McLysaght A., Mikkelsen T., Moran J.V., Mulder N., Pollara V.J., Ponting C.P., Schuler G., Schultz J., Slater G., Smit A.F., Stupka E., Szustakowski J., Thierry-Mieg D., Thierry-Mieg J., Wagner L., Wallis J., Wheeler R., Williams A., Wolf Y.I., Wolfe K.H., Yang S.P., Yeh R.F., Collins F., Guyer M.S., Peterson J., Felsenfeld A., Wetterstrand K.A., Patrinos A., Morgan M.J., de Jong P., Catanese J.J., Osoegawa K., Shizuya H., Choi S., Chen Y.J. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- 269.Babushok D.V., Ostertag E.M., Courtney C.E., Choi J.M., Kazazian H.H., Jr. L1 integration in a transgenic mouse model. Genome Res. 2006;16:240–250. doi: 10.1101/gr.4571606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 270.Han J.S., Boeke J.D. A highly active synthetic mammalian retrotransposon. Nature. 2004;429:314–318. doi: 10.1038/nature02535. [DOI] [PubMed] [Google Scholar]
- 271.Bestor T.H. Transposons reanimated in mice. Cell. 2005;122:322–325. doi: 10.1016/j.cell.2005.07.024. [DOI] [PubMed] [Google Scholar]
- 272.An W., Han J.S., Wheelan S.J., Davis E.S., Coombes C.E., Ye P., Triplett C., Boeke J.D. Active retrotransposition by a synthetic L1 element in mice. Proc. Natl. Acad. Sci. U. S. A. 2006;103:18662–18667. doi: 10.1073/pnas.0605300103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 273.Wardrop S.L., Brown M.A. Identification of two evolutionarily conserved and functional regulatory elements in intron 2 of the human BRCA1 gene. Genomics. 2005;86:316–328. doi: 10.1016/j.ygeno.2005.05.006. [DOI] [PubMed] [Google Scholar]
- 274.Gaspar C., Cardoso J., Franken P., Molenaar L., Morreau H., Moslein G., Sampson J., Boer J.M., de Menezes R.X., Fodde R. Cross-species comparison of human and mouse intestinal polyps reveals conserved mechanisms in adenomatous polyposis coli (APC)-driven tumorigenesis. Am. J. Pathol. 2008;172:1363–1380. doi: 10.2353/ajpath.2008.070851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 275.Arena S., Isella C., Martini M., de Marco A., Medico E., Bardelli A. Knock-in of oncogenic Kras does not transform mouse somatic cells but triggers a transcriptional response that classifies human cancers. Cancer Res. 2007;67:8468–8476. doi: 10.1158/0008-5472.CAN-07-1126. [DOI] [PubMed] [Google Scholar]
- 276.Schlicht M., Matysiak B., Brodzeller T., Wen X., Liu H., Zhou G., Dhir R., Hessner M.J., Tonellato P., Suckow M., Pollard M., Datta M.W. Cross-species global and subset gene expression profiling identifies genes involved in prostate cancer response to selenium. BMC Genomics. 2004;5:58. doi: 10.1186/1471-2164-5-58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 277.Lam S.H., Wu Y.L., Vega V.B., Miller L.D., Spitsbergen J., Tong Y., Zhan H., Govindarajan K.R., Lee S., Mathavan S., Murthy K.R., Buhler D.R., Liu E.T., Gong Z. Conservation of gene expression signatures between zebrafish and human liver tumors and tumor progression. Nat. Biotechnol. 2006;24:73–75. doi: 10.1038/nbt1169. [DOI] [PubMed] [Google Scholar]
- 278.Waterston R.H., Lindblad-Toh K., Birney E., Rogers J., Abril J.F., Agarwal P., Agarwala R., Ainscough R., Alexandersson M., An P., Antonarakis S.E., Attwood J., Baertsch R., Bailey J., Barlow K., Beck S., Berry E., Birren B., Bloom T., Bork P., Botcherby M., Bray N., Brent M.R., Brown D.G., Brown S.D., Bult C., Burton J., Butler J., Campbell R.D., Carninci P., Cawley S., Chiaromonte F., Chinwalla A.T., Church D.M., Clamp M., Clee C., Collins F.S., Cook L.L., Copley R.R., Coulson A., Couronne O., Cuff J., Curwen V., Cutts T., Daly M., David R., Davies J., Delehaunty K.D., Deri J., Dermitzakis E.T., Dewey C., Dickens N.J., Diekhans M., Dodge S., Dubchak I., Dunn D.M., Eddy S.R., Elnitski L., Emes R.D., Eswara P., Eyras E., Felsenfeld A., Fewell G.A., Flicek P., Foley K., Frankel W.N., Fulton L.A., Fulton R.S., Furey T.S., Gage D., Gibbs R.A., Glusman G., Gnerre S., Goldman N., Goodstadt L., Grafham D., Graves T.A., Green E.D., Gregory S., Guigo R., Guyer M., Hardison R.C., Haussler D., Hayashizaki Y., Hillier L.W., Hinrichs A., Hlavina W., Holzer T., Hsu F., Hua A., Hubbard T., Hunt A., Jackson I., Jaffe D.B., Johnson L.S., Jones M., Jones T.A., Joy A., Kamal M., Karlsson E.K., Karolchik D., Kasprzyk A., Kawai J., Keibler E., Kells C., Kent W.J., Kirby A., Kolbe D.L., Korf I., Kucherlapati R.S., Kulbokas E.J., Kulp D., Landers T., Leger J.P., Leonard S., Letunic I., Levine R., Li J., Li M., Lloyd C., Lucas S., Ma B., Maglott D.R., Mardis E.R., Matthews L., Mauceli E., Mayer J.H., McCarthy M., McCombie W.R., McLaren S., McLay K., McPherson J.D., Meldrim J., Meredith B., Mesirov J.P., Miller W., Miner T.L., Mongin E., Montgomery K.T., Morgan M., Mott R., Mullikin J.C., Muzny D.M., Nash W.E., Nelson J.O., Nhan M.N., Nicol R., Ning Z., Nusbaum C., O'Connor M.J., Okazaki Y., Oliver K., Overton-Larty E., Pachter L., Parra G., Pepin K.H., Peterson J., Pevzner P., Plumb R., Pohl C.S., Poliakov A., Ponce T.C., Ponting C.P., Potter S., Quail M., Reymond A., Roe B.A., Roskin K.M., Rubin E.M., Rust A.G., Santos R., Sapojnikov V., Schultz B., Schultz J., Schwartz M.S., Schwartz S., Scott C., Seaman S., Searle S., Sharpe T., Sheridan A., Shownkeen R., Sims S., Singer J.B., Slater G., Smit A., Smith D.R., Spencer B., Stabenau A., Stange-Thomann N., Sugnet C., Suyama M., Tesler G., Thompson J., Torrents D., Trevaskis E., Tromp J., Ucla C., Ureta-Vidal A., Vinson J.P., Von Niederhausern A.C., Wade C.M., Wall M., Weber R.J., Weiss R.B., Wendl M.C., West A.P., Wetterstrand K., Wheeler R., Whelan S., Wierzbowski J., Willey D., Williams S., Wilson R.K., Winter E., Worley K.C., Wyman D., Yang S., Yang S.P., Zdobnov E.M., Zody M.C., Lander E.S. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. doi: 10.1038/nature01262. [DOI] [PubMed] [Google Scholar]
- 279.Hackett C.S., Hodgson J.G., Law M.E., Fridlyand J., Osoegawa K., de Jong P.J., Nowak N.J., Pinkel D., Albertson D.G., Jain A., Jenkins R., Gray J.W., Weiss W.A. Genome-wide array CGH analysis of murine neuroblastoma reveals distinct genomic aberrations which parallel those in human tumors. Cancer Res. 2003;63:5266–5273. [PubMed] [Google Scholar]
- 280.Cheng A.J., Cheng N.C., Ford J., Smith J., Murray J.E., Flemming C., Lastowska M., Jackson M.S., Hackett C.S., Weiss W.A., Marshall G.M., Kees U.R., Norris M.D., Haber M. Cell lines from MYCN transgenic murine tumours reflect the molecular and biological characteristics of human neuroblastoma. Eur. J. Cancer. 2007;43:1467–1475. doi: 10.1016/j.ejca.2007.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 281.Urzua U., Frankenberger C., Gangi L., Mayer S., Burkett S., Munroe D.J. Microarray comparative genomic hybridization profile of a murine model for epithelial ovarian cancer reveals genomic imbalances resembling human ovarian carcinomas. Tumour Biol. 2005;26:236–244. doi: 10.1159/000087378. [DOI] [PubMed] [Google Scholar]
- 282.O'Hagan R.C., Chang S., Maser R.S., Mohan R., Artandi S.E., Chin L., DePinho R.A. Telomere dysfunction provokes regional amplification and deletion in cancer genomes. Cancer Cell. 2002;2:149–155. doi: 10.1016/s1535-6108(02)00094-6. [DOI] [PubMed] [Google Scholar]
- 283.Zender L., Spector M.S., Xue W., Flemming P., Cordon-Cardo C., Silke J., Fan S.T., Luk J.M., Wigler M., Hannon G.J., Mu D., Lucito R., Powers S., Lowe S.W. Identification and validation of oncogenes in liver cancer using an integrative oncogenomic approach. Cell. 2006;125:1253–1267. doi: 10.1016/j.cell.2006.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 284.Kim M., Gans J.D., Nogueira C., Wang A., Paik J.H., Feng B., Brennan C., Hahn W.C., Cordon-Cardo C., Wagner S.N., Flotte T.J., Duncan L.M., Granter S.R., Chin L. Comparative oncogenomics identifies NEDD9 as a melanoma metastasis gene. Cell. 2006;125:1269–1281. doi: 10.1016/j.cell.2006.06.008. [DOI] [PubMed] [Google Scholar]
- 285.Tomlins S.A., Chinnaiyan A.M. Of mice and men: cancer gene discovery using comparative oncogenomics. Cancer Cell. 2006;10:2–4. doi: 10.1016/j.ccr.2006.06.013. [DOI] [PubMed] [Google Scholar]
- 286.Lee J.S., Chu I.S., Mikaelyan A., Calvisi D.F., Heo J., Reddy J.K., Thorgeirsson S.S. Application of comparative functional genomics to identify best-fit mouse models to study human cancer. Nat. Genet. 2004;36:1306–1311. doi: 10.1038/ng1481. [DOI] [PubMed] [Google Scholar]
- 287.Maser R.S., Choudhury B., Campbell P.J., Feng B., Wong K.K., Protopopov A., O'Neil J., Gutierrez A., Ivanova E., Perna I., Lin E., Mani V., Jiang S., McNamara K., Zaghlul S., Edkins S., Stevens C., Brennan C., Martin E.S., Wiedemeyer R., Kabbarah O., Nogueira C., Histen G., Aster J., Mansour M., Duke V., Foroni L., Fielding A.K., Goldstone A.H., Rowe J.M., Wang Y.A., Look A.T., Stratton M.R., Chin L., Futreal P.A., DePinho R.A. Chromosomally unstable mouse tumours have genomic alterations similar to diverse human cancers. Nature. 2007;447:966–971. doi: 10.1038/nature05886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 288.Erkeland S.J., Verhaak R.G., Valk P.J., Delwel R., Lowenberg B., Touw I.P. Significance of murine retroviral mutagenesis for identification of disease genes in human acute myeloid leukemia. Cancer Res. 2006;66:622–626. doi: 10.1158/0008-5472.CAN-05-2908. [DOI] [PubMed] [Google Scholar]
- 289.Aguirre A.J., Brennan C., Bailey G., Sinha R., Feng B., Leo C., Zhang Y., Zhang J., Gans J.D., Bardeesy N., Cauwels C., Cordon-Cardo C., Redston M.S., DePinho R.A., Chin L. High-resolution characterization of the pancreatic adenocarcinoma genome. Proc. Natl. Acad. Sci. U. S. A. 2004;101:9067–9072. doi: 10.1073/pnas.0402932101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 290.Degenhardt Y.Y., Wooster R., McCombie R.W., Lucito R., Powers S. High-content analysis of cancer genome DNA alterations. Curr. Opin. Genet. Dev. 2008;18:68–72. doi: 10.1016/j.gde.2008.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 291.Wendel H.G., Silva R.L., Malina A., Mills J.R., Zhu H., Ueda T., Watanabe-Fukunaga R., Fukunaga R., Teruya-Feldstein J., Pelletier J., Lowe S.W. Dissecting eIF4E action in tumorigenesis. Genes Dev. 2007;21:3232–3237. doi: 10.1101/gad.1604407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 292.Shvarts A., Brummelkamp T.R., Scheeren F., Koh E., Daley G.Q., Spits H., Bernards R. A senescence rescue screen identifies BCL6 as an inhibitor of anti-proliferative p19(ARF)-p53 signaling. Genes Dev. 2002;16:681–686. doi: 10.1101/gad.929302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 293.Bussow K., Quedenau C., Sievert V., Tischer J., Scheich C., Seitz H., Hieke B., Niesen F.H., Gotz F., Harttig U., Lehrach H. A catalog of human cDNA expression clones and its application to structural genomics. Genome Biol. 2004;5:R71. doi: 10.1186/gb-2004-5-9-r71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 294.Keng V.W., Villanueva A., Chiang D.Y., Dupuy A.J., Ryan B.J., Matise I., Silverstein K.A., Sarver A., Starr T.K., Akagi K., Tessarollo L., Collier L.S., Powers S., Lowe S.W., Jenkins N.A., Copeland N.G., Llovet J.M., Largaespada D.A. A conditional transposon-based insertional mutagenesis screen for genes associated with mouse hepatocellular carcinoma. Nat. Biotechnol. 2009;27(3):264–274. doi: 10.1038/nbt.1526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 295.Wiesner S.M., Decker S.A., Larson J.D., Ericson K., Forster C., Gallardo J.L., Long C., Demorest Z.L., Zamora E.A., Low W.C., SantaCruz K., Largaespada D.A., Ohlfest J.R. De novo induction of genetically engineered brain tumors in mice using plasmid DNA. Cancer Res. 2009;69:431–439. doi: 10.1158/0008-5472.CAN-08-1800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 296.Su Q., Prosser H.M., Campos L.S., Ortiz M., Nakamura T., Warren M., Dupuy A.J., Jenkins N.A., Copeland N.G., Bradley A., Liu P. A DNA transposon-based approach to validate oncogenic mutations in the mouse. Proc. Natl. Acad. Sci. U. S. A. 2008;105:19904–19909. doi: 10.1073/pnas.0807785105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 297.Masui S., Shimosato D., Toyooka Y., Yagi R., Takahashi K., Niwa H. An efficient system to establish multiple embryonic stem cell lines carrying an inducible expression unit. Nucleic Acids Res. 2005;33:e43. doi: 10.1093/nar/gni043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 298.Steuber-Buchberger P., Wurst W., Kuhn R. Simultaneous Cre-mediated conditional knockdown of two genes in mice. Genesis. 2008;46:144–151. doi: 10.1002/dvg.20376. [DOI] [PubMed] [Google Scholar]
- 299.Forster A., Pannell R., Drynan L.F., Codrington R., Daser A., Metzler M., Lobato M.N., Rabbitts T.H. The invertor knock-in conditional chromosomal translocation mimic. Nat. Methods. 2005;2:27–30. doi: 10.1038/nmeth727. [DOI] [PubMed] [Google Scholar]
- 300.Metzler M., Forster A., Pannell R., Arends M.J., Daser A., Lobato M.N., Rabbitts T.H. A conditional model of MLL-AF4 B-cell tumourigenesis using invertor technology. Oncogene. 2006;25:3093–3103. doi: 10.1038/sj.onc.1209636. [DOI] [PubMed] [Google Scholar]
- 301.Moffat J., Grueneberg D.A., Yang X., Kim S.Y., Kloepfer A.M., Hinkle G., Piqani B., Eisenhaure T.M., Luo B., Grenier J.K., Carpenter A.E., Foo S.Y., Stewart S.A., Stockwell B.R., Hacohen N., Hahn W.C., Lander E.S., Sabatini D.M., Root D.E. A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell. 2006;124:1283–1298. doi: 10.1016/j.cell.2006.01.040. [DOI] [PubMed] [Google Scholar]
- 302.Root D.E., Hacohen N., Hahn W.C., Lander E.S., Sabatini D.M. Genome-scale loss-of-function screening with a lentiviral RNAi library. Nat. Methods. 2006;3:715–719. doi: 10.1038/nmeth924. [DOI] [PubMed] [Google Scholar]
- 303.Austin C.P., Battey J.F., Bradley A., Bucan M., Capecchi M., Collins F.S., Dove W.F., Duyk G., Dymecki S., Eppig J.T., Grieder F.B., Heintz N., Hicks G., Insel T.R., Joyner A., Koller B.H., Lloyd K.C., Magnuson T., Moore M.W., Nagy A., Pollock J.D., Roses A.D., Sands A.T., Seed B., Skarnes W.C., Snoddy J., Soriano P., Stewart D.J., Stewart F., Stillman B., Varmus H., Varticovski L., Verma I.M., Vogt T.F., von Melchner H., Witkowski J., Woychik R.P., Wurst W., Yancopoulos G.D., Young S.G., Zambrowicz B. The knockout mouse project. Nat. Genet. 2004;36:921–924. doi: 10.1038/ng0904-921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 304.Auwerx J., Avner P., Baldock R., Ballabio A., Balling R., Barbacid M., Berns A., Bradley A., Brown S., Carmeliet P., Chambon P., Cox R., Davidson D., Davies K., Duboule D., Forejt J., Granucci F., Hastie N., de Angelis M.H., Jackson I., Kioussis D., Kollias G., Lathrop M., Lendahl U., Malumbres M., von Melchner H., Muller W., Partanen J., Ricciardi-Castagnoli P., Rigby P., Rosen B., Rosenthal N., Skarnes B., Stewart A.F., Thornton J., Tocchini-Valentini G., Wagner E., Wahli W., Wurst W. The European dimension for the mouse genome mutagenesis program. Nat. Genet. 2004;36:925–927. doi: 10.1038/ng0904-925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 305.Zender L., Xue W., Zuber J., Semighini C.P., Krasnitz A., Ma B., Zender P., Kubicka S., Luk J.M., Schirmacher P., McCombie W.R., Wigler M., Hicks J., Hannon G.J., Powers S., Lowe S.W. An oncogenomics-based in vivo RNAi screen identifies tumor suppressors in liver cancer. Cell. 2008;135:852–864. doi: 10.1016/j.cell.2008.09.061. [DOI] [PMC free article] [PubMed] [Google Scholar]




