Abstract
Forward genetic analysis is an unbiased approach for identifying genes essential to defined biological phenomena. When applied to mice, it is one of the most powerful methods to facilitate understanding of the genetic basis of human biology and disease. The speed at which disease-causing mutations can be identified in mutagenized mice has been markedly increased by recent advances in DNA sequencing technology. Creating and analyzing mutant phenotypes may therefore become rate-limiting in forward genetic experimentation. We review the forward genetic approach and its future in the context of recent technological advances, in particular massively parallel DNA sequencing, induced pluripotent stem cells, and haploid embryonic stem cells.
Beginning with the mechanistic assertion that living organisms are biological machines, it is incumbent on biologists to understand the workings of these machines. Acquiring a list of parts (proteins) with determinative importance in any particular biological function is often the first step in such understanding. Genetics can provide that list, and it is left to other disciplines (biochemistry, cell biology, and structural biology) to determine how the parts are shaped, where they reside within cells, and how they fit together.
Genetic studies in a physiological setting generally require a model organism that can be experimentally manipulated in a controlled environment. The laboratory mouse serves as the premier model system for study of mammalian biology on a molecular level. Ninety-nine percent of human genes have homologues in mice (ie, 99% of human and mouse genes have a shared ancestry), and 80% have orthologs (ie, 80% of human and mouse genes with shared ancestry have remained intact and unduplicated since their last common ancestor); in addition, 90% of the mouse genome exists in segments in which the gene order has been conserved with that in the human genome.1,2 Thus, discoveries made using mice usually have corresponding implications in humans. Importantly, mouse geneticists have acquired many robust tools with which to probe the mouse genome. Mice have been inbred to produce hundreds of strains that are homozygous at all loci. Most mouse genes have been inactivated (at least in embryonic stem cells, with a smaller fraction also in mice) by gene targeting or gene trapping,3,4 and almost all genes will eventually be inactivated by these and other methods, notably chemical mutagenesis.5 A finished genome sequence has been established and annotated for one strain (C57BL/6J),1,2 and the genomes of other inbred strains have been sequenced and annotated in part.6–8 Finally, techniques for the manipulation of the mouse genome and/or mouse embryos to create chimeric, transgenic, knockout, knockin, or conditionally mutant mice with targeted gain- or loss-of-function mutations are well established and commonly available.9–11 More recently, induced pluripotent stem cells (iPSCs) and haploid embryonic stem cells (ESCs) have been described12–14; the utility of these new cell types to genetic research is likely to be great.
Since genetics was established as a science >100 years ago, geneticists have begun their work by finding exceptions to normal function (phenotypes). In recent decades, mouse geneticists have been able to identify the genetic changes responsible for individual phenotypes. At one time, this was a daunting task. However, the process has accelerated sharply during the past few years, empowered by advances in chemistry, engineering, and computational biology. In some cases, the molecular cause of a newly observed phenotype can now be established within days or weeks. The most important recent technological advance in mouse genetics has been the development of massively parallel DNA sequencing, which enables rapid and cost-effective interrogation of whole genomes or exomes. In this review, we discuss the forward genetic approach as applied to mice, discoveries it has delivered, and the future of forward genetics in the context of recent technological advances.
The Forward Genetic Approach
Reverse genetics begins with a known gene and experimentally investigates the effects of altering the sequence or expression of that gene.15 The reverse genetic approach can yield deep understanding of the function of individual genes but is limited by hypotheses about the phenotypic outcome of the targeted genetic alteration.16 In contrast, the forward genetic approach begins with a particular biological phenomenon or characteristic and asks, “Which genes are necessary to support this phenomenon?”17,18 Mice that show a variant phenotype are found or created using a random process, and the mutational cause is determined by mapping and then positional cloning. Genes implicated in this way can reveal the biological basis of the phenomenon. Gradations of phenotype that result from mutations in different genes may suggest which genes are essential and which have more peripheral roles in a given process. By far the most important advantage of the forward genetic approach is the unbiased nature of inquiry, which requires no hypotheses regarding the molecular basis of the phenotype in question. Because of this, forward genetics has led to many new and unexpected discoveries. In the field of immunology, Toll-like receptor (TLR)-4 was identified as the sensor of lipopolysaccharide by this approach.19 So too was Foxp3 revealed as a transcription factor essential for the development of regulatory T cells.20,21
A steady supply of phenotypic variation (phenovariance) is necessary for laboratories intending to perform forward genetic studies, and two main sources meet this need. Spontaneous mutations resulting in naturally occurring phenovariance among divergent mouse strains provide abundant material for study.6 Because distantly related mouse strains are distinguished by millions of nucleotide differences, they are particularly suited to the study of complex traits. Complex traits are those in which genetic differences at multiple loci produce a defined phenotype. Natural variation in quantitative traits is also commonly found among divergent strains of mice,22 which may serve as a useful model of quantitative traits in humans. However, because of the numerous differences between mouse strains [even within a small (approximately 100-kbp) genomic interval], causative mutations are more difficult to pinpoint accurately, which means that correct identification requires more refined mapping of phenotypes at greater expense in terms of time and resources. To overcome this challenge, the Collaborative Cross aims to capture the rich natural variation present in eight diverse inbred strains of mice by intercrossing and then inbreeding them to create >600 individual inbred strains that will be systematically genotyped and phenotyped.23
Spontaneous mutations within a strain also cause phenovariance between individuals; agouti (A)24 and obese (ob)25 were such mutations and were identified by mapping and positional cloning. Spontaneous mutations, however, arise at a rate insufficient for most forward genetic laboratories and cannot be relied on as a source of phenovariance.
Alternatively, random mutagenesis of the germline can be performed to create mutations in mice of uniform strain background and mutant offspring carrying heterozygous or homozygous mutations systematically screened for phenotypes of interest. Random mutagenesis and phenotypic screening have been applied extensively and with great success to Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Arabidopsis thaliana, validating the utility of this method. The method is equally applicable to mice, albeit more difficult because of the greater expense of their care, and the relatively large size of the mouse genome. Indeed, we and others have successfully used this strategy in mice to uncover genes with previously unknown and essential functions in antibody responses to immunization (Zeb1, Ruvbl2, Nfkbid),26 B lymphopoiesis (Atp11c),27,28 T-cell signaling (Card11),29 TLR signaling (Cd36, Unc93b1, Ticam1),30–32 systemic autoimmunity (Rc3h1),33 and intestinal homeostasis (Yipf6),34 discoveries that might not have been made using a hypothesis-driven approach.
The phenomenon chosen for study should be important in the eyes of the investigator, and the phenotypic screen must be robust. A screen that produces many false-positive results will sap resources, whereas a screen that is insensitive will miss many of the key genes responsible for the phenomenon in question. If narrow in scope, a screen may yield few hits, but it is possible that these hits will be readily comprehensible. If broad in scope, many hits may result, but mechanistic interpretation may be difficult. Functional redundancy limits the power of forward genetics and at times can only be assessed by trial and error. One guide to feasibility is the presence of interstrain or interspecies phenovariance. If it exists, it is at least possible that single mutations may also cause phenovariance.
ENU Mutagenesis
Phenotypic variation can be generated using germline mutagenesis, induced with chemicals,35,36 radiation,37,38 or transposons.39 The chemical N-ethyl-N-nitrosourea (ENU) is a powerful mutagen for mouse spermatogonial cells40 and is currently the compound most widely used for mutagenesis of mice. The mutagenic action of ENU involves the transfer of the ethyl group of ENU to a nucleophilic nitrogen or oxygen in DNA.41,42 The resulting adducts cause mispairing and bp substitutions during replication.36 Most ENU-induced mutations are single-bp substitutions that cause missense errors, splice site errors, and nonsense mutations, in order of declining frequency.43–45 Interestingly, some mutations induced by ENU display asymmetry in their frequency of occurrence within the sense versus antisense strand.43,46 Substitutions involving T/A pairs (A/T to T/A and A/T to G/C) are the most common ENU-induced changes, but T to A occurs more frequently in the sense strand than in the antisense strand. A similar strand asymmetry is observed for T to C transitions. Neither instance of asymmetry can be explained by the frequency of the target nucleotide within sense versus antisense strand.
When administered to male C57BL/6J mice at the most common dosage (three weekly i.p. injections at 90 mg/kg), ENU creates an average of 60 coding changes per sperm,5 which would correspond to approximately one mutation per 700,000 bp of target DNA sequence (slightly more than previous estimates of one mutation per 1 to 2.7 Mbp of genomic sequence for the frequency of ENU-induced mutations).46–48 Given the types and frequency of ENU-induced mutations, observed phenotypes are almost always of monogenic origin. Moreover, when a phenotype is observed, it almost always results from structural changes in proteins rather than from effects on cis-acting regulatory elements that govern the amount or location of protein synthesis.43,49 Other mutagens, such as X-rays38 or chlorambucil,50,51 typically induce large-scale genomic alterations, including translocations, inversions, and deletions, and may be better suited for analysis of regulatory regions.
Methods for mutagenizing male mice and inbreeding them to recover homozygous mutations have been described previously.52–54 In our laboratory, we use a breeding scheme in which a mutagenized C57BL/6J male (G0) is crossed to a G0′ female derived from an independent ENU-mutagenized male (Figure 1). G1 mice are either crossed to C57BL/6J females or intercrossed to yield G2 mice, and G3 mice carrying homozygous mutations are derived from G2 intercrosses.54 Based on a mutation rate of 1.4 per Mbp of DNA, an estimated 90 mutations are expected to result in nonsynonymous coding change in each G1 mouse generated using this breeding scheme. On average, each G3 mouse contains a total of 45 mutations, of which about six are homozygous and approximately 39 are heterozygous.
Figure 1.
Inbreeding protocol for generating G3 mice with homozygous ENU-induced mutations. Mutagenized G0 males are bred to G0′ females, which carry germline mutations derived from other mutagenized males. G1 mice are either intercrossed as shown here or crossed to wild-type C57BL/6J females. Siblings in the G2 generation are intercrossed. G3 mice are subjected to screening. Small asterisks represent mutations derived from the G0 male (red) and G0′ female (blue); large asterisks indicate initial germline transmission of the mutation.
Mutation Mapping as It Once Was
Although simple in principle, the process of mapping and identifying a mutation responsible for a given phenotype was for many years the greatest challenge of forward genetics. The process began with genetic mapping, which entailed outcrossing affected mice to a different inbred strain (the mapping strain) and backcrossing or intercrossing the F1 hybrid progeny. When a recessive mutation was at issue, F2 mice were tested for phenotype and then genotyped individually at markers distributed across the genome, distinguishing the mutant and mapping strains. Concordance between the mutant phenotype and homozygosity for one or more markers of the mutant strain suggested linkage of a particular chromosomal region with the mutant phenotype. The genotyping of additional markers within the region of linkage made it possible to localize the mutation to a smaller chromosomal region. This process of fine mapping defined a proximal and a distal marker, each separated from the mutation by one or more crossover events, and thus delimited the proximal and distal boundaries of the critical region. Thus, a refined critical region, perhaps 1 to 3 Mb in length could be established, and all coding exons within it would be PCR amplified and sequenced directly. On average, 1 Mb of genomic DNA contains 8.7 genes, and each gene contains 8.8 exons. However, because genes are not uniformly distributed but instead are clustered within the genome, many more genes (and coding exons) might reside within a 1-Mb critical region. The entire mapping process, but especially fine mapping, required extensive breeding. Sometimes thousands of meioses were used to narrow a gene-rich critical region to <1 Mb. In some cases, if a gene happened to reside within a region of the genome that was resistant to meiotic recombination, the critical region might remain much larger: 5 Mb or longer.
Before the publication of an annotated mouse genome sequence, the content of the genome, with respect to both genes and markers, remained largely mysterious. Accordingly, a physical mapping step was applied, in which all DNA within a critical region was cloned into bacterial artificial chromosomes or yeast artificial chromosomes, which were then surveyed for genes and polymorphic markers (usually simple sequence repeats). Several methods were used to identify genes within unannotated DNA sequences. Exon trapping was performed by cloning fragments of genomic DNA into vectors that would be expressed in mammalian cells, given the presence of functional splice junctions. cDNA cloning was used to retrieve and identify the expressed fragments, which were then sequenced, and designated as putative exons. Hybridization selection was a second technique, in which cDNA was allowed to hybridize with immobilized genomic DNA and then eluted, cloned, and sequenced to pinpoint expressed genes within the genome. Computational algorithms such as GRAIL were also used to predict the existence of genes within unannotated stretches of genomic DNA, assembled from shotgun sequencing of bacterial artificial chromosomes.
Expressed sequence tags, derived from the random sequencing of millions of cDNA molecules from many different tissues, were a breakthrough55 that permitted much more rapid and reliable identification of genes. Genomic DNA could be sequenced and then analyzed by BLAST (basic local alignment search tool) against an expressed sequence tag database to determine whether it contained elements with strong homology to expressed sequences. However, because expressed sequence tag libraries were not comprehensive, it was necessary to perform BLAST analyses recurrently (often at monthly intervals) to be sure that one did not overlook newly discoverable genes. Moreover, homologs and exact matches needed to be treated as candidate genes, notwithstanding the fact that most of them were eventually discarded as pseudogenes. Finally, for every authentic gene within the critical region, thorough sequencing was performed using both wild-type and mutant genomic DNA as templates to identify the causative mutation.
DNA sequencing to find and then examine genes in the critical region were often the rate-limiting steps in positional cloning. Before the introduction of massively parallel sequencing technology, DNA sequencers used the Sanger sequencing method that relied on DNA chain termination on incorporation of dideoxynucleotide analogs. Initially, radionuclide detection was used, necessitating manual reading of sequencing ladders on photographic film. Later fluorescence detection supplanted radionuclide detection,56 but manual tracking of sequences generated within slab gels was required, and a few tens of thousands of bases could be read per day, even with several sequencing machines. Capillaries supplanted slab gels as a separation medium,57 leading to greater automation, and as many as 1 million bp of sequence could be captured per machine per day as these machines reached their highest level of sophistication.
Altogether, identifying a mutation might occupy a large and competent laboratory for 5 to 8 years. The availability of a trusted reference sequence of the mouse genome usually eliminated the need to perform physical mapping and the need to identify candidate genes within a critical region. However, the time and resources needed to map a mutation to a chromosomal interval small enough for sequencing remained substantial. In early postgenomic times, identifying a mutation might typically take an entire year (although several mutations might be pursued simultaneously) (Figure 2).
Figure 2.
Workflow typical in 2005, 2008, and currently for identifying a mutation responsible for a variant phenotype. Estimated time requirements are indicated for each major step. Left panel: Around 2005, identifying a mutation causative for a particular phenotype by genetic mapping and capillary sequencing of critical region coding sequences commonly required approximately 1 year. Center panel: More efficient mapping by bulk segregation analysis (BSA) and massively parallel genome sequencing were implemented for mutation finding beginning around 2008 and reduced by approximately 55% the time needed to find a causative mutation (counting the time from confirmation of transmissibility). Right panel: Identification of causative mutations without genetic mapping has recently been demonstrated, made possible by the high accuracy of massively parallel sequencers and the use of multiple data filters to exclude false-positive mutation calls. This process results in the identification of only a few mutations per strain, which can be tested for linkage with the affected phenotype. Rapid mutation finding by BSA and/or massively parallel sequencing permits the simultaneous investigation of many more phenotypes than was previously possible. In every case, confirmation of causality depends on knowledge of the effect of a second mutant allele or on transgenic rescue of the mutant phenotype with the wild-type allele.
Seeking to automate the process of exploring critical regions,54 we wrote software to design primers flanking the coding region to be amplified within any part of the genome. A robot was then programmed to mix primers and DNA templates (from mutant and wild-type mice) for PCR amplification. Purification and mixing of PCR products with sequencing primers and reagents were also performed robotically in 96- or 384-well plate format. Sequencing of 96 samples could proceed in parallel, yielding 50 to 60 kbp of sequence per plate and about half a million bp of sequence per day. The resulting sequence trace files were analyzed with Phred version 0.020425.c (University of Washington, Seattle, WA) and Phrap version 0.990319 (University of Washington),58–60 programs that call bases from sequence chromatograms and compare and align multiple reads with a reference sequence, also giving each peak a score reflecting the quality of the read. Software was used to identify discrepancies between mutant and wild-type sequences or, if sequencing failed, to redesign new primers for a repeat of the process. Although these tools substantially accelerated the mutation identification process, massively parallel sequencing soon surpassed them in efficiency and utility.
Massively Parallel Sequencing
Among the first published massively parallel sequencing technologies was massively parallel signature sequencing in an article appearing in 2000,61 although conceptual development of early methodologies to support large-scale DNA sequencing was underway beginning in the early 1990s.62–65 The first massively parallel sequencing instrument introduced to the commercial market was the GS-20 produced by 454 Life Sciences (now 454 Life Sciences, a Roche company, Brandford, CT) in 2005, which was followed shortly thereafter by the Solexa (now Illumina, Inc., San Diego, CA) Genome Analyzer in 2006 and the Applied Biosystems (now Life Technologies, Carlsbad, CA) SOLiD sequencer in 2007. These machines were capable of producing hundreds of megabase pairs of sequence per day, compared with hundreds of kb pairs of sequence per day in capillary (Sanger reaction–based) sequencers. Cost per bp of sequence was reduced to a fraction of the cost per bp in capillary sequencers, whereas sequencing accuracy was reported to be upward of 98.5% per bp and much enhanced by deep coverage of a specific target region.66
Massively parallel refers to the simultaneous sequencing of large numbers (millions to billions) of DNA fragments, and the new instruments accomplish this using chemistry distinct from the Sanger reaction. The older capillary sequencers worked by separating Sanger reaction mixtures by electrophoresis in capillary tubes and then detecting the ladder of fluorescent reaction products by their specific wavelength of emitted light. Massively parallel sequencers begin with a population of DNA fragments (a library), which are attached to a solid support, such as a glass bead or slide, and amplified by single-molecule PCR. Amplification is necessary to increase the signal to be detected by the CCD imaging system but can also introduce sequence and abundance errors that reduce accuracy. The resulting amplified fragments are sequenced base by base in a flow cell through which reagents are cycled and signal is detected. For each platform, sequencing chemistry is different.67 For example, complementary reversible dye terminator nucleotides added by a DNA polymerase are detected by fluorescent emission in Illumina-Solexa sequencers.68 The ABI/Life Technologies SOLiD system is based on ligation and fluorescent detection of complementary dinucleotides on the template; multiple cycles of ligation beginning at consecutive start positions effectively result in each nucleotide being sequenced twice.69 Pyrosequencing, in which light is emitted on addition of complementary nucleotides by DNA polymerase, is used in Roche/454 sequencers.70 With the Illumina or Life Technologies platforms, a mouse genome can be sequenced in about 1 week.
The abundance and type of data generated by massively parallel sequencers have necessitated computational resources to both store and process these data efficiently. In particular, whereas sequence data from capillary sequencers was typically in the form of approximately 500-bp reads, most massively parallel sequencers produce millions of short reads (approximately 50 to 150 bp) at a time. These must be properly aligned to a reference genome sequence and then analyzed for mismatches. Because of the integrated, kit-based functionality of the instruments, most platforms use their own unique software to process raw data and output sequence information. Because of the short length of the reads, accurate alignment may be challenging, especially in areas with repetitive sequences. So-called mate-pair library construction is a means to address this issue by incorporating a larger DNA fragment into a circular DNA, thereby resulting in reads that are separated from one another by a spacer of relatively constant length. Mate pairs permit one to jump over repetitive sequences that would otherwise thwart unambiguous alignments.
Even newer technology for massively parallel sequencing of individual, linear, unamplified DNA molecules is currently in development, with several machines already available. These machines include the HeliScope (Helicos BioSciences Corp., Cambridge, MA),71 the PacBio RS (Pacific Biosciences of California, Inc., Menlo Park, CA),72 and the Ion PGM Sequencer (Life Technologies).73 These instruments use various detection methods to register nucleotide incorporation by DNA polymerase for millions of different templates simultaneously.66,67 Single-molecule sequencing addresses several drawbacks of the second-generation technologies just described. First, the single-molecule sequencers require less starting DNA material, thereby eliminating the need for PCR amplification and the sequence/abundance errors it introduces. Second, long reads upward of approximately 2 kb produced by single-molecule sequencers have the capacity to span regions of repetitive sequences, facilitating unambiguous alignment to a reference genome and de novo assembly and alignment of previously unsequenced genomes. Currently, however, the sequencing accuracy of single-molecule sequencers is generally far below that of massively parallel sequencers that depend on PCR amplification. For example, the PacBio RS generated reads with an average accuracy of 82.1%74 or 84.4%.75 The use of short, high-accuracy sequences from second-generation PCR-based sequencers to correct long, single-molecule sequences has been demonstrated as an effective strategy to improve the accuracy of sequences from the PacBio RS single-molecule sequencer.76–78
A New Era of Mutation Finding
The development of massively parallel sequencing has shifted the focus in mutation identification from mapping to sequencing (Figure 2). No longer is mapping to high resolution necessary to narrow a critical region to a few megabase pairs because the sequence of the whole mouse genome or the sequence of many different mouse exomes can be determined with high accuracy using massively parallel sequencing in far less time (currently about 1 week) than intensive mapping would require. Quick mapping methods, such as bulk segregation analysis (BSA), suffice in locating a broad critical region.79 Like traditional mapping, BSA involves outcrossing mutants to a mapping strain. F2 offspring are phenotyped and genotyped for markers across the genome. However, instead of genotyping individual mice for strain-specific markers, allele frequency is determined, based on sequencing peak trace heights, at each informative locus in two pools of DNA from F2 mice grouped by phenotype, either normal or affected. For each marker, enrichment of the mutant strain allele in the affected DNA pool and depletion in the normal DNA pool are used to establish linkage. With only about 20 meioses, BSA can localize a mutation to a subchromosomal region, within which there may be only one mutation identified by whole genome sequencing.
Because most ENU-induced phenotype is caused by changes in coding sequence,43,49 exome sequencing (exonic DNA capture combined with massively parallel sequencing) is an alternative to whole genome sequencing that greatly reduces the amount of sequencing needed per mouse. Exome capture involves solution- or microarray-based hybridization of sheared genomic DNA to either RNA or DNA oligonucleotides complementary to nonrepetitive exonic sequences.80–83 Exome sequencing has been used to analyze human DNA for rare coding variations,84 to identify the basis of several human genetic diseases,85–88 and to examine tumor genomes.89–91 In the mouse, exome capture coupled with massively parallel sequencing has been validated as a robust approach for identification of putative mutations,5,49 including those with low phenotypic penetrance.92 Several commercial kits for exome capture have recently become available for the mouse, with capture efficiency (frequency of DNA being exonic) ranging from 40% to 55% for the Nimblegen (Roche NimbleGen, Inc., Madison, WI) and Agilent (Agilent Technologies, Inc., Santa Clara, CA) kits.5 A benefit of both genome and exome sequencing is the capture of many incidental mutations, which are not responsible for the phenotype in question but may be useful tools for the investigation of other phenotypes.
The possibility of mutation identification directly from massively parallel exome sequencing data has recently been demonstrated.5 A total of 12 ENU-mutagenized strains with either immune disorders or obesity were analyzed. Whole-exome capture was performed using solution-based hybridization to either biotinylated RNA or DNA oligonucleotides, followed by amplification and sequencing of the exome-enriched DNA. Despite an expected number of approximately 50 mutations per animal based on estimates of ENU mutation frequency, approximately 10,000 single nucleotide changes were identified in raw data for each mouse sequenced. The key to successful identification of the causative mutations was the application of several filters to the list of variants. These filters excluded intronic variants, synonymous nucleotide changes, and variants listed in the Single-Nucleotide Polymorphism database (http://www.ncbi.nlm.nih.gov/projects/SNP). Another important filter excluded variants that recurred in more than one unrelated mouse from the colony; this filter effectively removed systematic false-positive variants that resulted from the exome enrichment and sequencing processes. Finally, with the reasoning that ENU is unlikely to induce more than one mutation in any gene in a single mouse, a filter for genes with multiple variant calls further reduced the putative mutation list. With one exception, the combination of filters reduced the number of homozygous variants to <10 per mouse and 6 on average. Examination of segregation patterns for each mutation in the original pedigrees definitively established the causative mutation without further need for meiotic mapping. A similar filtering strategy was successfully applied to custom exome sequence data to identify ENU-induced mutations in four mutant lines, with only minimal mapping performed during stock maintenance.93
Although minimal mapping saves time at the front end of a project, it can result in a greater burden later on if it becomes necessary to prove causality. Unexpected genetic findings may require verification by quite refined mapping of a critical region, especially if other candidate genes lie close by. This issue may be addressed by sequencing candidate genes directly to demonstrate the absence of mutations in animals with the mutant phenotype. Generating transgenic mice carrying the mutant allele might also be performed, if the mutation is dominant, to recapitulate the mutant phenotype, or in the case of a recessive mutation, transgenic rescue of the mutant phenotype with the wild type allele can be performed in support of causality. Phenotypic analysis of a knockout mouse, which may already exist in a repository such as the Knockout Mouse Project Repository (https://www.komp.org, last accessed March 11, 2013), might be easily performed to demonstrate a recapitulation of the mutant phenotype.
We note that mutation finding for complex phenotypes will also benefit from massively parallel sequencing technology. The mapping of complex phenotypes, typically among divergent mouse strains, follows the same principles as the mapping of monogenic traits, although the more loci involved, the more difficult it is to map and identify all causative loci. With massively parallel sequencing, it will be possible to have in hand the complete list of nucleotide changes within a group of animals expressing a complex phenotype. Together with coarse mapping, such as by BSA, and evaluation of the potential degree of damage from amino acid substitutions by prediction tools such as PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2),94 the list of candidate genes might be substantially narrowed before biological proof of causality is sought, for example, through transgenesis or creation of a knockin mutation.
Estimating Genome Saturation
In conducting a forward genetic screen, it is useful to know what fraction of the genome has been mutated to a state of detectable phenovariance. If this can be done, one may use the number of hits obtained in the screen to estimate the total number of genes with an essential function in the phenomenon of interest. Knowing the types of mutations caused by ENU and the frequency with which they each occur,43,44 one may simulate mutagenesis in silico, modeling the introduction of mutations into sperm and tracking their transmission to G1, G2, and G3 mice in keeping with pedigrees actually produced and screened. In this way, a plausible estimate of genetic damage surveyed among a given population of G3 mice can be made.
Within such a population, which has actually been screened for phenotype, one may determine the number of simulated homozygous overt null alleles (mutations predicted with high confidence to inactivate proteins by causing premature truncation of translation products or splicing defects). On the basis of empirical determination of the frequency with which missense mutations cause phenotypically detectable effects,43 one may also estimate the number of simulated homozygous missense mutations that are likely to cause functional inactivation of gene products among the G3 mice screened. With knowledge of the number of recessive phenovariants actually detected in the population and knowledge of the fraction of genes that have been functionally inactivated, one may estimate the total number of genes that make an essential and nonredundant contribution to the biological phenomenon of interest.
The need for simulation will undoubtedly diminish as sequencing costs decline, permitting more direct assessments of the amount of genetic damage present in a given population of G3 mice. However, at present, it continues to offer a useful estimate of the size of the target gene population.
The Future of Forward Genetics
As mutation identification becomes more efficient and less expensive as a result of new sequencing technology, the creation and analysis of phenotype may replace mutation finding as the bottlenecks in forward genetics. Future efforts will therefore likely focus on the rapid generation of phenotypes and their systematic analysis. One approach may be to use cell-based phenotypic screening to increase the rate at which mutant phenotypes are identified. An interesting possibility would be to mutagenize somatic cells (for example, fibroblasts), provided the phenotype of interest (for example, resistance to infection by a particular virus) can be detected in such cells (Figure 3). Surviving fibroblasts could then be expanded and converted to iPSCs,12 which are similar to ESCs in pluripotency.95 Conversion to iPSCs may be facilitated by derivation of the fibroblasts from a transgenic, reprogrammable mouse engineered to express the reprogramming genes Oct4, Sox2, Klf4, and c-Myc under a tetracycline-inducible promoter; fibroblasts could be converted to iPSCs by exposure to tetracycline. The iPSCs, bearing a heavy burden of mutations, could be injected into blastocysts to produce chimeric mice, with eventual germline transmission of mutations induced by ENU.96 At the same time, DNA from the iPS clone could be sequenced to detect all mutations that were induced. Although dominant phenotypes resulting from heterozygous mutations might be detected in screening, particular attention might be accorded compound heterozygous mutations at single loci, which might be indicative of recessive phenotypes. Either at the chimeric stage or later, in progeny derived from mutation-bearing germ cells, confirmation of phenovariance could be sought by examining resistance to viral infection. Positional cloning of the causative mutation or exome sequencing and segregation analysis could be undertaken provided germline transmission of mutations could be achieved.
Figure 3.
Somatic cell mutagenesis and recovery of mutations in iPSCs. Somatic cells, such as fibroblasts, derived from transgenic reprogrammable mice that inducibly express Oct4, Sox2, Klf4, and c-Myc (1) are mutagenized and then screened for phenotypes of interest, such as resistance to viral infection (2). Surviving cells are converted to iPSCs (3) and used to generate chimeric mice (4a). Resistance to viral infection is tested in chimeric mice or in mice fully derived from the germline-transmitted iPS clone (5). iPSCs are also sequenced to identify all induced mutations (4b). The causative mutation can be identified by genotyping fully iPSC-derived mice at all mutation sites and examining segregation patterns for concordance of the phenotype with homozygosity for a particular mutation (6). For a recessive phenotype, any single gene that sustained compound heterozygous mutations in the iPSCs is a strong candidate for causation. Other genes may be prioritized as candidates for causation based on published information on function/phenotype.
With such a method, millions of cells might be mutagenized and screened simultaneously in vitro, as opposed to screening a few hundred mutagenized mice at a time. Only those with interesting phenotypes would be reconstituted as live animals for further study of the mutation in a physiological setting. Equally important is the utility of this strategy in circumventing difficulties encountered in creating homozygous mutations in mice, now normally achieved by breeding G1 mice to produce G2 animals and finally G3 animals homozygous for ENU-induced mutations. Among the limitations of germline mutagenesis, mutations that affect a given biological process (eg, TLR signal transduction) will not be detected if they also cause lethality related to developmental anomalies. If it is possible to screen mutagenized somatic cells or chimeric mice derived from iPSCs, genes with a dual role in development and immunity might be far easier to identify.
Haploid ESCs also promise to accelerate the creation and analysis of phenotype in mouse forward genetic studies. Derived from parthenogenetic induction of oocyte division, haploid ESCs are harvested from the inner cell mass of blastocysts and can be mutagenized and then grown clonally.13,14 With the caveat that haploid ESCs frequently convert to the diploid state (for one clonal population, at a frequency of 2% to 3% each day14) and so must be habitually sorted to maintain a pure haploid population, they can be differentiated into multiple diploid cell types carrying induced mutations in homozygous state. Alternatively, when injected into fresh blastocysts, haploid ESCs gain a diploid karyotype, giving rise to chimeric mice with cells homozygous for induced mutations. Ultimately, such mutations can be transmitted in the germline.97 Because there is no need for extensive breeding or involved procedures to generate homozygous mutant animals or cells, haploid ESCs should enable high-throughput gene inactivation that leads rapidly to phenotypic analysis of recessive traits (Figure 4). Moreover, mutagenesis of haploid ESCs may be applied to many different genetic backgrounds; there is no need to generate a lengthy and expensive pipeline of mutants spanning three generations. Screens for the suppression of phenotypes induced by other mutations may be feasible, for example, screens for suppression of complex disease phenotypes. Screening haploid ESCs or chimeras derived from them also bypasses lethality that may be caused by developmental defects in whole animals.
Figure 4.
Use of mouse haploid ESCs for forward genetics. The first four steps are for the derivation of haploid ESCs, which can be mutagenized (5), such as with ENU or gene trap vectors. Single cell clones of mutagenized haploid ESCs can be expanded (6) and then screened for phenotypes of interest (7a). They may also be differentiated into specialized cell types, such as macrophages (7b), or injected into recipient blastocysts to generate chimeric mice (7c); both of these processes result in diploidization of the haploid ESCs. The resulting differentiated cells and chimeric mice thus carry the induced mutations in homozygous state, permitting screening for recessive phenotypes (8b and 8c). Mutations may be identified by mapping and/or sequencing as appropriate.
Conclusion
The process of mutation finding has been markedly accelerated recently, and the use of mutagenesis to provide quick insight into biological puzzles has been facilitated accordingly, so much that creation and systematic analysis of phenotype may become rate-limiting in mouse forward genetic studies. The development of reliable and streamlined protocols for high-throughput mutagenesis, screening, and generation of homozygous mice from IPSCs and haploid ESCs may help address a phenotypic shortage and permit multiple pipelines of mutagenesis where at present even a single pipeline is expensive to maintain. Forward genetic screens for mutations that cause suppression of disease phenotypes may help in deciphering the mechanism by which certain mutations produce phenotypes. Mutated genes identified as suppressors might be pursued as drug targets, even if the mechanism of disease suppression is not yet understood.
More critical, however, for the forward genetic approach is that mechanistic insight often lags far behind the discovery of new phenotypes and their mutational causes. Genetics creates an embarrassment of riches: embarrassing because we often see dramatic phenotypes and understand their primary causes but do not understand their secondary or tertiary causes well at all. Structural biology, cell biology, and biochemistry often provide insight into mechanism where genetics cannot and may be the most appropriate tools for exploiting genetic gains.
Acknowledgment
We thank Diantha La Vine for assistance in preparation of the illustrations.
Footnotes
Supported by NIH grants HHSN272200700038C and AI100627-01 (B.B.).
References
- 1.Waterston R.H., Lindblad-Toh K., Birney E., Rogers J., Abril J.F., Agarwal P., Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. doi: 10.1038/nature01262. [DOI] [PubMed] [Google Scholar]
- 2.Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X., Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y., Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S., Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R., Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K., Eichler E.E., Ponting C.P., Mouse Genome Sequencing Consortium Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 2009;7:e1000112. doi: 10.1371/journal.pbio.1000112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Austin C.P., Battey J.F., Bradley A., Bucan M., Capecchi M., Collins F.S. The knockout mouse project. Nat Genet. 2004;36:921–924. doi: 10.1038/ng0904-921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Skarnes W.C., von Melchner H., Wurst W., Hicks G., Nord A.S., Cox T., Young S.G., Ruiz P., Soriano P., Tessier-Lavigne M., Conklin B.R., Stanford W.L., Rossant J., International Gene Trap Consortium A public gene trap resource for mouse functional genomics. Nat Genet. 2004;36:543–544. doi: 10.1038/ng0604-543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Andrews T.D., Whittle B., Field M.A., Balakishnan B., Zhang Y., Shao Y., Cho V., Kirk M., Singh M., Xia Y., Hager J., Winslade S., Sjollema G., Beutler B., Enders A., Goodnow C.C. Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: an immediate source for thousands of new mouse models. Open Biol. 2012;2:120061. doi: 10.1098/rsob.120061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Keane T.M., Goodstadt L., Danecek P., White M.A., Wong K., Yalcin B. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature. 2011;477:289–294. doi: 10.1038/nature10413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yalcin B., Wong K., Agam A., Goodson M., Keane T.M., Gan X., Nellaker C., Goodstadt L., Nicod J., Bhomra A., Hernandez-Pliego P., Whitley H., Cleak J., Dutton R., Janowitz D., Mott R., Adams D.J., Flint J. Sequence-based characterization of structural variation in the mouse genome. Nature. 2011;477:326–329. doi: 10.1038/nature10432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Yalcin B., Adams D.J., Flint J., Keane T.M. Next-generation sequencing of experimental mouse strains. Mamm Genome. 2012;23:490–498. doi: 10.1007/s00335-012-9402-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.van der Weyden L., White J.K., Adams D.J., Logan D.W. The mouse genetics toolkit: revealing function and mechanism. Genome Biol. 2011;12:224. doi: 10.1186/gb-2011-12-6-224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Justice M.J., Siracusa L.D., Stewart A.F. Technical approaches for mouse models of human disease. Dis Model Mech. 2011;4:305–310. doi: 10.1242/dmm.000901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Nguyen D., Xu T. The expanding role of mouse genetics for understanding human biology and disease. Dis Model Mech. 2008;1:56–66. doi: 10.1242/dmm.000232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Takahashi K., Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126:663–676. doi: 10.1016/j.cell.2006.07.024. [DOI] [PubMed] [Google Scholar]
- 13.Leeb M., Wutz A. Derivation of haploid embryonic stem cells from mouse embryos. Nature. 2011;479:131–134. doi: 10.1038/nature10448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Elling U., Taubenschmid J., Wirnsberger G., O’Malley R., Demers S.P., Vanhaelen Q., Shukalyuk A.I., Schmauss G., Schramek D., Schnuetgen F., von Melchner H., Ecker J.R., Stanford W.L., Zuber J., Stark A., Penninger J.M. Forward and reverse genetics through derivation of haploid mouse embryonic stem cells. Cell Stem Cell. 2011;9:563–574. doi: 10.1016/j.stem.2011.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hardy S., Legagneux V., Audic Y., Paillard L. Reverse genetics in eukaryotes. Biol Cell. 2010;102:561–580. doi: 10.1042/BC20100038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Barbaric I., Miller G., Dear T.N. Appearances can be deceiving: phenotypes of knockout mice. Brief Funct Genomic Proteomic. 2007;6:91–103. doi: 10.1093/bfgp/elm008. [DOI] [PubMed] [Google Scholar]
- 17.Beutler B., Jiang Z., Georgel P., Crozat K., Croker B., Rutschmann S., Du X., Hoebe K. Genetic analysis of host resistance: toll-Like receptor signaling and immunity at large. Annu Rev Immunol. 2006;24:353–389. doi: 10.1146/annurev.immunol.24.021605.090552. [DOI] [PubMed] [Google Scholar]
- 18.Beutler B. Immunology, phenotype first: preface. Curr Top Microbiol Immunol. 2008;321 v–viii. [PubMed] [Google Scholar]
- 19.Poltorak A., He X., Smirnova I., Liu M., Van Huffel C., Du X., Birdwell D., Alejos E., Silva M., Galanos C., Freudenberg M.A., Ricciardi-Castagnoli P., Layton B., Beutler B. Defective LPS signaling in C3H/HeJ and C57BL/10ScCr mice: mutations in Tlr4 gene. Science. 1998;282:2085–2088. doi: 10.1126/science.282.5396.2085. [DOI] [PubMed] [Google Scholar]
- 20.Brunkow M.E., Jeffery E.W., Hjerrild K.A., Paeper B., Clark L.B., Yasayko S.A., Wilkinson J.E., Galas D., Ziegler S.F., Ramsdell F. Disruption of a new forkhead/winged-helix protein, scurfin, results in the fatal lymphoproliferative disorder of the scurfy mouse. Nat Genet. 2001;27:68–73. doi: 10.1038/83784. [DOI] [PubMed] [Google Scholar]
- 21.Khattri R., Cox T., Yasayko S.A., Ramsdell F. An essential role for Scurfin in CD4+CD25+ T regulatory cells. Nat Immunol. 2003;4:337–342. doi: 10.1038/ni909. [DOI] [PubMed] [Google Scholar]
- 22.Hunter K.W., Crawford N.P. The future of mouse QTL mapping to diagnose disease in mice in the age of whole-genome association studies. Annu Rev Genet. 2008;42:131–141. doi: 10.1146/annurev.genet.42.110807.091659. [DOI] [PubMed] [Google Scholar]
- 23.Chesler E.J., Miller D.R., Branstetter L.R., Galloway L.D., Jackson B.L., Philip V.M., Voy B.H., Culiat C.T., Threadgill D.W., Williams R.W., Churchill G.A., Johnson D.K., Manly K.F. The Collaborative Cross at Oak Ridge National Laboratory: developing a powerful resource for systems genetics. Mamm Genome. 2008;19:382–389. doi: 10.1007/s00335-008-9135-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bultman S.J., Michaud E.J., Woychik R.P. Molecular characterization of the mouse agouti locus. Cell. 1992;71:1195–1204. doi: 10.1016/s0092-8674(05)80067-4. [DOI] [PubMed] [Google Scholar]
- 25.Zhang Y., Proenca R., Maffei M., Barone M., Leopold L., Friedman J.M. Positional cloning of the mouse obese gene and its human homologue. Nature. 1994;372:425–432. doi: 10.1038/372425a0. [DOI] [PubMed] [Google Scholar]
- 26.Arnold C.N., Pirie E., Dosenovic P., McInerney G.M., Xia Y., Wang N., Li X., Siggs O.M., Karlsson Hedestam G.B., Beutler B. A forward genetic screen reveals roles for Nfkbid, Zeb1, and Ruvbl2 in humoral immunity. Proc Natl Acad Sci U S A. 2012;109:12286–12293. doi: 10.1073/pnas.1209134109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Siggs O.M., Arnold C.N., Huber C., Pirie E., Xia Y., Lin P., Nemazee D., Beutler B. The P4-type ATPase ATP11C is essential for B lymphopoiesis in adult bone marrow. Nat Immunol. 2011;12:434–440. doi: 10.1038/ni.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yabas M., Teh C.E., Frankenreiter S., Lal D., Roots C.M., Whittle B., Andrews D.T., Zhang Y., Teoh N.C., Sprent J., Tze L.E., Kucharska E.M., Kofler J., Farell G.C., Broer S., Goodnow C.C., Enders A. ATP11C is critical for the internalization of phosphatidylserine and differentiation of B lymphocytes. Nat Immunol. 2011;12:441–449. doi: 10.1038/ni.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Barnes M.J., Krebs P., Harris N., Eidenschenk C., Gonzalez-Quintial R., Arnold C.N., Crozat K., Sovath S., Moresco E.M., Theofilopoulos A.N., Beutler B., Hoebe K. Commitment to the regulatory T cell lineage requires CARMA1 in the thymus but not in the periphery. PLoS Biol. 2009;7:e51. doi: 10.1371/journal.pbio.1000051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hoebe K., Georgel P., Rutschmann S., Du X., Mudd S., Crozat K., Sovath S., Shamel L., Hartung T., Zahringer U., Beutler B. CD36 is a sensor of diacylglycerides. Nature. 2005;433:523–527. doi: 10.1038/nature03253. [DOI] [PubMed] [Google Scholar]
- 31.Tabeta K., Hoebe K., Janssen E.M., Du X., Georgel P., Crozat K., Mudd S., Mann N., Sovath S., Goode J., Shamel L., Herskovits A.A., Portnoy D.A., Cooke M., Tarantino L.M., Wiltshire T., Steinberg B.E., Grinstein S., Beutler B. The Unc93b1 mutation 3d disrupts exogenous antigen presentation and signaling via Toll-like receptors 3, 7 and 9. Nat Immunol. 2006;7:156–164. doi: 10.1038/ni1297. [DOI] [PubMed] [Google Scholar]
- 32.Hoebe K., Du X., Georgel P., Janssen E., Tabeta K., Kim S.O., Goode J., Lin P., Mann N., Mudd S., Crozat K., Sovath S., Han J., Beutler B. Identification of Lps2 as a key transducer of MyD88-independent TIR signaling. Nature. 2003;424:743–748. doi: 10.1038/nature01889. [DOI] [PubMed] [Google Scholar]
- 33.Vinuesa C.G., Cook M.C., Angelucci C., Athanasopoulos V., Rui L., Hill K.M., Yu D., Domaschenz H., Whittle B., Lambe T., Roberts I.S., Copley R.R., Bell J.I., Cornall R.J., Goodnow C.C. A RING-type ubiquitin ligase family member required to repress follicular helper T cells and autoimmunity. Nature. 2005;435:452–458. doi: 10.1038/nature03555. [DOI] [PubMed] [Google Scholar]
- 34.Brandl K., Tomisato W., Li X., Neppl C., Pirie E., Falk W., Xia Y., Moresco E.M., Baccala R., Theofilopoulos A.N., Schnabl B., Beutler B. Yip1 domain family, member 6 (Yipf6) mutation induces spontaneous intestinal inflammation in mice. Proc Natl Acad Sci U S A. 2012;109:12650–12655. doi: 10.1073/pnas.1210366109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Cordes S.P. N-ethyl-N-nitrosourea mutagenesis: boarding the mouse mutant express. Microbiol Mol Biol Rev. 2005;69:426–439. doi: 10.1128/MMBR.69.3.426-439.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Justice M.J., Noveroske J.K., Weber J.S., Zheng B., Bradley A. Mouse ENU mutagenesis. Hum Mol Genet. 1999;8:1955–1963. doi: 10.1093/hmg/8.10.1955. [DOI] [PubMed] [Google Scholar]
- 37.Russell W.L., Kelly E.M. Specific-locus mutation frequencies in mouse stem-cell spermatogonia at very low radiation dose rates. Proc Natl Acad Sci U S A. 1982;79:539–541. doi: 10.1073/pnas.79.2.539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Russell W.L. X-ray-induced mutations in mice. Cold Spring Harb Symp Quant Biol. 1951;16:327–336. doi: 10.1101/sqb.1951.016.01.024. [DOI] [PubMed] [Google Scholar]
- 39.Ivics Z., Li M.A., Mates L., Boeke J.D., Nagy A., Bradley A., Izsvak Z. Transposon-mediated genome manipulation in vertebrates. Nat Methods. 2009;6:415–422. doi: 10.1038/nmeth.1332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Russell W.L., Kelly E.M., Hunsicker P.R., Bangham J.W., Maddux S.C., Phipps E.L. Specific-locus test shows ethylnitrosourea to be the most potent mutagen in the mouse. Proc Natl Acad Sci U S A. 1979;76:5818–5819. doi: 10.1073/pnas.76.11.5818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Sun L., Singer B. The specificity of different classes of ethylating agents toward various sites of HeLa cell DNA in vitro and in vivo. Biochemistry. 1975;14:1795–1802. doi: 10.1021/bi00679a036. [DOI] [PubMed] [Google Scholar]
- 42.Singer B. All oxygens in nucleic acids react with carcinogenic ethylating agents. Nature. 1976;264:333–339. doi: 10.1038/264333a0. [DOI] [PubMed] [Google Scholar]
- 43.Arnold C.N., Barnes M.J., Berger M., Blasius A.L., Brandl K., Croker B., Crozat K., Du X., Eidenschenk C., Georgel P., Hoebe K., Huang H., Jiang Z., Krebs P., La Vine D., Li X., Lyon S., Moresco E.M., Murray A.R., Popkin D.L., Rutschmann S., Siggs O.M., Smart N.G., Sun L., Tabeta K., Webster V., Tomisato W., Won S., Xia Y., Xiao N., Beutler B. ENU-induced phenovariance in mice: inferences from 587 mutations. BMC Res Notes. 2012;5:577. doi: 10.1186/1756-0500-5-577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Noveroske J.K., Weber J.S., Justice M.J. The mutagenic action of N-ethyl-N-nitrosourea in the mouse. Mamm Genome. 2000;11:478–483. doi: 10.1007/s003350010093. [DOI] [PubMed] [Google Scholar]
- 45.Nguyen N., Judd L.M., Kalantzis A., Whittle B., Giraud A.S., van Driel I.R. Random mutagenesis of the mouse genome: a strategy for discovering gene function and the molecular basis of disease. Am J Physiol Gastrointest Liver Physiol. 2011;300:G1–G11. doi: 10.1152/ajpgi.00343.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Takahasi K.R., Sakuraba Y., Gondo Y. Mutational pattern and frequency of induced nucleotide changes in mouse ENU mutagenesis. BMC Mol Biol. 2007;8:52. doi: 10.1186/1471-2199-8-52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Quwailid M.M., Hugill A., Dear N., Vizor L., Wells S., Horner E., Fuller S., Weedon J., McMath H., Woodman P., Edwards D., Campbell D., Rodger S., Carey J., Roberts A., Glenister P., Lalanne Z., Parkinson N., Coghill E.L., McKeone R., Cox S., Willan J., Greenfield A., Keays D., Brady S., Spurr N., Gray I., Hunter J., Brown S.D., Cox R.D. A gene-driven ENU-based approach to generating an allelic series in any gene. Mamm Genome. 2004;15:585–591. doi: 10.1007/s00335-004-2379-z. [DOI] [PubMed] [Google Scholar]
- 48.Boles M.K., Wilkinson B.M., Maxwell A., Lai L., Mills A.A., Nishijima I., Salinger A.P., Moskowitz I., Hirschi K.K., Liu B., Bradley A., Justice M.J. A mouse chromosome 4 balancer ENU-mutagenesis screen isolates eleven lethal lines. BMC Genet. 2009;10:12. doi: 10.1186/1471-2156-10-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Fairfield H., Gilbert G.J., Barter M., Corrigan R.R., Curtain M., Ding Y., D’Ascenzo M., Gerhardt D.J., He C., Huang W., Richmond T., Rowe L., Probst F.J., Bergstrom D.E., Murray S.A., Bult C., Richardson J., Kile B.T., Gut I., Hager J., Sigurdsson S., Mauceli E., Di Palma F., Lindblad-Toh K., Cunningham M.L., Cox T.C., Justice M.J., Spector M.S., Lowe S.W., Albert T., Donahue L.R., Jeddeloh J., Shendure J., Reinholdt L.G. Mutation discovery in mice by whole exome sequencing. Genome Biol. 2011;12:R86. doi: 10.1186/gb-2011-12-9-r86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Rinchik E.M., Bangham J.W., Hunsicker P.R., Cacheiro N.L., Kwon B.S., Jackson I.J., Russell L.B. Genetic and molecular analysis of chlorambucil-induced germ-line mutations in the mouse. Proc Natl Acad Sci U S A. 1990;87:1416–1420. doi: 10.1073/pnas.87.4.1416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Russell L.B., Hunsicker P.R., Cacheiro N.L., Bangham J.W., Russell W.L., Shelby M.D. Chlorambucil effectively induces deletion mutations in mouse germ cells. Proc Natl Acad Sci U S A. 1989;86:3704–3708. doi: 10.1073/pnas.86.10.3704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Probst F.J., Justice M.J. Mouse mutagenesis with the chemical supermutagen ENU. Methods Enzymol. 2010;477:297–312. doi: 10.1016/S0076-6879(10)77015-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Georgel P., Du X., Hoebe K., Beutler B. ENU mutagenesis in mice. Methods Mol Biol. 2008;415:1–16. doi: 10.1007/978-1-59745-570-1_1. [DOI] [PubMed] [Google Scholar]
- 54.Beutler B., Du X., Xia Y. Precis on forward genetics in mice. Nat Immunol. 2007;8:659–664. doi: 10.1038/ni0707-659. [DOI] [PubMed] [Google Scholar]
- 55.Marra M., Hillier L., Kucaba T., Allen M., Barstead R., Beck C. An encyclopedia of mouse genes. Nat Genet. 1999;21:191–194. doi: 10.1038/5976. [DOI] [PubMed] [Google Scholar]
- 56.Smith L.M., Fung S., Hunkapiller M.W., Hunkapiller T.J., Hood L.E. The synthesis of oligonucleotides containing an aliphatic amino group at the 5′ terminus: synthesis of fluorescent DNA primers for use in DNA sequence analysis. Nucleic Acids Res. 1985;13:2399–2412. doi: 10.1093/nar/13.7.2399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Smith L.M., Sanders J.Z., Kaiser R.J., Hughes P., Dodd C., Connell C.R., Heiner C., Kent S.B., Hood L.E. Fluorescence detection in automated DNA sequence analysis. Nature. 1986;321:674–679. doi: 10.1038/321674a0. [DOI] [PubMed] [Google Scholar]
- 58.de la Bastide M., McCombie W.R. Assembling genomic DNA sequences with PHRAP. Curr Protoc Bioinformatics. 2007 doi: 10.1002/0471250953.bi1104s17. Chapter 11:Unit11.4. [DOI] [PubMed] [Google Scholar]
- 59.Ewing B., Green P. Base-calling of automated sequencer traces using phred, II: error probabilities. Genome Res. 1998;8:186–194. [PubMed] [Google Scholar]
- 60.Ewing B., Hillier L., Wendl M.C., Green P. Base-calling of automated sequencer traces using phred, I: accuracy assessment. Genome Res. 1998;8:175–185. doi: 10.1101/gr.8.3.175. [DOI] [PubMed] [Google Scholar]
- 61.Brenner S., Johnson M., Bridgham J., Golda G., Lloyd D.H., Johnson D., Luo S., McCurdy S., Foy M., Ewan M., Roth R., George D., Eletr S., Albrecht G., Vermaas E., Williams S.R., Moon K., Burcham T., Pallas M., DuBridge R.B., Kirchner J., Fearon K., Mao J., Corcoran K. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nature Biotechnol. 2000;18:630–634. doi: 10.1038/76469. [DOI] [PubMed] [Google Scholar]
- 62.Beck S., Alderton R.P. A strategy for the amplification, purification, and selection of M13 templates for large-scale DNA sequencing. Anal Biochem. 1993;212:498–505. doi: 10.1006/abio.1993.1359. [DOI] [PubMed] [Google Scholar]
- 63.Chetverin A.B., Kramer F.R. Oligonucleotide arrays: new concepts and possibilities. Biotechnology (N Y) 1994;12:1093–1099. doi: 10.1038/nbt1194-1093. [DOI] [PubMed] [Google Scholar]
- 64.Hultman T., Uhlen M. Solid-phase cloning to create sublibraries suitable for DNA sequencing. J Biotechnol. 1994;35:229–238. doi: 10.1016/0168-1656(94)90038-8. [DOI] [PubMed] [Google Scholar]
- 65.Jones D.H. An iterative and regenerative method for DNA sequencing. BioTechniques. 1997;22:938–946. doi: 10.2144/97225rr01. [DOI] [PubMed] [Google Scholar]
- 66.Pareek C.S., Smoczynski R., Tretyn A. Sequencing technologies and genome sequencing. J Appl Genet. 2011;52:413–435. doi: 10.1007/s13353-011-0057-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Metzker M.L. Sequencing technologies: the next generation. Nat Rev Genet. 2010;11:31–46. doi: 10.1038/nrg2626. [DOI] [PubMed] [Google Scholar]
- 68.Bentley D.R., Balasubramanian S., Swerdlow H.P., Smith G.P., Milton J., Brown C.G. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Valouev A., Ichikawa J., Tonthat T., Stuart J., Ranade S., Peckham H., Zeng K., Malek J.A., Costa G., McKernan K., Sidow A., Fire A., Johnson S.M. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res. 2008;18:1051–1063. doi: 10.1101/gr.076463.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Margulies M., Egholm M., Altman W.E., Attiya S., Bader J.S., Bemben L.A. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–380. doi: 10.1038/nature03959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Harris T.D., Buzby P.R., Babcock H., Beer E., Bowers J., Braslavsky I., Causey M., Colonell J., Dimeo J., Efcavitch J.W., Giladi E., Gill J., Healy J., Jarosz M., Lapen D., Moulton K., Quake S.R., Steinmann K., Thayer E., Tyurina A., Ward R., Weiss H., Xie Z. Single-molecule DNA sequencing of a viral genome. Science. 2008;320:106–109. doi: 10.1126/science.1150427. [DOI] [PubMed] [Google Scholar]
- 72.Eid J., Fehr A., Gray J., Luong K., Lyle J., Otto G. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. doi: 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
- 73.Rothberg J.M., Hinz W., Rearick T.M., Schultz J., Mileski W., Davey M. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475:348–352. doi: 10.1038/nature10242. [DOI] [PubMed] [Google Scholar]
- 74.Chin C.S., Sorenson J., Harris J.B., Robins W.P., Charles R.C., Jean-Charles R.R., Bullard J., Webster D.R., Kasarskis A., Peluso P., Paxinos E.E., Yamaichi Y., Calderwood S.B., Mekalanos J.J., Schadt E.E., Waldor M.K. The origin of the Haitian cholera outbreak strain. N Engl J Med. 2011;364:33–42. doi: 10.1056/NEJMoa1012928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Rasko D.A., Webster D.R., Sahl J.W., Bashir A., Boisen N., Scheutz F., Paxinos E.E., Sebra R., Chin C.S., Iliopoulos D., Klammer A., Peluso P., Lee L., Kislyuk A.O., Bullard J., Kasarskis A., Wang S., Eid J., Rank D., Redman J.C., Steyert S.R., Frimodt-Moller J., Struve C., Petersen A.M., Krogfelt K.A., Nataro J.P., Schadt E.E., Waldor M.K. Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. N Engl J Med. 2011;365:709–717. doi: 10.1056/NEJMoa1106920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Koren S., Schatz M.C., Walenz B.P., Martin J., Howard J.T., Ganapathy G., Wang Z., Rasko D.A., McCombie W.R., Jarvis E.D., Adam M.P. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nature Biotechnol. 2012;30:693–700. doi: 10.1038/nbt.2280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Bashir A., Klammer A.A., Robins W.P., Chin C.S., Webster D., Paxinos E., Hsu D., Ashby M., Wang S., Peluso P., Sebra R., Sorenson J., Bullard J., Yen J., Valdovino M., Mollova E., Luong K., Lin S., LaMay B., Joshi A., Rowe L., Frace M., Tarr C.L., Turnsek M., Davis B.M., Kasarskis A., Mekalanos J.J., Waldor M.K., Schadt E.E. A hybrid approach for the automated finishing of bacterial genomes. Nature Biotechnol. 2012;30:701–707. doi: 10.1038/nbt.2288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Au K.F., Underwood J.G., Lee L., Wong W.H. Improving PacBio long read accuracy by short read alignment. PLoS One. 2012;7:e46679. doi: 10.1371/journal.pone.0046679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Xia Y., Won S., Du X., Lin P., Ross C., La Vine D., Wiltshire S., Leiva G., Vidal S.M., Whittle B., Goodnow C.C., Koziol J., Moresco E.M., Beutler B. Bulk segregation mapping of mutations in closely related strains of mice. Genetics. 2010;186:1139–1146. doi: 10.1534/genetics.110.121160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Gnirke A., Melnikov A., Maguire J., Rogov P., LeProust E.M., Brockman W., Fennell T., Giannoukos G., Fisher S., Russ C., Gabriel S., Jaffe D.B., Lander E.S., Nusbaum C. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnol. 2009;27:182–189. doi: 10.1038/nbt.1523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Hodges E., Rooks M., Xuan Z., Bhattacharjee A., Benjamin Gordon D., Brizuela L., Richard McCombie W., Hannon G.J. Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing. Nat Protoc. 2009;4:960–974. doi: 10.1038/nprot.2009.68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Hodges E., Xuan Z., Balija V., Kramer M., Molla M.N., Smith S.W., Middle C.M., Rodesch M.J., Albert T.J., Hannon G.J., McCombie W.R. Genome-wide in situ exon capture for selective resequencing. Nat Genet. 2007;39:1522–1527. doi: 10.1038/ng.2007.42. [DOI] [PubMed] [Google Scholar]
- 83.Porreca G.J., Zhang K., Li J.B., Xie B., Austin D., Vassallo S.L., LeProust E.M., Peck B.J., Emig C.J., Dahl F., Gao Y., Church G.M., Shendure J. Multiplex amplification of large sets of human exons. Nat Methods. 2007;4:931–936. doi: 10.1038/nmeth1110. [DOI] [PubMed] [Google Scholar]
- 84.Tennessen J.A., Bigham A.W., O’Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., Kang H.M., Jordan D., Leal S.M., Gabriel S., Rieder M.J., Abecasis G., Altshuler D., Nickerson D.A., Boerwinkle E., Sunyaev S., Bustamante C.D., Bamshad M.J., Akey J.M., Broad G.O., Seattle G.O., NHLBI Exome Sequencing Project Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. doi: 10.1126/science.1219240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Ng S.B., Buckingham K.J., Lee C., Bigham A.W., Tabor H.K., Dent K.M., Huff C.D., Shannon P.T., Jabs E.W., Nickerson D.A., Shendure J., Bamshad M.J. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42:30–35. doi: 10.1038/ng.499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Ng S.B., Bigham A.W., Buckingham K.J., Hannibal M.C., McMillin M.J., Gildersleeve H.I., Beck A.E., Tabor H.K., Cooper G.M., Mefford H.C., Lee C., Turner E.H., Smith J.D., Rieder M.J., Yoshiura K., Matsumoto N., Ohta T., Niikawa N., Nickerson D.A., Bamshad M.J., Shendure J. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet. 2010;42:790–793. doi: 10.1038/ng.646. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Lee H., Graham J.M., Jr., Rimoin D.L., Lachman R.S., Krejci P., Tompson S.W., Nelson S.F., Krakow D., Cohn D.H. Exome sequencing identifies PDE4D mutations in acrodysostosis. Am J Hum Genet. 2012;90:746–751. doi: 10.1016/j.ajhg.2012.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.van Bon B.W., Gilissen C., Grange D.K., Hennekam R.C., Kayserili H., Engels H., Reutter H., Ostergaard J.R., Morava E., Tsiakas K., Isidor B., Le Merrer M., Eser M., Wieskamp N., de Vries P., Steehouwer M., Veltman J.A., Robertson S.P., Brunner H.G., de Vries B.B., Hoischen A. Cantu syndrome is caused by mutations in ABCC9. Am J Hum Genet. 2012;90:1094–1101. doi: 10.1016/j.ajhg.2012.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Banerji S., Cibulskis K., Rangel-Escareno C., Brown K.K., Carter S.L., Frederick A.M. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature. 2012;486:405–409. doi: 10.1038/nature11154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Barbieri C.E., Baca S.C., Lawrence M.S., Demichelis F., Blattner M., Theurillat J.P. Exome sequencing identifies recurrent SPOP. FOXA1 and MED12 mutations in prostate cancer. Nat Genet. 2012;44:685–689. doi: 10.1038/ng.2279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Pugh T.J., Weeraratne S.D., Archer T.C., Pomeranz Krummel D.A., Auclair D., Bochicchio J. Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations. Nature. 2012;488:106–110. doi: 10.1038/nature11329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Hilton J.M., Lewis M.A., Grati M., Ingham N., Pearson S., Laskowski R.A., Adams D.J., Steel K.P. Exome sequencing identifies a missense mutation in Isl1 associated with low penetrance otitis media in dearisch mice. Genome Biol. 2011;12:R90. doi: 10.1186/gb-2011-12-9-r90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Sun M., Mondal K., Patel V., Horner V.L., Long A.B., Cutler D.J., Caspary T., Zwick M.E. Multiplex chromosomal exome sequencing accelerates identification of ENU-induced mutations in the mouse. G3 (Bethesda) 2012;2:143–150. doi: 10.1534/g3.111.001669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Adzhubei I.A., Schmidt S., Peshkin L., Ramensky V.E., Gerasimova A., Bork P., Kondrashov A.S., Sunyaev S.R. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249. doi: 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kang L., Gao S. Pluripotency of induced pluripotent stem cells. J Anim Sci Biotechnol. 2012;3:5. doi: 10.1186/2049-1891-3-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Okita K., Ichisaka T., Yamanaka S. Generation of germline-competent induced pluripotent stem cells. Nature. 2007;448:313–317. doi: 10.1038/nature05934. [DOI] [PubMed] [Google Scholar]
- 97.Leeb M., Walker R., Mansfield B., Nichols J., Smith A., Wutz A. Germline potential of parthenogenetic haploid mouse embryonic stem cells. Development. 2012;139:3301–3305. doi: 10.1242/dev.083675. [DOI] [PMC free article] [PubMed] [Google Scholar]




