Skip to main content
Springer logoLink to Springer
. 2008 May 21;19(5):306–308. doi: 10.1007/s00335-008-9110-4

Prospects for complex trait analysis in the mouse

Richard Mott 1,, Jonathan Flint 1
PMCID: PMC2515547  PMID: 18493822

We want to discuss how recent dramatic progress in whole-genome association analysis (WGA) applied to human case-control studies will affect complex trait analysis in the mouse. At first sight it might now seem that one can identify the genetic determinants of human complex disease directly. However, the picture is less straightforward, for although a significant number of genes have been identified and replicated across different human WGA studies, in most diseases the genetic variation segregating at these genes explains only a small fraction of cases that should be accounted for by genetic causes. The question remains whether the missing genetic signal is from common variants in other genes but with very small phenotypic effect, or is caused by rare variants in the same genes, or a combination of both. In either case, to continue making progress directly with the human WGA methodology it will be necessary to increase sample sizes significantly (Zeggini et al. 2008).

Where does this leave mouse complex genetics? Of course, as the readers of Mammalian Genome will appreciate, working with mice does have certain advantages in that it is possible to design and control experiments and take detailed measurements in a way that is impossible with humans. Nevertheless, the mouse complex genetics community must address three key issues in order to contribute to our understanding of human biology and disease.

The first of these is mapping resolution. We need to establish suitable mouse populations in which high-resolution mapping is the norm. The small extent of linkage disequilibrium (LD) in most human populations means that a positive signal in a human WGA is localized to a few tens of kilobases, usually the span a single gene. Assuming that the functional genetic variant acts on the nearest gene (which is not always the case), then human WGA delivers single-gene resolution. In contrast, a detected QTL in an F2 intercross between two inbred laboratory mouse strains that explains 5% of the phenotypic variance will be mapped into a 95% confidence interval of approximately 30 Mb, containing approximately 300 genes. In some cases, combining information from multiple F2 crosses with the haplotype map of the mouse genome can refine QTL localization (Hitzemann et al. 2002; Li et al. 2005). Several other strategies have been proposed to solve this problem, by using populations of mice with more recombinants, and consequently steeper LD decay profiles. Heterogeneous stocks (HS) are descended from eight known inbred strain of mice that are outcrossed using a rotational breeding scheme for many generations until the genomes are relatively fine-grained mosaics of the founder haplotypes; mapping resolution of 2–3Mb is obtainable (Mott et al. 2000; Talbot et al. 1999; Valdar et al. 2006). However, despite being much more accurate than an F2 cross, this is still too crude for single-gene mapping for which we require mapping resolution of about 100 kb.

Commercial mouse breeders maintain very large genetically heterogeneous outbred populations, some of which are suitable for complex trait analysis. As proof of principle, Yalcin et al. (2004) showed how one such population, MF1, could be used to fine-map a QTL for behavior down to the gene Rgs2. Our group is now in the process of evaluating the genetic variability and LD profile of several commercially available outbred populations. Preliminary data suggest that there are some populations that have suitable properties but that others are not useful, having been recently rederived from a small number of animals and therefore containing extensive LD. There is also one tantalizing study of outbred wild mice that suggests they have an LD structure similar to that of humans (Laurie et al. 2007).

The disadvantage of working with outbreds is that each animal is unique so that it is impossible to perform repeat experiments on the same genetic background, which may be necessary; for example, in a study to measure gene expression changes during development. Furthermore, the cost of high-density genotyping required limits the size of experiments. In contrast, inbred strains of mice permit these types of studies, need only be genotyped once, and there is a synergy in accumulating data from different experiments on standardized genetic backgrounds. There has been considerable debate over the direct use of the standard laboratory inbred strains for WGA. The main point of contention is that the number of independent inbred strains available is limited (for example, the mouse phenome database http://www.phenome.jax.org/pub-cgi/phenome/mpdcgi uses less than 40 priority strains, and even these strains share haplotypes to a considerable degree). Although the method may work for major QTLs explaining over 50% of the phenotypic variance, results are mixed for complex traits (Payseur and Place 2007) where QTLs of small effect are lost among false-positive signals in a genome scan. On the other hand, the sharp LD decay profile of the inbred strains is very attractive, so that if one can be sure that a QTL is segregating in a region (for example, from an F2 cross), then an analysis of inbred strains across the region may identify the gene in some cases. In addition, there is a considerable saving in genotyping costs.

This discussion suggests there is a strong case for designing and constructing a large population of inbred lines that contains a high density of recombinants and where the lines are independent, in the sense that they do not share any recombination events. This was the motivation behind the Collaborative Cross (CC) (Churchill et al. 2004; Threadgill et al. 2002). Currently, over 400 CC recombinant inbred lines are being bred at Oak Ridge National Laboratary, USA, and Tel Aviv University, Israel, in a collaboration funded by the U.S. Department of Energy (DOE), The Ellison Foundation, National Institutes of Health (NIH), and The Wellcome Trust. The lines are descended from eight genetically diverse founder strains (including three wild-derived strains) and are currently between generation 6 and 10 of inbreeding. They will be fully inbred in about 4 years but are already sufficiently advanced that a pilot project is planned to assess their use for QTL mapping. The CC is almost ideal, except that the expected QTL mapping resolution is about 1 Mb (Valdar et al. 2005)—not quite single gene.

The second issue is that the haplotype structure of the classical laboratory strains of mice is not ideal for complex trait analysis. To make best use of the mouse, we need complete genome sequences of the common laboratory strains. Already we know from partial resequencing of 16 strains that the so-called classical strains share only a fraction of the genetic variation segregating in wild-derived strains (Frazer et al. 2007; Yang et al. 2007) so there should be fewer QTLs in a study solely using classical strains. Moreover, their genomes are not independent; the pattern of haplotype sharing is not random across the genome, causing the problem of false-positive QTLs alluded to above. By contrast, the haplotype structure of the CC is necessarily random, very rare variants should not exist, and the founding strains contain more genetic variation than is present in some human populations (Roberts et al. 2007). We do not yet know much about the haplotype structure or origins of commercial outbreds.

The third issue is how we should relate discoveries made in the mouse to human disease. It should be reiterated that many studies in mice are not feasible in humans, such as the elucidation of gene networks from most tissues and developmental stages. As it is extremely unlikely that identical causative polymorphisms will be segregating in both species, we should not necessarily expect the same genes to be identified in WGA, although there are examples where this is the case, such as cancer modifiers common to mice and humans (Ruivenkamp et al. 2002). Nevertheless, we should expect the same pathways to be implicated (Emilsson et al. 2008). However, at present the functional annotation of both species is incomplete. Therefore, alongside the development of suitable mapping populations, we need comprehensive annotation of the gene networks in the mouse, and how they vary during development, between tissues and between genetic backgrounds. There is not space here to say more except that it will require a concerted international effort, integrating and extending existing online resources.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References

  1. Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD, et al. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat Genet. 2004;36:1133–1137. doi: 10.1038/ng1104-1133. [DOI] [PubMed] [Google Scholar]
  2. Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452:423–428. doi: 10.1038/nature06758. [DOI] [PubMed] [Google Scholar]
  3. Frazer KA, Eskin E, Kang HM, Bogue MA, Hinds DA, et al. A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature. 2007;448:1050–1053. doi: 10.1038/nature06067. [DOI] [PubMed] [Google Scholar]
  4. Hitzemann R, Malmanger B, Cooper S, Coulombe S, Reed C, et al. Multiple cross mapping (MCM) markedly improves the localization of a QTL for ethanol-induced activation. Genes Brain Behav. 2002;1:214–222. doi: 10.1034/j.1601-183X.2002.10403.x. [DOI] [PubMed] [Google Scholar]
  5. Laurie CC, Nickerson DA, Anderson AD, Weir BS, Livingston RJ, et al. Linkage disequilibrium in wild mice. PLoS Genet. 2007;3:e144. doi: 10.1371/journal.pgen.0030144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Li R, Lyons MA, Wittenburg H, Paigen B, Churchill GA. Combining data from multiple inbred line crosses improves the power and resolution of quantitative trait loci mapping. Genetics. 2005;169:1699–1709. doi: 10.1534/genetics.104.033993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Mott R, Talbot CJ, Turri MG, Collins AC, Flint J. A method for fine mapping quantitative trait loci in outbred animal stocks. Proc Natl Acad Sci USA. 2000;97:12649–12654. doi: 10.1073/pnas.230304397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Payseur BA, Place M. Prospects for association mapping in classical inbred mouse strains. Genetics. 2007;175:1999–2008. doi: 10.1534/genetics.106.067868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Roberts A, Pardo-Manuel de Villena F, Wang W, McMillan L, Threadgill DW. The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mamm Genome. 2007;18:473–481. doi: 10.1007/s00335-007-9045-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Ruivenkamp CA, van Wezel T, Zanon C, Stassen AP, Vlcek C, et al. Ptprj is a candidate for the mouse colon-cancer susceptibility locus Scc1 and is frequently deleted in human cancers. Nat Genet. 2002;31:295–300. doi: 10.1038/ng903. [DOI] [PubMed] [Google Scholar]
  11. Talbot CJ, Nicod A, Cherny SS, Fulker DW, Collins AC, et al. High-resolution mapping of quantitative trait loci in outbred mice. Nat Genet. 1999;21:305–308. doi: 10.1038/6825. [DOI] [PubMed] [Google Scholar]
  12. Threadgill DW, Hunter KW, Williams RW. Genetic dissection of complex and quantitative traits: from fantasy to reality via a community effort. Mamm Genome. 2002;13:175–178. doi: 10.1007/s00335-001-4001-y. [DOI] [PubMed] [Google Scholar]
  13. Valdar W, Flint J, Mott R. Simulating the collaborative cross: power of QTL detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics. 2005;172:1783–1797. doi: 10.1534/genetics.104.039313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Valdar W, Solberg LC, Gauguier D, Burnett S, Klenerman P, et al. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nat Genet. 2006;38:879–887. doi: 10.1038/ng1840. [DOI] [PubMed] [Google Scholar]
  15. Yalcin B, Willis-Owen SA, Fullerton J, Meesaq A, Deacon RM, et al. Genetic dissection of a behavioral quantitative trait locus shows that Rgs2 modulates anxiety in mice. Nat Genet. 2004;36:1197–1202. doi: 10.1038/ng1450. [DOI] [PubMed] [Google Scholar]
  16. Yang H, Bell TA, Churchill GA, Pardo-Manuel de Villena F. On the subspecific origin of the laboratory mouse. Nat Genet. 2007;39:1100–1107. doi: 10.1038/ng2087. [DOI] [PubMed] [Google Scholar]
  17. Zeggini E, Scott LJ, Saxena R, Voight BF, Marchini JL, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008;40:638–645. doi: 10.1038/ng.120. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Mammalian Genome are provided here courtesy of Springer

RESOURCES