R/qtl2 is an interactive software environment for mapping quantitative trait loci (QTL) in experimental populations. The R/qtl2 software expands the scope of the widely-used R/qtl software package to include multiparental populations, better handles modern high-dimensional data....
Keywords: software, QTL, multiparent populations, MAGIC, Diversity Outbred mice, heterogeneous stock, Collaborative Cross, Multiparent Advanced Generation Inter-Cross (MAGIC), MPP
Abstract
R/qtl2 is an interactive software environment for mapping quantitative trait loci (QTL) in experimental populations. The R/qtl2 software expands the scope of the widely used R/qtl software package to include multiparent populations derived from more than two founder strains, such as the Collaborative Cross and Diversity Outbred mice, heterogeneous stocks, and MAGIC plant populations. R/qtl2 is designed to handle modern high-density genotyping data and high-dimensional molecular phenotypes, including gene expression and proteomics. R/qtl2 includes the ability to perform genome scans using a linear mixed model to account for population structure, and also includes features to impute SNPs based on founder strain genomes and to carry out association mapping. The R/qtl2 software provides all of the basic features needed for QTL mapping, including graphical displays and summary reports, and it can be extended through the creation of add-on packages. R/qtl2, which is free and open source software written in the R and C++ programming languages, comes with a test framework.
THERE has been a resurgence of interest in the mapping of quantitative trait loci (QTL) in experimental organisms, spurred in part by the use of gene expression phenotypes [eQTL mapping; see Albert and Kruglyak (2015)] to more rapidly identify the underlying genes, and by the development of multiparent populations (de Koning and McIntyre 2017), including heterogeneous stocks (Mott et al. 2000; Mott and Flint 2002), MAGIC lines (Cavanagh et al. 2008; Kover et al. 2009), the Collaborative Cross (Churchill et al. 2004), and Diversity Outbred mice (Churchill et al. 2012; Svenson et al. 2012).
Multiparent populations (MPPs) are genetically mixed populations derived from a small set of known founders that are typically, but not necessarily, inbred strains. The presence of multiple founder alleles imparts unique features to MPPs with significant advantages over traditional two-parent crosses. Allelic series of linked functional variants produce information-rich patterns of effects that can help identify causal variants and distinguish pleiotropy from chance colocalization of multiple QTL (King et al. 2012). MPPs provide high-resolution mapping, which results in fewer candidate genes and minimizes the confounding effects of linked loci. MPPs create new multi-locus allelic combinations by mixing founder genomes. The founder strain genomes of many MPPs have been, or will be, sequenced, and, using high-density genotyping, we can then accurately impute whole genomes of individuals (Oreper et al. 2017).
MPPs can be generated by many different breeding designs and have been developed in different model organisms including rats (Woods and Mott 2017), Drosophila (King et al. 2012), Caenorhabditis elegans (Noble et al. 2017), as well as a variety of plant species (Kover et al. 2009; Huang et al. 2012a; Bandillo et al. 2013; Dell’Acqua et al. 2015). Different breeding designs of MPPs give rise to different population structures and thus will require a flexible and general framework for analysis. The key challenges that arise in the analysis of MPP data include the reconstruction of the founder haplotype mosaic, imputation of whole-genome genetic variants, and analysis methods that can handle the multiple founder alleles and account for population structure.
There are numerous software packages for QTL mapping in classical two-parent experimental populations, including Mapmaker/QTL (Lincoln and Lander 1990), QTL Cartographer (Basten et al. 2002), R/qtl (Broman et al. 2003; Broman and Sen 2009), and MapQTL (Van Ooijen 2009). There are a smaller number of packages for QTL analysis in multiparent populations, including DOQTL (Gatti et al. 2014), HAPPY (Mott et al. 2000), and mpMap (Huang and George 2011). Our aim in developing R/qtl2 is to provide an open-source, extensible software environment for QTL mapping and associated data analysis tasks that applies to the full range of classical and MPP cross designs.
The original R/qtl (hereafter, R/qtl1) is widely used, and has a number of advantages compared to proprietary alternatives. R/qtl1 includes a quite comprehensive set of QTL mapping methods, including multiple-QTL exploration and model selection (Broman and Speed 2002; Manichaikul et al. 2009; Arends et al. 2010), as well as extensive visualization and data diagnostics tools (Broman and Sen 2009). Further, users and developers both benefit by it being an add-on package for the general statistical software, R (R Core Team 2018). A number of other R packages have been written to work in concert with R/qtl1, including ASMap (Taylor and Butler 2017), ctl (Arends et al. 2016), dlmap (Huang et al. 2012b), qtlcharts (Broman 2015), vqtl (Corty and Valdar 2018), and wgaim (Taylor and Verbyla 2011).
R/qtl1 has a number of limitations (see Broman 2014), the most critical of which is that the central data structure generally limits its use to biparental crosses. Also, R/qtl1 was designed at a time when a dataset with >100 genetic markers was considered large.
Rather than extend R/qtl1 for multiparent populations, we decided to start fresh. R/qtl2 is a completely redesigned R package for QTL analysis that can handle a variety of multiparent populations and is suited for high-dimensional genotype and phenotype data. To handle population structure, QTL analysis may be performed with a linear mixed model that includes a residual polygenic effect. The R/qtl2 software is available from its web site (https://kbroman.org/qtl2) as well as GitHub (https://github.com/rqtl/qtl2).
Features
QTL analysis in multiparent populations can be split into two parts: calculation of genotype probabilities using multipoint single nucleotide polymorphism (SNP) genotypes, and the genome scan to evaluate the association between genotype and phenotype, using those probabilities. We use a hidden Markov model [HMM; see Broman and Sen (2009), App. D] for the calculation of genotype probabilities. The HMM implemented in R/qtl2 is generalized from the implementation in R/qtl1 to accommodate the MPP founder haplotype structure. As the source of genotype information, R/qtl2 considers array-based SNP genotypes. At present, we focus solely on marker genotypes rather than array intensities, as in DOQTL, or allele counts/dosages from genotyping-by-sequencing (GBS) assays.
R/qtl2 includes implementations of many classical two-way crosses (backcross, intercross, doubled haploids, two-way recombinant inbred lines by selfing or sibling mating, and two-way advanced intercross populations), and many different types of multiparent populations [4- and 8-way recombinant inbred lines by sibling mating; 4-, 8-, and 16-way recombinant inbred lines by selfing; 3-way advanced intercross populations, Diversity Outbred mice, heterogeneous stocks, 19-way MAGIC lines like the Kover et al. (2009) Arabidopsis lines, and 6-way doubled haploids following a design of maize MAGIC lines being developed at the University of Wisconsin–Madison].
A key component of the HMM is the transition matrix (or “step” probabilities), which are specific to the cross design. Transitions represent locations where the ancestry of chromosomal segments change from one founder strain haplotype to another. The transition probabilities for multi-way recombinant inbred lines are taken from Broman (2005). The transition probabilities for heterogeneous stocks and Diversity Outbred mice are taken from Broman (2012b), which uses the results of Broman (2012a).
The output of the HMM is a list of three-dimensional arrays, one per chromosome, with dimensions corresponding to individuals × genotypes × marker loci. Array elements represent genotype probabilities that can reflect both the uncertainty of haplotype inference and the heterozygosity. The size and structure of the genotype dimension determine the form of the regression model that will be used in the genome scanning step. Thus, once the genotype probabilities are defined, there is no need to reference the breeding scheme that gave rise to the cross population. For breeding schemes that are not currently implemented in the R/qtl2 HMM, the user can precompute and import a custom genotype probability data structure.
At present, R/qtl2 assumes dense marker information and a low level of uncertainty in the haplotype reconstructions, so that we may rely on Haley-Knott regression (Haley and Knott 1992) for genome scans to establish genotype-phenotype association. This may be performed either with a simple linear model [as in Haley and Knott (1992)], or with a linear mixed model (Yu et al. 2006; Kang et al. 2008; Lippert et al. 2011) that includes a residual polygenic effect to account for population structure. The latter may also be performed using kinship matrices derived using the “leave-one-chromosome-out” (LOCO) method (see Yang et al. 2014).
To establish statistical significance of evidence for QTL, accounting for a genome scan, R/qtl2 facilitates the use of permutation tests (Churchill and Doerge 1994). For multiparent populations with analysis via a linear mixed model, we permute the rows of the haplotype reconstructions as considered in Cheng and Palmer (2013). R packages such as qvalue (Storey et al. 2018) can be used to implement multiple-test corrections for high-dimensional data analysis (Storey 2002, 2003) such as gene expression QTL (eQTL) mapping.
R/qtl2 includes a variety of data diagnostic tools, which can be particularly helpful for data on multiparent populations where the SNP genotypes are incompletely informative (i.e., SNP genotypes do not fully define the corresponding founder haplotype). These include SNP genotyping error LOD scores (Lincoln and Lander 1992) and estimated crossover counts.
Examples
R/qtl2 reproduces the functionality of DOQTL (Gatti et al. 2014) but targets a broader set of multiparent populations, in addition to Diversity Outbred (DO) mice. (DOQTL will ultimately be deprecated and replaced with R/qtl2.) Figure 1 contains a reproduction, using R/qtl2, of Figure 5 from Gatti et al. (2014). This is a QTL analysis of constitutive neutrophil counts in 742 Diversity Outbred mice (from generations three to five) that were genotyped with the first generation Mouse Universal Genotyping Array (MUGA) (Morgan et al. 2016), which contained 7851 markers, of which we are using 6413.
The regression model that R/qtl2 applies in a genome scan is determined by the HMM output in the genotype probabilities data structure. For an eight-parent MPP such as the DO mice, there are 36 possible diplotypes (44 on the X chromosome) and the genome scan will be based on a regression model with 35 degrees of freedom. With so many degrees of freedom, the model typically lacks power to detect QTL. An alternative representation collapses the 36 states to eight founder “dosages” and uses a regression model with seven degrees of freedom, assuming that the founder effects are additive at any given locus. R/qtl2 has the ability to incorporate SNP (and other variant) data from founder strains and to impute biallelic genotypes for every SNP. The genome scan on imputed SNPs is equivalent to an association mapping scan, and can employ an additive (one degree of freedom) or general (two degrees of freedom) regression model.
Figure 1A contains the LOD curves from a genome scan using a full model comparing all 36 possible genotypes with log neutrophil count as the phenotype, and with sex and log white blood cell count as covariates. The horizontal dashed line indicates the 5% genome-wide significance level, derived from a permutation test, with separate thresholds for the autosomes and the X chromosome, using the technique of Broman et al. (2006). Figure 1B contains the LOD curves from a genome scan using an additive allele model (corresponding to a test with seven degrees of freedom), and Figure 1C contains a SNP association scan, using a test with two degrees of freedom. All of these analyses use a linear mixed model with kinship matrices derived using the LOCO method.
Figure 1D shows the estimated QTL effects, assuming a single QTL with additive allele effects on chromosome (chr) 1, and sliding the position of the QTL across the chromosome. This is analogous to the estimated effects in Figure 5D of Gatti et al. (2014), but here we present Best Linear Unbiased Predictors (BLUPs), taking the QTL effects to be random effects. This results in estimated effects that have been shrunk toward 0, which helps to clean up the figure and focus attention on the region of interest.
Figure 1E shows individual SNP association results, for the 6 Mbp region on chr 1 that contains the QTL. As with the DOQTL software, we use all available SNPs for which genotype data are available in the eight founder lines, and impute the SNP genotypes in the DO mice, using the individuals’ genotype probabilities along with the founder strains’ SNP genotypes.
Figure 1 shows a number of differences from the results reported in Gatti et al. (2014), including that we see nearly significant loci on chr 5 and 17 in the full model (Figure 1A), and we see a second significant QTL on chr 7 with the additive allele model (Figure 1B). Also, in Figure 1E, we see associated SNPs not just at ∼128.6 Mbp near the Cxcr4 gene (as in Gatti et al. 2014), but also a group of associated SNPs at ∼132.4 Mbp, near Tmcc2. The differences between these results and those of Gatti et al. (2014) are due to differences in genotype probability calculations; R/qtl2 appears to be more tolerant of SNP genotyping errors (data not shown).
To further illustrate the broad applicability of R/qtl2, we reanalyzed the data of Gnan et al. (2014) on seed weight, seed number, and fruit length in 677 19-way Arabidopsis MAGIC lines from Kover et al. (2009). In Figure 2, we show LOD scores for three traits and effect estimates for a selected QTL for each trait, as derived from the log P-values provided by Gnan et al. (2014) and as calculated with R/qtl2.
The genome scan results are largely concordant except for an important difference in the LOD curve on chr 1 for seed weight (Figure 2A). There are also smaller differences on chr 3 for seed weight (Figure 2A) and chr 1 for number of seeds per fruit (Figure 2C). These differences are likely due to differences in the calculated genotype probabilities, and deserve further study.
The estimated effects at the selected QTL are largely concordant (Figure 2, D–F), but note that, for the seed weight trait (Figure 2D), R/qtl2’s estimate of the average seed weight for lines with the Po-0 allele is 39.9, well outside the plotted range. At this QTL, it appears that the 677 MAGIC lines all have small probabilities for carrying the Po-0 allele. The only other large difference is in Figure 1E for fruit length, where the value reported in Gnan et al. (2014) for the Edi-0 allele is much smaller than that obtained with R/qtl2. Finally, note that, throughout, the BLUPs are all shifted toward the mean, and that this shift is much larger for seed number (Figure 1F) vs. fruit length (Figure 1E).
Data and software availability
The data for Figure 1 are available at the Mouse Phenotype Database (https://phenome.jax.org/projects/Gatti2). The data for Figure 2 are available as supplemental files for Gnan et al. (2014) (https://doi.org/10.1534/genetics.114.170746). R/qtl2 input files for both datasets are available at GitHub (https://github.com/rqtl/qtl2data).
The R/qtl2 software is available from its web site (https://kbroman.org/qtl2) as well as GitHub (https://github.com/rqtl/qtl2). The software is licensed under the GNU General Public License version 3.0.
The code to create Figure 1 and Figure 2 is available at GitHub at https://github.com/kbroman/Paper_Rqtl2.
Implementation
R/qtl2 is developed as a free and open source add-on package for the general statistical software, R (R Core Team 2018). Much of the code is written in R, but computationally intensive aspects are written in C++. (Computationally intensive aspects of R/qtl1 are in C.) We use Rcpp (Eddelbuettel and François 2011; Eddelbuettel 2013) for the interface between R and C++, to simplify code and reduce the need for copying data in memory. We use roxygen2 (Wickham et al. 2017) to develop the R package documentation.
Linear algebra calculations, such as matrix decomposition and linear regression, are a central part of QTL analysis. We use RcppEigen (Bates and Eddelbuettel 2013) and the Eigen C++ library (Guennebaud et al. 2010) for these calculations. For the fit of linear mixed models, to account for population structure with a residual polygenic effect, we closely followed code from PyLMM (Furlotte 2015). In particular, we use the basic technique described in Kang et al. (2008), of taking the eigen decomposition of the kinship matrix.
In contrast to R/qtl1, which includes almost no formal software tests, R/qtl2 includes extensive unit tests to ensure correctness. We use the R package “testthat” (Wickham 2011) for this purpose. The use of unit tests helps us to catch bugs earlier, and revealed several bugs in R/qtl1.
Discussion
We have completed the core of the R/qtl2 software package, which is a reimplementation of the widely used software R/qtl, to better handle high-dimensional genotypes and phenotypes, and modern cross designs including MPPs. This software forms a key computational platform for QTL analysis in MPPs, and includes genotype reconstruction for a variety of MPP designs (including MAGIC lines, the Collaborative Cross, Diversity Outbreds, and heterogeneous stock), numerous facilities for quality-control assessments, QTL genome scans by Haley-Knott regression (Haley and Knott 1992) and linear mixed models to account for population structure, and BLUP-based estimates of QTL effects. Most procedures in R/qtl2 can make use of the multiple CPU cores on a given machine, to speed computations by parallel processing.
While the basic functionality of R/qtl2 is complete, there are a number of areas for further development. In particular, we would like to further expand the set of crosses that may be considered, including partially inbred recombinant inbred lines (so that we may deal with residual heterozygosity, which presently is ignored). We have currently been focusing on exact calculations for specific designs, but the mathematics involved can be tedious. We would like to have a more general approach for genotype reconstruction in multiparent populations, along the lines of RABBIT (Zheng et al. 2015) or STITCH (Davies et al. 2016). Plant researchers have been particularly creative in developing unusual sets of MAGIC populations, and, by our current approach, each design requires the development of design-specific code, which is difficult to sustain. In addition, we will provide facilities for importing data in more general formats, including genotype probabilities/reconstructions and kinship matrices that were derived from other software packages. This will further expand the scope for R/qtl2 by making its QTL analysis facilities usable beyond the set of MPP designs that can be handled by our genotype reconstruction code.
Another important area of development is the handling of GBS data. We are currently focusing solely on called genotypes. With low-coverage GBS data, it is difficult to get quality genotype calls at individual SNPs, and there will be considerable advantage to using the pairs of allele counts and combining information across SNPs. Extending the current HMM implementation in R/qtl2 to handle pairs of allele counts for GBS data appears straightforward.
At present, QTL analysis in R/qtl2 is solely by genome scans with single-QTL models. Consideration of multiple-QTL models will be particularly important for exploring the possibility of multiple causal SNPs in a QTL region, along the lines of the CAVIAR software (Hormozdiari et al. 2014).
We have currently focused solely on Haley-Knott regression (Haley and Knott 1992) for QTL analysis. This has a big advantage in terms of computational speed, but it does not fully account for the uncertainty in genotype reconstructions. But the QTL analysis literature has a long history of methods for dealing with this genotype uncertainty, including interval mapping (Lander and Botstein 1989) and multiple imputation (Sen and Churchill 2001). While this has not been a high priority in the development of R/qtl2, ultimately we will include implementations of these sorts of approaches, to better handle regions with reduced genotype information.
We will continue to focus on lean implementations of fitting algorithms, such as simple linear mixed models with a single random effect for kinship, that will be widely used for genome-wide scans. But we will also seek to simplify the use of external packages, for genome scans with more complex models.
R/qtl2 is an important update to the popular R/qtl software, expanding the scope to include multiparent populations, providing improved handling of high-dimensional data, and enabling genome scans with a linear mixed model to account for population structure. R/qtl1 served as an important hub upon which other developers could build; we hope that R/qtl2 can serve a similar role for the genetic analysis of multiparent populations.
Acknowledgments
This work was supported in part by National Institutes of Health grants R01GM074244 (to K.W.B.), R01GM070683 (to K.W.B. and G.A.C.), and R01GM123489 (to Ś.S.). The authors thank Paula Kover for assistance with the data from Gnan et al. (2014).
Footnotes
Communicating editor: J. Holland
Literature Cited
- Albert F. W., Kruglyak L., 2015. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16: 197–212. 10.1038/nrg3891 [DOI] [PubMed] [Google Scholar]
- Arends D., Prins P., Jansen R. C., Broman K. W., 2010. R/qtl: high-throughput multiple QTL mapping. Bioinformatics 26: 2990–2992. 10.1093/bioinformatics/btq565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arends D., Li Y., Brockmann G. A., Jansen R. C., Williams R. W., et al. , 2016. Correlation trait loci (CTL) mapping: phenotype network inference subject to genotype. J. Open Source Softw. 1: 87 10.21105/joss.00087 [DOI] [Google Scholar]
- Bandillo N., Raghavan C., Muyco P. A., Sevilla M. A., Lobina I. T., et al. , 2013. Multi-parent advanced generation inter-cross (MAGIC) populations in rice: progress and potential for genetics research and breeding. Rice (N. Y.) 6: 11 10.1186/1939-8433-6-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Basten C. J., Weir B. S., Zeng Z.-B., 2002. QTL Cartographer, Version 1.16. Department of Statistics, North Carolina State University, Raleigh, NC. [Google Scholar]
- Bates D., Eddelbuettel D., 2013. Fast and elegant numerical linear algebra using the RcppEigen package. J. Stat. Softw. 52: 1–24. 10.18637/jss.v052.i0523761062 [DOI] [Google Scholar]
- Broman K. W., 2005. The genomes of recombinant inbred lines. Genetics 169: 1133–1146. 10.1534/genetics.104.035212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., 2012a Genotype probabilities at intermediate generations in the construction of recombinant inbred lines. Genetics 190: 403–412. 10.1534/genetics.111.132647 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., 2012b Haplotype probabilities in advanced intercross populations. G3 (Bethesda) 2: 199–202. 10.1534/g3.111.001818 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., 2014. Fourteen years of R/qtl: just barely sustainable. J. Open Res. Softw. 2: e11 10.5334/jors.at [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., 2015. R/qtlcharts: interactive graphics for quantitative trait locus mapping. Genetics 199: 359–361. 10.1534/genetics.114.172742 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., Sen S., 2009. A Guide to QTL Mapping with R/qtl. Springer, New York: 10.1007/978-0-387-92125-9 [DOI] [Google Scholar]
- Broman K. W., Speed T. P., 2002. A model selection approach for the identification of quantitative trait loci in experimental crosses. J. R. Stat. Soc. B 64: 641–656. 10.1111/1467-9868.00354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., Wu H., Sen S., Churchill G. A., 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19: 889–890. 10.1093/bioinformatics/btg112 [DOI] [PubMed] [Google Scholar]
- Broman K. W., Sen S., Owens S. E., Manichaikul A., Southard-Smith E., et al. , 2006. The X chromosome in quantitative trait locus mapping. Genetics 174: 2151–2158. 10.1534/genetics.106.061176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cavanagh C., Morell M., Mackay I., Powell W., 2008. From mutations to MAGIC: resources for gene discovery, validation and delivery in crop plants. Curr. Opin. Plant Biol. 11: 215–221. 10.1016/j.pbi.2008.01.002 [DOI] [PubMed] [Google Scholar]
- Cheng R., Palmer A. A., 2013. A simulation study of permutation, bootstrap, and gene dropping for assessing statistical significance in the case of unequal relatedness. Genetics 193: 1015–1018. 10.1534/genetics.112.146332 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Churchill G. A., Doerge R. W., 1994. Empirical threshold values for quantitative trait mapping. Genetics 138: 963–971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Churchill G. A., Airey D. C., Allayee H., Angel J. M., Attie A. D., et al. , 2004. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat. Genet. 36: 1133–1137. 10.1038/ng1104-1133 [DOI] [PubMed] [Google Scholar]
- Churchill G. A., Gatti D. M., Munger S. C., Svenson K. L., 2012. The diversity outbred mouse population. Mamm. Genome 23: 713–718. 10.1007/s00335-012-9414-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corty R. W., Valdar W., 2018. vqtl: an R package for mean-variance QTL mapping. G3 (Bethesda) 8: 3757–3766. 10.1534/g3.118.200642 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davies R. W., Flint J., Myers S., Mott R., 2016. Rapid genotype imputation from sequence without reference panels. Nat. Genet. 48: 965–969. 10.1038/ng.3594 [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Koning D. J., McIntyre L. M., 2017. Back to the future: multiparent populations provide the key to unlocking the genetic basis of complex traits. G3 (Bethesda) 7: 1617–1618. 10.1534/g3.117.042846 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dell’Acqua M., Gatti D. M., Pea G., Cattonaro F., Coppens F., et al. , 2015. Genetic properties of the MAGIC maize population: a new platform for high definition qtl mapping in Zea mays. Genome Biol. 16: 167 10.1186/s13059-015-0716-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eddelbuettel D., 2013. Seamless R and C++ Integration with Rcpp. Springer, New York: 10.1007/978-1-4614-6868-4 [DOI] [Google Scholar]
- Eddelbuettel D., François R., 2011. Rcpp: seamless R and C++ integration. J. Stat. Softw. 40: 1–18. 10.18637/jss.v040.i08 [DOI] [Google Scholar]
- Furlotte N., 2015. Pylmm, a lightweight linear mixed-model solver. https://github.com/nickFurlotte/pylmm.
- Gatti D., Svenson K., Shabalin A., Wu L.-Y., Valdar W., et al. , 2014. Quantitative trait locus mapping methods for Diversity Outbred mice. G3 (Bethesda) 4: 1623–1633. 10.1534/g3.114.013748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gnan S., Priest A., Kover P. X., 2014. The genetic basis of natural variation in seed size and seed number and their trade-off using Arabidopsis thaliana MAGIC lines. Genetics 198: 1751–1758. 10.1534/genetics.114.170746 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guennebaud G., Jacob B., et al. , 2010. Eigen, version 3. http://eigen.tuxfamily.org.
- Haley C. S., Knott S. A., 1992. A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69: 315–324. 10.1038/hdy.1992.131 [DOI] [PubMed] [Google Scholar]
- Hormozdiari F., Kostem E., Kang E. Y., Pasaniuc B., Eskin E., 2014. Identifying causal variants at loci with multiple signals of association. Genetics 198: 497–508. 10.1534/genetics.114.167908 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang B. E., George A. W., 2011. R/mpMap: a computational platform for the genetic analysis of multiparent recombinant inbred lines. Bioinformatics 27: 727–729. 10.1093/bioinformatics/btq719 [DOI] [PubMed] [Google Scholar]
- Huang B. E., George A. W., Forrest K. L., Kilian A., Hayden M. J., et al. , 2012a A multiparent advanced generation inter-cross population for genetic analysis in wheat. Plant Biotechnol. J. 10: 826–839. 10.1111/j.1467-7652.2012.00702.x [DOI] [PubMed] [Google Scholar]
- Huang B. E., Shah R., George A. W., 2012b dlmap: an R package for mixed model QTL and association analysis. J. Stat. Softw. 50: 1–22. 10.18637/jss.v050.i0625317082 [DOI] [Google Scholar]
- Kang H. M., Ye C., Eskin E., 2008. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180: 1909–1925. 10.1534/genetics.108.094201 [DOI] [PMC free article] [PubMed] [Google Scholar]
- King E. G., Merkes C. M., McNeil C. L., Hoofer S. R., Sen S., et al. , 2012. Genetic dissection of a model complex trait using the Drosophila synthetic population resource. Genome Res. 22: 1558–1566. 10.1101/gr.134031.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kover P. X., Valdar W. V., Trakalo J., Scarcelli N., Ehrenreich I. M., et al. , 2009. A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet. 5: e1000551 10.1371/journal.pgen.1000551 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lander E. S., Botstein D., 1989. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lincoln, S. E., and E. S. Lander, 1990 Mapping genes for quantitative traits using MAPMAKER/QTL. A Whitehead Institute for Biomedical Research Technical Report. Whitehead Institute, Cambridge, MA. [Google Scholar]
- Lincoln S. E., Lander E. S., 1992. Systematic detection of errors in genetic linkage data. Genomics 14: 604–610. 10.1016/S0888-7543(05)80158-2 [DOI] [PubMed] [Google Scholar]
- Lippert C., Listgarten J., Liu Y., Kadie C. M., Davidson R. I., et al. , 2011. FaST linear mixed models for genome-wide association studies. Nat. Methods 8: 833–835. 10.1038/nmeth.1681 [DOI] [PubMed] [Google Scholar]
- Manichaikul A., Moon J. Y., Sen S., Yandell B. S., Broman K. W., 2009. A model selection approach for the identification of quantitative trait loci in experimental crosses, allowing epistasis. Genetics 181: 1077–1086. 10.1534/genetics.108.094565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morgan A. P., Fu C. P., Kao C. Y., Welsh C. E., Didion J. P., et al. , 2016. The mouse universal genotyping array: from substrains to subspecies. G3 (Bethesda) 6: 263–279. 10.1534/g3.115.022087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mott R., Flint J., 2002. Simultaneous detection and fine mapping of quantitative trait loci in mice using heterogeneous stocks. Genetics 160: 1609–1618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mott R., Talbot C. J., Turri M. G., Collins A. C., Flint J., 2000. A method for fine mapping quantitative trait loci in outbred animal stocks. Proc. Natl. Acad. Sci. USA 97: 12649–12654. 10.1073/pnas.230304397 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noble L. M., Chelo I., Guzella T., Afonso B., Riccardi D. D., et al. , 2017. Polygenicity and epistasis underlie fitness-proximal traits in the Caenorhabditis elegans multiparental experimental evolution (CeMEE) panel. Genetics 207: 1663–1685. 10.1534/genetics.117.300406 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oreper D., Cai Y., Tarantino L. M., de Villena F. P.-M., Valdar W., 2017. Inbred strain variant database (ISVdb): a repository for probabilistically informed sequence differences among the Collaborative Cross strains and their founders. G3 (Bethesda) 7: 1623–1630. 10.1534/g3.117.041491 [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team , 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. [Google Scholar]
- Sen Ś., Churchill G. A., 2001. A statistical framework for quantitative trait mapping. Genetics 159: 371–387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Storey J. D., 2002. A direct approach to false discovery rates. J. R. Stat. Soc. B 64: 479–498. 10.1111/1467-9868.00346 [DOI] [Google Scholar]
- Storey J. D., 2003. The positive false discovery rate: a Bayesian interpretation and the q-value. Ann. Stat. 31: 2013–2035. 10.1214/aos/1074290335 [DOI] [Google Scholar]
- Storey, J. D., A. J. Bass, A. Dabney, and D. Robinson, 2018 qvalue: Q-value estimation for false discovery rate control. R package version 2.14.0. https://github.com/jdstorey/qvalue.
- Svenson K. L., Gatti D. M., Valdar W., Welsh C. E., Cheng R., et al. , 2012. High-resolution genetic mapping using the mouse diversity outbred population. Genetics 190: 437–447. 10.1534/genetics.111.132597 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taylor J., Butler D., 2017. R package ASMap: efficient genetic linkage map construction and diagnosis. J. Stat. Softw. 79: 1–29. 10.18637/jss.v079.i0630220889 [DOI] [Google Scholar]
- Taylor J., Verbyla A., 2011. R package wgaim: QTL analysis in bi-parental populations using linear mixed models. J. Stat. Softw. 40: 1–18. 10.18637/jss.v040.i07 [DOI] [Google Scholar]
- Van Ooijen J. W., 2009. MapQTL 6: Software for the Mapping of Quantitative Trait Loci in Experimental Populations of Diploid Species. Kyazma BV, Wageningen, The Netherlands. [Google Scholar]
- Wickham H., 2011. testthat: get started with testing. R J. 3: 5–10. [Google Scholar]
- Wickham, H., P. Danenberg, and M. Eugster, 2017 roxygen2: in-line documentation for R. R package version 6.0.1. https://CRAN.R-Project.org/package=roxygen2
- Woods L. C., Mott R., 2017. Heterogeneous stock populations for analysis of complex traits. Methods Mol. Biol. 1488: 31–44. 10.1007/978-1-4939-6427-7_2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang J., Zaitlen N. A., Goddard M. E., Visscher P. M., Price A. L., 2014. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46: 100–106. 10.1038/ng.2876 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu J., Pressoir G., Briggs W. H., Bi I. V., Yamasaki M., et al. , 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38: 203–208. 10.1038/ng1702 [DOI] [PubMed] [Google Scholar]
- Zheng C., Boer M. P., van Eeuwijk F. A., 2015. Reconstruction of genome ancestry blocks in multiparental populations. Genetics 200: 1073–1087. 10.1534/genetics.115.177873 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data for Figure 1 are available at the Mouse Phenotype Database (https://phenome.jax.org/projects/Gatti2). The data for Figure 2 are available as supplemental files for Gnan et al. (2014) (https://doi.org/10.1534/genetics.114.170746). R/qtl2 input files for both datasets are available at GitHub (https://github.com/rqtl/qtl2data).
The R/qtl2 software is available from its web site (https://kbroman.org/qtl2) as well as GitHub (https://github.com/rqtl/qtl2). The software is licensed under the GNU General Public License version 3.0.
The code to create Figure 1 and Figure 2 is available at GitHub at https://github.com/kbroman/Paper_Rqtl2.