Abstract
Motivation: Although imprinted genes have been ubiquitously observed in nature, statistical methodology still has not been systematically developed for jointly characterizing genomic imprinting effects and patterns. To detect imprinting genes influencing quantitative traits, the least square and maximum likelihood approaches for fitting a single quantitative trait loci (QTL) and Bayesian method for simultaneously modeling multiple QTLs have been adopted in various studies.
Results: In a widely used F2 reciprocal mating population for mapping imprinting genes, we herein propose a genomic imprinting model which describes additive, dominance and imprinting effects of multiple imprinted quantitative trait loci (iQTL) for traits of interest. Depending upon the estimates of the above genetic effects, we categorized imprinting patterns into seven types, which provides a complete classification scheme for describing imprinting patterns. Bayesian model selection was employed to identify iQTL along with many genetic parameters in a computationally efficient manner. To make statistical inference on the imprinting types of iQTL detected, a set of Bayes factors were formulated using the posterior probabilities for the genetic effects being compared. We demonstrated the performance of the proposed method by computer simulation experiments and then applied this method to two real datasets. Our approach can be generally used to identify inheritance modes and determine the contribution of major genes for quantitative variations.
Contact: annie.lin@duke.edu; runqingyang@sjtu.edu.cn
1 INTRODUCTION
Genomic imprinting, a non-equivalent genetic phenomenon of allele expression that depends on parental origins, has been broadly identified in nature. It may be caused by DNA methylation, histone modification, non-coding RNAs (ncRNA), and even long distance interchromosomal interactions (Allis et al., 2007; Kiefer, 2007; Pauler and Barlow, 2006; Wood and Oakey, 2006). An imprinting locus is evaluated by phenotypic differences between the reciprocal heterozygotes of its two alleles with specified father allele or mother allele. Imprinting can be categorized as either complete imprinting or partial imprinting: the former signifies that only one parental allele is expressed while the latter signifies that both parental alleles are expressed, but at different levels (Naumova and Croteau, 2004; Sandovici et al., 2003, 2005). Furthermore, when the maternal allele of a gene is expressed and the paternal allele is inactivated, it is referred to as paternal imprinting. The reverse circumstance is referred to as maternal imprinting. These classifications are all based on the additive effects of imprinted quantitative trait loci (iQTL). However, imprinting can also lead to altered interactions between alleles. A recent study (Cheverud et al., 2008) illustrated a scheme for characterizing the potential diversity of imprinting patterns, in which imprinting patterns were classified as either parental expression (paternal or maternal) or dominance (bipolar and polar) imprinting.
Recently, it has been shown that the epigenetic modification of an imprinted gene on quantitative traits can be detected through a genetic mapping approach (de Koning et al., 2000, 2002; Haghighi and Hodge, 2002; Hanson et al., 2001; Knott et al., 1998; Shete and Amos, 2002; Shete et al., 2003; Tuiskula-Haavisto et al., 2004). When the reciprocal heterozygotes along with two homozygotes are fully informative or distinguishable in a source population, imprinted effects of quantitative trait loci (QTL) could be uniquely tested and estimated by means of conditional probabilities of QTL genotypes given selected flanking marker genotypes. If the reciprocal heterozygotes are partially informative or semi-distinguishable, the information about sex-specific differences in the recombination fraction (de Vicente and Tankslay, 1991; Dib et al., 1996; Dietrich et al., 1996; Groover et al., 1995; Haldane, 1922; Huxley, 1928; Knott et al., 1998; Neff et al., 1999) can be used to estimate the imprinted effect of the QTL. This allows us to consider how the QTL is inherited in a genetically designed population with only one heterozygote, such as the F2 population generated from the intercross of inbred strains.
Genomic scans with interval mapping of Mendelian QTL have been extended to detect iQTL. Imprinting effects can be estimated using either least square (de Koning et al., 2000, 2002; Knott et al., 1998; Tuiskula-Haavisto et al., 2004) or maximum likelihood methods (Cui et al., 2006), and multi-step tests for contrast models have been proposed to identify the imprinting pattern (de Koning et al., 2002; Tuiskula-Haavisto et al., 2004). Under the single QTL model, these mapping approaches can estimate and test one locus at a time. Subsequently, Bayesian mapping, which is able to simultaneously map multiple QTL, has been introduced to detect iQTL and discriminate between Mendelian and imprinted expressions of a QTL (Hayashi and Awata, 2008). Although the Bayesian method enhances statistical power to detect a QTL, drawing a number of QTL using a reversible-jump MCMC procedure may lead to lower convergence efficiency and poor mixing. In comparison, the Bayesian model selection can improve competence for analyzing the multiple iQTL models by presetting the upper boundary for the number of QTL and selecting prior distribution for QTL parameters. The objectives of this study were (i) to construct a mathematical model that describes various genetic effects of multiple imprinted loci on quantitative traits, and (ii) to develop a computationally efficient Bayesian mapping method based on Bayesian model selection that can not only estimate multiple genetic effects of iQTL, but also make statistical inference on all possible imprinting patterns.
2 THEORY AND METHODS
2.1 Genomic imprinting model
In the source population for gene mapping, assume that there are four distinguishable genotypes at each locus, denoted by QMQp, QMqP, qMQP and qMqP, where the subscripts represent the parental origins. Suppose that genotypes for a set of co-dominant molecular markers with known linkage map and phenotypes (yk for k = 1, 2,…, n) for the trait of interest are measured on n individuals. Generally, the additive effect a is defined as the half of the phenotypic difference between homozygotes, the dominance effect d is the difference between the joint mean of two heterozygotes and the mean of two homozygotes and the imprinting effect i is the difference between two heterozygotes (Falconer and Mackay, 1996; Knott et al., 1996). Following the definition of genetic parameters, we constructed a genomic imprinting model to describe the effects of multiple imprinting genes on quantitative traits, which can be written as
![]() |
(1) |
Where m is the number of imprinting genes, μ is the population mean and ek is a random environmental error, distributed as N(0, σ2) with σ2 being residual variance, zkj, wkj and skj are genotype indicator variables related to genetic effects aj, dj and ij, which were defined by previous work (Mantey et al., 2005).
![]() |
(2) |
Depending on the estimates of a, d and i, imprinting types can be classified as the parental and dominance imprinting (Cheverud et al., 2008): a/i = ±1 and d = 0 represent parental expression, including paternal (a/i = +1) and maternal (a/i = −1) imprinting subtypes; while dominance imprinting is featured with a = 0 but i ≠ 0, which can be further classified as bipolar imprinting (d = 0 and i ≠ 0) and polar imprinting (d/i = ±1). In the above scheme, the condition of parental imprinting stems from the hypothesis of complete silencing of the parental allele, so the defined parental expression only describes the complete parental imprinting without taking into account the two possible types of partial imprinting, i.e. paternal and maternal partial imprinting. Thus, we have classified imprinting patterns into seven types, listed in Table 1.
Table 1.
Imprinting patterns, hypotheses and corresponding statistical criteria for iQTL
Imprinting type | Hypothesis | Statistical criterion | ||
---|---|---|---|---|
Additive | Complete | Paternal | d=0∩a=i | BFd<3∩BFavs.i<3 |
imprinting | Maternal | d=0∩a=i | BFd<3∩BFavs.i<3 | |
Partial | Paternal | d=0∩a≠i | BFd<3∩BFavs.i>3 | |
Maternal | d=0∩a≠i | BFd<3∩BFavs.i>3 | ||
Dominance | Bipolar | a=0∩d=0 | BFa<3∩BFd<3 | |
imprinting | Polar | Over-dominance | a=0∩d=i | BFa<3∩BFdvs.i<3 |
Under dominance | a=0∩d=−i | BFa<3∩BFdvs.i<3 |
2.2 Bayesian model selection for genetic parameters
After organizing all genetic effects into β and all indicators variables or dummy variables into xi, we simplify the multiple interacting QTL model as the following linear model:
![]() |
(3) |
for i=1, 2,…, n. In fact, this is not a common linear model because the number of independent variables and the associated design matrix are all unknown due to the unknown number of QTLs.
In theory, all positions across the genome are possible QTLs, though their contributions to phenotypic variation are different in size and most of them can be negligible. Hence, we approximate positions for all possible QTLs using a partition of the entire genome with evenly spaced loci, which includes all observed markers and additional loci between flanking markers. The expected values for elements in the design matrix can be calculated based on the conditional probabilities of QTL genotypes given two flanking marker genotypes (Rao and Xu, 1998). In the supersaturated model, however, the number of genetic effects for these QTL is so large that it is almost impossible to estimate. Thus, we preset the upper bound on the number of QTL to be included in the model (Yi et al., 2005). It should be larger than the number of detectable QTL in the given dataset.
Given the upper bound on the number of QTL, these QTL can be drawn from densely spaced loci over the genome. To detect the existence of these effects, we introduced a random binary variable γ to indicate which genetic effects are included in (γ = 1) or excluded from the model (George and McCulloch, 1997; Kuo and Mallick, 1998). Let Γ = diag (γ), model (3) becomes
![]() |
(4) |
Within the framework of Bayesian Model selection, MCMC methods that include both the Gibbs sampler and the Metropolis–Hastings algorithm, are adopted to explore the posteriors for unknown parameters in model (4). The released sampling value for γ in the matrix Γ at current iteration determined which genetic effect and QTL position will be drawn or estimated at the next iteration. This allowed us to conduct Bayesian sampling for QTL parameters in a reasonably reduced model space, thus greatly decreasing the computational demand.
Compared to Mendelian QTL, the iQTL involves more types of genetic effects under the same genetic design. This will result in the decrease of prior inclusion probability for each genetic effect and the increase of the computational demand. Based on the marginal posterior distribution of each parameter, we implemented the MCMC algorithm for Bayesian model selection by following a simplified and computationally efficient process:
- Calculate the expected values for the associated design matrix with all spaced loci over the genome:
with πQQ, πQq, πqQ, and πqq being the conditional probabilities of the genotypes QMQP, QMqP, qMQP and qMqP on two flanking markers. Set the upper bound on the number of QTL, estimated by
with l0 being prior expected number of QTL that is determined according to initial investigations with traditional methods.
Initialize all variables with some legal values or values sampled from their prior distributions.
Update the population mean μ by sampling from a normal distribution with mean ∑i=1n(yi − xiβ)/n and variance σ2/n.
Update the residual variance σ2 by sampling from an inverted χ2 distribution with parameters νe+n and ∑i=1n(yi − μ − xiβ)2+(νe+n)Se, where νe and Se are prior hyperparameters.
Update the QTL position by drawing from all spaced loci over the genome when considering γ=1. Note that the existence of QTL depends on γ=1 for either main or epistatic effect. Each locus is sampled from a variable interval whose boundaries are the positions of adjoining QTL. The Metropolis–Hastings algorithm is used to decide whether each proposed (new) position should be accepted or not (Wang et al., 2005; Zhang and Xu, 2005).
Repeat steps (4)–(7) until the Markov chain reaches a desirable length.
Post-MCMC analysis includes the monitoring of the mixing behavior and convergence rates of MCMC algorithms and the assessment of characteristics of the imprinting genetic architecture. For the former case, we can adopt visually inspecting trace plots of the sample values of scalar quantities of interest or formal diagnostic methods provided in the package R/coda (Plummer et al., 2004). In the latter case, model averaging can be used to account for model uncertainty (Ball, 2001; Raftery et al., 1997; Sillanpaa and Corander, 2002) by averaging over possible models weighted by their posterior probabilities. The posterior inclusion probability for each locus is estimated based on its frequency in the posterior samples. The Bayes factor (BF) is calculated as a criterion for assessing inclusion versus exclusion for each QTL locus (Kass and Raftery, 1995). Generally, the threshold of BF is empirically set to be 3, or 2ln BF = 2.1, in order to declare the statistical significance for each QTL.
2.3 Bayesian inference for imprinting pattern
Once a QTL is detected via the post-MCMC analysis, we can also adopt the idea of BF to evaluate statistical significance for its imprinting effect. The BF for the additive, dominant or imprinting effect can be formulated as
![]() |
where θ=a, d or i, corresponding to the additive, dominant or imprinting effect, respectively. The pθ* is the prior probability, and pθ is the posterior probability which is calculated as the proportion of samples with γ=1 in MCMC sampling iterations. If BFi>3 (or 2ln BFi>2.1) for imprinting effect i, then the detected QTL is considered to be an iQTL, otherwise the QTL is thought to be inherited in a Mendelian fashion.
In Cheverud et al.'s scheme, the types of imprinting are classified according to the estimates of the imprinting, additive and dominance genotypic values relative to the origin of imprinting genetic effects at the locus of interest. In this article, we propose a new scheme that redefines imprinting patterns as either additive or dominance imprinting (Table 1). The additive imprinting is composed of four subtypes, that is, the complete or partial paternal additive imprinting, and the complete or partial maternal additive imprinting. The dominance imprinting is further classified into bipolar dominance, polar over-dominance and polar under-dominance imprinting.
To determine imprinting patterns of the detected iQTL, that is, statistically in favor of the hypotheses of a or d=±i and a≠±i, a new BF is formulated through comparing posterior probabilities between the additive or dominant effect and the imprinting effect at the detected locus, which is denoted as
![]() |
Note that prior probabilities for the various genetic effects at detected loci are the same and are therefore removed during the derivation. We summarized the statistical criteria for corresponding hypotheses on different imprinting patterns in Table 1.
3 SIMULATIONS
We performed simulation experiments to investigate the statistical properties of the proposed imprinting model. Consider 61 equally spaced co-dominant markers on a single large chromosome of length 500 cM for a F2 mating population with sample sizes of 150 and 300. One QTL inherited in Mendelian fashion and three iQTL with different types were simulated. The total genetic variance by all QTL was 22.4, in which the proportion of phenotypic variance contributed by individual QTL ranged from 1.11% to 10.04%. The population mean and the environmental (residual) variance were set at μ = 5.0 and σ2 = 4.0. The marker and QTL genotypes in the F2 family were generated by mimicking sex-specific recombination fractions.
Prior to Bayesian sampling, we assign the prior number of main-effect QTL l0 = 4, then the upper bound of the number of QTL, . The hyper parameters are taken νe = 0 and Se = 1. The initial values for all variables are sampled from their prior distributions. The MCMC is run for 6000 cycles as a burn-in period (deleted) and then for an additional 100 000 cycles after the burn-in. Note that, here, the length of the burn-in is judged by visually inspecting the plots of some samples across rounds and is set to a sufficient number of cycles to ensure MCMC convergence. For reducing serial correlation, we save one observation in every 40 cycles and therefore obtained an independent posterior sample of 2500 observations for the post-MCMC analysis. The 50 repeat simulations are carried out to evaluate statistical power of QTL detection.
The estimates for selected imprinting loci and effects are shown in Table 2, along with the relative statistical power of iQTL detection. In the presence of BF, the imprinting type of the detected iQTL can be categorized with full accuracy. It can be seen from Table 2 that Bayesian mapping is capable of giving reliable estimates of the effects and positions of genome-wide iQTL detected. As expected, the precision for the estimates of parameters and the statistical power of iQTL detection can be improved as the sample size and/or the proportion of genetic contribution of the iQTL increases. In addition, the Bayesian model selection is found to be sensitive to the iQTL with relatively small proportion of genetic contribution, as compared to the Mendelian QTL (Yi et al., 2005, 2007).
Table 2.
Parameter mean estimates (SDs) and statistical powers of QTL detection obtained with Bayesian model selection for simulated data
Sample size | QTL number |
||||||||
---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | ||||||
True | Type | a | d | a | i | a | i | d | i |
Position | 23 | 148 | 391 | 476 | |||||
Effect | 1.5 | 1.2 | 1.0 | 0.7 | 0.5 | 0.5 | 0.6 | 0.6 | |
150 | Position | 21.2 (2.3) | 24.0 (2.1) | 145.3 (1.8) | 146.2 (2.9) | 387.6 (2.5) | 388.2 (2.8) | 479.7 (2.9) | 479.2 (3.1) |
Effect | 1.43 (0.23) | 1.29 (0.35) | 0.88 (0.26) | 0.62 (0.34) | 0.58 (0.29) | 0.63 (0.17) | 0.66 (0.26) | 0.67 (0.19) | |
Power | 94% | 76% | 52% | 68% | |||||
300 | Position | 21.8 (2.3) | 23.5 (2.1) | 146.7 (1.8) | 146.9 (2.9) | 388.3 (2.5) | 389.1 (2.8) | 478.4 (2.9) | 475.2 (3.1) |
Effect | 1.46 (0.18) | 1.24 (0.35) | 0.95 (0.18) | 0.67 (0.24) | 0.47 (0.18) | 0.54 (0.13) | 0.58 (0.19) | 0.63 (0.15) | |
Power | 100% | 88% | 72% | 80% |
4 EXAMPLES
4.1 Differential body weights
This data represents weight growth of mouse in an F2 mating population derived from the Large (LG/J) and the Small (SM/J) inbred mouse strains (Cheverud et al., 1996). A total of 502 F2 mice were genotyped for 96 micro-satellite markers located on 19 autosomal chromosomes. The linkage map (a total length of 1780 cM) has been constructed (Vaughn et al., 1999). The body mass was measured for each mouse at 10 successive weekly intervals starting from 7 days old. The raw weights were adjusted for the effects of each covariate due to dam, litter size at birth and parity, and sex (Vaughn et al., 1999).
The measures of body weight at the fifth time point were utilized as the mapping phenotype to illustrate our approach. The female-to-male recombination rate of 1.25 : 1.0 was introduced into mapping analysis due to the lack of distinguishable reciprocal heterozygotes for measured marker genotypes.
In Bayesian analysis, the expected number of iQTL was set at lm=3 according to interval mapping results. Thus, the upper bound of the number of iQTL, L, was equal to 8. The initial value assigned to each unknown parameter was the same as the one used for the simulation study. The MCMC was run for 200 000 iterations after the burn-in period of 5000 iterations.
The genome-wide 2logBFs profile obtained with Bayesian model selection for body weights in mice is depicted in Figure 1A. Five significant QTL were found on chromosomes 6, 7, 10, 13 and 15, respectively, as their corresponding peaks exceeded the empirical critical value of 2.1. It can be seen from Figure 1B that at every detected QTL, the imprinting effect is significant and the difference is non-significant between imprinting effect and additive or dominant effect, indicating complete or dominance imprinting types. The positions, effects and imprinting types of those iQTL are shown in Table 3. In comparison, the data were also analyzed using the maximum likelihood method. Only three of the five QTL detected with the Bayesian method were identified (Fig. 2) in this case. Using our model, additional analyses on body weight at all other time points consistently supported the same iQTL effects (data not shown).
Fig. 1.
The genome-wide 2logBFs profiles of scanning loci (A) and genetic effects (B) obtained with Bayesian model selection for body weights in mice. The horizontal reference line is the critical 2lnBF value of 2.1 for the significance test. The genome consists of 19 linkage groups that are separated by the vertical dotted lines. The 19 linkage groups are drawn to scale and proportional to their chromosomal lengths. Positions of the markers are indicated by the ticks on the horizontal axis. The thick solid, dashed and thin solid lines represent additive, dominance and imprinting effects, respectively.
Table 3.
Estimates for iQTL parameters and imprinting types obtained with Bayesian model selection for body weights in mice, 2log BFs for genetic effects are shown in the parentheses
iQTL number | Chromosome; Position | 2Log BF | a | d | i | Imprinting type |
---|---|---|---|---|---|---|
1 | 6; 73.2 | 5.75 | 0.59 (3.96) | 0.48 (0.00) | 0.21 (4.52) | Complete paternal |
2 | 7; 63.1 | 6.65 | 0.63 (4.83) | 0.29 (0.00) | −0.41 (4.76) | Complete maternal |
3 | 10; 72.7 | 2.94 | 0.26 (0.00) | 0.51 (3.23) | 0.24 (4.12) | Polar over-dominance |
4 | 13;26.3 | 4.53 | 0.38 (0.00) | 0.71 (3.61) | 0.57 (3.81) | Polar over-dominance |
5 | 15; 12.7 | 3.02 | 0.47 (2.96) | 0.14 (0.00) | −0.12 (3.19) | Complete maternal |
Fig. 2.
The profile of LR test statistics from maximum likelihood interval mapping for body weights in mice. The horizontal reference line is the empirical critical value at the significant level of 0.05 for the LR test statistic generated from permutation tests. Linkage groups are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.
4.2 Acute lung injury survival time
To investigate epigenetic properties of hyperoxic acute lung injury (HALI) survival, Prows et al. (2007a, b) established a mouse model system that derived from a pair of polar-responding inbred strains, in which B (C57BL/6J) strain mice are sensitive and S (129X1/SvJ) strain mice are significantly more resistant to HALI mortality. The reciprocal F1 lines were first generated by mating B females to S males (B.S) and S females to B males (S.B). The difference of the mean survival time between these reciprocal F1 mice provided strong evidence for the existence of a parent-of-origin effect (Prows et al., 2007a, b). The reciprocal F1 offspring were systematically bred through BS × BS, BS × SB, SB × BS and SB × SB (female F1 listed first) intercross mating schemes, to generate ∼200 mice for each of the four possible F2 crosses; these crosses allowed genetic studies to assess any potential imprinting effects. A total of 840 F2 mice were phenotyped for survival time in hours and genotyped for 97 polymorphic microsatellite markers distributed throughout the genome, including the X chromosome. The raw survival times were adjusted for the effects of each system environment factor due to dam, sire and sex.
Prior to Bayesian mapping, the conditional probabilities of four genotypes at all possible loci over the entire genome are required to be estimated. Since F2 mice from BS × BS or SB × SB have only one heterozygote that is the same as their parents' genotype at each locus, we estimated the conditional probabilities for three possible QTL genotypes including one heterozygote and two homozygotes given flanking markers, as is custom in an F2 population. The conditional probability for another reciprocal heterozygote was set to be 0. F2 mice from BS × SB or SB × BS, however, likely carry one of the reciprocal heterozygotes. Thus the female-to-male recombination rate of 1.25 : 1.0 was assumed in order to estimate the conditional probabilities of various genotypes due to the lack of distinguishable reciprocal heterozygotes for measured marker genotypes.
In Bayesian model selection mapping analysis, the natural logarithm transformation was applied to raw survival times to resemble a normal distribution. The number of QTL was preset at 4 and then the upper bound of the number of QTL was equal to 10. The initial value of each unknown parameter for the MCMC sampling was assigned to be the same as the one used in the simulation studies.
Figure 3 shows the genomewide 2log BFs profile obtained with Bayesian model selection for HALI survival time in mice. Six peaks exceeded the empirical critical value of the test statistic, which correspond to seven significant QTL detected on six chromosomes 1, 2, 4, 9, 15 and 17, respectively. Four of the seven detected QTL were identified as iQTL and the other three were inherited in Mendelian fashion. The parameter estimates for the detected QTL and imprinting types of iQTL are listed in Table 4. Interestingly, the maximum likelihood method was only able to identify three of the four iQTL detected by the Bayesian method (Figure 4).
Fig. 3.
The genome-wide 2logBFs profile of scanning loci (above) and genetic effects (below) obtained with Bayesian model selection for HALI survival time in mice. The horizontal reference line is the critical 2lnBF value of 2.1 for the significance test. The genome consists of 20 linkage groups that are separated by the vertical dotted lines. The 20 linkage groups are drawn to scale and proportional to their chromosomal lengths. Positions of the markers are indicated by the ticks on the horizontal axis. The thick solid, dashed and thin solid lines represent additive, dominance and imprinting effects, respectively.
Table 4.
Estimates for QTL parameters and imprinting types obtained with Bayesian model selection for HALI survivals in mice, 2logBFs for genetic effects are shown in the parentheses
QTL number | Chromosome; Position | 2LogBF | a | d | i | Imprinting type |
---|---|---|---|---|---|---|
1 | 1; 24 | 7.37 | −0.26 (7.06) | −0.12 (0.00) | −0.02 (5.53) | Complete paternal |
2 | 1; 78 | 3.12 | 0.06 (0.00) | 0.11 (5.12) | 0.05 (0.00) | Mendelian dominance |
3 | 2; 64.2 | 2.59 | 0.11 (0.00) | 0.08 (3.78) | 0.03 (0.00) | Mendelian dominance |
4 | 4; 43.7 | 7.72 | 0.09 (5.88) | 0.07 (0.00) | 0.04 (3.19) | Partial paternal |
5 | 9; 48.2 | 5.28 | 0.13 (0.00) | 0.05 (3.45) | 0.03 (3.87) | Polar over-dominance |
6 | 15; 10.4 | 7.58 | 0.13 (6.74) | 0.06 (0.00) | 0.12 (5.53) | Complete paternal |
7 | 17; 30.4 | 2.56 | 0.14 (0.00) | 0.08 (2.91) | 0.04 (0.00) | Mendelian dominance |
Fig. 4.
The profile of LR test statistics obtained with maximum likelihood interval mapping for HALI survival time in mice. The horizontal reference line is the empirical critical value at the significant level of 0.05 for the LR test statistic generated from permutation tests. Linkage groups are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.
5 DISCUSSION
Based on Mantey's imprinting genetic model (Mantey et al., 2005), we derived a genomic imprinting model with multiple iQTL. Bayesian model selection was employed to identify imprinted loci and estimate imprinting genetic effects. The main feature of Bayesian mapping is that it allows MCMC sampling for iQTL parameters to be carried out in reduced model space, by presetting a maximum number of detectable QTL and latent binary variables. It is performed to indicate whether or not main effects of putative QTL are included in or excluded from the model, thus greatly enhancing the computational efficiency of multiple iQTL mapping. On the basis of Cheverud's classification (Cheverud et al., 2008), we summarize a total of seven imprinting types that provide an exhaustive classification scheme for imprinting patterns (Table 1). Bayesian mapping can easily implement statistical inference for imprinting types of detected iQTL with the appropriate BF formulated by the posterior probabilities for genetic effects that are being compared.
Currently, there are two straightforward approaches for mapping iQTL: association analysis of phenotype with genetic markers (Cheverud et al., 2008) and interval mapping (e.g. Mantey et al., 2005). Both methods are based on a single marker/QTL model. With the information of sparse markers, interval mapping can estimate positions of the iQTL by scanning each position over the genome. Our proposed method based on multiple QTL model was implemented in Bayesian model selection, with which our approach provides higher detecting power than interval mapping method, as demonstrated in two examples. It should be noted that our model considers the main effects of each imprinted locus only. In fact, except for these main effects, the interaction between imprinted loci may also influence phenotypic variation of quantitative traits. When mapping multiple interacting imprinted loci, greater computational demand will be required because of the potential 16 interacting effects caused by a pair of imprinting loci where each has four types of main effects. Fortunately, Bayesian model selection for mapping the main effect of imprinted loci can be easily extended to analyze complex genomic imprinting architecture with high computational efficiency.
In principle, both of the traits analyzed in case studies are not simple quantitative traits as body weight growth is a dynamic developmental trait and survival time is a time-to-event trait. For dynamic developmental traits, the typically dynamic points measured for growth and developmental course will be analyzed separately without considering the transitional relationship among developmental stages. Thus, it is necessary to incorporate a time-dependent developmental pattern into mapping of iQTL for dynamic developmental traits. In general, survival time does not follow a normal distribution. Some specific statistical approaches, such as parametric and non-parametric models, are available for mapping Mendelian QTL with survival traits (Diao and Lin, 2005; Diao et al., 2004). Those approaches usually encounter the problem of solving non-linear equations and therefore are not appropriate for simultaneously identifying multiple genomic imprinting loci. However, the accelerated failure time model is an exception due to its linearity (Kalbfleish and Prentice, 2002). In a simplified case of this method, it conducts the logarithmic transformation of survival time and then employs a linear model to analyze the normalized data.
In conclusion, Bayesian model selection provides a statistically efficient way to not only estimate genetic effects of QTL over genome, but also characterize the genetic modes. Our method is developed on the basis of a F2 population which generated from either an intercross or a reciprocal cross of inbred strains, but it can be applied to the data from both the backcross and reverse backcross populations without considering dominance effect. It is also desirable to extend this method to analyze longitudinal/dynamic quantitative traits, and to further evaluate the interactions among genes inherited in various genetic fashions.
ACKNOWLEDGMENTS
We thank James Cheverud for providing us with his mouse data.
Funding: National Natural Science Foundation of China (grant no 30972077); Clinical and Translational Sciences Award (UL1 RR024128) of National Institutes of Health.
Conflict of Interest: none declared.
REFERENCES
- Allis CD, et al. Epigenetics. New York: Cold Spring Harbor Laboratory Press; 2007. [Google Scholar]
- Ball RD. Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using the Bayesian information criterion. Genetics. 2001;159:1351–1364. doi: 10.1093/genetics/159.3.1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheverud JM, et al. Quantitative trait loci for murine growth. Genetics. 1996;142:1305–1319. doi: 10.1093/genetics/142.4.1305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheverud JM, et al. Genomic imprinting effects on adult body composition in mice. Proc. Natl Acad. Sci. USA. 2008;105:4253–4258. doi: 10.1073/pnas.0706562105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cui Y, et al. Model for mapping imprinted quantitative trait loci in an inbred F2design. Genomics. 2006;87:543–551. doi: 10.1016/j.ygeno.2005.11.021. [DOI] [PubMed] [Google Scholar]
- de Koning DJ, et al. Genome-wide scan for body composition in pigs reveals important role of imprinting. Proc. Natl Acad. Sci. USA. 2000;97:7947–7950. doi: 10.1073/pnas.140216397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Koning DJ, et al. On the detection of imprinted quantitative trait loci in experimental crosses of outbred species. Genetics. 2002;161:931–938. doi: 10.1093/genetics/161.2.931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Vicente MC, Tankslay SD. Genome-wide reduction in recombination of backcross progeny derived from male versus female gametes in an interspecific cross of tomato. Theor. Appl. Genet. 1991;83:173–178. doi: 10.1007/BF00226248. [DOI] [PubMed] [Google Scholar]
- Diao G, Lin DY. Semiparametric methods for mapping quantitative trait loci with censored data. Biometrics. 2005;61:789–798. doi: 10.1111/j.1541-0420.2005.00346.x. [DOI] [PubMed] [Google Scholar]
- Diao G, et al. Mapping quantitative trait loci with censored observations. Genetics. 2004;168:1689–1698. doi: 10.1534/genetics.103.023903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dib C, et al. A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature. 1996;380:152–154. doi: 10.1038/380152a0. [DOI] [PubMed] [Google Scholar]
- Dietrich WF, et al. A comprehensive genetic map of the mouse genome. Nature. 1996;380:149–152. doi: 10.1038/380149a0. [DOI] [PubMed] [Google Scholar]
- Falconer DS, Mackay TFC. Introduction to Quantitative Genetics. London: Longman; 1996. [Google Scholar]
- George EI, McCulloch RE. Approaches for Bayesian variable selection. Stat. Sin. 1997;7:339–373. [Google Scholar]
- Groover AT, et al. Sexrelated differences in meiotic recombination frequency in Pinus taeda. J. Hered. 1995;86:157–158. [Google Scholar]
- Haghighi F, Hodge SE. Likelihood formulation of parent-of-origin effects on segregation analysis, including ascertainment. Am. J. Hum. Genet. 2002;70:142–156. doi: 10.1086/324709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haldane JBS. The part played by recurrent mutation in evolution. Am. Nat. 1922;67:5–9. [Google Scholar]
- Hanson RL, et al. Assessment of parent-of-origin effects in linkage analysis of quantitative traits. Am. J. Hum. Genet. 2001;68:951–962. doi: 10.1086/319508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayashi T, Awata T. A Bayesian method for simultaneously detecting Mendelian and imprinted quantitative trait loci in experimental crosses of outbred species. Genetics. 2008;178:527–538. doi: 10.1534/genetics.107.081521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huxley JS. Sexual difference of linkage Grammarus chereuxi. J. Genet. 1928;20:145–156. [Google Scholar]
- Kalbfleish JD, Prentice RL. The Statistical Analysis of Failure Time Data. Hoboken, NJ: Wiley; 2002. [Google Scholar]
- Kass RE, Raftery AE. Bayes factors. J. Am. Stat. Assoc. 1995:773–795. [Google Scholar]
- Kiefer JC. Epigenetics in development. Dev. Dyn. 2007;236:1144–1156. doi: 10.1002/dvdy.21094. [DOI] [PubMed] [Google Scholar]
- Knott SA, et al. Methods for multiple-marker mapping of quantitative trait loci in half-sib populations. Theor. Appl. Genet. 1996;93:71–80. doi: 10.1007/BF00225729. [DOI] [PubMed] [Google Scholar]
- Knott SA, et al. Multiple marker mapping of quantitative trait loci in a cross between outbred wild boar and large white pigs. Genetics. 1998;149:1069–1080. doi: 10.1093/genetics/149.2.1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohn R, et al. Nonparametric regression using linear combinations of basis functions. Stat. Comput. 2001;11:313–322. [Google Scholar]
- Kuo L, Mallick B. Variable selection for regression models. Sankhya Ser. B. 1998;60:65–81. [Google Scholar]
- Mantey C, et al. Mapping and exclusion mapping of genomic imprinting effects in mouse F2families. J. Hered. 2005;96:329–338. doi: 10.1093/jhered/esi044. [DOI] [PubMed] [Google Scholar]
- Naumova A, Croteau S. Mechanisms of epigenetic variation: polymorphic imprinting. Curr. Genomics. 2004;84:417–429. [Google Scholar]
- Neff MW, et al. A second-generation genetic linkage map of the domestic dog, Canis familiaris. Genetics. 1999;151:803–820. doi: 10.1093/genetics/151.2.803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pauler FM, Barlow DP. Imprinting mechanisms–it only takes two. Genes Dev. 2006;20:1203–1206. doi: 10.1101/gad.1437306. [DOI] [PubMed] [Google Scholar]
- Plummer M, et al. CODA: output analysis and diagnostics for MCMC, v. 0.9-5. Beachwood, OH: Institute of Mathematical Statistics; 2004. ( http://www-fis.iarc.fr/coda) [Google Scholar]
- Prows DR, et al. A genetic mouse model to investigate hyperoxic acute lung injury survival. Physiol. Genomics. 2007a;30:262–270. doi: 10.1152/physiolgenomics.00232.2006. [DOI] [PubMed] [Google Scholar]
- Prows DR, et al. Genetic analysis of hyperoxic acute lung injury survival in reciprocal intercross mice. Physiol. Genomics. 2007b;30:271–281. doi: 10.1152/physiolgenomics.00038.2007. [DOI] [PubMed] [Google Scholar]
- Raftery AE, et al. Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 1997;92:179–191. [Google Scholar]
- Rao S, Xu S. Mapping quantitative trait loci for ordered categorical traits in four-way crosses. Heredity. 1998;81(Pt 2):214–224. doi: 10.1046/j.1365-2540.1998.00378.x. [DOI] [PubMed] [Google Scholar]
- Sandovici I, et al. Familial aggregation of abnormal methylation of parental alleles at the IGF2/H19 and IGF2R differentially methylated regions. Hum. Mol. Genet. 2003;12:1569–1578. doi: 10.1093/hmg/ddg167. [DOI] [PubMed] [Google Scholar]
- Sandovici I, et al. Interindividual variability and parent of origin DNA methylation differences at specific human Alu elements. Hum. Mol. Genet. 2005;14:2135–2143. doi: 10.1093/hmg/ddi218. [DOI] [PubMed] [Google Scholar]
- Shete S, Amos CI. Testing for genetic linkage in families by a variance-components approach in the presence of genomic imprinting. Am. J. Hum. Genet. 2002;70:751–757. doi: 10.1086/338931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shete S, et al. Genomic imprinting and linkage test for quantitative-trait Loci in extended pedigrees. Am. J. Hum. Genet. 2003;73:933–938. doi: 10.1086/378592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sillanpaa MJ, Corander J. Model choice in gene mapping: what and why. Trends Genet. 2002;18:301–307. doi: 10.1016/S0168-9525(02)02688-4. [DOI] [PubMed] [Google Scholar]
- Tuiskula-Haavisto M, et al. Quantitative trait loci with parent-of-origin effects in chicken. Genet. Res. 2004;84:57–66. doi: 10.1017/s0016672304006950. [DOI] [PubMed] [Google Scholar]
- Vaughn TT, et al. Mapping quantitative trait loci for murine growth: a closer look at genetic architecture. Genet. Res. 1999;74:313–322. doi: 10.1017/s0016672399004103. [DOI] [PubMed] [Google Scholar]
- Wang H, et al. Bayesian shrinkage estimation of quantitative trait loci parameters. Genetics. 2005;170:465–480. doi: 10.1534/genetics.104.039354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood AJ, Oakey RJ. Genomic imprinting in mammals: emerging themes and established theories. PLoS Genet. 2006;2:e147. doi: 10.1371/journal.pgen.0020147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yi N, et al. Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics. 2005;170:1333–1344. doi: 10.1534/genetics.104.040386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yi N, et al. Bayesian mapping of genomewide interacting quantitative trait loci for ordinal traits. Genetics. 2007;176:1855–1864. doi: 10.1534/genetics.107.071142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang YM, Xu S. Advanced statistical methods for detecting multiple quantitative trait loci. Recent Res. Devel. Genet. Breeding. 2005;2:1–23. [Google Scholar]