Abstract
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix (T−1) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible.
Keywords: epigenetics, Bayesian analysis, genetic variance, resemblance between relatives
Epigenetics studies variations in gene expression that are not caused by modifications of the DNA sequence (Eccleston et al. 2007) and is now considered as one of the most important fields of biological research. Several biochemical mechanisms that alter gene activity (DNA methylation, histone modifications, etc.) underpin epigenetic processes (Jablonka and Raz 2009; Law and Jacobsen 2010). A number of studies have emphasized the importance of epigenetics in cancer (Esteller 2008) and other human illnesses (Feinberg 2007) as well as in other mammal (Reik 2007) and plant traits (Henderson and Jacobsen 2007). Gene activity modifications may occasionally occur in a sperm or an egg cell and can be transferred to the next generation, denoted as transgenerational epigenetic inheritance (Youngson and Whitelaw 2008), a phenomenon that has been reported in a wide range of organisms (Jablonka and Raz 2009). Despite this, the magnitude of the phenotypic variation that is explained by epigenetic processes still remains unclear (Grossniklaus et al. 2013; Heard and Martienssen 2014).
From the perspective of quantitative genetics, the presence of transgenerational epigenetic inheritance involves a redefinition of the covariance between relatives. Tal et al. (2010) developed a model for the calculation of the covariance between relatives for asexual and sexual reproduction as a function of epigenetic heritability (), the reset coefficient (v), and its complement, the epigenetic transmission coefficient (1 − v). According to these authors, the covariance between relatives is reduced as the number of opportunities to dissipate (or reset) the epigenetic marks increase. Therefore, for sexual diploid organisms, the covariance between parent and offspring is greater than the covariance between full sibs, and the covariance between half sibs is greater than the covariance between an uncle and its nephew, despite the fact that the additive numerator relationship between them is identical (0.5 and 0.25, respectively).
In animal and plant breeding, the standard procedure for estimating variance components is by means of a linear mixed model (Henderson 1984) that includes systematic or random environmental effects, one or more random genetic effects, and a residual. The variance components are estimated by using likelihood-based procedures (Patterson and Thompson 1971) or Bayesian approaches (Gianola and Fernando 1986). Under the Bayesian paradigm, the standard procedure for determining the posterior distribution of the variance components uses the Gibbs Sampler algorithm (Gelfand and Smith 1990) that involves an updated iterative sampling scheme from the full conditional distributions of all the unknowns in the model.
The aim of this study is to present a Bayesian linear mixed model that allows one to estimate the heritable epigenetic variability, the reset coefficient, and the epigenetic transmission coefficient based on genealogical and phenotypic information. The model is illustrated with two simulated datasets and with one example that considers a beef cattle database.
Material and Methods
Statistical model
The standard mixed linear model is described by the following equation:
where y is the vector of phenotypic records, b is the vector of systematic effects, u is the vector of random additive genetic effects, and e is the vector of residuals. Then, X and Z are the matrices that link the systematic and additive genetic effects with the data. The usual assumption for the prior distribution of u and e are the following multivariate Gaussian distributions (MVN):
where and are the additive genetic and residual variances, respectively, and A is the numerator relationship matrix. Further, the prior distribution of b is commonly assumed to be a uniform distribution. Conjugate priors for the variance components are the following inverted chi-square distributions:
where and are the prior values of the variances and and are their corresponding prior “degrees of belief”.
To estimate transgenerational epigenetic variability, this standard model can be expanded to:
where w is the vector of individual epigenetic effects. Note that the incidence matrix (Z) is the same for u and w and that both genetic and epigenetic effects are assumed to be independent. The prior distribution of w is defined as:
where T is the matrix of epigenetic relationships between individuals and is the transgenerational epigenetic variance. As before, the prior distribution of is defined as:
with hyperparameters and .
The structure of the T matrix is defined by the recursive relationship between the epigenetic effect of one individual () with respect to the epigenetic effects of its father () and mother ():
where
As defined by Tal et al. (2010), v is the reset coefficient and (1 − v) is the epigenetic transmission coefficient. The reset coefficient represents the proportion of epigenetic marks across the parental genome that are expected to be erased, whereas its opposite, the epigenetic transmission coefficient, indicates the proportion that are transmitted. Furthermore, is the residual epigenetic effect of the ith individual, independent from wi,, and whose distribution is:
assuming that the variance of transgenerational epigenetics effects (V()) is constant across generations:
In matrix notation:
(1) |
where the P matrix defines a recurrent relationship with the epigenetic effects of the father and mother. For nonbase individuals, the ith row of the P matrix contains a parameter λ in the column pertaining to the father and mother of the ith individual. The rest of elements are null.
Furthermore if:
then
where is a diagonal matrix with entries equal to for base individuals, for individuals with one known ancestor, and for individuals whose father and mother are known. The prior distribution for λ will be assumed to be uniform, between 0 and 0.5.
Parameter estimation through a Gibbs sampler
The Gibbs sampler algorithm is an iterative, updating sampling scheme that obtains samples from the marginal posterior distributions of all the unknowns in a model. It requires samples from the full conditional distributions of all the parameters. In the proposed model, the set of parameters can be classified into three main groups: i) the location parameters (b, u, and w); ii) the parameter λ associated with matrix T; iii) the variance components (, , and ).
The full conditional distributions for these groups are:
Sampling of the location parameters (b, u, w):
The conditional distributions of the location parameters are univariate Gaussian (Wang et al. 1994), with parameters drawn from the following mixed model equations:
where
and and .
Specifically, the full conditional distribution for the ith location parameter (si) is:
given the multivariate Gaussian nature of the conditional distribution of the location parameters.
A key limitation of the procedure is the calculation of the A−1 and T−1 matrices. Whereas A−1 is calculated by the standard Henderson’s rules (Henderson 1976) only once, the T−1 matrix needs to be calculated afresh in each cycle as the parameter λ is updated. An algorithm to set-up the T−1 matrix is presented in the appendix of this work (see Appendix).
The full conditional distribution of λ:
The conditional distribution of λ is developed from the recursive definition of the epigenetic effects (equation 1). It is a truncated univariate Gaussian distribution (TN) between 0 and 0.5 due to the prior distribution of the parameter.
where n1 is the number of individuals with known fathers and mothers, n2 is the number of individuals with only father known, and n3 is the number of individuals with only mother known.
The full conditional distributions of the variance components ():
The conditional distributions for the variance components are the following inverted χ2 distributions
where nan is the number of individuals in the population, ndat is the number of phenotypic data, and and are the hyperparameters for .
Simulated datasets
To evaluate the procedure, we simulated two different datasets. The first one had a relative high percentage of variation caused by additive and transgenerational epigenetic effects (35% and 20%, respectively) and an intermediate reset coefficient (v = 0.40). In the second dataset, a lower percentage of variation was explained by additive genetic and transgenerational epigenetic effects (15% and 10%, respectively) and the reset coefficient was greater (v = 0.80). Each dataset was composed by a three-generation population. The base populations included 3000 individuals (1500 sires and 1500 dams) and each of the two subsequent generations were composed by 3000 full sib families of 10 individuals each one. The sire and dam for each family were sampled randomly from the individuals of the previous generation. The genetic and epigenetic effects for each individual (ui and wi) in the base population were generated from the following Gaussian distributions:
Furthermore, the genetic and epigenetic effects for the individuals (uj and wj) of the second and third generation were obtained by sampling from:
where and and and were the additive genetic and transgenerational epigenetic effects of the father and the mother of the jth individual, respectively. In addition, one phenotypic record (yi) was generated for every individual by:
where μ was a general mean set to 100 units. The values of the parameters for the two datasets were:
- Dataset 1.
being
- Dataset 2.
The Gibbs sampler was implemented using own software written in Fortran 95. The analysis consisted of a single chain of 1,250,000 cycles, and the first 250,000 were discarded. Each analysis took 36 hr using a single thread of an Intel Xeon E5-2650 of 2.00 GHz. The source code is included as Supporting Information, File S1. Convergence was checked by the visual inspection of the chains and with the test of Raftery and Lewis (1992). All samples were stored for calculating summary statistics.
Pirenaica beef cattle data
As an illustrative example, we used a database of phenotypic records from the yield recording system of the Pirenaica beef cattle breed. The Pirenaica breed is a meat-type beef population from northern Spain with an approximate census of 20,000 individuals that are typically reared under extensive conditions (Sánchez et al. 2002). The data set was made up of 78,209 records for birth weight with an average value of 41.52 kg and a raw standard deviation of 4.65 kg. In addition, a pedigree file including 125,974 individual-sire-dam records was used. This information was provided by the National Pirenaica Breeders Confederation (Confederación Nacional de Asociaciones de Ganado Pirenaico; http://www.conaspi.net). Animal Care and Use Committee approval was not required for this study as field data were obtained from the Yield Recording System of the Pirenaica breed; furthermore, data were recorded by the stockbreeders themselves, under standard farm management, with no additional requirements.
The full model of analysis was:
Where b is the vector of fixed systematic effects that included sex (2 levels) and age of the mother (16 levels), u and m are the vectors of direct and maternal additive genetic effects with 125,974 elements, p is the vector of permanent environmental maternal effects (21,143 levels), h is the vector of the random herd-year-season effects (12,925 levels), w is the vector of transgenerational epigenetic effects (125,974 levels), and e is the vector of the residuals. Furthermore, X, Z1, Z2, and Z3 are the incidence matrices that link the different effects with the phenotypic data. Appropriate uniform bounded distributions were assumed for the systematic effects and for each variance component (), defined by hyperparameters (nx = −2 and s2x = 0). Furthermore, the prior distribution for the (co) variance matrix between direct and maternal genetic effects (G) was the following inverted Wishart:
where
being and the direct and maternal genetic variances and the covariance between them. In this case, was set to −3 and G0 to a 2 × 2 matrix of zeroes to define a uniform distribution.
Two statistical models (I and II) were fitted. Model I includes transgenerational epigenetic variance (), whereas Model II does not. For each model, the Gibbs sampler was implemented with a single chain of 3,250,000 cycles, and the first 250,000 were discarded. Convergence was checked by the visual inspection of the chains and with the test of Raftery and Lewis (1992). The models were compared based on the pseudo-log-marginal probability of the data (Gelfand 1996; Varona and Sorensen 2014) by computing the logarithm of the Conditional Predictive Ordinate (LogCPO) calculated from the Markov chain Monte Carlo (MCMC) samples. It was calculated as:
where y-i is the vector of data with the ith datum (yi) deleted, Mk is the kth model and
being Ns is the number of MCMC draws, is the jth draw from the posterior distribution of the parameters of the kth model.
Results and Discussion
From the perspective of quantitative genetics, the estimation of variance components is obtained from the statistical information provided by the covariance between the phenotypes of relatives (Falconer and Mckay 1996). Tal et al. (2010) proposed a simple model for the resemblance between relatives under transgenerational epigenetic inheritance for asexual and sexual reproduction. This model assumes that transgenerational epigenetic effects are not correlated with the additive genetic ones, and that they are distributed under a Gaussian law. After the central limit theorem, this may be explained by a large number of epigenetic marks randomly distributed across the genome. In addition, the distribution pattern of these marks is assumed independent between individuals unless they are relatives. In that case, some of these marks should have not been erased during the meiosis drawing them apart from their common ancestor. The population rate of erasure of these marks is measured through the reset coefficient (v).
In this study, we propose a procedure that makes use of the aforementioned authors’ definition for sexual diploid organisms to estimate the parameters under a Bayesian mixed model framework. Our approach makes it feasible to estimate transgenerational epigenetic variability from huge datasets of genealogical and phenotypic data. The key to the procedure is the definition of a T matrix of (co) variance between transgenerational epigenetic effects. This matrix exclusively depends on a single parameter (λ) which is directly related to the reset coefficient (v) put forward by Tal et al. (2010). As an example, for a simple pedigree of seven individuals (Figure 1), the T matrix is:
As Tal et al. (2010) stated, if we compare the covariance between full sibs (individuals 5 and 6) and the covariance between sire-offspring (individuals 1 and 4), the covariance is greater for the sire offspring pair, λ vs. , as λ is defined as being within the interval (0, 0.5). Note that if the reset coefficient is 0, then λ is 1/2 and the T matrix becomes equivalent to the numerator relationship matrix (A), where the covariances between full sibs and sire offspring are equivalent. Similarly, the uncle−nephew covariance (individuals 5 and 7) is , lower than the covariance between half sibs (individuals 4 and 5) which is only .
The mixed model implementation of the covariance between relatives requires the inverse of the T matrix to construct the mixed model equations, in a similar manner as the numerator relationship matrix in the standard model (Henderson 1984). In this study, we describe a simple procedure for the calculation of this T-1 matrix (see the Appendix section). The procedure takes into account the recursive nature of the transgenerational epigenetic effects, using an argument equivalent to Quass (1976), for the inverse of numerator relationship matrix (A−1), and Quintanilla et al. (1999), for dam-related permanent maternal environmental effects. The main consequence is that the inverse of the T matrix can be sequentially constructed by reading the pedigree of the population, in the light of very simple rules. These rules are close to Henderson’s (1976) rules for constructing the inverse of the A matrix.
For the example pedigree, the T−1 matrix is:
Given this algorithm to set-up the T−1 matrix, the implementation of a Gibbs sampling approach becomes straightforward; it merely requires sequential iterative sampling from Gaussian (b, u, and w), truncated Gaussian (λ) and inverted χ2 distributions (). The only minor complication is that the T−1 matrix depends on parameter λ and consequently must be calculated in each Gibbs sampler’s iteration. One possible alternative to avoid this matrix inversion, although not to avoid its updating in each cycle, is to implement an approach similar to the one proposed by Rodríguez-Ramilo et al. (2014) for the genomic relationship matrix. However, setting up the T matrix is computationally more demanding than the calculation of its inverse with the algorithm proposed in this study.
The proposed procedure was first checked with simulated data. The summary of the marginal posterior distributions for the variance components, additive genetic (h2), and epigenetic () heritability, reset coefficient (v), and epigenetic transmission coefficient (1 − v) for both cases of simulation are presented in Table 1. The greatest posterior density at 95% for all parameters in the model included the simulated values. However, some details of the results should be highlighted. In the second case of simulation, with lower and λ, the posterior standard deviation (and the HPD95) for the and were remarkable wider (66.19 and 67.42, respectively). The cause of this phenomenon is that there is a statistical confounding between the epigenetic and residual variance components when the reset coefficient (v) is very high, because the T and I matrices becomes very similar. To illustrate this fact, in Figure 2 and Figure 3, we present the joint posterior densities of and v for both cases of simulation. The results of the first case of simulation showed posterior independence between both parameters, whereas in the second, the marginal posterior density presented a half-moon shape. This indicates that for large values of v, (and ) may take any value on its support with a fairly equal probability. In other words, in the first simulated case the model showed very good ability to discriminate between both parameters, whereas in the second it did not. In addition, mixing of the MCMC procedure in the second case of simulation was clearly worst, and adaptive MCMC algorithms, such as the proposed by Mathew et al. (2012) may represent an interesting alternative for its implementation in large data sets.
Table 1. PM, PSD, and HPD95 for simulation cases I and II.
Case I | Case II | |||||
---|---|---|---|---|---|---|
Parameter | PM | PSD | HPD95 | PM | PSD | HPD95 |
217.87 | 18.21 | 174.66−247.49 | 90.91 | 4.63 | 81.42−99.25 | |
132.92 | 14.42 | 106.30−164.53 | 111.99 | 66.19 | 37.45−288.61 | |
251.00 | 19.71 | 205.07−282.31 | 396.30 | 67.42 | 217.79−475.16 | |
0.362 | 0.029 | 0.291−0.408 | 0.152 | 0.007 | 0.136−0.164 | |
0.221 | 0.024 | 0.176−0.274 | 0.187 | 0.110 | 0.062−0.481 | |
λ | 0.256 | 0.055 | 0.148−0.365 | 0.091 | 0.059 | 0.020−0.233 |
v | 0.488 | 0.111 | 0.270−0.704 | 0.818 | 0.117 | 0.534−0.960 |
1 − v | 0.512 | 0.111 | 0.296−0.730 | 0.182 | 0.117 | 0.040−0.466 |
PM, posterior mean estimate; PSD, posterior standard deviation; HPD95, highest posterior density at 95%;, additive genetic variance, , transgenerational epigenetic variance;, residual variance. Moreover, h2, heritability; , transgenerational epigenetic heritability;λ, autorecursive parameter, v, the reset coefficient; 1 − v, epigenetic transmission coefficient.
It must be noted that the sources of information for the estimation of and v (or λ) comes from the comparison between the covariance between different categories of relatives (see Table 2). Following Tal et al. (2010), one estimate of v can be achieved from the difference between the estimates of covariance between half-sib and uncle−nephew relationships, a difference that becomes almost null for low values of λ (or high v), as seen in the second case of simulation (23.10 vs. 22.62). On the contrary, with greater values of λ(or lower v), as in the first case of simulation, this difference is higher (60.60 vs. 57.36), and, thus, the amount of information available for estimation purposes increases. This uncertainty in the estimation of λ (or v) is also reflected in the estimation of () and and implies wider posterior marginal densities. Nevertheless, it is important to emphasize that the joint posterior mode (v = 0.80, = 0.13) in the second case of simulation was very close to the simulated parameters (0.80 and 0.10, respectively).
Table 2. Expected covariance between relatives in cases of simulation I and II.
Case I | Case II | ||
---|---|---|---|
Relatives | Expected Covariance | ||
Offspring−Progeny | 132 | 51 | |
Full-Sibs | 121.2 | 46.2 | |
Half-Sibs | 60.6 | 23.1 | |
Uncle−Nephew | 57.36 | 22.62 |
, additive genetic variance;, transgenerational epigenetic variance;λ, autorecursive parameter.
We have also checked our model with a real dataset. In Table 3, summary statistics of the marginal posterior distributions for the parameters and LogCPO values are presented for models I and II in the analysis of birth weight from the Pirenaica beef cattle population. The posterior mean estimates for heritability ranged between 0.38 (Model I) to 0.41 (Model II), and the posterior mean estimates for transgenerational epigenetic heritability was only 0.04 (Model I), with a greatest posterior density at 95% that ranged between 0.00 and 0.11. These results are coherent with the output of the model comparison test (LogCPO), which pointed to Model II as the more plausible. If we focus in this latter model, the estimates of direct and maternal heritability were within the range of estimates in the literature for these traits (Meyer 1992; Varona et al. 1999; Jamrozik and Miller 2014), and the absence of the epigenetic transgenerational heritability is coherent with the fact that most of the epigenetic marks are erased during the meiosis in mammals (Jablonka and Raz 2009, Schmitz 2014). However, it should be highlighted that the posterior mean estimate of the reset coefficient under Model I was surprisingly low (0.20), although its posterior standard deviation was very high (0.20), and that the HPD95 region covered almost all the parametric space (0.01−0.89). This wide range reflects the absence of information to properly estimate the parameter, given the low magnitude of .
Table 3. PM, PSD, and HPD95 for models I and II.
Model I | Model II | |||||
---|---|---|---|---|---|---|
PM | PSD | HPD95 | PM | PSD | HPD95 | |
9.033 | 0.914 | 7.059, 10.449 | 9.897 | 0.408 | 9.123, 10.727 | |
3.574 | 0.212 | 3.168, 3.998 | 3.554 | 0.217 | 3.135, 3.981 | |
−4.473 | 0.253 | −4.979, −3.990 | −4.508 | 0.259 | −5.032, −4.013 | |
0.747 | 0.081 | 0.590, 0.906 | 0.765 | 0.079 | 0.611, 0.922 | |
0.876 | 0.820 | 0.000, 2.701 | − | − | − | |
2.634 | 0.069 | 2.501, 2.771 | 2.633 | 0.069 | 2.499, 2.770 | |
7.055 | 0.223 | 6.606, 7.478 | 7.156 | 0.209 | 6.735, 7.556 | |
λ | 0.398 | 0.103 | 0.056, 0.494 | − | − | − |
v | 0.204 | 0.207 | 0.012, 0.888 | − | − | − |
1 − v | 0.796 | 0.207 | 0.112, 0.988 | − | − | − |
0.377 | 0.035 | 0.299, 0.427 | 0.412 | 0.012 | 0.389, 0.436 | |
0.149 | 0.007 | 0.135, 0.164 | 0.148 | 0.007 | 0.133, 0.162 | |
0.036 | 0.034 | 0.000, 0.113 | − | − | − | |
LogCPO | −287542.3 | −287475.1 |
PM, posterior mean estimate; PSD, posterior standard deviation; HPD95, highest posterior density at 95%;, additive genetic variance, , maternal environmental variance, , covariance between them, , permanent maternal environmental variance, , herd-year-season variance, , transgenerational epigenetic variance, and , residual variance. Moreover, h2, heritability, , maternal heritability, , transgenerational epigenetic heritability, λ, autorecursive parameter, v, reset coefficient, and 1 − v, epigenetic transmission coefficient. LogCPO, logarithm of the conditional predictive ordinate.
It is worth noting that the procedure also can provide epigenetic breeding values in the fields of animal and plant breeding. These are calculated by weighting the phenotypic information of relatives according to the magnitude of the elements of the T matrix. In fact, for epigenetic effects, the weight of distantly related individuals is lowered as the number of opportunities to reset the epigenetic marks increases. Both additive genetic and epigenetic effects can be used for prediction of the future performance of the individual: the expected genetic response (R) after one cycle of mass selection should be , where i is the intensity of selection and is the phenotypic variance. However, further research must be undertaken to develop adequate indices of selection that consider genetic and epigenetic effects. Although both of them affect the immediate future performance of the offspring, epigenetic effects are diluted in future generations as the epigenetic marks are cleaned. It is important to note that if this selection pressure is relaxed, the average epigenetic effect declines to zero with a rate of , where n is the number of generations without selection.
It should be further noted that the proposed model uses a very basic definition of transgenerational epigenetic inheritance, as it assumes equal epigenetic variance for all individuals in the population. However, when phenotypic records are recorded through the life of the individual, transgenerational epigenetic variance can be modeled as age dependent, based on the consideration that the number of epigenetic marks was accumulated in the genome of the individuals. Moreover, in the proposed model, it is assumed that the λ parameter (or the reset coefficient) is equal for fathers and mothers, and, in future research, it seems reasonable to assume a different transmission coefficient for males and females, allowing for the presence of sex differential genomic imprinting (Barlow 1997; Reik and Walker 1998). The model can be refined by the inclusion of a hierarchical Bayesian paradigm that includes systematic effects for any environmental factors that may influence the epigenetic transmission coefficient. In addition, a more profound approach could even allow for the consideration of a genetic determinism of the reset coefficient with a model similar to that proposed by Varona et al. (2008) for the degree of asymmetry of a skewed Gaussian distribution.
Finally, and as mentioned by Jablonka and Raz (2009) and Tal et al. (2010), epigenetics may be viewed as a more wide ranging concept that could include several types of cultural transmission (Avital and Jablonka 2000; Richerson and Boyd 2005). The procedure suggested in this work also could be applied to the analysis of human or animal datasets that involve any kind of transgenerational transmission, even those not directly related to gene expression.
By way of a conclusion, we can say that this paper presents an original procedure for the estimation of transgenerational epigenetic variability based on some generalized assumptions. It is hoped that the proposal will lead to future research on variations of epigenetic transmission abilities caused by environmental and/or genetic factors.
Supplementary Material
Acknowledgments
We thank the Pirenaica Breeders Association (Confederación Nacional de Asociaciones de Ganado Pirenaico) for the provision of data used in this work. We also thank the useful comments of the editor and the reviewers. This research was funded by grant AGL2010-15903 (Ministerio de Ciencia e Innovación, Spain).
Appendix
Derivation of the T−1 matrix
The vector of epigenetic effects (w) can be represented as:
where the P matrix defines a recurrent relationship with the epigenetic effects of the father and mother. For nonbase individuals, the ith row of the P matrix contains a parameter λ in the column pertaining to the father and mother of the ith individual. The rest of the elements are null.
Furthermore, if:
then
given that
(A1) |
From , it is possible to define a QT matrix as:
(A2) |
If we consider the structure of and , the QT matrix has a diagonal structure with 1s for base individuals and and for those with one or two known parents, respectively.
Finally, replacing (A2) into (A1):
Taking advantage of this structure, it is possible to define the following algorithm for constructing T−1 by reading the pedigree information and calculating for each individual (i) in the pedigree:
If the father and mother are unknown;
add 1 to the diagonal (i,i)
If only one parent (p) is known;
add to the diagonal (i,i)
add to the elements (i,p), (p,i)
add to the element (p,p)
If both ancestors (p and m) are known;
add to the diagonal (i,i)
add to the elements (i,p), (i,m), (m,i), (p,i)
add to the elements (p,p), (p,m), (m,p), (m,m)
Footnotes
Supporting information is available online at http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.115.016725/-/DC1
Communicating editor: D. J. de Koning
Literature Cited
- Avital E., Jablonka E., 2000. Animal Traditions: Behavioural Inheritance in Evolution. Cambridge University Press, Cambridge. [DOI] [PubMed] [Google Scholar]
- Barlow D. P., 1997. Competition—a common motif for the imprinting mechanism? EMBO J. 16: 6899–6905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eccleston A., DeWitt N., Gunter C., Marte B., Nath D., 2007. Introduction epigenetics. Nature 447: 395–400. [Google Scholar]
- Esteller M., 2008. Epigenetics of cancer. N. Engl. J. Med. 358: 1148–1159. [DOI] [PubMed] [Google Scholar]
- Falconer D. S., Mckay T. F. C., 1996. Introduction to Quantitative Genetics. Ed. 4 Longmans Green, Harlow, Essex. [Google Scholar]
- Feinberg A. P., 2007. Phenotypic plasticity and the epigenetics of human disease. Nature 447: 433–440. [DOI] [PubMed] [Google Scholar]
- Gelfand A. E., 1996. Model determination using sampling-based methods, pp. 145–161 in Markov Chain Monte Carlo in Practice, edited by Gilks W. R., Richardson S., Spiegelhalter D. J. Chapman & Hall, London. [Google Scholar]
- Gelfand A. E., Smith A. F. M., 1990. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85: 398–409. [Google Scholar]
- Gianola D., Fernando R. L., 1986. Bayesian methods in animal breeding theory. J. Anim. Sci. 63: 217–244. [Google Scholar]
- Grossniklaus U., Kelly W. G., Ferguson-Smith A. C., Pembrey M., Lindquist S., 2013. Transgenerational epigenetic inheritance: how important is it? Nat. Rev. Genet. 14: 228–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heard E., Martienssen R. A., 2014. Transgenerational epigenetic inheritance: myths and mechanisms. Cell 157: 95–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henderson C. R., 1976. A simple method for computing the inverse of a numerator relationship matrix used in the prediction of breeding values. Biometrics 32: 69–83. [Google Scholar]
- Henderson C. R., 1984. Application of Linear Models in Animal Breeding. University of Guelph, Guelph, ON. [Google Scholar]
- Henderson I. R., Jacobsen S. E., 2007. Epigenetic inheritance in plants. Nature 447: 418–424. [DOI] [PubMed] [Google Scholar]
- Jablonka E., Raz G., 2009. Transgenerational epigenetic inheritance: prevalence, mechanisms and implications for the study of heredity and evolution. Q. Rev. Biol. 84: 131–176. [DOI] [PubMed] [Google Scholar]
- Jamrozik J., Miller S. P., 2014. Genetic evaluation of calving ease in Canadian Simmentals using birth weight and gestation length as correlated traits. Livest. Sci. 162: 42–49. [Google Scholar]
- Law J. A., Jacobsen S. E., 2010. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat. Rev. Genet. 11: 204–220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mathew B., Bauer A. M., Koistinen P., Reetz T. C., Léon J., et al. , 2012. Bayesian adaptive Markov chain Monte Carlo estimation of genetic parameters. Heredity 109: 235–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer K., 1992. Variance components due to direct and maternal effects for growth traits of Australian beef cattle. Livest. Prod. Sci. 11: 143–177. [Google Scholar]
- Patterson H. D., Thompson R., 1971. Recovery of inter-block information when block sizes are unequal. Biometrika 58: 545–554. [Google Scholar]
- Quass R. L., 1976. Computing the diagonal elements of the inverse of a large numerator relationship matrix. Biometrics 32: 949–953. [Google Scholar]
- Quintanilla R., Varona L., Pujol M. R., Piedrafita J., 1999. Maternal animal model with correlation between maternal environmental effects of related dams. J. Anim. Sci. 77: 2904–2917. [DOI] [PubMed] [Google Scholar]
- Raftery A. E., Lewis S. M., 1992. How many iterations in the Gibbs sampler, pp. 763–774 in Bayesian Statistics IV, edited by Bernardo J. M., Berger J. O., Dawid A. P., Smith A. F. M. Oxford University Press, New York. [Google Scholar]
- Reik W., 2007. Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 447: 425–432. [DOI] [PubMed] [Google Scholar]
- Reik W., Walker J., 1998. Imprinting mechanisms in mammals. Curr. Opin. Genet. Dev. 8: 154–164. [DOI] [PubMed] [Google Scholar]
- Richerson P. J., Boyd R., 2005. Not by Genes Alone: How Culture Transformed Human Evolution. University of Chicago Press, Chicago. [Google Scholar]
- Rodríguez-Ramilo S. T., García-Cortés L. A., González-Recio O., 2014. Combining genomic and genealogical information in a reproducing kernel Hilbert spaces regression model for genome-enabled predictions in dairy cattle. PLoS One 9: e93424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sánchez A., Ambrona J., Sánchez L., 2002. Razas Ganaderas Españolas Bovinas. MAPA-FEAGAS. Spain, Madrid. [Google Scholar]
- Schmitz R. J., 2014. The secret garden—epigenetic alleles underlies complex traits. Science 343: 1082–1083. [DOI] [PubMed] [Google Scholar]
- Tal O., Kisdi E., Jablonka E., 2010. Epigenetic contribution to covariance between relatives. Genetics 184: 1037–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varona L., Sorensen D., 2014. Joint analysis of binomial and continuous traits with recursive model: a case study using mortality and litter size in pigs. Genetics 196: 643–651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varona L., Misztal I., Bertrand J. K., 1999. Threshold-linear versus linear-linear analysis of birth weight and calving ease using an animal model: I. Variance component estimation. J. Anim. Sci. 77: 1994–2002. [DOI] [PubMed] [Google Scholar]
- Varona L., Ibañez-Escriche N., Quintanilla R., Noguera J. L., Casellas J., 2008. Bayesian analysis of quantitative traits using skewed distributions. Genet. Res. 90: 179–190. [DOI] [PubMed] [Google Scholar]
- Wang C. S., Rutledge J. J., Gianola D., 1994. Bayesian analysis of mixed linear models via Gibbs sampling with an application to litter size in Iberian pigs. Genet. Sel. Evol. 26: 91–115. [Google Scholar]
- Youngson N. A., Whitelaw E., 2008. Transgenerational epigenetic effects. Annu. Rev. Genomics Hum. Genet. 9: 233–257. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.