Skip to main content
Genetics, Selection, Evolution : GSE logoLink to Genetics, Selection, Evolution : GSE
. 2015 Apr 19;47(1):32. doi: 10.1186/s12711-015-0091-y

Genomic best linear unbiased prediction method including imprinting effects for genomic evaluation

Motohide Nishio 1,, Masahiro Satoh 1
PMCID: PMC4404063  PMID: 25928098

Abstract

Background

Genomic best linear unbiased prediction (GBLUP) is a statistical method used to predict breeding values using single nucleotide polymorphisms for selection in animal and plant breeding. Genetic effects are often modeled as additively acting marker allele effects. However, the actual mode of biological action can differ from this assumption. Many livestock traits exhibit genomic imprinting, which may substantially contribute to the total genetic variation of quantitative traits. Here, we present two statistical models of GBLUP including imprinting effects (GBLUP-I) on the basis of genotypic values (GBLUP-I1) and gametic values (GBLUP-I2). The performance of these models for the estimation of variance components and prediction of genetic values across a range of genetic variations was evaluated in simulations.

Results

Estimates of total genetic variances and residual variances with GBLUP-I1 and GBLUP-I2 were close to the true values and the regression coefficients of total genetic values on their estimates were close to 1. Accuracies of estimated total genetic values in both GBLUP-I methods increased with increasing degree of imprinting and broad-sense heritability. When the imprinting variances were equal to 1.4% to 6.0% of the phenotypic variances, the accuracies of estimated total genetic values with GBLUP-I1 exceeded those with GBLUP by 1.4% to 7.8%. In comparison with GBLUP-I1, the superiority of GBLUP-I2 over GBLUP depended strongly on degree of imprinting and difference in genetic values between paternal and maternal alleles. When paternal and maternal alleles were predicted (phasing accuracy was equal to 0.979), accuracies of the estimated total genetic values in GBLUP-I1 and GBLUP-I2 were 1.7% and 1.2% lower than when paternal and maternal alleles were known.

Conclusions

This simulation study shows that GBLUP-I1 and GBLUP-I2 can accurately estimate total genetic variance and perform well for the prediction of total genetic values. GBLUP-I1 is preferred for genomic evaluation, while GBLUP-I2 is preferred when the imprinting effects are large, and the genetic effects differ substantially between sexes.

Background

Genomic imprinting is an epigenetic process that involves DNA methylation and histone modifications that distinguish the expression of maternal and paternal alleles [1]. The expression of an imprinted gene depends on the parent from which it is inherited. Complete inactivation of an imprinted gene results in functional haploidy, with only one of the two copies of the gene expressed. Well known examples of such imprinted genes are IGF2 (insulin-like growth factor 2) in pigs [2] and the Callipyge gene in sheep [3]. Moreover, imprinting may not entail the complete inactivation of a gene. In a study of peripheral blood leukocytes in humans, four of 38 cases exhibited substantial biallelic expression of IGF2, although the product level of the maternally-derived gene was lower than that of the paternally-derived gene in all cases [4]. Over 70 imprinted genes have been identified in mice [5], and 24 genes with parent-of-origin effects in beef cattle [6]. Furthermore, quantitative traits such as carcass composition, growth, teat number, and litter size have been suggested to exhibit imprinting effects [7-10]. Thus, imprinting effects may substantially contribute to the total genetic variation of quantitative traits.

There are several statistical methods for modeling imprinting effects. Using a mixed model, Schaeffer et al. [11] replaced the numerator relationship matrix with a gametic relationship matrix to calculate the expectation of covariance among relatives with imprinting. Essl and Voith [12] suggested that sire and dam models should be constructed separately to assess differences between paternal and maternal imprinting. Neugebauer et al. [13,14] recently fitted a model with correlated paternal and maternal gametes to simultaneously estimate imprinting variances between sexes in pigs and beef cattle. These methods are based on the traditional best linear unbiased prediction (BLUP) method, which uses only pedigree information. More recently, the genomic BLUP (GBLUP) method was developed by modifying the BLUP method to incorporate single nucleotide polymorphism (SNP) information in the form of a genomic relationship matrix that defines the additive genetic covariance among individuals. GBLUP includes genomic information into breeding value estimation and has been used for genomic selection in dairy cattle [15-18]. Therefore, modeling genetic effects by including imprinting effects is expected to improve the predictive ability of GBLUP. Thus, the objectives of this study were twofold: (1) develop a GBLUP method including imprinting effects (termed GBLUP-I hereafter) and (2) estimate genetic variances and assess the accuracies and unbiasedness of genomic predictions using simulation data with varying degrees of imprinting.

Methods

Genetic model

Spencer [19] first extended the standard two-allele one locus model of quantitative genetics to account for imprinting. Following the approach of Spencer [19], consider an autosomal biallelic locus with alleles A1 and A2 at frequencies 1−q and q, respectively, in the population. Allele frequencies of males and females were assumed to be the same and under Hardy-Weinberg equilibrium. By denoting a genotype, AiAj, Ai and Aj, are the paternally- and maternally-derived alleles, respectively. Following the approach of Spencer [19], the genotypic values for genotypes A1A1, A1A2, A2A1, and A2A2 are then given by a, d1, d2, and -a, respectively. In this study, the mean of two heterozygotes and the difference between two heterozygotes were defined as δ and ε:

δ=d1+d22

and

ε=d1d22

In this model, the heterozygous genotypic values were + ε and −ε deviations from δ, i.e., d1 and d2 can be rewritten as δ + ε and δε, respectively (Figure 1). With imprinting, reciprocal heterozygotes differ in their genotypes. For example, in the case of complete inactivation of the maternal allele (i.e., ε = a and δ = 0), the genotypic value of A2A1 is the same as that of A2A2, whereas the genotypic value of A1A2 is the same as that of A1A1. If the paternally-derived A1 allele randomly combines with maternally-derived alleles from a population, the frequencies of the genotypes produced will be 1−q for A1A1 and q for A1A2. The genotypic values of A1A1 and A1A2 are a and d1, respectively. Taking in account the proportions at which they occur, the mean value of genotypes produced from the paternally-derived A1 allele is (1−q)a + qd1. The mean genotypic value in the entire population (μ) is as follows:

μ=1q2a+1qqd1+1qqd2+q2a=12qa+21qqδ.

Figure 1.

Figure 1

Genotypic values for the four genotypes ( A 1 A 1 , A 1 A 2, A 2 A 1, and A 2 A 2 ).

Thus, the average effect of the paternally-derived A1 allele is calculated from the difference between the mean value of the genotypes produced and population mean as follows:

1qa+qd112qa+21qqδ=qa+2q1δ+ε=qαm,

where αm is the average effect of the allele substitution in the paternal gamete and is equivalent to the male breeding value of Spencer [19]. Similarly, the average effect of the maternally-derived A1 allele is as follows:

qa+2q1δε=qαf,

where αf is the average effect of the allele substitution in the maternal gamete and is equivalent to the female breeding values of Spencer [19]. The average effects of all alleles are in Table 1.

Table 1.

Average effects of paternal and maternal alleles at a QTL with imprinting

Gamete type Allele Values and frequencies of genotypes produced Mean value of genotypes produced Average allele effect
A 1 A 1 A 1 A 2 A 2 A 1 A 2 A 2
a δ+ε δ-ε -a
Sire A 1 1-q q (1-q)a + q(δ+ε) q{a + (2q-1)δ+ε} =  m
A 2 1-q q -qa + (1-q)(δ-ε) -(1-q){a + (2q-1)δ+ε} = -(1-q)α m
Dam A 1 1-q q (1-q)a + q(δ-ε) q{a + (2q-1)δ-ε} =  f
A 2 1-q q -qa + (1-q)(δ+ε) -(1-q){a + (2q-1)δ-ε} = -(1-q)α f

α m = a + (2q - 1)δ + ε; α f = a + (2q - 1)δ − ε; a = genotypic value of A 1 A 1; δ = mean of two heterozygotes; ε = difference between two heterozygotes; q = frequency of allele A 2.

The genotypic deviation of a particular genotype can be calculated from the difference between its genotypic value and the population mean. For example, the genotypic deviation of A1A2 is as follows:

d1μ=2q1a+121qqδ+ε=2q1α+21qqδ+ε,

where α = a + (2q − 1)δ When there is no imprinting (d1= d2 = δ and ε = 0) α is the same as the average effect of the allele substitution [20] and (2q − 1)α and 2(1 − q) are the same as the breeding value and dominance deviation of the traditional genetic model. By using δ and ε, a genotypic deviation can be divided into three terms. Under imprinting, the breeding values and dominance deviations are no longer uncorrelated, which means that the total genetic variance cannot be partitioned into the usual additive and dominance variance [19]. Therefore, in this study, total genetic variance was partitioned into three variances corresponding to α, δ, and εσa'2,σd'2,andσi'2 as follows:

σg2=21qqα2+2q1q2δ2+21qqε2=σa'2+σd'2+σi'2.

When there is no imprinting, σa'2 and σd'2 are the same as the additive and dominance genetic variance, respectively. In this case, the covariance between the α and δ terms (σa ' d ') is equal to 0, as follows:

σa'd'=1q22qα2q2δ+1qq2q1α21qqδ+1qq2q1α21qqδ+q221qα21q2δ=0.

The covariance between the α and ε terms (σa ' i ') is also equal to 0, as follows:

σa'i'=1q22qα0+1qq2q1αε+1qq2q1αε+q221qα0=0.

Similarly, the covariance between the δ and ε terms is also equal to 0.

Alternatively, paternal and maternal gametic variances (σpat2 and σmat2, respectively) can be calculated from the variances of the average effects of paternally- and maternally-derived alleles:

σpat2=1qqαm2+q1qαm2=1qqαm2,

and

σmat2=1qqαf2+q1qαf2=1qqαf2.

The sum of these variances is as follows:

σpat2+σmat2=pqαm2+pqαf2=21qqa+2q1δ2+21qqε2=σa'2+σi'2.

Thus, the total genetic variance can be partitioned as follows:

σg2=σpat2+σmat2+σd'2.

Statistical model

Two statistical models of GBLUP-I based on genotypic values (GBLUP-I1) and gametic values (GBLUP-I2) are proposed here.

First, GBLUP-I1 is defined as follows:

y=Xβ+Zaa+Zdd+Zii+e,

where y is the vector of the phenotypes; β is the vector of the fixed effects; a, d, and i are the vectors of α, δ, and ε terms, respectively; X, Za, Zd, and Zi are incidence matrices linking the phenotypes to β, a, d, and i, respectively; and e is the vector of errors. The variances of a, d, and i are as follows:

Vara=Gaσa'2,
Vard=Gdσd'2,

and

Vari=Giσi'2,

where Ga, Gd, and Gi are the genomic relationship matrices relevant to α, δ, and ε terms, respectively. These matrices describe the relationships among genotyped individuals and can be constructed by using the information from genome-wide SNPs. Let A1j and A2j be two alleles at the jth SNP and qj be the frequency of A2j. Ga and Gd are the same as the genomic relationship matrices for breeding values and dominance deviations without imprinting. Thus, Ga and Gd can be calculated as described previously [21,22]:

Ga=MaMa'jNsnp2qj1qj,

and

Gd=MdMd'jNsnp2qj1qj2,

where Ma and Md are n × Nsnp matrices (n is the number of genotyped individuals, and Nsnp is the number of SNPs); the elements of Ma and Md for the ith individual at the jth SNP are calculated as follows:

Mai,j=2qjA1A12qj1A1A2andA2A12qj2A2A2,

and

Mdi,j=2qj2A1A12qj1qjA1A2andA2A121qj2A2A2.

Similarly, Mi is assumed to be a n × Nsnp matrix, and the element of Mi for the ith individual at the jth SNP can be calculated as follows:

Mi,i,j=0A1A11A1A21A2A10A2A2

The elements of Ma, Md, and Mi describe the coefficients of the α, δ, and ε. terms in Table 2, respectively. Therefore, i and its variance can be derived as follows:

i=Miε,

where ε is the Nsnp dimensional vector of which the jth element is εj. Thus, the variance of i is calculated as follows:

Vari=MiMi'Varε,
σi'2=jNsnp2qj1qjVarε.

Table 2.

Genotypic values in the two-allele model

A 1 A 1 A 1 A 2 A 2 A 1 A 2 A 2
Genotypic value a δ+ε δ-ε -a
Deviation from population mean 2qa-2(1-q)q δ (2q-1)a + {1-2(1-q)q}δ+ε (2q-1)a + {1-2(1-q)}q δ-ε -2(1-q)a-2(1-q)q δ
α term 2 (2q-1)α (2q-1)α -2(1-q)α
δ term -2q 2 δ 2(1-q)q δ 2(1-q)q δ -2(1-q)2 δ
ε term 0 ε -ε 0

α = α + (2q-1)δ; a = genotypic value of A 1 A 1; δ = mean of two heterozygotes; ε = difference between two heterozygotes; q = frequency of allele A 2.

Consequently, Gi can be calculated using Mi:

Gi=MiMi'jNsnp2qj1qj.

In general, GBLUP includes only breeding values. The statistical model of GBLUP is as follows:

y=Xβ+Zaa+e.

Therefore, without imprinting and dominance, the GBLUP model is the same as GBLUP-I1.

Second, GBLUP-I2 is defined as:

y=Xβ+Zpatpat+Zmatmat+Zdd+e,

where pat and mat are the vectors of paternal and maternal gametic effects, respectively; and Zpat and Zmat are incidence matrices linking phenotypes to pat and mat, respectively. The variances of pat and mat are as follows:

Varpat=Gpatσpat2,

and

Varmat=Gmatσmat2,

where Gpat and Gmat are the genomic relationship matrices of the paternal and maternal gametes, respectively. Let Mpat and Mmat be the n × Nsnp matrices that specify the coefficients of am and af in Table 1; then, the elements of Mpat and Mmat for the ith individual at the jth SNP are calculated as follows:

Mpati,jqjA11qjA2,

and

Mmati,jqjA11qjA2.

Therefore, pat and mat are as follows:

pat=Mpatαm

and

mat=Mmatαf,

where αm and αf are the Nsnp -dimensional vectors of αm and αf, respectively. The variance of pat is equal to:

Varpat=MpatMpat'Varαm.

The variance of the paternal gametic effect σpat2 is the sum of the variances of αm at all SNPs as follows:

σpat2=jNsnpqj1qjVarαm.

From this equation, Var(pat) can be rewritten as follows:

Varpat=MpatMpat'jNsnpqj1qjσpat2.

Consequently,

Gpat=MpatMpat'jNsnpqj1qj.

Similarly,

Gmat=MmatMmat'jNsnpqj1qj.

Stochastic simulation

A historical population was simulated to establish mutation-drift equilibrium. The simulated genome comprised 10 chromosomes, each 1 Morgan long, containing 100 000 randomly spaced SNPs and 1000 biallelic quantitative trait loci (QTL). In the first generation of the historical population, the initial allele frequencies of all SNPs and QTL were assumed to be 0.5. A recurrent mutation process was applied with a mutation rate for SNPs and QTL of 1.0 × 10−4 per locus per generation. Recombinations were sampled from a Poisson distribution with a mean of 1 per Morgan and then randomly placed along the chromosome. The historical population evolved over 20 000 generations of random selection and random mating, with a population size of 500 (250 males and 250 females) to reach mutation-drift equilibrium [23].

After 20 000 historical generations, the base population (G0) was generated. In G0, the population size decreased to 300 (150 males and 150 females). 10 000 markers and 200 QTL were randomly selected from the segregating SNPs and QTL with minor allele frequencies greater than 0.05. Therefore, Nsnp was equal to 10 000. Let Q1 and Q2 be two alleles at each QTL. The genotypic values of Q1Q1, Q1Q2, Q2Q1 and Q2Q2, are given by a, d1, d2 and -a, respectively. The value of a was drawn from a gamma distribution with a shape parameter of 0.42 and its sign was drawn at random with equal chance. For QTL with imprinting, the values of d1 and d2 were determined as the product of a and the degree of imprinting (τ). Let Nm and Nf be the number of QTL that are silencing the paternal alleles and maternal alleles. The total number of QTL with imprinting (Ni) was 60 (Nm + Nf = Ni), which were randomly chosen from the 200 QTL. The total genetic effect (gj) of the jth animal was calculated by summing all QTL genotypic values, and its variance σg2 was calculated from the variance of the genotypic deviations:

σg2=j=1NQTL1qj22qjaj21qjqjδj2+j=1NQTLqj1qj2qj1aj+12qj1qjδj+εj2+j=1NQTLqj1qj2qj1aj+12qj1qjδjεj2+j=1NQTLqj221qjaj21qjqjδj2,

where NQTL is the number of QTL. To obtain phenotypic values, an environmental effect was added to the true genetic value, which was sampled from the normal distribution, N0,1H2σg2/H2, where H2 is broad-sense heritability; narrow-sense heritability was set to 0.3. The phenotypic variance was finally standardized to be equal to 1.

After G0, the subsequent five generations (G1 to G5) were generated. In G1 to G5, 30 males were selected by BLUP on the basis of estimated breeding values and randomly mated to 150 dams to produce 300 offspring (150 males and 150 females). The reference population with both phenotypes and genotypes comprised 1200 individuals from G1 to G4, and the test population with only genotypes comprised 300 individuals from G5.

The range of d1 and d2, the number of QTL with imprinting (Ni), and Nsnp were varied to investigate their effects on the performance of GBLUP-I. In the base simulation scenario, τ = 1.0, Ni = 60, (NmNf) = (0, 60), Nsnp = 10 000, and paternal and maternal alleles were known. In this scenario, only maternal alleles were silenced. Six alternative scenarios were simulated in addition to the base scenario. In scenario 1, τ = 0.5, 0.75, and 1.0 to meet the condition that − a ≤ d1d2 ≤ a. In scenario 2, Ni = 20, 60, and 100. In scenario 3, (Nm, Nf) = (0, 60), (15, 45), and (30, 30). In scenario 4, H2= 0.1, 0.3, and 0.5. In scenario 5, Nsnp = 2000, 10 000, and 50 000. In scenario 6, the paternal and maternal alleles were assumed to be unknown. Parameter settings are outlined in Table 3. Twenty replicates were simulated for each scenario.

Table 3.

Parameters for different scenarios

Parameter Scenario
Base 1 2 3 4 5 6
τ 1.0 0.5, 0.75,1.0 1.0 1.0 1.0 1.0 1.0
N i 60 60 20, 60, 100 60 60 60 60
(N m, N f) (0, 60) (0, 60) (0, 60) (0, 60) (0, 60), (15, 45),(30, 30) (0, 60) (0, 60)
H 2 0.3 0.3 0.3 0.1, 0.3,0.5 0.3 0.3 0.3
N snp 10 000 10 000 10 000 10 000 10 000 2000, 10 000, 50 000 10 000
Paternal and maternal alleles Known Known Known Known Known Known Known, predicted

Six alternative scenarios were simulated in addition to the base scenario: τ = degree of imprinting; N i = number of QTL with imprinting; N m and N f = numbers of QTL silencing paternal and maternal alleles; H 2 = broad-sense heritability; N snp = number of SNPs.

Outline of the analysis

In the base scenario, the paternal and maternal alleles were assumed to be known. However, such information is unknown when using real data, because only genotypes are available. In scenario 6, the maternal and paternal origins of specific alleles (phase) were predicted using genotype and pedigree information processed by AlphaImpute software [24]. The phasing accuracy was measured as the correlation between true and predicted alleles by origin.

Here, we estimated variance components and genetic values using GBLUP and two types of GBLUP-I. Variance components were estimated by average information restricted maximum likelihood (AI-REML) [25]. The reference population dataset was used to predict the genetic effects of the genotyped individuals in the test population. The accuracy of the estimated total genetic value (ρ) was assessed as the correlation between estimated and true values. The regression coefficients of total genetic value on its estimate (b) was calculated to assess unbiasedness.

Results

Tables 4 and 5 show the estimates of variance components and the predictive abilities of total genetic values with varying values of τ and Ni in scenarios 1 and 2. Total genetic variance was underestimated by GBLUP when the degree of imprinting was high. With GBLUP, the estimated total genetic variances were equal to 97.6%, 91.3%, and 82.1% of true variances for τ of 0.5, 0.75, and 1.0, respectively, and 99.3%, 82.1%, and 78.2% of true variances for Ni of 20, 50, and 100, respectively. The estimated total genetic variances by GBLUP-I1 and GBLUP-I2 were almost the same as the true variances regardless of the degree of imprinting.

Table 4.

Variance component estimates and predictive abilities with varying degrees of imprinting ( τ ) in scenario 1

τ Method σg2 σe2 ρ b
0.5 True value 0.293 0.698 - -
GBLUP 0.286 0.692 0.626 1.026
GBLUP-I1 0.300 0.679 0.635 1.010
GBLUP-I2 0.294 0.679 0.570 0.996
0.75 True value 0.290 0.705 - -
GBLUP 0.265 0.716 0.581 1.011
GBLUP-I1 0.295 0.681 0.599 1.000
GBLUP-I2 0.300 0.680 0.570 0.989
1.0 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995

Values are the mean of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

Table 5.

Variance component estimates and predictive ability with varying numbers of QTL with imprinting ( N i ) in scenario 2

N i Method σg2 σe2 ρ b
20 True value 0.289 0.705 - -
GBLUP 0.287 0.696 0.659 1.078
GBLUP-I1 0.291 0.689 0.660 1.076
GBLUP-I2 0.286 0.688 0.596 1.071
60 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
100 True value 0.293 0.698 - -
GBLUP 0.229 0.756 0.518 0.975
GBLUP-I1 0.298 0.683 0.564 0.982
GBLUP-I2 0.295 0.664 0.575 0.999

Values are the mean of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

The prediction accuracies, ρ obtained with GBLUP-I1 exceeded those obtained with GBLUP by 1.4%, 3.1%, and 7.8% for τ of 0.5, 0.75, and 1.0, respectively, and by 0.2%, 7.8%, and 8.2% for Ni of 20, 50, and 100, respectively. Compared to GBLUP-I1, the ρ values obtained with GBLUP-I2 were more affected by the degree of imprinting. When Ni was equal to 60 and 100, the ρ values obtained with GBLUP-I2 exceeded those obtained with GBLUP by 6.8% and 11.0%; while, when Ni was equal to 20, ρ was smaller with GBLUP-I2 than with GBLUP. For all values of τ and Ni the b values obtained with GBLUP-I1 and GBLUP-I2 were closer to 1 than with GBLUP. In scenario 3, the predictive abilities of GBLUP-I1 were not affected by the values of Nm and Nf whereas the ρ values with GBLUP-I2 decreased as the difference between Nm and Nf decreased (Table 6).

Table 6.

Variance component estimates and predictive ability with varying numbers of QTL silencing paternal and maternal alleles ( N m and N f ) in scenario 3

N m N f Method σg2 σe2 ρ b
0 60 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
15 45 True value 0.295 0.698 - -
GBLUP 0.234 0.739 0.519 0.981
GBLUP-I1 0.296 0.686 0.569 1.002
GBLUP-I2 0.298 0.688 0.549 0.998
30 30 True value 0.290 0.704 - -
GBLUP 0.234 0.739 0.519 0.981
GBLUP-I1 0.296 0.686 0.573 1.009
GBLUP-I2 0.295 0.684 0.538 0.998

Values are the mean of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

In scenario 4, for all values of H2 the estimated variance components obtained with GBLUP-I1 and GBLUP-I2 were close to the true values (Table 7). The performance of GBLUP-I1 and GBLUP-I2 increased with increasing values of H2. With H2 of 0.1, 0.3, and 0.5, the ρ values obtained with GBLUP exceeded those obtained with GBLUP-I1 by 5.7%, 7.8%, and 9.2% and those obtained with GBLUP-I2 by 4.1%, 6.8%, and 7.1%. In scenario 5, the predictive abilities of GBLUP, GBLUP-I1, and GBLUP-I2 decreased when Nsnp decreased from 10 000 to 2000 whereas those were unaltered when Nsnp increased from 10 000 to 50 000 (Table 8).

Table 7.

Variance component estimates and predictive ability with varying broad-sense heritability ( H 2 ) in scenario 4

H 2 Method σg2 σe2 ρ b
0.1 True value 0.096 0.902 - -
GBLUP 0.078 0.920 0.367 1.021
GBLUP-I1 0.094 0.885 0.388 0.990
GBLUP-I2 0.090 0.886 0.382 0.986
0.3 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
0.5 True value 0.493 0.500 - -
GBLUP 0.403 0.597 0.608 1.010
GBLUP-I1 0.499 0.492 0.664 1.001
GBLUP-I2 0.498 0.492 0.651 1.000

Values are the mean of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

Table 8.

Variance component estimates and predictive ability with varying numbers of SNPs ( N snp ) in scenario 5

N snp Method σg2 σe2 ρ b
2000 True value 0.290 0.694 - -
GBLUP 0.201 0.770 0.481 0.939
GBLUP-I1 0.263 0.727 0.521 0.946
GBLUP-I2 0.256 0.733 0.514 0.948
10 000 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
50 000 True value 0.292 0.693 - -
GBLUP 0.242 0.725 0.536 1.072
GBLUP-I1 0.296 0.687 0.572 1.070
GBLUP-I2 0.294 0.694 0.566 1.048

Values are the mean of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

In scenario 6, the phasing accuracy was equal to 0.979. Prediction accuracies with GBLUP-I1 and GBLUP-I2 were 1.7% and 1.2% lower when paternal and maternal alleles were predicted than when paternal and maternal alleles were known (Table 9).

Table 9.

Accuracies of estimated genetic values with predicted paternal and maternal alleles in scenario 6

Phasing accuracy Method σg2 σe2 ρ b
1.0 True value 0.291 0.701 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
0.979 True value 0.289 0.705 - -
GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.295 0.690 0.560 0.984
GBLUP-I2 0.294 0.689 0.558 0.991

Values are the mean (standard error) of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance; predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

Discussion

Performance of GBLUP-I

We present a new GBLUP method that includes imprinting effects for the prediction of total genetic value. For all scenarios, the performance of GBLUP-I1 to estimate variance components was always better than that of GBLUP. Prediction accuracies with GBLUP-I1 and GBLUP-I2 increased with increasing degree of imprinting and broad-sense heritability. Prediction accuracies with GBLUP-I2 were strongly affected by the degree of imprinting (Tables 4 and 5) and the difference between the values of Nm and Nf (Table 6).

Method GBLUP-I2 assumes that paternal and maternal gametic effects are independent. However, when there is no imprinting, sire and dam are genetically correlated, and thus paternal and maternal gametic effects are not independent [26] and the accuracy by GBLUP-I2 should be reduced. The reduction of accuracy by GBLUP-I2 would be small with a high degree of imprinting because of the low correlation between paternal and maternal gametic effects. Thus, the performance of GBLUP-I2 increases as the degree of the imprinting and the difference in genetic values between paternal and maternal gametes increase. Meanwhile, when the degree of imprinting is low and there is little difference in the genetic values between paternal and maternal gametes, GBLUP-I1 is preferred for genomic evaluation.

In a previous study with bovine data, when the number of SNPs was greater than 50 000, reliabilities of genomic evaluations remained almost unaltered as the number of SNPs increased [27]. In this study, with a Nsnp of 10 000, the average distance between neighboring SNPs was 0.1 cM, which is similar to the distance between SNPs in a bovine dataset that includes 50 000 SNPs. In scenario 5, when Nsnp was greater than 10 000, the performances of GBLUP and both GBLUP-I were not affected by various values of Nsnp. This suggests that high-density and costly chips with more (777 000) SNPs may not be necessary for genomic evaluation, even when imprinting effects exist.

In scenario 6, prediction accuracies obtained with GBLUP-I1 and GBLUP-I2 were higher than those obtained with GBLUP when paternal and maternal alleles were predicted. Thus, both GBLUP-I methods can be applied to real livestock data. The phasing accuracy was improved by increasing sample size [28], number of SNPs [29], and number of high-density genotyped relatives of the individuals to be imputed [24,30], which suggests that the performance of GBLUP-I can be further improved in real livestock data.

Degree of imprinting and number of QTL with imprinting

GBLUP-I1 and GBLUP-I2 could accurately capture the total genetic variance, whereas GBLUP underestimated the total genetic variance. The difference in estimated total genetic variance between GBLUP-I and GBLUP is caused by the imprinting effect. Here, we calculated imprinting variance as the difference in the estimated total genetic variance between GBLUP-I1 and GBLUP. When τ varied from 0.5 to 1.0 and Ni from 20 to 100, imprinting variances were equal to 1.4% to 6.0% and 0.4% to 6.9% of the phenotypic variances (1.0), respectively.

de Vries et al. [31] were the first to estimate imprinting variance in livestock and found that approximately 5% and 4% of the phenotypic variance in back fat thickness and growth rate, respectively, were due to imprinting. More recently, imprinting variances were found to range from 5 to 19% of the total genetic variance for 19 pig performance traits [13] and, on average, to be equal to 28% of the total genetic variance for ultrasonic measurements of body composition in Australian beef cattle [26]. The degree of imprinting reported in our study is similar to those reported in the literature [13,26,31].

Effects of QTL parameters

Setting QTL parameters may affect the accuracy of genomic predictions. We investigated the effects of the number of QTL, the distribution of their effects, and their location. The values of NQTL ranged from 50 (Ni = 15) to 1000 (Ni = 300) and QTL were evenly spaced throughout the genome. The value of a was drawn from a normal distribution. In these conditions, the accuracies obtained by GBLUP, GBLUP-I1, and GBLUP-I2 were the almost the same as in the base scenario (Table 10).

Table 10.

Accuracies of estimated genetic values with varying numbers of QTL, distributions of homozygous genotypic value, and locations of QTL

Number of QTL Distribution of homozygous genotypic value ( a ) QTL location Method σg2 σe2 ρ b
50 Gamma Random GBLUP 0.238 0.740 0.528 0.983
GBLUP-I1 0.295 0.683 0.565 0.989
GBLUP-I2 0.296 0.685 0.564 0.992
200 Gamma Random GBLUP 0.239 0.742 0.529 0.982
GBLUP-I1 0.299 0.681 0.570 0.990
GBLUP-I2 0.297 0.681 0.565 0.995
1000 Gamma Random GBLUP 0.235 0.745 0.527 0.979
GBLUP-I1 0.297 0.686 0.568 0.994
GBLUP-I2 0.294 0.684 0.563 0.993
200 Normal Random GBLUP 0.239 0.741 0.529 0.980
GBLUP-I1 0.294 0.681 0.566 0.991
GBLUP-I2 0.295 0.683 0.560 0.994
200 Gamma Evenly spaced GBLUP 0.245 0.738 0.531 0.989
GBLUP-I1 0.295 0.688 0.572 0.997
GBLUP-I2 0.294 0.689 0.567 0.999

Values are the means of 20 replicates; variance components for each source of genetic variation: σg2 = total genetic variance; σe2 = residual variance. Predictive abilities: ρ = accuracy of estimated total genetic value; b = regression coefficient of total genetic value on its estimate.

Significance of genetic effects in GBLUP-I

This study partitioned the total genetic value into three estimated genetic effects (α, δ, and ε terms) in GBLUP-I1. However, there is no biological meaning for these genetic effects. In order to estimate a breeding value and a dominance deviation, the genetic values should be defined by sex, as presented by Spencer [19]. In such a model, the number of variance components would be doubled and the covariance between the breeding value and dominance deviation would not be equal to 0. These factors would collectively reduce the accuracy of genetic evaluations.

Practical use of GBLUP-I

When a dominance effect exists, assortative mating or mate allocation can boost the field performance of livestock [32,33]. Similarly, when an imprinting effect exists, the performance of livestock can be improved by optimizing matings, because the genotypic values of A1A2 and A2A1 can be distinguished and evaluated accurately. Let prij (A1A1) prij (A1A2) prij (A2A1) and prij(A2A2) be the probabilities of the genotypes A1A1, A1A2, A2A1, and A2A2 for the ith offspring of future matings and the jth marker. In GBLUP-I1, the elements of coefficient matrices for these offspring (i.e., Ma, Md, and Mi ) can be calculated from the products of coefficients and the genotype probabilities. For example, the element of Mi for offspring is as follows:

Mii,j=0A1A11×prijA1A2A1A21×prijA2A1A2A10A2A2.

Likewise, in GBLUP-I2, the elements of Mpat and Mmat for the offspring of future matings can be calculated. Thus, the total genetic effects for the offspring of future matings can be predicted and maximized by using an optimum mating plan.

Conclusions

This study proposed two GBLUP methods i.e., GBLUP-I1 and GBLUP-I2, which include imprinting effects at the genotypic and gametic levels, respectively. The GBLUP-I1 and GBLUP-I2 methods accurately estimated the variance components and improved unbiasedness regardless of parameter settings. The accuracies of estimated total genetic values in GBLUP-I1 and GBLUP-I2 increased with increasing degree of imprinting and broad-sense heritability. Compared to GBLUP, the accuracies of estimated total genetic values obtained with GBLUP-I1 were always higher. Thus, in general, GBLUP-I1 should be applied for genetic evaluation. However, GBLUP-I2 is preferred when the imprinting effect is large and the genetic effects differ substantially between paternal and maternal gametes. After predicting the total genetic value by both GBLUP-I methods, assortative mating or mate allocation could be used to boost the field performance of livestock.

Acknowledgements

The authors thank John Hickey for kindly providing the AlphaImpute software.

Footnotes

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

MN developed the two GBLUP models that include the imprinting effect, wrote all the computer programs, and drafted the manuscript. MS conceived and set up the study, and helped with writing the manuscript. Both authors have read and approved the final manuscript.

Contributor Information

Motohide Nishio, Email: mtnishio@affrc.go.jp.

Masahiro Satoh, Email: hereford@affrc.go.jp.

References

  • 1.Reik W, Walter J. Genomic imprinting: parental influence on the genome. Nat Rev Genet. 2001;2:21–32. doi: 10.1038/35047554. [DOI] [PubMed] [Google Scholar]
  • 2.O’Neill MJ, Ingram RS, Vrana PB, Tilghman SM. Allelic expression of IGF2 in marsupials and birds. Dev Genes Evol. 2000;210:18–20. doi: 10.1007/PL00008182. [DOI] [PubMed] [Google Scholar]
  • 3.Georges M, Charlier C, Cockett N. The callipyge locus: evidence for the trans interaction of reciprocally imprinted genes. Trends Genet. 2003;19:248–52. doi: 10.1016/S0168-9525(03)00082-9. [DOI] [PubMed] [Google Scholar]
  • 4.Sakatani T, Wei M, Katoh M, Okita C, Wada D, Mitsuya K, Meguro M, Ikeguchi M, Ito H, Tycko B, Oshimura M. Epigenetic heterogeneity at imprinted loci in normal populations. Biochem Biophys Res Commun. 2001;283:1124–30. doi: 10.1006/bbrc.2001.4916. [DOI] [PubMed] [Google Scholar]
  • 5.Morison IM, Ramsay JP, Spencer HG. A census of mammalian imprinting. Trends Genet. 2005;21:457–65. doi: 10.1016/j.tig.2005.06.008. [DOI] [PubMed] [Google Scholar]
  • 6.Imumorin IG, Kim EH, Lee YM, de Koning DJ, van Arendonk JAM, Donato MD, Taylor JF, Kim JJ. Genome scan for parent-of-origin QTL effects on bovine growth and carcass traits. Front Genet. 2011;2:44. doi: 10.3389/fgene.2011.00044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.de Koning DJ, Rattink AP, Harlizius B, Groenen MAM, Brascamp EW, van Arendonk JAM. Detection and characterization of quantitative trait loci for growth and reproduction traits in pigs. Livest Prod Sci. 2001;72:185–98. doi: 10.1016/S0301-6226(01)00226-3. [DOI] [PubMed] [Google Scholar]
  • 8.Quintanilla R, Milan D, Bidanel JP. A further look at quantitative trait loci affecting growth and fatness in a cross between Meishan and Large White pig populations. Genet Sel Evol. 2002;34:193–210. doi: 10.1186/1297-9686-34-2-193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hirooka H, de Koning DJ, Harlizius B, van Arendonk JAM, Rattink AP, Groenen MA, Brascamp EW, Bovenhuis H. A whole-genome scan for quantitative trait loci affecting teat number in pigs. J Anim Sci. 2001;79:2320–6. doi: 10.2527/2001.7992320x. [DOI] [PubMed] [Google Scholar]
  • 10.Stella A, Stalder KJ, Saxton AM, Boettcher PJ. Estimation of variances for gametic effects on litter size in Yorkshire and Landrace swine. J Anim Sci. 2003;81:2171–8. doi: 10.2527/2003.8192171x. [DOI] [PubMed] [Google Scholar]
  • 11.Schaeffer LR, Kennedy BW, Gibson JP. The inverse of the gametic relationship matrix. J Dairy Sci. 1989;72:1266–72. doi: 10.3168/jds.S0022-0302(89)79231-6. [DOI] [Google Scholar]
  • 12.Essl A, Voith K. Genomic imprinting effects on dairy- and fitness-related traits in cattle. J Anim Breed Genet. 2002;119:182–9. doi: 10.1046/j.1439-0388.2002.00334.x. [DOI] [Google Scholar]
  • 13.Neugebauer N, Luther H, Reinsch N. Parent-of-origin effects cause genetic variation in pig performance traits. Animal. 2010;4:672–81. doi: 10.1017/S1751731109991625. [DOI] [PubMed] [Google Scholar]
  • 14.Neugebauer N, Rader I, Schild HJ, Zimmer D, Reinsch N. Evidence for parent-of-origin effects on genetic variability of beef traits. J Anim Sci. 2010;88:523–32. doi: 10.2527/jas.2009-2026. [DOI] [PubMed] [Google Scholar]
  • 15.Dalton R. No bull: genes for better milk. Nature. 2009;457:369. doi: 10.1038/457369a. [DOI] [PubMed] [Google Scholar]
  • 16.Lund M, de Ross S, de Vries A, Druet T, Ducrocq V, Fritz S, Guillaume F, Guldbrandtsen B, Liu Z, Reents R, Schrooten C, Seefried F, Su G. A common reference population from four European Holstein populations increases reliability of genomic predictions. Genet Sel Evol. 2011;43:43. doi: 10.1186/1297-9686-43-43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Mc Hugh N, Meuwissen TH, Cromie AR, Sonesson AK. Use of female information in dairy cattle genomic breeding programs. J Dairy Sci. 2011;94:4109–18. doi: 10.3168/jds.2010-4016. [DOI] [PubMed] [Google Scholar]
  • 18.Wiggans GR, VanRaden PM, Cooper TA. The genomic evaluation system in the United States: past, present, future. J Dairy Sci. 2011;94:3202–11. doi: 10.3168/jds.2010-3866. [DOI] [PubMed] [Google Scholar]
  • 19.Spencer HG. The correlation between relatives on the supposition of genomic imprinting. Genetics. 2002;161:411–7. doi: 10.1093/genetics/161.1.411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Falconer DS, Mackay TFC. Introduction to Quantitative Genetics. Addison Wesley Longman: Essex; 1996. [Google Scholar]
  • 21.VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23. doi: 10.3168/jds.2007-0980. [DOI] [PubMed] [Google Scholar]
  • 22.Nishio M, Satoh M. Including dominance effects in the genomic BLUP method for genomic evaluation. PLoS One. 2014;9:e85792. doi: 10.1371/journal.pone.0085792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Nishio M, Satoh M. Parameters affecting genome simulation for evaluating genomic selection method. Anim Sci J. 2014;85:879–87. doi: 10.1111/asj.12224. [DOI] [PubMed] [Google Scholar]
  • 24.Hickey JM, Kinghorn BP, Tier B, van der Werf JH, Cleveland MA. A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation method. Genet Sel Evol. 2012;44:9. doi: 10.1186/1297-9686-44-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Johnson DL, Thompson R. Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information. J Dairy Sci. 1995;78:449–56. doi: 10.3168/jds.S0022-0302(95)76654-1. [DOI] [Google Scholar]
  • 26.Tier B, Meyer K. On the analysis of parent-of-origin effects with examples from ultrasonic measures of carcass traits in Australian beef cattle. J Anim Breed Genet. 2012;129:359–68. doi: 10.1111/j.1439-0388.2012.00996.x. [DOI] [PubMed] [Google Scholar]
  • 27.VanRaden PM, O’Connell JR, Wiggans GR, Weigel KA. Genomic evaluations with many more genotypes. Genet Sel Evol. 2011;43:10. doi: 10.1186/1297-9686-43-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Huang L, Li Y, Singleton AB, Hardy JA, Abecasis G, Rosenberg NA, Scheet P. Genotype-imputation accuracy across worldwide human populations. Am J Hum Genet. 2009;84:235–50. doi: 10.1016/j.ajhg.2009.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhang Z, Druet T. Marker imputation with low-density marker panels in Dutch Holstein cattle. J Dairy Sci. 2010;93:5487–94. doi: 10.3168/jds.2010-3501. [DOI] [PubMed] [Google Scholar]
  • 30.Hickey JM, Crossa J, Babu R, de los Campos G. Factors affecting the accuracy of genotype imputation in populations from several maize breeding programs. Crop Sci. 2012;52:654–63. doi: 10.2135/cropsci2011.07.0358. [DOI] [Google Scholar]
  • 31.de Vries AG, Kerr R, Tier B, Long T, Meuwissen TH. Gametic imprinting effects on rate and composition of pig growth. Theor Appl Genet. 1994;88:1037–42. doi: 10.1007/BF00220813. [DOI] [PubMed] [Google Scholar]
  • 32.Toro MA, Varona L. A note on mate allocation for dominance handling in genomic selection. Genet Sel Evol. 2010;42:33. doi: 10.1186/1297-9686-42-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zeng J, Toosi A, Fernando RL, Dekkers JCM, Garrick DJ. Genomic selection of purebred animals for crossbred performance in the presence of dominant gene action. Genet Sel Evol. 2013;45:11. doi: 10.1186/1297-9686-45-11. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genetics, Selection, Evolution : GSE are provided here courtesy of BMC

RESOURCES