Skip to main content
Journal of Animal Science logoLink to Journal of Animal Science
. 2020 Jan 19;98(3):skaa019. doi: 10.1093/jas/skaa019

Variance estimates are similar using pedigree or genomic relationships with or without the use of metafounders or the algorithm for proven and young animals1

Michael N Aldridge 1,1,, Jérémie Vandenplas 1, Rob Bergsma 2, Mario P L Calus 1
PMCID: PMC7053865  PMID: 31955195

Abstract

With an increase in the number of animals genotyped there has been a shift from using pedigree relationship matrices (A) to genomic ones. As the use of genomic relationship matrices (G) has increased, new methods to build or approximate G have developed. We investigated whether the way variance components are estimated should reflect these changes. We estimated variance components for maternal sow traits by solving with restricted maximum likelihood, with four methods of calculating the inverse of the relationship matrix. These methods included using just the inverse of A (A1), combining A1 and the direct inverse of G (HDIRECT1), including metafounders (HMETA1), or combining A1 with an approximated inverse of G using the algorithm for proven and young animals (HAPY1). There was a tendency for higher additive genetic variances and lower permanent environmental variances estimated with A1 compared with the three H1 methods, which supports that G1 is better than A1 at separating genetic and permanent environmental components, due to a better definition of the actual relationships between animals. There were limited or no differences in variance estimates between HDIRECT1, HMETA1, and HAPY1. Importantly, there was limited differences in variance components, repeatability or heritability estimates between methods. Heritabilities ranged between <0.01 to 0.04 for stayability after second cycle, and farrowing rate, between 0.08 and 0.15 for litter weight variation, maximum cycle number, total number born, total number still born, and prolonged interval between weaning and first insemination, and between 0.39 and 0.44 for litter birth weight and gestation length. The limited differences in heritabilities suggest that there would be very limited changes to estimated breeding values or ranking of animals across models using the different sets of variance components. It is suggested that variance estimates continue to be made using A1, however including G1 is possibly more appropriate if refining the model, for traits that fit a permanent environmental effect.

Keywords: pigs, restricted maximum likelihood, single step, variance components

Introduction

Variance estimates are needed for single-step genomic best linear unbiased prediction (ssGBLUP). Traditionally variance estimates were calculated using a pedigree based relationship matrix (A). The effect of using a genomic based relationship matrix (G) during solving of variance estimates, has shown a tendency for higher genetic variances estimated with A1 (Legarra, 2016). The full information of A1 and G1 can be combined in the H1 matrix (HDIRECT1), described by Aguilar et al. (2010) and Christensen and Lund (2010). In the past unknown sires and dams in the base generation have been treated as unrelated, in reality they have some unknown relationship. One way of accounting for this, is to estimate relationships between and within metafounders, computed based on genotypes and pedigree of descendants (Legarra et al., 2015). These relationships can be included in AMETA1 and combined with GMETA1 (HMETA1). The number of genotyped animals is constantly increasing, and there is a need for more computational efficient methods of building G1. The algorithm for proven and young animals (APY) is one such method (Misztal et al., 2014; Fragomeni et al., 2015). It is an approximation of G1 (GAPY1), requiring inversion of a genomic relationship matrix computed for a subset of genotyped animals (that are a good representation of the population diversity), which can then be combined with A1 (HAPY1). It should be noted that different relationship matrices may require different variance components. It was hypothesized that genetic variances estimated with A1 would be higher compared with HDIRECT1, but differences would be limited. Since the relationships of metafounders are based on genotypes of descendants, and HAPY1 uses an approximation of G1, it is hypothesized that the three H1 methods will have similar variance estimates. Therefore, our objective is to compare variance estimates using A1, HDIRECT1, HMETA1, and HAPY1, based on empirical pig data for different maternal traits.

Materials and methods

The data used for this study was collected as part of routine data recording in a commercial breeding program. Samples collected for DNA extraction were only used for routine diagnostic purposes of the breeding program. Data recording and sample collection were conducted strictly in line with the Dutch law on the protection of animals (Gezondheids- en welzijnswet voor dieren).

Dataset

Data were provided by Topigs Norsvin on a breeding large white maternal sow line. There were 10 traits for which variance components were to be estimated: mean litter birth weight (LBW), litter variation defined as the within litter standard deviation of birth weight (LVAR), stayability after second cycle defined as a binary trait for animals that reach second parity or not (STAY), maximum cycle number (MAX) was defined as the maximum number of parities with large parities (more than five) treated as equal to five, total number born (TNB), number still born (STB) which was expressed as log10(STB + 1), litter mortality (LMO), prolonged interval between weaning and first insemination (PIWI) defined as a binary trait, where prolonged is defined as 0 if insemination is 6 d or fewer after weaning or 1 if insemination is 7 d or more after weaning, gestation length (GLE), and farrowing rate (FRT).

The phenotype data from the maternal line analyzed (originally 293,619 animals with at least one record, from 39 generations), was to be limited to genotyped animals (42,112 records from 10,860 genotyped animals with at least one record). Data filtering removed records where levels of categorical fixed effects had fewer than five records. After filtering based on fixed effects 34,441 records from 9,695 genotyped animals remained (Table 1). There were also genotyped sires (498) and dams (2,585) of remaining phenotyped animals. These sires and dams had no own records. Due to software limitations for the variance components estimated using HDIRECT1, the number of genotypes to be included in the analysis was limited to 10,000, by selecting animals as explained further on (see analysis using genomic relationships approximated with APY, for details on selected animals). The pedigree was then limited to these 10,000 animals and their ancestors with a total of 16,932 animals and 36 generations. The number of genotyped and ungenotyped animals per generation in the pedigree is illustrated in Figure 1.

Table 1.

Summary of data used for the estimation of variance components after filtering

Trait1 Total number of Mean SD
Records Animals Sires Dams
LBW, g 33,974 9,465 625 3,442 1246.18 213.77
LVAR, g 33,955 9,462 626 3,441 262.28 77.34
STAY, % 7,782 7,782 508 2,648 0.92 0.27
MAX, number 5,910 5,910 397 2,085 3.96 1.31
TNB, number 33,974 9,465 626 3,442 16.03 3.77
STB, number 33,970 9,465 626 3,442 2.512 1.916
LMO, % 32,754 9,039 621 3,284 14.38 15.21
PIWI, % 7,797 7,797 530 2,877 13.80 34.49
GLE, days 33,274 9,182 625 3,338 115.09 1.54
FRT, % 33,974 9,465 626 3,442 94.76 22.28

1LBW, litter birth weight; LVAR, litter variation; STAY, stayability after second cycle; MAX, maximum cycle number; TNB, total number born; STB, total number stillborn (no log-transformation for summary); LMO, litter mortality; PIWI, prolonged interval between weaning and first insemination; GLE, gestation length; FRT, farrowing rate. Sires and dams are the number of sires and dams of progeny with at least one record for that trait.

Figure 1.

Figure 1.

Number of genotyped and ungenotyped animals per generation in the pedigree.

Animal Models

All variance components were estimated with a univariate animal model with restricted maximum likelihood (REML). For the traits STAY, MAX, and PIWI, the model can be summarized in matrix notation as:

y=Xb+Za+e

where y is a vector of the trait observations, the matrices X and Z, are incidence matrices associated with the vector of fixed effects b, and the vector of random additive genetic effects a(0,KσA2), respectively, and e(0,Iσe2) is a vector of residuals. The term σA2 and σe2 is for additive genetic and residual variances, respectively. The matrices K is the relationship matrix (either A, AMETA, HDIRECT, HMETA, HAPY), and I is an identity matrix, respectively. For each of the methods (A1, AMETA1, HDIRECT1, HMETA1, and HAPY1) the same fixed and random effects were used (Table 2). Details on the computation of the different relationship matrices will be given below.

Table 2.

Summary of random and fixed effects fitted for each trait

Effect1 Traits
STAY MAX PIWI LBW LVAR LMO TNB STB GLE FRT
Random effects
 Animal
 PE
 Sire
Fixed effects
 Parity
 HYSF
 HYSI1
 HYSI
 I2
 CIWP
 CTNB
 LP
 LP2
 NW
 PBCB

1Sire, service sire; PE, permanent environment; HYSF, herd-year-season at farrowing; HYSI1, herd-year-season at first insemination; HYSI, herd-year-season at insemination; I2, second insemination; CIWP, class interval between weaning a pregnancy; CTNB, class of total number born; LP, lactation period; LP2, LP × LP; NW, number weaned; PBCB, purebred or crossbred litter; LBW, litter birth weight; LVAR, litter variation; STAY, stayability after second cycle; MAX, maximum cycle number; TNB, total number born; STB, total number stillborn; LMO, litter mortality; PIWI, prolonged interval between weaning and first insemination; GLE, gestation length; FRT, farrowing rate.

When appropriate random permanent environment and service sire effects were fitted (Table 2). For the traits LBW, LVAR, and LMO, a permanent environmental effect was fitted, and the associated model can be summarized as:

y=Xb+Za+Wc+e

where W is an incidence matrix associated with the vector of random permanent environmental effects c(0,Iσpe2). The term σpe2 is for permanent environmental variance.

For the traits TNB, STB, GLE, and FRT, the associated model included both a permanent environmental and a nongenetic service sire effect, and can be summarized as:

y=Xb+Za+Wc+Vs+e

where V is an incidence matrix associated with a vector of service sire effects s(0,Iσsire2). The term σsire2, is for service sire variances.

Analysis Using Pedigree Relationships

Variance components were estimated using AIREMLF90 (Misztal et al., 2002). All methods used the same pedigree so that additive genetic variance estimates are comparable with the same base (Legarra, 2016). The pedigree was provided to AIREMLF90, which for solving built the A1 internally including inbreeding. The variance estimates calculated with A1 were used as starting values for each of the following methods.

Analysis Using Genomic Relationships

The AIREMLF90 software implements A1 by default. For solving with HDIRECT1, the pedigree and genotypes were provided to PREGSF90 which built A1 and G1 internally with the default options. The GDIRECT matrix needed was computed with PREGSF90 as follows:

GDIRECT=α(a×Graw+J×b)+βA

where a and b (0.933 and 0.134) were computed following Powell et al. (2010) to scale inbreeding in Graw to the same level as in A, α=0.95 and β=0.05, J was a matrix of ones, and Graw was computed following the first method of VanRaden (2008):

Graw=ZZk

where k=2×Σ[p×(1p)], and the allele frequencies (p) were estimated from the genotyped population.

PREGSF90 then created GDIRECT1A221 in a binary format, where A221 is the inverse of the relationship matrix for genotyped animals. AIREMLF90 built HDIRECT1 internally by using GDIRECT1A221 as follows (Aguilar et al., 2010; Christensen and Lund, 2010):

HDIRECT1=A1+[0             00   GDIRECT1A221]

Analysis Using Metafounders

The data used in the analysis was from a single purebred line, and a single metafounder was defined and added as a pseudo-individual to the pedigree to reflect a single founding group (Christensen, 2012). The pedigree relationship matrix was built with this extended pedigree (AMETA1), using Calc_grm (Calus and Vandenplas, 2016). The self-relationship of the metafounder (γ) was estimated using the method of moments based on summary statistics of Legarra et al. (2015). The estimated self-relationship was equal to 0.364. The HMETA1 was built using Calc_grm (Calus and Vandenplas, 2016) as:

HMETA1=AMETA1+[0            00    GMETA1A221]

where A221 is the inverse of the pedigree relationship matrix among genotyped animals, and GMETA1 is computed with allele frequencies of 0.5 following Legarra et al. (2015).

The AMETA1 and HMETA1 were then provided to AIREMLF90 directly for solving. By adding the metafounder to AMETA1 and HMETA1, the additive genetic variances (and its standard error) were no longer on the same scale as the ones obtained with A1 or HDIRECT1. The additive genetic variances (and its standard error) expressed on the metafounder base were thus rescaled after estimation, by multiplying the variance with a function of the self-relationship 1(γ/2) (Legarra et al., 2015).

Analysis Using Genomic Relationships Approximated with APY

The approximation of G1 with APY was done following the method of Misztal et al. (2014) with Calc_grm (Calus and Vandenplas, 2016). To ensure the relationships in GAPY were relative to the same base as A, scaling was based on the method of Powell et al. (2010) as used above for GDIRECT. The GAPY1A221 was translated into a binary format and provided as an external matrix to AIREMLF90, which then built HAPY1.

To determine the number of animals to be included as core animals recommendations by Bradford et al. (2017) were followed. First the GDIRECT was built in Calc_grm and a principal components analysis was used to identify that 98% of variation was explained by 5,136 individuals. Core animals were selected based on the amount of information available for an individual. The number of records for LBW was used as an indicator, whereas the other traits with repeated records would provide similar animals to the core as LBW, while the traits with a single observation provide too many core animals. In total 4,764 core animals were selected which included those with four or more LBW phenotype records (4,229), and genotyped sires (498) and dams (37) that were not phenotyped but had progeny with both genotypes and phenotypes. The genotyped animals with three or fewer LBW records (5,466) were selected as non-core animals, and 230 animals were randomly removed from this group to limit the number of genotyped animals to 10,000 to meet software limitations as mentioned previously.

Estimating Repeatability and Heritability

All models converged for each of the methods and trait combinations. Convergence was achieved when the squared relative difference between the variances estimated in two consecutive runs was lower than 1.0×1010. The variances estimated with AIREMLF90 were used to calculate repeatability (r) and heritability (h2). Repeatability was calculated as r=(σa2+σsire2+σpe2)/(σa2+σsire2+σpe2+σe2), and heritability was calculated as h2=(σa2)/(σa2+σsire2+σpe2+σe2). Note that variances for service sire (σsire2) and permanent environment (σpe2) were only included for traits which fit those effects in the model, while the additive (σa2) and residual (σe2) variances were estimated for all traits. Standard errors for repeatability and heritability were estimated with Monte Carlo sampling as described by Houle and Meyer (2015), and implemented in AIREMLF90, which were then used to determine significant differences between models, based on a Z-test with a significance of 0.05.

Results

The models based on A1 tended to have higher estimates for additive genetic effects and lower estimates for the permanent environment compared with the models based on HDIRECT1 (8 of the 10 traits). Variance estimates using the HDIRECT1, HMETA1 (variances rescaled to be expressed on the same base as the other methods), and HAPY1 were all very similar when compared with each other. For most traits, the variances estimated with AMETA1 were the same as when estimated with A1, the additive genetic variance for traits LBW and GLE (also sire variance for GLE) had a maximum deviation of less than 1%, with no differences for other variance estimates. Therefore these results are not reported below, or discussed. To make the comparison between the methods, the traits have been separated into four categories, based on the difference in additive genetic variance estimated with A1 or HDIRECT1. These include: additive genetic variances estimated with A1 that are significantly higher than with HDIRECT1 (TNB and PIWI; Table 3), additive genetic variances estimated with A1 are higher than with HDIRECT1 by at least 10% but the difference is not significant (MAX, LMO, FRT; Table 4), additive genetic variances estimated with A1 and HDIRECT1 have less than a 10% difference and are not significantly different (LVAR, STAY, STB, GLE; Table 5), and additive genetic variances estimated with A1 are lower but not significantly different to HDIRECT1 (LBW; Table 6). Note that there were no traits where the additive genetic variances estimated with A1 were significantly lower than with HDIRECT1.

Table 3.

Genetic parameters for maternal sow traits where additive genetic variances (σa2) estimated with A1 are significantly greater than with HDIRECT1, with permanent environmental variance (σpe2), sire variance (σsire2), and residual variance (σe2) estimates included for the calculation of repeatability (r) and heritability (h2), with standard errors (± SE)1

Trait Method σa2 σpe2 σsire2 σe2 r h2 −2LogL difference*
TNB A1 1.893 ± 0.158 1.124 ± 0.109 0.435 ± 0.040 8.971 ± 0.083 0.28 ± 0.01 0.15 ± 0.01 6,119
HDIRECT1 1.540 ± 0.120 1.426 ± 0.081 0.429 ± 0.040 9.000 ± 0.083 0.27 ± 0.01 0.12 ± 0.01 5,911
HMETA1 1.488 ± 0.117 1.546 ± 0.077 0.430 ± 0.040 9.003 ± 0.083 0.28 ± 0.01 0.12 ± 0.01 5,881
HAPY1 1.512 ± 0.120 1.534 ± 0.077 0.430 ± 0.040 9.002 ± 0.083 0.28 ± 0.01 0.12 ± 0.01 0
PIWI A1 177.5 ± 25.46 976.8 ± 23.49 0.15 ± 0.02 0.15 ± 0.02 17
HDIRECT1 115.1 ± 15.94 1030.9 ± 18.95 0.10 ± 0.01 0.10 ± 0.01 17
HMETA1 118.4 ± 15.95 1037.6 ± 18.42 0.10 ± 0.02 0.10 ± 0.02 0
HAPY1 117.0 ± 15.96 1038.8 ± 18.47 0.10 ± 0.01 0.10 ± 0.01 5

1TNB, total number born; PIWI, prolonged interval between weaning and first insemination; A1, inverted pedigree relationship matrix; HDIRECT1, G1 inverted from the full genomic relationship matrix (G); HMETA1, G1 inverted from the full G with metafounder included in A1; HAPY1, G1 approximation based on recursion of core animals in G.

*Difference between maximum likelihood function of the model fitted and the lowest value obtained for the trait (TNB: 169,900; PIWI: 76,119), where a smaller value indicates a better fit.

Table 4.

Genetic parameters for maternal sow traits where additive genetic variances (σa2) estimated with A1 are greater but not significantly than with HDIRECT1, with permanent environmental variance (σpe2), sire variance (σsire2), and residual variance (σe2) estimates included for the calculation of repeatability (r) and heritability (h2), with standard errors (± SE)1

Trait Method σa2 σpe2 σsire2 σe2 r h2 -2LogL difference*
MAX A1 0.151 ± 0.034 1.285 ± 0.034 0.11 ± 0.02 0.11 ± 0.02 0
HDIRECT1 0.117 ± 0.025 1.316 ± 0.029 0.08 ± 0.02 0.08 ± 0.02 1
HMETA1 0.143 ± 0.028 1.307 ± 0.028 0.10 ± 0.02 0.10 ± 0.02 35
HAPY1 0.119 ± 0.024 1.326 ± 0.028 0.08 ± 0.02 0.08 ± 0.02 1
LMO A1 16.06 ± 1.99 17.32 ± 1.82 195.00 ± 1.80 0.15 ± 0.01 0.07 ± 0.01 23
HDIRECT1 13.45 ± 1.44 18.75 ± 1.49 195.71 ± 1.82 0.14 ± 0.01 0.06 ± 0.01 20
HMETA1 13.78 ± 1.43 19.34 ± 1.44 195.80 ± 1.82 0.15 ± 0.01 0.06 ± 0.01 0
HAPY1 13.62 ± 1.43 19.34 ± 1.45 195.82 ± 1.82 0.14 ± 0.01 0.06 ± 0.01 3
FRT A1 5.31 ± 1.59 3.89 ± 2.47 11.95 ± 1.42 466.06 ± 4.24 0.043 ± 0.001 0.011 ± 0.001 1
HDIRECT1 3.36 ± 1.59 5.41 ± 2.29 11.91 ± 1.41 466.22 ± 4.24 0.043 ± 0.006 0.007 ± 0.002 0
HMETA1 3.05 ± 0.97 5.95 ± 2.28 11.90 ± 1.41 466.22 ± 4.24 0.043 ± 0.006 0.006 ± 0.002 4
HAPY1 2.92 ± 0.93 6.05 ± 2.28 11.89 ± 1.41 466.22 ± 4.24 0.043 ± 0.005 0.006 ± 0.002 5

1MAX, maximum cycle number; LMO, litter mortality; FRT, farrowing rate; A1, inverted pedigree relationship matrix; HDIRECT1, G1 inverted from the full genomic relationship matrix (G); HMETA1, G1 inverted from the full G with metafounder included in A1; HAPY1, G1 approximation based on recursion of core animals in G.

*Difference between maximum likelihood function of the model fitted and the lowest value obtained for the trait (MAX: 18,619; LMO: 265,007; FRT: 300,766), where a smaller value indicates a better fit.

Table 5.

Genetic parameters for maternal sow traits where additive genetic variances (σa2) estimated with A1 are not different than with HDIRECT1, with permanent environmental variance (σpe2), sire variance (σsire2), and residual variance (σe2) estimates included for the calculation of repeatability (r) and heritability (h2), with standard errors (± SE)1

Trait Method σa2 σpe2 σsire2 σe2 r h2 -2LogL difference*
LVAR A1 640 ± 55 256 ± 39 4,201 ± 37 0.18 ± 0.01 0.13 ± 0.02 401
HDIRECT1 637 ± 46 286 ± 28 4,205 ± 37 0.18 ± 0.01 0.12 ± 0.01 51
HMETA1 673 ± 49 316 ± 27 4,206 ± 38 0.19 ± 0.01 0.13 ± 0.01 900
HAPY1 676 ± 49 312 ± 27 4,206 ± 37 0.19 ± 0.01 0.13 ± 0.01 0
STAY A1 0.003 ± 0.001 0.068 ± 0.001 0.044 ± 0.014 0.044 ± 0.014 104
HDIRECT1 0.002 ± 0.001 0.069 ± 0.001 0.029 ± 0.009 0.029 ± 0.009 80
HMETA1 0.002 ± 0.001 0.069 ± 0.001 0.029 ± 0.011 0.029 ± 0.011 12
HAPY1 0.001 ± 0.001 0.069 ± 0.001 0.015 ± 0.008 0.015 ± 0.008 0
STB A1 0.048 ± 0.004 0.027 ± 0.003 0.006 ± 0.001 0.299 ± 0.003 0.21 ± 0.01 0.13 ± 0.01 464
HDIRECT1 0.052 ± 0.004 0.027 ± 0.002 0.006 ± 0.001 0.299 ± 0.003 0.22 ± 0.01 0.14 ± 0.01 96
HMETA1 0.051 ± 0.004 0.030 ± 0.002 0.006 ± 0.001 0.299 ± 0.002 0.23 ± 0.01 0.13 ± 0.01 0
HAPY1 0.052 ± 0.004 0.029 ± 0.004 0.006 ± 0.001 0.299 ± 0.003 0.23 ± 0.01 0.14 ± 0.01 238
GLE A1 0.950 ± 0.045 0.149 ± 0.003 0.227 ± 0.001 0.888 ± 0.003 0.60 ± 0.02 0.43 ± 0.01 1,409
HDIRECT1 0.905 ± 0.040 0.226 ± 0.002 0.231 ± 0.001 0.892 ± 0.003 0.60 ± 0.01 0.40 ± 0.01 96
HMETA1 0.911 ± 0.043 0.286 ± 0.002 0.231 ± 0.001 0.892 ± 0.002 0.62 ± 0.01 0.39 ± 0.01 0
HAPY1 0.914 ± 0.044 0.283 ± 0.004 0.231 ± 0.001 0.892 ± 0.003 0.61 ± 0.01 0.39 ± 0.01 38

1LVAR, litter variation; STAY, stayability after second cycle; STB, total number stillborn; GLE, gestation length; A1, inverted pedigree relationship matrix; HDIRECT1, G1 inverted from the full genomic relationship matrix (G); HMETA1, G1 inverted from the full G with metafounder included in A1; HAPY1, G1 approximation based on recursion of core animals in G.

*Difference between maximum likelihood function of the model fitted and the lowest value obtained for the trait (LVAR: 376,393; STAY: 2,215; STB: 61,502; GLE: 102,987), where a smaller value indicates a better fit.

Table 6.

Genetic parameters for maternal sow traits where additive genetic variances (σa2) estimated with A1 are lower but not significantly than with HDIRECT1, with permanent environmental variance (σpe2), sire variance (σsire2), and residual variance (σe2) estimates included for the calculation of repeatability (r) and heritability (h2), with standard errors (± SE)1.

Trait Method σa2 σpe2 σsire2 σe2 r h2 -2LogL difference*
LBW A1 13,535 ± 688 3,552 ± 361 14,172 ± 129 0.55 ± 0.01 0.43 ± 0.02 1,201
HDIRECT1 14,284 ± 647 4,247 ± 210 14,268 ± 130 0.57 ± 0.01 0.44 ± 0.01 120
HMETA1 14,428 ± 699 5,176 ± 194 14,281 ± 130 0.58 ± 0.01 0.43 ± 0.01 0
HAPY1 14,140 ± 695 5,195 ± 199 14,278 ± 130 0.58 ± 0.01 0.42 ± 0.01 32

1LBW, litter birth weight; A1, inverted pedigree relationship matrix; HDIRECT1, G1 inverted from the full genomic relationship matrix (G); HMETA1, G1 inverted from the full G with metafounder included in A1; HAPY1, G1 approximation based on recursion of core animals in G.

*Difference between maximum likelihood function of the model fitted and the lowest value obtained for the trait (LBW: 423,684), where a smaller value indicates a better fit.

The decision to present the results within these four categories was because there was no observed patterns based on trait heritability. Whether the trait was lowly or highly heritable did not appear to relate to differences in estimated heritability between methods. Nor was there consistency based on whether the trait was lowly to highly heritable when comparing differences in variance estimates.

The additive genetic effect for TNB was significantly higher for A1 (1.893) compared with HDIRECT1 (1.540). The lower additive genetic variance for HDIRECT1 is countered by a higher estimate for the permanent environmental component (1.426), which is lower for A1 (1.124). For PIWI which did not fit a permanent environmental or service sire effect, the higher variance for the additive genetic component in A1 (177.53 compared with 115.08) was moved to the residual with HDIRECT1 (967.81 compared with 1038.80). For both traits (TNB and PIWI), there was no significant difference in additive genetic variance estimated with HMETA1 or HAPY1 compared with the HDIRECT1, nor was there any significant difference for the other variance components, repeatability, or heritability (Table 3).

The additive variance estimates were higher with A1 compared with HDIRECT1 for the traits MAX (0.151 and 0.117), LMO (16.06 and 13.45), and FRT (5.31 and 3.36), but these differences were not significant (Table 4). For MAX using the HDIRECT1 and HAPY1 methods, the variance removed from the additive genetic component (compared with A1) was moved to the residual. It was also the only trait where the additive genetic variance estimated with HMETA1 (0.143) was significantly different to both HDIRECT1 and HAPY1 (0.119). The heritability of MAX for each of the methods was low (between 0.08 and 0.11). For both LMO and FRT, the lower variance estimate for the additive genetic component corresponded with a larger estimate for the permanent environmental component. The repeatability between the four methods for both LMO and FRT were not different, nor was the heritability of LMO (between 0.06 and 0.07), however the heritability of FRT was slightly lower for the HDIRECT1, HMETA1, and HAPY1 methods compared with A1 but all were very low (0.006 to 0.011).

The differences in variance estimates between A1 and the three H1 methods were similar for LVAR, STAY, STB, and GLE (Table 5). Any differences were insignificant and less than 10% relatively. There was still a tendency for A1 to have a larger additive genetic variance and lower permanent environmental variance compared with HDIRECT1. This was true for LVAR, STAY and GLE. The additive genetic variance for LVAR estimated with HMETA1 (673) and HAPY1 (676) were higher than with A1 (640) and HDIRECT1 (637). However, this had limited impact on the repeatability (between 0.18 and 0.19) and heritability (0.12 to 0.13). The additive genetic variances estimated for STAY with A1 and the three H1 methods were all low, between 0.001 and 0.003, and with residual variances of between 0.068 and 0.069. There was a difference in heritability for STAY, between A1 (0.044) and HAPY1 (0.015), however they were both very low and not significantly different. There was however no significant difference between HDIRECT1, HMETA1, and HAPY1. For STB, the additive genetic variances (between 0.048 and 0.052) and residual variances (0.299) were also very low across all methods. Variance estimates for the permanent environmental component (0.027 to 0.030) and service sire component (0.006) were also very low for STB, however unlike for STAY there was no significant difference in heritability between methods (0.13 to 0.14).

The trait LBW was the only trait to have an observably lower additive genetic variance with A1 (13,535) compared with HDIRECT1 (14,284) (Table 6). However, this difference was not significant. The permanent environmental variance estimated with A1 (3,552) was also lower compared with HDIRECT1 (4,247), this difference was significant. The difference was even greater for HMETA1 (5,176) and HAPY1 (5,195). To investigate this difference further, the solutions for each of the fixed effects were plotted for the A1 and HDIRECT1. For each of the fixed effects, there was a linear relationship and a correlation approaching one, except for herd-year-season at farrowing (Figure 2). Similar results were found for HMETA1 and HAPY1 when compared with A1.

Figure 2.

Figure 2.

Solutions for herd-year-season at farrowing (HYSF) when solving with either A1 (inverted pedigree relationship matrix) or HDIRECT1 (inverted relationship matrix that combines pedigree and the full genomic information).

The log-likelihood multiplied by negative two was included as an indication of model fit. The differences between the methods were very small. However, there was a limited tendency for the A1 to have the poorest fit with a larger value for 7 of the 10 traits (STAY, LMO, STB, TNB, PIWI, GLE, and LBW). While the HMETA1 tended to have the best fit with lowest value for 6 of the 10 traits (LMO, STB, PIWI, GLE, and LBW). There appeared to be no pattern between the heritability of the trait, or differences in variance estimates, with which method provided the best fit.

Discussion

We hypothesized that there would be limited differences in variances estimated with A1 and HDIRECT1. There was a tendency for A1 to have a larger estimate for the additive genetic effect, and for HDIRECT1 to have a larger estimate for the permanent environmental component, but these differences were small and support the first hypothesis. Our second hypothesis was that there would be no difference in variance estimates between HDIRECT1, HMETA1, and HAPY1. The second hypothesis was also supported, as there were limited but not significant differences between HDIRECT1, HMETA1, and HAPY1. The differences in heritability and repeatability are so small across the four methods, for each of the traits, that there should be limited impact on estimated breeding values from ssGBLUP (Henderson, 1984), but some minor re-ranking of animals for traits that fit a permanent environmental effect or have data structure issues are probable when moving from A1 to H1 methods. We also investigated if any of the methods provided a better model fit. When considering the maximum likelihood function (−2 × LogL) a value closer to zero is considered a better fit. Across the 10 traits, there was a tendency that the A1 method had the poorest, and HMETA1the best fit. Differences in maximum likelihood were, however, generally small, and therefore not conclusive.

Heritability

The variance component estimates for this study were similar to previous estimates published in the literature for maternal sow traits. There were some small differences between variance estimates in this study and previous estimates, which can be explained by different datasets, numbers of animals and records, breeds and lines used, and fixed effects fitted. We have compared the heritabilities calculated from the variance estimates and found the majority of the traits, including LBW, MAX, TNB, STB, and FRT, all had heritabilities within the range of previously published estimates (Roehe, 1999; Hanenberg et al., 2001; Knol et al., 2002; van Grevenhof et al., 2015; Sevillano et al., 2016). The heritability for STAY (0.04) was the only estimate to have a lower heritability compared with previous estimates (0.11 ± 0.01; van Grevenhof et al., 2015). Heritability for both LVAR (0.13) and PIWI (0.15) was higher but not significantly different to previous published estimates (0.00 to 0.08 and 0.07 to 0.14, respectively) (Hanenberg et al., 2001; Damgaard et al., 2003; Bergsma et al., 2008). While heritability of GLE (0.43) was the only trait to have a significantly higher estimate compared with previous estimates (0.25 to 0.33) (Hanenberg et al., 2001; Rydhmer et al., 2008).

Variance Estimates with Full G

The standard errors for variance components estimated with HDIRECT1 were lower compared with A1, an indication that HDIRECT1 is a more informative matrix. The variances estimated with HDIRECT1 tended to have lower additive genetic and high permanent environmental variances, compared with the A1. This is likely due to the fact that HDIRECT1 is a more accurate definition of the relationship between individuals due to the additional information of GDIRECT1 (Legarra, 2016), therefore, HDIRECT1 is likely to be better at separating the additive genetic and permanent environmental effects. Assuming that the genomic based variance component estimates are more correct, this could mean that additive genetic variances with A1 may be overestimated in some cases, but these differences were limited.

Variance Estimates with Metafounders

We did not expect there to be significant differences in estimated variances between HDIRECT1 and HMETA1 methods after scaling, and this was verified for most traits. The reason for this expectation was that we already expected limited differences between A1 and G1 (Legarra, 2016), including metafounders should make the pedigree relationships and genomic relationships more compatible, so estimates using metafounders should be similar to A1, HDIRECT1, or both methods of defining the relationships. Increasing the number of metafounders could be beneficial for more complex data structures, over longer periods of time. In this case all unknown parents were from a single line and were considered as one metafounder. However, even by only including a single metafounder, pedigree and genomic relationship matrices are more compatible (Christensen, 2012), which improves the ability to estimate variances. The inclusion of a single metafounder in simulations has been shown to be beneficial, with more accurate EBVs (Meyer et al., 2018). However, this was not observed with additional metafounders (van Grevenhof et al., 2018), but could be used to refine HMETA1 as scaling variance components becomes more difficult with more metafounders. Adding a metafounder to A1 yielded the same variances, after rescaling the genetic variance. This confirms that the scaling factor as proposed by (Legarra et al., 2015) indeed is correct. It also shows that adding metafounders to A1 for analyses based on pedigree only has no practical benefit, simply because there are no genomic relationships that need to be made compatible with pedigree relationships.

Variance Estimates with APY

It has already been shown that the GAPY1 is an accurate approximation of the GDIRECT1 (Misztal et al., 2014, 2016; Bradford et al., 2017). Therefore, variances estimated with HAPY1 and HDIRECT1 were expected to be similar. The results in this study support this, with there being no traits that had a significantly different additive genetic variance, and only LBW having a significantly different permanent environmental variance. The number of core animals selected (4,764) was based on the principal components analysis of GDIRECT1. The similar results between HDIRECT1 and HAPY1 observed here, confirmed that the number of allocated core animals was sufficient.

The main benefit of using HAPY1 compared with HDIRECT1 was the reduced computational requirement of building GAPY1 compared with GDIRECT1. For variance component estimation with REML, there was no benefit to using HAPY1 as the potential advantage of the sparse GAPY1, was canceled by the A221 being dense. To utilize the sparsity of GAPY1 and improve the efficiency of variance component estimation, adapted software could be used that avoids the inversion of the left hand side such as Gibbs sampling (Misztal et al., 2002), or by using a sparse approximation of A221 (Faux and Gengler, 2015), to preserve the sparsity of GAPY1.

Variance Estimates for LBW

The only trait to have a lower additive genetic variance with A1 compared with each of the H1 methods was LBW. We hypothesized that the additive genetic and permanent environmental variances both being higher with H1 could be due to confounding between some genetic component and a fixed effect. All animals phenotyped were also genotyped, and there was a consistent lower fixed effect for herd-year-season at farrowing with H1 methods. There is likely some confounding between some genetic component not captured by the pedigree and herd-year-season at farrowing. The other traits that fit this fixed effect (PIWI, LVAR, LMO, TNB, STB, and GLE) tended to have lower estimates for permanent environmental variance between A1 and any of the H1 matrices, but unlike LBW the additive genetic variance was either not significantly different or greater with A1 compared with and any of the H1 matrices. An alternative explanation is LBW is treated as a trait of the sow, and by not fitting the genetic effect for piglet (assuming the genetic correlation between the sow and piglet trait is positive) may lead to inflation of the genetic variance for the sow trait (Roehe, 1999). It is unlikely that the H1 can better account for this as it would be expected to correspond with a decrease in either additive genetic or permanent environmental variances. Instead there is no difference in residual variance with H1, with a larger permanent environmental variance with HDIRECT1 and even higher with HMETA1 and HAPY1. This leaves confounding between some genetic component and a fixed effect as the best explanation for the difference between A1 and HDIRECT1, which is exacerbated by HMETA1 and HAPY1. With the differences in variance estimates for LBW, there is limited impact on the repeatability or heritability, suggesting limited impact on EBVs (Henderson, 1984).

Possible Reasons for Different Variance Estimates

It is likely that some of the differences in variance estimates observed are due to the estimation process and also to factors not investigated in this study. The animals selected for genotyping were unlikely random, with a bias for animals in recent generations being genotyped (Figure 1). If within generations predominantly the better animals are genotyped, then this would break the assumption of random Mendelian sampling and bias the results from A1 (Patry and Ducrocq, 2011; VanRaden, 2012; Masuda et al., 2018). The too low variance of observed Mendelian Sampling terms is expected to lead to decreased, and thus underestimated, genetic variances with A1 (Gao et al., 2019), while use of genomic information provides a handle on the Mendelian Sampling terms and thus is expected to lead to less biased estimates as a result from nonrandom genotyping. Across the traits analyzed here, there was a tendency that with A1 higher rather than lower genetic variances were estimated, which suggests that any differences observed here in estimated genetic variances between models is unlikely due to the genotyping strategy. Furthermore, there were no animals with only phenotypes and pedigree information, as the data were limited to animals with genotypes and their ancestors. Finally, the differences observed with HMETA1 could be due to the approximation of the self-relationship of the metafounder (Garcia-Baccino et al., 2017).

Application and Recommendations

Variance component estimations may continue to be estimated with A1, as there is limited differences between methods and it remains the easiest method to implement. For some traits, it could be appropriate to investigate variance component estimation with one of the H1 methods. Complex traits, traits with large datasets that need sub-setting, and traits that fit a permanent environmental effect, could benefit from variance component estimation using H1. Further investigation and application to such traits could help to understand the differences between A1 and G1, and to better define the models used for calculating EBVs of these traits. To the authors knowledge, this is the only study to compare variance estimates between the four methods of defining the relationship matrix (A1,HDIRECT1,HMETA1, and HAPY1), and with the use of industry data. In this study, a single maternal large white sow line was used, and the number of animals genotyped was limited to 10,000. However, we have no reason to believe that these results and recommendations are not applicable to other lines, breeds, species, or a larger number of genotyped animals.

Conflict of interest statement

The authors declare that there is no conflict of interest.

Footnotes

1

This study was financially supported by the Dutch Ministry of Economic Affairs (TKI Agri & Food project 16022) and the Breed4Food partners Cobb Europe, CRV, Hendrix Genetics, and Topigs Norsvin. The use of the HPC cluster has been made possible by CAT-AgroFood (Shared Research Facilities Wageningen UR).

Literature Cited

  1. Aguilar I., Misztal I., Johnson D., Legarra A., Tsuruta S., and Lawlor T.. . 2010. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J. Dairy Sci. 93:743–752. doi: 10.3168/jds.2009-2730 [DOI] [PubMed] [Google Scholar]
  2. Bergsma R., Kanis E., Verstegen M. W., and Knol E. F.. . 2008. Genetic parameters and predicted selection results for maternal traits related to lactation efficiency in sows. J. Anim. Sci. 86:1067–1080. doi: 10.2527/jas.2007-0165 [DOI] [PubMed] [Google Scholar]
  3. Bradford H. L., Pocrnić I., Fragomeni B. O., Lourenco D. A. L., and Misztal I.. . 2017. Selection of core animals in the algorithm for proven and young using a simulation model. J. Anim. Breed. Genet. 134:545–552. doi: 10.1111/jbg.12276 [DOI] [PubMed] [Google Scholar]
  4. Calus M., and Vandenplas J.. . 2016. Calc_grm—a program to compute pedigree, genomic, and combined relationship matrices. Wageningen (the Netherlands): ABGC, Wageningen UR Livestock Research. [Google Scholar]
  5. Christensen O. F. 2012. Compatibility of pedigree-based and marker-based relationship matrices for single-step genetic evaluation. Genet. Sel. Evol. 44:37. doi: 10.1186/1297-9686-44-37 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Christensen O. F., and Lund M. S.. . 2010. Genomic prediction when some animals are not genotyped. Genet. Sel. Evol. 42:2. doi: 10.1186/1297-9686-42-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Damgaard L. H., Rydhmer L., Løvendahl P., and Grandinson K.. . 2003. Genetic parameters for within-litter variation in piglet birth weight and change in within-litter variation during suckling. J. Anim. Sci. 81:604–610. doi: 10.2527/2003.813604x [DOI] [PubMed] [Google Scholar]
  8. Faux P., and Gengler N.. . 2015. A method to approximate the inverse of a part of the additive relationship matrix. J. Anim. Breed. Genet. 132:229–238. doi: 10.1111/jbg.12128 [DOI] [PubMed] [Google Scholar]
  9. Fragomeni B. O., Lourenco D. A., Tsuruta S., Masuda Y., Aguilar I., Legarra A., Lawlor T. J., and Misztal I.. . 2015. Hot topic: use of genomic recursions in single-step genomic best linear unbiased predictor (BLUP) with a large number of genotypes. J. Dairy Sci. 98:4090–4094. doi: 10.3168/jds.2014-9125 [DOI] [PubMed] [Google Scholar]
  10. Gao H., Madsen P., Aamand G. P., Thomasen J. R., Sørensen A. C., and Jensen J.. . 2019. Bias in estimates of variance components in populations undergoing genomic selection: a simulation study. BMC Genomics 20:956. doi: 10.1186/s12864-019-6323-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Garcia-Baccino C. A., Legarra A., Christensen O. F., Misztal I., Pocrnic I., Vitezica Z. G., and Cantet R. J.. . 2017. Metafounders are related to Fst fixation indices and reduce bias in single-step genomic evaluations. Genet. Sel. Evol. 49:34. doi: 10.1186/s12711-017-0309-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Hanenberg E., Knol E., and Merks J.. . 2001. Estimates of genetic parameters for reproduction traits at different parities in Dutch Landrace pigs. Livest. Prod. Sci. 69:179–186. doi: 10.1016/S0301-6226(00)00258-X [DOI] [Google Scholar]
  13. Henderson C. R. 1984. Applications of linear models in animal breeding. Can. Catal. Publ. Data. Guelph (Canada): University of Guelph. [Google Scholar]
  14. Houle D., and Meyer K.. . 2015. Estimating sampling error of evolutionary statistics based on genetic covariance matrices using maximum likelihood. J. Evol. Biol. 28:1542–1549. doi: 10.1111/jeb.12674 [DOI] [PubMed] [Google Scholar]
  15. Knol E. F., Leenhouwers J., and Van der Lende T.. . 2002. Genetic aspects of piglet survival. Livest. Prod. Sci. 78:47–55. doi: 10.1016/S0301-6226(02)00184-7 [DOI] [Google Scholar]
  16. Legarra A. 2016. Comparing estimates of genetic variance across different relationship models. Theor. Popul. Biol. 107:26–30. doi: 10.1016/j.tpb.2015.08.005 [DOI] [PubMed] [Google Scholar]
  17. Legarra A., Christensen O. F., Vitezica Z. G., Aguilar I., and Misztal I.. . 2015. Ancestral relationships using metafounders: finite ancestral populations and across population relationships. Genetics 200:455–468. doi: 10.1534/genetics.115.177014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Masuda Y., VanRaden P. M., Misztal I., and Lawlor T. J.. . 2018. Differing genetic trend estimates from traditional and genomic evaluations of genotyped animals as evidence of preselection bias in US Holsteins. J. Dairy Sci. 101:5194–5206. doi: 10.3168/jds.2017-13310 [DOI] [PubMed] [Google Scholar]
  19. Meyer K., Tier B., and Swan A.. . 2018. Estimates of genetic trend for single-step genomic evaluations. Genet. Sel. Evol. 50:39. doi: 10.1186/s12711-018-0410-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Misztal I. 2016. Inexpensive computation of the inverse of the genomic relationship matrix in populations with small effective population size. Genetics 202:401–409. doi: 10.1534/genetics.115.182089 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Misztal I., Legarra A., and Aguilar I.. . 2014. Using recursion to compute the inverse of the genomic relationship matrix. J. Dairy Sci. 97:3943–3952. doi: 10.3168/jds.2013-7752 [DOI] [PubMed] [Google Scholar]
  22. Misztal I., Tsuruta S., Strabel T., Auvray B., Druet T., and Lee D.. . 2002. BLUPF90 and related programs (BGF90). In: Proceedings of the 7th World Congress on Genetics Applied to Livestock Production; Montpellier, France; p. 743–744. [Google Scholar]
  23. Patry C., and Ducrocq V.. . 2011. Evidence of biases in genetic evaluations due to genomic preselection in dairy cattle. J. Dairy Sci. 94:1011–1020. doi: 10.3168/jds.2010-3804 [DOI] [PubMed] [Google Scholar]
  24. Powell J. E., Visscher P. M., and Goddard M. E.. . 2010. Reconciling the analysis of IBD and IBS in complex trait studies. Nat. Rev. Genet. 11:800–805. doi: 10.1038/nrg2865 [DOI] [PubMed] [Google Scholar]
  25. Roehe R. 1999. Genetic determination of individual birth weight and its association with sow productivity traits using Bayesian analyses. J. Anim. Sci. 77:330–343. doi: 10.2527/1999.772330x [DOI] [PubMed] [Google Scholar]
  26. Rydhmer L., Lundeheim N., and Canario L.. . 2008. Genetic correlations between gestation length, piglet survival and early growth. Livest. Sci. 115:287–293. doi: 10.1016/j.livsci.2007.08.014 [DOI] [Google Scholar]
  27. Sevillano C. A., Mulder H. A., Rashidi H., Mathur P. K., and Knol E. F.. . 2016. Genetic variation for farrowing rate in pigs in response to change in photoperiod and ambient temperature. J. Anim. Sci. 94:3185–3197. doi: 10.2527/jas.2015-9915 [DOI] [PubMed] [Google Scholar]
  28. van Grevenhof E., Knol E., and Heuven H.. . 2015. Interval from last insemination to culling: I. The genetic background in crossbred sows. Livest. Sci. 181:103–107. doi: 10.1016/j.livsci.2015.09.017 [DOI] [Google Scholar]
  29. van Grevenhof E. M., Vandenplas J., and Calus M. P. L.. . 2018. Genomic prediction for crossbred performance using metafounders. J. Anim. Sci. 97:548–558. doi: 10.1093/jas/sky433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. VanRaden P. M. 2008. Efficient methods to compute genomic predictions.J. Dairy Sci. 91:4414–4423. doi: 10.3168/jds.2007-0980 [DOI] [PubMed] [Google Scholar]
  31. VanRaden P. M. 2012. Avoiding bias from genomic pre-selection in converting daughter information across countries.Interbull. [accessed December 2, 2019]. Available from https://journal.interbull.org/index.php/ib/article/view/1243/1241 [Google Scholar]

Articles from Journal of Animal Science are provided here courtesy of Oxford University Press

RESOURCES