Abstract
A spurious negative genetic correlation between direct and maternal effects of weaning weight (WW) in beef cattle has historically been problematic for researchers and industry. Previous research has suggested the covariance between sires and herds may be contributing to this relationship. The objective of this study was to estimate the variance components (VC) for WW in American Angus with and without sire by herd (S×H) interaction effect when genomic information is used or not. Five subsets of ~100k animals for each subset were used. When genomic information was included, genotypes were added for 15,637 animals. Five replicates were performed. Four different models were tested, namely, M1: without S×H interaction effect and with covariance between direct and maternal effect () ≠ 0; M2: with S×H interaction effect and ≠ 0; M3: without S×H interaction effect and with = 0; M4: with S×H interaction effect and = 0. VC were estimated using the restricted maximum likelihood (REML) and single-step genomic REML (ssGREML) with the average information algorithm. Breeding values were computed using single-step genomic BLUP for the models above and one additional model, which had the covariance zeroed after the estimation of VC (M5). The ability of each model to predict future breeding values was investigated with the linear regression method. Under REML, when the S×H interaction effect was added to the model, both direct and maternal genetic variances were greatly reduced, and the negative covariance became positive (i.e., when moving from M1 to M2). Similar patterns were observed under ssGREML, but with less reduction in the direct and maternal genetic variances and still a negative covariance. Models with the S×H interaction effect (M2 and M4) had a better fit according to the Akaike information criteria. Breeding values from those models were more accurate and had less bias than the other three models. The rankings and breeding values of artificial insemination sires (N = 1,977) greatly changed when the S×H interaction effect was fit in the model. Although the S×H interaction effect accounted for 3% to 5% of the total phenotypic variance and improved the model fit, this change in the evaluation model will cause severe reranking among animals.
Keywords: direct and maternal covariance, single-step GBLUP, single-step genomic REML, sire × herd interaction, weaning weight
Lay Summary
A spurious negative genetic correlation between direct and maternal effects of weaning weight (WW) in beef cattle has been problematic for researchers and industry. Previous research suggested the covariance between sires and herds may contribute to this relationship. The objective of this study was to estimate the variance components (VC) for WW in American Angus with and without sire by herd (S×H) interaction effect when genomic information is used or not. Four models were designed to investigate the S×H effect. The restricted maximum likelihood (REML) and single-step genomic REML (ssGREML) were used to estimate VC. Breeding values were computed using single-step genomic BLUP and the validation was done through the linear regression method. Under REML, when the S×H was added to the model, both direct and maternal genetic variances were greatly reduced, and the negative covariance became positive. Similar patterns were observed under ssGREML, but with less reduction in the direct and maternal genetic variances and still a negative covariance. Breeding values from models with S×H were more accurate and had less bias than the other models. Although the S×H improved the model, this change in the evaluation model will cause severe reranking among key animals.
Including the S×H interaction effect could help better estimate genetic parameters, improving the model fit and breeding values estimation. However, significant changes in EPDs and the ranking of animals are expected if the current model is updated with S×H interactions.
Introduction
In beef cattle, the genetic covariance between the direct and maternal effects of weaning weight (WW) has shown an antagonistic effect that hinders the progress in a selection program (Meyer, 1992; Pollak et al., 1994). Several simulation studies reported this antagonistic estimate could arise from ignoring the additional variance among sires such as S×H and sire by year interaction effects (Robinson, 1996; Lee and Pollak, 1997). In Australian beef cattle, various studies reported significant S×H or sire × herd-year interaction effects for many traits accounted for ~5% to 10% of the phenotypic variation. Additionally, including the S×H interaction effect greatly reduced the negative covariance between direct and maternal effects on 200-d weight (Notter et al., 1992; Bradfield, 1999; Meyer and Graser, 1999). As a result, the Australian evaluation system, BREEDPLAN, began to include S×H interaction effect in its national evaluation model in 1999 (Graser et al., 1999).
The major reasons for the variation due to S×H interaction have not been completely determined, but several possible sources are reported: (1) preferential treatment, (2) non-random mating, (3) use of selected sires, which could lead to heterogeneous residual and additive genetic variance among herds, and (4) extensive use of specific sires in particular herds. Therefore, ignoring S×H interaction effect in the evaluation model could inflate the genetic variance and overestimate the estimated breeding value (EBV; Tong et al., 1977; Meyer, 1987; Banos and Shook, 1990). When the S×H interaction effect was fit in the genetic evaluation model, the direct and maternal variances were lower compared with the model without the S×H interaction effect (Berweger Baschnagel et al., 1999; Dodenhoff et al., 1999). Specifically, Dodenhoff et al. (1999) used data from the American Angus Association (AAA; St. Joseph, MO) and recommended the inclusion of the S×H interaction effect in routine genetic evaluations to avoid biased estimates.
The estimation of VC has been mostly computed using the pedigree relationship matrix (A). If the population is undergoing selection based on pedigree and phenotypes with a proper model, the VC based on those two sources of information would be unbiased (Kennedy et al., 1988). However, genomic information is now available and used for selection, so adding this source of information to VC estimation models makes sense. It is common fact in livestock populations that only a fraction of animals are genotyped, so using a genomic relationship matrix (G) instead of A could result in biased VC because the information on non-genotyped animals would not be used; therefore, the population would not be well represented (Cesarani et al., 2019). Veerkamp et al. (2011); Cesarani et al. (2019) recommended using the single-step methodology (Aguilar et al., 2010; Christensen and Lund, 2010) to estimate VC when genotyped and non-genotyped animals coexist in the pedigree. In single-step, G and A are combined into a realized relationship matrix (H), so the information on genotyped and non-genotyped animals can be used.
Because the estimated VC could differ with the choice of the covariance structure among animals and the presence of the SxH interaction, the EBV can also change causing animals to change rank. Changes in EBV and the ranking of animals are problematic in the commercial marketplace. However, if those changes are moving EBV in the appropriate direction, the modifications should be acceptable. Because most of the routine genetic evaluations ignore the negative covariance between additive direct and maternal effects, room for improvements could be explored if S×H interaction is deemed important. Therefore, the first objective of this study was to investigate the impact of a random S×H interaction effect on the VC of WW in American Angus cattle in the presence or absence of genomic information. The second objective was to evaluate the prediction models in terms of accuracy, bias, and dispersion using the LR method (Legarra and Reverter, 2018). The last objective was to investigate the changes in EPD and the ranking of artificial insemination (AI) sires among different models.
Materials and Methods
Animal care and Use Committee approval was not needed because the information was obtained from the pre-existing databases.
Data
All datasets were provided by AAA. Over 9.4 million WW phenotypes collected from 1955 to 2020 were available for almost 9.9 million animals. All WW were pre-adjusted for the age of dam and age of calf using the adjustment factors from the standard AAA national cattle evaluation. Data filtering for the VC estimation was performed to remove the following: (1) animals without WW and herd information; (2) contemporary groups (CG) with less than 50 animals; (3) animals with registration ID other than AAA and beef improvement records. After all filtering processes, 2,474,202 animals remained. Five random samples of ~100k animals with WW records were taken for the analysis, which mimics the current procedures and data structures for VC estimation by AAA. Each sample contained all animals in the selected herd over time. Table 1 depicts summary statistics for WW along with the number of animals, herds, sires, and S×H interactions in each replicate.
Table 1.
General statistics for all the replicates
| Replicate1 | Replicate2 | Replicate3 | Replicate4 | Replicate5 | ||
|---|---|---|---|---|---|---|
| No. of animals | 112,677 | 105,909 | 102,433 | 109,260 | 102,183 | |
| No. of herds | 88 | 93 | 84 | 90 | 97 | |
| No. of sires | 3,970 | 4,553 | 4,262 | 4,379 | 4,157 | |
| No. of S×H | 5,723 | 6,128 | 5,668 | 6,286 | 5,808 | |
| WW | Min., lbs | 211 | 193 | 262 | 246 | 196 |
| Mean., lbs | 602.6 | 607.2 | 602.4 | 600 | 604 | |
| Max., lbs | 1,044 | 1,113 | 1,032 | 1,044 | 1,014 | |
| SD., lbs | 95.43 | 103.83 | 96.20 | 90.20 | 90.93 | |
Among those animals, 180,733 were genotyped. Because of the computing limitation of single-step genomic restricted maximum likelihood (ssGREML), a subset of 15,637 animals born from 1972 to 2017 was selected among 180,733 genotyped animals who had phenotypes for WW and at least one progeny as sire or dam. The animals were genotyped for 54,609 single-nucleotide polymorphisms (SNP) originally present in the BovineSNP50k v2 BeadChip (Illumina Inc., San Diego, CA). Quality control of genomic data removed SNP with call rate < 0.9, minor allele frequency < 0.05, and those located on the sex chromosomes. After the quality control, 39,733 SNPs were available for animals born from 1972 to 2017. For the estimation of breeding values, a larger dataset was used which included phenotypes for 2,474,202 animals, 180,733 genotyped animals, and a 4-generation pedigree including 869,583 animals in total. Because of the large number of genotyped animals, the algorithm for proven and young was used to obtain G−1 without the direct inversion of G, as proposed by Misztal et al. (2014a). The number of core animals was set to 19,019, which has been used for routine genomic evaluations by the AAA. Among all those animals, 1,977 were AI sires under investigation for ranking and EPD changes under different models. AI sires in this data are a combination of old sires with many progenies and young sires with no progeny in production yet. These AI sires had direct progeny ranging from 0 to 6,053 with a mean of 117.02 and the number of progenies raised by daughters ranged from 0 to 19 with a mean of 0.23.
Models and analysis
The following 4 different linear mixed models were used for the VC estimation.
where y is a vector of WW records; b is a vector of the fixed effects of CG, where CG was composed to represent animals of the same sex, born and weaned in the same herd, in the same year, and part of the same management group within that herd; a, m, and mpe are random vectors of additive direct genetic effect, additive maternal genetic effect, and maternal permanent environmental effect, respectively; sh is a random vector of S×H interaction effect as an additional uncorrelated random effect; X, , , , and are the incidence matrices for the effects in b, a, m, mpe, and sh, respectively; e is the vector of random residuals. Hence, variances for the random effects in models M1 and M3 were:
where A and I denote pedigree relationship and identity matrices; under single-step (i.e., single-step genomic best linear unbiased prediction (ssGBLUP) and ssGREML), the realized relationship matrix (H) was used instead of A. Models M1 and M2 considered covariance between direct and maternal effects, whereas M3 and M4 forced this covariance to zero.
Models M2 and M4 had a random S×H interaction effect, so the variance structure for the random effects was:
Phenotypic variance () was computed based on all the variances in each model. For example, in M2 and M4:
Therefore, the direct and maternal heritabilities were estimated as
where , , , , , and are additive genetic direct variance, maternal genetic variance, the covariance between direct and maternal genetic effects, maternal permanent environment variance, S×H variance, and residual variance, respectively. The formulas for heritability had no for M1 and M3, and was zero for M3 and M4.
Two methods were used to estimate VC, which included restricted maximum likelihood (REML) and ssGREML. In REML, the assumption was a ~ N (0, A ), where A is the pedigree relationship matrix. Conversely, the assumption under ssGREML was a ~ N (0, H ), where H is the realized relationship matrix combining A with the genomic relationship matrix (G). In the ssGREML algorithm, the inverse of H is required (Aguilar et al., 2010):
VC were estimated using AIREML algorithm as implemented in AIREMLF90 (Misztal et al., 2014b), which has been modified to incorporate the YAMS package (Masuda et al., 2015) for optimized sparse matrix computations. Genomic EBV (GEBV) was estimated for all four models using ssGBLUP. One additional model was used as a benchmark, mimicking the current procedure in the AAA evaluations. This model was labeled model 5 (M5) and was similar to M1, except for the covariance between direct and maternal effects was zeroed after the VC estimation. As our objective herein was to compare genomic predictions between the models, not between methods, only ssGBLUP evaluations were carried out. Akaike Information Criteria (AIC) was used to compare models. In most cases, VC from non-genomic models are used to obtain genomic predictions; however, in this study, GEBV were also computed using VC from genomic models. The VC used were averaged across five replicates. Changes in ranking and predictions for AI bulls were presented in the EPD scale, which was computed as one-half EBV. Ranking changes were calculated by comparing the ranking of animals in M1 to M4 against M5; the same was done for investigating EPD changes.
Validation
The LR validation method (Legarra and Reverter, 2018) was used to evaluate model performance. A total of 23,021 young genotyped animals born in 2019 were selected as validation animals and had their phenotypes removed from the evaluation, along with phenotypes for their contemporaries. The total number of records in this dataset was 2,451,181. This will be referred to as the partial data and will be represented by the subscript p. On the other hand, the entire data will be represented by the subscript w and had no phenotype truncation. Under the LR method, the accuracy of GEBV was calculated as , where a is the vector of GEBV and is the average inbreeding coefficient for validation animals; was model-specific under REML or ssGREML. Bias was calculated as the difference between the mean of partial and whole GEBV, which is , with an expected estimator of 0 if unbiased. Dispersion of GEBV was assessed as the deviation of the regression coefficient (b1) from 1, where b1 was obtained from the regression of on :. Under the condition of neither over nor under dispersion, the expectation of this estimator would be 1.
Results and Discussion
Genetic parameter estimation
VC can be estimated considering the covariance structure among animals is given by the pedigree relationship matrix, the genomic relationship matrix, or by the realized relationship matrix. In this study, the first and third assumptions were used to examine the differences in VC when the pedigree information is combined with genomic information or not (Table 2). Under REML, M1 resulted in larger direct and maternal genetic variances compared with the other 3 models. In addition, M1 had greater negative covariance between direct and maternal genetic effects compared to M2. When S×H interaction was fit into the model (M2), both direct and maternal genetic variances were reduced by a ratio of almost 2.3 and 1.6, respectively. However, the residual variance was 16% greater in M2 compared to M1. Remarkably, the negative covariance between direct and maternal effects became positive when moving from M1 to M2. Therefore, adding the S×H interaction effect could mitigate the issue with negative covariance between direct and maternal effects.
Table 2.
Estimated variance component for the four investigated models using REML and ssGREML method
| Cor (a,m) |
AIC | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| M1 | 1,069.60 (47.12) |
415.66 (43.31) |
372.08 (27.13) |
0 | 1,623.96 (63.26) |
−251.76 (35.35) |
−0.38 (0.06) |
3,229.54 (94.64) |
802,049 (37,686.16) |
|
| M2 | 467.63 (40.63) |
266.11 (45.67) |
366.95 (27.00) |
150.90 (21.16) |
1,889.96 (79.97) |
42.34 (26.95) |
0.12 (0.08) |
3,183.89 (94.35) |
801,606 (37,619.05) |
|
| M3 | 858.01 (16.89) |
275.57 (54.57) |
359.78 (27.69) |
0 | 1,730.24 (61.11) |
0 | 0 | 3,223.60 (98.88) |
802,150 (37,669.09) |
|
| M4 | 517.79 (20.99) |
290.99 (55.71) |
368.25 (27.13) |
143.77 (17.75) |
1,865.80 (68.78) |
0 | 0 | 3,186.59 (95.41) |
801,607 (37,620.00), |
|
| M1 | 1,185.56 (35.95) |
371.13 (30.45) |
341.52 (25.60) |
0 | 1,517.72 (52.46) |
−263.24 (38.19) |
−0.40 (0.06) |
3,152.70 (85.16) |
829542 (37,475.95) |
|
| M2 | 803.34 (47.82) |
255.05 (26.92) |
335.62 (25.75) |
100.11 (19.86) |
1,672.94 (72.20) |
−55.09 (38.65) |
−0.12 (0.08) |
3,111.96 (81.46) |
829,174 (37,416.09) |
|
| M3 | 928.88 (19.57) |
236.99 (39.32) |
326.15 (26.40) |
0 | 1,643.42 (55.07) |
0 | 0 | 3,135.44 (89.08) |
829,663 (37,452.34) |
|
| M4 | 736.42 (42.19) |
226.54 (38.84) |
333.03 (26.12) |
106.86 (17.44) |
1,708.12 (61.31) |
0 | 0 | 3,110.97 (89.67) |
829,178 (37,413.24) |
*Standard deviation based on five replicates is in parenthesis.
: restricted maximum likelihood method using only pedigree and phenotype.
: single-step genomic restricted maximum likelihood method using pedigree, phenotype, and genotype.
: M1, without S×H interaction effect and with covariance between direct and maternal effect ; M2, with S×H interaction effect and ; M3, without S×H interaction effect and with ; M4, with S×H interaction effect and .
Berweger Baschnagel et al. (1999) and Dodenhoff et al. (1999) also reported larger estimates of direct and maternal genetic variance and negative covariance between those effects when the S×H interaction effect was not fit in the models. Meyer (1992) outlined that a negative estimate of covariance between direct and maternal effects increased both direct and maternal genetic variances in crosses between Hereford and Zebu cattle, but the same was not true in Angus because the covariance was positive. In the current study, M1 showed negative covariance between direct and maternal effects as well as larger estimates of direct and maternal genetic variance among all the models. Nonetheless, these estimates decreased when the S×H interaction effect was considered, and a positive covariance between direct and maternal effects was observed. Several studies with simulated data also reported biased VC without S×H interaction effect in the model (Robinson, 1996; Lee and Pollak, 1997), supporting the hypothesis of overestimated genetic variances in the models without the S×H interaction effect.
In our study, larger additive direct genetic variances were observed in ssGREML compared with REML. In contrast, smaller estimates of maternal genetic variance, S×H variance, and residual variances were observed in ssGREML; all with smaller standard errors. The large negative covariance between direct and maternal effects was reduced when S×H was added to the model (M1 vs M2) but was still negative.
When the covariance between direct and maternal effects was ignored in M3 and M4, most of the variances decreased, whereas the residual increased for both REML and ssGREML. One opposite pattern was observed in the comparison of M2 vs. M4 under REML, which showed increased estimates of direct and maternal variances, with a decrease in residual variance. This current study’s results agree with Meyer (1992) that the overestimation of both direct and maternal genetic variances is due to a negative covariance between these effects. Based on the current observations, biased direct and maternal genetic variances could be caused by ignoring the additional S×H interaction effect and allowing the negative estimation of a covariance component between direct and maternal effects. Therefore, if a negative covariance is mitigated by adding the S×H interaction effect (M1 to M2), including the covariance may give less overestimated genetic variances. AIC values were calculated for all models to determine the best model fitting the data (Table 2). As the amount of data was different for REML and ssGREML, AIC was not used for comparisons across the methods but only for the comparison of models within each method. In the results of both REML and ssGREML, M2 and M4 showed lower AIC values than models without S×H interaction effect (M1 and M3) although the differences were not very large.
Direct and maternal heritabilities, together with the proportion of the phenotypic variance explained by the S×H interaction effect, are shown in Fig. 1 for REML (a) and ssGREML (b). Overall estimates of direct heritability from ssGREML across all models were larger than the ones from REML. When the S×H interaction effect was considered under REML, direct heritabilities were reduced by a factor of 2.2 from M1 to M2, and by 1.7 from M3 to M4. The reduction was also observed under ssGREML but to a lesser extent (i.e., a factor of 1.5 and 1.25, respectively). The rationale for a larger reduction in the direct heritability when S×H interaction was added under REML is the decrease in direct variance combined with larger S×H interaction and residual variances and a larger phenotypic variance compared with ssGREML.
Figure 1.
Proportion of variance explained by additive direct, maternal, and sire × herd interaction effect using REML and ssGREML.
Overall, the estimation of VC with genomic information is affected by several factors: (1) genotyping strategy, (2) the presence of selection, (3) parameters for the construction of G, and (4) proportion of genotyped animals (Jensen, 2016; Cesarani et al., 2019; Wang et al., 2020). Because genomic selection has been applied to many livestock species, estimating unbiased VC using A becomes more challenging as it does not account for the impact of genomic selection (Jensen, 2016). In the AAA, the initial genotyping strategy included donor dams and proven sires because of the high costs; more recently, about half of the newly registered animals are genotyped each year, so the process is less selective. Wang et al. (2020) reported that VC estimated using H as the covariance structure among animals are sensitive to the genotyping strategy and proportion of genotyping. They emphasized that the strong selective genotyping and the high proportion of genotyped animals could produce overestimated variances; however, the level of overestimation observed in their study has not been confirmed.
In this study, genotyped animals were sampled that had phenotypes and at least one progeny either as a sire or dam, but animals were not filtered based on their phenotypic values. This sampling strategy was expected to reduce the selective genotyping effect while meeting the computing limitation of ssGREML. However, as the AAA breeders practiced selective genotyping at the very early stages of genomic selection and still even less selective genotyping existed, those genotyped animals generally showed heavier adjusted WW than the non-genotyped animals (Fig. 2, t-value = 67.445 with P-value < 2.2 × 10−16). This could be one possible reason why the direct heritability by ssGREML was larger than the estimation by REML among all the models (Fig. 1). Another possible reason could be the small proportion of genotyped animals. In the current study, the proportion of genotyping animals for each replicate is about ~8% which could produce a similar estimate or a modest overestimation in the ssGREML results (Wang et al., 2020).
Figure 2.
Distribution of adjusted WW for genotyped and non-genotyped animals used for ssGREML. Vertical lines are indicating the average adjusted weaning weight for genotyped (geno; = 653.30) and non-genotyped (non_geno; = 601.43) animals.
Forni et al. (2011) reported similar variance component (VC) estimates between REML and ssGREML, but smaller standard errors in ssGREML as it could use more data than REML. Moreover, adding genomic information could help to solve possible issues caused by missing or incorrect pedigree information, frequent in many animal species (Banos et al., 2001). Using both genotypes and pedigree for estimating VC might be useful for populations with a high error rate in the pedigree. Cesarani et al. (2019) carried out a simulation study to compare VC using REML, GREML, and ssGREML under different genotyping strategies. Those authors reported biased VC under REML with a small dataset, but no bias under REML and ssGREML with larger datasets. The dataset used in our study was large enough to estimate VC (Table 1), so the different estimates for the direct variance under REML and ssGREML may not be due to the data size.
This is the first study that has estimated VC for WW in the presence of S×H interaction using ssGREML. Therefore, the basis for the differences between estimates under REML and ssGREML is not completely clear. Aldridge et al. (2020) claimed H could better separate the additive direct and permanent environmental effects. If the same theory can be applied to the additive genetic effect and the additionally random SxH interaction effect, it could be hypothesized that the additive direct VC estimated using H is more accurate than A because H reflects the realized relationships among animals rather than the expected (Legarra, 2016).
In the US dairy cattle evaluations, reduced weight for multiple daughters of a given bull in the same herd is used by adjusting for S×H interaction since 1967. As the S×H variance decreased from 14% (1967) to 10% (1997), the direct heritability increased from 25% to 30% in the same period (Van Tassell et al., 1997). Additionally, Wiggans et al. (2000) reported that SxH variance in Jersey and Brown Swiss reduced to 8% when heritability increased from 30% to 35% in November of 2000. Those findings are supported by the current results. When SxH variance was 5% in REML for both M2 and M4, direct heritability was 0.15 and 0.16, respectively. On the other hand, when SxH variance decreased to 0.03 for both M2 and M4 under ssGREML, direct heritability increased to 0.26 and 0.24, respectively (Fig. 1). Lee and Pollak (1997) scrutinized the sire × year interaction effect and conjectured that the effect might be a true effect due to the different environmental factors associated with a different year. Based on their speculation, S×H interaction might also be a true effect due to the different environmental factors related to different herds. Therefore, improving the environment in specific herds could introduce heterogeneous variance among herds, which is a possible factor to generate S×H variance.
Genetic trends and genomic prediction
Genetic trends from 1972 to 2019 for all the 5 models are shown in Fig. 3. The genetic trends were measured as the average EPDs by year of birth. Overall, results indicate direct genetic trends have been increasing over time. The result for the direct effect (Fig. 3a) shows M2 and M4 have lower genetic trends than M1, M3, and M5. Furthermore, M1 and M5 showed almost equivalent genetic trends and were a bit greater than M3. In Fig. 3b, opposite patterns were observed for maternal effects, in which M2 and M4 have greater genetic trends than M1, M3, and M5. Particularly, M1 showed the lowest maternal genetic trend among all the models. Like the results of the direct genetic trend, consistent increases were observed since the 1980s; however, the slopes were not very steep after the 2010s, especially for the M1. These results suggest adding S×H interaction in the evaluation model increases maternal genetic trends and reduces the direct genetic trends, which could be overestimated without S×H. Legarra and Reverter (2017) outlined that bias was expected to increase with greater genetic gains. Genetic gain is defined as the change in the average breeding value of a population over a period, and the rate of genetic gain per year could be expressed as a genetic trend. These current results show that the models with the greatest bias for the direct effect (Table 3) have larger trends. In Fig. 3, direct genetic trends of M1, M3, and M5 are larger than M2 and M4. Also, greater bias is observed (Table 3) for those M1, M3, and M5 than M2 and M4 when both of and were used.
Figure 3.
Genetic trends for additive direct (a) and maternal (b) effects.
Table 3.
Accuracy, bias, and dispersion using the LR method (ssGBLUP)
| Accuracy | Bias | Dispersion estimator () | |||||
|---|---|---|---|---|---|---|---|
| Direct | M1 | 0.72 | 0.69 | −3.60 | −3.80 | 1.00 | 0.99 |
| M2 | 0.95 | 0.79 | −2.53 | −3.22 | 1.01 | 1.00 | |
| M3 | 0.76 | 0.75 | −3.26 | −3.41 | 1.00 | 1.00 | |
| M4 | 0.92 | 0.81 | −2.65 | −3.09 | 1.01 | 1.00 | |
| M5 | 0.71 | 0.68 | −3.53 | −3.71 | 1.00 | 1.00 | |
| Maternal | M1 | 0.59 | 0.62 | 0.55 | 0.58 | 0.97 | 0.97 |
| M2 | 0.65 | 0.67 | −0.06 | 0.24 | 0.98 | 0.98 | |
| M3 | 0.66 | 0.70 | 0.06 | 0.08 | 0.98 | 0.98 | |
| M4 | 0.63 | 0.69 | 0.07 | 0.11 | 0.98 | 0.98 | |
| M5 | 0.59 | 0.61 | 0.04 | 0.06 | 0.97 | 0.97 | |
: ssGBLUP using the variance component estimated from REML.
: ssGBLUP using the variance component estimated from ssGREML.
: M1, without S×H interaction effect and with covariance between direct and maternal effect ; M2, with S×H interaction effect and ; M3, without S×H interaction effect and with ; M4, with S×H interaction effect and ; M5, equivalent to M1, except for the after variance component estimation.
In beef cattle and many other species, the predictive ability has been used as a tool for predicting future phenotypes (progeny performance), which is calculated as the correlation between (G)EBV and phenotypes adjusted for fixed effects (Legarra et al., 2008; Lourenco et al., 2015). However, this method was difficult to apply for complex models such as binary traits, maternal effect, and multiple random effect models. Therefore, in the current study, the LR method was used to calculate both direct and maternal prediction estimators. As the LR method was recently developed, no studies have reported its performance on models with a maternal effect, although some studies validated this method with several simulations and real datasets (Silva et al., 2019; Bermann et al., 2021; Macedo et al., 2020). The estimators of the LR method are shown in Table 3. When was used, M2 and M4 showed greater accuracy for the direct effect than the other models, as well as relatively less bias. Dispersion was almost equivalent for all the models. Similar behavior was observed when using . The increase in accuracy for the direct effect when adding SxH interaction in the model (M1 vs M2) was around 24% for and 12% for . Additionally, bias decreased by approximately 30% and 15% for and , respectively.
The accuracy of M2 and M4 for the maternal effect was also greater than M1 and M5 for both VC scenarios, whereas M3 showed the greatest accuracy among all the models although the differences compared with M2 and M4 were not very large. The largest bias was observed in M1 for both VC scenarios. On the other hand, other models showed very similar biases when was used, but those biases increased when was used, especially in M2 and M4. No large differences in dispersion were seen between the models and VC methods.
In general, lower accuracies and greater biases were observed when was used. In the LR method, the dispersion estimator may indicate overdispersion of GEBV (if ) or under-dispersion of GEBV (). The across 5 models did not differ either with or . Remarkably, M2 and M4 had the greatest accuracy under the scenario; however, those accuracies dropped about 16.8% and 12%, respectively, when was used. Such a large reduction was not observed in other models. This pattern was also observed for the bias. When was used for M2 and M4, the bias increased up to 21.4% and 17.2%, respectively. However, these observed increases were to a very small extent for M1, M3, and M5 (4.3%–5.3%). Based on our findings, fitting S×H interaction in the model (M2 and M4) resulted in more accurate and less biased breeding values for the validation group, regardless of the choice of the covariance structure among the animals (A vs H) for estimating VC. However, it could also be speculated that the use of for genomic prediction, especially with the SxH interaction effect, could decrease the accuracy and increase the bias compared to the results with because of , which is part of the denominator of the accuracy formula, was larger when using genomic information, therefore, reducing the accuracy.
Accuracy, bias, and dispersion are the main features to examine the performance of genomic predictions. These three components could reflect the predictability of response to selection, correctness of model, use of inappropriate VC, and several unaccounted effects in the models (Reverter et al., 1994; Legarra and Reverter, 2018; Macedo et al., 2020). Macedo et al. (2020) applied the LR method to examine the possible bias and lower accuracy with the use of wrong heritability and unaccounted environmental effects. In that study, they concluded that if the incorrect genetic model was used for genomic evaluations, the LR method could estimate the bias when the model was not severely misspecified. The current results for the models without SxH interaction effect (M1, M3, and M5) support that discovery. These models showed a large bias for direct GEBV and some level of bias for maternal GEBV. Henderson (1975) reported that the use of an incorrect variance and covariance matrix could result in greater prediction error variance (PEV) for the solutions. Schaeffer (1984) extended that theory and concluded that the increase in PEV is directly related to the differences between true and estimated correlations. Therefore, we would argue that M2 and M4 had more appropriate VCs because of the S×H interaction effect. However, the large bias still observed in all models may be due to the effects that could not be accounted for in the models, affecting the estimation of GEBV. Wang et al. (2020) reported that the inflation of (G)EBV could reflect the bias in VC estimation. However, the inflation of (G)EBV (i.e., dispersion) was very consistent among models and VC methods. Therefore, based on our results and reports from the literature, we could conjecture that M1, M3, and M5 used inappropriate VC (estimates without S×H effect) and did not account for the hidden trend in the data (not fitting the SxH effect). Additionally, the use of negative covariance between direct and maternal effects might result in biased estimates, especially for the maternal GEBV (M1 vs M5).
Wang et al. (2020) tested genomic predictions using VC estimated from A and H for commercial and simulated datasets. These results agree with the results from the current study in the sense that accuracies of GEBV were greater when using VC estimated from A than from H; however, no clear explanation was provided in the previous study. One possible reason could be selective genotyping. In general, accuracy is the correlation between true breeding value (TBV) and (G)EBV or a function of and in the LR method. Therefore, greater accuracy reflects the greater relatedness between TBV and (G)EBV or and . If the VC used for genomic predictions were estimated with only selected genotyped animals, the relatedness between true and estimated BV would be more distant than if true VC were used. In this sense, it could be recommended to use VC from A, especially under the selective genotyping strategy; although more precisely estimated VC are expected from H as it has a more accurate relationship structure among the animals.
One finding that deserves a deeper investigation is the large increase in accuracy and decrease in bias from to when the S×H interaction effect was added (M2 and M4 in Table 3). Further research is needed to understand the changes in predictions and VC when an additional random sire interaction effect is fitted in the model.
Changes in EPD and ranking of AI sires
The changes in the rank of AI sires among the models are illustrated in Fig. 4. The horizontal dotted lines were drawn to specify each change on +50, +100, −50, 0, and −100 scales. G1 to G4 represents the animals having no changes (G1), changes within the interval from −50 to +50 (G2), changes within −50 to −100 or within +50 to +100 (G3), changes more than 100 (G4). Overall, considerable ranking changes were observed, especially for (b) M2 vs M5 and (d) M4 vs M5 compared with (a) M1 vs M5 and (c) M3 vs M5. Only a few AI sires had the same ranking among comparisons (82, 17, 81, 16 for (a) to (d), respectively). Because the ranking is an indicator of the genetic merit of the bulls in the population, even small changes could have a large impact, affecting the breeding decisions. Results of the change on direct EPDs of 1,977 AI sires are described in Fig. 5. Fig. 5a shows all the EPDs changed randomly within a very small range (from −2 to 4) regardless of the ranks of AI sires. On the contrary, Fig. 5b-d shows changes that agree with the changes in the rankings of AI sires. Interestingly, the top AI sires had a greater reduction in EPDs as indicated by the greater negative values on the left-hand side of each plot (Fig. 5b–d). Additionally, a few bottom sires also had greater changes as observed on the right-hand side of the plots (Fig. 5b–d). Although similar patterns are observed in Fig. 5b-d, the range of EPD changes in Fig. 5c is smaller than that of Fig. 5b and d. These results imply adding the S×H interaction effect in the evaluation model could generate large changes in rank and direct EPDs on AI sires although it showed unbiased VC estimation along with a better prediction model.
Figure 4.
Changes in the ranking of 1,977 AI sires (direct effect).
Figure 5.
Changes of EPDs for 1,977 AI sires (direct effect).
The results of ranking changes of maternal EPD for AI sires among the models are in Fig. 6. The horizontal dotted lines and G1 to G4 have the same description as in Fig. 4. Similar patterns are detected in Fig. 4b and d and Fig. 6b and d, showing large ranking changes. Different from Fig. 4a, Fig. 6a also showed very large changes in rankings, implying the negative covariance between direct and maternal effects may have been the reason for such changes in the maternal effect. Fig. 7 shows changes in maternal EPDs for AI sires. Most of the AI sires had reduced maternal EPDs (Fig. 7a). Many sires had larger maternal EPDs in M2 and M4 than in M5 (Fig. 7b and d, respectively), in addition, the bottom sires had larger maternal EPDs in these models. A similar pattern was observed in Fig. 7c, but a lot of sires had reduced maternal EPDs in M3 with a relatively small magnitude.
Figure 6.
Changes in the ranking of EPDs for 1,977 AI sires (maternal effect).
Figure 7.
Changes of EPDs for 1,977 AI sires (maternal effect).
Conclusions
The inclusion of the S×H interaction effect in the model for WW reduces the direct and maternal genetic variances and results in a positive covariance between direct and maternal effects when genomic information is not used. With genomics, the reduction is less, and the covariance is still negative. Using VC without genomic information may result in greater LR accuracy because of a lower additive genetic variance, with a similar level of dispersion. Adding the S×H interaction effect showed the best estimates of accuracy and bias for the direct effect but not for the maternal effect. Larger additive genetic variance with genomic information may be an artifact of selective genotyping. Fitting the S×H interaction effect in the model is recommended; however, further research is needed to investigate the improvement of prediction accuracy of maternal effects when S×H interaction is considered. Additionally, breeders should expect large changes in EPDs and ranking of animals, especially at the tails of the distributions, if this extra effect were fit into the genetic evaluation model. Before such changes are implemented in practice, more research is needed to ensure the resulting breeding values are better. The results of this study justify further investigation in this area for American Angus.
Acknowledgments
We thank Dale Van Vleck for inspiring this study through his continued research in this area and for generously sharing his ideas. This study was supported by the American Angus Association and its subsidiary Angus Genetics Inc. (St. Joseph, MO).
Glossary
Abbreviations
- A
pedigree relationship matrix
- AAA
American Angus Association
- AI
artificial insemination
- AIC
Akaike information criteria
- AIREML
average information restricted maximum likelihood
- CG
contemporary group
- EBV
estimated breeding value
- EPD
expected progeny difference
- G
genomic relationship matrix
- GEBV
genomic estimated breeding value
- H
realized relationship matrix
- LR
linear regression
- REML
restricted maximum likelihood
- SNP
single-nucleotide-polymorphisms
- S×H
sire × herd
- ssGBLUP
single-step genomic best linear unbiased prediction
- ssGREML
single-step genomic restricted maximum likelihood
- TBV
true breeding value
- VC
variance components
- WW
weaning weight
Conflict of Interest Statement
S.M. discloses that he was employed by the American Angus Association when this research was done. The remaining authors declare no real or perceived conflicts of interest.
Literature Cited
- Aguilar, I., Misztal I., Johnson D., Legarra A., Tsuruta S., and Lawlor T.. . 2010. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J. Dairy Sci. 93(2):743–752. [DOI] [PubMed] [Google Scholar]
- Aldridge, M. N., Vandenplas J., Bergsma R., and Calus M. P.. . 2020. Variance estimates are similar using pedigree or genomic relationships with or without the use of metafounders or the algorithm for proven and young animals. J. Anim. Sci. 98(3):skaa019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banos, G., and Shook G.. . 1990. Genotype by environment interaction and genetic correlations among parities for somatic cell count and milk yield. J. Dairy Sci. 73(9):2563–2573. [DOI] [PubMed] [Google Scholar]
- Banos, G., Wiggans G., and Powell R.. . 2001. Impact of paternity errors in cow identification on genetic evaluations and international comparisons. J. Dairy Sci. 84(11):2523–2529. [DOI] [PubMed] [Google Scholar]
- Bermann, M., Legarra A., Hollifield M. K., Masuda Y., Lourenco D., and Misztal I.. . 2021. Validation of single-step GBLUP genomic predictions from threshold models using the linear regression method: an application in chicken mortality. J. Anim. Breed. Genet. 138(1):4–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berweger Baschnagel, M., Moll J., and Künzi N.. . 1999. Comparison of models to estimate maternal effects for weaning weight of Swiss Angus cattle fitting a sire x herd interaction as an additional random effect. Livest. Prod. Sci. 60(2–3):203–208. [Google Scholar]
- Bradfield, M. J. 1999. Genetic evaluation of cattle managed under extensive conditions in northern Australia [PhD Thesis]. Armidale, NSW: University of New England. [Google Scholar]
- Cesarani, A., Pocrnic I., Macciotta N. P., Fragomeni B. O., Misztal I., and Lourenco D. A.. . 2019. Bias in heritability estimates from genomic restricted maximum likelihood methods under different genotyping strategies. J. Anim. Breed. Genet. 136(1):40–50. [DOI] [PubMed] [Google Scholar]
- Christensen, O. F., and Lund M. S.. . 2010. Genomic prediction when some animals are not genotyped. Genet. Sel. Evol. 42(1):1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dodenhoff, J., Van Vleck L. D., and Wilson D.. . 1999. Comparison of models to estimate genetic effects of weaning weight of Angus cattle. J. Anim. Sci.. 77(12):3176–3184. [DOI] [PubMed] [Google Scholar]
- Forni, S., Aguilar I., and Misztal I.. . 2011. Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information. Genet. Sel. Evol. 43(1):1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graser, H., Johnston D., and Tier B.. . 1999. Sire × herd interaction effect in BREEDPLAN. In: Proceedings of the Association for the Advancement of Animal Breeding and Genetics. p 197–198.
- Henderson, C. R. 1975. Comparison of alternative sire evaluation methods. J. Anim. Sci. 41(3):760–770. [Google Scholar]
- Jensen, J. 2016. Estimation of genetic variance in the age of genomics. Wiley Online Library. [DOI] [PubMed] [Google Scholar]
- Kennedy, B., Schaeffer L., and Sorensen D.. . 1988. Genetic properties of animal models. J. Dairy Sci. 71:17–26. [Google Scholar]
- Lee, C., and Pollak E.. . 1997. Relationship between sire× year interactions and direct-maternal genetic correlation for weaning weight of Simmental cattle. J. Anim. Sci.75(1):68–75. [DOI] [PubMed] [Google Scholar]
- Legarra, A. 2016. Comparing estimates of genetic variance across different relationship models. Theoret. Popul. Biol. 107:26–30. [DOI] [PubMed] [Google Scholar]
- Legarra, A., and Reverter A.. . 2017. Can we frame and understand cross-validation results in animal breeding. In: Proceedings of the 22nd conference association for the advancement of animal breeding and genetics. p 2–5.
- Legarra, A., and Reverter A.. . 2018. Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet. Sel. Evol 50(1):53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Legarra, A., Robert-Granié C., Manfredi E., and Elsen J.-M.. . 2008. Performance of genomic selection in mice. Genetics 180(1):611–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lourenco, D., Tsuruta S., Fragomeni B., Masuda Y., Aguilar I., Legarra A., Bertrand J., Amen T., Wang L., and Moser D.. . 2015. Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus. J. Anim. Sci. 93(6):2653–2662. [DOI] [PubMed] [Google Scholar]
- Macedo, F., Reverter A., and Legarra A.. . 2020. Behavior of the linear regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models. J. Dairy Sci. 103(1):529–544. [DOI] [PubMed] [Google Scholar]
- Masuda, Y., Aguilar I., Tsuruta S., and Misztal I.. . 2015. Acceleration of sparse operations for average-information REML analyses with supernodal methods and sparse-storage refinements. J. Anim. Sci. 93(10):4670–4. [DOI] [PubMed] [Google Scholar]
- Meyer, K. 1987. Estimates of variances due to sire× herd interactions and environmental covariances between paternal half-sibs for first lactation dairy production. Livest. Prod. Sci. 17:95–115. [Google Scholar]
- Meyer, K. 1992. Variance components due to direct and maternal effects for growth traits of Australian beef cattle. Livest. Prod. Sci. 31(3-4):179–204. [Google Scholar]
- Meyer, K., and Graser H.-U.. . 1999. Estimates of parameters for scan records of Australian beef cattle treating records on males and females as different traits. In: Proceedings of the Association for the Advancement of Animal Breeding and Genetics. p 385–388.
- Misztal, I., Legarra A., and Aguilar I.. . 2014a. Using recursion to compute the inverse of the genomic relationship matrix. J Dairy Sci. 97(6):3943–52. [DOI] [PubMed] [Google Scholar]
- Misztal, I., Tsuruta S., Lourenco D., Aguilar I., Legarra A., and Vitezica Z.. . 2014b. Manual for BLUPF90 family of programs. Athens: University of Georgia. [Google Scholar]
- Notter, D., Tier B., and Meyer K.. . 1992. Sire× herd interactions for weaning weight in beef cattle. J. Anim. Sci. 70(8):2359–2365. [DOI] [PubMed] [Google Scholar]
- Pollak, E., Wang C., Cunningham B., Klei L., and Van Tassell C.. . 1994. Considerations on the validity of parameters used in national cattle evaluations. In: Proceedings for the Fourth Genetic Prediction Workshop. January. p 21–22.
- Reverter, A., Golden B., Bourdon R., and Brinks J.. . 1994. Detection of bias in genetic predictions. J. Anim. Sci. 72(1):34–37. [DOI] [PubMed] [Google Scholar]
- Robinson, D. 1996. Models which might explain negative correlations between direct and maternal genetic effects. Livest. Prod. Sci. 45(2-3):111–122. [Google Scholar]
- Schaeffer, L. 1984. Sire and cow evaluation under multiple trait models. J. Dairy Sci. 67(7):1567–1580. [Google Scholar]
- Silva, R. M., Evenhuis J. P., Vallejo R. L., Gao G., Martin K. E., Leeds T. D., Palti Y., and Lourenco D. A.. . 2019. Whole-genome mapping of quantitative trait loci and accuracy of genomic predictions for resistance to columnaris disease in two rainbow trout breeding populations. Genet. Sel. Evol. 51(1):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tong, A., Kennedy B., and Moxley J.. . 1977. Sire by herd interactions for milk yield and composition traits. Can. J. Anim. Sci. 57(3):383–388. [Google Scholar]
- Van Tassell, C., Wiggans G., VanRaden P., and Norman H.. . 1997. Changes in USDA-DHIA genetic evaluations (August 1997). AIPL. Res. Rpt. 9, no 8–97. [Google Scholar]
- Veerkamp, R., Mulder H., Thompson R., and Calus M.. . 2011. Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped. J. Dairy Sci.. 94(8):4189–4197. [DOI] [PubMed] [Google Scholar]
- Wang, L., Janss L. L., Madsen P., Henshall J., Huang C.-H., Marois D., Alemu S., Sørensen A., and Jensen J.. . 2020. Effect of genomic selection and genotyping strategy on estimation of variance components in animal models using different relationship matrices. Genet. Sel. Evol. 52(1):1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wiggans, G., VanRaden P., Powell R., and Van Tassell C.. . 2000. Changes in USDA-DHIA genetic evaluations (November 2000). AIPL. Res. Rpt. [Google Scholar]







