Skip to main content
Genetics, Selection, Evolution : GSE logoLink to Genetics, Selection, Evolution : GSE
. 2014 Oct 14;46(1):66. doi: 10.1186/s12711-014-0066-4

Improving the accuracy of genomic prediction in Chinese Holstein cattle by using one-step blending

Xiujin Li 1, Sheng Wang 1, Ju Huang 1, Leyi Li 1, Qin Zhang 1, Xiangdong Ding 1,
PMCID: PMC4196050  PMID: 25315995

Abstract

Background

The one-step blending approach has been suggested for genomic prediction in dairy cattle. The core of this approach is to incorporate pedigree and phenotypic information of non-genotyped animals. The objective of this study was to investigate the improvement of the accuracy of genomic prediction using the one-step blending method in Chinese Holstein cattle.

Findings

Three methods, GBLUP (genomic best linear unbiased prediction), original one-step blending with a genomic relationship matrix, and adjusted one-step blending with an adjusted genomic relationship matrix, were compared with respect to the accuracy of genomic prediction for five milk production traits in Chinese Holstein. For the two one-step blending methods, de-regressed proofs of 17 509 non-genotyped cows, including 424 dams and 17 085 half-sisters of the validation cows, were incorporated in the prediction model. The results showed that, averaged over the five milk production traits, the one-step blending increased the accuracy of genomic prediction by about 0.12 compared to GBLUP. No further improvement in accuracies was obtained from the adjusted one-step blending over the original one-step blending in our situation. Improvements in accuracies obtained with both one-step blending methods were almost completely contributed by the non-genotyped dams.

Conclusions

Compared with GBLUP, the one-step blending approach can significantly improve the accuracy of genomic prediction for milk production traits in Chinese Holstein cattle. Thus, the one-step blending is a promising approach for practical genomic selection in Chinese Holstein cattle, where the reference population mainly consists of cows.

Background

A reference population with sufficient size is essential in genomic selection (GS) [1-3]. For dairy cattle, in almost all countries with developed dairy industry, thousands of progeny-tested bulls with highly reliable estimated breeding value (EBV) are used to form the national reference population. However, constituting such a reference population is not feasible in some countries, e.g. China, where the number of bulls with highly reliable EBV is limited. As an alternative, cows can be used to form the reference population. Ding et al. [4] investigated the accuracy of genomic prediction using a reference population consisting of cows, and showed that genomic selection using cows is feasible. However, a larger population of reference cows was required to obtain comparable accuracies of genomic prediction than when progeny-tested bulls are used as reference population, because cow EBV are generally less reliable than bull EBV [4]. Further efforts are needed to improve the accuracy of genomic prediction in such a situation.

The term “one-step blending” was used to distinguish it from the original single-step approach using DPR (de-regressed proofs) instead of raw phenotypes [5]. In the present study, we investigated the possible improvements in the accuracy of genomic prediction by applying the one-step blending approach to Chinese Holsteins, for which the reference population consists primarily of cows. In addition, the influence of the relationship between the non-genotyped animals and genotyped selection candidates on the prediction accuracy of one-step blending was also investigated.

Methods

Data

The data consisted of 4917 Chinese Holstein cows born from 1998 to 2009 and 240 progeny-tested bulls born from 1984 to 2005, all of which had official EBV on five milk production traits (milk yield, fat yield, fat percentage, protein yield, and protein percentage). These official EBV were obtained based on a multiple-trait random regression test-day model [6]. DRP of all animals were derived from their EBV according to VanRaden and Wiggans [7] and used as response variables for genomic prediction. Reliabilities of the DRP were calculated according to Liu et al. [8]. All animals had reliabilities of DRP greater than 0.40 (for cows) or 0.80 (for bulls). Out of the 4917 cows, 4106 born before 2008, together with the 240 bulls, were taken as the reference population, and the remaining 811 cows born in or after 2008 were used as the validation population.

All individuals in the reference and validation populations were genotyped with the Illumina BovineSNP50 BeadChip (Illumina, San Diego, CA). Missing genotypes of single nucleotide polymorphisms (SNPs) with known chromosomal positions were imputed by BEAGLE [9], and those with unknown chromosomal positions were discarded. After imputation, SNPs with minor allele frequency (MAF) less than 0.01 were removed, leaving 46 422 SNPs for genomic prediction.

To implement the one-step blending approach, all non-genotyped dams and half-sisters of the validation cows that had DRP with reliabilities greater than 0.40, were considered. Of the 811 validation cows, 425 had non-genotyped dams (424 in total) and all had non-genotyped half-sisters (17 085 in total, ranging from 154 to 2672).

Blood samples were collected from Chinese Holstein cattle when the regular quarantine inspection of the farms was conducted. The procedure for collecting the blood samples was carried out in strict accordance with the protocol approved by the Animal Welfare Committee of China Agricultural University (Permit Number: DK996).

Statistical models

Three methods, GBLUP, the original one-step blending and the adjusted one-step blending, were implemented for genomic prediction of animals in the validation population.

GBLUP

The following genomic BLUP model [10] was used to predict genomic breeding values:

graphic file with name M1.gif

where y is the vector of DRP of the reference animals, g is the vector of additive genetic effects, which assumed to follows a normal distribution Inline graphic with G being the genomic relationship matrix constructed using the first method of VanRaden [10], and e is the vector of random errors, assumed to follow a normal distribution Inline graphic, with D being a diagonal matrix with dii = 1/wi, where Inline graphic (Inline graphic is the reliability of DRP of individual i) [10,11]. The estimates in g based on this model are termed direct genomic breeding values (DGV).

Original one-step blending

Following Legarra et al. [12], Aguilar et al. [13], and Christensen and Lund [14], the one-step blending method has the same model as GBLUP, except that the vector y also contains the DRP of non-genotyped animals and vector g is assumed to follow a normal distribution Inline graphic, where H is defined as:

graphic file with name M7.gif

with A11, A12, and A22 sub-matrices of A (the pedigree-based relationship matrix), and subscripts 1 and 2 refer to non-genotyped and genotyped animals, respectively. The estimates in g based on this model are termed the genomic enhanced breeding values (GEBV).

Adjusted one-step blending

To avoid the potential incompatibility in scale between the coefficients of G and A22 involved in the H matrix, which could lead to incorrect weighting of the pedigree and genomic information, as pointed out by Forni et al. [15], the G matrix was adjusted following Gao et al. [16], i.e.,

graphic file with name M8.gif

where β and α are obtained from the following equations:

graphic file with name M9.gif

Where Avg(diag(*)) means the average value of diagonal elements of matrix *; Avg(offdiag(*)) means the average value of non-diagonal elements of matrix *.

The variance components Inline graphic involved in the three models were estimated using AI-REML, as implemented in the software DMU [17].

Evaluation of the accuracy of genomic prediction

The accuracy of genomic predictions was evaluated as Inline graphic [5], where Inline graphic is the correlation between the estimated g (DGV or GEBV) and the DRP in the validation population and rDRP is the average of the square root of the reliability of the DRP of the validation cows.

In addition, the theoretical accuracy of the DGV or GEBV was calculated for each individual in the same way as in conventional BLUP, following Henderson [18] from the diagonal of the inverse of the mixed model equation (MME), and the average theoretical accuracy over validation animals was also used to evaluate the accuracy of genomic predictions.

Results and discussion

As shown in Table 1, for the 811 validation cows, rv and average theoretical accuracies from the original one-step blending increased by 0.12 and 0.02, respectively, compared with the accuracies from GBLUP averaged over the five traits. Accuracies from the adjusted one-step blending approach were almost the same as those from the original one-step blending. Theoretical accuracies were much higher than rv, which was also observed in other studies [3,19-21]. The theoretical accuracy may also be overestimated owing to sampling errors in elements of the genomic relationship matrix as pointed out by Goddard et al. [22]. In comparison with GBLUP, the one-step blending approach can significantly improve the accuracy of genomic prediction by incorporating the phenotypes (DRP) of non-genotyped relatives of the selection candidates. However, the adjusted one-step blending did not result in further improvements in accuracy compared with the original one-step blending, probably because the original G matrix was little adjusted in our situation, since the estimates of β and α were 0.992 (close to 1) and 0.017 (close to 0), respectively, while they were 0.859 and 0.298 in the study of Christensen et al. [23]. Similar results were also observed by Gao et al. [16] in the Nordic Holstein population, where the adjusted one-step blending resulted in little improvement in the prediction accuracy and estimates of β and α were 0.976 and 0.085, respectively.

Table 1.

Accuracies of genomic prediction for the validation cows

Trait GBLUP Original one-step Adjusted one-step
r v * Theoretical accuracy r v * Theoretical accuracy r v * Theoretical accuracy
All validation cows
Milk yield 0.36 0.73 0.46 0.75 0.46 0.75
Fat yield 0.44 0.73 0.60 0.75 0.60 0.75
Fat % 0.52 0.67 0.60 0.69 0.60 0.69
Protein yield 0.37 0.72 0.51 0.74 0.51 0.75
Protein % 0.51 0.67 0.62 0.69 0.62 0.69
Average 0.44 0.70 0.56 0.72 0.56 0.73
386 validation cows with genotyped dams
Milk yield 0.34 0.73 0.36 0.73 0.37 0.74
Fat yield 0.42 0.73 0.44 0.74 0.44 0.74
Fat % 0.50 0.66 0.52 0.67 0.52 0.67
Protein yield 0.36 0.72 0.37 0.73 0.37 0.73
Protein % 0.48 0.67 0.50 0.67 0.50 0.67
Average 0.42 0.70 0.44 0.71 0.44 0.71
425 validation cows with both non-genotyped dams and half-sisters
Milk yield 0.36 0.73 0.55 0.76 0.55 0.76
Fat yield 0.46 0.73 0.73 0.76 0.72 0.76
Fat % 0.54 0.67 0.69 0.70 0.69 0.70
Protein yield 0.39 0.75 0.62 0.76 0.62 0.76
Protein % 0.54 0.67 0.73 0.70 0.73 0.70
Average 0.46 0.71 0.66 0.74 0.66 0.74

*r v is equal to the correlation between the estimated breeding and the DRP divided by the average of the square root of the reliability of the DRP in the validation cows.

Among the 811 validation cows, 425 had both non-genotyped dams and half-sisters, while 386 with genotyped dams had only non-genotyped half-sisters. For validation cows with genotyped dams, rv and the theoretical accuracies obtained from both one-step blending approaches were nearly the same as those from GBLUP (Table 1), while for validation cows with both non-genotyped dams and half-sisters, rv were improved by 15 to 26 percentage points and 1 to 3 percentage points for the theoretical accuracy, when using the one-step blending approach (Table 1). Again, in all these cases, the adjusted one-step blending did not perform better than the original one-step blending. These results suggest that, compared with GBLUP, improvements in accuracies from the one-step blending approach were almost completely contributed by the non-genotyped dams. To further prove this, we discarded all non-genotyped half-sisters and only included the non-genotyped dams of 425 validation cows in the one-step blending approach. As expected, rv and the theoretical accuracies of the 425 validation cows from the original one-step blending approach (Table 2) were almost the same as those in the scenario when both non-genotyped dams and half-sisters were included in the one-step blending approach (Table 1). The reason for this is that all non-genotyped half-sisters were daughters of 19 genotyped sires in the reference population and the information from these daughters was part of the DRP of the sires. Therefore, these half-sisters contributed little extra information for genomic prediction.

Table 2.

Accuracies of genomic prediction for 425 validation cows when their non-genotyped dams and not their non-genotyped half-sisters were used in the one-step blending approach

Trait GBLUP Original one-step
r v * Theoretical accuracy r v * Theoretical accuracy
Milk yield 0.36 0.73 0.54 0.76
Fat yield 0.46 0.73 0.73 0.76
Fat percentage 0.54 0.67 0.70 0.69
Protein yield 0.39 0.75 0.61 0.76
Protein percentage 0.54 0.67 0.74 0.69
Average 0.46 0.71 0.66 0.73

*r v is equal to the correlation between the estimated breeding and the DRP divided by the average of the square root of the reliability of the DRP in the validation cows.

Conclusions

Averaged over the five milk production traits, both one-step blending methods increased rv and the average theoretical accuracy by about 0.12 and 0.02, respectively, compared to GBLUP. However, the adjusted one-step blending did not perform better than the original one-step blending in our situation. In our situation, improvements in accuracies from both one-step blending approaches were almost completely contributed by the non-genotyped dams of the validation animals.

Acknowledgements

This work was supported by the National ‘948’ Project (2011-G2A), Ph.D. Programs Foundation of Ministry of Education of China (20110008110001), National Natural Science Foundation of China (31272418, 31372158), the earmarked fund for CARS-37, National Dairy Industry System in Beijing Team, Scientific Research Foundation for Returned Scholars, Ministry of Education of China, Program for Changjiang Scholar and Innovation Research Team in University (Grant No. IRT1191). We thank the Dairy Association of China for supplying the official EBV and Beijing Dairy Cattle Center for providing blood and semen samples.

Footnotes

These authors Xiujin Li and Sheng Wang are contributed equally to this work.

Competing interests

The authors declare that they have no competing interests.

Authors’ contribution

XJL performed statistical analysis and wrote the manuscript. JH, LYL and SW prepared data. XDD and QZ jointly conceived the design of the study, made substantial contribution to the results interpretation and revised the manuscript. All authors read and approved the manuscript.

Contributor Information

Xiujin Li, Email: lixiujin996@gmail.com.

Sheng Wang, Email: waswangs@gmail.com.

Ju Huang, Email: huangj1023@gmail.com.

Leyi Li, Email: 13126665473@163.com.

Qin Zhang, Email: qzhang@cau.edu.cn.

Xiangdong Ding, Email: xding@cau.edu.cn.

References

  • 1.Goddard ME, Hayes BJ. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet. 2009;10:381–391. doi: 10.1038/nrg2575. [DOI] [PubMed] [Google Scholar]
  • 2.Hayes B, Bowman P, Chamberlain A, Goddard ME. Invited review: genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009;92:433–443. doi: 10.3168/jds.2008-1646. [DOI] [PubMed] [Google Scholar]
  • 3.VanRaden P, Van Tassell C, Wiggans G, Sonstegard T, Schnabel R, Taylor J, Schenkel F. Invited Review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009;92:16–24. doi: 10.3168/jds.2008-1514. [DOI] [PubMed] [Google Scholar]
  • 4.Ding X, Zhang Z, Li X, Wang S, Wu X, Sun D, Yu Y, Liu J, Wang Y, Zhang Y, Zhang S, Zhang Y, Zhang Q. Accuracy of genomic prediction for milk production traits in Chinese Holstein. Population using a reference population consisting of cows. J Dairy Sci. 2013;96:5315–5323. doi: 10.3168/jds.2012-6194. [DOI] [PubMed] [Google Scholar]
  • 5.Su G, Madsen P, Nielsen US, Mäntysaari EA, Aamand GP, Christensen OF, Lund MS. Genomic prediction for Nordic Red cattle using one-step and selection index blending. J Dairy Sci. 2012;95:909–917. doi: 10.3168/jds.2011-4804. [DOI] [PubMed] [Google Scholar]
  • 6.Schaeffer LR, Jamrozik J, Kistemaker GJ, Van Doormaal BJ. Experience with a test-day model. J Dairy Sci. 2000;83:1135–1144. doi: 10.3168/jds.S0022-0302(00)74979-4. [DOI] [PubMed] [Google Scholar]
  • 7.VanRaden PM, Wiggans GR. Derivation, calculation, and use of national animal model information. J Dairy Sci. 1991;74:2737–2746. doi: 10.3168/jds.S0022-0302(91)78453-1. [DOI] [PubMed] [Google Scholar]
  • 8.Liu Z, Reinhardt F, Bünger A, Reents R. Derivation and calculation of approximate reliabilities and daughter yield-deviations of a random regression test-day model for genetic evaluation of dairy cattle. J Dairy Sci. 2004;87:1896–1907. doi: 10.3168/jds.S0022-0302(04)73348-2. [DOI] [PubMed] [Google Scholar]
  • 9.Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–223. doi: 10.1016/j.ajhg.2009.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–4423. doi: 10.3168/jds.2007-0980. [DOI] [PubMed] [Google Scholar]
  • 11.Garrick DJ, Taylor JF, Fernando RL. Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet Sel Evol. 2009;41:55. doi: 10.1186/1297-9686-41-55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–4663. doi: 10.3168/jds.2009-2061. [DOI] [PubMed] [Google Scholar]
  • 13.Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J Dairy Sci. 2010;93:743–752. doi: 10.3168/jds.2009-2730. [DOI] [PubMed] [Google Scholar]
  • 14.Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2. doi: 10.1186/1297-9686-42-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Forni S, Aguilar I, Misztal I. Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information. Genet Sel Evol. 2011;43:1. doi: 10.1186/1297-9686-43-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gao H, Christensen OF, Madsen P, Nielsen US, Zhang Y, Lund MS, Su G. Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population. Genet Sel Evol. 2012;44:8. doi: 10.1186/1297-9686-44-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Madsen P, Sørensen P, Su G, Damgaard LH, Thomsen H, Labouriau R: DMU-a Package for Analyzing Multivariate Mixed Models, In Proceedings of the 8thWorld Congress on Genetics Applied to Livestock Production: 13-18 August 2006; Belo Horizonte. 2006.
  • 18.Henderson CR. Best linear unbiased estimation and prediction under a selection model. Biometrics. 1975;31:423–447. doi: 10.2307/2529430. [DOI] [PubMed] [Google Scholar]
  • 19.Bijma P. Accuracies of estimated breeding values from ordinary genetic evaluations do not reflect the correlation between true and estimated breeding values in selected populations. J Anim Breed Genet. 2012;129:345–358. doi: 10.1111/j.1439-0388.2012.00991.x. [DOI] [PubMed] [Google Scholar]
  • 20.Edel C, Neuner S, Emmerling R, Goetz K-U. A note on using ‘forward prediction’to assess precision and bias of genomic predictions. Interbull Bulletin. 2012;46:16–19. [Google Scholar]
  • 21.Su G, Guldbrandtsen B, Gregersen V, Lund M. Preliminary investigation on reliability of genomic estimated breeding values in the Danish Holstein population. J Dairy Sci. 2010;93:1175–1183. doi: 10.3168/jds.2009-2192. [DOI] [PubMed] [Google Scholar]
  • 22.Goddard M, Hayes B, Meuwissen T. Using the genomic relationship to matrix predict the accuracy of genomic selection. J Anim Breed Genet. 2011;128:409–421. doi: 10.1111/j.1439-0388.2011.00964.x. [DOI] [PubMed] [Google Scholar]
  • 23.Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Single-step methods for genomic evaluation in pigs. Animal. 2012;6:1565–1571. doi: 10.1017/S1751731112000742. [DOI] [PubMed] [Google Scholar]

Articles from Genetics, Selection, Evolution : GSE are provided here courtesy of BMC

RESOURCES