Skip to main content
Journal of Animal Science logoLink to Journal of Animal Science
. 2021 Feb 9;99(3):skab041. doi: 10.1093/jas/skab041

Selective genotyping and phenotypic data inclusion strategies of crossbred progeny for combined crossbred and purebred selection in swine breeding

Garrett M See 1,, Benny E Mote 1, Matthew L Spangler 1
PMCID: PMC7968076  PMID: 33560334

Abstract

Inclusion of crossbred (CB) data into traditionally purebred (PB) genetic evaluations has been shown to increase the response in CB performance. Currently, it is unrealistic to collect data on all CB animals in swine production systems, thus, a subset of CB animals must be selected to contribute genomic/phenotypic information. The aim of this study was to evaluate selective genotyping strategies in a simulated 3-way swine crossbreeding scheme. The swine crossbreeding scheme was simulated and produced 3-way CB animals for 6 generations with 3 distinct PB breeds each with 25 and 175 mating males and females, respectively. F1 crosses (400 mating females) produced 4,000 terminal CB progeny which were subjected to selective genotyping. The genome consisted of 18 chromosomes with 1,800 QTL and 72k SNP markers. Selection was performed using estimated breeding values (EBV) for CB performance. It was assumed that both PB and CB performance was moderately heritable (h2=0.4). Several scenarios altering the genetic correlation between PB and CB performance (rpc=0.1, 0.3, 0.5, 0.7,or 0.9) were considered. CB animals were chosen based on phenotypes to select 200, 400, or 800 CB animals to genotype per generation. Selection strategies included: (1) Random: random selection, (2) Top: highest phenotype, (3) Bottom: lowest phenotype, (4) Extreme: half highest and half lowest phenotypes, and (5) Middle: average phenotype. Each selective genotyping strategy, except for Random, was considered by selecting animals in half-sib (HS) or full-sib (FS) families. The number of PB animals with genotypes and phenotypes each generation was fixed at 1,680. Each unique genotyping strategy and rpc scenario was replicated 10 times. Selection of CB animals based on the Extreme strategy resulted in the highest (P < 0.05) rates of genetic gain in CB performance (ΔG) when rpc<0.9. For highly correlated traits (rpc=0.9) selective genotyping did not impact (P > 0.05) ΔG. No differences (P > 0.05) were observed in ΔG between top, bottom, or middle when rpc>0.1. Higher correlations between true breeding values (TBV) and EBV were observed using Extreme when rpc<0.9. In general, family sampling method did not impact ΔG or the correlation between TBV and EBV. Overall, the Extreme genotyping strategy produced the greatest genetic gain and the highest correlations between TBV and EBV, suggesting that 2-tailed sampling of CB animals is the most informative when CB performance is the selection goal.

Keywords: commercial data, selective genotyping, selective phenotyping, swine

Introduction

Genomic selection has dramatically changed the swine breeding industry and has been successfully applied in purebred (PB) populations (Knol et al., 2016). As implemented, it requires collection of PB genotypes and phenotypes. These PB phenotypes are presumably correlated to traits that are drivers of net profit at the commercial level. Estimated breeding values (EBV) of PB selection candidates are then calculated with the goal of generating phenotypic change in crossbred (CB) animal performance. Currently, the collection of genotypes and phenotypes is limited to superior PB animals (Knol et al., 2016). However, the targets of selection are CB traits, thus, if the genetic correlation between PB and CB traits (rpc) is below 1 the selection pressure is suboptimal (Wientjes and Calus, 2017). Dekkers (2007) proposed a genomic combined CB and PB selection scheme (CCPS) utilizing breed specific marker effects. CCPS has since been shown to improve prediction accuracy of PB performance for CB traits in simulated (Zeng et al., 2013; See et al., 2020) and real data (Hidalgo et al., 2016; Sewel et al., 2018). Further work has shown the benefit of CB reference populations used to inform genetic predictions, compared to strictly PB animals (Esfandyari et al., 2015; van Grevenhof and van der Werf, 2015; See et al., 2020).

In practice, CCPS, which utilizes information from both nucleus and commercial tier animals, has seen limited use due to added costs. Costs could be reduced by carefully designing PB and CB reference populations which limit the number of CB animals with data while maximizing the response in CB performance. van Grevenhof and van der Werf (2015) designed a deterministic simulation, where the amount and sources of information used in CCPS were evaluated using 2 PB lines and the resulting CB progeny. The authors showed that when the rpc is <0.7 and the selection goal does not include PB performance, a reference population comprised of half CB and PB out performs one that is purely PB. See et al. (2020) investigated the impact of combined PB and CB reference populations used to inform selection decisions targeted at improving CB performance. Inclusion of CB data into existing PB reference populations was shown to improve the response to selection for CB performance when PB records comprised the majority of the reference population.

Further strategies to maximize genetic gain without genotyping all possible animals have previously been investigated through selective genotyping of the PB candidates (Boligon et al., 2012; Ehsani et al., 2010; Zhao et al., 2012; Howard et al., 2018; Chu et al., 2020). The pitfalls of selectively genotyping top preforming animals have been extensively discussed and the authors generally concluded that selective genotyping of both top and bottom preforming animals does improve predictive ability when not all candidates are genotyped. With the exception of Howard et al. (2018) and Chu et al. (2020) previous works have not focused on the long-term impact of selective genotyping on the rate of genetic gain and phenotyping in breeding schemes. Howard et al. (2018) simulated beef and swine populations which were then subjected to selection based upon an index comprised of estimated breeding values (EBV) for two traits. Using phenotypes of all selection candidates, several genotyping strategies were investigated including selective genotyping of a proportion of top index values or at random, and including all genotypes. Chu et al. (2020) was the only previous work which evaluated selective genotyping across 2 environments and did not include phenotypes from all animals. The authors suggest that selective genotyping of top performing individuals in the breeding environment, and genotyping and phenotyping of top and bottom performing individuals from the commercial environment would increase genetic gains. Adapting the concepts of previous works to a CCPS breeding scheme, the potential impacts of selective genotyping could increase genetic gain in CB performance when a limited number of CB animals are included in genomic evaluations.

Therefore, the objective of the present study was to evaluate predicting genetic merit for selection candidates in nucleus populations resulting from different selective genotyping strategies of commercial animals in CCPS under 2 breeding scheme constructs: (1) value of rpc and (2) the proportion of commercial tier animals with available genomic and phenotypic data.

Methods

Simulated data were utilized in this study, thus animal care and use approval was not required. Simulated phenotypes and genotypes were produced using AlphaSimR (Gaynor et al., 2019) in a series of cohesive R scripts previously described by See et al. (2020). The litter size for PB and CB was set to 4 and 10, respectively, for computational ease.

Genotypes and phenotypes

Simulation structure for the development of founder haplotypes and genotypes have been previously described by See et al. (2020). Briefly, founder haplotypes for each chromosome were simulated using the Markovian Coalescence Simulator (Chen et al., 2009). Simulation of the founder haplotypes began 1×106 generations prior to the end of founder haplotype simulation with an effective population size (Ne) of 100,000. The Ne was then gradually reduced to 100 at end of the simulation of founder haplotypes. Formation of the initial 4,800 founder animals was performed by randomly sampling founder haplotypes with replacement and recombination events to make up the initial founder genome. The simulated genome was designed to mimic the Sus scrofa genome with an SNP density comparable to a 60K SNP chip and included 18 chromosomes each with 100 non-overlapping QTL. A SNP chip with 72,000 markers was simulated by randomly selecting 4,000 bi-allelic sites, per chromosome, to become SNP markers. The base pair mutation and recombination rates were 2.5×108 and 1×108, respectively. Two additive traits with moderate heritability (h2=0.4) and phenotypic variance of 10 were simulated to represent CB and PB performance free of dominance or epistatic variation.

Breeding program

The general gene flow and breeding scheme has been previously described by See et al. (2020). Briefly, breed formation began by randomly separating founder animals into 3 breeds, with equal numbers of males and females. Prior to the initiation of crossbreeding, within breed random mating occurred for 65 generations. To develop breed differences, during the first 15 generations of random mating a genetic bottle neck occurred were the population size gradually reduced from 1,600 to 200 mating animals. At the end of random mating in generation 0, selective mating occurred for 1 generation within breed where selection candidates were chosen based on EBV for PB performance. This selective mating was done to further deviate selection candidate EBV prior to crossbreeding. Data utilized to produce EBV in generation 0 consisted of phenotypes and genotypes of all selection candidates produced from the last generation of random mating and pedigree relationships including the current selection candidates and the previous 2 randomly mated generations (generation −2 and −1).

A 3-tier swine breeding program was simulated with 3 nucleus lines (hereafter breeds A, B, and C) each with 25 and 175 mating males and females, respectively. Each PB dam produced 1 litter per generation with 2 males and 2 females to be selection candidates. Selection schemes have been previously described by See et al. (2020). Prior to the production of CB animals, PB selection candidates were selected as replacements by EBV for PB performance. Beginning in generation 2, replacements for breeds A, B and C were selected by EBV for CB performance according to replacement rates presented in Table 1. Mimicking the 3-way crossbreeding system, an F1 cross (AB) between males of breed A (n = 25) and females of breed B (n = 250) occurred in the second generation of selection. It is important to note that dams from B used to make AB progeny were not the same animals that produced PB progeny in each generation. Further, sires from A used to produce AB progeny were the same sires used to make PB matings in each generation. Females from AB (n = 400) were crossed with sires from C (n = 50) to produce terminal CB progeny (C(AB)). Each dam of C(AB) produced 1 litter with 10 progeny which were candidates for genotyping and phenotyping, resulting in a total of 4,000 CB progeny per generation. Half of the sires of C(AB), those with the highest EBV for CB performance, were the same subset of sires which were used to make PB matings in breed C each generation. The remaining 25 sires of C(AB) were sires which had the next highest EBV for CB performance but were not selected to be sires in breed C PB matings. The crossbreeding scheme continued for 6 generations.

Table 1.

Population parameters including average number of litters, progeny produced, and replacement rates for PB (A, B and C) and CB breeds (AB and C(AB)) per generation.

Breed Sires Dams Progeny Litter size Sire replacement Dam replacement
A 25 175 700 4 0.75 0.60
B 25 175 700 4 0.75 0.60
C 25 175 700 4 0.70 0.50
AB 25 250 1,000 4 0.65
C(AB) 50 400 4,000 10

Genotyping strategies

Five strategies of choosing CB animals to be genotyped were evaluated including: (1) Random: random animals were selected, (2) Top: animals with the highest phenotypes, (3) Bottom: animals with the lowest phenotypes, (4) Extreme: half animals with the highest and half with the lowest phenotypes, and (5) Middle: animals with average phenotypes. Strategies 2 through 5 were conducted by comparing genotype candidates to either their half-sib (HS) or full-sib (FS) contemporaries resulting in 2 sampling methods including full-sib family sampling (FS) and half-sib family sampling (HS) for a total of 9 unique selective genotyping strategies. Different proportions (number) of CB animals with known sire pedigrees were chosen to be genotyped and phenotyped including 5% (200), 10% (400), or 20% (800). For each of the genotyping strategies, it was assumed that 80% (1,680) of PB selection candidates were phenotyped for PB performance and genotyped with full pedigree information. Proportions of CB and PB data utilized to predict EBV for CB performance resulted in reference populations which were ~10%, 20%, and 30% comprised of CB data as suggested by See et al. (2020). Only genotyped CB animals entered genetic evaluations.

The 9 selective genotyping strategies were evaluated for different values of rpc. The values of rpc investigated were 0.1, 0.3, 0.5, 0.7, and 0.9. The selective genotyping strategies were compared at the conclusion of the simulation by the rate of genetic gain in CB animals, accuracy of EBV for CB performance in the PB selection candidates and bias of EBV for CB performance in PB selection candidates. Each genetic correlation scenario and selective genotyping strategy was replicated 10 times.

Analysis

Genetic evaluations were conducted using 1 and 2 trait single-step genomic best linear unbiased prediction (ssGBLUP) models using the BLUPf90 suite of programs (Misztal et al., 2002). Univariate models were utilized in generation 1 prior to the production of CB animals. All subsequent models were bivariate. Fixed effects included in the univariate models were the overall mean and generation of birth, while the bivariate models included fixed effects of the overall mean, generation of birth, and breed fraction covariates. In both models, the random effect of animal was included. In the univariate model, solutions for the additive genetic effects were assumed N(0,Hσa12), where σa12 is the additive genetic variance associated with PB performance. In the bivariate model, the solutions for the additive genetic effects were assumed    MVN(0,ΦH), where is the Kronecker product and Φ is the additive genetic (co)variance matrix of PB and CB performance. In both models, the residuals were assumed to have homogeneous variance. The (co)variance values used in all ssGBLUP evaluations were constant throughout the simulation and were the true values which were initially simulated. H was the blended relationship matrix including pedigree and genomic relationships. Solving the ssGBLUP equations required H1 which was calculated according to Aguilar et al. (2010), which required the inverse of the genomic relationship matrix (G1). As suggested by Lourenco et al. (2016), G was formed without breed-specific adjustments to centering and scaling allele frequencies as G=WW/(2pi(1pi)), first proposed by VanRaden (2008).

The accuracy of prediction for each selective genotyping strategy was calculated as the correlation between the true breeding values (TBV) and the EBV of selection candidates from a given generation. Bias of EBV was calculated as the mean absolute difference between TBV and EBV of PB selection candidates for CB performance for each selective genotyping strategy. Estimated marginal (EM) means were calculated in R using the emmeans package for genetic gain in CB animals for CB performance (ΔG), EBV accuracy and EBV bias for CB performance in PB selection candidates. Fixed linear models used to produce EM means for ΔG, accuracy and bias of PB EBV for CB performance were fit for each rpc scenario and breed as follows:

Yijk=μ+Repi+Stratj+Propk+eijk

where Y is either the difference in average TBV between the last and first generation of CB animals for CB performance (rate of genetic gain), the correlation between average TBV and EBV (accuracy) or the absolute value of the difference between average TBV and EBV (bias) for PB selection candidates in the last generation. The µ was the overall mean, Rep was the fixed effect of replicate, Strat is the fixed effect of CB genotyping strategy, Prop is the fixed effect of the proportion of CB data selectively genotyped and e was the random residual.

Results and Discussion

Characteristics of simulated data

The simulation yielded 3 PB breeds which were considered to be genetically distinct. Breed differentiation was determined using a principal components analysis, the average absolute difference in allele frequencies between breeds (0.24 on average across all replicates and rpc scenarios), and Wrights FST statistics (0.26 on average across all replicates and rpc scenarios) calculated according to Weir and Hill (2002) using all genotypes from all breeds in the first generation of selection. Breed differentiation in the current simulation was practically the same to that reported by See et al. (2020) using the same crossbreeding scheme.

The composition of the reference populations in terms of PB representation in genetic evaluations by CB phenotypes and differences in selected CB phenotypes across genotyping and phenotyping strategies is presented in Figure 1 when rpc=0.5. Regardless of selective genotyping strategy or proportion of CB data included, the number of PB sires represented in genetic evaluations ranged from a low of 45 to all 50 sires utilized for CB animal production. Given the degree of relatedness simulated, it would be expected that nearly all sires from breed C used to produce CB progeny would have PB and CB offspring in genetic evaluations (Figure 1A). Representation of maternal grand-sires of CB progeny (breed A) was consistent across selective genotyping strategies and proportion of CB data scenarios. The scenario which represented the largest number of sires (n = 31) from breed A was Middle_HS selective genotyping of 20% of the CB animals, while Extreme_FS selective genotyping of 5% of CB progeny represented the fewest number of breed A sires (n = 27; Figure 1B). The largest differences in PB representation of genotyped CB animals were expected and observed in the maternal grand dams. By design, large differences were shown in representation of dams from breed B between HS and FS selection schemes when including 10% or less of CB animals in genetic evaluations. The extreme selective genotyping strategy among FS while including 5% of CB animals represented the fewest number of maternal grand dams (n = 85) included while the most maternal grand dams (~186) were represented by HS selective strategies when the proportion of CB included was 20% (Figure 1C).

Figure 1.

Figure 1.

Sampling statistics when the CB inclusion rate of 5%, 10%, or 20% by genotyping strategy using FS or HS comparisons when the genetic correlation between PB and CB traits was 0.5. (A) Average number of CB sires sampled (Breed C). (B) Average number of CB maternal grandsires sampled (Breed A). (C) Average number of CB maternal grandams sampled (Breed B). (D) Average CB phenotype of selected animals to be included in genetic analyses with genotypes and phenotypes.

In general, CB animals selected to be included using random, extreme, and middle selection strategies, regardless of HS of FS comparisons methods, had the same average phenotype. As expected, top and bottom selective strategies had the highest and lowest average phenotypes, respectively. In general, the deviations in average phenotype among the selective strategies was reduced as the proportion of CB data included increased.

Response to selection

EM means of genetic change in CB animals (ΔG) are reported in Figure 2. In general, as rpc increased ΔG also increased and the differences between selective genotyping strategies decreased. When rpc was < 0.9 selective genotyping impacted (P < 0.05) the ΔG of CB animals. As the proportion of CB data included in an evaluation increased the numerical differences between selective genotyping strategies decreased within a rpc scenario. As expected, when the proportion of CB data increased the response in ΔG across selective genotyping strategies also increased.

Figure 2.

Figure 2.

EM means for the genetic gain in CB animals, defined as average TBV in last generation minus average TBV in the first generation of CB animals, under various genetic correlations between PB and CB performance (rpc) scenarios, levels of CB data inclusion and selective genotyping strategies using FS and HS comparisons.

When rpc was <0.9 extreme selective genotyping increased ΔG compared with random, top, bottom, and middle selection (P < 0.05), in agreement with previous findings (Boligon et al., 2012; Jiménez-Montero et al., 2012; Chu et al., 2020). The increase in ΔG from extreme selective genotyping is likely explained by the ability of 2-tailed selection strategies to capture more variation in phenotypes and genotypes compared with single-tailed selection strategies. The efficiency of 2-tailed genotyping and phenotyping selection strategies can be likened to the use of case–control studies in QTL detection (Van Gestel et al., 2000). The use of extreme samples has the ability to enhance the detection of influential SNP. Random resulted in the greater (P < 0.05) ΔG when rpc=0.1 when compared with top, bottom, and middle selection strategies. When the rpc was 0.3 or 0.5 random provided a numerical advantage compared with top and bottom selection strategies. Contrasts showed that top, bottom, and middle did not differentially impact ΔG (P > 0.05).

In agreement with the current study, Chu et al. (2020) showed that Extreme and Random selective genotyping produced greater rates of genetic gain compared with Top. Other simulation studies suggest that selective genotyping of the best animals caused reductions in prediction accuracy of EBV for animals in future generations compared to random sampling (Ehsani et al., 2010). Boligon et al. (2012) showed that selective genotyping by Extreme yield deviations provided an increase in predictive ability in single and correlated trait selection with a heritability ranging from 0.15 to 0.50. Further, Jiménez-Montero et al. (2012) proposed that divergent genotyping could increase the efficiency of genomic selection compared with genotyping only the best performing animals. Genetic evaluations utilized by Boligon et al. (2012) and Jiménez-Montero et al. (2012) produced genomic EBV from a Bayesian LASSO model which differs from the ssGBLUP model utilized in the current study. Further, strategies previously investigated included selective genotyping by minimizing relationships between selected animals (Boligon et al., 2012). Such selective genotyping strategies were shown to be comparable in terms of predictive ability to random selection. Selective genotyping via Middle or similar methods has not been previously investigated (Ehsani et al., 2010; Boligon et al., 2012; Jiménez-Montero et al., 2012; Chu et al., 2020).

Differences in ΔG between HS and FS comparison methods were limited. Comparisons within FS families coincided with a numerically greater response in ΔG using Middle selection compared with HS family selection. In general, there was a greater representation of the population-wide phenotypic distribution of animals when comparisons were made using FS (within litter) compared with HS (within sire family). Additionally, fewer PB maternal grand-dams were represented by CB progeny in genetic evaluations when HS comparisons were used and when the proportion of CB included was <20%. This lack of representation when using HS comparisons lead to a decrease in ΔG in CB animals through a likely increase in prediction error from the limited relationships between CB phenotypes and PB selection candidates.

Accuracy and bias of prediction

At the end of the simulation, the EBV accuracy across PB breeds ranged from 0.08 to 0.72. Selection candidates from breed A and B exhibited a larger degree of variation in EBV accuracy within a CB data inclusion level compared with breed C. Further, EBV accuracy was greater for selection candidates belonging to breed C compared with breed A or B as there was a greater degree of connectedness between CB phenotypes and selection candidates from breed C. In general, as the rpc or the CB data inclusion level increased, the accuracy of EBV also increased, as expected. When the rpc was low to moderate (0.1, 0.3, or 0.5), the increase in CB inclusion level increased EBV accuracy to a greater degree than when rpc was high (0.7 or 0.9), in agreement with previous reports (van Grevenhof and van der Werf 2015; See et al., 2020). Selective genotyping strategy influenced (P < 0.05) EBV accuracy in PB selection candidates when the rpc was < 0.9.

EM means for EBV accuracy of selection candidates from breed C in generation 6 are presented in Figure 3. Extreme genotyping produced greater EBV accuracy (P < 0.05) in breed C selection candidates compared with Random, Top, Bottom, or Middle selective genotyping strategies when rpc was < 0.9. Random genotyping produced the second highest (P < 0.05) EBV accuracy for breed C selection candidates when the rpc was < 0.7. No statistical differences in EBV accuracy for breed C selection candidates were observed when comparing Top, Bottom, and Middle selection strategies. In agreement with the current study, Boligon et al. (2012) reported that Extreme and Bottom selective genotyping produced the greatest and lowest EBV accuracy, respectively.

Figure 3.

Figure 3.

EM means of breeding value accuracy for CB performance from Breed C selection candidates defined as the correlation between estimated and TBV under various genetic correlations between purebred and CB performance (rpc) scenarios, levels of CB data inclusion and selective genotyping strategies using FS and HS comparisons.

Estimates of EBV bias, defined as the absolute value of TBV minus EBV, for selection candidates from breed C are presented in Figure 4. In general, Random and Extreme selective genotyping resulted in the least amount of EBV bias, while Middle produced a numerically greater EBV bias in breed C selection candidates. Extreme selective genotyping of CB animals resulted in the least EBV bias (P < 0.05) for breed C selection candidates regardless of family sampling, CB data inclusion, or rpc. Among the strategies of Random, Top, Middle and Bottom, Random produced less (P < 0.05) biased estimates of EBV when the rpc<0.5. These results are in line with those previously reported by Jiménez-Montero et al. (2012), where EBV bias in beef cattle was shown to be minimized by two-tailed selective genotyping strategies.

Figure 4.

Figure 4.

EM means of EBV bias for CB (CB) performance from Breed C selection candidates defined as the absolute value of the difference between true and EBVs under various genetic correlations between PB and CB performance (rpc) scenarios, levels of CB data inclusion and selective genotyping strategies using FS and HS comparisons.

Trends in EBV accuracy in generation 6 from animals in breed A are presented in Figure 5. The average EBV accuracy for breed B was similar to that of breed A (results not shown). In general, Extreme and Middle sampling produced the greatest and lowest EBV accuracy, respectively, for breed A selection candidates. Extreme selective genotyping produced the greatest (P < 0.05) EBV accuracy for breed A selection candidates when rpc was <0.9. Further, Random and Middle_FS genotyping strategies produced numerically larger estimates of EBV accuracy than Top or Bottom genotyping strategies. Trends in EBV bias for selection candidates from breed A were similar to that observed in breed C, with the exception of when rpc=0.1 (Figure 6). In general, bias and differences between strategies increased as the rpc increased. The least amount of bias was observed when using Extreme and Random selective genotyping.

Figure 5.

Figure 5.

EM means of breeding value accuracy for CB performance from Breed A selection candidates defined as the correlation between estimated and TBV under various genetic correlations between PB and CB performance (rpc) scenarios, levels of CB data inclusion and selective genotyping strategies using FS and HS comparisons.

Figure 6.

Figure 6.

EM means of EBV bias for CB (CB) performance from Breed A selection candidates defined as the absolute value of the difference between true and EBVs under various genetic correlations between PB and CB performance (rpc) scenarios, levels of CB data inclusion and selective genotyping strategies using FS and HS comparisons.

Unlike comparisons in breed C accuracy and bias, family sampling structure played a large role in EBV accuracy and bias for breed A when the rpc<0.7. When using Top, Bottom or Middle genotyping strategies, using FS comparisons produced greater EBV accuracy than HS comparisons. Family sampling method did not greatly impact EBV bias of breed A selection candidates across all strategies. Extreme selective genotyping was the most impacted strategy due to family sampling. Using Extreme when rpc was 0.1 or 0.3, HS family sampling resulted in a greater degree of bias compared with FS. Differences in EBV accuracy and bias due to family sampling method are multifaceted. The most likely reason for the differences between HS and FS family sampling methods is due to the differences between the numbers of animals used to make comparisons. A greater number of animals are within HS families and there are less HS families compared with FS families. Consequently, HS sampling allows for capturing extreme ends of the phenotypic distribution which are further separated from the family and population mean when compared with FS. Further, increases in accuracy could be caused by the greater connectedness provided by FS comparisons between CB phenotypes and selection candidates from breed A. By utilizing FS comparisons, a greater number of maternal grand-sires families are represented in genetic evaluations with CB phenotypes. While the number of CB maternal grand-sires represented is not drastically different across selective genotyping strategies (Figure 1), maximizing the number of records within CB FS families reduces the noise within the genetic evaluations to a greater degree due to unknown CB maternal pedigrees.

General discussion

In this study, the rate of genetic gain and the difference in accuracy of EBV between selective genotyping strategies is similar to previous reports (Boligon et al., 2012; Howard et al., 2018; Gowane et al., 2019; Chu et al., 2020). Selective genotyping of the top and bottom CB animals (2-tailed sampling) resulted in the greatest rates of genetic gain, lowest EBV bias and highest EBV accuracy when the rpc was moderate to low. Random genotyping of CB animals outperformed selective genotyping strategies which chose CB animals with Top, Bottom, or Middle phenotypes. Reduced bias was achieved with FS family sampling compared with HS, yet, limited change was observed in the rate of genetic gain. Differences between FS and HS family sampling can likely be attributed to a greater variation of animals sampled using FS due to making comparisons between a smaller group of animals.

Consideration should be given in selective genotyping schemes to limit the potential increase in inbreeding depression caused from nonrandom selection of genotyped individuals. Pedigree-based rates of inbreeding from the current study favored different selective genotyping strategies by breed (Figure 7). Extreme genotyping yielded the lowest and highest amount of pedigree-based inbreeding in breed C and breed A, respectively. Generally, the rate of inbreeding in breed B was not affected by the family sampling method or selective genotyping, which is to be expected as the number of FS families represented is not greatly impacted across methods due to the crossbreeding structure. Increases in inbreeding have been previously noted in combined CCPS (Bijma et al., 2001) as they increase the dependency of family information. Such increases in family information required for increased prediction accuracy are best noted in breed A where the increase in FS family structure captured in genetic evaluations both increases EBV accuracy and rate of inbreeding.

Figure 7.

Figure 7.

Pedigree-based rate of inbreeding (ΔF; %) of selection candidates from PB lines from selection schemes where CB progeny are selectively genotyped using HS and FS comparisons averaged across several genetic correlation scenarios and CB data inclusion rates.

Previous results suggest that ssGBLUP reduces bias in the prediction of breeding values when using selective genotyping compared with GBLUP (Howard et al., 2018; Gowane et al., 2019). The inclusion of phenotypes from nongenotyped animals through combined pedigree and genomic relationships reduces bias by accounting for preselection of genotyped animals. The mixed model equations utilized by ssGBLUP can account for preselection of data records given that the preselection criteria is a linear function of the phenotype (Henderson, 1975). In the current study, the only CB phenotypes included in the evaluation were from genotyped animals. Selectively genotyping Top, Bottom, or Middle animals produced the greatest amount of EBV bias compared with Random, in agreement with Chu et al. (2020). Using true variance components produces less biased EBV when using selective genotyping in ssGBLUP (Gowane et al., 2019; Chu et al., 2020); however, true variance component is not obtainable outside of simulation. In the current study, the phenotypic variation among selected CB animals differed drastically depending upon the selective genotyping strategy employed. Wang et al. (2020) evaluated the bias in variance component estimation and predictive ability using both traditional pedigree-based relationship matrices and single-step relationship matrices in broilers. When selectively genotyping the top 20% of broilers using single-step relationship matrices, variance components were highly over-estimated. Further, the predictive ability was worse when using variance components which were estimated using selectively genotyped data compared with randomly selected genotyped individuals. Consequently, the predictive ability was greatest for genotyped animals using a random genotyping strategy. Such results are in agreement with those of the current study considering Extreme selective genotyping was not considered by Wang et al. (2020).

The effectiveness of crossbreeding schemes is dependent upon many factors which were not the subject of current investigation. Non-additive genetic effects such as dominance, which is the most likely cause of heterotic effects in swine breeding, could impact the effectiveness of CCPS and selective genotyping schemes. In comparisons to previous reports using ssGBLUP models, the current study produced a greater predictive ability across breeds regardless of selection strategy (Pocrnic et al., 2019). Differences in across breed predictivability between the current study and Pocrnic et al. (2019) are likely due to the low genetic correlation between the PB breeds. In contrast, the current study assumed that PB traits were genetically identical, which likely results in an increased effectiveness of CCPS. When dominance is present, it has been shown that further benefits from CCPS could be realized by fitting genetic models which include dominance by accounting for breed-of-origin alleles (Esfandyari et al., 2015; Duenk et al., 2019). If CB traits utilized in the current study were influenced by dominance further increases in genetic gain and prediction accuracy could be realized by incorporating dominance into ssGBLUP models (Christensen et al., 2014, 2019).

The results obtained by this simulation study can be of practical use in swine crossbreeding systems. When incorporating a proportion of CB data into genetic evaluations which are historically dominated by PB nucleus records, selective genotyping becomes an area of interest in maximizing the trade-offs between contribution of CB data and the cost of data collection. Random collection of CB data records is likely the most cost-efficient due to ease of implementation, yet, gains in EBV accuracy and genetic gain could be achieved by collecting data on the extreme ends of the CB phenotypic distribution of interest. The current study has shown Extreme selective genotyping increased EBV accuracy for CB sires and maternal grandparents and increased the genetic gain in CB performance compared with Random genotyping when the correlation between PB and CB performance is <0.9. In the current study, only 1 heritability scenario was investigated, however, results from the current study can be adapted to other traits with heritability which differ from 0.40 through the inclusion of additional CB and PB data (Solberg et al., 2008; Chu et al., 2020).

Reliable estimates of variance components are needed to produce unbiased EBV in swine selection schemes when implementing selective genotyping where not all phenotypes are included. Selective genotyping of CB animals likely introduces bias in the estimates of (co)variances related to CB performance. As such, further research is warranted to determine the effect of selective genotyping on the estimates of variance components and subsequent impacts of EBV bias in swine crossbreeding schemes utilizing CB data to inform PB selection decisions.

Glossary

Abbreviations

CB

crossbred

CCPS

combined crossbred purebred selection

EBV

estimated breeding value

EM

estimated marginal

FS

comparisons between full-sibs

HS

comparisons between half-sibs

Ne

effective population size

PB

purebred

QTL

quantitative trait loci

rpc.

genetic correlation between purebred and crossbred performance

ssGBLUP

single-step genomic best linear unbiased prediction

TBV

true breeding value

ΔG

genetic gain in crossbred performance

Conflict of interest statement

The authors declare no real or perceived conflicts of interest.

Literature Cited

  1. Aguilar, I., I.  Misztal, D. L.  Johnson, A.  Legarra, S.  Tsuruta, and T. J.  Lawlor. . 2010. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J. Dairy Sci. 93:743–752. doi: 10.3168/jds.2009-2730 [DOI] [PubMed] [Google Scholar]
  2. Bijma, P., I. A.  Woolliams, and J. A. M.  van Arendonk. . 2001. Genetic gain of pure line selection and combined crossbred purebred selection with constrained inbreeding. Anim. Sci. 72:225–232. doi: 10.1017/s135772980005715 [DOI] [Google Scholar]
  3. Boligon, A. A., N.  Long, L. G.  Albuquerque, K. A.  Weigel, D.  Gianola, and G. J.  Rosa. . 2012. Comparison of selective genotyping strategies for prediction of breeding values in a population undergoing selection. J. Anim. Sci. 90:4716–4722. doi: 10.2527/jas.2012-4857 [DOI] [PubMed] [Google Scholar]
  4. Chen, G. K., P.  Marjoram, and J. D.  Wall. . 2009. Fast and flexible simulation of DNA sequence data. Genome Res. 19:136–142. doi: 10.1101/gr.083634.108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Christensen, O. F., P.  Madsen, B.  Nielsen, and G.  Su. . 2014. Genomic evaluation of both purebred and crossbred performances. Genet. Sel. Evol. 46:23. doi: 10.1186/1297-9686-46-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Christensen, O. F., B.  Nielsen, G.  Su, T.  Xiang, P.  Madsen, T.  Ostersen, I.  Velander, and A. B.  Strathe. . 2019. A bivariate genomic model with additive, dominance and inbreeding depression effects for sire line and three-way crossbred pigs. Genet. Sel. Evol. 51:45. doi: 10.1186/s12711-019-0486-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chu, T. T., A. C.  Sørensen, M. S.  Lund, K.  Meier, T.  Nielsen, and G.  Su. . 2020. Phenotypically selective genotyping realizes more genetic gains in a rainbow trout breeding program in the presence of genotype-by-environment interactions. Front. Genet. 11:866. doi: 10.3389/fgene.2020.00866 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dekkers, J. C. M. 2007. Marker-assisted selection for commercial crossbred performance. J. Anim. Sci. 85:2104–2114. doi: 10.2527/jas.2006-683 [DOI] [PubMed] [Google Scholar]
  9. Duenk, P., M. P. L.  Calus, Y. C. J.  Wientjes, V. P.  Breen, J. M.  Henshall, R.  Hawken, and P.  Bijma. . 2019. Validation of genomic predictions for body weight in broilers using crossbred information and considering breed-of-origin of alleles. Genet. Sel. Evol. 51:38. doi: 10.1186/s12711-019-0481-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Ehsani, A., L.  Janss, and O. F.  Christensen. . 2010. Effects of selective genotyping on genomic prediction. In  World Congress on Genetic Applied to Livestock Production Abstract; Leipzig, Germany, (No. 444, p. 2–7). [Google Scholar]
  11. Esfandyari, H., A. C.  Sørensen, and P.  Bijma. . 2015. A crossbred reference population can improve the response to genomic selection for crossbred performance. Genet. Sel. Evol. 47:76. doi: 10.1186/s12711-015-0155-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gaynor, R. C. G., G.  Gorjanc, E.  Wilson, D.  Money, and J. M.  Hickey. . 2019. AlphaSimR: Breeding  Program Simulations.  https://CRAN.R-project.org/package=AlphaSimR, R package version 0.9.0. Accessed October 2019.
  13. Gowane, G. R., S. H.  Lee, S.  Clark, N.  Moghaddar, H. A.  Al-Mamun, and J. H. J.  van der Werf. . 2019. Effect of selection and selective genotyping for creation of reference on bias and accuracy of genomic prediction. J. Anim. Breed. Genet. 136:390–407. doi: 10.1111/jbg.12420 [DOI] [PubMed] [Google Scholar]
  14. van Grevenhof, I. E., and J. H.  van der Werf. . 2015. Design of reference populations for genomic selection in crossbreeding programs. Genet. Sel. Evol. 47:14. doi: 10.1186/s12711-015-0104-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Henderson, C. R. 1975. Best linear unbiased estimation and prediction under a selection model. Biometrics  31:423–447. doi: 10.2307/2529430 [DOI] [PubMed] [Google Scholar]
  16. Hidalgo, A. M., J. W.  Bastiaansen, M. S.  Lopes, M. P.  Calus, and D. J.  de Koning. . 2016. Accuracy of genomic prediction of purebreds for cross bred performance in pigs. J. Anim. Breed. Genet. 133:443–451. doi: 10.1111/jbg.12214 [DOI] [PubMed] [Google Scholar]
  17. Howard, J. T., T. A.  Rathje, C. E.  Bruns, D. F.  Wilson-Wells, S. D.  Kachman, and M. L.  Spangler. . 2018. The impact of selective genotyping on the response to selection using single-step genomic best linear unbiased prediction. J. Anim. Sci. 96:4532–4542. doi: 10.1093/jas/sky330 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Jiménez-Montero, J. A., O.  González-Recio, and R.  Alenda. . 2012. Genotyping strategies for genomic selection in small dairy cattle populations. Animal  6:1216–1224. doi: 10.1017/S1751731112000341 [DOI] [PubMed] [Google Scholar]
  19. Knol, E. F., B.  Nielsen, and P. W.  Knap. . 2016. Genomic selection in commercial pig breeding. Anim. Front. 6:15–22. doi: 10.2527/af.2016-0003 [DOI] [Google Scholar]
  20. Lourenco, D. A., S.  Tsuruta, B. O.  Fragomeni, C. Y.  Chen, W. O.  Herring, and I.  Misztal. . 2016. Crossbreed evaluations in single-step genomic best linear unbiased predictor using adjusted realized relationship matrices. J. Anim. Sci. 94:909–919. doi: 10.2527/jas.2015-9748 [DOI] [PubMed] [Google Scholar]
  21. Misztal, I., S.  Tsuruta, T.  Strabel, B.  Auvray, T.  Druet, and D. H.  Lee. . 2002. BLUPF90 and related programs (GF90). In Proceedings of the 7th World Congress on Genetics Applied to Livestock Production; Montpellier, France; p. 743–744. [Google Scholar]
  22. Pocrnic, I., D. A.  Lourenco, C. Y.  Chen, W. O.  Herring, and I.  Misztal. . 2019. Crossbred evaluations using single-step genomic BLUP and algorithm for proven and young with different sources of data. J. Anim. Sci. 97:1513–1522. doi: 10.1093/jas/skz042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. See, G. M., B. E.  Mote, and M. L.  Spangler. . 2020. Impact of inclusion rates of crossbred phenotypes and genotypes in nucleus selection programs. J. Anim. Sci. 98:1–13. doi: 10.1093/jas/skaa360 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Sewel, A., H.  Li, C.  Schwab, C.  Maltecca, and F.  Tiezzi. . 2018. On the value of genotyping terminal crossbred pigs for nucleus genomic selection for carcass traits. In: Proceedings of the World Congress on Genetics Applied to Livestock Production; Auckland, New Zealand, (No. 11, p. 755). [Google Scholar]
  25. Solberg, T. R., A. K.  Sonesson, J. A.  Woolliams, and T. H.  Meuwissen. . 2008. Genomic selection using different marker types and densities. J. Anim. Sci. 86:2447–2454. doi: 10.2527/jas.2007-0010 [DOI] [PubMed] [Google Scholar]
  26. Van Gestel, S., J. J.  Houwing-Duistermaat, R.  Adolfsson, C. M.  van Duijn, and C.  Van Broeckhoven. . 2000. Power of selective genotyping in genetic association analyses of quantitative traits. Behav. Genet. 30:141–146. doi: 10.1023/a:1001907321955 [DOI] [PubMed] [Google Scholar]
  27. VanRaden, P. M. 2008. Efficient methods to compute genomic predictions. J. Dairy Sci. 91:4414–4423. doi: 10.3168/jds.2007-0980 [DOI] [PubMed] [Google Scholar]
  28. Wang, L., L. L.  Janss, P.  Madsen, J.  Henshall, C. H.  Huang, D.  Marois, S.  Alemu, A. C.  Sørensen, and J.  Jensen. . 2020. Effect of genomic selection and genotyping strategy on estimation of variance components in animal models using different relationship matrices. Genet. Sel. Evol. 52:31. doi: 10.1186/s12711-020-00550-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Weir, B. S., and W. G.  Hill. . 2002. Estimating F-statistics. Annu. Rev. Genet. 36:721–750. doi: 10.1146/annurev.genet.36.050802.093940 [DOI] [PubMed] [Google Scholar]
  30. Wientjes, Y. C. J., and M. P. L.  Calus. . 2017. Board invited review: the purebred-crossbred correlation in pigs: a review of theory, estimates, and implications. J. Anim. Sci. 95:3467–3478. doi: 10.2527/jas.2017.1669 [DOI] [PubMed] [Google Scholar]
  31. Zeng, J., A.  Toosi, R. L.  Fernando, J. C.  Dekkers, and D. J.  Garrick. . 2013. Genomic selection of purebred animals for crossbred performance in the presence of dominant gene action. Genet. Sel. Evol. 45:11. doi: 10.1186/1297-9686-45-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Zhao, Y., M.  Gowda, F. H.  Longin, T.  Würschum, N.  Ranc, and J. C.  Reif. . 2012. Impact of selective genotyping in the training population on accuracy and bias of genomic selection. Theor. Appl. Genet. 125:707–713. doi: 10.1007/s00122-012-1862-2 [DOI] [PubMed] [Google Scholar]

Articles from Journal of Animal Science are provided here courtesy of Oxford University Press

RESOURCES