Skip to main content
G3: Genes | Genomes | Genetics logoLink to G3: Genes | Genomes | Genetics
. 2019 Jun 6;9(8):2463–2475. doi: 10.1534/g3.118.200986

Genomic Selection with Allele Dosage in Panicum maximum Jacq.

Letícia A de C Lara *, Mateus F Santos , Liana Jank , Lucimara Chiari , Mariane de M Vilela , Rodrigo R Amadeu *, Jhonathan P R dos Santos *, Guilherme da S Pereira , Zhao-Bang Zeng , Antonio Augusto F Garcia *
PMCID: PMC6686918  PMID: 31171567

Abstract

Genomic selection is an efficient approach to get shorter breeding cycles in recurrent selection programs and greater genetic gains with selection of superior individuals. Despite advances in genotyping techniques, genetic studies for polyploid species have been limited to a rough approximation of studies in diploid species. The major challenge is to distinguish the different types of heterozygotes present in polyploid populations. In this work, we evaluated different genomic prediction models applied to a recurrent selection population of 530 genotypes of Panicum maximum, an autotetraploid forage grass. We also investigated the effect of the allele dosage in the prediction, i.e., considering tetraploid (GS-TD) or diploid (GS-DD) allele dosage. A longitudinal linear mixed model was fitted for each one of the six phenotypic traits, considering different covariance matrices for genetic and residual effects. A total of 41,424 genotyping-by-sequencing markers were obtained using 96-plex and Pst1 restriction enzyme, and quantitative genotype calling was performed. Six predictive models were generalized to tetraploid species and predictive ability was estimated by a replicated fivefold cross-validation process. GS-TD and GS-DD models were performed considering 1,223 informative markers. Overall, GS-TD data yielded higher predictive abilities than with GS-DD data. However, different predictive models had similar predictive ability performance. In this work, we provide bioinformatic and modeling guidelines to consider tetraploid dosage and observed that genomic selection may lead to additional gains in recurrent selection program of P. maximum.

Keywords: Plant Breeding, Guinea Grass, Quantitative Genotyping, Polyploidy, Genotyping-by-sequencing (GBS), Recurrent Genomic Selection, Genomic Prediction, GenPred, Shared Data Resources


Many agricultural and forage crops of economic importance are autotetraploids, such as potato (Solanum tuberosum) (Allard 1960), alfalfa (Medicago sativa) (McCoy and Bingham 1988), and guinea grass (Panicum maximum Jacq.) (Warmke 1954; Jank et al. 2014). For these species, marker alleles can be represented with different dosages ranging from 0 to 4. This allele dosage refers to the number of copies of the reference allele, ranging from aaaa for nulliplex, Aaaa for simplex (single dose), and so on up to AAAA (quadruplex), where A is the reference allele, for a biallelic marker, such as Single Nucleotide Polymorphisms (SNPs).

Historically, genetic analysis in polyploids has been based on single dose markers, mostly for building genetic maps (Wu et al. 1992; Hackett and Luo 2003). With the development of Next-Generation Sequencing (NGS) technologies and the advance of genetic and statistical methods, new possibilities have arisen to enable studies of the complex polyploid genomes for many crops (Garcia et al. 2013). In these species, the evaluation of SNP throughout the genome allows one to assess the relative abundance of each allele (Voorrips et al. 2011; Serang et al. 2012; Garcia et al. 2013; Hackett et al. 2013; Mollinari and Serang 2013; Schmitz Carley et al. 2017; Gerard et al. 2018).

According to Osborn et al. (2003), the allele dosage is expected to be related to different expression levels of some target traits. Thus, polyploidy can increase the potential variation in its genic expression, reflecting in phenotypic variation. Therefore, the inclusion of allele dosage information has become important for genetic studies in polyploid species (Garcia et al. 2013; Hackett et al. 2013; Endelman et al. 2018). In addition, it will allow the development of new statistical models in autopolyploid species, such as in the forage P. maximum.

Panicum maximum Jacq. (Syn. Megathyrsus maximum Jacq. B. K. Simon & S. W. L. Jacobs) is a perennial tropical autotetraploid forage grass (2n=4x=32) that reproduces by facultative apomixis (Warmke 1954), which means a high percentage of asexual reproduction by seeds. Cross-breeding started with the doubling of chromosomes by colchicine of rare diploid sexual plants (2n=2x=16) found in the center of origin in East Africa (Pernès 1975). This enabled the successful crossing between apomictic and sexual plants on a tetraploid level, giving rise to a progeny of sexual and apomictic hybrids in a 1:1 ratio (Savidan 1978). This means that it is possible to fix the hybrid vigor by apomixis in the F1 generation. Also, superior sexual hybrids can be used as new parents in sexual × apomictic crosses or as donors of favorable alleles in sexual populations.

Few breeding generations separate the founder diploid accessions from the current sexual parents, and increase in the frequency of favorable alleles by selection for traits such as disease resistance, regrowth, biomass and seasonal yield, and nutritive value is necessary. Currently, conventional methods such as half-sib recurrent selection (RS) scheme (Hallauer and Carena 2012) have been carried out in sexual populations to increase the general combining ability of sexual parents. However, RS is a long term program as each cycle requires from three to five years for developing, evaluating, selecting and recombining superior plants. The evaluation phase needs two years for phenotyping the target traits in different harvests and seasons (rainy and dry seasons), and is the most time and expensive phase of the RS program. Thus, methods that accelerate the RS cycles will speed up the development of superior sexual parents for crossing with apomictic plants and increase the probability of releasing improved apomictic cultivars in the market. This is one of the goals of EMBRAPA, the Brazilian Agricultural Research Corporation. The unit Embrapa Beef Cattle is responsible for forage breeding programs, where the main crops are Brachiaria spp. and P. maximum (Jank et al. 2011). This research was established into the P. maximum recurrent genomic selection program.

Initially proposed by Meuwissen et al. (2001), genomic selection (GS) is an approach that uses statistical methods to predict breeding values from markers distributed throughout the genome. It uses all markers simultaneously to calculate the genomic estimated breeding values (GEBVs) of individuals for ranking and progeny selection (Meuwissen et al. 2001; Bernardo 2014). GS has recently been proposed as a useful tool for rapid genetic gains in RS (Massman et al. 2013; Müller et al. 2017; Zhang et al. 2017). It is a promising approach to accelerate cycles of recurrent selection since an established prediction equation can be used repeatedly for multiple cycles of selection (Müller et al. 2017). Furthermore, GS promises to increase genetic gain per unit of time, reduce costs for phenotyping, and increase the accuracy of selection.

Most of the available GS models were developed for diploids, and are not well adapted and evaluated for polyploid species. The huge challenge in working with polyploid species is in the correct distinction among different types of heterozygotes. Hidden heterozygotes lead to pseudo-diploid models, that are commonly applied in polyploid species (Annicchiarico et al. 2015; Biazzi et al. 2017). However, the impact of this approximation for GS models has only been reported in a few studies (Endelman et al. 2018; Oliveira et al. 2019).

The goal of this research was to develop and to evaluate predictive models for genomic selection in P. maximum. For this purpose, we compared different predictive models considering tetraploid and diploid allele dosage to obtain the GEBVs. The genotype calling was performed to allow the identification of different dose levels, showing all heterozygous genotypes and six different GS models were extended to incorporate the tetraploid allele dosage. To our knowledge, this is the first study that includes tetraploid dosage in whole-genome regression models for genomic selection in this tropical forage grass, combined with a high throughput genotyping in a sexual recurrent selection population of P. maximum.

Materials and Methods

Panicum maximum population

We generated P. maximum multi-parental population using 20 selected plants (JA, S7, S13, S16, A42, B87, T103, T4610, A47, A72, B107, C48, C16, B22, Y34, C54, B74, B96, BX4, and B103) as male parents and 19 sexual plants (the same set of plants except B107) as female parents in a polycross mating design. These parents were selected based on relevant agronomic performance in the forage breeding program at Embrapa Beef Cattle during the last 30 years. After the crossing process, we synthesized 19 half-sibs progeny families, each family composed of 30 individuals, resulting in a total of 570 tetraploid sexual plants.

Phenotypic evaluation

The multi-parental population was evaluated in a split-plot randomized complete block design with six replications at Embrapa Beef Cattle, in Campo Grande city, Mato Grosso do Sul state, Brazil (2027’S, 5437’W, 530m). Half-sib progenies (represented by each female parent) were treated as the whole plot factor, while individuals were treated as sub-plot factors. Experimental units were 5 m by 2 m sized plots containing 5 individual plants of a family or 5 clones of one of three apomictic cultivars (B107, Mombaça, and Tanzânia). Thus, each block consisted of a total of 22 (19 half-sib families and three check cultivars) and a total of 110 individual plants (Figure S1). Therefore 660 plants were evaluated in the experiment, in which 570 were individuals from the 19 half-sib progenies and 90 were clones from the apomictic cultivars.

Each individual plant was evaluated for six traits: leaf dry matter (LDM - g/plant), regrowth capacity (RC), percentage of leaf blade (PLB - %), organic matter (OM - %), crude protein (CP - %), and in vitro digestibility of organic matter (IVD - %). The RC trait was evaluated seven days after the harvest using density notes and speed notes of sprouted stems. We considered these traits due their agronomic importance for forage breeding, especially for better forage quality properties for animal consumption. Thus, forage value is measured indirectly by being converted into animal products (such as meat, milk, leather, and furs) (Jank et al. 2011). The first three traits were evaluated for eight harvests, during the years of 2013 (four harvests), 2014 (one harvest), and 2015 (three harvests), and the last three were evaluated for four harvests, during the years of 2013 and 2015 (two harvests each).

Statistical analysis of phenotypic data

For each trait, we fitted the following longitudinal linear mixed model to obtain predicted means later used in genomic selection analysis. The model is

yijkl=μ+hl+bk(l)+pj(l)+bpkj(l)+til+eijkl (1)

where yijkl was the phenotypic value of the ith plant from the parent j in the block k at the harvest l, μ was the fixed overall mean, hl was the fixed effect of the lth harvest (l=1,,L, with L=8 for LDM, RC, and PLB traits, and L=4 for OM, CP, and IVD traits), bk(l) was the random effect of the kth block (k=1,,K, with K=6) at the harvest l, pj(l) was the random effect of the jth parent (j=1,,J, with J=22) at the harvest l, bpkj(l) was the random interaction effect of block k and parent j at the harvest l, til was the effect of the ith plant (i=1,,I, with I=573) at the harvest l, and eijkl was the random environmental error. The plant effects (til) were separated into two groups, in which gil was the random effect of the ith individual genotype (i=1,,Ig, with Ig=570) at the harvest l, and cil was the fixed effect of the ith apomictic cultivar (i=Ig+1,,Ig+Ic, with Ic=3) at the harvest l. For genotypes, the vector g=(g11,,gIgL) was assumed to follow a multivariate normal distribution with zero mean and genetic VCOV matrix G=GLIIg, i.e., gN(0,G). For residuals, eN(0,R), where e=(e1111,,eIJKL) and R=RLIIJK. For parents, pj(kl)N(0,σp2) and for blocks, bk(l)N(0,σb2).

The choice of the genetic (GL) and residual (RL) VCOV matrices was performed hierarchically as follows. First, we picked the model with the best fit for the VCOV structures of the genetic effects among nine different VCOV structures (Table 1). This was done assuming an identity matrix (ID) as the VCOV matrix of the residual effects. Then, this picked model was evaluated assuming nine different VCOV structures for residual effects. Finally, the model with the best fit for the VCOV structure of the residual effect was selected. The models’ goodness of fitness were computed with Akaike Information Criterion (AIC) (Akaike 1974) and Bayesian Information Criterion (BIC) (Schwarz 1978). These analyses were performed in the R package ASReml-R (Butler et al. 2009). The generalized heritability (broad-sense heritability) for each trait was calculated using the same model as before mentioned but considering GL matrix as CS and RL as ID, according to Cullis et al. (2006) as:

H^C2=1PEV2σG2 (2)

where, PEV is the prediction error variance (average variance of individual plant comparisons) and σG2 is the genetic variance.

Table 1. Variance and covariance structures examined for genetic (GL matrix) and residual effects (RL matrix).

Model nPARa Description
ID 1 Identical variation
DIAG L Heterogeneous variation
CS 2 Compound symmetry with homogeneous variation
CSHet L + 1 Compound symmetry with heterogeneous variation
AR1 2 First-order autoregressive model with homogeneous variation
AR1Het L + 1 First-order autoregressive model with heterogeneous variation
Po 2 Power model with homogeneous variation
PoHet L + 1 Power model with heterogeneous variation
US L(L+1)/2 Unstructured model
a

The number of parameters for the models, where L is the number of harvests.

Molecular data

Due to losses of individuals in the field, we sequenced a total of 530 offspring. DNA of these 530 offspring were extracted using the DNeasy Plant kit (QIAGEN) and sequenced along with female parents each repeated twice. To provide a higher sequence depth, genotyping-by-sequencing (GBS) was conducted in NextSeq 500 platform for 96-plex Pst1 libraries and following the protocol from Genomic Diversity Facility, Cornell University (Elshire et al. 2011). Raw data were analyzed using Tassel-GBS pipeline (Glaubitz et al. 2014) modified to obtain the exact read counts for each SNP allele (Pereira et al. 2018). As this pipeline requires a reference genome and P. maximum does not have one, we aligned the GBS tags against six pseudo-references from other diploid and tetraploid forage genomes, and P. maximum transcriptomes: (i) Panicum hallii genome (v. 2.0, DOE-JGI; 554 Mb arranged in 9 chromosomes and 8,405 scaffolds; diploid forage); (ii) Panicum virgatum genome (v 1.0, DOE-JGI; 1,230 Mb arranged in total of 18 chromosomes, 9 chromosomes named as A and B each, and 220,646 contigs; allotetraploid forage); (iii) Setaria italica genome (v 2.2; 405.7 Mb arranged in 336 scaffolds; diploid forage) (Bennetzen et al. 2012); (iv) Setaria viridis genome (v 1.0, DOE-JGI; 394.9 Mb arranged in 9 chromosomes and 724 scaffolds; diploid forage); (v) transcriptome of P. maximum obtained by EMBRAPA institution (43,803 transcripts with mean length of 841.334 bases); and (vi) transcriptome of P. maximum obtained by UNICAMP institution (138,853 transcripts with mean length of 695.658 bases). All the genomes are available in Phytozome website (http://www.phytozome.jgi.doe.gov/) (Goodstein et al. 2011). The transcriptomes were provided directly by the mentioned institutions. The transcriptome obtained by UNICAMP was published by Toledo-Silva et al. (2013). The transcriptome obtained by EMBRAPA has not yet been published. Bowtie2 algorithm (Langmead and Salzberg 2012) was used to align tags against each reference with -D and -R parameters defined as 20 and 4, respectively, and with very-sensitive-local argument. Common reads that aligned to the different genomes were called as different SNP markers. Then, we performed a preliminary analysis evaluating the prediction accuracy of models containing all the markers against models containing only non-redundant markers in genomic prediction models. As predictive accuracies between them did not differ, we chose to use all the markers in the subsequent analyses.

In Tassel-GBS pipeline, the minimum minor allele frequency (mnMAF) considered was 1%. The counting information derived from the Tassel-GBS pipeline was used to estimate the tetraploid allele dosage for each loci using SuperMASSA (Serang et al. 2012; Pereira et al. 2018). This software is designed especially for SNP calling in polyploid species based on probabilistic graphical models. In SuperMASSA software, the minimum overall depth considered was 25 reads and the model used was “Generalized Population Model”. Markers were fitted and filtered to ploidy 4. Triallelic SNPs and markers with more than 5% of missing data were filtered out. Since the selection of markers with up to 5% of missing data are a very strict filter, it is expected to include few percentage of errors due to the imputation process. The imputation was made using random sampling considering the frequency of each dose within each marker. Redundancy among markers was analyzed and displayed using R package circlize (Gu et al. 2014). The name of the markers was formed by: reference genome plus chromosome number plus position of SNP in the chromosome.

As linkage disequilibrium (LD) can affect the prediction accuracy of GS, the LD was estimated using squared Pearson correlation, r2 (Vos et al. 2017). Correlations were calculated on tetraploid dosage (0, 1, 2, 3, and 4) between marker pairs with up 1500 bp of distance for each reference genome. The average r2 between adjacent markers were calculated and, subsequently, the pairwise correlations were pooled over all chromosomes for each reference genome. A principal component analysis (PCA) was performed on the genotype data to detect population structure and displayed using ggplot2 (Wickham 2009). The relatedness among individuals was assessed computing (i) a pedigree-based additive relationship matrix (Amadeu et al. 2016), and (ii) a marker-based genomic relationship matrix (VanRaden 2008) using the R package AGHmatrix (Amadeu et al. 2016).

Genomic prediction models using tetraploid dosage

Here, we generalized well known GS models for diploid to tetraploid species using the information of tetraploid allele dosage. Our response variable was the marginal predicted values of individual plants across harvests (BLUPs considering all harvests simultaneously). We evaluated both bayesian and frequentist approaches: Bayesian Ridge Regression (BRR) (Whittaker et al. 2000; Meuwissen et al. 2001); Bayes A (BA) (Meuwissen et al. 2001); Bayes B (BB) (Meuwissen et al. 2001); Bayes C (BC) (Habier et al. 2011); Bayesian LASSO (BL) (Park and Casella 2008); and Genomic Best Linear Unbiased Predictor (GBLUP) (VanRaden 2008).

All bayesian models expanded for tetraploid allele dosage (TD models) share the same predictive multiple linear regression model (a complete description of these models can be seen at Pérez and de los Campos (2014) and de los Campos et al. (2013)),

y=1nμ++ε (3)

where, y (n×1) is the n adjusted entry mean response vector (mean values across harvests obtained from the phenotypic analysis); 1n is a vector of 1’s; μ is a scalar representing the population mean; X (n×p) is the tetraploid allele dosage incidence matrix of p marker loci coded as xij{0,1,2,3,4} according to the copy number of the reference allele for individual i at marker j; β (p×1) is the vector of (unknown) marker with tetraploid dosage genetic (TDG) effects; and ε (n×1) is the vector of residual effects, εMVN(0n,Iσε2).

Different assumptions of TDG effects were evaluated. BRR-TD model assumes that all marker loci share the same normal prior distribution, βj|σβ2N(0,σβ2), where the common genetic variance hyperparameter (σβ2) follows a scaled inverse chi-squared hyperprior distribution σβ2|d.f.β,Sβχ2(d.f.β,Sβ), where d.f.β is the number of degrees of freedom and Sβ is the scale parameter of the distribution.

BA-TD model is an extension of the above model, which assumes that each TDG prior effects follows specific normal densities, βj|σβj2N(0,σβj2). As before, each specific genetic variance hyperparameter (σβj2) follows a scaled inverse chi-squared distribution. Due to its property, we expect that BA-TD model tends to shrink TDG effects with different prior strength (desirable for highly parameterized models, pn), as opposed to the BRR-TD that assumes a common genetic variance hyperparameter. Genetically, these assumptions mean that the analyzed traits are controlled by many genes of small effects and few genes of large effects.

BB-TD model is an extension of BA-TD model, that takes into account the TDG effects prior as a mixture of two normal densities, βj|δ,σβj2,ππN(0,δ)+(1π)N(0,σβj2), where δ is the genetic variance hyperparameter assumed as a known infinitesimal small value. In this model, we can interpret the mixture proportion (π) as a known expectation of a Bernoulli random variable, that is, the expectation of which mixture component best describes the TDG effects (Dos Santos et al. 2016).

BC-TD model is a parsimonious variant of the BB-TD model, which considers that all TDG effects follows a common mixture of two normal distributions, βj|δ,σβ2,ππN(0,δ)+(1π)N(0,σβ2).

BL-TD model assumes that all markers follows specific normal priors, with genetic variance hyperprior given by the product σε2σβj2. However, the key difference of BL-TD is the assumption that one component of genetic variance hyperparameters follows exponential distributions, σβj2|λβExp(λβ). The hyperparameter λβ measures the knowledge (precision) about the genetic variance hyperparameter. As the usual procedure for all the above models, we assume the residual genetic variance hyperparameter follows σε2|d.f.ε,Sεχ2(d.f.ε,Sε).

The frequentist model GBLUP-TD was:

y=1nμ+Zg+ε (4)

where y, 1n, μ, and εij are the same as previously defined; Z (n×n) is an identity indicence matrix, and g (n×1) is the vector mapping the individuals total dosage genetic effects (random effect). The GBLUP-TD model assumes that the random variable g follows a multivariate normal distribution, gMVN(0n,Kσg2), where σg2 represents the genetic variance of the population and K=0.99K+0.01A, where K is the genomic relationship matrix and A the additive relationship matrix (VanRaden 2008; Aguilar et al. 2010; Isik et al. 2017). This normalization is necessary to obtain a invertible genomic relationship matrix (Isik et al. 2017) and the use of such weight does not result in critical differences in the outputs (VanRaden 2008; Aguilar et al. 2010). The K matrix, for tetraploid organisms, can be expressed as K=WWTtr(WWT)/n, were W is the centered marker score matrix. Each kii on K can be interpreted as a correlation between genotypes of different individuals (genomic relationship), and each kii as the correlation of the genotypes of an individual with itself (inbreeding). K and A matrices were computated using R package AGHmatrix (Amadeu et al. 2016).

To fit all bayesian models, we used the R package BGLR (Pérez and de los Campos 2014), choosing the default package settings for all known hyperparameters. To obtain the posterior distribution of the unknown parameters and hyperparameters, we used the Gibbs sampler with 20,000 iterations; the first 2,000 cycles were discarded as burn in. The GBLUP-TD model was analyzed using the R package ASReml-R (Butler et al. 2009).

Model evaluation

The best statistical model was selected for each of the phenotypic traits. Cross-validation with fivefolds has been repeated 100 times for bayesian approaches (computationally intensive) and 1,000 times for the frequentist approach, to obtain an asymptotic empirical distribution of the predictive ability. In each replication, the population was randomly split into five disjoint subsets of genotypes. Whereas one subset was used as validation population (20% or 106 individuals), the remaining four were combined as training population (80% or 424 individuals) to predicte the left-out genotypes in the first population. Subsequently, another subset was used as validation population and the left-out genotypes of this set were predicted. These steps were repeated until all five subsets were used as validation population once.

We calculated Pearson correlation between observed (y) and predicted (y^) adjusted entry means for the validation sets, considering, simultaneously, all five cross-validations of each replication. Predictive ability was calculated as the mean of these correlations. In addition, we also derived the empirical distribution of the predicted residual error sums of squares (PRESS), given by the sum of squares of the difference between the predicted and observed adjusted entry means. The narrow-sense heritability was calculated as:

h^2=σA2σP2 (5)

where, σA2 is the additive genetic variance and σP2 is the phenotypic variance from GBLUP analysis. Therefore, this estimate corresponds to a genomic heritability, using the additive relationship matrix from molecular markers.

Genomic prediction models using diploid dosage

We also performed a comparison between GS models using diploid and tetraploid allele dosage (GS-DD and GS-TD models, respectively). To do so, the three possible heterozygotes (Aaaa, AAaa, and AAAa) were coded as a single heterozygote (Aa), while the two tetraploid homozygotes (AAAA and aaaa) as diploid homozygotes (AA and aa). Molecular data were filtered to eliminate markers containing only doses 0 and 1. This step was necessary since the offspring were closely related among them and most of the genotypes in the full data were classified as aaaa or Aaaa (84.27% and 12.57%, respectively). Therefore, the molecular data set decreased from 41,424 to 1,223 markers. In these new data, tetraploid molecular matrix contains 1,223 markers coded as 0, 1, 2, 3, and 4, and diploid molecular matrix contains the same 1,223 markers coded as 0, 1, and 2. Usual GBLUP model with diploid dosage (GBLUP-DD) was evaluated using R package ASReml-R (Butler et al. 2009), considering the genomics relationship matrix as described by VanRaden (2008). Bayesian models with diploid dosage were evaluated using R package BGLR (Pérez and de los Campos 2014). The model evaluation was performed as previously described.

Data Availability

Phenotypic and molecular data and R code files can be found at: <https://github.com/leticia-lara/PM_data>. GBS data have been submitted to the NCBI Sequence Read Archive with the BioProject ID: PR-JNA510446. Additional information can also be found in the supplementary material. Figure S1 contains an illustrative scheme of the experimental design. Figure S2 contains the proportion of missing data in molecular dataset. Figure S3 contains the heatmaps of relationship matrices. Figure S4 contains the principal component analysis. Figure S5 contains the linkage disequilibrium analysis. Table S1 contains model selection for phenotypic traits. Table S2 contains the alignment rate for each reference genome. Table S3 contains the PRESS of predictive models for each trait. Table S4 contains the mean predictive ability of genomic selection models using tetraploid dosage for all markers. Table S5 contains the comparison between individual plant values based on half-sib family means and the genomic prediction of individuals. Supplemental material available at FigShare: https://doi.org/10.25387/g3.7762958.

Results

Phenotypic models

All VCOV structures selected for GL and RL matrices allowed for heterogeneous variation and/or correlations across harvests for genetic and residual effects (Table S1). When the selection by AIC was not in agreement with the selection by BIC, we calculated the differences between these two selected models considering both criteria and selected by the criteria that obtained the higher difference. For example, considering the OM trait and GL matrix, the PoHet had the lowest AIC value (with AIC criteria of 2774.400 and BIC criteria of 2826.809) and the Po had the lowest BIC value (with AIC criteria of 2782.262 and BIC criteria of 2817.201). The differences between these two selected models were 7.862 for AIC (2782.262 - 2774.400) and 9.608 for BIC (2826.809 - 2817.201). As BIC criteria had the higher difference, this criteria was used and the Po matrix was selected for GL matrix in OM trait. The most commom selected matrix for genetic effects was AR1Het, which was selected for CP and IVD, and different VCOV matrices were selected for residual effects. Generalized heritabilities were 0.66, 0.42, 0.48, 0.89, 0.85, and 0.43 for OM, CP, IDV, LDM, RC, and PLB, respectively. On average, traits evaluated in four harvests had lower heritability than traits evaluated in eight harvests.

Genotype calling

Approximately 485 million of reads per lane were obtained from short read sequencing, where 81.73% were good barcoded reads. The alignment of 6,596,939 read tags using P. hallii, P. virgatum, S. italica, S. viridis, and two transcriptomes of P. maximum (Table 2) showed that the overall alignment ranged from 19.05 to 24.24%. Although transcriptomes obtained by UNICAMP had the highest overall alignment rate, transcriptomes obtained by EMBRAPA had the highest unique alignment rate, followed by S. viridis and S. italica (Table S2). A total of 476,904 markers were obtained as the sum of the final number of markers for each reference genome (Table 2).

Table 2. Overall alignment rate and total number of markers obtained after Tassel-GBS pipeline, allele dosage in SuperMASSA software, and selection of up to 5% of missing data (Filter NA), for each reference genome.

Reference Genome Overall alignment rate Tassel-GBS SuperMASSA Filter NA
Panicum hallii 19.05% 77,105 12,835 6,945
Panicum virgatum 22.66% 84,119 11,230 5,598
Setaria italica 22.04% 92,494 15,047 8,066
Setaria viridis 22.07% 92,591 15,271 8,118
Transcriptome (EMBRAPA) 20.11% 74,049 14,129 7,665
Transcriptome (UNICAMP) 24.24% 56,546 9,777 5,032
Total 476,904 78,289 41,424

Due to the nature of GBS technique, the sequencing coverage of different samples are random. It is possible that the same genomic region was not sequenced for all samples. Furthermore, sequences that have mutation in the restriction site of the enzyme are also not observed (Elshire et al. 2011). Therefore, a large amount of missing data are expected. The reference genomes of P. hallii, P. virgatum, S. italica, and S. viridis had around 5,000 markers with 87% missing data (Figure S2). Markers with a high proportion of missing values were eliminated in subsequent steps. Besides, most part of the selected markers had 0% missing data after the Tassel-GBS pipeline. S. italica, S. viridis, and transcriptome obtained by EMBRAPA had more than 15,000 markers with 0% missing data.

VCF files obtained from Tassel-GBS pipeline were used as input in the SuperMASSA software. A total of 78,289 markers was selected with minimum average depth for the population of 25 reads (Table 2). From this, 32,619 markers had more than 100 minimum overall depth. For example, the scatterplot of marker Hallii.1_3461595 (Figure 1) shows the intuition of how SuperMASSA uses count reads of two alleles to classify individuals according to their genotype using a probabilistic graphical model (Serang et al. 2012).

Figure 1.

Figure 1

Tetraploid allele dosage for marker Hallii.1_3461595.

High level of missing data can impact the performance of GS models, and precise imputation of missing entries can be complex, especially when considering polyploid genomes. Therefore, markers were selected with up to 5% of missing data, aiming to reduce imputation bias. The final number of markers was 41,424, which were used in GS models (Table 2). Subsequently, imputation was made using random sampling and considering the dose frequency for each marker.

The redundancy among markers was inspected (Figure 2) and a large similarity was verified within three specific groups of reference: Panicum genus, Setaria genus, and transcriptomes. This result is expected due to phylogenetic proximity of the groups (Aliscioni et al. 2003; Bennetzen et al. 2012). More than half of the markers identified by the different reference genomes have non-redundant information, and 31,046 markers were classified as unique (74.95%). This may be due to the great genomic variability still persistent in each genome, since they are relatively new in terms of breeding.

Figure 2.

Figure 2

Redundancy among markers. Regions in red represent redundant markers within each reference, while regions in black, pink, blue, green, and orange represent redundant markers among six, five, four, three, and two references, respectively. Gray regions represent markers with unique information for each reference.

Population structure and linkage disequilibrium

A trace of population structure was detected for the additive relationship matrix based on pedigree information (Figure S3A) as expected, since these pedigrees consider only female parental information. However, no clear population structure was detected for the genomic relationship matrix based on molecular markers (Figure S3B). We also confirmed a relevant trace of population structure for female parents by principal component analysis of their genotype data, in which the three first principal components explained, respectively, 20.88, 11.98, and 10.23% of the total variability in these parents (Figure S4A). On the other hand, the principal component analysis for the 19 half-sib families (530 individuals) did not detected the population structure in these population. Their first three principal components explained, respectively, 6.20, 4.91, and 4.03% of the total variability in the breeding population (Figure S4B).

Average linkage disequilibrium (LD) between marker pairs was 0.3063, 0.3321, 0.3155, 0.3165, 0.2401, and 0.2810 for markers identified from P. hallii, P. virgatum, S. italica, S. viridis, and transcriptomes of P. maximum obtained by EMBRAPA and UNICAMP, respectively (Figure S5). The LD decay was lower for P. hallii, S. italica, and S. viridis than for others.

Genomic prediction using tetraploid dosage

Genomic selection models were evaluated using all markers (41,424 markers) and only non-redundant ones (31,046 markers). The predictive accuracy did not differ between these two data sets (results not shown), since the predictive models deal well with multicollinearity. From now on, only predictive models using all markers will be presented.

Mean values of predictive ability ranged from 0.1610 (BL-TD for LDM) to 0.4229 (BB-TD for OM) (Table S4). LDM showed the lowest accuracies for all analyzed GS-TD models and OM showed the highest ones (Figure 3). No clear difference in predictive ability was observed among models (Figure 3A). The BL-TD model for LDM showed the highest estimates of standardized PRESS (Figure 3B, Table S3).

Figure 3.

Figure 3

Comparison among six genomic selection models using tetraploid dosage (GS-TD) for organic matter (OM), in vitro digestibility of organic matter (IVD), crude protein (CP), leaf dry matter (LDM), regrowth capacity (RC), and percentage of leaf blade (PLB). Molecular data contains 41,424 markers. (A) Predictive ability. (B) Standardized Predicted Residual Error Sum of Squares.

Comparison between GS-TD and GS-DD models

The mean predictive ability showed clear differences between GS-TD and GS-DD models (Figure 4 and Table 3). Almost all GS-TD models were superior to GS-DD models for most of the analyzed traits. The higest superiority was for LDM, with an average superiority of 50.96% for GS-TD model in relation to GS-DD model. The narrow-sense heritability ranged from 0.11 for LDM to 0.53 to OM. These results were consistent with those obtained for the predictive abilities, which is expected since both were calculated considering the additive effects (Table 3).

Figure 4.

Figure 4

Comparison between genomic selection models using tetraploid dosage (GS-TD) and diploid dosage (GS-DD), for organic matter (OM), crude protein (CP), in vitro digestibility of organic matter (IVD), leaf dry matter (LDM), regrowth capacity (RC), and percentage of leaf blade (PLB). Molecular matrices containing 1,223 markers for each data set.

Table 3. Mean predictive ability of genomic selection models using tetraploid dosage (GS-TD) and diploid dosage (GS-DD) for organic matter (OM), crude protein (CP), in vitro digestibility of organic matter (IVD), leaf dry matter (LDM), regrowth capacity (RC), and percentage of leaf blade (PLB). GS-TD and GS-DD models with the highest mean predictive ability for each trait are indicated in bold. Molecular matrices containing 1,223 markers for each data set. The broad-sense heritability is H^C2 and the narrow-sense heritability is h^2.

Model OM CP IVD LDM RC PLB
GBLUP-TD 0.3782 0.2237 0.2211 0.1282 0.2475 0.1957
GBLUP-DD 0.2905 0.1578 0.1869 0.0685 0.1829 0.1968
BRR-TD 0.3866 0.2299 0.2157 0.1390 0.2621 0.2058
BRR-DD 0.2966 0.1633 0.1837 0.0997 0.1951 0.2155
BA-TD 0.3931 0.2251 0.2174 0.1308 0.2514 0.2001
BA-DD 0.2958 0.1585 0.1835 0.0854 0.1876 0.2067
BB-TD 0.3955 0.2242 0.2190 0.1311 0.2501 0.2004
BB-DD 0.2960 0.1560 0.1843 0.0886 0.1885 0.2110
BC-TD 0.3879 0.2282 0.2186 0.1374 0.2577 0.2049
BC-DD 0.2958 0.1612 0.1855 0.1006 0.1944 0.2161
BL-TD 0.3821 0.2233 0.2187 0.1333 0.2468 0.1991
BL-DD 0.2923 0.1594 0.1849 0.0870 0.1853 0.1998
Average GS-TD 0.3872 0.2257 0.2184 0.1333 0.2526 0.2010
Average GS-DD 0.2945 0.1594 0.1848 0.0883 0.1889 0.2076
Average 31.48 41.59 18.18 50.96 33.72 −3.18
superiority
H^C2 0.66 0.42 0.48 0.89 0.85 0.43
h^2 0.53 0.23 0.25 0.11 0.31 0.12

Discussion

The aim of this study was to develop predictive models considering allele dosage for autotetraploid species, with applications in P. maximum. We extended six different GS models to autotetraploid species and compared the accuracy of predicted breeding values of these models. To the best of our knowledge, this is for the first time where the efficiency of genomic selection models considering the information from hidden heterozygotes in the autotetraploid tropical forage P. maximum has been reported. Furthermore, we evaluated strategies for modeling residual effects during phenotypic analysis, and performing quantitative genotype calling in autotetraploid species. This methodology can be applied to other autotetraploid species as well as can be extended to species with higher ploidy levels.

Before the development of GS models considering tetraploid allele dosage (GS-TD models), it is important to perform a phenotypic analysis carefully since the sucess of GS in breeding for quantitative traits largely depends of phenotyping process (Cabrera-Bosquet et al. 2012). We performed a two stage approach for genomic selection, where we considered only phenotypic data in the first stage and incorporated molecular data in the second stage. In the first stage, for each trait, we fitted a longitudinal linear mixed model and treated genotypes as random. We recognized there is a discussion about fit genotype as random in the first stage (Schulz-Streeck et al. 2013). However, as we have multiple harvests and we are modeling genotype effects nested within harvests, with VCOV structure for genotype values across harvests, the model requires to treat genotype as random. The gain in information by incorporating the correlations among harvests is corroborated in the VCOV structures selection, in which the matrices allowing heterogeneous genetic and residual variation as well as allowing correlation among harvests provided a better fit than other models for most analyzed traits (Table S1). Moreover, there was reported in the literature that the use of BLUPs (Best Linear Unbiased Predictor, and in this case, it refers to the marginal predicted values of individual plants across harvests) does not result in significant differences for selection purposes (Galli et al. 2018), and it has been already applied in other GS studies as a simple approach to include correlations among traits/harvests/environments (Asoro et al. 2011; Resende et al. 2017; Ferrão et al. 2017; de Oliveira et al. 2018). Furthermore, we calculated the deregressed BLUPs, but the GS models had lower predictive ability than using the marginal predicted values of individual plants across harvests (results not shown).

In the second stage, molecular markers were considered in the predictive models. For this, the genotype calling took into consideration the allele dosage, discriminating among the five possible genotypes. According to Uitdewilligen et al. (2013), a high sequence depth is required to identify the genotypic class accurately for autotetraploid species, where 60–80 depth leads to 98.4% accuracy in genotypic calls. Reads were also selected with minimum overall depth of 100 reads. However, as the predictive ability of GS-TD models was similar for both criteria (results not shown), we chose to keep only the analysis with an overall depth of 25 reads (approximately 78.7% of markers selected with minimum overall depth of 25 reads were also selected with minimum overall depth of 100 reads). This is a less strict filter for the practical pipeline of breeding and contains a larger number of markers.

The underlying assumption of genomic selection is the presence of SNPs at some loci in linkage disequilibrium (LD) with Quantitative Trait Loci (QTL) alleles that affect traits which are subject to selection (Calus et al. 2008). LD represents the non-random association between alleles at different loci and it can be estimated using the correlation among markers when the SNP alleles at those loci are coded with numerical values. Vos et al. (2017) calculated several estimators for LD in a simulated and real panel of tetraploid potato and concluded that LD1/2,90 values provides the most consistent estimates of LD decay. This estimator consists of 90% percentile r2 the short-range LD. Short-range LD is calculated across a defined interval of physical distances between marker pairs (Vos et al. 2017). One major reason for the minor differences in prediction accuracies among prediction models is the high level of LD found in breeding populations (Riedelsheimer et al. 2012). Riedelsheimer et al. (2012) obtained similar accuracies of GS models in elite maize germplasm, which had high level of LD. Accuracies did not differ regardless whether the effect of large QTL were precisely captured or spread over a larger region. In this work, we also obtained a high level of LD between markers pairs with up 1500 bp of distance (r2 from 0.2401 to 0.3321), which can explain the similarity among GS-TD models for prediction purposes.

Population structure is another important factor in GS and can result in biased estimates of the predict ability when it is not taken into account (Riedelsheimer et al. 2013). As our population is composed by half-sib families, a stratified sampling was performed for the cross-validation process, in which 20% of individuals in each family were taken from the validation population and 80% from the training population. Therefore, all families were represented in both populations. The GS models and the evaluation process were performed as described in Materials and Methods section. We did not observe differences in the predictive abilities between stratified sampling and random sampling (results not shown) and we are presenting in this paper only results using random sampling. This similarity is expected, since no clear population structure was detected in this population of P. maximum (Figure S3 and S4B). When we observe the genotypes of the parents, few parents are different from the others (Figure S4A). The absence of population structure occurs because most parents are closely related and the individuals analyzed have a high level of relationship among them. Similar results have been reported in wheat (Arruda et al. 2015) and in potato (Sverrisdóttir et al. 2017), with typical values for data with family structure, but without substantial population structure. As reported by Sverrisdóttir et al. (2017), the absence of population structure could in principle be caused by an extremely narrow genetic base of the parents. This is in line with our study since all parents were derived from a common female tetraploidized ancestor.

Synthetic populations are usually formed by crossing several parents and posteriorly cross-pollinating the F1 individuals for one or several generations (Falconer and Mackay 1996). These populations have played an important role in quantitative genetic research on gene action in complex heterotic traits and comparison of selection methods (Hallauer et al. 2010; Schopp et al. 2017). As in our population, the number of parents is usually small and parents are often related, leading to small effective size (Ne) (Müller et al. 2017). We estimated broad and narrow sense heritabilities for the evaluated traits. Besides we did not expect high heritabilities for traits related with yield, we observed a high broad sense heritability (0.89) and low narrow sense heritability (0.11) for LDM (Table 3). Such low value reflects the difficulty to breed for this trait. Differences in magnitude between these estimates were previously reported by Resende et al. (2014b), which observed broad-sense heritability ranging from 0.47 to 0.75 and narrow-sense heritability ranging from 0.15 to 0.24 for LDM considering different years of evaluation. For individual harvests, the same authors found a narrow-sense heritability of 0.05 for LDM. Similar to our results, broad-sense heritabilities were reported in Paspalum for LDM with 0.82 and PLB with 0.55 (Pereira et al. 2017). In respect to the nutritive traits, moderate values were reported by Matias et al. (2016) for narrow-sense heritability considering the mean of seven harvests in Brachiaria decumbens progenies for CP (0.31) and IVD (0.14). Similar moderate values for CP and IVD were also reported in our study (Table 3).

The prediction of the breeding values was made using the marginal predicted values of individual plants across harvests that were obtained in the first stage of the analysis. The goals were to select individuals in the present recurrent selection cycle (already phenotypically evaluated) as well as to select non-phenotyped individuals from the next cycle. Since one selection cycle requires three to five years to complete (Resende et al. 2014a), P. maximum breeding program with genomic selection will reduce from five to one year for each recurrent cycle. This one year is necessary to grow the selected plants and cross them to obtain the new population. In addition, the P. maximum breeding programs for releasing cultivars will be benefited since superior sexual plants can be selected every year to, posteriorly, cross with apomictic plants. From these crosses, new apomictic hybrid combinations will be obtained and tested as new cultivars for releasing the best one in the market. Lipka et al. (2014) applied genomic selection in P. virgatum species with the objective of evaluating genomic selection efficiency to accelerate breeding cycles in this species. Since P. virgatum is an allotetraploid species, the authors used diploid dosage and obtained high prediction accuracy for most of the traits, using association panel and considering seven morphological and thirteen biomass quality trais. Although analyzed traits were different from ours, the range of values were similar. The higher mean prediction accuracy obtained by Lipka et al. (2014) was 0.52 for standability and the lower was -0.08 for minerals. Our higher mean predictive ability was 0.4229 for organic matter and the lower was 0.1610 for leaf dry matter (Table S4). Similar accuracies were also obtained for oats (Asoro et al. 2011), maize (González-Camacho et al. 2012), and rice (Spindel et al. 2015). Despite the difference among traits, the accuracy did not differ among models. Besides that, we suggest to use GBLUP-TD model for all traits, since it is more intuitive for breeders and it is less computational demanding.

We also compared the predictions of individual plant values based on genomic prediction at the level of half-sib family means with that at the individual level. To achieve this, we removed the maternal effect of the phenotypic model and included it in the genomic selection model. Then, we predicted individuals based only on the maternal effects and compared with prediction of individuals (Table S5). The genomic predicted values are more accurate because they capture not only information on the family means, but also the within-family deviations. The accuracy of prediction using only maternal effect ranged from 0.0452 for PLB to 0.1843 for LDM. The prediction of individuals using molecular markers and family effect ranged from 0.1786 for LDM to 0.3466 for OM. Therefore, the GS can operate on individual plants without needing trait data on sibs with considerable accuracy of prediction and genetic gains with selection.

There are few studies on GS for autotetraploid species using allele dosage, most of them are for potato (Slater et al. 2016; Endelman et al. 2018; Stich and Inghelandt 2018). To our knowledge, this is the first study that applied genomic selection with allele dosage in an autotetraploid tropical forage grass. Slater et al. (2016) developed an extension of genomic relationship matrix proposed by Yang et al. (2010) for autotetraploids and applied genomic selection in potato. The authors found accuracies ranging from 0.2, under conditions of low heritability and small reference populations, to 0.8 in larger reference populations. Endelman et al. (2018) and Stich and Inghelandt (2018) analyzed the inclusion of additive and non-additive components (digenic dominance and two order epistasis) for the accuracy of predictive models. The inclusion of dominance effects in GS when selecting the per-se performance of the individuals is fundamental, mainly when working with species that are highly heterozygous, which allows dominance effects to contribute to phenotypic variation (Stich and Inghelandt 2018). Concordantly, Endelman et al. (2018) extended the genomic relationship matrix proposed by VanRaden (2008) for autotetraploids and highlighted the importance of genotypic and additive values for selection. However, for both studies, when the goals is selecting new parents for several cycles, only the additive value needs to be considered since the contribution of digenic dominance diminishes exponentially over the generations (Gallais 2003; Endelman et al. 2018), in other words, non-additive effects are less efficiently transmitted to progeny. In our work, we modeled only additive effects since our study aims to select superior sexual parents in a recurrent genomic selection program in the current and next selection cycles.

A comparison of GS-TD models with GS models considering the usual diploid allele dosage (GS-DD models) was also performed. Although being a rougly approximation, this strategy has been applied in several polyploid crops, such as alfafa (Annicchiarico et al. 2015; Biazzi et al. 2017), and sugarcane (Gouy et al. 2013), as GS software for polyploids have only recently emerged. To investigate the impact of using diploidized markers, Endelman et al. (2018) also compared diploid and tetraploid dosage. The authors observed that the accuracy of the diploidized model was consistently lower. In our study, the accuracy of GS-DD models were also lower than for GS-TD models for all traits (Table 3 and Figure 4). This reinforces that GS models using tetraploid allele dosage are superior for autotetraploid populations, and extensions can and should be applied to other ploidy levels.

Implementation of genomic selection in forage breeding

Recent research shows the potential of genomic selection to reshape crop breeding programs. In particular, the results obtained here imply that GS has great potential for P. maximum breeding programs, especially in a recurrent genomic selection program.

Usually, one cycle of half-sib recurrent selection is split in: i) development of progenies; ii) phenotypic evaluation; and iii) selection and recombination of the best selected individuals to obtain an improved population (Figure 5A). The recurrent genomic selection is a modification to get shorter breeding cycles and greater genetic gains with selection. Therefore, genomic prediction is implemented by genotyping and phenotyping the base population and estimating marker effects to predict hybrid performance in the subsequent recurrent selection cycles and recombine the best individuals based only on the GEBVs (Figure 5B). The persistency of predictive accuracy in GS is fundamental for practical breeding because it determines the number of cycles that can be advanced until it is necessary to retrain the predictive models (Müller et al. 2017). This is because in each cycle of recombination and selection, the individuals of breeding population can accumulate genetic diversity and gene frequencies may differ from the training population (Heffner et al. 2010; Bassi et al. 2016). Therefore the breeder needs to update the GS models by phenotyping the population and re-estimating marker effects each three recurrent selection cycles (Heffner et al. 2009) or whenever necessary.

Figure 5.

Figure 5

Forage breeding program in a: (A) recurrent phenotypic selection. (B) recurrent genomic selection.

To compare the efficiency of GS approach with phenotypic breeding programs, the breeders’ equation can be used. In this equation, expected genetic gains per unit of time is defined as ΔG=(irσA)/T, where i is the selection intensity, r is the selection accuracy, σA is the square root of the additive genetic variance, and T is the length of time to complete one breeding selection cycle (Falconer and Mackay 1996). The success of GS approach is determined by its ability to predict phenotypes that were not evaluated as well as by its ability to increase the rate of ΔG while maintaining affordable costs. Assuming similar selection intensity and similar genetic variance for both methods, greater gains per unit of time can be achieved as long as the reduction in the time of each breeding cycle by GS compensates for the reduction in selection accuracy (Bassi et al. 2016). Furthermore, GS becomes cheaper than phenotypic selection for traits that have a long generation time or are difficult and expensive to evaluate.

Another crucial aspect for implementation of genomic selection in forage breeding programs is the generation of accurate genotypic data. Several softwares have been developed to perform quantitative genotype calling for autotetraploid species, such as R packages fitTetra (Voorrips et al. 2011) and ClusterCall (Schmitz Carley et al. 2017); for “diploidizing” tetraploid species, such as R package breadarrayMSV (Gidskehaug et al. 2010); and species with any ploidy level, such as R packages fitPoly (Voorrips et al. 2011), updog (Gerard et al. 2018), and software SuperMASSA (Serang et al. 2012; Pereira et al. 2018). Therefore, the evaluation of allele dosage for polyploid species has become more accessible by several available softwares.

This is the first work of genomic selection in the tropical forage grass P. maximum which uses a high throughput genotyping and considers tetraploid allele dosage in Bayesian and frequentist models. GBS and allele dosage showed to be promising strategies for genomic analysis in autotetraploid species. Furthermore, the accuracy of predictive models and the time reduction in the breeding cycles justifies the implementation of genomic selection in P. maximum breeding programs.

Acknowledgments

This work was supported by FAPESP (São Paulo Research Foundation), grants 2015/20659-2 and 2016/01279-7. Additional support was provided by CNPq (“Conselho Nacional do Desenvolvimento Científico e Tecnológico”), EMBRAPA (“Empresa Brasileira de Pesquisa Agropecuária”) number 02.14.01.013.00.03.003, UNIPASTO (“Associação para o Fomento à Pesquisa de Melhoramento de Forrageiras”), and CAPES (“Coordenação de Aperfeiçoamento de Pessoal de Nível Superior”). L. A. C. Lara was supported by FAPESP, grants 2015/20659-2 and 2016/01279-7. A. A. F. Garcia was supported by a productivity scholarship from CNPq and CAPES (“Bio Computacional 051/2013”). J. P. R. dos Santos was supported by FAPESP, grants 2017/03625-2 and 2017/25674-5. The authors thank all trainees of Embrapa Beef Cattle that helped on the phenotypic evaluations.

Footnotes

Supplemental material available at FigShare: https://doi.org/10.25387/g3.7762958.

Communicating editor: J. Holland

Literature Cited

  1. Aguilar I., Misztal I., Johnson D. L., Legarra A., Tsuruta S. et al. , 2010.  Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J. Dairy Sci. 93: 743–752. 10.3168/jds.2009-2730 [DOI] [PubMed] [Google Scholar]
  2. Akaike H., 1974.  A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19: 716–723. 10.1109/TAC.1974.1100705 [DOI] [Google Scholar]
  3. Aliscioni S. S., Giussani L. M., Zuloaga F. O., and Kellogg E. A., 2003.  A molecular phylogeny of Panicum (Poaceae: Paniceae): tests of monophyly and phylogenetic placement within the Panicoideae. Am. J. Bot. 90: 796–821. 10.3732/ajb.90.5.796 [DOI] [PubMed] [Google Scholar]
  4. Allard R. W., 1960.  Inheritance in autotetraploids. In Principles of Plant Breeding, edited by J. W. Sons, pp. 385–399, Wiley. [Google Scholar]
  5. Amadeu R. R., Cellon C., Olmstead J. W., Garcia A. A. F., Resende M. F. R. et al. , 2016.  AGHmatrix: R package to construct relationship matrices for autotetraploid and diploid species: A blueberry example. Plant Genome 9: 1–10. 10.3835/plantgenome2016.01.0009 [DOI] [PubMed] [Google Scholar]
  6. Annicchiarico P., Nazzicari N., Li X., Wei Y., Pecetti L. et al. , 2015.  Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC Genomics 16: 1020 10.1186/s12864-015-2212-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Arruda M. P., Brown P. J., Lipka A. E., Krill A. M., Thurber C. et al. , 2015.  Genomic selection for predicting Head Blight Resistance in a Wheat Breeding Program. Plant Genome 8: 1–12. 10.3835/plantgenome2015.01.0003 [DOI] [PubMed] [Google Scholar]
  8. Asoro F. G., Newell M. A., Beavis W. D., Scott M. P., and Jannink J.-L., 2011.  Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. Plant Genome 4: 132–144. 10.3835/plantgenome2011.02.0007 [DOI] [Google Scholar]
  9. Bassi F. M., Bentley A. R., Charmet G., Ortiz R., and Crossa J., 2016.  Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Sci. 242: 23–36. 10.1016/j.plantsci.2015.08.021 [DOI] [PubMed] [Google Scholar]
  10. Bennetzen J. L., Schmutz J., Wang H., Percifield R., Hawkins J. et al. , 2012.  Reference genome sequence of the model plant Setaria. Nat. Biotechnol. 30: 555–561. 10.1038/nbt.2196 [DOI] [PubMed] [Google Scholar]
  11. Bernardo R., 2014.  Genomewide selection when major genes are known. Crop Sci. 54: 68–75. 10.2135/cropsci2013.05.0315 [DOI] [Google Scholar]
  12. Biazzi E., Nazzicari N., Pecetti L., Brummer E. C., Palmonari A. et al. , 2017.  Genome-wide association mapping and genomic selection for alfalda (Medicago sativa) forage quality traits. PLoS One 12: e0169234 10.1371/journal.pone.0169234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Butler D. G., Cullis B. R., Gilmour A. R., and Gogel B. J., 2009.  ASReml-R reference manual.
  14. Cabrera-Bosquet L., Crossa J., von Zitzewitz J., Serret M. D., and Araus J. L., 2012.  High-throughput phenotyping and genomic selection: The frontiers of crop breeding converge. J. Integr. Plant Biol. 54: 312–320. 10.1111/j.1744-7909.2012.01116.x [DOI] [PubMed] [Google Scholar]
  15. Calus M. P. L., Meuwissen T. H. E., Roos A. P. W., and Veerkamp R. F., 2008.  Accuracy of genomic selection using different methods to define haplotypes. Genetics 178: 553–561. 10.1534/genetics.107.080838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cullis B. R., Smith A. B., and Coombes N. E., 2006.  On the design of early generation variety trials with correlated data. J. Agric. Biol. Environ. Stat. 11: 381–393. 10.1198/108571106X154443 [DOI] [Google Scholar]
  17. de los Campos G., Hickey J. M., Pong-Wong R., Daetwyler H. D., and Calus M. P. L., 2013.  Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193: 327–345. 10.1534/genetics.112.143313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. de Oliveira A. A., Pastina M. M., da Costa Parrella R. A., Noda R. W., Simeone M. L. F. et al. , 2018.  Genomic prediction applied to high-biomass sorghum for bioenergy production. Mol. Breed. 38: 49 10.1007/s11032-018-0802-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Dos Santos J. P. R., Pires L. P. M., Vasconcellos R. C. C., Pereira G. S., Pinho R. G. V. et al. , 2016.  Genomic selection to resistance to Stenocarpella maydis in maize lines using DArTseq markers. BMC Genet. 17: 86 10.1186/s12863-016-0392-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Elshire R. J., Glaubitz J. C., Sun Q., Poland J. A., Kawamoto K. et al. , 2011.  A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6: e19379 10.1371/journal.pone.0019379 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Endelman J. B., Carley C. A. S., Bethke P. C., Coombs J. J., Clough M. E. et al. , 2018.  Genetic variance partitioning and genome-wide prediction with allele dosage information in autotetraploid potato. Genetics 209: 77–87. 10.1534/genetics.118.300685 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Falconer D. S., and Mackay T. F. C., 1996.  Quantitative genetics in maize breeding, Pearson, ‎London. [Google Scholar]
  23. Ferrão L. F. V., Ferrão R. G., Ferrão M. A. G., Fonseca A. F. A., and Garcia A. A. F., 2017.  A mixed model to multiple harvest-location trials applied to genomic prediction in Coffea canephora. Tree Genet. Genomes 13: 95 10.1007/s11295-017-1171-7 [DOI] [Google Scholar]
  24. Gallais A., 2003.  Quantitative genetics and breeding methods in autopolyploid plants, Institut National de la Recherche Agronomique, Paris, France. [Google Scholar]
  25. Galli G., Lyra D. H., Alves F. C., Granato S. C., Fritsche-Neto R. et al. , 2018.  Impact of phenotypic correction method and missing phenotypic data on genomic prediction of maize hybrids. Crop Sci. 58: 1481–1491. 10.2135/cropsci2017.07.0459 [DOI] [Google Scholar]
  26. Garcia A. A. F., Mollinari M., Marconi T. G., Serang O. R., Silva R. R. et al. , 2013.  SNP genotyping allows an in-depth characterisation of the genome of sugarcane and other complex autopolyploids. Sci. Rep. 3: 3399 10.1038/srep03399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gerard D., Ferrão L. F. V., Garcia A. A. F., and Stephens M., 2018.  Genotyping polyploids from messy sequencing data. Genetics 210: 789–807. 10.1534/genetics.118.301468 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gidskehaug L., Kent M., Hayes B. J., and Lien S., 2010.  Genotype calling and mapping of multisite variants using an Atlantic salmon iSelect SNP array. Bioinformatics 27: 303–310. 10.1093/bioinformatics/btq673 [DOI] [PubMed] [Google Scholar]
  29. Glaubitz J. C., Casstevens T. M., Lu F., Harriman J., Elshire R. J. et al. , 2014.  TASSEL-GBS: A high capacity genotyping by sequencing analysis pipeline. PLoS One 9: e90346 10.1371/journal.pone.0090346 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. González-Camacho J. M., De los Campos G., Pérez P., Gianola D., Cairns J. E. et al. , 2012.  Genome-enabled prediction of genetic values using radial basis function neural networks. Theor. Appl. Genet. 125: 759–771. 10.1007/s00122-012-1868-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Goodstein D. M., Shu S., Howson R., Neupane R., Hayes R. D. et al. , 2011.  Phytozome: a comparative platform for green plant genommics. Nucleic Acids Res. 40: D1178–D1186. 10.1093/nar/gkr944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gouy M., Rousselle Y., Bastianelli D., Lecomte P., Bonnal L. et al. , 2013.  Experimental assessment of the accuracy of genomic selection in sugarcane. Theor. Appl. Genet. 126: 2575–2586. 10.1007/s00122-013-2156-z [DOI] [PubMed] [Google Scholar]
  33. Gu Z., Gu L., Eils R., Schlesner M., and Brors B., 2014.  circlize implements and enhances circular visualization in R. Bioinformatics 30: 2811–2812. 10.1093/bioinformatics/btu393 [DOI] [PubMed] [Google Scholar]
  34. Habier D., Fernando R. L., Kizilkaya K., and Garrick D. J., 2011.  Extension of the bayesian alphabet for genomic selection. BMC Bioinformatics 12: 186–197. 10.1186/1471-2105-12-186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Hackett C. A., and Luo Z. W., 2003.  TetraploidMap: Construction of a linkage map in autotetraploid species. J. Hered. 94: 358–359. 10.1093/jhered/esg066 [DOI] [PubMed] [Google Scholar]
  36. Hackett C. A., McLean K., and Bryan G., 2013.  Linkage analysis and QTL mapping using SNP dosage data in a tetraploid potato mapping population. PLoS One 8: e63939 10.1371/journal.pone.0063939 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hallauer A. R., and Carena M. J., 2012.  Recurrent selection methods to improve germplasm in maize. Maydica 57: 266–283. [Google Scholar]
  38. Hallauer A. R., Carena M. J., and Miranda Filho J. B., 2010.  Quantitative genetics in maize breeding, Springer, New York. [Google Scholar]
  39. Heffner E. L., Lorenz A., Jannink J.-L., and Sorrells M. E., 2010.  Plant breeding with genomic selection: gain per unit time and cost. Crop Sci. 50: 1681–1690. 10.2135/cropsci2009.11.0662 [DOI] [Google Scholar]
  40. Heffner E. L., Sorrells M. E., and Jannink J.-L., 2009.  Genomic selection for crop improvement. Crop Sci. 49: 1–12. 10.2135/cropsci2008.08.0512 [DOI] [Google Scholar]
  41. Isik F., Holland J., and Maltecca C., 2017.  Genetic data analysis for plant and animal breeding, Springer Nature, New York: 10.1007/978-3-319-55177-7 [DOI] [Google Scholar]
  42. Jank L., Barrios S. C., Valle C. B., Simeão R. M., and Alves G. F., 2014.  The value of improved pastures to Brazilian beef production. Crop Pasture Sci. 65: 1132–1137. 10.1071/CP13319 [DOI] [Google Scholar]
  43. Jank L., Valle C. B., and Resende R. M. S., 2011.  Breeding tropical forages. Crop Breed. Appl. Biotechnol. 11: 27–34. 10.1590/S1984-70332011000500005 [DOI] [Google Scholar]
  44. Langmead B., and Salzberg S. L., 2012.  Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 357–359. 10.1038/nmeth.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lipka A. E., Lu F., Cherney J. H., Buckler E. S., Casler M. D. et al. , 2014.  Accelerating the switchgrass (Panicum virgatum L.) breeding cycle using genomic selection approaches. PLoS One 9: e112227 10.1371/journal.pone.0112227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Massman J. M., Jung H.-J. G., and Bernardo R., 2013.  Genomewide selection vs. marker-assisted recurrent selection to improve grain yield and stover-quality traits for cellulosic ethanol in maize. Crop Sci. 53: 58–66. 10.2135/cropsci2012.02.0112 [DOI] [Google Scholar]
  47. Matias F. I., Barrios S. C. L., Valle C. B., Mateus R. G., Martins L. B. et al. , 2016.  Estimate of genetic parameters in Brachiaria decumbens hybrids. Crop Breed. Appl. Biotechnol. 16: 115–122. 10.1590/1984-70332016v16n2a18 [DOI] [Google Scholar]
  48. McCoy T. J., and Bingham E. T., 1988.  Cytology and cytogenetics of alfalfa, pp. 737–776 in Alfalfa and alfalfa improvement, Vol. 29, edited by Hanson A. A., Barnes D. K., Hill R. R.. American Society of Agronomy, Madison, WI. [Google Scholar]
  49. Meuwissen T. H. E., Hayes B. J., and Goddard M. E., 2001.  Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: 1819–1829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Mollinari M., and Serang O., 2013.  Quantitative SNP genotyping of polyploids with MassARRAY and other platforms, pp. 215–241 in Methods in Molecular Biology, edited by Walker J. M. Springer, New York. [DOI] [PubMed] [Google Scholar]
  51. Müller D., Schopp P., and Melchinger A. E., 2017.  Persistency of prediction accuracy and genetic gain in synthetic populations under recurrent genomic selection. G3 (Bethesda) 7: 801–811. 10.1534/g3.116.036582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Oliveira I. d. B., Resende M. F., Ferrão L. F. V., Amadeu R. R., Endelman J. B. et al. , 2019.  Genomic prediction of autotetraploids; influence of relationship matrices, allele dosage, and continuous genotyping calls in phenotype prediction. G3 (Bethesda) 9: 1189–1198. 10.1534/g3.119.400059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Osborn T. C., Chris Pires J., Birchler J. A., Auger D. L., Jeffery Chen Z. et al. , 2003.  Understanding mechanisms of novel gene expression in polyploids. Trends Genet. 19: 141–147. 10.1016/S0168-9525(03)00015-5 [DOI] [PubMed] [Google Scholar]
  54. Park T., and Casella G., 2008.  The Bayesian Lasso. J. Am. Stat. Assoc. 103: 681–686. 10.1198/016214508000000337 [DOI] [Google Scholar]
  55. Pereira E. A., DallAgnol M., Saraiva K. M., Simioni C., Leães A. P. S. et al. , 2017.  Genetic gain in apomictic species of the genus Paspalum. Rev. Ceres 64: 60–67. 10.1590/0034-737x201764010009 [DOI] [Google Scholar]
  56. Pereira G. S., Garcia A. A. F., and Margarido G. R. A., 2018.  A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids. BMC Bioinformatics 19: 398 10.1186/s12859-018-2433-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Pérez P., and de los Campos G., 2014.  Genome-wide regression and prediction with the BGLR statistical package. Genetics 198: 483–495. 10.1534/genetics.114.164442 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Pernès J., 1975.  Organisation évolutive d’un groupe agamique: la section des Maximae du genre Panicum. Ph.D. thesis, Orstom. [Google Scholar]
  59. Resende R. M. S., Casler M. D., and de Resende M. D. V., 2014a Genomic selection in forage breeding: Accuracy and methods. Crop Sci. 54: 143–156. 10.2135/cropsci2013.05.0353 [DOI] [Google Scholar]
  60. Resende R. M. S., Jank L., Valle C. B. d., and Bonato A. L. V., 2014b Biometrical analysis and selection of tetraploid progenies of Panicum maximum using mixed model methods. Crop Sci. 54: 143–156. [Google Scholar]
  61. Resende R. T., Resende M. D. V., Silva F. F., Azevedo C. F., Takahashi E. K. et al. , 2017.  Assessing the expected response to genomic selection of individuals and families in Eucalyptus breeding with an additive-dominant model. Heredity 119: 245–255. 10.1038/hdy.2017.37 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Riedelsheimer C., Endelman J. B., Stange M., Sorrells M. E., Jannink J.-L. et al. , 2013.  Genomic predictability of interconnected bi-parental maize populations. Genetics 194: 493–503. 10.1534/genetics.113.150227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Riedelsheimer C., Technow F., and Melchinger A. E., 2012.  Comparison of whole-genome prediction models for traits with contrasting genetic architecture in a diversity panel of maize inbred lines. BMC Genomics 13: 452 10.1186/1471-2164-13-452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Savidan Y., 1978.  Genetic control of facultative apomixis and application in breeding Panicum maximum.
  65. Schmitz Carley C. A., Coombs J. J., Douches D. S., Bethke P. C., Palta J. P. et al. , 2017.  Automated tetraploid genotype calling by hierarchical clustering. Theor. Appl. Genet. 130: 717–726. 10.1007/s00122-016-2845-5 [DOI] [PubMed] [Google Scholar]
  66. Schopp P., Müller D., Technow F., and Melchinger A. E., 2017.  Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium. Genetics 205: 441–454. 10.1534/genetics.116.193243 [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Schulz-Streeck T., Ogutu J. O., and Piepho H.-P., 2013.  Comparisons of single-stage and two stage approaches to genomic selection. Theor. Appl. Genet. 126: 69–82. 10.1007/s00122-012-1960-1 [DOI] [PubMed] [Google Scholar]
  68. Schwarz G., 1978.  Estimating a dimension of a model. Ann. Stat. 6: 461–464. 10.1214/aos/1176344136 [DOI] [Google Scholar]
  69. Serang O., Mollinari M., and Garcia A. A. F., 2012.  Efficient exact maximum a posteriori computation for Bayesian SNP genotyping in polyploids. PLoS One 7: e30906 10.1371/journal.pone.0030906 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Slater A. T., Cogan N. O. I., Forster J. W., Hayes B. J., and Daetwyler H. D., 2016.  Improving genetic gain with genomic selection in autotetraploid potato. Plant Genome 9: 1–15. 10.3835/plantgenome2016.02.0021 [DOI] [PubMed] [Google Scholar]
  71. Spindel J., Begum H., Akdemir D., Virk P., Collard B. et al. , 2015.  Genomic selection and association mapping in rice (Oryza sativa): Effect of trait, genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding line. PLoS One 11: e1004982 10.1371/journal.pgen.1004982 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Stich B., and Inghelandt D. V., 2018.  Prospects and potential uses of genomic prediction of key performance traits in tetraploid potato. Front. Plant Sci. 9: 159 10.3389/fpls.2018.00159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Sverrisdóttir E., Byrne S., Sundmark E. H. R., Johnsen H. O., Grethe K. H. et al. , 2017.  Genomic prediction of starch content and chipping quality in tetraploid potato using genotyping-by-sequencing. Theor. Appl. Genet. 130: 2091–2108. 10.1007/s00122-017-2944-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Toledo-Silva G., Cardoso-Silva C. B., Jank L., and Souza A. P., 2013.  De Novo transcriptome assembly for the tropical grass Panicum maximum Jacq. PLoS One 8: e70781 10.1371/journal.pone.0070781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Uitdewilligen J. G. A. M. L., Wolters A.-M. A., D’hoop B. B., Borm T. J. A., Visser R. G. F. et al. , 2013.  A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS One 8: 1–14. 10.1371/journal.pone.0062355 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. VanRaden P. M., 2008.  Efficient methods to compute genomic predictions. J. Dairy Sci. 91: 4414–4423. 10.3168/jds.2007-0980 [DOI] [PubMed] [Google Scholar]
  77. Voorrips R. E., Gort G., and Vosman B., 2011.  Genotype calling in tetraploid species from bi-allelic marker data using mixture models. BMC Bioinformatics 12: 172 10.1186/1471-2105-12-172 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Vos P. G., João Paulo M., Voorips R. E., Visser R. G. F., van Eck H. J. et al. , 2017.  Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato. Theor. Appl. Genet. 130: 123–135. 10.1007/s00122-016-2798-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Warmke H. E., 1954.  Apomixis in Panicum maximum. Am. J. Bot. 41: 5–11. 10.1002/j.1537-2197.1954.tb14297.x [DOI] [Google Scholar]
  80. Whittaker J. C., Thompson R., and Denham M. C., 2000.  Marker-assisted selection using ridge regression. Genet. Res. 75: 249–252. 10.1017/S0016672399004462 [DOI] [PubMed] [Google Scholar]
  81. Wickham H., 2009.  ggplot2: Elegant graphics for data analysis. Springer-Verlag New York.
  82. Wu K. K., Burnquist W., Sorrells M. E., Tew T. L., Moore P. H. et al. , 1992.  The detection and estimation of linkage in polyploids using single-dose restriction fragments. Theor. Appl. Genet. 83: 294–300. 10.1007/BF00224274 [DOI] [PubMed] [Google Scholar]
  83. Yang J., Benyamin B., McEvoy B. P., Gordon S., Henders A. K. et al. , 2010.  Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42: 565–569. 10.1038/ng.608 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Zhang X., Pérez-Rodríguez P., Burgueño J., Olsen M., Buckler E. et al. , 2017. Rapid cycling genomic selection in a multiparental tropical maize population. G3 (Bethesda) 7: 2315–2326. 10.1534/g3.117.043141 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Phenotypic and molecular data and R code files can be found at: <https://github.com/leticia-lara/PM_data>. GBS data have been submitted to the NCBI Sequence Read Archive with the BioProject ID: PR-JNA510446. Additional information can also be found in the supplementary material. Figure S1 contains an illustrative scheme of the experimental design. Figure S2 contains the proportion of missing data in molecular dataset. Figure S3 contains the heatmaps of relationship matrices. Figure S4 contains the principal component analysis. Figure S5 contains the linkage disequilibrium analysis. Table S1 contains model selection for phenotypic traits. Table S2 contains the alignment rate for each reference genome. Table S3 contains the PRESS of predictive models for each trait. Table S4 contains the mean predictive ability of genomic selection models using tetraploid dosage for all markers. Table S5 contains the comparison between individual plant values based on half-sib family means and the genomic prediction of individuals. Supplemental material available at FigShare: https://doi.org/10.25387/g3.7762958.


Articles from G3: Genes|Genomes|Genetics are provided here courtesy of Oxford University Press

RESOURCES