Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2020 Mar 17;13(1):e20004. doi: 10.1002/tpg2.20004

Implementing within‐cross genomic prediction to reduce oat breeding costs

Greg Mellers 1, Ian Mackay 2, Sandy Cowan 3, Irene Griffiths 3, Pilar Martinez‐Martin 3, Jesse A Poland 4, Wubishet Bekele 5, Nicholas A Tinker 5, Alison R Bentley 1,, Catherine J Howarth 3
PMCID: PMC8638661  PMID: 33016630

Abstract

A barrier to the adoption of genomic prediction in small breeding programs is the initial cost of genotyping material. Although decreasing, marker costs are usually higher than field trial costs. In this study we demonstrate the utility of stratifying a narrow‐base biparental oat population genotyped with a modest number of markers to employ genomic prediction at early and later generations. We also show that early generation genotyping data can reduce the number of lines for later phenotyping based on selections of siblings to progress. Using sets of small families selected at an early generation could enable the use of genomic prediction for adaptation to multiple target environments at an early stage in the breeding program. In addition, we demonstrate that mixed marker data can be effectively integrated to combine cheap dominant marker data (including legacy data) with more expensive but higher density codominant marker data in order to make within generation and between lineage predictions based on genotypic information. Taken together, our results indicate that small programs can test and initiate genomic predictions using sets of stratified, narrow‐base populations and incorporating low density legacy genotyping data. This can then be scaled to include higher density markers and a broadened population base.


Abbreviations

BLUP

Best linear unbiased prediction

CV

Cross‐validation

DArT

Diversity Array Technology

DiPR

Differentially penalized ridge regression

GBS

Genotyping‐by‐sequencing

GEBV

Genomic estimated breeding value

GS

Genomic selection

LD

Linkage disequilibrium

MCCV

Monte Carlo cross‐validation

RIL

Recombinant inbred line

RR‐BLUP

Ridge regression‐BLUP

SSD

Single‐seed descent

SNP

single nucleotide polymorphism

1. INTRODUCTION

The adoption of affordable genetic markers in breeding programs has expanded the use of accelerated, genomic‐based breeding approaches from genome‐wide information (Lorenzana & Bernardo, 2009; Morrell, Buckler, & Ross‐Ibarra, 2011). Genomic selection (GS) based on the selection of individuals using a genomic estimated breeding value (GEBV) can enable faster, more intense and more accurate selection (Heffner, Sorrells, & Jannink, 2009; Meuwissen, Hayes, & Goddard, 2001).

Ongoing research in crops has progressed beyond improving prediction accuracy and now centers on how best to employ GS within breeding programs (Arruda et al., 2015; Bassi, Bentley, Charmet, Ortiz, & Crossa, 2015; Jarquín et al., 2016; Norman, Taylor, Edwards, & Kuchel, 2018; Vivek et al., 2017), although the transition to practical implementation in small programs remains a challenge (Voss‐Fels, Cooper, & Hayes, 2019). This is predominantly due to the initial expense of genotyping existing germplasm. Recent work has shown that a modest number of markers can be sufficient to achieve accurate predictions in small populations with high linkage disequilibrium (LD; Gonen et al., 2018; Norman et al., 2018). It is necessary to consider how to gain value from the upfront cost of genotyping material that may not be progressed within an active breeding program. Additionally, training sets should logically be developed from breeding lines or populations (Akdemir & Isidro, 2019; Akdemir, Sanchez, & Jannink, 2015; Asoro, Newell, Beavis, Scott, & Jannink, 2011; Isidro et al., 2015; Ou & Liao, 2019; Rincent et al., 2012). Therefore, in small programs, the gradual generation and use of genotypic data in narrow‐based populations can support the longer‐term adoption of GS.

In a biparental crossing scheme, high levels of LD can be exploited to minimize genotyping cost. In cultivated oat (Avena sativa L.), high levels of long‐range LD (Esvelt Klos et al., 2016) and large haplotype blocks have been reported (Bekele, Wight, Chai, Howarth, & Tinker, 2018), along with clustering of Diversity Array Technology (DArT) markers (Tinker et al., 2009). Selfing limits the amount of recombination per generation, reducing LD dissipation, and increasing genetic variance between the resulting recombinant inbred lines (RILs), thus promoting the emergence of superior transgressive segregants (McClosky, LaCombe, & Tanksley, 2013). These factors make within‐cross GS feasible to assess performance relative to the other lines in the same (rather than different) subpopulations (Asoro et al., 2013; Gonen et al., 2018; Gorjanc et al., 2017a, 2017b). Previous work has tested biparental prediction approaches via simulation in a maize (Zea mays) genome (McClosky et al., 2013), concluding that gains attributable to selfing are achievable across different population sizes, trait heritabilities, and selection intensities. Lorenzana and Bernardo (2009) evaluated GS in two double haploid biparental barley (Hordeum vulgare) populations using historical trial data on production and quality traits and 223 polymorphic markers. They reported that the simple and computationally efficient best linear unbiased prediction (BLUP) approach was ideally suited to biparental GS. Additionally, extensive LD and large linkage blocks meant that fewer markers were needed for accurate predictions (Lorenzana & Bernardo, 2009).

Cultivated hexaploid oat is a cereal crop used to produce grain in temperate regions and forage in the subtropics (Hoffman, 1995). The allopolyploid oat genome is large (12.5 gigabases) and highly repetitive, making the large‐scale adoption of genomics‐based breeding methods difficult (Yan et al., 2016). Recent advances in genomic resources (e.g., Chaffin et al., 2016; Huang, Poland, Wight, & Jackson, 2014) mean GS is now more tractable for uptake within oat breeding programs. Previous work to evaluate the application of GS in elite‐cultivated North American oat lines for both production and quality traits demonstrated that GS could be effective even at modest marker density (∼every 2cM; Asoro et al., 2011), although no plateau was reached with low density DArT marker numbers. Comparison of GS to traditional phenotypic and marker‐assisted selection for the complex quality trait β‐glucan showed that the benefits of GS could be realized based on a per cycle basis via the scaling of selection to two cycles per year (Asoro et al., 2013). More recently, Bekele et al. (2018) described the prediction of heading date in a large cultivated oat panel, reporting a minimal increase in accuracy from increasing marker density. However, their results showed that prediction from genotyping‐by‐sequencing (GBS) derived single nucleotide polymorphisms (SNPs) gave higher prediction accuracy than using tag‐level haplotype markers (Bekele et al., 2018).

Here we report the implementation of genomic prediction within a biparental cross between two cultivated winter oat varieties, ‘Buffalo’ and ‘Tardis’. The population has been previously used to update the oat consensus map based on GBS‐derived, tag‐level haplotypes (Bekele et al., 2018). In this study the population was stratified for both genotyping (at the F2 and F7 generation) and phenotyping (segregated at the F3 generation with one stream of material progressed to field assessment and the other undergoing rapid single seed descent [SSD] to the F7 generation). Using low‐coverage genotypic information in the early generation, we investigate the recovery of missing phenotypes via genomic prediction, which is required for accurate representation of true phenotypic value and variance. We also extend this to the F7 generation to test prediction of missing yield data. We demonstrate that using mixed marker data is feasible—with both low cost dominant markers and more expensive co‐dominant markers integrated to improve accuracy.

Core Ideas

  • Predictions based on low coverage genotyping can recover missing phenotypes in early generations.

  • Mixed data types can be effectively integrated to improve prediction accuracy in oat.

  • Differentially penalized regression can optimally weight mixed data.

2. MATERIALS AND METHODS

2.1. Plant material, genotyping and phenotyping

An F2 mapping population of 194 individuals was produced from a cross between the two winter oat varieties ‘Buffalo’ and ‘Tardis’ at Aberystwyth University, United Kingdom. The population was created to capture key differences between the parents; ‘Buffalo’ is a dwarf variety with low kernel content and small grains and ‘Tardis’ is a conventional‐height variety with high kernel content and large grains. The DNA was extracted from the seedling leaves of F2 plants and the parents using a QIAGEN DNeasy 96 Plant Kit (QIAGEN, Crawley, United Kingdom) and genotyped using 121 polymorphic microsatellites (Dumlupinar et al., 2016; Jannink & Gardner, 2005; Li, Rossnagel, & Scoles, 2000; Pal, Sandhu, Domier, & Kolb, 2002; Wight, Yan, Fetch, Deyl, & Tinker, 2010; Wu, Zhang, Chen, & He, 2012) and with the oat DArT array (Tinker et al., 2009; Diversity Arrays Technology Pty, Canberra, Australia) which identified a further 424 polymorphic (dominant) loci.

From each F2 plant, families of F3 seed were harvested and multiplied in the field to produce F4 bulks to enable sufficient seed for replicated field trials. Each F2‐origin plant therefore defines a lineage with the resulting progeny forming a family and F3 and F4 genotypes are inferred from the F2. A RIL mapping population of 227 individuals was derived by SSD from individual seeds of the F3 plants, giving a slightly larger number of individuals than the initial 194‐line F2 population. The population size was increased by selecting single seeds from individual F3 dwarf and tall plants. Progeny were advanced through SSD to the F7 generation where leaf material was sampled for DNA extraction as previously described (Figure 1). In addition to microsatellites and DArT markers, GBS libraries were constructed following the oat protocol developed and described by Huang et al. (2014) and processed as reported in Bekele et al. (2018). In this analysis, 1,046 markers were used for the RILs, and between the F2 and RIL datasets there were 401 common markers, of which 100 were codominant and 301 were dominant. Genotype calls and map locations are integrated into The Triticeae Toolbox oat platform (http://triticeaetoolbox.org/oat/genotyping) as reported in Bekele et al., 2018. Stratification of the population for both phenotyping and genotyping is summarized in Figure 1. Phenotypic assessment for production‐related traits (maturity, ear emergence, Internode 1 length, kernel content, panicle length, panicle extrusion, winter hardiness, height, grain yield, mildew, hullability, grain length, grain width, and grain area) was conducted in either the field or polytunnel at the F2 (2005), F3 (2006), and F4 (2007–2010) generation. In addition, the F7 RILs were phenotyped (2010–2014) for both the production characteristics (as previously) and the quality trait grain β‐glucan content at the RIL (F7; 2010–2014) generation (Table 1). All field trials were conducted in Aberystwyth, United Kingdom (52.43 lat, 4.02 long) and used standard pre‐emergence and early spring weed control with no fungicides or growth regulators applied. Nitrogen fertilizer (70 kg ha−1) was applied in a split dose at GS31 and GS35 (Zadoks, Chang, & Konzak, 1974). The traits were assessed using a range of standard phenotyping methods, summarized in Supplemental Table S1. The number of individuals phenotyped for each trait varied, and data was averaged across trial entries to derive phenotypic means (Table 1).

FIGURE 1.

FIGURE 1

The stratification of within‐population advance of material in the ‘Buffalo’ × ‘Tardis’ population, including derivation of phenotyping and genotyping data used in this study

TABLE 1.

Within generation predictions of traits assessed on the ‘Buffalo’ × ‘Tardis’ population with different generations of phenotyping (GenP) and genotyping (GenG) assessed in a range of years in field (F) or polytunnel (PT) trials. The ridge regression best linear unbiased predictor (RR‐BLUP) predictions are made across the full set of available lines (All), and within Dw6 classes (Tall, Dwarf). SD, standard deviation

Trait value RR‐BLUP prediction
Trait GenP GenG Year assessed Trial n Mean SD All Tall Dwarf
Internode 1 length, cm F2 F2 2005 F 180 25.97 10.92 0.79 0.21 0.10
F4 F2 2007, 2008 F 92 34.81 10.06 0.34 0.25 0.00
F7 F7 2011, 2014 PT 227 36.36 12.51 0.83
Kernel content, % F4 F2 2007, 2008, 2010 F 180 63.53 3.99 0.60 0.40 0.26
F7 F7 2012 F 156 62.60 4.60 0.66
Maturity, d after 1 April F4 F2 2008 F 180 112.12 1.22 0.33 0.32 0.33
F7 F7 2013 F 156 95.96 1.62 0.79
Mildew F3 F2 2006 F 180 0.54 1.29 0.40 0.23 0.40
F7 F7 2011 PT 227 0.42 0.50 0.72
F7 F7 2014 PT 91 0.53 0.50 0.86
Panicle extrusion, cm F4 F2 2007, 2008 F 92 8.47 9.98 0.35 0.24 0.00
F7 F7 2011 PT 227 12.02 12.42 0.87
Winter hardiness F3 F2 2006 F 148 8.10 0.58 0.09 0.10 0.10
F4 F2 2007–2009 F 180 7.28 0.44 0.70 0.60 0.67
F7 F7 2011 F 184 1.81 1.13 0.47
F7 F7 2012 F 227 7.29 0.64 0.77
Grain yield, t ha−2 at 85% dry matter F4 F2 2007, 2008, 2010 F 180 1.07 0.34 0.63 0.00 0.30
F7 F7 2014 F 227 4.91 2.19 0.66
Ear emergence, d after 1 April F2 F2 2005 F 180 61.96 3.62 0.64 0.00 0.44
F4 F2 2007, 2008, 2010 F 87 66.42 1.94 0.79 0.34 0.00
F7 F7 2010, 2011 PT 227 78.65 16.60 0.72
F7 F7 2010–2013 F 227 70.44 5.23 0.71
Height, cm F2 F2 2005 F 180 101.38 28.00 0.81 0.00 0.44
F4 F2 2007, 2008, 2010 F 180 113.59 19.81 0.88 0.00 0.76
F7 F7 2010–2014 F 222 110.60 27.13 0.88
F7 F7 2011, 2014 PT 222 137.68 32.56 0.89
Grain length, mm F4 F2 2008 F 177 11.10 0.50 0.45 0.54 0.42
F7 F7 2013 F 150 13.22 0.70 0.61
Grain width, F4 F2 2008 F 177 3.40 0.10 0.61 0.50 0.59
F7 F7 2013 F 150 3.07 0.14 0.65
Hullability, % F4 F2 2008 F 180 75.01 6.40 0.36 0.34 0.37
F7 F7 2012 F 156 91.70 6.63 0.47
Panicle length, cm F2 F2 2005 F 180 21.52 2.73 0.52 0.34 0.22
F4 F2 2007, 2008 F 92 26.49 2.21 0.04 0.00 0.05
β‐glucan, % F7 F7 2012 F 155 4.16 0.29 0.68
F7 F7 2013 F 146 4.10 0.28 0.47
Grain area, mm2 F7 F7 2013 F 150 28.68 2.13 0.61

2.2. Genomic prediction models

Two genomic prediction methods were used: ridge regression‐BLUP (RR‐BLUP; Piepho, 2009) and differentially penalized regression (DiPR; Bentley et al., 2014; Ward, Rakszegi, Bedő, Shewry, & Mackay, 2015). The use of two methods allowed for validation of models, including testing the use of marker information in a single matrix against differential weighting of the dominant (DArT) and codominant (microsatellite and GBS) marker data combined using DiPR. The RR‐BLUP analysis used the package rrBLUP v4.6 (Endelman, 2011) in R v3.3.3 for Windows (R Core Team, 2016). Predictions were compared within a generation between lineages and between generations using an integrated data matrix. For the integrated data matrix, genotype data for dominant markers were attributed half scores to account for their uncertainty, akin to an imputed marker (i.e., AA or AB: 0.5; AB or BB: −0.5), whereas whole value scores were used for codominant marker data (i.e., AA: 1; AB: 0; BB: −1). Imputation of further missing marker data was performed using the random forest algorithm implemented with the R package missForest v1.4 (Stekhoven & Buhlmann, 2012) with 1,000 trees and using Chi‐squared tests for parameterization of the missForest model with artificially removed data. Five‐fold cross‐validation (CV) within generations was performed with 100 replications via Monte Carlo cross‐validation (MCCV; Xu & Liang, 2001). Cross‐validation between generations, between and within lineages, was performed with k‐fold CV (k = 2, 3, 4, and 5) to examine differential sampling depths from the available population (i.e., simulating a breeder having genotyped and phenotyped 0.50, 0.33, 0.25, or 0.20 of the early generation, respectively). Within generation predictions were made independently on Tall and Dwarf classes (as determined by F2 genotyping) to account for the known segregating Dw6 gene (Molnar et al., 2012) when the training set size was greater than 30 individuals. In the full dataset, 27% of lines were classified as Tall.

All prediction accuracies are reported as pairwise Pearson correlations. In the within‐generation models, Fisher's Z‐transformation was used to convert Pearson correlations to a normal distribution (as Z is normally distributed whereas r is not) before averaging and back‐conversion. In the between‐generation models, where all available marker and phenotype data common to both generations were used to train and predict from the early generation (F2, F3 and F4) to RILs, accuracy is reported as the pairwise Pearson correlation within a family. In the between‐generation, between‐lineage models, genotypes were randomly sampled without replacement 100 times according to that k‐fold CV analysis (where k = 2–5). Accuracy is reported for both within and between family Fisher's Z‐transformed mean Pearson correlations with back‐conversion across the 100 per‐k iterations.

To implement DiPR, the common marker data from F2 to RIL genotypes were divided into dominant (DArT) and codominant (microsatellite and GBS) marker types. Markers were thinned at an r2 value of .90 to prevent oversampling and an additive relationship matrix was derived for each marker type. These were linearly combined into a single matrix with separate weighting factors (w and 1−w), between w = 0 and w = 1 in 0.01 steps, to produce a single input to RR‐BLUP, as previously described (Bentley et al., 2014; Ward et al., 2015). Model fitting used the R package ‘RR‐BLUP’ (Endelman, 2011) and the optimal w‐value was determined as the maximum cross‐validation correlation. At w = 0, only the codominant markers contributed to the prediction and at w = 1, only the dominant markers contributed. The intervening weights use differential penalization consistent between matrices but with the two marker sets contributing to the additive relationship matrix proportional to their weighting (Bentley et al., 2014).

3. RESULTS

Across the early generation (F2) genotypes, there were 545 genotyped markers, of which 424 were dominant and 121 codominant. For the RIL (F7) population there were 1,046 codominant genotyped markers. Between the two datasets there were 401 common markers, of which 100 were codominant and 301 were dominant.

3.1. Within‐generation predictions

In order to determine the added value of early generation genotyping, within‐generation models were tested. Predictions were made using F2 genotype data to predict phenotypes at the F2, F3, and F4 level, while RIL genotype data (F7) was used to predict phenotypes at the RIL level (F7). A total of 15 phenotypes were predicted with 12 predicted at both the early and later RIL generation, and all data are presented in Table 1. At the F2 genotype level, all the phenotypes were compared across all lines as well as within Dw6 genotypic Tall and Dwarf classes. The accuracy of prediction varied across traits and generations. The traits that were predicted to the highest levels of accuracy across generations were height (range .81–.89), ear emergence (.64–.79), and kernel content (.60–.66). For the majority of traits, the accuracy of prediction was higher when using F7 genotypic and phenotypic data compared to predicting in early generations (kernel content, maturity, mildew, panicle extrusion, grain yield, grain length, grain width, and hullability). Variation was observed for the accuracy of trait prediction when using different phenotyping generations or trial years for some traits including Internode 1 length (.34 from F4 compared to .79 from F2 and .83 from F7 phenotypes) and winter hardiness (.09 from F3, .47 in 2011 F7 trials to .70 from F4, and .77 from F7 in 2012 phenotypes). Predicting within Dw6 classes gave generally low predictions for all traits when compared to predicting across the full dataset, with the exception of height, grain length, and width in the F4 and the overall low prediction traits (maturity, mildew, winter hardiness, and panicle length).

3.2. Between generation predictions

To examine whether between‐lineage predictions were possible from early to late generations, predictions were made within and between lineages as well as across all available data. The phenotypic correlation between early generation and RIL phenotypes was used as a proxy for the accuracy of imposing selection on phenotype alone at the F4 generation for comparative purposes. Nine traits were selected for comparison to assess differences in predictive accuracy between the early and late generations and all data are presented in Table 2. In general, this showed that low‐level genotyping (combined with phenotyping) in the early generation was sufficient to allow relatively accurate predictions to be made for later generation (F7) phenotypes with the exception of kernel content (.66 for all markers, dropping to .39 with 50% of individuals). High prediction accuracies were maintained for Internode 1 length for predictions from the F2 to F7 (.58–.76) and F4 to F7 (.54–.77) across genotyping coverage as well as for panicle extrusion, grain yield, and height (Table 2). Where predictions overall were low (maturity, mildew, and winter hardiness), accuracies were maintained or slightly reduced with coverage. For ear emergence, the predictions from early generation to field grown F7 lines were high overall (.62) and showed a slow pattern of reduction with genotyping density, but predictions from early generations to F7 polytunnel phenotypes were low (.29 for F2, .12 for F3). This was not the case for height, with predictions stable across both field and polytunnel trials (Figure 2).

TABLE 2.

Comparing within and between generation predictions in the ‘Buffalo’ × ‘Tardis’ population using training populations of early generation phenotyped (GenP) individuals (Phenn) for common traits assessed in a test set composed of F7 recombinant inbred lines (RILs) in either the field (F) or polytunnel (PT). The true correlation between phenotypes (CorrP) is used as a proxy for early phenotypic selection. A comparison is made on the change in accuracy of prediction within and between generation when variable numbers of F2 individuals are genotyped (all including Dw6 Tall and Dwarf class, 50, 33, 25, and 20%)

Accuracy of prediction based on proportion of F2 individuals genotyped
Training set Test set (F7)   Genotype all (F2) Genotype 0.50 Genotype 0.33 Genotype 0.25 Genotype 0.20
Trait GenP Phenn Trial Phenn CorrP All Tall Dwarf Within Between Within Between Within Between Within Between
Internode 1 length F2 161 PT 213 0.38 0.76 0.00 0.22 0.73 0.71 0.70 0.67 0.67 0.62 0.64 0.58
F4 81 PT 213 0.77 0.77 0.30 0.21 0.75 0.69 0.74 0.65 0.72 0.59 0.72 0.54
Kernel content F4 161 F 152 0.10 0.66 0.49 0.26 0.39 0.39 0.39 0.39 0.39 0.39 0.39 0.39
Maturity F4 161 F 152 0.22 0.36 0.23 0.16 0.33 0.31 0.31 0.28 0.30 0.25 0.29 0.23
Mildew F3 161 F 137 0.21 0.49 0.34 0.55 0.49 0.47 0.47 0.46 0.46 0.43 0.44 0.41
Panicle extrusion F4 81 PT 213 0.86 0.82 0.21 0.00 0.81 0.73 0.81 0.69 0.80 0.64 0.80 0.60
Winter hardiness F3 136 F 213 0.22 0.43 0.36 0.18 0.34 0.30 0.29 0.25 0.27 0.21 0.25 0.19
F4 159 F 213 0.40 0.61 0.63 0.64 0.56 0.55 0.53 0.50 0.50 0.45 0.49 0.42
Grain yield F4 161 F 213 0.38 0.63 0.03 0.25 0.57 0.53 0.52 0.46 0.48 0.40 0.46 0.36
Ear emergence F2 161 PT 213 0.05 0.29 0.11 0.13 0.24 0.25 0.20 0.21 0.17 0.19 0.15 0.17
F3 91 PT 213 0.04 0.12 0.00 0.00 0.00 0.12 0.00 0.08 0.00 0.06 0.00 0.05
F2 161 F 213 0.19 0.62 0.15 0.00 0.56 0.57 0.51 0.51 0.47 0.47 0.44 0.42
F3 91 F 213 0.54 0.62 0.03 0.00 0.53 0.57 0.52 0.51 0.51 0.47 0.50 0.44
Height F2 161 PT 208 0.37 0.81 0.18 0.19 0.78 0.76 0.75 0.72 0.71 0.66 0.68 0.61
F3 161 PT 208 0.50 0.86 0.11 0.05 0.84 0.82 0.81 0.78 0.78 0.73 0.75 0.68
F2 161 F 208 0.37 0.82 0.04 0.46 0.79 0.77 0.75 0.73 0.73 0.68 0.68 0.63
  F3 161 F 208 0.50 0.87 0.20 0.05 0.84 0.83 0.81 0.78 0.78 0.74 0.76 0.69

FIGURE 2.

FIGURE 2

Comparison of changes in ridge regression best linear unbiased prediction accuracy from the F2 to F7 generation in ‘Buffalo’ × ‘Tardis’ recombinant inbred lines for height and ear emergence in the field (F) and polytunnel (PT) based on varying proportions of genotyped F2 individuals

3.3. Comparing methods for handling mixed marker data

Dominant markers provide less information than codominant markers but are cheaper to generate, meaning that a greater number are likely to be available (or required) to generate accurate predictions. The DiPR was implemented across nine traits (Table 3) to assess the predictive advantage of proportionally combining marker types in a single additive relationship matrix. When compared to predictions based on a single marker type using RR‐BLUP, the DiPR method performed as well or better than a RR‐BLUP model using a single matrix with results summarized in Table 3. Low RR‐BLUP predictions for maturity (.36), mildew (.49), and winter hardiness (F3 predictions .43) were all improved through the implementation of DiPR (.47, .52, and .48, respectively) although their optimal weighting factors varied. Differential weighting for these low‐prediction traits showed that only using codominant markers (DiPR wopt = 0.00) improved maturity and winter hardiness predictions whereas using only dominant markers (wopt = 1.00) optimized prediction of mildew.

TABLE 3.

Comparison of methods to handle mixed dominant and codominant marker types for predictions in the ‘Buffalo’ × ‘Tardis’ population. The training population consists of early generation phenotyped (GenP) individuals (Phenn) for common traits assessed in the test set composed of F7 recombinant inbred lines (RILs) in either the field (F) or polytunnel (PT). Prediction is compared between standard ridge regression best linear unbiased predictor (RR‐BLUP) and differentially penalized ridge regression (DiPR), with the associated optimal weight value (DiPR wopt) given

Training set Test set (F7) Prediction accuracy  
Trait GenP Phenn Trial Phenn RR‐BLUP DiPR DiPR wopt
Internode 1 length F2 161 PT 213 0.76 0.77 0.29
F4 81 PT 213 0.77 0.76 0.93
Kernel content F4 161 F 152 0.66 0.66 0.22
Maturity F4 161 F 152 0.36 0.47 0.00
Mildew F3 161 F 137 0.49 0.52 1.00
Panicle extrusion F4 81 PT 213 0.82 0.84 0.03
Winter hardiness F3 136 F 213 0.43 0.48 0.00
F4 159 F 213 0.61 0.70 0.00
Grain yield F4 161 F 213 0.63 0.69 0.23
Ear emergence F2 161 PT 213 0.29 0.24 0.90
F3 91 PT 213 0.12 0.09 0.42
F2 161 F 213 0.62 0.60 0.68
F3 91 F 213 0.62 0.63 0.27
Height F2 161 PT 208 0.81 0.82 0.24
F2 161 F 208 0.82 0.86 0.01
F3 161 PT 208 0.86 0.85 0.26
  F3 161 F 208 0.87 0.87 0.05

4. DISCUSSION

As genotyping costs fall, there is an opportunity to use genomic prediction to reduce the number of individuals phenotyped in within‐cross breeding populations. We demonstrate that it is feasible to use the genotypic information from a full set of biparental lines to make within generation, between‐lineage genomic predictions. This can recover information on missing phenotypic data to improve selection resilience representing an added value to early generation genotyping beyond deselection of unfavorable alleles, as previously described for wheat (Triticum aestivum; He et al., 2016) and soybean (Glycine max; Ma et al., 2016). Early generation genotypic data can also be used to reduce the number of lines required in later generation phenotyping based on siblings progressed to generate stable, genotyped homozygous lines. Our results demonstrate that early generation genotyping need not cover the full population in order to attain accuracies in line with true trait correlation between early and late generation phenotypes (a proxy for selecting on early generation phenotypes alone), as has been previously shown in small populations (Wong & Bernardo, 2008). We therefore propose that strong within‐cross selection could be imposed early in a breeding cycle whilst retaining accuracy of selection. Prior simulations of within‐cross genomic prediction have been reported in maize, suggesting that gains plateau with selfing rounds, with the F4 capturing 90% of the F8 gains due to an increase in the maximal breeding value of the population (McClosky et al., 2013). If these gains can be identified within lineages in the early stages, then accurate selection could be imposed before phenotypic selection.

In this study we performed predictions with F2 and F7 (RIL) genotype data. In the first instance, F2 genotypes were employed to make models with early generation (F2, F3, and F4) phenotypes with 80% of available phenotype data as the training set and 20% as the test set. This simulates lost data in early generation phenotyping when an accurate representation of the cross’ phenotypic value is required for selection. A major limitation to the implementation of GS within small breeding programs is the high upfront genotyping cost (Varshney et al., 2012). Our results indicate that there is an advantage to early generation genotyping, and that this need not be at high coverage in order to provide value to between‐generation RIL performance prediction.

Prediction within the F7 RILs had generally high accuracy and demonstrates potential savings in later stage phenotyping costs. Where seed is generated for RIL phenotyping in a shuttle breeding framework (Borlaug, 1968; Forster et al., 2015), there is a requirement to transfer substantial quantities of seed between environments. Our data indicates that an alternative to the movement of large amounts of seed could be to use separate sets of families to be tested in multiple target environments and to use within‐generation prediction for the missing environment performance. However, this would need to be empirically tested as the effect of environmental variability on robustness of prediction are well documented (Burgueño, de los Campos, Weigel, & Crossa, 2012; Jarquín et al., 2014). This would be particularly attractive in Europe where out‐of‐season multiplication takes place in climatically matched environments in the southern hemisphere, representing a major cost. Using sets of small families could also enable the use of GS for adaptation to multiple target environments at an early stage in the breeding program. This is currently limited by seed availability and would have both cost and logistical advantages in using sibling predictions to avoid phytopathological quarantine requirements.

Our data indicate that between family predictions across generations could allow for earlier lineage selection. Early selection is currently limited due to high levels of heterozygosity and uncertain phenotypic value of lineages. However, if a portion of the F2 lineages are genotyped (as single plants), and their derived F4 field phenotypes (based on siblings from F3 rows) are used in conjunction with low‐coverage F2 genotyping, a genomic prediction model could be developed. Following subsequent production and genotyping of fixed RILs, a prediction can be used to select which of the cross’ lineages are likely to perform best and reduce the number of entries into fully replicated field trials, therefore accelerating the breeding cycle (Jannink, Lorenz, & Iwata, 2010). This offers the ability to use F2:4 families to predict F7s derived from different F2s and to rapidly generate F7 lines while producing a prediction equation over one or two seasons of yield testing. Selection among the F7 is then made on the predicted trait values. This theoretical program design is summarized in Supplemental Figure S1.

We also compared different proportions of F2 genotyping as an approximation for a breeder varying the level of financial investment in F2 genotyping, with all derived RILs then being genotyped. There was a reduction in predictive accuracy as the proportion of F2 genotyping declined although, even at low representation, some traits could still be predicted to the same levels as for phenotypic selection at the F4 generation. Similar results have previously been shown in biparental maize population simulations (Bernardo & Yu, 2007).

The employment of within‐cross predictions reported here must be tailored to the existing breeding program, particularly with respect to number of crosses per cycle and selection intensity in order to ensure financial viability. The evaluation of economic aspects of GS implementation are essential for wider application (Abed, Pérez‐Rodríguez, Crossa, & Belzile, 2018). However, given the ability to achieve rapid generation time (Watson et al., 2018), our accuracy results suggest that selections could be made much earlier, although this remains to be empirically tested within breeding programs. Given that between‐lineage accuracies are similar to within‐lineage accuracies, our data suggest that independent families can be used to predict across lineages. In addition to showing that between‐lineage prediction is possible, we also show that F3 and F4 segregated material (as used in shuttle breeding or remote testing) can be used to reduce the costs associated with multi‐environment testing. Bekele et al. (2018) recently demonstrated heading date prediction accuracies of up to .67 in independent training and test populations. The accumulation of data from many crosses also represents a first step to the full implementation of GS within a program (Gorjanc et al., 2017a; Sverrisdóttir et al., 2017; Edwards et al., 2019). However, we note that further work is required to compare the within‐cross predictions reported here to wider performance across a breeding program with analysis of all crosses jointly (Jannink et al., 2010). The longer‐term adoption and implementation of a multi‐subpopulation training population (de Roos, Hayes, & Goddard, 2009) offers an attractive gradual adoption model for GS in small programs if LD can be maintained with higher marker densities (Asoro et al., 2011).

Finally, we demonstrate that the use of mixed marker data can be optimized using DiPR. Although dominant marker use is declining, they still represent the cheapest genotyping method for low‐resource crops and much legacy data exists. Dominant markers are less informative than codominant markers and their use can be problematic for GS across generations because of varying levels of heterozygosity that cannot be accounted for. We considered an alternative to a linear combination of dominant and codominant markers that separately weighted marker types as components of a single additive relationship matrix. Implemented as DiPR (Bentley et al., 2014; Ward et al., 2015), this showed that for some traits an optimized weighted combination of the two marker types improved prediction accuracy compared to a combined matrix using all available data. When the weight factor (w) was zero, only codominant data was used in the prediction. As the weight tends toward one, more weight is applied to the dominant marker data. For example, for kernel content (training: F4 2007, 2008, 2010; test: RIL 2012) an intermediate optimal solution (wopt = 0.22) was found. This compares to mildew (training: F3 2006; test: RIL 2012) which had a dominant marker optimum (wopt = 1.00) and winter hardiness (training: F3 2006 and F4 2007; test: RIL 2012) which had a codominant marker optimum (wopt = 0.00). Although dominant markers have been largely superseded by SNP‐based methods of genotyping, our results indicate that for some traits they provide useful information. The low frequency or uneven distribution of SNP markers across the oat genome (Bekele et al., 2018) may explain why the dominant markers used here made higher, or complete contributions to optimal predictions for disease (typically a dominant genetic effect, controlled by a limited number of loci; Okoń & Ociepa, 2018). Conversely, winter hardiness was optimally predicted from codominant markers and it is a documented complex, quantitative trait that has limited tractability in oat breeding programs (Chawade et al., 2012). Therefore, we propose that the genetic architecture of a trait combined with marker coverage are determinants of optimal DiPR weighting.

Our findings are potentially useful for other studies looking to combine data types in predictions. Asoro et al. (2013) previously proposed the use of selection criteria to weight low‐frequency favorable alleles in GS to avoid loss of diversity with increasing gains for β‐glucan in oat breeding. We also demonstrate the utility of within‐ and between‐generation predictions in a narrow‐base oat population. The predictions reported here would have benefits to a breeding program where genotyping costs are less than field trial costs. In this study we use a modest number of individuals and markers, but scaling to higher density markers, larger numbers of individuals, and broadening the population base are all opportunities for achieving future breeding gains.

CONFLICT OF INTEREST

The authors declare no conflicts of interest.

Supporting information

Supplemental Table S1. Phenotypic traits assessed in the ‘Buffalo’ × ‘Tardis’ population in this study including trait name and method used for assessment.

Supplemental Figure S1. Theoretical program design to increase selection pressure within a cross using genomic prediction based on development of a predictive model from F2 genotypes and F4 phenotypes applied to estimate between lineage performance in recombinant inbred lines (RILs). We propose that this scheme would permit stronger selection within cross without a loss of accuracy.

ACKNOWLEDGMENTS

This paper is dedicated to the memory of our friend and colleague Dr. Greg Mellers who passed away on Friday 6th September 2019, aged 29. This work was supported by Biotechnology and Biological Sciences Research Council (BBSRC) grant BB/M000869/1 to IBERS and NIAB through the InnovOat project (www.innovoat.uk). Alison Bentley is supported by the BBSRC Cross‐Institute Strategic Program “Designing Future Wheat” BB/P016855/1.

AUTHOR CONTRIBUTIONS

SC, IG, PM‐M, and CJH developed and phenotyped the population, WB, NAT, and CJH genotyped the material, GM performed the data analysis, ARB, CJH, and IM conceived the study, GM and ARB drafted the manuscript, all authors read and approved the final manuscript.

Mellers G, Mackay I, Cowan S, et al. Implementing within‐cross genomic prediction to reduce oat breeding costs. Plant Genome. 2020;13:e20004. 10.1002/tpg2.20004

DATA AVAILABILITY

The raw data used for the analysis reported in this study is available from the Triticeae Toolbox oat platform (http://triticeaetoolbox.org/oat/genotyping).

REFERENCES

  1. Abed, A. , Pérez‐Rodríguez, P. , Crossa, J. , & Belzile, F. (2018). When less can be better: How can we make genomic selection more cost‐effective and accurate in barley? Theoretical and Applied Genetics, 131, 1873–1890. 10.1007/s00122-018-3120-8 [DOI] [PubMed] [Google Scholar]
  2. Akdemir, D. , & Isidro‐Sánchez, J. (2019). Design of training populations for selective phenotyping in genomic prediction. Scientific Reports, 9. 10.1038/s41598-018-38081-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Akdemir, D. , Sanchez, J. I. , & Jannink, J.‐L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection, Evolution, 47. 10.1186/s12711-015-0116-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arruda, M. P. , Brown, P. J. , Lipka, A. E. , Krill, A. M. , Thurber, C. , & Kolb, F. L. (2015). Genomic selection for predicting fusarium head blight resistance in a wheat breeding program. The Plant Genome, 8. 10.3835/plantgenome2015.01.0003 [DOI] [PubMed] [Google Scholar]
  5. Asoro, F. G. , Newell, M. A. , Beavis, W. D. , Scott, M. P. , & Jannink, J.‐L. (2011). Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. The Plant Genome, 4, 132–144. 10.3835/plantgenome2011.02.0007 [DOI] [Google Scholar]
  6. Asoro, F. G. , Newell, M. A. , Beavis, W. D. , Scott, M. P. , Tinker, N. A. , & Jannink, J.‐L. (2013). Genomic, marker‐assisted, and pedigree‐BLUP selection methods for β‐glucan concentration in elite oat. Crop Science, 53, 1894–1906. [Google Scholar]
  7. Bassi, F. M. , Bentley, A. R. , Charmet, G. , Ortiz, R. , & Crossa, J. (2015). Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Science, 242, 23–26. 10.1016/j.plantsci.2015.08.021 [DOI] [PubMed] [Google Scholar]
  8. Bekele, W. A. , Wight, C. P. , Chai, S. , Howarth, C. J. , & Tinker, N. A. (2018). Haplotype‐based genotyping‐by‐sequencing in oat genome research. Plant Biotechnology Journal, 16, 1452–1463. 10.1111/pbi.12888 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bentley, A. R. , Scutari, M. , Gosman, N. , Faure, S. , Bedford, F. , Howell, P. , … Mackay, I. J. (2014). Applying association mapping and genomic selection to the dissection of key traits in elite European wheat. Theoretical and Applied Genetics, 127, 2619–2633. 10.1007/s00122-014-2403-y [DOI] [PubMed] [Google Scholar]
  10. Bernardo, R. , & Yu, J. (2007). Prospects for genome‐wide selection for quantitative traits in maize. Crop Science, 47, 1082–1090. 10.2135/cropsci2006.11.0690 [DOI] [Google Scholar]
  11. Borlaug, N. E. (1968). Wheat breeding and its impact on world food supply. In Finlay K. W. & Shephard K. W. (Eds.), Proceedings of the 3rd International Wheat Genetics Symposium (pp. 1–36). Canberra, Australia: Australian Academy of Sciences. [Google Scholar]
  12. Burgueño, J. , de los Campos, G. , Weigel, K. , & Crossa, J. (2012). Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Science, 52, 707–719. 10.2135/cropsci2011.06.0299 [DOI] [Google Scholar]
  13. Chaffin, A. S. , Huang, Y.‐F. , Smith, S. , Bekele, W. A. , Babiker, E. , Gnanesh, B. N. , … Tinker, N. A. (2016). A consensus map in cultivated hexaploid oat reveals conserved grass synteny with substantial sub‐genome rearrangement. The Plant Genome, 9, 10.3835/plantgenome2015.10.0102 [DOI] [PubMed] [Google Scholar]
  14. Chawade, A. , Lindén, P. , Bräutigam, M. , Jonsson, R. , Jonsson, A. , Moritz, T. , & Olsson, O. (2012). Development of a model system to identify differences in spring and winter oat. PLOS ONE, 7. 10.1371/journal.pone.0029792 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. de Roos, A. P. W. , Hayes, B. J. , & Goddard, M. E. (2009). Reliability of genomic breeding values across multiple populations. Genetics, 183, 1545–1553. 10.1534/genetics.109.104935 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dumlupinar, Z. , Brown, R. , Campbell, R. , Anderson, J. , Bonman, J. M. , Carson, M. , … Jackson, E. (2016). The art of attrition: development of robust oat microsatellites. Plant Breeding, 135, 323–334. 10.1111/pbr.12362 [DOI] [Google Scholar]
  17. Edwards, S. M. , Buntjer, J. B. , Jackson, R. , Bentley, A. R. , Lage, J. , Byrne, E. , … Hickey, J. M. (2019). The effects of training population design on genomic prediction accuracy in wheat. Theoretical and Applied Genetics, 132, 1943–1952. 10.1007/s00122-019-03327-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Endelman, J. B. (2011). Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome, 4, 250–255. 10.3835/plantgenome2011.08.0024 [DOI] [Google Scholar]
  19. Esvelt Klos, K. , Huang, Y.‐F. , Bekele, W. A. , Obert, D. E. , Babiker, E. , Beattie, A. D. , … Tinker, N. A. (2016). Population genomics related to adaptation in elite oat germplasm. The Plant Genome, 9, 10.3835/plantgenome2015.10.0103 [DOI] [PubMed] [Google Scholar]
  20. Forster, B. , Till, B. J. , Ghanim, A. M. A. , Huynh, H. O. A. , Burstmayr, H. , & Caligari, P. D. S. (2015). Accelerated plant breeding. CAB Reviews, 9. 10.1079/PAVSNNR20149043 [DOI] [Google Scholar]
  21. Gonen, S. , Wimmer, V. , Gaynor, R. C. , Byrne, E. , Gorjanc, G. , & Hickey, J. (2018). A heuristic method for fast and accurate phasing and imputation of single‐nucleotide polymorphism data in bi‐parental plant populations. Theoretical and Applied Genetics, 131, 2345–2357. 10.1007/s00122-018-3156-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gorjanc, G. , Battagin, M. , Dumasy, J.‐F. , Antolin, R. , Gaynor, R. C. , & Hickey, J. M. (2017a). Prospects for cost‐effective genomic selection via accurate within‐family imputation. Crop Science, 57, 216–228. 10.2135/cropsci2016.06.0526 [DOI] [Google Scholar]
  23. Gorjanc, G. , Dumasy, J.‐F. , Gonen, S. , Gaynor, R. C. , Antolin, R. , & Hickey, J. M. (2017b). Potential of low‐coverage genotyping‐by‐sequencing and imputation for cost‐effective genomic selection in biparental segregating populations. Crop Science, 57, 1404–1420. 10.2135/cropsci2016.08.0675 [DOI] [Google Scholar]
  24. He, S. , Schulthess, A. W. , Mirdita, V. , Zhao, Y. , Korzun, V. , Bothe, R. , … Jiang, Y. (2016). Genomic selection in a commercial winter wheat population. Theoretical and Applied Genetics, 129, 641–651. 10.1007/s00122-015-2655-1 [DOI] [PubMed] [Google Scholar]
  25. Heffner, E. L. , Sorrells, M. E. , & Jannink, J.‐L. (2009). Genomic selection for crop improvement. Crop Science, 49, 1–12. 10.2135/cropsci2008.08.0512 [DOI] [Google Scholar]
  26. Hoffman, L. A. (1995). World production and use of oats. In Welch R. W. (Ed.), The oat crop: Production and utilization (pp. 34–61). London, England: Chapman & Hall. [Google Scholar]
  27. Huang, Y. F. , Poland, J. A. , Wight, C. P. , Jackson, E. W. , Tinker, N. A. (2014). Using Genotyping‐By‐Sequencing (GBS) for genomic discovery in cultivated oat. PLOS ONE, 9. 10.1371/journal.pone.0102448 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Isidro, J. , Jannink, J.‐L. , Akdemir, D. , Poland, J. , Heslot, N. , & Sorrells, M. E. (2015). Training set optimization under population structure in genomic selection. Theoretical and Applied Genetics, 128, 145–158. 10.1007/s00122-014-2418-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jannink, J.‐L. , & Gardner, S. W. (2005). Expanding the pool of PCR‐based markers for oat. Crop Science, 45, 2383–2387. 10.2135/cropsci2005.0285 [DOI] [Google Scholar]
  30. Jannink, J.‐L. , Lorenz, A. J. , & Iwata, H. (2010). Genomic selection in plant breeding: From theory to practice. Briefings in Functional Genomics, 9, 166–177. 10.1093/bfgp/elq001 [DOI] [PubMed] [Google Scholar]
  31. Jarquín, D. , Crossa, J. , Lacaze, Z. , Du Cheyron, P. , Daucourt, J. , Lorgeuo, J. , … de los Campos, G. (2014). A reaction norm model for genomic selection using high‐dimensional genomic and environmental data. Theoretical and Applied Genetics, 127, 595–607. 10.1007/s00122-013-2243-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Jarquín, D. , Kocak, K. , Posadas, L. , Hyma, K. , Jedlicka, J. , Graef, G. , & Lorenz, A. (2016). Genotyping by sequencing for genomic prediction in a soybean breeding population. BMC Genomics, 15. 10.1186/1471-2164-15-740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Li, C. D. , Rossnagel, B. G. , & Scoles, G. J. (2000). The development of oat microsatellite markers and their use in identifying relationships among Avena species and oat cultivars. Theoretical and Applied Genetics, 101, 1259–1268. 10.1007/s001220051605 [DOI] [Google Scholar]
  34. Lorenzana, R. E. , & Bernardo, R. (2009). Accuracy of genotypic value predictions for marker‐based selection in biparental plant populations. Theoretical and Applied Genetics, 120, 151–161. 10.1007/s00122-009-1166-3 [DOI] [PubMed] [Google Scholar]
  35. Ma, Y. , Reif, J. C. , Jiang, Y. , Wen, Z. , Wang, D. , Liu, Z. ,…Qiu, L. (2016). Potential of marker selection to increase prediction accuracy for genomic selection in soybean (Glycine max L.). Molecular Breeding, 36. 10.1007/s11032-016-0504-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. McClosky, B. , LaCombe, J. , & Tanksley, S. D. (2013). Selfing for the design of genomic selection experiments in biparental plant populations. Theoretical and Applied Genetics, 126, 2907–2920. 10.1007/s00122-013-2182-x [DOI] [PubMed] [Google Scholar]
  37. Meuwissen, T. H. E. , Hayes, B. J. , & Goddard, M. E. (2001). Prediction of total genetic value using genome‐wide dense marker maps. Genetics, 157, 1819–1829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Molnar, S. J. , Chapados, J. T. , Satheeskumar, S. , Wight, C. P. , Bancroft, B. , Orr, W. , … Kibite, S. (2012). Comparative mapping of the oat Dw6/dw6 dwarfing locus using NILs and association with vacuolar proton ATPase subunit H. Theoretical and Applied Genetics, 124, 1115–1125. 10.1007/s00122-011-1773-7 [DOI] [PubMed] [Google Scholar]
  39. Morrell, P. L. , Buckler, E. S. , & Ross‐Ibarra, J. (2011). Crop genomics: Advances and applications. Nature Reviews Genetics, 13, 85–96. 10.1038/nrg3097 [DOI] [PubMed] [Google Scholar]
  40. Norman, A. , Taylor, J. , Edwards, J. , & Kuchel, H. (2018). Optimising genomic selection in wheat: Effect of marker density, population size and population structure on prediction accuracy. G3: Genes, Genomes, Genetics, 8, 2889–2899. 10.1534/g3.118.200311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Okoń, S. M. , & Ociepa, T. (2018). Effectiveness of new sources of resistance against oat powdery mildew identified in A. sterilis. Journal of Plant Diseases and Protection, 125, 505–510. 10.1007/s41348-018-0171-7 [DOI] [Google Scholar]
  42. Ou, J.‐H. , & Liao, C.‐T. (2019). Training set determination for genomic selection. Theoretical and Applied Genetics, 132, 2781–2792. 10.1007/s00122-019-03387-0 [DOI] [PubMed] [Google Scholar]
  43. Pal, N. , Sandhu, J. S. , Domier, L. L. , & Kolb, F. L. (2002). Development and characterization of microsatellites and RFLP‐derived PCR markers in oat. Crop Science, 42, 912–918. 10.2135/cropsci2002.9120 [DOI] [Google Scholar]
  44. Piepho, H. P. (2009). Ridge regression and extensions for genomewide selection in maize. Crop Science, 49, 1165–1176. 10.2135/cropsci2008.10.0595 [DOI] [Google Scholar]
  45. R Core Team . (2016). R: A language and environment for statistical computing. Retrieved from http://www.R-project.org/
  46. Rincent, R. , Laloë, D. , Nicolas, S. , Altmann, T. , Brunel, D. , Revilla, P. , … Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: Comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192, 715–728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Stekhoven, D. J. , & Buhlmann, P. (2012). MissForest–non‐parametric missing value imputation for mixed‐type data. Bioinformatics, 28, 112–118. 10.1093/bioinformatics/btr597 [DOI] [PubMed] [Google Scholar]
  48. Sverrisdóttir, E. , Byrne, S. , Sundmark, E. H. R. , Johnsen, H. Ø. , Kirk, H. G. , Asp, T. , … Nielsen, K. L. (2017). Genomic prediction of starch content and chipping quality in tetraploid potato using genotyping‐by‐sequencing. Theoretical and Applied Genetics, 130, 2091–2108. 10.1007/s00122-017-2944-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Tinker, N. A. , Kilian, A. , Wight, C. P. , Heller‐Uszynska, K. , Wenzl, P. , Rines, H. W. , … Langdon, T. (2009). New DArT markers for oat provide enhanced map coverage and global germplasm characterization. BMC Genomics, 10, 39. 10.1186/1471-2164-10-39 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Varshney, R. K. , Ribaut, J.‐M. , Buckler, E. S. , Tuberosa, R. , Rafalski, J. A. , & Langridge, P. (2012). Can genomics boost productivity of orphan crops? Nature Biotechnology, 30, 1172–1176. 10.1038/nbt.2440 [DOI] [PubMed] [Google Scholar]
  51. Vivek, B. S. , Krishna, G. K. , Vengadessan, V. , Babu, R. , Zaidi, P. H. , Kha, L. Q. , … Crossa, J. (2017). Use of genomic estimated breeding values results in rapid genetic gains for drought tolerance in maize. Plant Genome, 10, 1–8. 10.3835/plantgenome2016.07.0070 [DOI] [PubMed] [Google Scholar]
  52. Voss‐Fels, K. P. , Cooper, M. , & Hayes, B. J. (2019). Accelerating crop genetic gains with genomic selection. Theoretical and Applied Genetics, 132, 669–686. 10.1007/s00122-018-3270-8 [DOI] [PubMed] [Google Scholar]
  53. Ward, J. , Rakszegi, M. , Bedő, Z. , Shewry, P. R. , & Mackay, I. (2015). Differentially penalized regression to predict agronomic traits from metabolites and markers in wheat. BMC Genetics, 16. 10.1186/s12863-015-0169-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Watson, A. , Ghosh, S. , Williams, M. J. , Cuddy, W. S. , Simmonds, J. , Rey, M.‐D. , … Hickey, L. T. (2018). Speed breeding is a powerful tool to accelerate crop research and breeding. Nat. Plants, 4, 23–29. 10.1038/s41477-017-0083-8 [DOI] [PubMed] [Google Scholar]
  55. Wight, C. P. , Yan, W. , Fetch, J. M. , Deyl, J. , & Tinker, N. A. (2010). A set of new simple sequence repeat and Avenin DNA markers suitable for mapping and fingerprinting studies in oat (Avena Spp.). Crop Science, 50, 1207–1218. 10.2135/cropsci2009.09.0474 [DOI] [Google Scholar]
  56. Wong, C. , & Bernardo, R. (2008). Genomewide selection in oil palm: Increasing selection gain per unit time and cost with small populations. Theoretical and Applied Genetics, 116, 815–824. 10.1007/s00122-008-0715-5 [DOI] [PubMed] [Google Scholar]
  57. Wu, B. , Zhang, Z. , Chen, L. , & He, M. (2012). Isolation and characterization of novel microsatellite markers for Avena sativa (Poaceae) (oat). American Journal of Botany, 99, 69–71. 10.3732/ajg.1100404 [DOI] [PubMed] [Google Scholar]
  58. Xu, Q.‐S. , & Liang, Y.‐Z. (2001). Monte Carlo cross validation. Chemometrics Intelligent Laboratory Systems, 56, 1–11. 10.1016/S0169-7439(00)00122-2 [DOI] [Google Scholar]
  59. Yan, H. , Martin, S. L. , Bekele, W. A. , Latta, R. G. , Diederichsen, A. , Peng, Y. , & Tinker, N. A. (2016). Genome size variation in the genus Avena . Genome, 59, 209–220. 10.1139/gen-2015-0132 [DOI] [PubMed] [Google Scholar]
  60. Zadoks, J. C. , Chang, T. T. , & Konzak, C. F. (1974). Decimal code for growth stages of cereals. Weed Research, 14, 415–421. 10.1111/j.1365-3180.1974.tb01084.x [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Table S1. Phenotypic traits assessed in the ‘Buffalo’ × ‘Tardis’ population in this study including trait name and method used for assessment.

Supplemental Figure S1. Theoretical program design to increase selection pressure within a cross using genomic prediction based on development of a predictive model from F2 genotypes and F4 phenotypes applied to estimate between lineage performance in recombinant inbred lines (RILs). We propose that this scheme would permit stronger selection within cross without a loss of accuracy.

Data Availability Statement

The raw data used for the analysis reported in this study is available from the Triticeae Toolbox oat platform (http://triticeaetoolbox.org/oat/genotyping).


Articles from The Plant Genome are provided here courtesy of Wiley

RESOURCES