Use of multiple traits genomic prediction, genotype by environment interactions and spatial effect to improve prediction accuracy in yield data

Hsin-Yuan Tsai; Fabio Cericola; Vahid Edriss; Jeppe Reitan Andersen; Jihad Orabi; Jens Due Jensen; Ahmed Jahoor; Luc Janss; Just Jensen

doi:10.1371/journal.pone.0232665

. 2020 May 13;15(5):e0232665. doi: 10.1371/journal.pone.0232665

Use of multiple traits genomic prediction, genotype by environment interactions and spatial effect to improve prediction accuracy in yield data

Hsin-Yuan Tsai ^1,^2,^*, Fabio Cericola ³, Vahid Edriss ⁴, Jeppe Reitan Andersen ⁴, Jihad Orabi ⁴, Jens Due Jensen ⁴, Ahmed Jahoor ^4,⁵, Luc Janss ¹, Just Jensen ¹

Editor: Aimin Zhang⁶

PMCID: PMC7219756 PMID: 32401769

Abstract

Genomic selection has been extensively implemented in plant breeding schemes. Genomic selection incorporates dense genome-wide markers to predict the breeding values for important traits based on information from genotype and phenotype records on traits of interest in a reference population. To date, most relevant investigations have been performed using single trait genomic prediction models (STGP). However, records for several traits at once are usually documented for breeding lines in commercial breeding programs. By incorporating benefits from genetic characterizations of correlated phenotypes, multiple trait genomic prediction (MTGP) may be a useful tool for improving prediction accuracy in genetic evaluations. The objective of this study was to test whether the use of MTGP and including proper modeling of spatial effects can improve the prediction accuracy of breeding values in commercial barley and wheat breeding lines. We genotyped 1,317 spring barley and 1,325 winter wheat lines from a commercial breeding program with the Illumina 9K barley and 15K wheat SNP-chip (respectively) and phenotyped them across multiple years and locations. Results showed that the MTGP approach increased correlations between future performance and estimated breeding value of yields by 7% in barley and by 57% in wheat relative to using the STGP approach for each trait individually. Analyses combining genomic data, pedigree information, and proper modeling of spatial effects further increased the prediction accuracy by 4% in barley and 3% in wheat relative to the model using genomic relationships only. The prediction accuracy for yield in wheat and barley yield trait breeding, were improved by combining MTGP and spatial effects in the model.

Introduction

Wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) are two of the earliest domesticated crop species and are ranked as the first and fourth most-grown cereals worldwide, respectively [1–4]. Approximately 75% of barley’s global production is used as an ingredient in animal feed with the remaining 25% used for alcoholic and non-alcoholic beverages and a variety of other foodstuffs. Due to barley’s diploid genome architecture and its ability to self-fertilize, barley is considered an ideal model species for cereal genetic research [5]. Most wheat varieties are tetraploid (durum) or hexaploid (bread), but a few diploid varieties also exist. Due to their importance in food production, a high quality assembly of the entire genome sequence for barley is publicly available [1]. In contrast, the first genome assembly for wheat became available only recently [4], enhancing the opportunities for plant breeders to advance genome-assisted crop improvements and discover quantitative trait loci (QTLs) of commercial interest.

Previous researchers have indicated that most traits of commercial importance in barley and wheat (e.g., yield) can likely be explained by many QTLs, each of which provide small contributions to total genetic variance [6,7]. This architecture has significantly restricted the application of traditional marker-assisted selection techniques, particularly for economically important traits with a highly polygenic architecture. The concept of genomic selection (GS) proposed by Meuwissen et al. [8] was developed to incorporate whole-genome marker data in selection programs to accumulate single nucleotide polymorphisms (SNP) or haplotype effects that can accurately predict future performance of potential new lines. As such, genomic prediction (GP) is now utilized to predict the breeding values of individuals based on a sufficient number of molecular markers and a training population (TP) that is genotyped and phenotyped for traits of interest. The performances of phenotypes in a validated population (VP) can then be predicted by exploiting dense molecular markers (or QTLs) that are associated with traits in the TP. For commercial breeding programs, large scale phenotyping and genotyping of breeding lines in the TP can lead to the development of promising statistical models for variance component estimation and for predicting breeding values using established approaches (e.g., REML [9] and BLUP [10]). In contrast to animal breeding approaches, the utilization of genomic approaches in plant breeding has been developed only recently [11].

Several methods in statistical genetics have been developed that benefit from genetic correlations between traits [12–14]. Univariate analysis, also known as single trait genomic prediction (STGP), is currently the most common method used in plant breeding schemes (e.g., in cassava [15], wheat [16–18], barley [19], rye [20], and rice [21]). However, for most commercial plant breeding programs, breeders have collected data on several phenotypes, which enable them to take advantage of genetic and phenotypic correlations among traits. Such multiple trait genomic prediction (MTGP) methods have recently been extensively examined [15,17,18,22,23].

The MTGP approach was originally developed to exploit information gained from correlated indicator traits [13]. Results have generally indicated that MTGP can increase the accuracy of genetic evaluations, especially when traits with high genetic correlations are involved in the analyses [13,15,17,20,22–24]. These findings agree with expected advantages of indirect selection [25]. Compared with traditional pedigree-based breeding methods and STGP, MTGP will likely be able to provide an ideal alternative for characterizing a higher number of candidate genes for selection and at lower cost, especially for traits that are labor intensive to evaluate or require a long time before they are expressed (e.g., baking quality or resistance to pests).

For several economic traits of spring wheat, studies have shown that correlations between observed phenotypes and estimated breeding values are higher when the genomic prediction model involves both genomic and pedigree information than when pedigree alone is used [26]. In general, commercial plant breeders usually have phenotypic records across multiple generations for traits of economic importance. In this study, we used data from multiple plots of F₅ generations and analyzed those results jointly with records from replicated experiments of F₆ generations from a variety of field locations. Because testing conditions are not necessarily identical for each generation, it may be necessary to treat records from different generations as being different, but correlated traits. This approach might considerably increase selection accuracy and further increase the genetic gain achieved per generation [15,17].

The major aims of this study were to: (1) compare the predictive ability for both genomic information and spatial effect in breeding lines of winter wheat and spring barley, (2) evaluate the prediction accuracy underlying STGP and MTGP methods, and (3) apply F₅ and F₆ data in the MTGP analysis (as multiple training populations) to predict the future yield in breeding lines of winter wheat and spring barley.

Materials and methods

Field experiment and phenotypes

Our field experiment was performed by Nordic Seed A/S (Galten, Denmark). In total, we tested 1,317 spring barley (H. vulgare) and 1,325 winter wheat (T. aestivum) breeding lines. We tested each line in two consecutive years and at three locations every year (Fig 1). The three locations tested were Dyngby, Holeby, and Skive (for first year only) in Denmark.

Fig 1 — ‘B’ is the number of lines in spring barley in each corresponding set and ‘W’ is the number of lines in winter wheat. Each set contains data from two consecutive years. For instance, set 1 contained data from 2013 to 2014, set 2 contained data from 2014 to 2015, and so on. The green box represents data we included in the test, whereas the white box in Set 4 represents data still under collection at time of analysis, and not yet included in the test. The figure was adapted from and originally drawn by Andrea Bellucci (pers. comm.).

We nested multiple trials within the three test locations, and tested plots using a randomized complete block design [27] within each trial and each trial contained the same number of breeding lines. For barley, every trial comprised 22 lines and 3 checks, with 3 replicates in the first year and 2 replicates in the second year. For wheat, every trial had 21 lines and 4 checks, with 2 replicates in each year.

For each trial, lines from any given family were sown in a randomized order, in each replicate, next to each other in the field. Based on the size of the family, a trial consisted of one or more families, and if the last family to be sown was more numerous than the remaining available plots, they were sown in the next trial. Therefore, many families had members in at least two different trials. In general, there were 3–5 full-siblings in each trial.

Yield data for the F₅ and F₆ generations were collected in this study. Every year, we made a new set of crosses and every set contained approximately 330 unique single seed descent lines in F₅, which were then used to produce the F₆ line. The number of recorded plots in F₆ were slightly different in spring barley and winter wheat, as details in Table 1. The yield data were measured as kg grain per 8.25-m² plot in both spring barley and winter wheat breeding lines for F₅ and F₆, respectively.

Table 1. Descriptive statistics for spring barley and winter wheat phenotypic records.

Species	Trait	Units	No. of Plots	Mean (SD)	Min.	Max.
Barley	Yield F₆	kg grain /8.25m² per plot	15376	6.60 (0.8)	4.2	9.4
	Yield F₅		1317	6.11 (1.0)	3.7	8.0
Wheat	Yield F₆		13329	8.62 (0.9)	3.9	14.8
	Yield F₅		1325	9.68 (1.8)	4.1	13.4

Open in a new tab

The phenotypes of trait and pedigree information of every line were recorded by Nordic Seed A/S (Galten, Denmark). The three farms used are owned by Nordic Seed A/S, and they, therefore did not need any further permission to use the land. The three farms are legal for farming use, and not located on any national parks or other protected areas of land or sea.

Genotypes

We used the Illumina 9K barley SNP-chip and the 15K wheat SNP-chip to genotype all breeding lines. After quality control procedures, 4,056 SNPs in spring barley and 11,154 SNPs in winter wheat remained for analysis using the following two filters: (1) a minor allele frequency of <0.01 and (2) a missing SNP frequency per line value of >0.02. There were 2,841 SNPs in spring barley and 9,290 SNPs in winter wheat mapped to existing linkage groups according to the genome assembly [1,4], whereas 1,215 SNPs in spring barley and 1,864 SNPs in winter wheat had unknown positions in the genome.

Statistical methods

Pedigree relationship matrices were constructed based on the pedigree information of spring barley and winter wheat using the tabular approach [28,29], which assumed that parental lines have nine cycles of self-fertilization. Genomic relationship matrices (G) were generated for spring barley and winter wheat, using the first method of VanRaden (2008) [30], with G = ZZ’ / 2∑p_j (1-p_j), where the matrix Z was calculated as (M–P). M is a matrix of minor allele counts (0, 1, and 2) with m columns (one for each marker) and n rows (one for each line). P is a matrix containing allele frequencies, with column j defined as l2(p_j − 0.5), wherein l is a vector of ones, and p_j is the frequency of the second allele at corresponding locus j. After quality control procedures, the percentage of missing values was about 1% for both species in the genotype file before the genomic relationship matrices were constructed. The mean imputation approach was then applied to assign any missing genotypes [30]. All the missing genotypes were imputed while constructing the genomic relationship matrices. We performed a principle coordinate analysis (PCoA) (Fig 3) on the genomic relationship matrix using the built-in R function [31]. We used univariate and multivariate linear mixed models to obtain REML estimates of the variance components of traits using the DMU multivariate mixed model package [32].

Model used for yield traits of F₅ and F₆ generations

We developed the following models for the analyses. Model 1 was developed for yield for both F₅ and F₆ generations using only genomic information (G). As yield data were both available for F₅ and F₆ in spring barley and winter wheat, the univariate and multivariate analyses were applied using Model 1:

y = X b + Z_{1} g + e

(1)

The b is the fixed factor comprising year, location, and trial (YLT), whereas the g is the genomic information. In addition, to estimate effects from pedigree information and spatial effects, we also developed Model 2 for F₅ and F₆ yields. The b and g terms are described by Model 2:

y = X b + Z_{1} a + Z_{2} g + \sum_{i = 1}^{n} Z_{i} s + e

(2)

where the a term corresponds to additive genetic effects using pedigree information for the covariance structure, the s term is a spatial effect variable to account for local spatial variation of experiments in the field.

For the models described above, where y is a vector of observations for one trait, X is a design matrix for the fixed effect, and the b term is the vector of fixed effects, including combined effects of year, location, and trial (YLT). Z_n comprises the design matrices of random effects and the g term is a vector of additive genetic effects with $g ~ N (0, G σ_{g}^{2})$ , wherein $σ_{g}^{2}$ represents genomic variance and G is the genomic relationship matrix. The distribution of $a ~ N (0, A σ_{a}^{2})$ , then $σ_{a}^{2}$ represents the additive genetic variance and A is the pedigree relationship matrix. The s term is a vector of spatial effect with $s ~ N (0, I σ_{s}^{2})$ , which contains the X and Y coordinates of plots in the F₅ test (n = 2), and eight surrounding plots and plot itself in the F₆ test (n = 9), as illustrated in Fig 2. The e term is a vector of random residuals with $e ~ N (0, I σ_{e}^{2})$ .

Fig 2 — In the F₅ test, we fitted X- and Y-coordinates as the spatial effect in the model, whereas for the F₆ test, we included its eight surrounding plots as well in the spatial effect (as a moving average). The figure was adapted from and originally drawn by Andrea Bellucci (pers. comm.).

For multivariate analysis, we modeled two traits together to estimate all effects, including the marker effects. The testing combinations are detailed in Fig 4. Taking Model 1 as an example, y₁ represents yield for F₆ and y₂ represents yield for F₅. The year, location, and trial (YLT) serves as a fixed factor represented by b_n in the model. The terms X_n and Z_n are the designed matrices of the fixed factor and random factor, respectively. The g_n term is the genomic information, as described in the statistical model section. We assumed the residual covariance to be zero because yield in F₅ and F₆ generation were statistically independent (as they were collected from different years and generations).

[\begin{matrix} y_{1} \\ y_{2} \end{matrix}] = [\begin{matrix} X_{1} & 0 \\ 0 & X_{2} \end{matrix}] [\begin{matrix} b_{1} \\ b_{2} \end{matrix}] + [\begin{matrix} Z_{1} & 0 \\ 0 & Z_{2} \end{matrix}] [\begin{matrix} g_{1} \\ g_{2} \end{matrix}] + [\begin{matrix} e_{1} \\ e_{2} \end{matrix}]

(3)

where $[\begin{matrix} g_{1} \\ g_{2} \end{matrix}] ~ N (0, G \otimes H)$ with $H = [\begin{matrix} σ_{g 1}^{2} & σ_{g 12} \\ σ_{g 12} & σ_{g 2}^{2} \end{matrix}]$ , wherein H is the variance and covariance matrix of the genomic breeding values of the two traits, and for $[\begin{matrix} e_{1} \\ e_{2} \end{matrix}] ~ N (0, I \otimes R)$ with $R = [\begin{matrix} σ_{e 1}^{2} & σ_{e 12} \\ σ_{e 12} & σ_{e 2}^{2} \end{matrix}]$ , R is the residual variance and covariance matrix of the two traits. Residual co-variance did not exist when we performed yield calculations for F₅ and F₆ generations, because the traits were collected from different years. When there were missing data for one of the traits, the residual variance was equal to $σ_{e}^{2}$ for the observed trait.

For MTGP, we used a training population, including F₅ as Trait I, Sets 1, 2, 3 for yield by F₆ as Trait II, and Set 4 to predict yield of F₆ (as a validation population). For STGP, we used Sets 1, 2, and 3 for yield by F₆ data as the training population to predict Set 4 for yield by F₆ (as a validation population). Fig 4c shows the corresponding models used for MTGP and STGP, respectively. The corresponding models are described in statistical model section in material and methods.

We used the variances to estimate the heritability of line means. The total phenotypic variance ( $σ_{p}^{2}$ ) of line means was:

σ_{p}^{2} = d (G) σ_{g}^{2} + \frac{n_{s} σ_{s}^{2}}{r_{1}} + \frac{σ_{e}^{2}}{r_{2}}

(4)

Heritability was estimated as:

h^{2} = d (G) σ_{g}^{2} / σ_{p}^{2}

(5)

where d(G) is the mean diagonal element of the genomic relationship matrix, n_s is the number of surrounding plots considered in the spatial effect, r_n is the number of replicates of corresponding effects for each genotype when estimating line heritability, and r_n was one (1.0) when estimating the narrow-sense plot heritability [33] based on the data of a single plot. The narrow-sense plot heritability was used to consider the random effects from the plot itself, whereas the line heritability was used to calculate the mean of effects from records across all replicates based on the same breeding line [6]. Line heritability is higher than plot heritability when there are more replicates in the experiment.

Cross-validation and predictive ability

For our multivariate analysis (MTGP) of the F₅ and F₆ yield dataset in particular, we used four sets (Sets 1, 2, 3, and 4) in F₅ as the first trait, and Sets 1, 2, and 3 in F₆ as the second trait to predict the yield performance of Set 4 in F₆. This strategy helped us test the feasibility of using a multivariate analysis for predicting future traits of interest in the coming year. For univariate analysis (STGP), we used Sets 1, 2, and 3 for yield by F₆ data as the training population to predict Set 4 for yield by F₆ (as a validation population). We estimated the predictive ability for future yield performance [ρ(ӯ_c, ĝ)] by calculating the correlation between the average of phenotypic records corrected for the fixed effect (ӯ_c) and genomic predicted breeding values (ĝ). The accuracy of predicting additive breeding values we calculated as the predictive ability divided by the square root of heritability of line means: ρ(ӯ_c, ĝ)/h.

Results

Genomic relationship analysis among breeding lines

The first two principal components of PCoA explained 69% (Axis 1) and 13% (Axis 2) of the total variance in genomic relationships for spring barley, and 83% (Axis 1) and 10% (Axis 2) for winter wheat (Fig 3). In general, most lines were highly genetically associated with others. Based on genomic information, PCoA indicated that there were clearly identifiable groups in spring barley and winter wheat, implying that certain lines were from the same groups. For example, there were some lines from Set 2 segregating in the left area of the PCoA plot in barley, whereas Set 3 also segregated in left area of the PCoA plot in wheat. However, in general, although the PCoA plot showed that there were only two major genetic clusters in both species, we also found that certain lines came from different crosses, sets, and parents. The heat-map of genomic relationship (using a similar dataset) also highlights the same results for both grain species [6,7].

Descriptive statistics and variance components

We studied yield traits in spring barley and winter wheat commercial breeding lines. The number of plots and phenotype statistics for each trait are listed in Table 1. The heritability and variance component estimates of traits are given in Table 2. Heritability (using the genomic-based method) of yield in F₅ was 9% for spring barley and 41% for winter wheat, whereas the heritability of yield in F6 was 24% for spring barley, and 33% for winter wheat (Table 2).

Table 2. Variance components, narrow-sense plot heritability, and correlation estimation of traits using model 1.

The column for $σ_{g}^{2}$ and $σ_{e}^{2}$ are given by 10⁻² as base unit.

Species	Traits	$σ_{g}^{2}$ (x 10⁻²)	$σ_{e}^{2}$ (x 10⁻²)	plot¹ h²	line h²	Cor_G³
Barley	Yield F₅	0.3	6.6	0.09	0.09	0.7
	Yield F₆	1.7	5.7	0.24	0.75	0.7
Wheat	Yield F₅	2.9	7.8	0.41	0.41	0.72
	Yield F₆	7.6	22.8	0.33	0.76

Open in a new tab

¹ The plot heritability. For yield F₅, we only have one plot in F₅, so r_n in the denominator is always one (see Model 4) and the plot heritability is equal to line heritability. For other traits, we have multiple plots from the same breeding line, so we obtained more information based on the same breeding line. Therefore, line heritability is higher than plot heritability. See more descriptions in Model 4.

² Line heritability.

³ The environmental correlation was set as independent between yield F₆ and F₅ because their records were collected in different years and locations. Therefore, only genetic correlations (Cor_G) are provided for yield traits.

STGP versus MTGP

For multiple trait analysis, we used yield in F₅ as the first trait and yield in F₆ as the second trait to predict the future yield performance for the spring barley and winter wheat breeding lines. Overall, the prediction accuracies of bivariate analyses were higher than for univariate analyses in all scenarios (Fig 4). In using a bivariate analysis, we improved predictive accuracies by about 7% in spring barley and by about 57% in winter wheat varying from STGP and MTGP. The MTGP model that combined pedigree, the genomic relationship matrix, and spatial effects showed higher prediction accuracy than using the genomic relationship matrix only.

Discussion

The goal of this study was to utilize pedigree information, genomic information, and genetic covariance between associated traits to increase the accuracy of prediction of economically important traits in cereal breeding programs. Our main findings were that the prediction accuracy of yield performance clearly increased when we modeled both yield F₅ and F₆ data simultaneously in the analysis. Furthermore, the prediction accuracy calculated from test data involving both pedigree, genomic and spatial information was clearly higher than data obtained from genomic information.

Genetic correlation is critical for improving accuracy in MTGP

Theoretically, genetic correlation can arise mainly by pleiotropy or, less commonly, by linkage disequilibrium [34]. A high genetic correlation between two traits does not imply that both traits are highly heritable, but neither does a high phenotypic correlation [25]. Several studies using both real and simulated data have suggested that the genetic correlation between genetically-linked traits is important for multivariate genomic selection to be advantageous [13,15,20,23,25,35]. Therefore, genetic correlations between traits of interest have been recently exploited to increase the statistical power for detecting segregating QTLs [36,37] and to improve accuracy in genomic predictions in plant breeding programs [20,23,35].

To date, there have been only a few published multiple trait studies using field data for plant breeding [15–17,20,23,38]. In a simulation study, Jia and Jannink [23] reported that for two traits with no genetic correlation, the prediction accuracy of STGP was equivalent to or even better than the accuracy of MTGP. In the current study, the genetic correlation between yield in F₅ and in F₆ data was approximately 0.7 in both spring barley and winter wheat breeding lines. Because we collected phenotypes from different years and locations, the environmental effects on yield in F₅ and F₆ were independent. Our results showed that MTGP outperformed STGP by 7% of yield in spring barley and 57% in winter wheat. A similar improvement rate (60%) using MTGP was also reported from pine breeding data [23]. Notably, the predictive ability for spring barley was generally higher than it was for winter wheat, but the relative improvement was not as dramatic as it was for winter wheat. For winter wheat, the predictive ability for yield was 0.23 for F₆ generation in our single trait analysis, and 0.37 using yield data from F₅ and F₆ generations in the multiple trait analysis. Because the predictive ability for spring barley was 0.48 using yield data from F6 with the STGP model, but 0.51 using MTGP, the result clearly shows that, for estimating yield performance, the prediction accuracy of the STGP model for the spring barley line was better than was the MTGP model for the winter wheat breeding lines.

Yield heritability difference between F₅ and F₆

For spring barley and winter wheat, our results showed that the heritability of yield in F₅ and F₆ differed slightly. One reason to cause the differences could be due to the smaller plot size and lower sowing density for F₅. In addition, the F₅ data were tested on one location with one replicate only (compared with F₆, there were multiple tested locations and plots), this may cause that the genetic effects included both general additive genetic effects plus specific additive genetic effects due to GxE effect between genotypes and the one location used. These effects cannot be separated for F₅ data, compared with F₆ data. As such, the above reason cloud lead to the heritability difference between yield F₅ and F₆.

Genomic information boosts the prediction accuracy

To our knowledge, there are only few major QTLs segregating identified (such as, the Mlo locus), at least for the economically important traits we investigated in this study. A review by Bernardo [39] stated that approximately 10,000 QTLs have been identified by QTL mapping experiments on twelve major crop species. However, only a few QTLs have been exploited in marker-assisted selection in practical breeding schemes, which indicates that most economically important traits in spring barley and winter wheat are highly polygenic in nature. Thus, if sufficient genomic information is available (e.g., segregated SNPs across an entire genome), then genomic predictions can be an efficient tool for capturing genetic variances, much more efficient than relying on pedigree records in plant breeding. Previous studies that applied genomic-based BLUP (GBLUP) approaches show consistent prediction accuracies across various genetic architectures under simulated scenarios [40]. Additionally, Jia and Jannink [23] indicated that multiple trait GBLUP performed equally as well as Bayesian models (Bayes A and Bayes Cpi) when the traits were controlled by a polygenic genetic architecture. Both authors suggested that BLUP is likely an ideal option for modelling the traits we investigated. In this study, our model involved both genomic and pedigree information, the prediction accuracies were slightly higher than using genomic information only. This result suggests that our evaluation involving pedigree information was less accurate than using a genomic-based method. On the other hand, GBLUP is potentially not as robust as the Bayesian model when there are outliers involved (e.g., the disease traits in spring barley investigated in this study deviated from the normal distribution). The prediction accuracy reported in this study was sufficiently high (e.g., prediction accuracy > 0.5) for genomic breeders to make selection decisions on favored traits earlier in the breeding cycle, which would enable them to maximize genetic gains [17].

Clear genetic grouping observed in commercial spring barley and winter wheat breeding lines

Although our principal components analysis (PCoA) indicated that the genetic relationship and degree of variation between all lines in both species we examined differed slightly, the PCoA clearly showed that there were some segregating groups among all breeding lines, thus implying that many lines had strong genomic relationships in certain genetic clusters.

Future perspectives

Simulation studies based on STGP have suggested that when a high SNP marker density is used, a substantial improvement in prediction accuracy can be expected in genomic evaluations [41]. Our study used a full set of marker genotypes as well as the total available population in the MTGP model. However, genotyping cost is still a major concern in plant breeding, especially for commercial breeders. Therefore, although our approach has been tested using simulation scenarios [13], the effect of marker density and optimization of TP size may require further investigation based on real data. In addition, non-standard phenotypes, such as those obtained from metabolomics data, may assist practitioners in boosting correlations in MTGP. For example, some investigations have involved metabolomics data in multiple trait analyses, aiming to improve accuracy in plant breeding schemes (and in animal breeding) [17,42–45]. Although STGP usually achieves a predictive ability similar to MTGP in some cases (e.g., soybeans [38], bread wheat [17], durum wheat [18], and African cassava [15]), our study suggests that the predictive ability of certain traits can be improved using MTGP (based on winter wheat and spring barley breeding lines and the large number of lines we included in our study). As such, cereal breeders can apply MTGP, combined with GxE effect, to improve predictive ability for selecting high yielding cultivars with improved resistance and quality by exploiting genetic correlation between the traits.

Conclusion

Our study showed that the MTGP approach is better than STGP for predicting yield traits in spring barley and winter wheat breeding lines when we included yield in F₅ and yield in F₆ in the evaluation. We also found that a model fitting pedigree, genomic and spatial information will have better prediction accuracy than using genomic information only. To conclude, prediction accuracy clearly increased in both species when we modelled yield data from F₅ and F₆ generations with MTGP, GxE, and spatial effects in the model. Thus, breeders can use the genetic relationship between traits to predict future trait performance, with considerably improved accuracy, by including genetically related traits using multivariate genomic prediction approaches.

Supporting information

S1 Data

(GENOTYPE)

Click here for additional data file.^{(13.5MB, genotype)}

S2 Data

(YIELD)

Click here for additional data file.^{(558.2KB, yield)}

S3 Data

(PHENOTYPE)

Click here for additional data file.^{(660.2KB, phenotype)}

S4 Data

(GENOTYPE)

Click here for additional data file.^{(38.5MB, genotype)}

S5 Data

(R)

Click here for additional data file.^{(1.7KB, r)}

S6 Data

(DIR)

Click here for additional data file.^{(1.4KB, DIR)}

S7 Data

(DIR)

Click here for additional data file.^{(1.3KB, DIR)}

S1 File

(DOCX)

Click here for additional data file.^{(14.5KB, docx)}

S1 Dataset

(XLSX)

Click here for additional data file.^{(34.9KB, xlsx)}

S2 Dataset

(XLSX)

Click here for additional data file.^{(84.6KB, xlsx)}

Acknowledgments

We greatly appreciate the help of research technicians in Nordic Seed A/S who contributed to the phenotypic and genotypic data collection. We also thank Per Madsen for technical help in the use of DMU software, Andrea Bellucci for his generous contribution (Figs 1 and 2), and anonymous colleagues who read the draft to improve the quality of the manuscript.

Abbreviations

BLUP: Best linear unbiased prediction
GP: Genomic prediction
GS: Genomic selection
LD: Linkage disequilibrium
MAS: Marker-assisted selection
MT: Multiple trait
MTGP: Multiple trait genomic prediction
QTL: Quantitative trait loci
SNP: Single nucleotide polymorphism
ST: Single trait
STGP: Single trait genomic prediction
TP: Training population
VP: Validation population

Data Availability

All genotyping data used in the study with direct runnable format is provided in the supporting information. All phenotype collection is given in the supporting information.

Funding Statement

This study was funded by Danish Green Development and Demonstration Program (Grant No. 34009-12-0511) from the Danish Ministry of Food and Agriculture, and Nordic Seed A/S. In this study, the funding was used by the university partner - Aarhus University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, et al. A chromosome conformation capture ordered sequence of the barley genome. Nature. 2017;544: 427 Available: 10.1038/nature22043 [DOI] [PubMed] [Google Scholar]
2.FAOSTAT. FAO Statistics Division 2016. Rome; 2016. Available: http://www.fao.org/statistics/en/
3.Shewry PR. Wheat. J Exp Bot. 2009;60: 1537–1553. Available: 10.1093/jxb/erp058 [DOI] [PubMed] [Google Scholar]
4.Appels R, Eversole K, Feuillet C, Keller B, Rogers J, Stein N, et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science (80-). 2018;361 Available: http://science.sciencemag.org/content/361/6403/eaar7191.abstract [DOI] [PubMed] [Google Scholar]
5.Sreenivasulu N, Graner A, Wobus U. Barley Genomics: An Overview. Int J Plant Genomics. 2008;2008: 486258 10.1155/2008/486258 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Cericola F, Jahoor A, Orabi J, Andersen JR, Janss LL, Jensen J. Optimizing Training Population Size and Genotyping Strategy for Genomic Prediction Using Association Study Results and Pedigree Information. A Case of Study in Advanced Wheat Breeding Lines. PLoS One. 2017;12: e0169606 Available: 10.1371/journal.pone.0169606 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Nielsen NH, Jahoor A, Jensen JD, Orabi J, Cericola F, Edriss V, et al. Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines. PLoS One. 2016;11: e0164494 Available: 10.1371/journal.pone.0164494 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Meuwissen THE Hayes BJ, Goddard ME. Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. Genetics. 2001;157: 1819–1829. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Jensen J, Mäntysaari E, Madsen P, Thompson R. Residual maximum likelihood estimation of (co)variance components in multivariate mixed linear models using average information. J Indian Soc Agric Stat. 1997;49: 215–236. [Google Scholar]
10.Henderson CR. Applications of Linear Models in Animal Breeding. Guelph, Canada: University of Guelph; 1984. [Google Scholar]
11.Hickey JM, Chiurugwi T, Mackay I, Powell W, ParticipantsI GS in CBPW. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery. Nat Genet. 2017;49: 1297 Available: 10.1038/ng.3920 [DOI] [PubMed] [Google Scholar]
12.Jannink J-L, Lorenz AJ, Iwata H. Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics. 2010;9: 166–177. Available: 10.1093/bfgp/elq001 [DOI] [PubMed] [Google Scholar]
13.Calus MPL, Veerkamp RF. Accuracy of multi-trait genomic selection using different methods. Genet Sel Evol. 2011;43: 26 10.1186/1297-9686-43-26 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Heffner EL, Sorrells ME, Jannink J-L. Genomic Selection for Crop Improvement All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval syst. Crop Sci. 2009;49: 1–12. [Google Scholar]
15.Okeke UG, Akdemir D, Rabbi I, Kulakow P, Jannink J-L. Accuracies of univariate and multivariate genomic prediction models in African cassava. Genet Sel Evol. 2017;49: 88 10.1186/s12711-017-0361-y [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Rutkoski J, Benson J, Jia Y, Brown-Guedira G, Jannink J-L, Sorrells M. Evaluation of Genomic Prediction Methods for Fusarium Head Blight Resistance in Wheat. Plant Genome. 2012;5: 51–61. 10.3835/plantgenome2012.02.0001 [DOI] [Google Scholar]
17.Hayes BJ, Panozzo J, Walker CK, Choy AL, Kant S, Wong D, et al. Accelerating wheat breeding for end-use quality with multi-trait genomic predictions incorporating near infrared and nuclear magnetic resonance-derived phenotypes. Theor Appl Genet. 2017;130: 2505–2519. 10.1007/s00122-017-2972-7 [DOI] [PubMed] [Google Scholar]
18.Haile JK, N’Diaye A, Clarke F, Clarke J, Knox R, Rutkoski J, et al. Genomic selection for grain yield and quality traits in durum wheat. Mol Breed. 2018;38: 75 10.1007/s11032-018-0818-x [DOI] [Google Scholar]
19.Heslot N, Yang H-P, Sorrells ME, Jannink J-L. Genomic Selection in Plant Breeding: A Comparison of Models. Crop Sci. 2012;52: 146–160. 10.2135/cropsci2011.06.0297 [DOI] [Google Scholar]
20.Schulthess AW, Wang Y, Miedaner T, Wilde P, Reif JC, Zhao Y. Multiple-trait- and selection indices-genomic predictions for grain yield and protein content in rye for feeding purposes. Theor Appl Genet. 2016;129: 273–287. 10.1007/s00122-015-2626-6 [DOI] [PubMed] [Google Scholar]
21.Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redoña E, et al. Genomic Selection and Association Mapping in Rice (Oryza sativa): Effect of Trait Genetic Architecture, Training Population Composition, Marker Number and Statistical Model on Accuracy of Rice Genomic Selection in Elite, Tropical Rice Breeding Lines. PLOS Genet. 2015;11: e1004982 Available: 10.1371/journal.pgen.1004982 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Guo G, Zhao F, Wang Y, Zhang Y, Du L, Su G. Comparison of single-trait and multiple-trait genomic prediction models. BMC Genet. 2014;15: 30 10.1186/1471-2156-15-30 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.JiaY JanninkJ-L. Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy. Genetics. 2012;192: 1513 LP–1522. Available: http://www.genetics.org/content/192/4/1513.abstract [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Tsuruta S, Misztal I, Aguilar I, Lawlor TJ. Multiple-trait genomic evaluation of linear type traits using genomic and phenotypic data in US Holsteins. J Dairy Sci. 2011;94: 4198–4204. doi: 10.3168/jds.2011-4256 [DOI] [PubMed] [Google Scholar]
25.Falconer DS, Mackay TFC. Introduction to Quantitative Genetics. 4th ed Harlow, Essex, UK.: Longmans Green; 1996. [Google Scholar]
26.Sukumaran S, Crossa J, Jarquin D, Lopes M, Reynolds MP. Genomic Prediction with Pedigree and Genotype × Environment Interaction in Spring Wheat Grown in South and West Asia, North Africa, and Mexico. G3 Genes|Genomes|Genetics. 2017;7: 481–495. 10.1534/g3.116.036251 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Petersen RG. Agricultural field experiments: design and analysis. Marcel Dekker, New York: CRC Press; 1994. https://www.crcpress.com/ [Google Scholar]
28.Emik LO, Terrill CE. Systematic procedures for calculating inbreeding coefficients. J Hered. 1949;40: 51–55. 10.1093/oxfordjournals.jhered.a105986 [DOI] [PubMed] [Google Scholar]
29.Henderson CR. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics. 1976;32 10.2307/2529339 [DOI] [Google Scholar]
30.VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91: 4414–23. 10.3168/jds.2007-0980 [DOI] [PubMed] [Google Scholar]
31.R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2017. https://www.r-project.org/ [Google Scholar]
32.MadsenP, JensenJ. DMU: A user’s guide. A Package for Analysing Multivariate Mixed Models. Version 6. Release 5.2. Tjele, Denmark; 2013. http://dmu.agrsci.dk/DMU/Doc/Current/dmuv6_guide.5.2.pdf
33.delos Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D. Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor. PLOS Genet. 2013;9: e1003608 Available: 10.1371/journal.pgen.1003608 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Conner JK. Genetic mechanisms of floral trait correlations in a natural population. Nature. 2002;420: 407 Available: 10.1038/nature01105 [DOI] [PubMed] [Google Scholar]
35.Montesinos-López OA, Montesinos-López A, Crossa J, Toledo FH, Pérez-Hernández O, Eskridge KM, et al. A Genomic Bayesian Multi-trait and Multi-environment Model. G3 Genes|Genomes|Genetics. 2016;6: 2725 LP–2744. Available: http://www.g3journal.org/content/6/9/2725.abstract [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Xu C, Wang X, LI Z, XU S. Mapping QTL for multiple traits using Bayesian statistics. Genet Res (Camb). 2009/02/01. 2009;91: 23–37. 10.1017/S0016672308009956 [DOI] [PubMed] [Google Scholar]
37.Xue W, Xing Y, Weng X, Zhao Y, Tang W, Wang L, et al. Natural variation in Ghd7 is an important regulator of heading date and yield potential in rice. Nat Genet. 2008;40: 761–767. Available: 10.1038/ng.143 [DOI] [PubMed] [Google Scholar]
38.Bao Y, Kurle JE, Anderson G, Young ND. Association mapping and genomic prediction for resistance to sudden death syndrome in early maturing soybean germplasm. Mol Breed. 2015;35: 128 1 10.1007/s11032-015-0324-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Bernardo R. Molecular Markers and Selection for Complex Traits in Plants: Learning from the Last 20 Years. Crop Sci. 2008;48: 1649–1664. 10.2135/cropsci2008.03.0131 [DOI] [Google Scholar]
40.Daetwyler HD, Pong-Wong R, Villanueva B, Woolliams JA. The impact of genetic architecture on genome-wide evaluation methods. Genetics. 2010;185 10.1534/genetics.110.116855 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Meuwissen T, B, Goddard M. Accelerating Improvement of Livestock with Genomic Selection. Annu Rev Anim Biosci. 2013;1: 221–237. 10.1146/annurev-animal-031412-103705 [DOI] [PubMed] [Google Scholar]
42.Guo Z, Magwire MM, Basten CJ, Xu Z, Wang D. Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize. Theor Appl Genet. 2016;129: 2413–2427. 10.1007/s00122-016-2780-5 [DOI] [PubMed] [Google Scholar]
43.Riedelsheimer C, Czedik-Eysenberg A, Grieder C, Lisec J, Technow F, Sulpice R, et al. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet. 2012;44: 217 Available: 10.1038/ng.1033 [DOI] [PubMed] [Google Scholar]
44.Ward J, Rakszegi M, Bedő Z, Shewry PR, Mackay I. Differentially penalized regression to predict agronomic traits from metabolites and markers in wheat. BMC Genet. 2015;16: 19 10.1186/s12863-015-0169-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Xu S, Xu Y, Gong L, Zhang Q. Metabolomic prediction of yield in hybrid rice. Plant J. 2016;88: 219–227. 10.1111/tpj.13242 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Data

(GENOTYPE)

Click here for additional data file.^{(13.5MB, genotype)}

S2 Data

(YIELD)

Click here for additional data file.^{(558.2KB, yield)}

S3 Data

(PHENOTYPE)

Click here for additional data file.^{(660.2KB, phenotype)}

S4 Data

(GENOTYPE)

Click here for additional data file.^{(38.5MB, genotype)}

S5 Data

(R)

Click here for additional data file.^{(1.7KB, r)}

S6 Data

(DIR)

Click here for additional data file.^{(1.4KB, DIR)}

S7 Data

(DIR)

Click here for additional data file.^{(1.3KB, DIR)}

S1 File

(DOCX)

Click here for additional data file.^{(14.5KB, docx)}

S1 Dataset

(XLSX)

Click here for additional data file.^{(34.9KB, xlsx)}

S2 Dataset

(XLSX)

Click here for additional data file.^{(84.6KB, xlsx)}

Data Availability Statement

All genotyping data used in the study with direct runnable format is provided in the supporting information. All phenotype collection is given in the supporting information.

[pone.0232665.ref001] 1.Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, et al. A chromosome conformation capture ordered sequence of the barley genome. Nature. 2017;544: 427 Available: 10.1038/nature22043 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref002] 2.FAOSTAT. FAO Statistics Division 2016. Rome; 2016. Available: http://www.fao.org/statistics/en/

[pone.0232665.ref003] 3.Shewry PR. Wheat. J Exp Bot. 2009;60: 1537–1553. Available: 10.1093/jxb/erp058 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref004] 4.Appels R, Eversole K, Feuillet C, Keller B, Rogers J, Stein N, et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science (80-). 2018;361 Available: http://science.sciencemag.org/content/361/6403/eaar7191.abstract [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref005] 5.Sreenivasulu N, Graner A, Wobus U. Barley Genomics: An Overview. Int J Plant Genomics. 2008;2008: 486258 10.1155/2008/486258 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref006] 6.Cericola F, Jahoor A, Orabi J, Andersen JR, Janss LL, Jensen J. Optimizing Training Population Size and Genotyping Strategy for Genomic Prediction Using Association Study Results and Pedigree Information. A Case of Study in Advanced Wheat Breeding Lines. PLoS One. 2017;12: e0169606 Available: 10.1371/journal.pone.0169606 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref007] 7.Nielsen NH, Jahoor A, Jensen JD, Orabi J, Cericola F, Edriss V, et al. Genomic Prediction of Seed Quality Traits Using Advanced Barley Breeding Lines. PLoS One. 2016;11: e0164494 Available: 10.1371/journal.pone.0164494 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref008] 8.Meuwissen THE Hayes BJ, Goddard ME. Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. Genetics. 2001;157: 1819–1829. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref009] 9.Jensen J, Mäntysaari E, Madsen P, Thompson R. Residual maximum likelihood estimation of (co)variance components in multivariate mixed linear models using average information. J Indian Soc Agric Stat. 1997;49: 215–236. [Google Scholar]

[pone.0232665.ref010] 10.Henderson CR. Applications of Linear Models in Animal Breeding. Guelph, Canada: University of Guelph; 1984. [Google Scholar]

[pone.0232665.ref011] 11.Hickey JM, Chiurugwi T, Mackay I, Powell W, ParticipantsI GS in CBPW. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery. Nat Genet. 2017;49: 1297 Available: 10.1038/ng.3920 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref012] 12.Jannink J-L, Lorenz AJ, Iwata H. Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics. 2010;9: 166–177. Available: 10.1093/bfgp/elq001 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref013] 13.Calus MPL, Veerkamp RF. Accuracy of multi-trait genomic selection using different methods. Genet Sel Evol. 2011;43: 26 10.1186/1297-9686-43-26 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref014] 14.Heffner EL, Sorrells ME, Jannink J-L. Genomic Selection for Crop Improvement All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval syst. Crop Sci. 2009;49: 1–12. [Google Scholar]

[pone.0232665.ref015] 15.Okeke UG, Akdemir D, Rabbi I, Kulakow P, Jannink J-L. Accuracies of univariate and multivariate genomic prediction models in African cassava. Genet Sel Evol. 2017;49: 88 10.1186/s12711-017-0361-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref016] 16.Rutkoski J, Benson J, Jia Y, Brown-Guedira G, Jannink J-L, Sorrells M. Evaluation of Genomic Prediction Methods for Fusarium Head Blight Resistance in Wheat. Plant Genome. 2012;5: 51–61. 10.3835/plantgenome2012.02.0001 [DOI] [Google Scholar]

[pone.0232665.ref017] 17.Hayes BJ, Panozzo J, Walker CK, Choy AL, Kant S, Wong D, et al. Accelerating wheat breeding for end-use quality with multi-trait genomic predictions incorporating near infrared and nuclear magnetic resonance-derived phenotypes. Theor Appl Genet. 2017;130: 2505–2519. 10.1007/s00122-017-2972-7 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref018] 18.Haile JK, N’Diaye A, Clarke F, Clarke J, Knox R, Rutkoski J, et al. Genomic selection for grain yield and quality traits in durum wheat. Mol Breed. 2018;38: 75 10.1007/s11032-018-0818-x [DOI] [Google Scholar]

[pone.0232665.ref019] 19.Heslot N, Yang H-P, Sorrells ME, Jannink J-L. Genomic Selection in Plant Breeding: A Comparison of Models. Crop Sci. 2012;52: 146–160. 10.2135/cropsci2011.06.0297 [DOI] [Google Scholar]

[pone.0232665.ref020] 20.Schulthess AW, Wang Y, Miedaner T, Wilde P, Reif JC, Zhao Y. Multiple-trait- and selection indices-genomic predictions for grain yield and protein content in rye for feeding purposes. Theor Appl Genet. 2016;129: 273–287. 10.1007/s00122-015-2626-6 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref021] 21.Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redoña E, et al. Genomic Selection and Association Mapping in Rice (Oryza sativa): Effect of Trait Genetic Architecture, Training Population Composition, Marker Number and Statistical Model on Accuracy of Rice Genomic Selection in Elite, Tropical Rice Breeding Lines. PLOS Genet. 2015;11: e1004982 Available: 10.1371/journal.pgen.1004982 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref022] 22.Guo G, Zhao F, Wang Y, Zhang Y, Du L, Su G. Comparison of single-trait and multiple-trait genomic prediction models. BMC Genet. 2014;15: 30 10.1186/1471-2156-15-30 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref023] 23.JiaY JanninkJ-L. Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy. Genetics. 2012;192: 1513 LP–1522. Available: http://www.genetics.org/content/192/4/1513.abstract [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref024] 24.Tsuruta S, Misztal I, Aguilar I, Lawlor TJ. Multiple-trait genomic evaluation of linear type traits using genomic and phenotypic data in US Holsteins. J Dairy Sci. 2011;94: 4198–4204. doi: 10.3168/jds.2011-4256 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref025] 25.Falconer DS, Mackay TFC. Introduction to Quantitative Genetics. 4th ed Harlow, Essex, UK.: Longmans Green; 1996. [Google Scholar]

[pone.0232665.ref026] 26.Sukumaran S, Crossa J, Jarquin D, Lopes M, Reynolds MP. Genomic Prediction with Pedigree and Genotype × Environment Interaction in Spring Wheat Grown in South and West Asia, North Africa, and Mexico. G3 Genes|Genomes|Genetics. 2017;7: 481–495. 10.1534/g3.116.036251 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref027] 27.Petersen RG. Agricultural field experiments: design and analysis. Marcel Dekker, New York: CRC Press; 1994. https://www.crcpress.com/ [Google Scholar]

[pone.0232665.ref028] 28.Emik LO, Terrill CE. Systematic procedures for calculating inbreeding coefficients. J Hered. 1949;40: 51–55. 10.1093/oxfordjournals.jhered.a105986 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref029] 29.Henderson CR. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics. 1976;32 10.2307/2529339 [DOI] [Google Scholar]

[pone.0232665.ref030] 30.VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91: 4414–23. 10.3168/jds.2007-0980 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref031] 31.R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2017. https://www.r-project.org/ [Google Scholar]

[pone.0232665.ref032] 32.MadsenP, JensenJ. DMU: A user’s guide. A Package for Analysing Multivariate Mixed Models. Version 6. Release 5.2. Tjele, Denmark; 2013. http://dmu.agrsci.dk/DMU/Doc/Current/dmuv6_guide.5.2.pdf

[pone.0232665.ref033] 33.delos Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D. Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor. PLOS Genet. 2013;9: e1003608 Available: 10.1371/journal.pgen.1003608 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref034] 34.Conner JK. Genetic mechanisms of floral trait correlations in a natural population. Nature. 2002;420: 407 Available: 10.1038/nature01105 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref035] 35.Montesinos-López OA, Montesinos-López A, Crossa J, Toledo FH, Pérez-Hernández O, Eskridge KM, et al. A Genomic Bayesian Multi-trait and Multi-environment Model. G3 Genes|Genomes|Genetics. 2016;6: 2725 LP–2744. Available: http://www.g3journal.org/content/6/9/2725.abstract [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref036] 36.Xu C, Wang X, LI Z, XU S. Mapping QTL for multiple traits using Bayesian statistics. Genet Res (Camb). 2009/02/01. 2009;91: 23–37. 10.1017/S0016672308009956 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref037] 37.Xue W, Xing Y, Weng X, Zhao Y, Tang W, Wang L, et al. Natural variation in Ghd7 is an important regulator of heading date and yield potential in rice. Nat Genet. 2008;40: 761–767. Available: 10.1038/ng.143 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref038] 38.Bao Y, Kurle JE, Anderson G, Young ND. Association mapping and genomic prediction for resistance to sudden death syndrome in early maturing soybean germplasm. Mol Breed. 2015;35: 128 1 10.1007/s11032-015-0324-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref039] 39.Bernardo R. Molecular Markers and Selection for Complex Traits in Plants: Learning from the Last 20 Years. Crop Sci. 2008;48: 1649–1664. 10.2135/cropsci2008.03.0131 [DOI] [Google Scholar]

[pone.0232665.ref040] 40.Daetwyler HD, Pong-Wong R, Villanueva B, Woolliams JA. The impact of genetic architecture on genome-wide evaluation methods. Genetics. 2010;185 10.1534/genetics.110.116855 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref041] 41.Meuwissen T, B, Goddard M. Accelerating Improvement of Livestock with Genomic Selection. Annu Rev Anim Biosci. 2013;1: 221–237. 10.1146/annurev-animal-031412-103705 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref042] 42.Guo Z, Magwire MM, Basten CJ, Xu Z, Wang D. Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize. Theor Appl Genet. 2016;129: 2413–2427. 10.1007/s00122-016-2780-5 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref043] 43.Riedelsheimer C, Czedik-Eysenberg A, Grieder C, Lisec J, Technow F, Sulpice R, et al. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet. 2012;44: 217 Available: 10.1038/ng.1033 [DOI] [PubMed] [Google Scholar]

[pone.0232665.ref044] 44.Ward J, Rakszegi M, Bedő Z, Shewry PR, Mackay I. Differentially penalized regression to predict agronomic traits from metabolites and markers in wheat. BMC Genet. 2015;16: 19 10.1186/s12863-015-0169-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0232665.ref045] 45.Xu S, Xu Y, Gong L, Zhang Q. Metabolomic prediction of yield in hybrid rice. Plant J. 2016;88: 219–227. 10.1111/tpj.13242 [DOI] [PubMed] [Google Scholar]

PERMALINK

Use of multiple traits genomic prediction, genotype by environment interactions and spatial effect to improve prediction accuracy in yield data

Hsin-Yuan Tsai

Fabio Cericola

Vahid Edriss

Jeppe Reitan Andersen

Jihad Orabi

Jens Due Jensen

Ahmed Jahoor

Luc Janss

Just Jensen

Roles

Abstract

Introduction

Materials and methods

Field experiment and phenotypes

Fig 1. Trial plan of spring barley and winter wheat field growth experiments.

Table 1. Descriptive statistics for spring barley and winter wheat phenotypic records.

Genotypes

Statistical methods

Fig 3. Principal coordinate analysis of (a) spring barley and (b) winter wheat.

Model used for yield traits of F5 and F6 generations

Fig 2. Illustration of spatial effects employed in the F5 and F6 test.

Fig 4. Comparison of MTGP and STGP approaches for predicting yield in the F6 generation of winter wheat and spring barley.

Cross-validation and predictive ability

Results

Genomic relationship analysis among breeding lines

Descriptive statistics and variance components

Table 2. Variance components, narrow-sense plot heritability, and correlation estimation of traits using model 1.

STGP versus MTGP

Discussion

Genetic correlation is critical for improving accuracy in MTGP

Yield heritability difference between F5 and F6

Genomic information boosts the prediction accuracy

Clear genetic grouping observed in commercial spring barley and winter wheat breeding lines

Future perspectives

Conclusion

Supporting information

Acknowledgments

Abbreviations

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Model used for yield traits of F₅ and F₆ generations

Fig 2. Illustration of spatial effects employed in the F₅ and F₆ test.

Fig 4. Comparison of MTGP and STGP approaches for predicting yield in the F₆ generation of winter wheat and spring barley.

Yield heritability difference between F₅ and F₆