Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2025 Jun 1;18(2):e70053. doi: 10.1002/tpg2.70053

Genomic selection: Essence, applications, and prospects

Diana M Escamilla 1, Dongdong Li 1, Karlene L Negus 1, Kiara L Kappelmann 1, Aaron Kusmec 2, Adam E Vanous 3, Patrick S Schnable 1, Xianran Li 4, Jianming Yu 1,
PMCID: PMC12127607  PMID: 40452138

Abstract

Genomic selection (GS) emerged as a key part of the solution to ensure the food supply for the growing human population thanks to advances in genotyping and other enabling technologies and improved understanding of the genotype–phenotype relationship in quantitative genetics. GS is a breeding strategy to predict the genotypic values of individuals for selection using their genotypic data and a trained model. It includes four major steps: training population design, model building, prediction, and selection. GS revises the traditional breeding process by assigning phenotyping a new role of generating data for the building of prediction models. The increased capacity of GS to evaluate more individuals, in combination with shorter breeding cycle times, has led to wide adoption in plant breeding. Research studies have been conducted to implement GS with different emphases in crop‐ and trait‐specific applications, prediction models, design of training populations, and identifying factors influencing prediction accuracy. GS plays different roles in plant breeding such as turbocharging of gene banks, parental selection, and candidate selection at different stages of the breeding cycle. It can be enhanced by additional data types such as phenomics, transcriptomics, metabolomics, and enviromics. In light of the rapid development of artificial intelligence, GS can be further improved by either upgrading the entire framework or individual components. Technological advances, research innovations, and emerging challenges in agriculture will continue to shape the role of GS in plant breeding.

Core Ideas

  • Genomic selection (GS) is a breeding strategy to predict the genotypic merits of individuals for selection.

  • GS helps to shorten breeding cycle times, increase genetic gains, and facilitate better resource allocation.

  • GS can be enhanced by phenomics, metabolomics, transcriptomics, and enviromics data to improve prediction accuracy.

  • Deep learning (DL)‐based GS is expected to facilitate the integration of multiple layers of information in prediction models.

  • Emerging technologies, research, and agricultural challenges continuously shape the role of GS in plant breeding.


Abbreviations

AI

artificial intelligence

CD

coefficient of determination

CERIS

critical environmental regressor through informed search

CGM

crop growth model

CNN

convolutional neural network

CV

cross‐validation

DH

doubled haploid

DL

deep learning

EC

environmental covariate

EI

environmental index

FNN

feedforward neural network

GCA

general combining ability

GEI

genotype by environment interaction

GP

genomic prediction

GS

genomic selection

HTP

high‐throughput phenotyping

JGRA

joint genomic regression analysis

LOO

leave‐one‐out

MET

multi‐environment trial

ML

machine learning

NAM

nested association mapping

OSGS

origin specific genomic selection

PEV

prediction error variance

QTL

quantitative trait loci

RF

random forest

RKHS

reproducing kernel Hilbert space

RRM

random regression model

SCA

specific combining ability

SVM

support vector machine

TPE

target population of environments

TPG

target population of genotypes

1. GENOMIC SELECTION IN PLANT BREEDING

The primary goal of plant breeding programs is to develop improved cultivars for agricultural production to feed the ever‐growing human population. However, various challenges exist, including limited genetic diversity in elite germplasm pools, biotic and abiotic stresses exacerbated by climate change, fast‐paced human population growth, and resource constraints in breeding. For example, projected temperature increases of 1.4–2.5°C by 2050 could lead to extreme heat and drought stress during crop‐growing seasons with expected yield losses of 7%–23% (Rezaei et al., 2023; Tollefson, 2020). These abiotic stressors can exacerbate the impact of biotic stressors by increasing the prevalence and severity of insects, pests, and pathogens (Mahmood et al., 2022). When coupled with expected population growth and land use change, future climate variability is expected to have negative consequences on global food security (L. T. Hickey et al., 2019; Mahmood et al., 2022; Molotoks et al., 2021). These projected scenarios, however, are also meant to help identify the risks and guide research directions rather than only predict the future (Tollefson, 2020). Given the current rates of yield gains in major crops and the impact of climate change on agricultural production systems, global food security is and continues to be a significant concern for humanity (L. T. Hickey et al., 2019; Molotoks et al., 2021; Rezaei et al., 2023). Therefore, developing high‐yielding, nutritious, and climate‐resilient crops is imperative for improving global food production and securing food for present and future generations.

Approaches to increasing rates of genetic gain and speeding up crop improvement need to be designed and applied widely to combat the challenges associated with global food security. Among recently implemented plant breeding innovations (L. T. Hickey et al., 2019), genomic selection (GS) is a key part of the solution thanks to advances in genotyping and other enabling breeding technologies, improved understanding of quantitative genetics, and new analytical frameworks designed to leverage the genotype–phenotype relationship (Bernardo & Yu, 2007; Heffner et al., 2009; Meuwissen et al., 2001). Two decades after it was proposed, GS has become a common breeding strategy in many crops. It has shortened breeding cycles, increased rates of genetic gains, and improved resource allocation in breeding programs. It can be implemented between and within breeding cycles, at different stages, and for selection of parental materials and candidate individuals. Considering its central role in breeding, we expect that GS will remain a crucial component of plant breeding programs. However, ongoing research is necessary to maintain its relevance to emerging challenges in agriculture, research innovations, and technological advances.

In this review, we intend to highlight the essence of GS, factors affecting its accuracy, key features of its practical implementations, its limitations, and its prospects (Figure 1). Many previous review and research papers have focused on individual aspects such as concept, methodological considerations, applications in different crop species, long‐term response, factors influencing prediction accuracy, and GS optimization (Alemu et al., 2024; Crossa et al., 2017; de los Campos et al., 2013; Desta & Ortiz, 2014; Goddard, 2009; Goddard & Hayes, 2007; J. M. Hickey et al., 2017; O. A. Montesinoes–López et al., 2021; Robertsen et al., 2019; Skøt & Grinberg, 2016; Voss‐Fels et al., 2018; Y. Xu et al., 2020). We also highlight the versatility of GS in plant breeding such as exploring crop genetic diversity, parental selection, candidate selection, and prediction in untested environments, as well as its potential to be enhanced by phenomics, enviromics, and multi‐omics data (Box 1). Some prospects of GS considering the rapid advances in artificial intelligence (AI) are also provided.

FIGURE 1.

FIGURE 1

The main driving forces behind the development and advancement of genomic selection (GS) in plant breeding. Forces are grouped into breeding challenges, conceptual framework, and technology development. Global food security and resource allocation are constant breeding challenges that led to GS and other breeding innovations. Genotype–phenotype relationship, methodological developments, and improved understanding of the genetic control of complex traits provide a conceptual framework for the inception and continued improvement of GS. Advances in genotyping, phenotyping, and envirotyping technologies enable the revision of the breeding process and enable the implementation of GS. Emerging challenges in agriculture, technological advances, and improvements in the conceptual framework will continue to shape the role of GS in plant breeding.

BOX 1 Key summary statements about genomic selection

  • (1)

    Genomic selection (GS) is a breeding strategy to predict the genotypic values of individuals for selection using their genotypic data and a trained model.

  • (2)

    GS has been widely researched in different crops and implemented in many crops.

  • (3)

    GS can be viewed as a decision‐making process with four main steps: training population design, model building, prediction, and selection.

  • (4)

    GS can shorten breeding cycle time, work with a large candidate pool, and increase rates of genetic gain.

  • (5)

    GS efficiency is affected by many factors including genetic architecture of the trait, genotyping coverage, training population design, and prediction models.

  • (6)

    GS target prediction scenarios include tested genotypes in untested environments, untested genotypes in tested environments, and untested genotypes in untested environments.

  • (7)

    GS can be enhanced by incorporating additional data types and artificial intelligence.

2. ESSENCE OF GENOMIC SELECTION

GS is a breeding strategy that exploits the genotype–phenotype relationship among individuals of a population to establish a model to predict the genotypic values of untested individuals for selection (Bernardo & Yu, 2007; Heffner et al., 2009; Meuwissen et al., 2001). The predicted genotypic value, a generic term to cover different GS scenarios, represents the part of the phenotype determined by the combined effects of all loci. Before GS, breeders predicted the genotypic value of individuals using pedigrees, which capture the expected average relationships among individuals. With GS, predictions are made using genotypic data that capture the realized relationship among individuals. Thus, GS is expected to be more accurate than the pedigree‐based approach as genotypic data can trace genome‐wide marker information to capture random Mendelian sampling and unknown ancestral relationships (Bernardo & Yu, 2007; Burgueño et al., 2012; Heffner et al., 2009; Meuwissen et al., 2001; Velazco et al., 2019).

GS includes four main steps: training population design, model building, prediction, and selection (Figure 2 and Box 2). It replaces some of the phenotypic selection in breeding and gives phenotyping a new role of generating data for model building (Crain et al., 2018; Desta & Ortiz, 2014; Heffner et al., 2009; Meuwissen et al., 2001). The term GS is frequently used interchangeably with genomic prediction (GP). While GP comprises training population design, model building, and prediction, which is typical in research studies, GS includes the selection step, making it cover the entire breeding process. GS is a more generic term than GP. For example, genome‐wide association studies (GWAS) and GS are two major research areas in complex trait dissection and selection.

FIGURE 2.

FIGURE 2

Genomic selection (GS) steps in plant breeding. GS follows a cyclical four‐step process, with outputs from one step serving as inputs for the following step. (1) Training population design, where breeders define the individuals used for model building, as well as the testing and genotyping approaches. Training populations can be designed from newly created populations that must undergo genotyping and phenotyping or in some cases from historical data where genotypic and phenotypic information already exist. (2) Model building, where information from the training population is used as input data to build models that are assessed using cross‐validation, and the genomic prediction (GP) models with better performance are selected. (3) Prediction, where predicted genotypic values are generated by using the trained GP model and genotypic data from untested individuals in the breeding program. (4) Selection, where decisions are made using the predicted genotypic values alone or together with other criteria. GS is a cyclical process, and as genotypic and phenotypic data of advanced individuals become available, training populations are updated, and models are retrained.

BOX 2 What is genomic selection?

Genomic selection (GS) is a strategy used in plant breeding. Plant breeding involves selecting good parental individuals, crossing parents, evaluating their progeny's performance (e.g., yield, seed quality), and selecting top candidates to become new inbred cultivar or parents for hybrid cultivars. Developing a new cultivar can take several years (∼6–10). GS uses information in a plant's DNA (genotype) and a measurable trait (phenotype), such as yield, to build a model to predict the genotypic values of individuals with known genotypes but unknown phenotypes. It provides breeders an estimate of the plant's potential performance for specific traits before their phenotypes are measured. Thus, selection decisions can be made earlier to reduce the cost and time of breeding programs. Breeders can implement GS in different generations of the breeding cycle for parental and candidate selection. Four main steps are necessary to implement GS: (1) training population design, (2) model building, (3) prediction, and (4) selection (Diagram 1):Inline graphic

Diagram 1: Genomic selection steps in plant breeding.

  • (1)

    Training population design: At this stage, breeders define the individuals used for model building (the training population). For long‐established breeding programs, historical data may be utilized as the initial training populations. In contrast, for breeding programs starting to work on GS and after the GS process is started, training populations are specifically designed. The training population should be large enough to well represent the diversity observed in the target population of genotypes (TPG) (Diagram 2). The TPG is the group of individuals for which predictions are needed. Phenotyping accuracy is crucial for developing reliable prediction models. Phenotypes are determined by both genetics and environmental conditions where plants are grown. Temperature, solar radiation, and precipitation are some examples of environmental factors influencing plant growth and phenotypes. Plants can respond in different ways to environmental variation across locations and years, which is known as genotype by environment interaction (GEI). GEI complicates the prediction of the performance of individuals in different environments. As a result, the training population is assessed in a multi‐environment trial (MET) to build robust prediction models. METs are designed to represent the target population of environments (TPE) where future cultivars will be grown. By doing so, the model can incorporate information on GEI that facilitates predictions of untested plants in new and relevant environments.Inline graphic

Diagram 2: Graphical representation of training population design. This diagram illustrates that shape and color are the primary features contributing to the variation observed in the TPG. To effectively capture the diversity of shapes and colors present in the TPG, the training population is designed to include three objects to represent this variation.

  • (2)

    Model building: Genotypic and phenotypic information from the training population are the input data for building a prediction model. Gene expression data, metabolite profiles, environmental covariates, and high‐throughput phenotyping derived traits can also be included in the model. Several statistical models can be assessed to find the suitable one. To compare between models, cross‐validation is conducted with the data from the training population. The correlation between predicted genotypic values and observed phenotypes measures the model performance. When genotypic and phenotypic data are being collected over time, prediction models can be periodically retrained by considering the most recent data.

  • (3)

    Prediction: At this stage, the trained model for prediction (genomic prediction) is combined with the genotypic data of the population of selection candidates to generate predicted genotypic values.

  • (4)

    Selection: Predicted genotypic values can be used directly as a selection metric, and individuals with the best values are selected to be advanced to the next generation or used as parents of future breeding populations. Alternatively, individuals with undesirable predicted values are removed from consideration. Other selection metrics exist to focus on parental selection and long‐term response.

Core Ideas

  • Genomic selection (GS) is a breeding strategy to predict the genotypic merits of individuals for selection.

  • GS helps to shorten breeding cycle times, increase genetic gains, and facilitate better resource allocation.

  • GS can be enhanced by phenomics, metabolomics, transcriptomics, and enviromics data to improve prediction accuracy.

  • Deep learning (DL)‐based GS is expected to facilitate the integration of multiple layers of information in prediction models.

  • Emerging technologies, research, and agricultural challenges continuously shape the role of GS in plant breeding.

Plant breeding programs primarily aim to improve crop productivity and resilience. The efficiency of breeding programs is commonly measured by the number of varieties released and the gain in performance achieved by selection (Falconer & Mackay, 1996). This is quantified by the genetic gain per unit of time, known as genetic gain equation or breeder's equation, and given by

ΔG=ihσAt, (1)

where ∆G is the change in the average trait value after one cycle of selection, i is the selection intensity, h is the selection accuracy, σA is the square root of genetic variance, and t is the breeding cycle length (Falconer & Mackay, 1996; R2D2 Consortium et al., 2021; Y. Xu et al., 2017).

The success of GS in plant breeding derives from its capability to accelerate genetic gains by affecting different terms of the breeder's equation. First, predicting genotypic values enables early selection, reducing the breeding cycle time (t) (Alemu et al., 2024; Meuwissen et al., 2001). In maize (Zea mays L.) breeding, the adoption of doubled haploid (DH) process facilitated the implementation of GS because of the need to eliminate a large proportion of DHs, unlike the conventional breeding process where selection is performed during the selfing generations. Second, unlocking the genetic diversity stored in gene banks and accelerating pre‐breeding allow the creation of new genetic variation, thereby increasing diversity (σA) (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020; Crossa et al., 2016; Gorjanc et al., 2016b; Rogers et al., 2022; Sanchez et al., 2024, 2023; Yu et al., 2016). Third, utilizing genome‐wide markers allows for a more accurate estimation of genotypic values, enhancing selection accuracy (h) (Bernardo, 2020; O. A. Montesinos‐López et al., 2023a). Fourth, reducing field‐testing costs increases evaluation capacity, which enables higher selection intensity (i) (de los Campos et al., 2013; Meuwissen et al., 2001). Moreover, GS can be more cost‐effective than phenotypic selection, particularly for complex quantitative traits, where phenotyping can be expensive and challenging to assess, such as yield, abiotic stresses, and end‐quality traits (Crossa et al., 2017; L. T. Hickey et al., 2019; Jarquín et al., 2014, 2020). Seven statements are provided in Box 1 to summarize the key points from this section and capture the essence of GS. In the following sections, we outline the main components necessary to implement GS in a breeding program. These components include model performance and prediction scenarios, factors influencing prediction accuracy, prediction models, training population design, prediction targets and selection metrics, and efficiency of GS in real‐world breeding scenarios.

2.1. Model performance and prediction scenarios

Model performance is assessed by prediction accuracy, using cross‐validation (CV) to ensure the model performs well on unseen data (Meuwissen et al., 2001). Prediction accuracy uses Pearson's correlation between predicted genotypic values and observed performance (phenotype) (Bernardo & Yu, 2007; Meuwissen et al., 2001). This correlation is technically termed predictive ability and when divided by the square root of heritability is termed as prediction accuracy. However, most of the literature simply refers to the correlation as prediction accuracy. The main driver of genetic gain due to GS is prediction accuracy, and ensuring robust model performance through CV is fundamental for GS's success in breeding. In CV, the training population is split into a training set to train the model and a testing set to estimate prediction accuracy. Often, k‐fold or leave‐one‐out (LOO) CV is used to split the data (Burgueño et al., 2012; Robertsen et al., 2019). In k‐fold CV, the training population is divided into k random groups, where one group is left out and predicted based on the remaining groups. Meanwhile, in LOO CV (or n‐fold where n is the number of individuals), each individual is left out singly and predicted based on the remaining individuals.

In a multi‐environment context, CV schemes can also partition the environments into tested and untested environments. There are three general prediction scenarios: (1) tested genotypes in untested environments, (2) untested genotypes in tested environments, and (3) untested genotypes in untested environments (Figure 3) (Li et al., 2018). This description scheme can be utilized for both the CV during model building and the actual implementation of GS in breeding (T. Guo et al., 2020; Li et al., 2018, 2021; Mu et al., 2022; Tibbs‐Cortes et al., 2024; Wei et al., 2025; Yu et al., 2016). In brief, CV is a valuable tool for model building, assessing the expected model performance, and comparing models to decide which model is best for a particular set of traits and populations. Model building in breeding programs is an iterative process, and as data become available, models are retrained and updated to maintain good prediction accuracy.

FIGURE 3.

FIGURE 3

Prediction scenarios for genomic selection in the multi‐environment trial context. (A) Prediction of tested genotypes in untested environments. (B) prediction of untested genotypes in tested environments. (C) prediction of untested genotypes in untested environments.

2.2. Factors influencing prediction accuracy

Extensive research in GS has shown that many factors, besides the statistical model, influence prediction accuracy. Intrinsic factors influencing prediction accuracy include the genetic architecture of the trait, linkage disequilibrium between markers and quantitative trait loci (QTL), population structure, and the prediction scenario (Combs & Bernardo, 2013a; de los Campos et al., 2013; Robertsen et al., 2019; VanRaden et al., 2009). These factors are inherent to the breeding populations and objectives, making them unchangeable. However, there are controllable factors that can be adjusted to improve prediction accuracy. These include marker density, cross‐validation strategy, trait heritability estimation, training population size, and the relatedness between training population and the target population of genotypes (TPG) (Akdemir & Sánchez, 2019; Combs & Bernardo, 2013a; de los Campos et al., 2013; T. Guo et al., 2019; Lado et al., 2013; Merrick et al., 2022; Robertsen et al., 2019; VanRaden et al., 2009). While these factors enable optimization, their effectiveness is often limited by the availability of resources. For instance, prediction accuracy generally improves as the size of the training population increases (Crossa et al., 2017; Daetwyler et al., 2010; Lehermeir et al., 2014; Meuwissen et al., 2001). However, there is not a universal proportion to be used as training population since breeding objectives and breeding populations differ. In the research setting, the optimal training population size is determined by factors such as trait heritability, relatedness of training population and the rest of TPG, and population structure (Bassi et al., 2016; Crossa et al., 2010; Desta & Ortiz, 2014; Isidro et al., 2015; Lorenz & Smith, 2015; Schmidt et al., 2016; Xavier et al., 2016). In actual practice, the training population size is limited by available resources (e.g., land, time, and labor), representing a resource allocation problem where the size must be carefully determined to balance the trade‐off between phenotyping costs and prediction accuracy (P. Y. Wu, Ou, et al., 2023). Given the many factors that affect prediction accuracy and their complex interactions, different outcomes may occur, with some factors having a greater influence than others.

2.3. Prediction models

Numerous GP models exist with the primary objective of increasing prediction accuracy by capturing as much of the phenotypic variance as possible (Araus et al., 2018; de los Campos et al., 2013; Goddard, 2009; Meuwissen et al., 2001). The most straightforward models are mixed models to estimate marker effects (RRBLUP) or leverage a genomic relationship matrix (GBLUP) to predict the genotypic values of individuals (Crossa et al., 2017; Meuwissen et al., 2001; Piepho et al., 2008; Robertsen et al., 2019). There are also several Bayesian linear regression models (e.g., BayesA, B, and C; Bayesian ridge regression; Bayesian LASSO) that differ in their assumptions about the prior distributions of marker effects and how to incorporate relevant information about the markers (e.g., p‐value, coding, or non‐coding region) (de los Campos et al., 2013; Gianola, 2013; Robertsen et al., 2019). These are all parametric linear models as they assume that the marker effects are additive, with each marker contributing independently to the phenotype (Heslot et al., 2012). Semiparametric models, such as reproducing kernel Hilbert space (RKHS) regression, and non‐parametric models, such as random forest (RF), support vector machines (SVM), and deep learning (DL) are more complex and capable of accounting for non‐additive effects without explicit modeling (Abraham et al., 2014; Danilevicz et al., 2022; de los Campos et al., 2010; Desta & Ortiz, 2014; Piepho, 2009).

Integrating biological knowledge into GP models can enhance prediction accuracy. A straightforward method to incorporate biological information is to use locus‐specific priors for marker effects (Brøndum et al., 2012; Gao et al., 2015; Z. Zhang et al., 2014). More complex methodologies involve grouping markers based on genome annotations, metabolic pathway annotation, and gene ontology categories. Once markers are grouped into classes, GP models can be fitted using haplotype blocks representing biologically functional units (Gao et al., 2017). GP models can also use class‐specific priors by assigning a different prior distribution per class (e.g., BayesCR) or by treating each marker class as a separate random genetic effect with a different variance (i.e., GFBLUP) (Edwards et al., 2015; Farooq et al., 2021; MacLeao et al., 2016). In addition, GP can be performed separately for each marker class to identify the group of markers that leads to higher prediction accuracy (Abdollahi‐Arpanahi et al., 2016; Do et al., 2015; MacLeod et al., 2016; Morota et al., 2014; Turner‐Hissong et al., 2020). These strategies have demonstrated improvement in prediction accuracy and may become increasingly useful as our knowledge of genomes across species grows.

Even though many GP models have been thoroughly studied and reviewed (Charmet et al., 2020; Crossa et al., 2017; de los Campos et al., 2013; González‐Recio et al., 2008; Meuwissen et al., 2001; O. A. Montesinoes–López et al., 2021; Negus et al., 2024) and in some cases, non‐parametric methods have shown slightly better accuracy than parametric methods, differences still remain small (Charmet et al., 2020; Heslot et al., 2012). Because mixed models and Bayesian linear regression models are easier to implement and generally perform well, they are commonly used in GS (O. A. Montesinos‐López et al., 2023a). However, non‐parametric models may perform better in some scenarios, such as categorical phenotypes, different data types, complex interactions, missing data, outliers, and correlated variables (Negus et al., 2024; Ogutu et al., 2011; Washburn et al., 2020, 2021). Similar prediction accuracy between DL‐GP and linear GP models indicates either a significant contribution of additive effects to complex phenotypes or the difficulty of DL‐GP models to estimate epistatic interactions by focusing on high‐level genetic relatedness with the current data size (Negus et al., 2024; Ubbens et al., 2021). Developing DL‐GP models capable of learning epistatic effects is necessary to improve and exploit the full potential of nonlinear models in GS (Ubbens et al., 2021). Additionally, ensembles of different models can improve prediction accuracy over individual models (Kick & Washburn, 2023). Despite the success of current methods, alternative models must continually be developed to accommodate newly available data types, increasing data volumes, and evolving human demands. DL‐GP models remain promising in this scenario since they can effectively manage large, complex datasets (Negus et al., 2024).

2.4. Training population design

Random sampling is a common method for creating training populations (Crossa et al., 2010), but this does not always lead to high prediction accuracy (Isidro et al., 2015). Conceptually, it is helpful to point out that a random sample is not always a representative sample. This statement is highly relevant since in actual GS practice, only one training population is used, unlike in simulation studies where random sampling is repeated for assessment. A representative sample reflects the key characteristics of the original population. Furthermore, when the training population makes up a large portionof the TPG, key characteristics of the training population need to approximate those of the remaining individuals. These considerations, along with data availability and whether TPG is defined or whether it involves progenies, explain the extensive research in training population design.

GP models can have low sensitivity to select high‐performing genotypes if they are not well represented in the training population (Bassi et al., 2016; O. A. Montesinos‐López et al., 2023a; Ornella et al., 2014). Therefore, there is significant interest in designing training populations to enhance prediction accuracy (Endelman et al., 2014; T. Guo et al., 2019; Sarinelli et al., 2019). Several optimization methods have been developed to identify optimal training populations of predefined sizes. One group of methods has been derived from the mixed model framework. These methods use marker data from the training population, the population of selection candidates, or the entire TPG to derive measurements of the prediction quality for different training population subsets and select the optimal one. Common prediction quality metrics are prediction error variance (PEV), coefficient of determination (CD), and prediction accuracy (Akdemir & Sánchez, 2019; Akdemir et al., 2015; Isidro et al., 2015; Ou & Liao, 2019; Rincent et al., 2017, 2012). A second group of methods leverages the concept of representative subset selection to explore the genetic space spanned by the TPG to construct the training population (e.g., PAM, FURS, MaxCD, and uniform sampling) (Bustos‐Korts et al., 2016; T. Guo et al., 2019; Z. Guo et al., 2014; Jansen & Van Hintum, 2007). A third group maximizes the relationship between the training population and TPG (e.g., Avg_GRM, Min_GRM, Min_GRM_size, max_GRM, OPT_MAX, OPT_MIN) (Atanda et al., 2021; Berro et al., 2019; Fernández‐González et al., 2024; Lemeunier et al., 2022). A fourth group includes estimated theoretical accuracy based on a causal QTL model (Mangin et al., 2019); the adversarial validation, a strategy commonly used in machine learning (ML) to minimize differences between the training and testing distributions (O. A. Montesinos‐López et al., 2023b), and a sparse selection index that identifies a training population for each individual (Lopez‐Cruz & de los Campos, 2021). These methods can be further classified into “targeted”—when using information from the entire TPG—or “untargeted”—when not using information from the population of selection candidates—with the former often performing better. For more information, detailed reviews and comparative studies of these methods are available (Alemu et al., 2024; Fernández‐González et al., 2023; Fernández‐González et al., 2024; Rio et al., 2022).

Over generations, the patterns of linkage disequilibrium between markers and quantitative trait loci in breeding populations change due to recombination, selection, and drift. Shifts in the patterns of linkage disequilibrium, if not captured in the training populations, can lead to a decrease in GP accuracy (Neyhart et al., 2017). In addition, breeders have limited resources, and each year, they must allocate phenotyping resources between testing advanced breeding materials and  testing earlier generations of breeding propulations. This leads to the question of how to effectively leverage data across time to update training populations. According to simulated and empirical research, updating training populations with data from the most recent breeding cycles can improve prediction accuracy if the cycles are well connected by common ancestors, and if current environments are related to old environments (Auinger et al., 2016; Denis & Bouvet, 2013; Jannink, 2010; Pszczola & Calus, 2016). Updating the training population with the best predicted individuals, or both the best and worst predicted individuals, can improve prediction accuracy in the short term, but in the long term, this approach performs similarly to updating each cycle with random lines (Neyhart et al., 2017). Optimization methods discussed earlier could help design efficient updating strategies across breeding cycles (Neyhart et al., 2017; Fernández‐González et al., 2023). Rather than treating it as an optimization problem for designing training populations, this can also be considered a model selection problem to effectively utilize the phenotypic information generated over time in breeding programs (Fernández‐González et al., 2023). Further empirical validation experiments are required to determine the usefulness of training population optimization methods to update training populations over multiple breeding cycles. In summary, updating training populations requires careful consideration of the relationships among individuals within the training population and with recently tested individuals, as well as the relationship of past and current population of environments.

Leveraging historical datasets in GS can help accelerate crop development pipelines by using already existing datasets and diversifying the training population (Ballén et al., 2022; Dawson et al., 2013; Rutkoski et al., 2015b). However, historical data were not necessarily collected with the requirements of GS in mind and are generally unbalanced across years, posing challenges to their incorporation (Sarinelli et al., 2019). Empirical studies showed that retaining historical data in training populations can favor prediction accuracy when target traits have high heritability, the size of the dataset is large, and the target population of environments and genotypes are well represented in the dataset (Fernández‐González et al., 2024; Rutkoski et al., 2015b; Sarinelli et al., 2019). Moderate‐to‐high prediction accuracies (0.5–0.85) have been observed when using historical data for GS in wheat (Triticum aestivum L.), maize, cotton (Gossypium hirsutum L.), sunflower (Helianthus annuus L.), and sugarcane (Saccharum officinarum L.) (Dawson et al., 2013; Fernández‐González et al., 2024; Gapare et al., 2018; Hao et al., 2019; Sarinelli et al., 2019; Storlie & Charmet, 2013; Shahi et al., 2025). Optimization methods to select training populations from historical data have outperformed random sampling (Fernández‐González et al., 2024; Sarinelli et al., 2019). Given the large and heterogeneous nature of historical datasets, identifying a training population for each individual (i.e., sparse selection index) in the TPG was proposed, and it achieved gains of 5%–10% compared with using the entire data as the training population (Lopez‐Cruz & de los Campos, 2021).

Another study successfully optimized the utilization of historical data in sunflower breeding using a two‐step process. First, using a multi‐objective optimization approach to determine the best years to be included in the training population, balancing genotype and environment diversity, heritability, and the genetic relationship between training population and TPG; and second, using optimization methods to determine the optimal training size and composition (Fernández‐González et al., 2024). Optimization methods could also be helpful to identify subsets of genotypes within historical datasets for further phenotyping if needed (Rutkoski et al., 2015b). High levels of genotype by environment interaction (GEI) across years within and among target populations of environments is one of the major challenges faced when dealing with historical data. GP models accounting for GEI were shown to improve prediction accuracy when using historical data collected across multiple environments (Gapare et al., 2018). Researchers also suggested a more decentralized selection based on regional programs with different training populations that could better help dealing with GEI in historical datasets, improving prediction accuracy, and enabling breeding programs to better target germplasm to their adaptation environments (Dawson et al., 2013).

2.5. Prediction targets and selection metrics

GS can be used for parental and candidate selection (Bernardo, 2020; Robertsen et al., 2019). Most GS strategies focus on predicting the genotypic value contributed by average effects of alleles (also called “breeding value”) (Bernardo, 2020; Falconer & Mackay, 1996). The predicted genotypic value is the output of the standard GS methods (GBLUP and RRBLUP) (Bernardo, 2020; Varona et al., 2018; Piepho, 2009; Robertsen et al., 2019; VanRaden, 2008). In hybrid breeding, the general combining ability (GCA) measures the average performance of a parent in different hybrid combinations as a deviation from the population mean (Robertsen et al., 2019). Alternatively, the total genotypic value of an individual includes average effects of alleles, dominance deviations, and epistatic effects, and it is of most importance when selecting top candidates as new cultivars or hybrids (Goddard, 2009; Jannink, 2010; Meuwissen et al., 2001; Robertsen et al., 2019). Prediction of total genotypic value is more complex and requires methods that consider dominance and epistasis relationship. The specific combining ability (SCA) in hybrid breeding measures the progeny mean of a particular combination of parents as a deviation from the sum of their GCAs and the overall mean (Bernardo, 2020; Robertsen et al., 2019).

Once predictions are made, breeders must define a selection metric that maximizes genetic gains (ΔG) while preserving genetic diversity (σA). Using the predicted genotypic values for selection could lead to the loss of rare favorable alleles and reduce long‐term genetic gains and prediction accuracy (Goiffon et al., 2017; Jannink, 2010; H. Liu et al., 2015). Alternative selection metrics focus on parental selection, long‐term response, and maintenance of genetic variation. Examples include (1) weighted genomic selection, which prioritizes low frequency favorable alleles (Jannink et al., 2010; M. Goddard, 2009; H. Liu et al., 2015); (2) predicted cross value, which evaluates SCA (Han et al., 2017); (3) usefulness criterion, which rates crosses based on the expected progeny distribution associated with an estimated mean and genetic variance (Yao et al., 2018); (4) optimal haploid value, optimal population value, and genotype building, which facilitate the efficient development of doubled haploids (Daetwyler et al., 2015; Goiffon et al., 2017); (5) look‐ahead selection, which optimizes the trade‐off between short‐term and long‐term genetic gains taking into consideration timeframe, mating strategy, and resource allocation (Moeinizade et al., 2019); and (6) genomic mating, which focuses on the complementation of parents using genetic information and the estimated genotypic values (Akdemir & Sánchez, 2016).

2.6. Efficiency of GS in real breeding scenarios

Simulation studies help compare the prediction accuracy of different selection methods and explore the dynamics of short‐, medium‐, and long‐term GS in breeding programs (Bernardo & Yu, 2007; Daetwyler et al., 2013; J. M. Hickey et al., 2014; Isidro et al., 2015; Jannink, 2010; H. Liu et al., 2015; Yao et al., 2018). While simulations allow researchers to test multiple hypotheses quickly and at low cost to provide guidelines for optimizing GS, empirical studies are necessary to support the simulation results (Daetwyler et al., 2013; Isidro et al., 2015). Deviations in prediction accuracy of simulations from real breeding scenarios may occur due to epistasis, epigenetic modifications, and complex metabolic interactions (Daetwyler et al., 2013). GS's realized genetic gain and actual prediction accuracy in breeding programs can be evaluated via selection experiments to compare GS with conventional methods (i.e., phenotypic and pedigree‐based selection) (Beyene et al., 2015; Rutkoski et al., 2015a). Selection experiments in wheat, maize, rye (Secale cereale L.), and alfalfa (Medicago sativa L.) breeding programs have demonstrated equal or superior performance of GS compared to conventional selection methods for predictions within and across breeding cycles (Auinger et al., 2016; Krochov et al., 2015; Li et al., 2015; Rutkoski et al., 2015a; Sallam et al., 2015). These studies spanned up to 5 years of the breeding programs; used historical data from the breeding programs as training data; and evaluated biparental populations (Krchov et al., 2015), unbalanced historical datasets (Sallam et al., 2015) or closely related populations employed in recurrent selection (Li et al., 2015). These studies showed that GS can exceed the conventional selection methods by allowing earlier (smaller t) or more intense selection (larger i) in larger populations (Bonnett et al., 2022; Li et al., 2015; Sallam et al., 2015), with genetic gains that are two‐ to four‐fold higher than phenotypic selection (Beyene et al., 2015). Prediction accuracy across breeding cycles depends on the relatedness between populations from different breeding cycles (Auinger et al., 2016). When GS was combined with measured phenotypes, prediction accuracy improved compared to GS or phenotypic selection alone (Krchov et al., 2015). Studies also showed that GS reduced the genetic variance of breeding populations more rapidly than phenotypic selection (Auinger et al., 2016; Rutkoski et al., 2015a).

3. APPLICATIONS OF GS IN PLANT BREEDING

3.1. GS for harnessing diversity

Broadening the genetic basis (increasing σA) of modern crops is critical to enabling continuous genetic gains, extending crop cultivation to new areas, and ensuring a breeding program's long‐term success (Bernardo, 2014b; Fischer et al., 2010; Sanchez et al., 2024). Tropical landraces, exotic germplasm, wild ancestors, and first‐cycle inbred lines developed from landraces represent a great reservoir of genetic diversity for future crop improvement (Sanchez et al., 2024). Conserving plant genetic resources in gene banks has been of utmost importance to plant scientists since the beginning of modern selection. However, a disconnect existed between plant breeding programs and gene banks due to the need for strategies to identify valuable germplasm (diversity donors) for pre‐breeding (Yu et al., 2016). Considering how impractical screening collections containing hundreds of thousands of accessions can be when evaluating every trait of interest, alternative, efficient methods to target fewer accessions are needed. The development of low‐cost genotyping and advances in GS have offered a new approach to accessing and mining this unexplored diversity (Crossa et al., 2017; Yu et al., 2016).

Turbocharging gene banks through GS is a framework proposed to unlock unexplored diversity (Yu et al., 2016). It uses genotyping profiles of gene bank collections and GS to characterize and prioritize gene bank accessions for pre‐breeding. The main steps are identifying a reference set of reasonable size for genotyping based on prior knowledge, designing a training population representative of the genetic diversity of the gene bank collection, model building, and predicting and selecting untested accessions. Defining a reference set might not be necessary if genotypic data are available for the entire gene bank collection of the crop of interest (Yu et al., 2016). Two critical considerations to implement this framework are identifying phenotyping environments that enable accurate predictions under varied conditions and designing one or more training populations that allow accurate prediction across the entire gene bank collection. Pre‐existing knowledge of adaptation, experimental trials, and patterns of diversity across collections can guide these tasks. Turbocharging GS has shown promising results in mining natural variation in gene banks for wheat (Crossa et al., 2016; Schulthess et al., 2022), sorghum [Sorghum bicolor (L.) Moench] (Yu et al., 2016), barley (Hordeum vulgare L.) (Gonzalez et al., 2021), and maize (Allier, Teyssèdre, Lehermeier, Charcosset, et al., 2020). Additional research also highlighted the importance of training population design and selection to balance trait improvement and diversification (Dzievit et al., 2021; Tibbs‐Cortes et al., 2022; Yu et al., 2020, 2016). Lack of resources and infrastructure limit more extensive use of GS for harnessing diversity in other crops; however, this may change as sequencing costs decrease and tools for exploring plant genetic resources using GS are developed.

Pre‐breeding is the recurrent improvement of diverse donor lines by increasing favorable variants and reducing their performance gap with elite germplasm. Improved diversity donors released from pre‐breeding are then introduced into elite breeding programs. Developments in GS, combined with an increase interest in exploring and harnessing diversity from plant genetic resources, have sparked the development of GS‐based pre‐breeding and bridging strategies (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020; Gorjanc et al., 2016; Crossa et al., 2016; Rogers et al., 2022; Sanchez et al., 2024, 2023; Yu et al., 2016). GS can speed up pre‐breeding and improvement within elite‐by‐donor crosses (bridging) with more donor germplasm introgressed per unit time compared with phenotypic backcross selection (Bernardo, 2009; Combs & Bernardo, 2013b). However, if the performance gap is still large, some backcrossing may still be necessary (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020; Ordás et al., 2023; Sanchez et al., 2024; Tarter et al., 2003; Yu et al., 2016). Backcrossing generations help incorporate favorable variants, close the performance gap, and diminish negative impacts on short‐term genetic gains (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020; Sanchez et al., 2024; Tarter et al., 2003). The best bridging individuals are selected as potential parents for the elite breeding program (Figure 4). Pre‐breeding and bridging are complex tasks, and obtaining valuable new germplasm requires a sustained, long‐term effort (Ordás et al., 2023).

FIGURE 4.

FIGURE 4

Genomic selection (GS) to explore and harness genetic diversity. Turbocharging gene banks through GS is a framework proposed to explore available genetic resources in gene banks to identify diversity donors for pre‐breeding and bridging. GS can help speed up the recurrent improvement of diversity donors (pre‐breeding) and the improvement within elite by diversity donor crosses (bridging).

GS strategies can also identify and select desirable exotic alleles during the development of bridging populations. One strategy treats the genomic contribution of the diversity donor as a secondary trait and performs simultaneous selection on the donor contribution and agronomic traits using a selection index (Allier et al., 2019; Sukumaran et al., 2022). With this strategy, we cannot determine if the maintained donor genome in selected lines is favorable. However, we can ensure the donor genome is not lost, increasing the chances of transferring exotic favorable alleles. Another strategy partitions the GS equation into a component for markers of favorable alleles carried by the elite parent and another component for markers of favorable alleles carried by the diversity donor, named origin specific genomic selection (OSGS). OSGS worked well in barley and maize nested association mapping (NAM) populations, and authors suggested it could be extended to broader multi‐parental populations (C. J. Yang et al., 2020). Separating the predicted marker effects based on allele origins seems a better strategy than focusing only on donor genome contributions as it avoids biased selection of the elite background and loss of exotic novel variation, explicitly allowing for the selection of exotic favorable alleles. Strategies like OSGS can target the selection of favorable alleles when developing bridging populations and introducing favorable alleles into elite plant breeding programs, which is critical for broadening diversity in major crops.

GS strategies to identify diversity donors complementing elite recipients can also help guide the introgression of favorable exotic variation. Criteria previously proposed for identifying promising crosses in plant breeding can be extended for GS‐based bridging strategies. These methods evaluate the complementarity between parents at individual loci (Bernardo, 2014a), the complementarity between parents at haplotype segments (Optimal Haploid Value, Optimal Population Value) (Daetwyler et al., 2015; Goiffon et al., 2017), or the expected genetic mean and genetic variance of the progeny (Usefulness Criteria) (Yao et al., 2018). They have been applied and compared in the context of diversity donor by elite recipient crosses, suggesting that the selection criterion depends on breeding objectives (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020). For instance, evaluating complementarity at large haplotype segments or estimating the expected gain with a selection intensity of 5% is suitable if the objective is short‐term gains. For long‐term gains, estimating expected gains using higher selection intensities or evaluating haplotype complementarity at smaller haplotype segments is better, as more attention should be paid to the complementarity between donor‐elite recipients (Allier, Teyssèdre, Lehermeier, Moreau, et al., 2020).

It has long been recognized that developing pre‐breeding and bridging populations require significant time and resources, which has limited the utilization of the extensive plant genetic resources available for breeding. As a result, collaborative efforts between public and private organizations to share costs and efforts have emerged to characterize and facilitate the incorporation of novel exotic diversity to enrich modern maize's elite germplasm (Allier, Teyssèdre, Lehermeier, Charcosset, et al., 2020; Pollak, 2003; Rogers et al., 2022; Salhuana et al., 1998; Sanchez et al., 2024). Recent incorporation of GS in the Germplasm Enhancement of Maize (GEM) project is a good example (Rogers et al., 2022). Continuation and extension of these collaborative efforts to other crops can help harness genetic resources, facilitate the adoption of new GS advancements for pre‐breeding and bridging, optimize current strategies to increase short‐ and long‐term gains, and generate knowledge of crops’ global diversity.

GS increases the loss of genetic diversity due to intense selection and the use of few elite cultivars in breeding, leading to the loss of rare favorable alleles (Doublet et al., 2019). One way to counteract this effect involves the identification of favorable alleles from germplasm banks that were inadvertently lost during domestication and crop improvement, or alleles that have only recently become favorable due to climate change or changes in agronomic practices. This is feasible when companion GWAS (Tibbs‐Cortes et al., 2021) is conducted with an adequate genome‐wide marker coverage for allele mining. Turbocharging gene banks through GS and targeted introgression of favorable lost alleles into elite breeding pools will be critical for increasing genetic gain while preserving genetic diversity.

Controlled recombination approaches that utilize gene editing techniques, such as CRISPR/Cas system, offer another exciting opportunity to enhance the frequency and distribution of single crossovers (CO) (Blary & Jenczewski, 2019; Mieulet et al., 2018). This could increase the genetic variation and frequency of favorable haplotypes available to breeders (Hayut et al., 2017; Mieulet et al., 2018; Taagen et al., 2022). However, a simulation study based on an empirical wheat dataset cautioned that increased recombination leads to a loss of prediction accuracy and a decrease in the retention of genetic diversity compared with existing breeding methods (Taagen et al., 2022). Further empirical and simulated research is necessary to assess at which stages in a breeding program altering CO frequency may increase genetic gain, how to optimize CO frequency to preserve genetic diversity, and how to integrate this approach with current breeding methods.

3.2. Integration of phenomics

Phenotyping is essential for the success of both GS and conventional phenotypic selection. The development of high‐throughput phenotyping (HTP) systems is revolutionizing how phenotyping is done in agriculture. It allows the evaluation of large populations and numerous traits across time and space at low cost with less labor, obtaining measurements that are more repeatable and precise than manual measurements (Duddu et al., 2019; J. Rutkoski et al., 2016; Sun et al., 2017). Deployment of HTP systems in breeding can result in higher selection intensity (i), improved selection accuracy (h), and higher genetic gain (ΔG) (Crossa et al., 2017; J. E. Rutkoski, 2019; Sun et al., 2017). Research on HTP systems includes the use of HTP‐derived traits in agriculture, imaging techniques, sensors, phenotyping platforms, data management strategies, image processing and analysis, 3D image reconstruction, robot technologies, and spatial variability handling (Andrade‐Sanchez et al., 2014; Araus et al., 2018; Moreira et al., 2019; Singh et al., 2019; X. Wang et al., 2018). In HTP systems, different sensors (e.g., RGB, multispectral, hyperspectral, thermal, RADAR, LiDAR) capture high‐resolution images, reflectance bands, and 3D point clouds that are used to construct vegetation indices, count plants or organs (e.g., panicles), and measure important traits (e.g., canopy coverage, senescence, biomass, biotic and abiotic stress) (Crossa et al., 2017; Ninomiya, 2022).

Integrating genomics and phenomics is an ongoing process driven by developments in next‐generation sequencing and HTP systems. HTP systems allow the collection of in‐season traits that breeders can use to optimize the efficiency of breeding programs by decreasing their dependence on resource‐intensive end‐season phenotyping (Parmley et al., 2019). Any HTP‐derived trait highly correlated with yield, disease resistance, end‐use quality, or other economically important traits can be used for early testing and selection of candidate lines within individual environments (Crossa et al., 2017; Rutkoski, 2019). These secondary traits can allow selection in generations where the primary trait cannot be measured accurately but secondary traits can, or when primary traits cannot be phenotyped due to insufficient seed, the high cost of manual phenotyping, or severe weather. However, using them as predictors without genotypic data has limited power to predict across environments. HTP‐derived traits can be used as covariates in single‐trait GP models or different traits in multi‐trait GP models that exploit the shared genetic correlations between traits to increase prediction accuracy (Bassi et al., 2016; Crain et al., 2018; Heffner et al., 2009; Rutkoski et al., 2016; Sun et al., 2017). Some traits used in multi‐trait GP models are canopy temperature, vegetation indexes, and canopy coverage because they are highly heritable and genetically correlated to important end‐season traits (e.g., grain yield, seed quality) (Crain et al., 2018; Rutkoski et al., 2016). Studies integrating genomics and phenomics for GS showed improvements in prediction accuracy, indicating that adding more traits generated more gain compared to using only a few HTP‐derived traits (Baba et al., 2020; Fernandes et al., 2018; Jia & Jannink, 2012; Moeinizade et al., 2020; Sun et al., 2017; Volpato et al., 2019). It is worth noting that when not considering GEI in these multi‐trait GP models, their performance fluctuates from the highest accuracy in one environment to no accuracy or even negative accuracy in another; this happens as correlations between HTP‐derived and target traits can differ between environments (Crain et al., 2018).

To enhance the effectiveness of multi‐trait GS, it is essential to phenotype HTP‐derived traits in the genotypes subject to prediction (Rutkoski et al., 2016); however, this approach requires field observations, which can be impractical. In contrast, in GS, candidate genotypes can be evaluated without field testing (Persa et al., 2021). An alternative method is to use crop growth model (CGM) predictions to substitute for phenotyping secondary HTP‐derived traits on candidate genotypes in different environments (Robert et al., 2020). Furthermore, understanding the factors that contribute to GEI and accounting for GEI in the GP models can improve the prediction accuracy in new environments when dealing with multiple traits. However, multi‐trait GP models can become computationally demanding as the number of traits increases.

HTP systems can also capture dynamic, or longitudinal, traits measured at multiple times through the growing season. Some examples are light interception, plant height, biomass, and vegetation indexes. There are four common ways of modeling longitudinal traits. First, the simple repeatability model that treats longitudinal traits as repeated measurements with correlations between records assumed constant. Second, a multi‐trait GP model that treats each time point as a different trait. However, this approach is computationally demanding when there are many time points. Third, a two‐stage analysis (parameters as data) that first fits a model to describe the trajectory of the longitudinal trait and then uses a multi‐trait GP model to analyze the fitted trajectory parameters (M. Campbell et al., 2018; Sun et al., 2017). Last, a single‐stage approach that models trait trajectories and performs genetic analysis simultaneously is preferred as information can be lost in two‐stage analysis.

Random regression models (RRMs) offer a robust framework for single‐stage analysis of longitudinal traits. RRMs have been widely used for the genetic evaluation of longitudinal traits in animal breeding (Oliveira et al., 2019) and been used in plant breeding with examples in wheat, rice (Oryza sativa L.), and soybean [Glycine max (L.) Merr.] (Baba et al., 2020; M. Campbell et al., 2018; Lyra et al., 2020; Morales et al., 2024; Moreira et al., 2021, 2020; Sun et al., 2017). RRMs estimate the regression coefficients using a mixed model and obtain the genotypic values for any time during the continuous trajectory using simple algebra (Mrode, 2014; Sun et al., 2017). A critical step when using RRMs is selecting the mathematical function that best describes the trajectory of the trait. Some mathematical functions used to describe trait trajectories are power, exponential, logistic, splines, and orthogonal polynomials (Brien et al., 2020; Moreira et al., 2020; Oliveira et al., 2019; Sun et al., 2017). RRMs capture more genetic variation than single time point GP models, improving accuracy significantly (M. Campbell et al., 2019, 2018), and with higher accuracy than multi‐trait GP models treating each time point as a different trait (Momen et al., 2019; Sun et al., 2017). In addition, multi‐trait RRMs showed higher accuracy than single‐trait RRMs (Baba et al., 2020). Despite being measured at discrete time intervals, longitudinal traits vary continuously between intervals, and treating them as continuous traits while during genetic analysis seems more appropriate. RRMs also allow the prediction of temporal genotypic effects, enabling selection based on the complete trajectory of the trait or at specific growth periods. For instance, breeders might be interested in biomass accumulation trajectories in response to stress conditions to identify stress‐resilient individuals (M. Campbell et al., 2018; Momen et al., 2019) or HTP‐derived canopy features (e.g., canopy coverage, volume) at critical developmental stages for yield predictions (Jarquín et al., 2018; G. Yang et al., 2024). A key feature of RRMs is that the correlations between time points can vary with closer pairs of time points having higher correlations than more distant pairs. Models based on random regression can be computationally demanding as they require appropriate modeling of trait trajectories, especially when considering multiple traits (Oliveira et al., 2019). More efficient computational algorithms, continuous reduction in HTP costs, and increased image processing speeds are necessary to facilitate the full integration of longitudinal traits into GS.

3.3. Integration of intermediate phenotypes

Integrating intermediate phenotypes (e.g., transcripts, metabolites) into GP models has been possible thanks to technological advances that enable their measurement in a high‐throughput manner (Varshney et al., 2021). Intermediate phenotypes can provide novel and complementary information, leading to higher prediction accuracies. For instance, transcriptomic data can help establish a connection between dynamic gene expression profiles and genotypic data. At the same time, metabolomic data can offer insight into what metabolites influence specific growth periods (Washburn et al., 2020). In short, multi‐omic datasets can help unveil biological processes controlling specific phenotypes under different environments, which is beneficial to model GEI and predict additive and non‐additive genetic effects (Ali et al., 2024: Razzaq et al., 2022). However, a study integrating transcriptomic and proteomic data suggested that omic data can dramatically increase prediction accuracy compared to GP, only when plants are grown in similar conditions (Ali et al., 2024). Comparative studies between genomic, metabolomic, and transcriptomic data for performance prediction found that the GP model outperforms models using the other types of omics data (Y. Xu et al., 2017). On the other hand, combining transcriptomic and genomic data as predictors has improved predictive abilities with mRNA being more beneficial than small RNAs (Schrag et al., 2018). In other cases, metabolomic data alone or in combination with genomic data have shown to increase accuracy compared with GP models or models combining genomic and transcriptomic data (Azodi, Pardo, et al., 2020; Z. Guo et al., 2016; X. Hu et al., 2019; Riedelsheimer et al., 2012; S. Wang et al., 2019; S. Xu et al., 2016). The prediction accuracy of models integrating genomic, proteomic, metabolomic, and transcriptomic datasets is highly influenced by the time of measurement of the intermediate phenotypes, as well as the number of transcripts or metabolites included in the model (Z. Guo et al., 2016) because transcriptomic, proteomic, and metabolomic profiles are specific to plant tissues and developmental stages. Thus, defining optimal tissues and sampling times is crucial for enhancing accuracy in multi‐omics GP models, which would depend on the target prediction trait and critical growth periods. Through this section, we describe that it is possible to integrate new data types for GP, where metabolites seem to be the most beneficial, significantly improving prediction accuracy compared to transcriptomic data (Azodi, Pardo, et al., 2020; Z. Guo et al., 2016; S. Wang et al., 2019; S. Xu et al., 2016). It is likely the integration of metabolomic data in GP models brings information about trait regulatory networks and biochemical pathways that are not captured by transcriptomic or genomic data (Washburn et al., 2020).

Integrating intermediate phenotypes and genomic data is challenging due to their high dimensionality, which requires specialized GP models. In the simplest case, intermediate phenotypes can assist genomic feature preselection and improve the performance of GP (Ye et al., 2020), but this does not exploit the information contained in intermediate phenotypes. One way to deal with high dimensionality and facilitate multi‐omic data integration in GP models is by using reduced sets. A comparison between whole transcriptomic profiles and profiles of a reduced set of genes for GP of maize hybrids suggests that transcripts from a reduced number of genes may be enough to build GP models with reasonable accuracy (Fu et al., 2012; Zenke‐Philippi et al., 2016). In addition, several methods have been proposed to incorporate multi‐omics datasets in GP models including parametric (i.e., GBLUP, RRBLUP, LASSO, PLS, Bayes‐LASSO, BayesA, and BayesB) and non‐parametric methods (i.e., SVM, RKHS, RF) (Azodi, Pardo, et al., 2020; H. Hu et al., 2021; X. Wang & Wen, 2022; S. Xu et al., 2016; Y. Xu et al., 2017). Hybrid methods combining properties of linear mixed models and sparse regression modeling, re‐parameterization of the multivariate linear mixed model (MegaLMM), kernel methods, and ensemble methods combining the prediction from different algorithms have shown promising results for dealing with multi‐omics datasets (Azodi, Pardo, et al., 2020; H. Hu et al., 2021; Runcie et al., 2021; X. Wang & Wen, 2022; X. Zhou et al., 2013). Some of the limitations to using multi‐omics datasets include the cost of sampling methods, the computational capabilities of prediction models incorporating multiple data types, and the need for improved imputation methods.

3.4. GS under genotype by environment interaction

Genotype by environment interaction, the differential responses of genotypes across environments, is a long‐standing research topic for analyzing multi‐environment trial (MET) data in plant breeding. In practice, the statistical component for GEI may be ignored (i.e., treated as noise) to focus only on the overall performance of genotypes, reduced by subdividing the TPE into homogenous groups, or exploited by modeling the MET data to identify genotypes best suited to specific environments (Bernardo, 2020; Cooper & Delacy, 1994; Gauch, 2006). In the above strategies, the GEI term is either ignored and absorbed by the residual term or explicitly fitted into the model. METs are essential for predicting and selecting consistently high‐performing genotypes. In GS, prediction across environments is possible due to shared alleles between the training population and TPG, and the correlation between the environments used in the MET and the TPE (Cooper et al., 2023). Understanding the genetic basis and physiological and environmental causes of GEI is fundamental for enhancing the selection of superior genotypes. Approaches to considering GEI in GP models evolved in response to data availability and methodological advances, building on knowledge and methods previously developed to study GEI (Malosetti et al., 2013).

Initially, GP models were built using single‐environment phenotypes or adjusted mean phenotypes across environments (Bassi et al., 2016), reducing the computational demand of fitting multi‐environment datasets and covariates. Doing so comes with some loss of information, and selected genotypes are chosen based on their specific performance in any given environment or their average performance across environments. Alternatively, breeders can use more complex statistical methodologies to model GEI. Some examples are multiplicative models, which are a modified analysis of variance with the GEI decomposed into multiple orthogonal components (e.g., AMMI) (Gauch, 1988; Gollob, 1968; Priyadarshan, 2019). Linear mixed models treat GEI as a random effect and model the covariance between pairs of environments using the compound symmetry structure (equal variances and covariances) or an unstructured variance‐covariance matrix (Burgueño et al., 2008, 2011; Crossa et al., 2006, 2004; Cullis et al., 2010; Kelly et al., 2009; Priyadarshan, 2019; Smith et al., 2001; Stringer et al., 2017). The factor analytic model allows the variances and covariances between environments to differ, offering a more realistic approach when modeling GEI. Further developments in statistical modeling allowed breeders to use single‐step multi‐environment GP models, enabling a more efficient prediction for genotypic values in specific environments. Models with more complex covariance structures had higher prediction accuracies (Burgueño et al., 2011, 2012; Crossa et al., 2006; Kelly et al., 2009). However, because these models are based on observable phenotypic covariances among environments, making them explanatory a posteriori rather than predictive, and so do not allow for prediction in new environments.

Alternatively, joint regression analysis regresses each genotype's performance in different environments on the environmental mean to characterize the response patterns of genotypes to this environmental gradient (a reaction norm) (Finlay & Wilkinson, 1963; Stringer et al., 2017; Yates & Cochran, 1938). However, in the absence of genotype‐independent measures of environmental quality, these models cannot predict new environments, as environmental mean phenotypes can only be obtained after the actual experiment (Eberhart & Russell, 1966). This motivated the development of factorial regression (FR) models, which estimate genotypic sensitivities to explicit environmental covariates (EC) (Alimi et al., 2013; Boer et al., 2007; Crossa et al., 1999; Ly et al., 2018; Malosetti et al., 2004; Priyadarshan, 2019; van Eeuwijk et al., 1996; Vargas et al., 2006) for generating predictions for new environments. Thanks to developments in envirotyping technologies, weather and soil covariates (e.g., temperature, solar radiation, precipitation) at high resolution (e.g., hourly, daily) are now accessible. Modeling large numbers of ECs has the same constraints as genotypic data, with many correlated ECs each explaining a small amount of the total variance, and can be dealt with similarly (Gauch, 2006).

However, modeling the explicit interactions between high‐dimensional genomic and enviromic datasets can be challenging. One strategy uses variable selection procedures and then fits a reduced set of covariates using FR (Bernardo, 2020; Crossa, 2012; Heslot et al., 2014). The reaction norm model can be implicitly fit using GBLUP with a high dimensional variance–covariance structure incorporating the interactions between markers and ECs (Crossa et al., 2017; Jarquín et al., 2014; Pérez‐Rodríguez et al., 2015; Velu et al., 2016). This variance–covariance structure is the product of the genomic relationship matrix based on markers and the enviromics relationship matrix based on the ECs. Alternatively, explicit modeling of marker‐by‐environment interactions (GBLUP‐MxE) partitions marker effects into components that are common across environments and environment–specific deviations (Bassi et al., 2016; Cuevas et al., 2016; Lopez‐Cruz et al., 2015). Another approach uses factor analysis to predict observed GEI in terms of latent combinations of ECs (Tolhurst et al., 2022). These methods demonstrate that the extension of classical models for the combined use of genomic and enviromic data in GS can increase prediction accuracy compared with models that do not incorporate ECs.

Developing a quantitative environmental index (EI) that is biologically relevant and estimable for new environments offers an alternative approach to modeling the environmental effect and some patterned part of GEI (Li et al., 2018, 2021). Critical Environmental Regressor through Informed Search (CERIS) aims to uncover the primary environmental pattern that drives the separation of average performance of the population. This EI searches for an environmental variable that is readily available and is not connected with the biological materials under evaluation and a window of time within which the set of values of the environmental variable is highly correlated with the mean performance of genotypes in different environments. This EI can be combined with GS in joint genomic regression analysis (JGRA) for predicting performance across environments. This integrated procedure (CERIS‐JGRA), leveraging the concepts of phenotypic plasticity and GS, has been successfully applied to multiple traits and crops (T. Guo et al., 2020, 2024; Li et al., 2018, 2021; Mu et al., 2022; Resende et al., 2021; Tibbs‐Cortes et al., 2024; Wei et al., 2025). Using an EI offers higher interpretability, more targeted characterization of environments, and information that can help in the design of experiments or METs. More broadly, leveraging the EI from CERIS can provide an integrated framework for gene discovery underlying phenotypic plasticity (GWAS) and performance prediction across environments (GS) and offer insights into genetic architecture of complex traits. In summary, different methods to deal with GEI in GS are available and the decision on what method to use depends on balancing the trade‐offs between prediction accuracy, model complexity, computational time, the patterns embedded in the data, and how findings from different methods can be interpreted and used to guide research and practice.

3.5. GS and crop growth models

CGMs are an attractive tool for modeling data with genotype‐by‐environment‐by‐management interactions, allowing the examination of specific combinations. For instance, hybrid‐by‐density interactions have been of great interest for selection in maize breeding and management optimization by agronomists (Cooper et al., 2021). CGMs are systems of differential equations representing physiological processes in plants that simulate crop growth given environmental and management inputs and genotype‐specific parameters. The parameters of CGMs can be tuned to fit different genotypes (model calibration) (Washburn et al., 2020). However, CGM calibration is an expensive and time‐consuming process because multiple physiological variables must be measured on the new genotype (cultivar) in as many environments as possible. A genetics‐based approach to addressing the calibration problem was suggested to link genes with the genotype‐specific parameters to improve simulation models (White & Hoogenboom, 2003; Yin et al., 2003). However, the integration of CGMs with genetics is computationally demanding and the progress of direct integration was slow.

Instead, CGMs were used to derive stress covariates from daily weather data during predicted crop development stages and incorporated into GP models to improve prediction accuracy in untested environments (Heslot et al., 2014; Ly et al., 2017; Rincent et al., 2019; Shahhosseini et al., 2021). Later, CGMs and GP (CGM‐GP) were directly integrated using approximate Bayesian computation (ABC), although the computational demands were still high (Technow et al., 2015). Empirical applications of the CGM‐GP algorithm have been performed in maize, wheat, and rice with extensions to multi‐environment GP models (M. T. Campbell et al., 2020; Cooper et al., 2016; Diepenbrock et al., 2021; Jighly et al., 2023; Messina et al., 2018; Onogi et al., 2016; Rincent et al., 2017; Robertsen et al., 2019; Technow et al., 2015). The CGM‐GP algorithm has shown superiority when predicting unrelated genotypes or in TPE with lower similarity to the training environments. Further integration of CGMs with GP models and extension to other crops and more complex scenarios depends on the availability of suitable CGMs and development of less computationally intensive algorithms. These integrated CGM‐GP frameworks can be useful not only for prediction in untested environments but also for the prediction of specific genotype–management combinations, biological inference, and identification of genetic loci underlying the genotype‐specific parameters.

3.6. Deep learning applications for GS

GS can be performed using any form of prediction model. This includes DL models that are the backbone of modern AI. DL is composed of multi‐layered neural network models like feedforward neural networks (FNNs), convolutional neural networks (CNNs), and Transformers. These three DL model types have been tested extensively for GS. FNNs are an earlier form of DL, popularized in the AI field in the 2000s and early 2010s. FNNs do not capture any implicit positional/relational information from the input, meaning that the order of inputs designated during model training is arbitrary, which could limit their ability to capture complex interactions and linkage disequilibrium. Inputs to FNNs for GS have included relationship matrices (Gianola et al., 2011) and marker genotypes (generally as allele dosages) (González‐Camacho et al., 2012; Pérez‐Rodríguez et al., 2012). FNNs have had mixed results in comparison to linear and Bayesian regression approaches (Azodi et al., 2019; Gianola et al., 2011; González‐Camacho et al., 2012; A. Montesinos‐López et al., 2018), with results often varying by trait and species (Mcdowell, 2016, O. A. Montesinos‐López et al., 2018; Sandhu et al., 2021; Zingaretti et al., 2020).

CNNs excelled at computer vision tasks in the 2010s. CNNs apply weights to inputs via a sliding filter. This strategy works well at capturing the implicit structure of neighboring pixels in an image. CNNs were adopted for GS because it was theorized that the structure of SNPs could be captured by this method. However, SNPs have an irregular structure (irregular distances) in comparison to images, and the appropriateness of using CNN approaches for GS has been questioned (Pook et al., 2020). Like FNNs, CNNs only exceeded the performance of traditional method in some instances (Azodi et al., 2019; Bellot et al., 2018; Y. Liu & Wang, 2017; Ma et al., 2018).

Transformer models are another popular DL model that has proven highly successful at natural language tasks through the late 2010s and early 2020s (Devlin et al., 2018; Vaswani et al., 2017). Transformer models are different from their predecessors due to their attention mechanism and positional encoding. Positional encoding explicitly represents the structure of the input data to the model, and the attention mechanism allows the model to learn representations of the input variables in a context dependent manner. Transformer‐type GP models have begun to be investigated (R. Chen et al., 2024; Jubair et al., 2021; C. Wu, Zhang, et al., 2023).

It is important to note that DL‐GP models have some distinctions from conventional linear and Bayesian GP models described in Section 2.3. While DL and conventional approaches both use a trained model and genotypic data to estimate the performance of selection candidates, DL‐GP models generally operate with a closer tie with the phenotypic values because DL‐GP models have generally been trained directly on unprocessed observed phenotypic data (Ma et al., 2018; K. Wang et al., 2023; C. Wu, Zhang, et al., 2023). Additionally, conventional models are built on explicit genetic hypothesis, whereas DL‐GP models do not make biological assumptions; instead, they learn directly from the data itself.

Another difference that originates from the increased complexity of DL models is the model building process: DL‐GP models encompass many decisions regarding model architecture. In contrast, linear regression approaches either do not have hyperparameters (as seen in GBLUP) or learn them directly from the data (as with Bayesian methods). For DL‐GP, after choosing the model type (FNN, CNN, or Transformer), extensive hyperparameter tuning is essential for obtaining good training results. This tuning process adds complexity to the model building for DL‐GP models, as it requires considerable computing time to explore the entire hyperparameter space; different hyperparameter combinations often yield similar performance, and the optimal values can vary considerably between different traits and populations (Lourenço et al., 2024; O. A. Montesinoes–López et al., 2021). This process can make using DL‐GP models unattractive to practitioners especially since conventional approaches still perform competitively at moderate data scales. Where DL‐GP models currently have the most promise is in situations where multiple types of input data are utilized.

In addition to their use in GP, DL models are well suited to processing and handling the various data types associated with individual component data types (genomics, phenomics, enviromics, and other multi‐omics). Among these tasks is data interpolation where they have been successfully applied to genomic data (J. Chen & Shi, 2019), spatio‐temporal environmental (Amato et al., 2020), and transcriptomic (Jiang et al., 2023; Talwar et al., 2018) data. GP models have also integrated process‐based CGM like APSIM (Q. Chen et al., 2022; Kheir et al., 2023) or DSSAT (Von Bloh et al., 2024) using DL. CGMs have integrated ML and DL strategies in several ways that can be broadly categorized as parallel approaches, where both CGMs and ML models use the same input variables, with the output being either the sum or product of both models’ outputs. Serial approaches that involve running one model (either CGMs or ML) before the other, with the first model's output serving as the input for the subsequent model are another aproach. Lastly, modular approaches where individual CGMs can be replaced or combined with ML models, utilizing either a parallel or a serial approach (N. Zhang et al., 2023). These approaches mirror the ways the DL‐GP strategies can integrate CGMs for a knowledge and data‐driven GP model.

4. PROSPECTS

The conception, implementation, and evolution of GS in plant breeding have been and will continue to be driven by current and future world breeding challenges (Figure 1). As described in this review, developments in GS have significantly improved the efficiency of plant breeding, and given its central role, we anticipate it will remain important. At the same time, continuous research is required to enhance its accuracy and applicability to future climatic scenarios and agricultural challenges. Areas for further improvement in GS include developing decision support systems, increasing the capabilities of current DL‐GP models, efficient integration of multiple data types, a deeper understanding of the genetic control of complex traits, and training for the next generation of plant scientists (Figure 5).

FIGURE 5.

FIGURE 5

Challenges and future trends of genomic selection (GS) in plant breeding. Critical challenges include decision‐making under uncertainty, the unrealized potential of deep learning (DL) models, data complexity from different data types, partial understanding of how phenotypes emerge from genomes interacting with environments through development, and the growing gap between technology capabilities and adoption. Possible solutions to these challenges are the developments of decision support systems, explainable AI, efficient modeling of multiple data types, reconstruction and analysis of gene regulatory networks and pangenomes, and effective education and training programs for breeders and farmers. We foresee GS strategies design based on probability decisions, effective DL models for GS, efficient integration of multiple data types into GP, better understanding of the genetic control of traits under different environments, and more hands‐on training for breeders and farmers on new technologies.

Developing decision support tools (DSTs) to design GS strategies is needed to help breeders make informed decisions and reduce uncertainty. As described in this review, GS strategies involve various decision‐making steps interacting with one another and with stochastic factors (e.g., environmental conditions, genetic recombination) that determine the success of GS. DSTs integrating already developed methods to quantify the effects on multiple objectives of decisions made through the plant breeding pipeline can enable more efficient data‐driven GS strategies (Kusmec et al., 2021). Stochastic modeling of outcomes considering environmental and genetic uncertainties and cost and time constraints can also improve breeders’ ability to make probability‐based decisions when designing GS strategies. DSTs can integrate different elements of GS to make optimized decisions about training populations, METs, GP models, selection metrics, and recurrent selection balancing genetic gains and genetic variation. However, the successful adoption of DSTs requires developing information systems, such as application program interfaces (APIs) that facilitate access to DSTs, databases, training materials, and suitable support services for breeders (Varshney et al., 2016). Some APIs examples are the Integrated Breeding Platform, a web API that provides breeders with analytical tools and information related to designing and carrying out integrated breeding projects (https://integratedbreeding.net/1839/landing) (Delannay et al., 2012), and Breeding Insight, a program developed to provide resources to breeders for managing their breeding programs using new tools and technologies (https://breedinginsight.org/about‐bi/). Developments in this field are currently in their infancy; however, as new technologies emerge, further advances are expected. Adopting such DSTs and information systems also depends on the availability of user training and alignment between these tools and user’ end needs (Kusmec et al., 2021).

Improving the accuracy of current DL‐GP models is a promising avenue to improve accuracy when predicting the combined effects of multiple traits and genes across varied environmental conditions. Current DL‐GP models have been limited to model types predominantly designed for unrelated tasks and have yet to realize their full potential in GS (Negus et al., 2024; Ubbens et al., 2021). Due to the black‐box nature of DL models, it can be extremely challenging to explain why predictions are accurate or not. Incorporating additional information during the training process and progress in explainable AI can help improve the accuracy and interpretability of DL models, especially when dealing with complex interactions (Azodi, Tang, et al., 2020; Negus et al., 2024; Novakovsky et al., 2022; Wen et al., 2022). Explainable AI methods examine the inner workings of DL models to reveal the basis on which predictions are made. Explainability techniques are grouped into two categories: transparency (model‐specific) and post hoc interpretations (model‐agnostic) (Arrieta et al., 2020; Chari et al., 2020). Transparency offers perspective on how a model functions internally, whereas post hoc interpretations focus on how a model behaves (Chari et al., 2020). Prioritizing transparency when developing new types of neural networks with inherently interpretable units can improve overall model interpretability. However, building these models requires prior knowledge to design the network architecture (Novakovsky et al., 2022). Post hoc explainability techniques, also called post‐modeling explainability, are centered on understanding how already existing models perform (Arrieta et al., 2020). These methods can help identify relevant combinations of input features, quantify the contribution of each feature to prediction accuracy, and uncover the underlying interactions between inputs of the model (Novakovsky et al., 2022). Explainable AI models can act as debugging tools to overcome the shortcomings of current DL‐GP models. However, establishing best practices and integrating these models into accessible analysis tools are still needed to enable their use in GS. More interpretable DL‐GP models could potentially lead to more robust and optimized models, increased breeders’ confidence in the models, and improved understanding of the genetic control of traits by revealing underlying effects and interactions.

Embracing technology development holds tremendous potential for GS. Breeders now have access to many data types, including genomics, transcriptomics, metabolomics, enviromics, phenomics, CGM, and pangenomics. Most of them can be collected across time and space, resulting in complex datasets with multiple data types, each with multidimensionality (Kick & Washburn, 2023; Washburn et al., 2020). Therefore, data integration into GP models can be challenging owing to different data formats, levels of noise, sparsity, and environmental and developmental dependencies. These extra data layers offer vast information about factors affecting plant phenotypes and the interplay between those factors, which is becoming increasingly important in GS for predicting across different environments. However, they also impose significant challenges regarding data generation, preprocessing, analysis, and interpretability. Developments in machine learning methodologies, in particular DL algorithms, have great potential for more efficient modeling of multi‐layer datasets for GS, as they can handle not only large data but also raw data without any preprocessing (Feng et al., 2024; Ma et al., 2018; O. A. Montesinoes–López et al., 2021; Negus et al., 2024). Developing low‐cost sampling strategies and improved imputation methods enabling the regular use of different data types, especially for intermediated phenotypes, in GS are necessary (Song et al., 2020; Westhues et al., 2019). Further developments in HTP systems and 3D image processing to characterize plant morphology can advance our capabilities of phenotyping yield‐, root‐, and stress‐related traits to make their phenotyping suitable for large‐scale applications (Araus et al., 2018; Gill et al., 2022). Given the increasing generation of data and scientific developments, we are entering an era where plant breeding is increasingly driven by technology and data.

The recently developed concepts of the pangenome and omnigenic model can help us improve GS strategies through better understanding of the genetic control of complex traits. The pangenome emerges from our ability to compare whole genome sequences across multiple individuals and is defined as the entire genetic diversity of a species with core (across all individuals) and variable (present in some individuals) genes (Bayer et al., 2020; Mahmood et al., 2022). The omnigenic model was hypothesized as a framework for understanding the complex genetic architectures of quantitative traits. It proposes that genes underlying a quantitative trait can be partitioned into a few core genes with direct effects on cellular and organismal processes leading to a change in the expected value of the phenotype and many peripheral genes that indirectly affect the phenotype through their regulatory effects on core genes (Boyle et al., 2017; X. Liu et al., 2019). In the GS framework, these advances improve our understanding of the functionality of genomes and regulatory networks underlying complex phenotypes. Pangenome data can improve imputation accuracy for sparse genotypic data and uncover structural variants, improving the power to identify genetic factors underlying complex phenotypes (Bradbury et al., 2022; Y. Zhou et al., 2022). These newly identified variants can be incorporated into GP models to enhance accuracy (Y. Zhou et al., 2022). At the same time, omnigenic regulatory networks can better model complex epistatic interactions and GEI, which can increase GP accuracy (H. Wang et al., 2021).

Advances in AI can enable complex pangenome assemblies, trait regulatory network discovery and reconstruction, and new algorithms to include new components in GP models. All these developments suggest we are moving toward a time when predictive and explanatory modeling strategies could be integrated for breeding new crops. Considering the rapid development of AI and its versatility, advances in this field can help improve GS by upgrading the entire framework or individual components, leading to optimal solutions for applying GS to develop crops adapted to future climate scenarios. AI's potential roles in GS developments depend significantly on the continuous improvement of our understanding of the genetic control of traits, sampling methods, breeding methodologies that can incorporate new AI developments, and adequate hands‐on training for breeders/farmers on state‐of‐the‐art AI applications in breeding.

5. CONCLUSIONS

In this review, we synthesized the essence of GS as a strategy that exploits the genotype–phenotype relationship and genome‐wide genetic relationships to predict the genetic merits of untested individuals. This strategy has proven successful at shortening breeding cycle lengths and increasing genetic gains. Owing to its advantages, GS is extensively and intensively used in plant breeding. Through the review, we emphasized that GS is more than just a prediction model. It consists of four steps: training population design, model building, prediction, and selection. Furthermore, within each step, several decisions must be made. An ongoing debate in GS is about its optimal implementation, but as described in this review, there is not a unique recipe for GS implementation. As clearly demonstrated, GS is highly versatile, having the ability to play different roles in crop improvement (e.g., parental selection, harnessing diversity, selection at different stages of the breeding pipeline), and adaptable, allowing breeders to customize it to meet their individual needs (e.g., multi‐environment, multi‐trait, multi‐omic, CGM‐GP, and longitudinal traits). Many decisions made during the implementation of GS are related to finding solutions to resource allocation problems, and the context‐dependent, optimal solutions depend on each breeder's needs, resources, and the biology of their target species. GS is highly adaptable to new technologies and challenges, and we have seen its transformation over the years from prediction within individual environments or for overall performance to prediction of genotypes in untested environments, prediction within gene banks, and prediction by integrating phenomics, enviromics, CGMs, and other omics datasets. Recent development in AI holds the potential for further improvements in the GS framework. In summary, we expect that technological advances, research innovations, and emerging challenges in agriculture will continue to shape the role of GS in plant breeding.

AUTHOR CONTRIBUTIONS

Diana M. Escamilla: Investigation; resources; visualization; writing‐original draft; writing‐review and editing. Dongdong Li: Investigation; resources; visualization; writing‐original draft; writing‐review and editing. Karlene L. Negus: Investigation; visualization; writing‐original draft; writing‐review and editing. Kiara L. Kappelmann: Investigation; visualization; writing‐original draft; writing‐review and editing. Aaron Kusmec: Investigation; visualization; writing‐original draft; writing‐review and editing. Adam E. Vanous: Conceptualization; funding acquisition; writing‐review and editing. Patrick S. Schnable: Conceptualization; funding acquisition; writing‐review and editing. Xianran Li: Conceptualization; funding acquisition; visualization; writing‐review and editing. Jianming Yu: Conceptualization; funding acquisition; project administration; supervision; visualization; writing‐review and editing.

ACKNOWELEDGMENTS

This work was supported by the Agriculture and Food Research Initiative competitive grant (2021‐67013‐33833; 2023‐70412‐41087) and the Hatch project (1021013) from the USDA National Institute of Food and Agriculture, the In‐House Project 2090‐21000‐033‐00D and 5030‐21000‐065‐000‐D of the USDA Agricultural Research Service, the Iowa State University Raymond F. Baker Center for Plant Breeding, and the Iowa State University Plant Sciences Institute.

Open access funding provided by the Iowa State University Library.

CONFLICT OF INTEREST STATEMENT

Patrick S. Schnable is a co‐founder and CEO of Dryland Genetics, Inc and a co‐founder and managing partner of Data2Bio, LLC and EnGeniousAg, LLC. He is a member of the scientific advisory boards of Kemin Industries and Centro de Tecnologia Canavieira. He is a recipient of research funding from Iowa Corn and Bayer Crop Science. The other authors declare no conflicts of interest.

Escamilla, D. M. , Li, D. , Negus, K. L. , Kappelmann, K. L. , Kusmec, A. , Vanous, A. E. , Schnable, P. S. , Li, X. , & Yu, J. (2025). Genomic selection: Essence, applications, and prospects. The Plant Genome, 18, e70053. 10.1002/tpg2.70053

Assigned to Associate Editor Nonoy Bandillo.

DATA AVAILABILITY STATEMENT

No data associated with this study since this is a review.

REFERENCES

  1. Abdollahi‐Arpanahi, R. , Morota, G. , Valente, B. D. , Kranis, A. , Rosa, G. J. M. , & Gianola, D. (2016). Differential contribution of genomic regions to marked genetic variation and prediction of quantitative traits in broiler chickens. Genetics, Selection, Evolution, 48(10), 10. 10.1186/s12711-016-0187-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Abraham, G. , Tye‐Din, J. A. , Bhalala, O. G. , Kowalczyk, A. , Zobel, J. , & Inouye, M. (2014). Accurate and robust genomic prediction of celiac disease using statistical learning. PLOS Genetics, 10(4), e1004374. 10.1371/JOURNAL.PGEN.1004137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Akdemir, D. , & Isidro‐Sánchez, J. (2019). Design of training populations for selective phenotyping in genomic prediction. Scientific Reports, 9(1), 1–15. 10.1038/s41598-018-38081-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Akdemir, D. , & Sánchez, J. I. (2016). Efficient breeding by genomic mating. Frontiers in Genetics, 7, 210. 10.3389/FGENE.2016.00210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Akdemir, D. , Sanchez, J. I. , & Jannink, J. L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection Evolution, 47(1), 1–10. 10.1186/S12711-015-0116-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Alemu, A. , Åstrand, J. , Montesinos‐López, O. A. , Isidro y Sánchez, J. , Fernández‐Gónzalez, J. , Tadesse, W. , Vetukuri, R. R. , Carlsson, A. S. , Ceplitis, A. , Crossa, J. , Ortiz, R. , & Chawade, A. (2024). Genomic selection in plant breeding: Key factors shaping two decades of progress. Molecular Plant, 17(4), 552–578. 10.1016/J.MOLP.2024.03.007 [DOI] [PubMed] [Google Scholar]
  7. Ali, B. , Huguenin‐Bizot, B. , Laurent, M. , Chaumont, F. , Maistriaux, L. C. , Nicolas, S. , Duborjal, H. , Welcker, C. , Tardieu, F. , Mary‐Huard, T. , Moreau, L. , Charcosset, A. , Runcie, D. , & Rincent, R. (2024). High‐dimensional multi‐omics measured in controlled conditions are useful for maize platform and field trait predictions. Theoretical and Applied Genetics, 137(175). 10.1007/s00122-024-04679-w [DOI] [PubMed] [Google Scholar]
  8. Alimi, N. A. , Bink, M. C. , Dieleman, J. A. , Magán, J. J. , Wubs, A. M. , Palloix, A. , & van Eeuwijk, F. A. (2013). Multi‐trait and multi‐environment QTL analyses of yield and a set of physiological traits in pepper. Theoretical and Applied Genetics, 126(10), 2597–2625. 10.1007/S00122-013-2160-3/TABLES/12 [DOI] [PubMed] [Google Scholar]
  9. Allier, A. , Moreau, L. , Charcosset, A. , Teyssèdre, S. , & Lehermeier, C. (2019). Usefulness criterion and post‐selection parental contributions in multi‐parental crosses: Application to polygenic trait introgression. G3 :Genes,Genomes,Genetics, 9(5), 1469–1479. 10.1534/G3.119.400129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Allier, A. , Teyssèdre, S. , Lehermeier, C. , Charcosset, A. , & Moreau, L. (2020). Genomic prediction with a maize collaborative panel: Identification of genetic resources to enrich elite breeding programs. Theoretical and Applied Genetics, 133(1), 201–215. 10.1007/S00122-019-03451-9 [DOI] [PubMed] [Google Scholar]
  11. Allier, A. , Teyssèdre, S. , Lehermeier, C. , Moreau, L. , & Charcosset, A. (2020). Optimized breeding strategies to harness genetic resources with different performance levels. BMC Genomics, 21(349). 10.1186/S12864-020-6756-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Amato, F. , Guignard, F. , Robert, S. , & Kanevski, M. (2020). A novel framework for spatio‐temporal prediction of environmental data using deep learning. Scientific Reports, 10(1), 1–11. 10.1038/s41598-020-79148-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Andrade‐Sanchez, P. , Gore, M. A. , Heun, J. T. , Thorp, K. R. , Carmo‐Silva, A. E. , French, A. N. , Salvucci, M. E. , & White, J. W. (2014). Development and evaluation of a field‐based high‐throughput phenotyping platform. Functional Plant Biology, 41(1), 68–79. 10.1071/FP13126 [DOI] [PubMed] [Google Scholar]
  14. Araus, J. L. , Kefauver, S. C. , Zaman‐Allah, M. , Olsen, M. S. , & Cairns, J. E. (2018). Translating high‐throughput phenotyping into genetic gain. Trends in Plant Science, 23(5), 451–466. 10.1016/J.TPLANTS.2018.02.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Atanda, S. A. , Olsen, M. , Burgueño, J. , Crossa, J. , Dzidzienyo, D. , Beyene, Y. , Gowda, M. , Dreher, K. , Zhang, X. , Prasanna, B. M. , Tongoona, P. , Danquah, E. Y. , Olaoye, G. , & Robbins, K. R. (2021). Maximizing efficiency of genomic selection in CIMMYT's tropical maize breeding program. Theoretical and Applied Genetics, 134(1), 279–294. 10.1007/S00122-020-03696-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Auinger, H. J. , Schönleben, M. , Lehermeier, C. , Schmidt, M. , Korzun, V. , Geiger, H. H. , Piepho, H. P. , Gordillo, A. , Wilde, P. , Bauer, E. , & Schön, C. C. (2016). Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (Secale cereale L.). Theoretical and Applied Genetics, 129(11), 2043–2053. 10.1007/s00122-016-2756-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Azodi, C. B. , Bolger, E. , McCarren, A. , Roantree, M. , de los Campos, G. , & Shiu, S.‐H. (2019). Genomic prediction: Benchmarking parametric and machine learning models for genomic prediction of complex traits. G3: Genes, Genomes, Genetics, 9(10), 3117–3129. 10.1534/g3.119.400498 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Azodi, C. B. , Pardo, J. , VanBuren, R. , de los Campos, G. , & Shiu, S. H. (2020). Transcriptome‐based prediction of complex traits in maize. The Plant Cell, 32(1), 139–151. 10.1105/TPC.19.00332 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Azodi, C. B. , Tang, J. , & Shiu, S. H. (2020). Opening the black box: Interpretable machine learning for geneticists. Trends in Genetics, 36(6), 442–455. 10.1016/J.TIG.2020.03.005 [DOI] [PubMed] [Google Scholar]
  20. Baba, T. , Momen, M. , Campbell, I. D. M. , Walia, H. , & Morota, G. I. (2020). Multi‐trait random regression models increase genomic prediction accuracy for a temporal physiological trait derived from high‐throughput phenotyping. PLoS ONE, 15(3), e0228118. 10.1371/journal.pone.0228118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ballén‐Taborda, C. , Lyerly, J. , Smith, J. , Howell, K. , Brown‐Guedira, G. , Babar, M. A. , Harrison, S. A. , Mason, R. E. , Mergoum, M. , Murphy, J. P. , Sutton, R. , Griffey, C. A. , & Boyles, R. E. (2022). Utilizing genomics and historical data to optimize gene pools for new breeding programs: A case study in winter wheat. Frontiers in Genetics, 13, 964684. 10.3389/fgene.2022.964684 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Barredo Arrieta, A. , Díaz‐Rodríguez, N. , Del Ser, J. , Bennetot, A. , Tabik, S. , Barbado, A. , Garcia, S. , Gil‐Lopez, S. , Molina, D. , Benjamins, R. , Chatila, R. , & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. 10.1016/J.INFFUS.2019.12.01 [DOI] [Google Scholar]
  23. Bassi, F. M. , Bentley, A. R. , Charmet, G. , Ortiz, R. , & Crossa, J. (2016). Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Science, 242, 23–36. 10.1016/J.PLANTSCI.2015.08.021 [DOI] [PubMed] [Google Scholar]
  24. Bayer, P. E. , Golicz, A. A. , Scheben, A. , Batley, J. , & Edwards, D. (2020). Plant pan‐genomes are the new reference. Nature Plants, 6(8), 914–920. 10.1038/s41477-020-0733-0 [DOI] [PubMed] [Google Scholar]
  25. Bellot, P. , de los Campos, G. , & Pérez‐Enciso, M. (2018). Can deep learning improve genomic prediction of complex human traits? Genetics, 210(3), 809–819. 10.1534/GENETICS.118.301298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Bernardo, R. (2009). Genomewide selection for rapid introgression of exotic germplasm in maize. Crop Science, 49(2), 419–425. 10.2135/CROPSCI2008.08.0452 [DOI] [Google Scholar]
  27. Bernardo, R. (2014a). Genomewide selection of parental inbreds: Classes of loci and virtual biparental populations. Crop Science, 54(6), 2586–2595. 10.2135/CROPSCI2014.01.0088 [DOI] [Google Scholar]
  28. Bernardo, R. (2014b). Genomewide selection when major genes are known. Crop Science, 54(1), 68–75. 10.2135/CROPSCI2013.05.0315 [DOI] [Google Scholar]
  29. Bernardo, R. (2020). Breeding for quantitative traits in plants (3rd ed.). Stemma Press. http://stemmapress.com [Google Scholar]
  30. Bernardo, R. , & Yu, J. (2007). Prospects for genomewide selection for quantitative traits in maize. Crop Science, 47(3), 1082–1090. 10.2135/CROPSCI2006.11.0690 [DOI] [Google Scholar]
  31. Berro, I. , Lado, B. , Nalin, R. S. , Quincke, M. , & Gutiérrez, L. (2019). Training population optimization for genomic selection. The Plant Genome, 12(3), 190028. 10.3835/PLANTGENOME2019.04.0028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Beyene, Y. , Semagn, K. , Mugo, S. , Tarekegne, A. , Babu, R. , Meisel, B. , Sehabiague, P. , Makumbi, D. , Magorokosho, C. , Oikeh, S. , Gakunga, J. , Vargas, M. , Olsen, M. , Prasanna, B. M. , Banziger, M. , & Crossa, J. (2015). Genetic gains in grain yield through genomic selection in eight bi‐parental maize populations under drought stress. Crop Science, 55(1), 154–163. 10.2135/cropsci2014.07.0460 [DOI] [Google Scholar]
  33. Blary, A. , & Jenczewski, E. (2019). Manipulation of crossover frequency and distribution for plant breeding. Theoretical and Applied Genetics, 132(3), 575–592. 10.1007/s00122-018-3240-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Boer, M. P. , Wright, D. , Feng, L. , Podlich, D. W. , Luo, L. , Cooper, M. , & Van Eeuwijk, F. A. (2007). A mixed‐model quantitative trait loci (QTL) analysis for multiple‐environment trial data using environmental covariables for QTL‐by‐environment interactions, with an example in maize. Genetics, 177(3), 1801–1813. 10.1534/GENETICS.107.071068 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Bonnett, D. , Li, Y. , Crossa, J. , Dreisigacker, S. , Basnet, B. , Pérez‐Rodríguez, P. , Alvarado, G. , Jannink, J. L. , Poland, J. , & Sorrells, M. (2022). Response to early generation genomic selection for yield in wheat. Frontiers in Plant Science, 12, 718611. 10.3389/fpls.2021.718611 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Boyle, E. A. , Li, Y. I. , & Pritchard, J. K. (2017). An expanded view of complex traits: From polygenic to omnigenic. Cell, 169(7), 1177. 10.1016/J.CELL.2017.05.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Bradbury, P. J. , Casstevens, T. , Jensen, S. E. , Johnson, L. C. , Miller, Z. R. , Monier, B. , Romay, M. C. , Song, B. , & Buckler, E. S. (2022). The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation. Bioinformatics, 38(15), 3718–3724. 10.1093/bioinformatics/btac410 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Brien, C. , Jewell, N. , Watts‐Williams, S. J. , Garnett, T. , & Berger, B. (2020). Smoothing and extraction of traits in the growth analysis of noninvasive phenotypic data. Plant Methods, 16(1), 1–21. 10.1186/S13007-020-00577-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Brøndum, R. F. , Su, G. , Lund, M. S. , Bowman, P. J. , Goddard, M. E. , & Hayes, B. J. (2012). Genome position specific priors for genomic prediction. BMC Genomics, 13, 543. 10.1186/1471-2164-13-543 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Burgueño, J. , Crossa, J. , Cornelius, P. L. , & Yang, R. C. (2008). Using factor analytic models for joining environments and genotypes without crossover genotype × environment interaction. Crop Science, 48(4), 1291–1305. 10.2135/CROPSCI2007.11.0632 [DOI] [Google Scholar]
  41. Burgueño, J. , Crossa, J. , Cotes, J. M. , Vicente, F. S. , & Das, B. (2011). Prediction assessment of linear mixed models for multienvironment trials. Crop Science, 51(3), 944–954. 10.2135/CROPSCI2010.07.0403 [DOI] [Google Scholar]
  42. Burgueño, J. , de los Campos, G. , Weigel, K. , & Crossa, J. (2012). Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Science, 52(2), 707–719. 10.2135/CROPSCI2011.06.0299 [DOI] [Google Scholar]
  43. Bustos‐Korts, D. , Malosetti, M. , Chapman, S. , Biddulph, B. , & van Eeuwijk, F. (2016). Improvement of predictive ability by uniform coverage of the target genetic space. G3: Genes, Genomes, Genetics, 6(11), 3733–3747. 10.1534/G3.116.035410/-/DC1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Campbell, M. , Momen, M. , Walia, H. , & Morota, G. (2019). Leveraging breeding values obtained from random regression models for genetic inference of longitudinal traits. The Plant Genome, 12(2), 180075. 10.3835/PLANTGENOME2018.10.0075 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Campbell, M. , Walia, H. , & Morota, G. (2018). Utilizing random regression models for genomic prediction of a longitudinal trait derived from high‐throughput phenotyping. Plant Direct, 2(9), e00080. 10.1002/PLD3.80 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Campbell, M. T. , Grondin, A. , Walia, H. , & Morota, G. (2020). Leveraging genome‐enabled growth models to study shoot growth responses to water deficit in rice. Journal of Experimental Botany, 71(18), 5669–5679. 10.1093/JXB/ERAA280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Chari, S. , Gruen, D. M. , Seneviratne, O. , & McGuinness, D. L. (2020). Foundations of explainable knowledge‐enabled systems. IOS Press. https://ebooks.iospress.nl/publication/54076 [Google Scholar]
  48. Charmet, G. , Tran, L.‐G. , Auzanneau, J. , Rincent, R. , & Bouchet, S. (2020). BWGS: A R package for genomic selection and its application to a wheat breeding programme. PLoS ONE, 15(4), e0222733. 10.1371/journal.pone.0222733 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Chen, J. , & Shi, X. (2019). Sparse convolutional denoising autoencoders for genotype imputation. Genes, 10(9), 652. 10.3390/GENES10090652 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Chen, Q. , Zheng, B. , Chen, T. , & Chapman, S. C. (2022). Integrating a crop growth model and radiative transfer model to improve estimation of crop traits based on deep learning. Journal of Experimental Botany, 73(19), 6558–6574. 10.1093/jxb/erac291 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Chen, R. , Han, W. , Zhang, H. , Su, H. , Wang, Z. , Liu, X. , Jiang, H. , Ouyang, W. , & Dong, N. (2024). An embarrassingly simple approach to enhance transformer performance in genomic selection for crop breeding. https://arxiv.org/abs/2405.09585v3
  52. Combs, E. , & Bernardo, R. (2013a). Accuracy of genomewide selection for different traits with constant population size, heritability, and number of markers. The Plant Genome, 6(1), plantgenome2012.11.0030. 10.3835/plantgenome2012.11.0030 [DOI] [Google Scholar]
  53. Combs, E. , & Bernardo, R. (2013b). Genomewide selection to introgress semidwarf maize germplasm into U.S. Corn Belt inbreds. Crop Science, 53(4), 1427–1436. 10.2135/CROPSCI2012.11.0666 [DOI] [Google Scholar]
  54. Cooper, M. , & Delacy, I. H. (1994). Relationships among analytical methods used to study genotypic variation and genotype‐by‐environment interaction in plant breeding multi‐environment experiments. Theoretical Applied Genetics, 88, 561–572. 10.1007/BF01240919 [DOI] [PubMed] [Google Scholar]
  55. Cooper, M. , Powell, O. , Gho, C. , Tang, T. , & Messina, C. (2023). Extending the breeder's equation to take aim at the target population of environments. Frontiers in Plant Science, 14, 1129591. 10.3389/fpls.2023.1129591 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Cooper, M. , Technow, F. , Messina, C. , Gho, C. , & Totir, L. R (2016). Use of crop growth models with whole‐genome prediction: Application to a maize multienvironment trial. Crop Science, 56(5), 2141–2156. 10.2135/CROPSCI2015.08.0512 [DOI] [Google Scholar]
  57. Cooper, M. , Voss‐Fels, K. P. , Messina, C. D. , Tang, T. , & Hammer, G. L. (2021). Tackling G × E × M interactions to close on‐farm yield‐gaps: Creating novel pathways for crop improvement by predicting contributions of genetics and management to crop productivity. Theoretical Applied Genetics, 134, 1625–1644. 10.1007/s00122-021-03812-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Crain, J. , Mondal, S. , Rutkoski, J. , Singh, R. P. , & Poland, J. (2018). Combining high‐throughput phenotyping and genomic information to increase prediction and selection accuracy in wheat breeding. The Plant Genome, 11(1), 170043. 10.3835/PLANTGENOME2017.05.0043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Crossa, J. (2012). From genotype × environment interaction to gene × environment interaction. Current Genomics, 13(3), 225–244. 10.2174/138920212800543066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Crossa, J. , Burgueño, J. , Cornelius, P. L. , McLaren, G. , Trethowan, R. , & Krishnamachari, A. (2006). Modeling genotype × environment interaction using additive genetic covariances of relatives for predicting breeding values of wheat genotypes. Crop Science, 46(4), 1722–1733. 10.2135/CROPSCI2005.11-0427 [DOI] [Google Scholar]
  61. Crossa, J. , Jarquín, D. , Franco, J. , Pérez‐Rodríguez, P. , Burgueño, J. , Saint‐Pierre, C. , Vikram, P. , Sansaloni, C. , Petroli, C. , Akdemir, D. , Sneller, C. , Reynolds, M. , Tattaris, M. , Payne, T. , Guzman, G. , Peña, R. J. , Wenzl, P. , & Singh, S. (2016). Genomic prediction of gene bank wheat landraces. G3: Genes, Genomes, Genetics, 6(7), 1819–1834. https://academic.oup.com/g3journal/article‐abstract/6/7/1819/6027742 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Crossa, J. , Pérez‐Rodríguez, P. , Cuevas, J. , Montesinos‐López, O. , Jarquín, D. , de los Campos, G. , Burgueño, J. , González‐Camacho, J. M. , Pérez‐Elizalde, S. , Beyene, Y. , Dreisigacker, S. , Singh, R. , Zhang, X. , Gowda, M. , Roorkiwal, M. , Rutkoski, J. , & Varshney, R. K. (2017). Genomic selection in plant breeding: Methods, models, and perspectives. Trends in Plant Science, 22(11), 961–975. 10.1016/J.TPLANTS.2017.08.011 [DOI] [PubMed] [Google Scholar]
  63. Crossa, J. , Vargas, M. , van Eeuwijk, F. A. , Jiang, C. , Edmeades, G. O. , & Hoisington, D. (1999). Interpreting genotype × environment interaction in tropical maize using linked molecular markers and environmental covariables. Theoretical and Applied Genetics, 99(3–4), 611–625. 10.1007/s001220051276 [DOI] [PubMed] [Google Scholar]
  64. Crossa, J. , Yang, R.‐C. , & Cornelius, P. L. (2004). Studying crossover genotype × environment interaction using linear‐bilinear models and mixed models. JABES, 9, 362–380. 10.1198/108571104x4423 [DOI] [Google Scholar]
  65. Crossa, J. , Campos, G. D. L. , Pérez, P. , Gianola, D. , Burgueño, J. , Araus, J. L. , Makumbi, D. , Singh, R. P. , Dreisigacker, S. , Yan, J. , Arief, V. , Banziger, M. , & Braun, H.‐J. (2010). Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics, 186(2), 713–724. 10.1534/genetics.110.118521 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Cuevas, J. , Crossa, J. , Soberanis, V. , Pérez‐Elizalde, S. , Pérez‐Rodríguez, P. , Campos, G. D. L. , Montesinos‐López, O. A. , & Burgueño, J. (2016). Genomic prediction of genotype × environment interaction kernel regression models. The Plant Genome, 9(3), plantgenome2016.03.0024. 10.3835/PLANTGENOME2016.03.0024 [DOI] [PubMed] [Google Scholar]
  67. Cullis, B. R. , Smith, A. B. , Beeck, C. P. , & Cowling, W. A. (2010). Analysis of yield and oil from a series of canola breeding trials. Part II. Exploring variety by environment interaction using factor analysis. Genome, 53(1), 1002–1016. 10.1139/G10-080 [DOI] [PubMed] [Google Scholar]
  68. Daetwyler, H. D. , Calus, M. P. , Pong‐Wong, R. , de Los Campos, G. , & Hickey, J. M. (2013). Genomic prediction in animals and plants: Simulation of data, validation, reporting, and benchmarking. Genetics, 193(2), 347–365. 10.1534/genetics.112.147983 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Daetwyler, H. D. , Hayden, M. J. , Spangenberg, G. C. , & Hayes, B. J. (2015). Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection. Genetics, 200(4), 1341–1348. 10.1534/GENETICS.115.178038/-/DC1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Daetwyler, H. D. , Pong‐Wong, R. , Villanueva, B. , & Woolliams, J. A. (2010). The impact of genetic architecture on genome‐wide evaluation methods. Genetics, 185(3), 1021–1031. 10.1534/GENETICS.110.116855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Danilevicz, M. F. , Gill, M. , Anderson, R. , Batley, J. , Bennamoun, M. , Bayer, P. E. , & Edwards, D. (2022). Plant genotype to phenotype prediction using machine learning. Frontiers in Genetics, 13, 822173. 10.3389/FGENE.2022.822173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Dawson, U. C. , Endelman, J. B. , Heslot, N. , Crossa, J. , Poland, J. , Dreisigacker, S. , Manès, Y. , Sorrells, M. E. , & Jannink, J. L. (2013). The use of unbalanced historical data for genomic selection in an international wheat breeding program. Field Crops Research, 154, 12–22. 10.1016/j.fcr.2013.07.020 [DOI] [Google Scholar]
  73. Delannay, X. , McLaren, G. , & Ribaut, J. M. (2012). Fostering molecular breeding in developing countries. Molecular Breeding, 29(4), 857–873. [Google Scholar]
  74. de los Campos, G. , Gianola, D. , Rosa, G. J. M. , Weigel, K. A. , & Crossa, J. (2010). Semi‐parametric genomic‐enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genetical Research, 92, 295–308. 10.1017/S0016672310000285 [DOI] [PubMed] [Google Scholar]
  75. de los Campos, G. , Hickey, J. M. , Pong‐Wong, R. , Daetwyler, H. D. , & Calus, M. P. L. (2013). Whole‐genome regression and prediction methods applied to plant and animal breeding. Genetics, 193(2), 327–345. 10.1534/genetics.112.143313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Denis, M. , & Bouvet, J. M. (2013). Efficiency of genomic selection with models including dominance effect in the context of Eucalyptus breeding. Tree Genetics & Genomes, 9, 37–51. 10.1007/s11295-012-0528-1 [DOI] [Google Scholar]
  77. Desta, Z. A. , & Ortiz, R. (2014). Genomic selection: Genome‐wide prediction in plant improvement. Trend in Plant Science, 19(9), 592–601. 10.1016/j.tplants.2014.05.006 [DOI] [PubMed] [Google Scholar]
  78. Devlin, J. , Chang, M. W. , Lee, K. , & Toutanova, K. (2018). BERT: Pre‐training of deep bidirectional transformers for language understanding. NAACL HLT 2019—2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies—Proceedings of the Conference, 1, 4171–4186. https://arxiv.org/abs/1810.04805v2 [Google Scholar]
  79. Diepenbrock, C. , Tang, T. , Jines, M. , Technow, F. , Lira, S. , Podlich, D. , Cooper, M. , & Messina, C. (2021). Can we harness digital technologies and physiology to hasten genetic gain in U.S. maize breeding? 10.1101/2021.02.23.432477 [DOI] [PMC free article] [PubMed]
  80. Do, D. N. , Janss, L. L. G. , Jensen, J. , & Kadarmideen, H. N. (2015). SNP annotation‐based whole genomic prediction and selection: An application to feed efficiency and its component traits in pigs. Journal of Animal Science, 93(5), 2056–2063. 10.2527/jas.2014-8651 [DOI] [PubMed] [Google Scholar]
  81. Doublet, A. C. , Croiseau, P. , Fritz, S. , Michenet, A. , Hozé, C. , Danchin‐Burge, C. , Laloë, D. , & Restoux, G. (2019). The impact of genomic selection on genetic diversity and genetic gain in three French dairy cattle breeds. Genetics, Selection, Evolution, 51, 52. 10.1186/s12711-019-0495-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Duddu, H. S. N. , Johnson, E. N. , Willenborg, C. J. , & Shirtliffe, S. J. (2019). High‐throughput UAV image‐based method is more precise than manual rating of herbicide tolerance. Plant Phenomics, 2019, 6036453. 10.34133/2019/6036453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Dzievit, M. J. , Guo, T. , Li, X. , & Yu, J. (2021). Comprehensive analytical and empirical evaluation of genomic prediction across diverse accessions in maize. The Plant Genome, 14(3), e20160. 10.1002/tpg2.20160 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Eberhart, S. A. , & Russell, W. A. (1966). Stability parameters for comparing varieties. Crop Science, 6(1), 36–40. 10.2135/cropsci1966.0011183x000600010011x [DOI] [Google Scholar]
  85. Edwards, S. M. , Thomsen, B. , Madsen, P. , & Sørensen, O. (2015). Partitioning of genomic variance reveals biological pathways associated with udder health and milk production traits in dairy cattle. Genetics Selection Evolution, 47, 60. 10.1186/s12711-015-0132-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Endelman, J. B. , Atlin, G. N. , Beyene, Y. , Semagn, K. , Zhang, X. , Sorrells, M. E. , & Jannink, J. L. (2014). Optimal design of preliminary yield trials with genome‐wide markers. Crop Science, 54(1), 48–59. 10.2135/CROPSCI2013.03.0154 [DOI] [Google Scholar]
  87. Eyhart, J. L. , Tiede, T. , Lorenz, A. J. , & Smith, K. P. (2017). Evaluating methods of updating training data in long‐term genomewide selection. G3: Genes,Genomes,Genetics, 7(5), 1499–1510. 10.1534/g3.117.040550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Falconer, D. S. , & Mackay, T. F. C. (1996). Introduction to quantitative genetics (4th ed.). Longmans Green. https://www.pearson.com/us/higher‐education/program/Falconer‐Introduction‐to‐Quantitative‐Genetics‐4th‐Edition/PGM194806.html [Google Scholar]
  89. Farooq, M. , van Dijk, A. D. J. , Nijveen, H. , Aarts, M. G. M. , Kruijer, W. , Nguyen, T.‐P. , Mansoor, S. , & de Ridder, D. (2021). Prior biological knowledge improves genomic prediction of growth‐related traits in Arabidopsis thaliana . Frontiers in Genetics, 11, 609117. 10.3389/fgene.2020.609117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Feng, W. , Gao, P. , & Wang, X. (2024). AI breeder: Genomic predictions for crop breeding. New Crops, 1, 100010. 10.1016/J.NCROPS.2023.12.005 [DOI] [Google Scholar]
  91. Fernandes, S. B. , Dias, K. O. G. , Ferreira, D. F. , & Brown, P. J. (2018). Efficiency of multi‐trait, indirect, and trait‐assisted genomic selection for improvement of biomass sorghum. Theoretical and Applied Genetics, 131(3), 747–755. 10.1007/s00122-017-3033-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Fernández‐González, J. , Akdemir, D. , & Isidro y Sánchez, J. (2023). A comparison of methods for training population optimization in genomic selection. Theoretical and Applied Genetics, 136(3), 1–20. 10.1007/S00122-023-04265-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Fernández‐González, J. , Haquin, B. , Combes, E. , Bernard, K. , Allard, A. , & Isidro Y Sánchez, J. (2024). Maximizing efficiency in sunflower breeding through historical data optimization. Plant Methods, 20, 42. 10.1186/s13007-024-01151-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Filler Hayut, S. , Melamed Bessudo, C. , & Levy, A. A. (2017). Targeted recombination between homologous chromosomes for precise breeding in tomato. Nature Communications, 8, 15605. 10.1038/ncomms15605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Finlay, K. W. , & Wilkinson, G. N. (1963). The analysis of adaptation in a plant‐breeding programme. Australian Journal of Agricultural Research, 14(6), 742–754. 10.1071/AR9630742 [DOI] [Google Scholar]
  96. Fischer, S. , Melchinger, A. E. , Korzun, V. , Wilde, P. , Schmiedchen, B. , Möhring, J. , Piepho, H. P. , Dhillon, B. S. , Würschum, T. , & Reif, J. C. (2010). Molecular marker assisted broadening of the Central European heterotic groups in rye with Eastern European germplasm. Theoretical and Applied Genetics, 120(2), 291–299. 10.1007/S00122-009-1124-0 [DOI] [PubMed] [Google Scholar]
  97. Fu, J. , Falke, K. C. , Thiemann, A. , Schrag, T. A. , Melchinger, A. E. , Scholten, S. , & Frisch, M. (2012). Partial least squares regression, support vector machine regression, and transcriptome‐based distances for prediction of maize hybrid performance with gene expression data. Theoretical and Applied Genetics, 124(5), 825–833. 10.1007/S00122-011-1747-9 [DOI] [PubMed] [Google Scholar]
  98. Fugeray‐Scarbel, A. , Bastien, C. , Dupont‐Nivet, M. , & Lemarié, S. , R2D2 Consortium . (2021). Why and how to switch to genomic selection: Lessons from plant and animal breeding experience. Frontiers in Genetics, 12, 629737. 10.3389/fgene.2021.629737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Gao, N. , Li, J. , He, J. , Xiao, G. , Luo, Y. , Zhang, H. , Chen, Z. , & Zhang, Z. (2015). Improving accuracy of genomic prediction by genetic architecture‐based priors in a Bayesian model. BMC Genetics, 16, 120. 10.1186/s12863-015-0278-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Gao, N. , Martini, J. W. R. , Zhang, Z. , Yuan, X. , Zhang, H. , Simianer, H. , & Li, J. (2017). Incorporating gene annotation into genomic prediction of complex phenotypes. Genetics, 207(2), 489–501. 10.1534/genetics.117.300198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Gapare, W. , Liu, S. , Conaty, W. , Zhu, Q.‐H. , Gillespie, V. , Llewellyn, D. , Stiller, W. , & Wilson, I. (2018). Historical datasets support genomic selection models for the prediction of cotton fiber quality phenotypes across multiple environments. G3: Genes, Genomes, Genetics, 8(5), 1721–1732. 10.1534/g3.118.200140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Gauch, H. G. (1988). Model selection and validation for yield trials with interaction. Biometrics, 44(3), 705. 10.2307/2531585 [DOI] [Google Scholar]
  103. Gauch, H. G. (2006). Statistical analysis of yield trials by AMMI and GGE. Crop Science, 46(4), 1488–1500. 10.2135/CROPSCI2005.07-0193 [DOI] [Google Scholar]
  104. Gianola, D. (2013). Priors in whole‐genome regression: The Bayesian alphabet returns. Genetics, 194(3), 573–596. 10.1534/genetics.113.151753 [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Gianola, D. , Okut, H. , Weigel, K. A. , & Rosa, G. J. M. (2011). Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genetics, 12(1), 1–14. 10.1186/1471-2156-12-87 [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Gill, T. , Gill, S. K. , Saini, D. K. , Chopra, Y. , De Koff, J. P. , & Sandhu, K. S. (2022). A comprehensive review of high throughput phenotyping and machine learning for plant stress phenotyping. Phenomics, 2(3), 156–183. 10.1007/s43657-022-00048-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Goddard, M. E. (2009). Genomic selection: Prediction of accuracy and maximization of long term response. Genetica, 136, 245–257. 10.1007/s10709-008-9308-0 [DOI] [PubMed] [Google Scholar]
  108. Goddard, M. E. , & Hayes, B. J. (2007). Genomic selection. Journal of Animal Breeding and Genetics, 124(6), 323–330. 10.1111/j.1439-0388.2007.00702.x [DOI] [PubMed] [Google Scholar]
  109. Goiffon, M. , Kusmec, A. , Wang, L. , Hu, G. , & Schnable, P. S. (2017). Improving response in genomic selection with a population‐based selection strategy: Optimal population value selection. Genetics, 206(3), 1675–1682. 10.1534/GENETICS.116.197103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Gollob, H. F. (1968). A statistical model which combines features of factor analytic and analysis of variance techniques. Psychometrika, 33(1), 73–115. 10.1007/BF02289676/METRICS [DOI] [PubMed] [Google Scholar]
  111. Gonzalez, M. Y. , Zhao, Y. , Jiang, Y. , Stein, N. , Habekuss, A. , Reif, J. C. , & Schulthess, A. W. (2021). Genomic prediction models trained with historical records enable populating the German ex situ genebank bio‐digital resource center of barley (Hordeum sp.) with information on resistances to soilborne barley mosaic viruses. Theoretical and Applied Genetics, 134(7), 2181–2196. 10.1007/s00122-021-03815-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. González‐Camacho, J. M. , de los Campos, G. , Pérez, P. , Gianola, D. , Cairns, J. E. , Mahuku, G. , Babu, R. , & Crossa, J. (2012). Genome‐enabled prediction of genetic values using radial basis function neural networks. Theoretical and Applied Genetics, 125(4), 759–771. 10.1007/S00122-012-1868-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. González‐Recio, O. , Gianola, D. , Long, N. , Weigel, K. A. , Rosa, G. J. M. , & Avendaño, S. (2008). Nonparametric methods for incorporating genomic information into genetic evaluations: An application to mortality in broilers. Genetics, 178(4), 2305–2313. 10.1534/genetics.107.084293 [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Gorjanc, G. , Jenko, J. , Hearne, S. J. , & Hickey, J. M. (2016). Initiating maize pre‐breeding programs using genomic selection to harness polygenic variation from landrace populations. BMC Genomics, 17(1), 1–15. 10.1186/S12864-015-2345-Z [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Guo, T. , Mu, Q. , Wang, J. , Vanous, A. E. , Onogi, A. , Iwata, H. , Li, X. , & Yu, J. (2020). Dynamic effects of interacting genes underlying rice flowering‐time phenotypic plasticity and global adaptation. Genome Research, 30(5), 673–683. 10.1101/GR.255703.119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Guo, T. , Wei, J. , Li, X. , & Yu, J. (2024). Environmental context of phenotypic plasticity in flowering time in sorghum and rice. Journal of Experimental Botany, 75(3), 1004–1015. 10.1093/JXB/ERAD398 [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Guo, T. , Yu, X. , Li, X. , Zhang, H. , Zhu, C. , Flint‐Garcia, S. , McMullen, M. D. , Holland, J. B. , Szalma, S. J. , Wisser, R. J. , & Yu, J. (2019). Optimal designs for genomic selection in hybrid crops. Molecular Plant, 12, 390–401. 10.1016/j.molp.2018.12.022 [DOI] [PubMed] [Google Scholar]
  118. Guo, Z. , Magwire, M. M. , Basten, C. J. , Xu, Z. , & Wang, D. (2016). Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize. Theoretical and Applied Genetics, 129(12), 2413–2427. 10.1007/S00122-016-2780-5 [DOI] [PubMed] [Google Scholar]
  119. Guo, Z. , Tucker, D. M. , Basten, C. J. , Gandhi, H. , Ersoz, E. , Guo, B. , Xu, Z. , Wang, D. , & Gay, G. (2014). The impact of population structure on genomic prediction in stratified populations. Theoretical and Applied Genetics, 127(3), 749–762. 10.1007/S00122-013-2255-X [DOI] [PubMed] [Google Scholar]
  120. Han, Y. , Cameron, J. N. , Wang, L. , & Beavis, W. D. (2017). The predicted cross value for genetic introgression of multiple alleles. Genetics, 205(2), 885–896. 10.1534/genetics.116.197095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Hao, Y. , Wang, H. , Yang, X. , Zhang, H. , He, C. , Li, D. , Li, H. , Wang, G. , Wang, J. , & Fu, J. (2019). Genomic prediction using existing historical data contributing to selection in biparental populations: A study of kernel oil in maize. The Plant Genome, 12(1), 180025. 10.3835/plantgenome2018.05.0025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Heffner, E. L. , Sorrells, M. E. , & Jannink, J. L. (2009). Genomic selection for crop improvement. Crop Science, 49(1), 1–12. 10.2135/cropsci2008.08.0512 [DOI] [Google Scholar]
  123. Heslot, N. , Akdemir, D. , Sorrells, M. E. , & Jannink, J. L. (2014). Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theoretical and Applied Genetics, 127(2), 463–480. 10.1007/S00122-013-2231-5 [DOI] [PubMed] [Google Scholar]
  124. Heslot, N. , Yang, H.‐P. , Sorrells, M. E. , & Jannink, J.‐L. (2012). Genomic selection in plant breeding: A comparison of models. Crop Science, 52, 146–160. 10.2135/cropsci2011.06.0297 [DOI] [Google Scholar]
  125. Hickey, J. M. , Chiurugwi, T. , Mackay, I. , & Powell, W. (2017). Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery. Nature Genetics, 49(9), 1297–1303. 10.1038/NG.3920 [DOI] [PubMed] [Google Scholar]
  126. Hickey, J. M. , Dreisigacker, S. , Crossa, J. , Hearne, S. , Babu, R. , Prasanna, B. M. , Grondona, M. , Zambelli, A. , Windhausen, V. S. , Mathews, K. , & Gorjanc, G. (2014). Evaluation of genomic selection training population designs and genotyping strategies in plant breeding programs using simulation. Crop Science, 54(6), 1476–1488. 10.2135/cropsci2013.03.0195 [DOI] [Google Scholar]
  127. Hickey, L. T. , N Hafeez, A. , Robinson, H. , Jackson, S. A. , Leal‐Bertioli, S. C. M. , Tester, M. , Gao, C. , Godwin, I. D. , Hayes, B. J. , & Wulff, B. B. H. (2019). Breeding crops to feed 10 billion. Nature Biotechnology, 37, 744–754. 10.1038/s41587-019-0152-9 [DOI] [PubMed] [Google Scholar]
  128. Hu, H. , Campbell, M. T. , Yeats, T. H. , Zheng, X. , Runcie, D. E. , Covarrubias‐Pazaran, G. , Broeckling, C. , Yao, L. , Caffe‐Treml, M. , Gutiérrez, L. , Smith, K. P. , Tanaka, J. , Hoekenga, O. A. , Sorrells, M. E. , Gore, M. A. , & Jannink, J. L. (2021). Multi‐omics prediction of oat agronomic and seed nutritional traits across environments and in distantly related populations. Theoretical and Applied Genetics, 134(12), 4043–4054. 10.1007/S00122-021-03946-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Hu, X. , Xie, W. , Wu, C. , & Xu, S. (2019). A directed learning strategy integrating multiple omic data improves genomic prediction. Plant Biotechnology Journal, 17(10), 2011–2020. 10.1111/PBI.13117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Isidro, J. , Jannink, J. L. , Akdemir, D. , Poland, J. , Heslot, N. , & Sorrells, M. E. (2015). Training set optimization under population structure in genomic selection. Theoretical and Applied Genetics, 128(1), 145–158. 10.1007/s00122-014-2418-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Jannink, J. L. (2010). Dynamics of long‐term genomic selection. Genetics Selection Evolution, 42(1), 1–11. 10.1186/1297-9686-42-35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Jansen, J. , & Van Hintum, T. (2007). Genetic distance sampling: A novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce. Theoretical and Applied Genetics, 114(3), 421–428. 10.1007/S00122-006-0433-9 [DOI] [PubMed] [Google Scholar]
  133. Jarquín, D. , Crossa, J. , Lacaze, X. , Du Cheyron, P. , Daucourt, J. , Lorgeou, J. , Piraux, F. , Guerreiro, L. , Pérez, P. , Calus, M. , Burgueño, J. , & de los Campos, G. (2014). A reaction norm model for genomic selection using high‐dimensional genomic and environmental data. Theoretical and Applied Genetics, 127(3), 595–607. 10.1007/S00122-013-2243-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Jarquín, D. , Howard, R. , Crossa, J. , Beyene, Y. , Gowda, M. , Martini, J. W. R. , Covarrubias, G. , Burgueño, J. , Pacheco, A. , Grondona, M. , Wimmer, V. , & Prasanna, B. M. (2020). Genomic prediction enhanced sparse testing for multi‐environment trials. G3: Genes,Genomes,Genetics, 10, 2725–2739. 10.1534/g3.120.401349 [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Jarquin, D. , Howard, R. , Xavier, A. , & Das Choudhury, S. (2018). Increasing predictive ability by modeling interactions between environments, genotype and canopy coverage image data for soybeans. Agronomy, 8(4), 51. 10.3390/agronomy8040051 [DOI] [Google Scholar]
  136. Jia, Y. , & Jannink, J. L. (2012). Multiple‐trait genomic selection methods increase genetic value prediction accuracy. Genetics, 192(4), 1513–1522. 10.1534/genetics.112.144246 [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Jiang, J. , Xu, J. , Liu, Y. , Song, B. , Guo, X. , Zeng, X. , & Zou, Q. (2023). Dimensionality reduction and visualization of single‐cell RNA‐seq data with an improved deep variational autoencoder. Briefings in Bioinformatics, 24(3), bbad152. 10.1093/BIB/BBAD152 [DOI] [PubMed] [Google Scholar]
  138. Jighly, A. , Thayalakumaran, T. , O'Leary, G. J. , Kant, S. , Panozzo, J. , Aggarwal, R. , Hessel, D. , Forrest, K. L. , Technow, F. , Tibbits, J. F. G. , Totir, R. , Hayden, M. J. , Munkvold, J. , & Daetwyler, H. D. (2023). Using genomic prediction with crop growth models enables the prediction of associated traits in wheat. Journal of Experimental Botany, 74(5), 1389–1402. 10.1093/JXB/ERAC393 [DOI] [PubMed] [Google Scholar]
  139. Jubair, S. , Tucker, J. R. , Henderson, N. , Hiebert, C. W. , Badea, A. , Domaratzki, M. , & Fernando, W. G. D. (2021). GPTransformer: A transformer‐based deep learning method for predicting fusarium related traits in barley. Frontiers in Plant Science, 12, 761402. 10.3389/FPLS.2021.761402/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Kelly, A. M. , Cullis, B. R. , Gilmour, A. R. , Eccleston, J. A. , & Thompson, R. (2009). Estimation in a multiplicative mixed model involving a genetic relationship matrix. Genetics Selection Evolution, 41(1), 1–9. 10.1186/1297-9686-41-33 [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Kheir, A. M. S. , Mkuhlani, S. , Mugo, J. W. , Elnashar, A. , Nangia, V. , Devare, M. , & Govind, A. (2023). Integrating APSIM model with machine learning to predict wheat yield spatial distribution. Agronomy Journal, 115(8), 3188–3196. 10.1002/agj2.21470 [DOI] [Google Scholar]
  142. Kick, D. R. , & Washburn, J. D. (2023). Ensemble of best linear unbiased predictor, machine learning and deep learning models predict maize yield better than each model alone. In Silico Plants, 5(2), 1–11. 10.1093/insilicoplants/diad015 [DOI] [Google Scholar]
  143. Krchov, L.‐M. , Gordillo, G. A. , & Bernardo, R. (2015). Multienvironment validation of the effectiveness of phenotypic and genomewide selection within biparental maize populations. Crop Science, 55(3), 1068–1075. 10.2135/cropsci2014.09.0608 [DOI] [Google Scholar]
  144. Kusmec, A. , Zheng, Z. , Archontoulis, S. , Ganapathysubramanian, B. , Hu, G. , Wang, L. , Yu, J. , & Schnable, P. S. (2021). Interdisciplinary strategies to enable data‐driven plant breeding in a changing climate. One Earth, 4(3), 372–383. 10.1016/J.ONEEAR.2021.02.005 [DOI] [Google Scholar]
  145. Lado, B. , Matus, I. , Rodríguez, A. , Inostroza, L. , Poland, J. , Belzile, F. , del Pozo, A. , Quincke, M. , Castro, M. , & von Zitzewitz, J. (2013). Increased genomic prediction accuracy in wheat breeding through spatial adjustment of field trial data. G3: Genes, Genomes, Genetics, 3(12), 2105–2114. 10.1534/g3.113.007807 [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Lehermeier, C. , Krämer, N. , Bauer, E. , Bauland, C. , Camisan, C. , Campo, L. , Flament, P. , Melchinger, A. E. , Menz, M. , Meyer, N. , Moreau, L. , Moreno‐González, J. , Ouzunova, M. , Pausch, H. , Ranc, N. , Schipprack, W. , Schönleben, M. , Walter, H. , Charcosset, A. , & Schön, C. C. (2014). Usefulness of multiparental populations of maize (Zea mays L.) for genome‐based prediction. Genetics, 198(1), 3–16. 10.1534/genetics.114.161943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Lemeunier, P. , Paux, E. , Babi, S. , Auzanneau, J. , Goudemand‐Dugué, E. , Ravel, C. , & Rincent, R. (2022). Training population optimization for genomic selection improves the predictive ability of a costly measure in bread wheat, the gliadin to glutenin ratio. Euphytica, 218(8), 1–16. 10.1007/S10681-022-03062-4 [DOI] [Google Scholar]
  148. Li, X. , Guo, T. , Mu, Q. , Li, X. , & Yu, J. (2018). Genomic and environmental determinants and their interplay underlying phenotypic plasticity. Proceedings of the National Academy of Sciences of the United States of America, 115(26), 6679–6684. 10.1073/PNAS.1718326115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Li, X. , Guo, T. , Wang, J. , Bekele, W. A. , Sukumaran, S. , Vanous, A. E. , Mcnellie, J. P. , Tibbs‐Cortes, L. E. , Lopes, M. S. , Lamkey, K. R. , Westgate, M. E. , Mckay, J. K. , Archontoulis, S. V. , Reynolds, M. P. , Tinker, N. A. , Schnable, P. S. , & Yu, J. (2021). An integrated framework reinstating the environmental dimension for GWAS and genomic selection in crops. Molecular Plant, 14(6), 874–887. 10.1016/J.MOLP.2021.03.010 [DOI] [PubMed] [Google Scholar]
  150. Li, X. , Wei, Y. , Moore, K. J. , Michaud, R. , Viands, D. R. , Hansen, J. L. , Acharya, A. , & Brummer, E. C. (2015). Genomic prediction of biomass yield in two selection cycles of a tetraploid alfalfa breeding population. The Plant Genome, 8(1), plantgenome2014.12.0090. 10.3835/plantgenome2014.12.0090 [DOI] [PubMed] [Google Scholar]
  151. Liu, H. , Meuwissen, T. H. , Sørensen, A. C. , & Berg, P. (2015). Upweighting rare favourable alleles increases long‐term genetic gain in genomic selection programs. Genetics Selection Evolution, 47(1), 1–14. 10.1186/S12711-015-0101-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Liu, X. , Li, Y. I. , & Pritchard, J. K. (2019). Trans effects on gene expression can drive omnigenic inheritance. Cell, 177(4), 1022–1034. e6. 10.1016/j.cell.2019.04.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Liu, Y. , & Wang, D. (2017). Application of deep learning in genomic selection [Paper presentation]. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA. 10.1109/BIBM.2017.8218025 [DOI]
  154. Lopez‐Cruz, M. , Crossa, J. , Bonnett, D. , Dreisigacker, S. , Poland, J. , Jannink, J. L. , Singh, R. P. , Autrique, E. , & de los Campos, G. (2015). Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3: Genes, Genomes, Genetics, 5(4), 569–582. 10.1534/G3.114.016097/-/DC1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Lopez‐Cruz, M. , & de los Campos, G. (2021). Optimal breeding‐value prediction using a sparse selection index. Genetics, 218(1), iyab030. 10.1093/genetics/iyab030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Lorenz, A. J. , & Smith, K. P. (2015). Adding genetically distant individuals to training populations reduces genomic prediction accuracy in barley. Crop Science, 55(6), 2657–2667. 10.2135/cropsci2014.12.0827 [DOI] [Google Scholar]
  157. Lourenço, V. M. , Ogutu, J. O. , Rodrigues, R. A. P. , Posekany, A. , & Piepho, H.‐P. (2024). Genomic prediction using machine learning: A comparison of the performance of regularized regression, ensemble, instance‐based, and deep learning methods on synthetic and empirical data. BMC Genomics, 25, 152. 10.1186/s12864-023-09933-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Ly, D. , Chenu, K. , Gauffreteau, A. , Rincent, R. , Huet, S. , Gouache, D. , Martre, P. , Bordes, J. , & Charmet, G. (2017). Nitrogen nutrition index predicted by a crop model improves the genomic prediction of grain number for a bread wheat core collection. Field Crops Research, 214, 331–340. 10.1016/j.fcr.2017.09.024 [DOI] [Google Scholar]
  159. Ly, D. , Huet, S. , Gauffreteau, A. , Rincent, R. , Touzy, G. , Mini, A. , Jannink, J.‐L. , Cormier, F. , Paux, E. , Lafarge, S. , Le Gouis, J. , & Charmet, G. (2018). Whole‐genome prediction of reaction norms to environmental stress in bread wheat (Triticum aestivum L.) by genomic random regression. Field Crops Research, 28, 32–41. 10.1016/j.fcr.2017.08.020 [DOI] [Google Scholar]
  160. Lyra, D. H. , Virlet, N. , Sadeghi‐Tehran, P. , Hassall, K. L. , Wingen, L. U. , Orford, S. , Griffiths, S. , Hawkesford, M. J. , & Slavov, G. T. (2020). Functional QTL mapping and genomic prediction of canopy height in wheat measured using a robotic field phenotyping platform. Journal of Experimental Botany, 71(6), 1885–1898. 10.1093/jxb/erz545 [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Ma, W. , Qiu, Z. , Song, J. , Li, J. , Cheng, Q. , Zhai, J. , & Ma, C. (2018). A deep convolutional neural network approach for predicting phenotypes from genotypes. Planta, 248(5), 1307–1318. 10.1007/S00425-018-2976-9 [DOI] [PubMed] [Google Scholar]
  162. Macleod, I. M. , Bowman, P. J. , Vander Jagt, C. J. , Haile‐Mariam, M. , Kemper, K. E. , Chamberlain, A. J. , Schrooten, C. , Hayes, B. J. , & Goddard, M. E. (2016). Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits. BMC Genomics, 17, 144. 10.1186/s12864-016-2443-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Mahmood, U. , Li, X. , Fan, Y. , Chang, W. , Niu, Y. , Li, J. , Qu, C. , & Lu, K. (2022). Multi‐omics revolution to promote plant breeding efficiency. Frontiers in Plant Science, 13, 1062952. 10.3389/FPLS.2022.1062952 [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Malosetti, M. , Ribaut, J. M. , & van Eeuwijk, F. A. (2013). The statistical analysis of multi‐environment data: Modeling genotype‐by‐environment interaction and its genetic basis. Frontiers in Physiology, 4, 44. 10.3389/fphys.2013.00044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Malosetti, M. , Voltas, J. , Romagosa, I. , Ullrich, S. E. , & Eeuwijk, F. A. V. (2004). Mixed models including environmental covariables for studying QTL by environment interaction. Euphytica, 137(1), 139–145. 10.1023/B:EUPH.0000040511.46388.ef [DOI] [Google Scholar]
  166. Mangin, B. , Rincent, R. , Rabier, C. E. , Moreau, L. , & Goudemand‐Dugue, E. (2019). Training set optimization of genomic prediction by means of EthAcc. PLoS ONE, 14(2), e0205629. 10.1371/JOURNAL.PONE.0205629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Mcdowell, R. M. (2016). Genomic selection with deep neural networks [Unpublished PhD thesis]. Iowa State University. 10.31274/etd-180810-5600 [DOI] [Google Scholar]
  168. Merrick, L. F. , Herr, A. W. , Sandhu, K. S. , Lozada, D. N. , & Carter, A. H. (2022). Optimizing plant breeding programs for genomic selection. Agronomy, 12(3), 714. 10.3390/AGRONOMY12030714 [DOI] [Google Scholar]
  169. Messina, C. D. , Technow, F. , Tang, T. , Totir, R. , Gho, C. , & Cooper, M. (2018). Leveraging biological insight and environmental variation to improve phenotypic prediction: Integrating crop growth models (CGM) with whole genome prediction (WGP). European Journal of Agronomy, 100, 151–162. 10.1016/j.eja.2018.01.007 [DOI] [Google Scholar]
  170. Meuwissen, T. H. E. , Hayes, B. J. , & Goddard, M. E. (2001). Prediction of total genetic value using genome‐wide dense marker maps. Genetic Volumes, 157(4), 1819–1829. https://academic.oup.com/genetics/article/157/4/1819/6048353 [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Mieulet, D. , Aubert, G. , Bres, C. , Klein, A. , Droc, G. , Vieille, E. , Rond‐Coissieux, C. , Sanchez, M. , Dalmais, M. , Mauxion, J.‐P. , Rothan, C. , Guiderdoni, E. , & Mercier, R. (2018). Unleashing meiotic crossovers in crops. Nature Plants, 4(12), 1010–1016. 10.1038/s41477-018-0211-2 [DOI] [PubMed] [Google Scholar]
  172. Moeinizade, S. , Hu, G. , Wang, L. , & Schnable, P. S. (2019). Optimizing selection and mating in genomic selection with a look‐ahead approach: An operations research framework. G3: Genes,Genomes,Genetics, 9(7), 2123. 10.1534/G3.118.200842 [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Moeinizade, S. , Kusmec, A. , Hu, G. , Wang, L. , & Schnable, P. S. (2020). Multi‐trait genomic selection methods for crop improvement. Genetics, 215(4), 931–945. 10.1534/genetics.120.303305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Molotoks, A. , Smith, P. , & Dawson, T. P. (2021). Impacts of land use, population, and climate change on global food security. Food and Energy Security, 10(1), e261. 10.1002/FES3.261 [DOI] [Google Scholar]
  175. Momen, M. , Campbell, M. T. , Walia, H. , & Morota, G. (2019). Predicting longitudinal traits derived from high‐throughput phenomics in contrasting environments Using genomic Legendre polynomials and B‐splines. G3:Genes,Genomes,Genetics, 9(10), 3369–3380. 10.1534/g3.119.400346 [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Montesinos‐López, A. , Montesinos‐López, O. A. , Gianola, D. , Crossa, J. , & Hernández‐Suárez, C. M. (2018). Multi‐environment genomic prediction of plant traits using deep learners with dense architecture. G3:Genes,Genomes,Genetics, 8(12), 3813–3828. 10.1534/G3.118.200740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Montesinos‐López, O. A. , Kismiantini, & Montesinos‐López, A. (2023a). Two simple methods to improve the accuracy of the genomic selection methodology. BMC Genomics, 24(1), 1–15. 10.1186/S12864-023-09294-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. Montesinos‐López, O. A. , Kismiantini, & Montesinos‐López, A. (2023b). Designing optimal training sets for genomic prediction using adversarial validation with probit regression. Plant Breeding, 142(5), 594–606. 10.1111/pbr.13124 [DOI] [Google Scholar]
  179. Montesinos‐López, O. A. , Montesinos‐López, A. , Crossa, J. , Gianola, D. , Hernández‐Suárez, C. M. , & Martín‐Vallejo, J. (2018). Multi‐trait, multi‐environment deep learning modeling for genomic‐enabled prediction of plant traits. G3 Genes,Genomes,Genetics, 8(12), 3829–3840. 10.1534/G3.118.200728 [DOI] [PMC free article] [PubMed] [Google Scholar]
  180. Montesinos‐López, O. A. , Montesinos‐López, A. , Pérez‐Rodríguez, P. , Barrón‐López, J. A. , Martini, J. W. R. , Fajardo‐Flores, S. B. , Gaytan‐Lugo, L. S. , Santana‐Mancilla, P. C. , & Crossa, J. (2021). A review of deep learning applications for genomic selection. BMC Genomics, 22(1). 10.1186/s12864-020-07319-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Morales, N. , Anche, M. T. , Kaczmar, N. S. , Lepak, N. , Ni, P. , Romay, M. C. , Santantonio, N. , Buckler, E. S. , Gore, M. A. , Mueller, L. A. , & Robbins, K. R. (2024). Spatio‐temporal modeling of high‐throughput multispectral aerial images improves agronomic trait genomic prediction in hybrid maize. Genetics, 227(1), iyae037. 10.1093/GENETICS/IYAE037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Moreira, F. F. , Hearst, A. A. , Cherkauer, K. A. , & Rainey, K. M. (2019). Improving the efficiency of soybean breeding with high‐throughput canopy phenotyping. Plant Methods, 15(1), 1–9. 10.1186/s13007-019-0519-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. Moreira, F. F. , Oliveira, H. R. , Lopez, M. A. , Abughali, B. J. , Gomes, G. , Cherkauer, K. A. , Brito, L. F. , & Rainey, K. M. (2021). High‐throughput phenotyping and random regression models reveal temporal genetic control of soybean biomass production. Frontiers in Plant Science, 12, 1749. 10.3389/fpls.2021.715983 [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Moreira, F. F. , Oliveira, H. R. , Volenec, J. J. , Rainey, K. M. , & Brito, L. F. (2020). Integrating high‐throughput phenotyping and statistical genomic methods to genetically improve longitudinal traits in crops. Frontiers in Plant Science, 11, 681. 10.3389/fpls.2020.00681 [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Morota, G. , Abdollahi‐Arpanahi, R. , Kranis, A. , & Gianola, D. (2014). Genome‐enabled prediction of quantitative traits in chickens using genomic annotation. BMC Genomics, 15, 109. 10.1186/1471-2164-15-109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  186. Mrode, R. A. (2014). Linear models for the prediction of animal breeding values (3rd ed.). CABI. 10.1079/9781780643915.0000 [DOI] [Google Scholar]
  187. Mu, Q. , Guo, T. , Li, X. , & Yu, J. (2022). Phenotypic plasticity in plant height shaped by interaction between genetic loci and diurnal temperature range. New Phytologist, 233(4), 1768–1779. 10.1111/NPH.17904 [DOI] [PubMed] [Google Scholar]
  188. Negus, K. L. , Li, X. , Welch, S. M. , & Yu, J. (2024). The role of artificial intelligence in crop improvement. Advances in Agronomy, 184, 1–66. 10.1016/BS.AGRON.2023.11.001 [DOI] [Google Scholar]
  189. Ninomiya, S. (2022). High‐throughput field crop phenotyping: Current status and challenges. Breeding Science, 72, 3–18. 10.1270/jsbbs.21069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Novakovsky, G. , Dexter, N. , Libbrecht, M. W. , Wasserman, W. W. , & Mostafavi, S. (2022). Obtaining genetics insights from deep learning via explainable artificial intelligence. Nature Reviews Genetics, 24(2), 125–137. 10.1038/s41576-022-00532-2 [DOI] [PubMed] [Google Scholar]
  191. Ogutu, J. O. , Piepho, H. P. , & Schulz‐Streeck, T. (2011). A comparison of random forests, boosting and support vector machines for genomic selection. BMC Proceedings, 5(SUPPL. 3), 1–5. 10.1186/1753-6561-5-S3-S11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Oliveira, H. , Brito, L. , Lourenco, D. , Silva, F. , Jamrozik, J. , Schaeffer, L. , & Schenkel, F. (2019). Invited review: Advances and applications of random regression models: From quantitative genetics to genomics. Journal of Dairy Science, 102(9), 7664–7683. 10.3168/jds.2019-16265 [DOI] [PubMed] [Google Scholar]
  193. Onogi, A. , Watanabe, M. , Mochizuki, T. , Hayashi, T. , Nakagawa, H. , Hasegawa, T. , & Iwata, H. (2016). Toward integration of genomic selection with crop modelling: The development of an integrated approach to predicting rice heading dates. Theoretical and Applied Genetics, 129(4), 805–817. 10.1007/S00122-016-2667-5 [DOI] [PubMed] [Google Scholar]
  194. Ordás, B. , Malvar, R. A. , Revilla, P. , & Ordás, A. (2023). Effect of three cycles of recurrent selection for yield in four Spanish landraces of maize. Euphytica, 219(7), 1–11. 10.1007/S10681-023-03199-W/TABLES/9 [DOI] [Google Scholar]
  195. Ornella, L. , Pérez, P. , Tapia, E. , González‐Camacho, J. , Burgueño, J. , Zhang, X. , Singh, S. , Vicente, F. , Bonnett, D. , Dreisigacker, S. , Singh, R. , Long, N. , & Crossa, J. (2014). Genomic‐enabled prediction with classification algorithms. Heredity, 112, 616–626. 10.1038/hdy.2013.144 [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Ou, J. H. , & Liao, C. T. (2019). Training set determination for genomic selection. Theoretical and Applied Genetics, 132(10), 2781–2792. 10.1007/S00122-019-03387-0 [DOI] [PubMed] [Google Scholar]
  197. Parmley, K. , Nagasubramanian, K. , Sarkar, S. , Ganapathysubramanian, B. , & Singh, A. K. (2019). Development of optimized phenomic predictors for efficient plant breeding decisions using phenomic‐assisted selection in soybean. Plant Phenomics, 2019, 1–15. 10.34133/2019/5809404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  198. Pérez‐Rodríguez, P. , Crossa, J. , Bondalapati, K. , De Meyer, G. , Pita, F. , & Campos, G. D. E. L. (2015). A pedigree‐based reaction norm model for prediction of cotton yield in multienvironment trials. Crop Science, 55(3), 1143–1151. 10.2135/CROPSCI2014.08.0577 [DOI] [Google Scholar]
  199. Pérez‐Rodríguez, P. , Gianola, D. , González‐Camacho, J. M. , Crossa, J. , Manès, Y. , & Dreisigacker, S. (2012). Comparison between linear and non‐parametric regression models for genome‐enabled prediction in wheat. G3: Genes, Genomes, Genetics, 2(12), 1595–1605. 10.1534/G3.112.003665/-/DC1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Persa, R. , Ribeiro, P. C. D. O. , & Jarquin, D. (2021). The use of high‐throughput phenotyping in genomic selection context. Crop Breeding and Applied Biotechnology, 21(S), e385921S6. 10.1590/1984-70332021v21Sa19 [DOI] [Google Scholar]
  201. Piepho, H. P. (2009). Ridge regression and extensions for genomewide selection in maize. Crop Science, 49, 1165–1176. 10.2135/cropsci2008.10.0595 [DOI] [Google Scholar]
  202. Piepho, H. P. , Möhring, J. , Melchinger, A. E. , & Büchse, A. (2008). BLUP for phenotypic selection in plant breeding and variety testing. Euphytica, 161(1–2), 209–228. 10.1007/s10681-007-9449-8 [DOI] [Google Scholar]
  203. Pollak, L. M. (2003). The history and success of the public–private project on germplasm enhancement of maize (GEM). Advances in Agronomy, 78, 45–87. 10.1016/s0065-2113(02)78002-4 [DOI] [Google Scholar]
  204. Pook, T. , Freudenthal, J. , Korte, A. , & Simianer, H. (2020). Using local convolutional neural networks for genomic prediction. Frontiers in Genetics, 11, 561497. 10.3389/FGENE.2020.561497/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
  205. Priyadarshan, P. M. (2019). Genotype‐by‐environment interactions. Plant breeding: Classical to modern (pp. 457–472). Springer. 10.1007/978-981-13-7095-3_20 [DOI] [Google Scholar]
  206. Pszczola, M. , & Calus, M. P. L. (2016). Updating the reference population to achieve constant genomic prediction reliability across generations. Animal, 10(6), 1018–1024. 10.1017/S1751731115002785 [DOI] [PubMed] [Google Scholar]
  207. R2D2 Consortium, Fugeray‐Scarbel, A. , Bastien, C. , Dupont‐Nivet, M. , & Lemarié, S. (2021). Why and how to switch to genomic selection: Lessons from plant and animal breeding experience. Frontiers in Genetics, 12, 629737. 10.3389/fgene.2021.629737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  208. Razzaq, A. , Wishart, D. S. , Wani, S. H. , Hameed, M. K. , Mubin, M. , & Saleem, F. (2022). Advances in metabolomics‐driven diagnostic breeding and crop improvement. Metabolites, 12(6), 511. 10.3390/metabo12060511 [DOI] [PMC free article] [PubMed] [Google Scholar]
  209. Resende, R. T. , Piepho, H. P. , Rosa, G. J. M. , Silva‐Junior, O. B. , e Silva, F. F. , de Resende, M. D. V. , & Grattapaglia, D. (2021). Enviromics in breeding: Applications and perspectives on envirotypic‐assisted selection. Theoretical and Applied Genetics, 134(1), 95–112. 10.1007/s00122-020-03684-z [DOI] [PubMed] [Google Scholar]
  210. Rezaei, E. E. , Webber, H. , Asseng, S. , Boote, K. , Durand, J. L. , Ewert, F. , Martre, P. , & MacCarthy, D. S. (2023). Climate change impacts on crop yields. Nature Reviews Earth & Environment, 4(12), 831–846. 10.1038/s43017-023-00491-0 [DOI] [Google Scholar]
  211. Riedelsheimer, C. , Czedik‐Eysenberg, A. , Grieder, C. , Lisec, J. , Technow, F. , Sulpice, R. , Altmann, T. , Stitt, M. , Willmitzer, L. , & Melchinger, A. E. (2012). Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nature Genetics, 44(2), 217–220. 10.1038/ng.1033 [DOI] [PubMed] [Google Scholar]
  212. Rincent, R. , Charcosset, A. , & Moreau, L. (2017). Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations. Theoretical and Applied Genetics, 130(11), 2231–2247. 10.1007/s00122-017-2956-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  213. Rincent, R. , Laloë, D. , Nicolas, S. , Altmann, T. , Brunel, D. , Revilla, P. , Rodríguez, V. M. , Moreno‐Gonzalez, J. , Melchinger, A. , Bauer, E. , Schoen, C.‐C. , Meyer, N. , Giauffret, C. , Bauland, C. , Jamin, P. , Laborde, J. , Monod, H. , Flament, P. , Charcosset, A. , & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: Comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192(2), 715–728. 10.1534/genetics.112.141473 [DOI] [PMC free article] [PubMed] [Google Scholar]
  214. Rincent, R. , Malosetti, M. , Ababaei, B. , Touzy, G. , Mini, A. , Bogard, M. , Martre, P. , Le Gouis, J. , & Van Eeuwijk, F. (2019). Using crop growth model stress covariates and AMMI decomposition to better predict genotype‐by‐environment interactions. Theoretical and Applied Genetics, 132, 3399–3411. 10.1007/s00122-019-03432-y [DOI] [PubMed] [Google Scholar]
  215. Rio, S. , Charcosset, A. , Mary‐Huard, T. , Moreau, L. , & Rincent, R. (2022). Building a calibration set for genomic prediction, characteristics to be considered, and optimization approaches. In Ahmadi N. & Bartholomé J. (Eds.), Genomic prediction of complex traits (pp. 35–50). Humana. 10.1007/978-1-0716-2205-6_3 [DOI] [PubMed] [Google Scholar]
  216. Robert, P. , Le Gouis, J. , & Rincent, R. (2020). Combining crop growth modeling with trait‐assisted prediction improved the prediction of genotype by environment interactions. Frontiers in Plant Science, 11, 827. 10.3389/fpls.2020.00827 [DOI] [PMC free article] [PubMed] [Google Scholar]
  217. Robertsen, C. D. , Hjortshøj, R. L. , & Janss, L. L. (2019). Genomic selection in cereal breeding. Agronomy, 9(2), 1–16. 10.3390/agronomy9020095 [DOI] [Google Scholar]
  218. Rogers, A. R. , Bian, Y. , Krakowsky, M. , Peters, D. , Turnbull, C. , Nelson, P. , & Holland, J. B. (2022). Genomic prediction for the Germplasm Enhancement of Maize project. The Plant Genome, 15(4), e20267. 10.1002/tpg2.20267 [DOI] [PMC free article] [PubMed] [Google Scholar]
  219. Runcie, D. E. , Qu, J. , Cheng, H. , & Crawford, L. (2021). MegaLMM: Mega‐scale linear mixed models for genomic predictions with thousands of traits. Genome Biology, 22, 213. 10.1186/s13059-021-02416-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  220. Rutkoski, J. , Poland, J. , Mondal, S. , Autrique, E. , González Pérez, L. , Crossa, J. , Reynolds, M. , & Singh, R. (2016). Canopy temperature and vegetation indices from high‐throughput phenotyping improve accuracy of pedigree and genomic selection for grain yield in wheat. G3: Genes, Genomes, Genetics, 6(9), 2799–2808. 10.1534/g3.116.032888 [DOI] [PMC free article] [PubMed] [Google Scholar]
  221. Rutkoski, J. , Singh, R. P. , Huerta‐Espino, J. , Bhavani, S. , Poland, J. , Jannink, J. L. , & Sorrells, M. E. (2015a). Genetic gain from phenotypic and genomic selection for quantitative resistance to stem rust of wheat. The Plant Genome, 8(1), plantgenome2014.10.0074. 10.3835/plantgenome2014.10.0074 [DOI] [PubMed] [Google Scholar]
  222. Rutkoski, J. , Singh, R. P. , Huerta‐Espino, J. , Bhavani, S. , Poland, J. , Jannink, J. L. , & Sorrells, M. E. (2015b). Efficient use of historical data for genomic selection: A case study of stem rust resistance in wheat. The Plant Genome, 8(1), plantgenome2014.09.0046. 10.3835/plantgenome2014.09.0046 [DOI] [PubMed] [Google Scholar]
  223. Rutkoski, J. E. (2019). A practical guide to genetic gain. In Sparks D. L. (Ed.), Advances in agronomy (Vol. 157, pp. 217–249). Academic Press. 10.1016/bs.agron.2019.05.001 [DOI] [Google Scholar]
  224. Salhuana, W. , Pollak, L. M. , Ferrer, M. , Paratori, O. , & Vivo, G. (1998). Breeding potential of maize accessions from Argentina, Chile, USA, and Uruguay. Crop Science, 38(3), 866–872. 10.2135/CROPSCI1998.0011183x003800030040X [DOI] [Google Scholar]
  225. Sallam, A. H. , Endelman, J. B. , Jannink, J. L. , & Smith, K. P. (2015). Assessing genomic selection prediction accuracy in a dynamic barley breeding population. The Plant Genome, 8(1), plantgenome2014.05.0020. 10.3835/plantgenome2014.05.0020 [DOI] [PubMed] [Google Scholar]
  226. Sanchez, D. , Allier, A. , Ben Sadoun, S. , Mary‐Huard, T. , Bauland, C. , Palaffre, C. , Lagardère, B. , Madur, D. , Combes, V. , Melkior, S. , Bettinger, L. , Murigneux, A. , Moreau, L. , & Charcosset, A. (2024). Assessing the potential of genetic resource introduction into elite germplasm: A collaborative multiparental population for flint maize. Theoretical and Applied Genetics, 137(1), 19. 10.1007/S00122-023-04509-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  227. Sanchez, D. , Sadoun, S. B. , Mary‐Huard, T. , Allier, A. , Moreau, L. , & Charcosset, A. (2023). Improving the use of plant genetic resources to sustain breeding programs’ efficiency. Proceedings of the National Academy of Sciences of the United States of America, 120(14), e2205780119. 10.1073/PNAS.2205780119/SUPPL_FILE/PNAS.2205780119.SAPP.PDF [DOI] [PMC free article] [PubMed] [Google Scholar]
  228. Sandhu, K. , Patil, S. S. , Pumphrey, M. , & Carter, A. (2021). Multitrait machine‐ and deep‐learning models for genomic selection using spectral information in a wheat breeding program. The Plant Genome, 14(3), e20119. 10.1002/TPG2.20119 [DOI] [PubMed] [Google Scholar]
  229. Sarinelli, J. M. , Murphy, J. P. , Tyagi, P. , Holland, J. B. , Johnson, J. W. , Mergoum, M. , Mason, R. E. , Babar, A. , Harrison, S. , Sutton, R. , Griffey, C. A. , & Brown‐Guedira, G. (2019). Training population selection and use of fixed effects to optimize genomic predictions in a historical USA winter wheat panel. Theoretical and Applied Genetics, 132, 1247–1261. 10.1007/s00122-019-03276-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  230. Schmidt, M. , Kollers, S. , Maasberg‐Prelle, A. , Großer, J. , Schinkel, B. , Tomerius, A. , Graner, A. , & Korzun, V. (2016). Prediction of malting quality traits in barley based on genome‐wide marker data to assess the potential of genomic selection. Theoretical and Applied Genetics, 129, 203–213. 10.1007/s00122-015-2639-1 [DOI] [PubMed] [Google Scholar]
  231. Schrag, T. A. , Westhues, M. , Schipprack, W. , Seifert, F. , Thiemann, A. , Scholten, S. , & Melchinger, A. E. (2018). Beyond genomic prediction: Combining different types of omics data can improve prediction of hybrid performance in maize. Genetics, 208(4), 1373–1385. 10.1534/genetics.117.300374 [DOI] [PMC free article] [PubMed] [Google Scholar]
  232. Schulthess, A. W. , Kale, S. M. , Liu, F. , Zhao, Y. , Philipp, N. , Rembe, M. , Jiang, Y. , Beukert, U. , Serfling, A. , Himmelbach, A. , Fuchs, J. , Oppermann, M. , Weise, S. , Boeven, P. H. G. , Schacht, J. , Longin, C. F. H. , Kollers, S. , Pfeiffer, N. , Korzun, V. , … Reif, J. C. (2022). Genomics‐informed prebreeding unlocks the diversity in genebanks for wheat improvement. Nature Genetics, 54(10), 1544–1552. 10.1038/s41588-022-01189-7 [DOI] [PubMed] [Google Scholar]
  233. Shahhosseini, M. , Hu, G. , Huber, I. , & Archontoulis, S. V. (2021). Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt. Scientific Reports, 11(1), 1–15. 10.1038/s41598-020-80820-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  234. Shahi, D. , Todd, J. , Gravois, K. , Hale, A. , Blanchard, B. , Kimbeng, C. , Pontif, M. , & Baisakh, N. (2025). Exploiting historical agronomic data to develop genomic prediction strategies for early clonal selection in the Louisiana sugarcane variety development program. The Plant Genome, 18(1), e20545. 10.1002/tpg2.20545 [DOI] [PMC free article] [PubMed] [Google Scholar]
  235. Singh, D. , Wang, X. , Kumar, U. , Gao, L. , Noor, M. , Imtiaz, M. , Singh, R. P. , & Poland, J. (2019). High‐throughput phenotyping enabled genetic dissection of crop lodging in wheat. Frontiers in Plant Science, 10, 394. 10.3389/fpls.2019.00394 [DOI] [PMC free article] [PubMed] [Google Scholar]
  236. Skøt, L. , & Grinberg, N. F. (2016). Genomic selection in crop plants. Encyclopedia of applied plant sciences (Vol. 3, pp. 88–92). Elsevier Inc. 10.1016/B978-0-12-394807-6.00228-8 [DOI] [Google Scholar]
  237. Smith, A. , Cullis, B. , & Thompson, R. (2001). Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics, 57(4), 1138–1147. 10.1111/J.0006-341X.2001.01138.X [DOI] [PubMed] [Google Scholar]
  238. Song, M. , Greenbaum, J. , Luttrell, J. , Zhou, W. , Wu, C. , Shen, H. , Gong, P. , Zhang, C. , & Deng, H. W. (2020). A review of integrative imputation for multi‐omics datasets. Frontiers in Genetics, 11, 570255. 10.3389/FGENE.2020.570255/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
  239. Storlie, E. , & Charmet, G. (2013). Genomic selection accuracy using historical data generated in a wheat breeding program. The Plant Genome, 6, plantgenome2013.01.0001. 10.3835/plantgenome2013.01.0001 [DOI] [Google Scholar]
  240. Stringer, J. K. , Atkin, F. C. , & Gezan, S. A. (2017). Statistical approaches in plant breeding: Maximising the use of the genetic information. Genetic improvement of tropical crops (pp. 3–17). Springer. 10.1007/978-3-319-59819-2_1 [DOI] [Google Scholar]
  241. Sukumaran, S. , Rebetzke, G. , Mackay, I. , Bentley, A. R. , & Reynolds, M. P. (2022). Pre‐breeding strategies. Wheat improvement: Food security in a changing climate (pp. 451–469). Springer. 10.1007/978-3-030-90673-3_25 [DOI] [Google Scholar]
  242. Sun, J. , Rutkoski, J. E. , Poland, J. A. , Crossa, J. , Jannink, J. , & Sorrells, M. E. (2017). Multitrait, random regression, or simple repeatability model in high‐throughput phenotyping data improve genomic prediction for wheat grain yield. The Plant Genome, 10(2), plantgenome2016.11.0111. 10.3835/plantgenome2016.11.0111 [DOI] [PubMed] [Google Scholar]
  243. Taagen, E. , Jordan, K. , Akhunov, E. , Sorrells, M. E. , & Jannink, J.‐L. (2022). If it ain't broke, don't fix it: Evaluating the effect of increased recombination on response to selection for wheat breeding. G3: Genes,Genomes,Genetics, 12(12), jkac291. 10.1093/g3journal/jkac291 [DOI] [PMC free article] [PubMed] [Google Scholar]
  244. Talwar, D. , Mongia, A. , Sengupta, D. , & Majumdar, A. (2018). AutoImpute: Autoencoder based imputation of single‐cell RNA‐seq data. Scientific Reports, 8(1), 1–11. 10.1038/s41598-018-34688-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  245. Tarter, J. A. , Goodman, M. M. , & Holland, J. B. (2003). Testcross performance of semiexotic inbred lines derived from Latin American maize accessions. Crop Science, 43(6), 2272–2278. 10.2135/CROPSCI2003.2272 [DOI] [Google Scholar]
  246. Technow, F. , Messina, C. D. , Totir, L. R. , & Cooper, M. (2015). Integrating crop growth models with whole genome prediction through approximate Bayesian computation. PLoS ONE, 10(6), e0130855. 10.1371/JOURNAL.PONE.0130855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  247. Tibbs‐Cortes, L. E. , Zhang, Z. , & Yu, J. (2021). Status and prospects of genome‐wide association studies in plants. The Plant Genome, 14, e20077. 10.1002/tpg2.20077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  248. Tibbs‐Cortes, L. E. , Guo, T. , Andorf, C. M. , Li, X. , & Yu, J. (2024). Comprehensive identification of genomic and environmental determinants of phenotypic plasticity in maize. Genome Research, 34, 1253–1263. 10.1101/gr.279027.124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  249. Tibbs‐Cortes, L. E. , Guo, T. , Li, X. , Tanaka, R. , Vanous, A. E. , Peters, D. , Gardner, C. , Magallanes‐Lundback, M. , Deason, N. T. , DellaPenna, D. , Gore, M. A. , & Yu, J. (2022). Genomic prediction of tocochromanols in exotic‐derived maize. The Plant Genome, 16, e20286. 10.1002/tpg2.20286 [DOI] [PMC free article] [PubMed] [Google Scholar]
  250. Tolhurst, D. J. , Chris Gaynor, R. , Gardunia, B. , Hickey, J. M. , & Gorjanc, G. (2022). Genomic selection using random regressions on known and latent environmental covariates. Theoretical and Applied Genetics, 135, 3393–3415. 10.1007/s00122-022-04186-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  251. Tollefson, J. (2020). How hot will Earth get by 2100? Nature, 580(7804), 443–445. 10.1038/D41586-020-01125-X [DOI] [PubMed] [Google Scholar]
  252. Turner‐Hissong, S. D. , Bird, K. A. , Lipka, A. E. , King, E. G. , Beissinger, T. M. , & Angelovici, R. (2020). Genomic prediction informed by biological processes expands our understanding of the genetic architecture underlying free amino acid traits in dry Arabidopsis seeds. G3: Genes,Genomes,Genetics, 10(11), 4227–4239. 10.1534/g3.120.401240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  253. Ubbens, J. , Parkin, I. , Eynck, C. , Stavness, I. , & Sharpe, A. G. (2021). Deep neural networks for genomic prediction do not estimate marker effects. The Plant Genome, 14(3), e20147. 10.1002/TPG2.20147 [DOI] [PMC free article] [PubMed] [Google Scholar]
  254. van Eeuwijk, F. , Kang, M. , & Denis, J. (1996). Incorporating additional information on genotypes and environments in models for two‐way genotype by environment tables. Genotype‐by‐environment interaction (pp. 15–49). CRC Press. 10.1201/9781420049374.CH2 [DOI] [Google Scholar]
  255. VanRaden, P. M. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91(11), 4414–4423. 10.3168/JDS.2007-0980 [DOI] [PubMed] [Google Scholar]
  256. VanRaden, P. M. , Van Tassell, C. P. , Wiggans, G. R. , Sonstegard, T. S. , Schnabel, R. D. , Taylor, J. F. , & Schenkel, F. S. (2009). Invited review: Reliability of genomic predictions for North American Holstein bulls. Journal of Dairy Science, 92(1), 16–24. 10.3168/JDS.2008-1514 [DOI] [PubMed] [Google Scholar]
  257. Vargas, M. , Van Eeuwijk, F. A. , Crossa, J. , & Ribaut, J. M. (2006). Mapping QTLs and QTL x environment interaction for CIMMYT maize drought stress program using factorial regression and partial least squares methods. Theoretical and Applied Genetics, 112(6), 1009–1023. 10.1007/S00122-005-0204-Z [DOI] [PubMed] [Google Scholar]
  258. Varona, L. , Legarra, A. , Toro, M. A. , & Vitezica, Z. G. (2018). Non‐additive effects in genomic selection. Frontiers in Genetics, 1, 78. 10.3389/fgene.2018.00078 [DOI] [PMC free article] [PubMed] [Google Scholar]
  259. Varshney, R. K. , Bohra, A. , Yu, J. , Graner, A. , Zhang, Q. , & Sorrells, M. E. (2021). Designing future crops: Genomics‐assisted breeding comes of age. Trends in Plant Science, 26(6), 631–649. 10.1016/J.TPLANTS.2021.03.010 [DOI] [PubMed] [Google Scholar]
  260. Varshney, R. K. , Singh, V. K. , Hickey, J. M. , Xun, X. , Marshall, D. F. , Wang, J. , Edwards, D. , & Ribaut, J. M. (2016). Analytical and decision support tools for genomics‐assisted breeding. Trends in Plant Science, 21(4), 354–363. 10.1016/J.TPLANTS.2015.10.018 [DOI] [PubMed] [Google Scholar]
  261. Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A. N. , Kaiser, Ł. , & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 2017‐December, 5999–6009. https://arxiv.org/abs/1706.03762v5 [Google Scholar]
  262. Velazco, J. G. , Malosetti, M. , Hunt, C. H. , Mace, E. S. , Jordan, D. R. , & Van Eeuwijk, F. A. (2019). Combining pedigree and genomic information to improve prediction quality: An example in sorghum. Theoretical and Applied Genetics, 132, 2055–2067. 10.1007/s00122-019-03337-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  263. Velu, G. , Crossa, J. , Singh, R. P. , Hao, Y. , Dreisigacker, S. , Perez‐Rodriguez, P. , Joshi, A. K. , Chatrath, R. , Gupta, V. , Balasubramaniam, A. , Tiwari, C. , Mishra, V. K. , Sohu, V. S. , & Mavi, G. S. (2016). Genomic prediction for grain zinc and iron concentrations in spring wheat. Theoretical and Applied Genetics, 129(8), 1595–1605. 10.1007/S00122-016-2726-Y/TABLES/4 [DOI] [PubMed] [Google Scholar]
  264. Volpato, L. , Alves, R. S. , Teodoro, P. E. , Vilela De Resende, M. D. , Nascimento, M. , Nascimento, A. C. C. , Ludke, W. H. , Lopes Da Silva, F. , & Borém, A. (2019). Multi‐trait multi‐environment models in the genetic selection of segregating soybean progeny. PLoS ONE, 14(4), e0215315. 10.1371/journal.pone.0215315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  265. von Bloh, M. , Lobell, D. , & Asseng, S. (2024). Knowledge informed hybrid machine learning in agricultural yield prediction. Computers and Electronics in Agriculture, 227(Part 2), 109606. 10.1016/j.compag.2024.109606 [DOI] [Google Scholar]
  266. Voss‐Fels, K. P. , Cooper, M. , & Hayes, B. J. (2018). Accelerating crop genetic gains with genomic selection. Theoretical and Applied Genetics, 132(3), 669–686. 10.1007/S00122-018-3270-8 [DOI] [PubMed] [Google Scholar]
  267. Wang, H. , Ye, M. , Fu, Y. , Dong, A. , Zhang, M. , Feng, L. , Zhu, X. , Bo, W. , Jiang, L. , Griffin, C. H. , Liang, D. , & Wu, R. (2021). Modeling genome‐wide by environment interactions through omnigenic interactome networks. Cell Reports, 35(6), 109114. 10.1016/j.celrep.2021.109114 [DOI] [PubMed] [Google Scholar]
  268. Wang, K. , Abid, M. A. , Rasheed, A. , Crossa, J. , Hearne, S. , & Li, H. (2023). DNNGP, a deep neural network‐based method for genomic prediction using multi‐omics data in plants. Molecular Plant, 16(1), 279–293. 10.1016/j.molp.2022.11.004 [DOI] [PubMed] [Google Scholar]
  269. Wang, S. , Wei, J. , Li, R. , Qu, H. , Chater, J. M. , Ma, R. , Li, Y. , Xie, W. , & Jia, Z. (2019). Identification of optimal prediction models using multi‐omic data for selecting hybrid rice. Heredity, 123(3), 395–406. 10.1038/s41437-019-0210-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  270. Wang, X. , Singh, D. , Marla, S. , Morris, G. , & Poland, J. (2018). Field‐based high‐throughput phenotyping of plant height in sorghum using different sensing technologies. Plant Methods, 14(1), 1–16. 10.1186/s13007-018-0324-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  271. Wang, X. , & Wen, Y. (2022). A penalized linear mixed model with generalized method of moments for prediction analysis on high‐dimensional multi‐omics data. Briefings in Bioinformatics, 23(4), bbac193. 10.1093/BIB/BBAC193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  272. Washburn, J. D. , Burch, M. B. , & Franco, J. A. V. (2020). Predictive breeding for maize: Making use of molecular phenotypes, machine learning, and physiological crop models. Crop Science, 60(2), 622–638. 10.1002/CSC2.20052 [DOI] [Google Scholar]
  273. Washburn, J. D. , Cimen, E. , Ramstein, G. , Reeves, T. , O'Briant, P. , McLean, G. , Cooper, M. , Hammer, G. , & Buckler, E. S. (2021). Predicting phenotypes from genetic, environment, management, and historical data using CNNs. Theoretical and Applied Genetics, 134(12), 3997–4011. 10.1007/S00122-021-03943-7 [DOI] [PubMed] [Google Scholar]
  274. Wei, J. , Guo, T. , Mu, Q. , Alladassi, M. , Mural, R. , Boyles, R. E. , Hoffman, L., Jr. , Hayes, C. M. , Sigmon, B. , Thompson, A. M. , Salas‐Fernandez, M. G. , Rooney, W. L. , Kresovich, S. , Schnable, J. C. , Li, X. , & Yu, J. (2025). Genetic and environmental patterns underlying phenotypic plasticity in flowering time and plant height in sorghum. Plant, Cell & Environment, 48(4), 2727–2738. 10.1111/pce.15213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  275. Wen, C. , Qian, J. , Lin, J. , Teng, J. , Jayaraman, D. , & Gao, Y. (2022). Fighting fire with fire: Avoiding DNN shortcuts through priming. Proceedings of Machine Learning Research, 162, 23723–23750. https://proceedings.mlr.press/v162/wen22d.html [Google Scholar]
  276. Westhues, M. , Heuer, C. , Thaller, G. , Fernando, R. , & Melchinger, A. E. (2019). Efficient genetic value prediction using incomplete omics data. Theoretical and Applied Genetics, 132, 1211–1222. 10.1007/s00122-018-03273-1 [DOI] [PubMed] [Google Scholar]
  277. White, J. W. , & Hoogenboom, G. (2003). Gene‐based approaches to crop simulation. Agronomy Journal, 95(1), 52–64. 10.2134/AGRONJ2003.5200 [DOI] [Google Scholar]
  278. Wu, C. , Zhang, Y. , Ying, Z. , Li, L. , Wang, J. , Yu, H. , Zhang, M. , Feng, X. , Wei, X. , & Xu, X. (2023). A transformer‐based genomic prediction method fused with knowledge‐guided module. Briefings in Bioinformatics, 25(1), 1–11. 10.1093/BIB/BBAD438 [DOI] [PMC free article] [PubMed] [Google Scholar]
  279. Wu, P.‐Y. , Ou, J.‐H. , & Liao, C.‐T. (2023). Sample size determination for training set optimization in genomic prediction. Theoretical and Applied Genetics, 136, 57. 10.1007/s00122-023-04254-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  280. Xavier, A. , Muir, W. M. , & Rainey, K. M. (2016). Assessing predictive properties of genome‐wide selection in soybeans. G3: Genes, Genomes, Genetics, 6(8), 2611–2616. 10.1534/g3.116.032268 [DOI] [PMC free article] [PubMed] [Google Scholar]
  281. Xu, S. , Xu, Y. , Gong, L. , & Zhang, Q. (2016). Metabolomic prediction of yield in hybrid rice. The Plant Journal, 88(2), 219–227. 10.1111/TPJ.13242 [DOI] [PubMed] [Google Scholar]
  282. Xu, Y. , Liu, X. , Fu, J. , Wang, H. , Wang, J. , Huang, C. , Prasanna, B. M. , Olsen, M. S. , Wang, G. , & Zhang, A. (2020). Enhancing genetic gain through genomic selection: From livestock to plants. Plant Communications, 1(1), 100005. 10.1016/j.xplc.2019.100005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  283. Xu, Y. , Xu, C. , & Xu, S. (2017). Prediction and association mapping of agronomic traits in maize using multiple omic data. Heredity, 119(3), 174–184. 10.1038/hdy.2017.27 [DOI] [PMC free article] [PubMed] [Google Scholar]
  284. Yang, C. J. , Sharma, R. , Gorjanc, G. , Hearne, S. , Powell, W. , & Mackay, I. (2020). Origin specific genomic selection: A simple process to optimize the favorable contribution of parents to progeny. G3: Genes, Genomes, Genetics, 10(7), 2445–2455. 10.1534/G3.120.401132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  285. Yang, G. , Li, Y. , Yuan, S. , Zhou, C. , Xiang, H. , Zhao, Z. , Wei, Q. , Chen, Q. , Peng, S. , & Xu, L. (2024). Enhancing direct‐seeded rice yield prediction using UAV‐derived features acquired during the reproductive phase. Precision Agriculture, 25(2), 1014–1037. 10.1007/S11119-023-10103-Y [DOI] [Google Scholar]
  286. Yao, J. , Zhao, D. , Chen, X. , Zhang, Y. , & Wang, J. (2018). Use of genomic selection and breeding simulation in cross prediction for improvement of yield and quality in wheat (Triticum aestivum L.). Crop Journal, 6(4), 353–365. 10.1016/j.cj.2018.05.003 [DOI] [Google Scholar]
  287. Yates, F. , & Cochran, W. G. (1938). The analysis of groups of experiments. The Journal of Agricultural Science, 28(4), 556–580. 10.1017/S0021859600050978 [DOI] [Google Scholar]
  288. Ye, S. , Li, J. , & Zhang, Z. (2020). Multi‐omics‐data‐assisted genomic feature markers preselection improves the accuracy of genomic prediction. Journal of Animal Science and Biotechnology, 11(1), 1–12. 10.1186/S40104-020-00515-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  289. Yin, X. , Stam, P. , Kropff, M. J. , & Schapendonk, A. H. C. M. (2003). Crop modeling, QTL mapping, and their complementary role in plant breeding. Agronomy Journal, 95(1), 90–98. 10.2134/AGRONJ2003.9000A [DOI] [Google Scholar]
  290. Yu, X. , Leiboff, S. , Li, X. , Guo, T. , Ronning, N. , Zhang, X. , Muehlbauer, G. J. , Timmermans, M. C. P. , Schnable, P. S. , Scanlon, M. J. , & Yu, J. (2020). Genomic prediction of maize micro‐phenotypes provides insights for optimizing selection and mining diversity. Plant Biotechnology Journal, 18, 2456–2465. 10.1111/pbi.13420 [DOI] [PMC free article] [PubMed] [Google Scholar]
  291. Yu, X. , Li, X. , Guo, T. , Zhu, C. , Wu, Y. , Mitchell, S. E. , Roozeboom, K. L. , Wang, D. , Wang, M. L. , Pederson, G. A. , Tesso, T. T. , Schnable, P. S. , Bernardo, R. , & Yu, J. (2016). Genomic prediction contributing to a promising global strategy to turbocharge gene banks. Nature Plants, 2(10), 1–7. 10.1038/nplants.2016.150 [DOI] [PubMed] [Google Scholar]
  292. Zenke‐Philippi, C. , Thiemann, A. , Seifert, F. , Schrag, T. , Melchinger, A. E. , Scholten, S. , & Frisch, M. (2016). Prediction of hybrid performance in maize with a ridge regression model employed to DNA markers and mRNA transcription profiles. BMC Genomics, 17(1), 1–8. 10.1186/S12864-016-2580-Y [DOI] [PMC free article] [PubMed] [Google Scholar]
  293. Zhang, N. , Zhou, X. , Kang, M. , Hu, B. G. , Heuvelink, E. , & Marcelis, L. F. M. (2023). Machine learning versus crop growth models: An ally, not a rival. AoB PLANTS, 15(2), 1–7. 10.1093/AOBPLA/PLAC061 [DOI] [PMC free article] [PubMed] [Google Scholar]
  294. Zhang, Z. , Ober, U. , Erbe, M. , Zhang, H. , Gao, N. , He, H. , Li, J. , & Simianer, H. (2014). Improving the accuracy of whole genome prediction for complex traits using the results of genome‐wide association studies. PLoS ONE, 9(10), e93017. 10.1371/journal.pone.0093017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  295. Zhou, X. , Carbonetto, P. , & Stephens, M. (2013). Polygenic modeling with Bayesian sparse linear mixed models. PLOS Genetics, 9(2), e1003264. 10.1371/JOURNAL.PGEN.1003264 [DOI] [PMC free article] [PubMed] [Google Scholar]
  296. Zhou, Y. , Zhang, Z. , Bao, Z. , Li, H. , Lyu, Y. , Zan, Y. , Wu, Y. , Cheng, L. , Fang, Y. , Wu, K. , Zhang, J. , Lyu, H. , Lin, T. , Gao, Q. , Saha, S. , Mueller, L. , Fei, Z. , Städler, T. , Xu, S. , … Huang, S. (2022). Graph pangenome captures missing heritability and empowers tomato breeding Construction of the graph pangenome. Nature, 606, 527–534. 10.1038/s41586-022-04808-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  297. Zingaretti, L. M. , Gezan, S. A. , Ferrão, L. F. V. , Osorio, L. F. , Monfort, A. , Muñoz, P. R. , Whitaker, V. M. , & Pérez‐Enciso, M. (2020). Exploring deep learning for complex trait genomic prediction in polyploid outcrossing species. Frontiers in Plant Science, 11, 506702. 10.3389/FPLS.2020.00025/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No data associated with this study since this is a review.


Articles from The Plant Genome are provided here courtesy of Wiley

RESOURCES