Skip to main content
. 2021 May 17;41(5):33. doi: 10.1007/s11032-021-01221-4

Box 2.

Kernels

One of the most commonly used but potentially most misunderstood terms in GS is the kernel. To introduce this term in context, consider a GBLUP-style model that includes at least one random effect to account for the individuals. These random effects are assumed to follow a multivariate normal distribution with population mean vector 0 and population variance-covariance matrix Σ. One common practice is to set Σ to be proportional to an additive genetic relatedness matrix, for example, the matrix described in VanRaden (2008). Colloquially, the methodology used to calculate each element of this relatedness matrix is called the kernel.

Different kernels, or methodologies used to estimate elements of a relatedness matrix, are used in practice. For instance, suppose one was interested in accounting for additive, dominance, and additive × additive epistasis in a GS model (the so-called ADE model from Covarrubias-Pazaran (2016)). Then one could fit a GBLUP-style model, where there are three individual random effects. To account for additive and dominance genetic effects, relatedness matrices could be calculated respectively using additive and dominance kernels (e.g., described in Su et al. (2012)), and the resulting relatedness matrices could be used to model the variance-covariance of two of the random effects. To account for additive × additive epistasis, a similar relatedness matrix could be created where the kernel used is the Hadamard product (i.e., element-wise multiplication) between each element of two additive relatedness matrices. Other kernels, for example, the reproducing kernel Hilbert space (described in Neves et al. (2012) and Pérez, and de los Campos, G. (2014)) or the support vector machine regression (described in Howard et al. (2014)), are also commonly used to account for non-additive genetic effects in GS models.

Approaches for calculating kernels are critical for future GS research because they can be used to account for additional sources of trait variability in a model. For example, suppose one wanted to account for small RNA (sRNA) in a GS model (as done in Schrag et al. (2018)). In this instance, they could include an additional random effect for the individuals in the model and then incorporate sRNA information into kernels that ultimately create a relatedness matrix. In summary, the increasing amount of available data has a potential to increase the effectiveness of genomic prediction, and these data can be easily incorporated into GS models through kernels.