Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 May 18;76(7):1378–1390. doi: 10.1111/evo.14486

Analytical results for directional and quadratic selection gradients for log‐linear models of fitness functions

Michael B Morrissey 1,, I B J Goudie 2
PMCID: PMC9546161  PMID: 35340021

Abstract

Log‐linear models are widely used for assessing determinants of fitness in empirical studies, for example, in determining how reproductive output depends on trait values or environmental conditions. Similarly, theoretical works of fitness and natural selection employ log‐linear models, often with a negative quadratic term, generating Gaussian fitness functions. However, in the specific application of regression‐based analysis of natural selection, such models are rarely employed. Rather, OLS regression is the predominant means of assessing the form of natural selection. OLS regressions allow specific evolutionary quantitative parameters, selection gradients, to be estimated, and benefit from the fact that the associated statistical models are easily applied. We examine whether selection gradients can be directly expressed in terms of the coefficients of models using exponential fitness functions with linear or quadratic arguments. Such models can be easily fitted with generalized linear models (GLMs). The expressions we obtain coincide with those for Gaussian functions, but relax the major constraint that the (log) fitness function is concave (downwardly curved). Additionally these results lead to univariate and multivariate analyses of both linear and quadratic selection that potentially incorporate pragmatic and interpretable models of fitness functions, where the parameters can be related analytically to selection gradients, and that can be operationalized using widely available statistical tools.

Keywords: Capture‐mark‐recapture, fitness, generalised linear model, natural selection, selection gradients, survival analysis


The characterization of natural selection, especially in the wild, has long been a major research theme in evolutionary ecology and evolutionary quantitative genetics (Endler 1986, Kingsolver et al. 2001, Lande & Arnold 1983, Manly 1985, Weldon 1901). In recent decades, regression‐based approaches have been used to obtain direct selection gradients (especially following Lande & Arnold 1983), which represent the direct effects of traits on fitness. These, and related, measures of selection have an explicit justification in quantitative genetic theory (Lande 1979, Lande & Arnold 1983), which provides the basis for comparison among traits, taxa, and so on, and ultimately allows meta‐analysis (e.g., Kingsolver et al. 2001). Selection gradients can characterize both directional selection and aspects of nonlinear selection, and so are a very powerful concept in evolutionary quantitative genetics.

The selection gradient may be defined as the vector of partial derivatives of relative fitness with respect to phenotype, averaged over the distribution of phenotype observed in a population. This definition is equivalent to other existing definitions when the phenotype follows a Gaussian distribution (Walsh & Lynch 2018), an assumption we rely on for most of our results. Following Lande & Arnold (1983), given an arbitrary function W(z) for expected fitness of a (multivariate) phenotype z, a general expression for the directional selection gradient vector β is

β=W¯1W(z)zp(z)dz, (1)

where p(z) is the probability density function of phenotype, with z being a column vector, and W¯ is mean fitness. Mean fitness can itself be obtained by W(z)p(z)dz. A quadratic selection gradient can also be defined as the average curvature (similarly standardised), rather than the average slope, of the relative fitness function,

γ=W¯12W(z)zzp(z)dz. (2)

The directional selection gradient has a direct relationship to evolutionary change, assuming that breeding values (the additive genetic component of individual phenotype, Falconer 1960) are multivariate normally distributed, following the Lande (1979) equation

Δz¯=Gβ, (3)

where Δz¯ is per‐generation evolutionary change, and G is the additive genetic covariance matrix, i.e., the (co)variances among individuals of breeding values. The quadratic selection gradient matrix has direct relationships to the change in the distribution of breeding values due to selection, but not with such simple relationships between generations as for the directional selection gradient and the change in the mean (Lande & Arnold 1983). Walsh & Lynch (2018) provide an extended treatment of the various relationships between the summaries of the local fitness function provided by β and γ to changes in the distribution of phenotype and breeding values.

Some progress has been made at developing generalised regression model methods for inference of selection gradients. Janzen & Stern (1998) proposed a method for binomial fitness components (e.g., per‐interval survival, mated vs. not mated). The Janzen & Stern (1998) method provides estimates of β , and requires fitting a logistic model with linear terms only, calculating the average derivatives at each phenotypic value observed in a sample, and then standardizing to the relative fitness scale. Morrissey & Sakrejda (2013) expanded Janzen & Stern's (1998) basic approach to arbitrary fitness functions (i.e., not necessarily linear) and arbitrary response variable distributions, retaining the basic idea of numerically averaging the slope (and curvature) of the fitness function over the distribution of observed phenotype. Shaw & Geyer (2010) developed a framework for characterizing the distributions of fitness (and fitness residuals) that arise in complex life cycles, and also showed how the method could be applied to estimate selection gradients by averaging the slope or curvature of the fitness function over the observed values of phenotype in a sample.

Perhaps the simplest fitness function, W(z), that arises in evolutionary theory is a log‐linear model, such that

W(z)=eα+bz, (4)

where α is an intercept, and b is a vector of linear regression coefficients on the log scale. The derivative of fitness with respect to phenotype is W/z=bW(z), and so from equation 1 the selection gradient is

β=bW(z)p(z)dzW(z)p(z)dz=b. (5)

Thus, under such a model, the directional selection gradient vector β is equal to b (see also Lande 1983, Chevin & Hospital 2008, Chevin et al. 2015). The fitness function given by equation 4 is convex (upwardly curved) on the scale of fitness (as opposed to log fitness), and this may be regarded both as a biological feature of such a model, or as a statistical artifact of fitting a model that is linear on the log scale (Schluter 1988, Chevin & Hospital 2008).

More commonly in evolutionary theory (e.g., Lande 1976, Chevin & Haller 2014), Gaussian functions are used, such that

W(z)e12(zθ)'ω1(zθ), (6)

where θ is the vector of optimal phenotypes and ω describes the curvature; ω must be positive‐definite. Any time that mean trait values are not equal to the optimum (θ), selection in a Gaussian fitness model has a directional component. Specifically,

β=S(μθ), (7)

where μ is mean phenotype and S=(ω+Σ)1, where Σ is the phenotypic variance‐covariance matrix before selection; this relation is used extensively in evolutionary theory (e.g., Lande 1976, 1979, Gomulkiesicz & Houle 2009, Chevin & Haller 2014). Quadratic selection gradients are more rarely considered in theory that uses Gaussian fitness functions, perhaps because those functions are seen as general models of stabilizing selection, and therefore represent quite constrained models of nonlinear selection. The quadratic selection gradient matrix is given by

γ=ββS (8)

(Chevin et al. 2015; note that this reference gives a mis‐printed version of this relation). The expressions given here for selection gradients in Gaussian fitness models deviate somewhat from typical presentations in so far as we give multivariate expressions for all quantities.

It is possible to fit constrained functions (e.g., Gaussian) to empirical data in studies on natural selection, although this is rarely done (see Chevin et al. 2015 for an excellent example, linking theoretical quantities to selection gradients using a Gaussian model). More general exponential functions may be very useful for empirical studies of natural selection, particularly if they make less restrictive assumptions about the overall form of selection. If the coefficients of more general functions can be related analytically to selection gradients, they may help to better link theoretical and empirical evolutionary quantitative genetic studies. We sought to determine whether analytical relationships could be found between selection gradients and the parameters of exponential functions with quadratic exponents, i.e.,

W(z)=ea+ibizi+i12gizi2+i=1k1j=i+1kgijzizj. (9)

Such functions are easily estimated via generalised linear models, using log link functions, and inherently benefit from the property of exponential functions to treat the response variable (expected fitness in this case) as a strictly positive quantity. The class of function we consider is also a generalization of the Gaussian fitness function (as in eq. 6), and is therefore linked to evolutionary quantitative genetic theory, but provides a more flexible model of nonlinear selection. Specifically, it need not be assumed that the function contains a maximum; when it does, relationships to theory that uses Gaussian functions may be invoked. As such, this more general approach, and its immediate link to statistical analysis with generalized linear models, may be very attractive to empiricists. We obtain analytical links between the regression parameters in equation 9 and β and γ . Our expressions coincide with known relations for Gaussian fitness functions (i.e., equations 7 and 8). The results are thus a particularly satisfying link between procedures that are likely to be adopted by empiricists, and the kinds of function that are used by theoreticians in evolutionary quantitative genetics. We also provide expressions for biological and statistical variance in selection gradients, given variance in the parameters of the regression in equation 9, and some links between exponential fitness functions and some other analyses about fitness components in use in evolutionary studies.

Selection Gradients and Exponential Fitness Functions with Quadratic Arguments

Before detailing our results, a brief description of the factor of 12 associated with the quadratic terms in equation 9, analogous to that surrounding a similar factor in Lande & Arnold's (1983) paper (see Stinchcombe et al. 2008), may prevent confusion. In order to obtain the correct values of the gi coefficients, the covariate values for quadratic terms should be squared and then halved. An alternative analysis is possible, where the squared covariate values are not halved, but the estimated coefficient estimates are doubled (analogous to procedures discussed by Stinchcombe et al. 2008). However, this alternative analysis leads to an additional, and potentially confusing, step in the calculation of standard errors and variance‐covariance matrices of selection gradients in replicated studies (detailed in the appendix).

Define a vector b=(b1,,bk) containing the coefficients of the linear terms in the exponent of the model in equation 9, and a matrix g=(gij) containing the coefficients of the corresponding quadratic form. We can then write the fitness function more conveniently in matrix form

W(z)=ef(z) (10a)
f(z)=a+bz+12zgz. (10b)

Let d be a vector of the expectations of the first order partial derivatives of W(z) and let H be the matrix of expectations of the second order partial derivatives of W(z). Thus the elements of d are di=E[W(z)zi] and the elements of H are Hij=E[2W(z)zizj]. We can now rewrite the expressions for directional and quadratic selection gradients as

β=dE[W(z)] (11)

and

γ=HE[W(z)]. (12)

Differentiating equation 10 gives

W(z)z=b+gzef(z), (13)

and

2W(z)zz=g+b+gzb+gzef(z). (14)

Assume that the phenotype z is multivariate normal, with mean μ and covariance matrix Σ, and denote its probability density by pμ,Σ(z). Provided ef(z) has a finite expectation, the function

K(z)=ef(z)pμ,Σ(z)Eef(z) (15)

is a probability density function giving the distribution of phenotypes after selection. Define the matrix Ω1=Σ1g and the vector ν=μ+Ω(b+gμ). We show in the Appendix that Ω is symmetric. Provided it is also positive definite, it is a valid covariance matrix, and, by equation A8, K(z)pν,Ω(z). As K is a probability density function this implies

K(z)=pν,Ω(z). (16)

Define Q1=Ω1Σ=IkgΣ. Combining equations (11), (13) and 16 yields β=E[b+gz], where the expectation is taken with respect to K. This is an expectation of a linear function of z, and so

β=b+gν=(b+gμ)+gΩ(b+gμ)=(Ik+gΩ)(b+gμ)=Q(b+gμ), (17)

by use of equation A5.

Combining equations (12), (14), and 16 yields γ=E[g+(b+gz)(b+gz)], where the expectation is taken with respect to K. Hence

γ=g+VAR(b+gz)+[E(b+gz)][E(b+gz)]=g+gΩg+ββ=ββ+(Ik+gΩ)g=ββ+Qg, (18)

where we have noted that g is symmetric and used equation A5.

In univariate analyses, the matrix machinery necessary for implementing the general formulae in equations 17 and 18 can be avoided. If the fitness function is W(z)=ea+bz+12gz2, and z has a mean of μ and a variance of σ2, then β=b+gμ1gσ2 and γ=(b+gμ)2+g(1gσ2)(1gσ2)2. These expressions will hold for any univariate analysis, and can be applied to get mean‐standardized, variance‐standardized, and unstandardized selection gradients, when appropriate values of μ and σ2 are used, and applied to log‐quadratic models of W(z) where the phenotypic records have been correspondingly standardized. For the common case where the trait is mean‐centred and (unit) variance standardized, the expressions simplify further to β=b1g and γ=b2+g(1g)(1g)2.

The notation we have used is designed to relate the parameters one may estimate directly in a regression analysis to selection gradients. This should facilitate the application of log‐linear models of fitness functions in empirical models of selection. Interestingly, the expressions we obtain, while more general than Gaussian functions (they do not constrain selection to have a stabilising form), coincide with those for Gaussian selection. This equivalence is demonstrated in the appendix. We have thus shown that these expressions are more general and empirically useful than has previously been known. The requirements for obtaining selection gradients from Gaussian and the more general exponential functions with quadratic arguments are slightly different. For Gaussian functions, ω (as in equation 6) must be positive definite. Equations 17 and 18 require that Ω, which depends on both g and Σ, is positive definite. In univariate analyses, this condition reduces to g<1σ2, implying that the fitness function should not curve upward too sharply within the range of observed phenotype.

The main application of the expressions given here to obtain selection gradient estimates from log‐linear or log‐quadratic arguments is generalized linear model (GLM) regression analyses with log link functions. With appropriate specification of linear, quadratic, and correlational terms, GLMs with any response variable distribution should yield selection gradient estimates, providing that they specify a log link function. This will include, for example, binomial and negative binomial GLM models, with arguments specified as in equation 4 or 9. The equations also apply directly to log‐link models with additive overdispersion (appendix section). Additionally under certain conditions, several other analyses commonly used to assess the dependence of fitness, or fitness components, on quantities such as phenotypic traits can yield log‐linear or log‐quadratic models of trait‐fitness relationships, and can thus be used with the expressions given in this section for β and γ . These include special cases of parentage analysis, capture‐mark‐recapture analysis, and survival analysis. The conditions under which these methods can yield selection gradient estimates are elaborated in appendix sections.

Biological Variation and Statistical Uncertainty

The expressions for selection gradients, given the parameters of a log‐quadratic fitness function (equations 17 and 18) give the selection gradients conditional on the estimated values of b and g. However, we often make inference about selection, not merely to quantify selection at a given place and time, but rather to ask larger questions, for example, about how selection varies. For exponential models with a linear argument (equation 4), the variance in b among, for example, cohorts or populations, may be estimated by random‐regression analysis, and would represent the variance in directional selection gradients, since β=b. However, since the b and g coefficients in an exponential model of fitness with a quadratic argument (equation 9) are not themselves selection gradients, their variances and covariances are not the variances and covariances of β and γ . Moreover in empirical studies of natural selection, b and g will not typically be known quantities, but rather will be estimates with error.

Whether variances and covariances of the elements of b and g are of biological interest (e.g., variation among temporal replicates of a selection analysis), or are statistical (e.g., sampling variance), the corresponding biological or sampling variances of β and γ can potentially be obtained by integrating approximations, bootstrapping, and/or Monte Carlo methods. In particular, approximation of variances in β and γ by a first‐order Taylor approximation (the “delta method”; Lynch & Walsh 1998) may generally be pragmatic. Formulae for approximate biological or statistical variances of β and γ given covariances of b and g by this method are given in the appendix. For univariate analysis, with phenotype standardised to μ=0 and σ2=1, the approximate variances β and γ are given by

Var[β]Var[b](1g)2+b2Var[g](1g)4+2bCov[b,g](1g)3, (19)

and

Var[γ]4b2Var[b](1g)4+1+2b2g2Var[g](1g)6+4b(1+2b2g)Cov[b,g](1g)5, (20)

where Var[b] and Var[g] represent the variances of b and g terms and Cov[b,g] is the covariance of the b and g terms. These variances and covariances may represent real variation in trait‐fitness relationships (e.g., variation in time or space), or they may represent statistical uncertainty in parameter values. If equations 19 and 20 are used to represent statistical uncertainty in estimates β^ and γ^, Var[b] would be replaced by Var[b^] (the sampling variance, or squared standard error of the estimate of b), etc. The sampling covariance of b and g (i.e., Cov[b^,g^]) is not routinely reported by most statistical software packages, but can generally be obtained (the use of the sampling covariance of b^ and g^ terms in a GLM model in R is demonstrated in the supplemental materials).

STATISTICAL UNCERTAINTY: SIMULATION

We performed a small simulation study to assess the extent of any bias in the estimators β and γ and the adequacy of the first‐order approximation of their standard errors. We simulated univariate directional selection, with values of b between ‐0.5 and 0.5, and with g=1,0, and 0.4. Because β and γ are nonlinear functions of g, it is not possible to simultaneously investigate ranges of parameter values with regular intervals of values of both g and selection gradients. These values of g represent a compromise between investigating a regular range of g and γ. We used a (log) intercept of the fitness function of a=0. We simulated a sample size of 200 individuals. This sample size reflects a very modest‐sized study with respect to precision in inference of nonlinear selection, and is therefore a useful scenario in which to judge performance of different methods for calculating standard errors. Fitness was simulated as a Poisson variable with expectations defined by the ranges of values of b and g, and with phenotypes sampled from a standard normal distribution.

First, we analyzed each simulated dataset using the OLS regression described by Lande & Arnold (1983), i.e., wi=μ+βzi+12γzi2+ei, using the R function lm(). For the OLS regressions, we calculated standard errors assuming normality using the standard method implemented in the R function summary.lm(), and by case‐bootstrapping, by generating 1000 bootstrapped datasets by sampling with replacement, running the OLS regression analysis, and calculating the standard deviation of the bootstrapped selection gradient estimates. Second, we fitted a Poisson GLM with linear and quadratic terms, using the R function glm(). We then calculated conditional selection gradient estimates using equations 17 and 18. We obtained standard errors by using a first‐order Taylor series approximation (the “delta method”; Lynch & Walsh 1998, appendix A1). For each method of obtaining estimates and standard errors, we calculated the standard deviation of replicate simulated estimates. We could thus evaluate the performance of different methods of obtaining standard errors by their ability to reflect this sampling standard deviation. We also calculated mean absolute errors for both estimators of β and γ for all scenarios. Every simulation scenario and associated analysis of selection gradients was repeated 1000 times.

Selection gradient estimates obtained by all three methods were essentially unbiased (Figure 1a,d,g,j,m,p), except for small biases that occurred when the fitness function was very curved. Thus, GLM‐derived values of selection gradients, conditional on estimated values of b and g, performed very well as estimators of β and γ in our simulations. Similarly, first‐order approximations of standard errors of the GLM‐derived estimates of β and γ closely reflected the simulated standard deviations of the estimators (Figure 1b,e,h,k,n,q). All methods for obtaining standard errors performed well for estimates of β in the pure log‐linear selection simulations (Figure 1h,k). OLS standard errors performed reasonably well under most simulation scenarios, except when g was positive (Figure 1n,q); across all scenarios bootstrap standard errors of the OLS estimators outperformed OLS standard errors produced using the standard formula. Mean squared error of the GLM estimators was always smaller than that of the OLS estimators of β and γ. This is unsurprising, as the simulation scheme corresponded closely to the GLM model. These results demonstrate the usefulness of the conditional values of β and γ as estimators, and show that gains in precision and accuracy can be obtained when glm models of fitness functions fit the data well.

Figure 1.

Figure 1

Simulation results for the performance of Lande & Arnold's (1983) least squares‐based (OLS) estimators (red lines), and log‐quadratic (GLM) estimators (blue lines), of directional and quadratic selection gradients. The first column shows bias in estimates of β and γ, where departure from the grey line (the simulated truth) indicates bias. The middle column shows the performance of OLS standard errors (red dashed lines), bootstrap standard errors (red dotted lines), and first‐order approximations (blue dashed lines) of the standard errors of the GLM estimators. Ideally, all values of estimated mean standard errors would fall on the simulated standard deviation of their associated estimators, shown as solid lines. The right column shows the mean squared errors of the OLS and GLM estimators. Note that the y‐axis scale for the MSE in plot o differs from that in plots (c) and (i), that the scale in plot (r) differs from that in plots (f) and (l), and that the scales in plots (d), (j), and (p) differ.

It remains plausible that the OLS estimators motivated by Lande & Arnold's (1983) work could outperform glm‐based analyses in some scenarios. In particular, the OLS estimators will rarely be misleading (Figure 1), whereas the risk of bias or misleading standard errors in workflows involving GLMs resulting from, for example, mis‐specification of error structures, is not known. What the theory and limited simulations presented here demonstrate is that (a) given a well‐specified log‐link GLM, selection gradients can be obtained and (b) first‐order approximations to the standard errors of these GLM‐based selection gradients seem to perform adequately.

VARIATION IN SELECTION: EMPIRICAL EXAMPLE

In order to explore the behavior of estimates of β and γ derived from log‐quadratic fitness models, we conducted a small study of selection and variation in selection of birth mass in female Soay sheep (Ovis aries), from a long‐term study on St Kilda, in the Outer Hebrides, Scotland (Clutton‐Brock & Pemberton 2004). Briefly, the data comprise records of mass taken within 5 days of birth, and subsequent lifetime breeding success (i.e., the total number of offspring produced, regardless of the fates of those offspring) of females from cohorts born between 1985 and 2006 (inclusive, except for 2001, when data collection was suspended due to precautions associated with an outbreak of foot‐and‐mouth disease). Fitness data were collected between 1985 and 2016. A small number of ewes from later cohorts (i.e., 2005 and 2006) will have still been alive in 2016, and their fitness will therefore be slightly underestimated. Birth masses were corrected for growth in the first 5 days prior to other analyses by fitting a linear model of mass as a function of age, and then correcting mass for growth.

We fitted a single model as our basis for estimating the variance in selection among cohorts. We fitted a generalized linear random regression mixed model describing the dependence of lifetime breeding success, Wij, of individual i from cohort j, as a function of their birth mass zij, according to

Wij=PE[W]ij, (21a)
logE[W]ij=aj+b¯zij+12g¯zij2+δb,jzij2+12δg,jzij+eij, (21b)

where P(λ) denotes samples from a Poisson distribution with expectation λ, aj are cohort‐specific intercepts, and b¯ and g¯ are overall log‐scale slopes and curvatures. δb,j and δg,j are cohort‐specific random effects for slopes and curvature terms, assumed to be distributed according to

δbδgjN0,Σδ,

and elements of the covariance matrix Σδ=σδb2σδb,δgσδb,δgσδg2 are estimated. eij are residuals, with zero means and cohort‐specific variances. We implemented the model defined by equations 21 as a mixed model, and collected MCMC samples, using MCMCglmm (Hadfield 2010). We used diffuse normal priors on the fixed effects, an inverse‐Wishart prior for Σδ, and inverse‐gamma priors on the cohort‐specific log‐scale overdispersion variances, σe2. We used informative priors on the cohort‐specific overdispersion variances, σe2, in order to help the model to predict cohort‐specific mean fitnesses that agreed well with the observed data. The specific model is fully detailed in the code supplement.

Selection is predominantly directional, favoring larger birth masses (Table 1). Quadratic selection is near zero on average, but not because there is generally a lack of curvature of the fitness function. The combination of positive values of b and negative values of g (from fitting equation 21; Table 1 a) implies that the fitness function is often upwardly curved over much of the distribution of birth mass, but becomes downwardly curved in the region of large birth masses (Figure 2), such that the curvature, averaged over the distribution of phenotype, tends to be near zero. Using the system given in the appendix for calculating variances in β and γ (with univariate cases given in equations 19 and 20), we inferred that there is very substantial variation in both directional and quadratic selection. While β varies greatly (σ^β = 0.18 (0.04 –  0.34); Table 1 b), because selection is so strongly directional, this does not lead to fluctuations in the direction of selection. The fact that the average quadratic selection gradient is near zero does not imply that nondirectional selection is absent. Rather, in contrast to the situation for directional selection, the estimate of σ^γ=0.35(0.060.71) (Table 1 b), given that μ^γ = 0.02 (–0.21 –  0.21), suggests quadratic selection varies substantially, sometimes taking positive values, and sometimes being negative.

Table 1.

Parameters of the random regression mixed model (a) used to characterize variation in selection of lamb body mass in Soay sheep (Ovis aries). Part (b) gives estimates of the means, variances, and covariances of selection gradients, derived from the model reported in part (a). For ease of interpretation, estimates of variation in β and γ are reported as standard deviations, and the relationship between β and γ is reported as a correlation

Parameter Estimate and 95% CI
(a) Mixed model parameters
b  1.07 ( 0.83 –  1.35)
g −0.72 (−1.08 – ‐0.33)
σd2
 0.25 (0.05 –  0.50)
σe2
 0.47 (0.07 –  1.02)
σd,e
−0.26 (−0.56 – 0.00)
(b) Derived parameters of the distribution of selection gradients
μβ
 0.62 ( 0.52 –  0.74)
μγ
 0.02 (‐0.21 –  0.21)
σβ
 0.18 ( 0.04 –  0.34)
σγ
 0.35 ( 0.06 –  0.71)
ρβ,γ
 0.64 (‐0.18 –  1.00)

Figure 2.

Figure 2

Estimates of functions relating birth mass to lifetime breeding success in 21 cohorts of Soay sheep (Ovis aries) on St Kilda from 1985 to 2006; note that inference of selection in 2001 is not possible because very little phenotypic data were collected on account of the foot and mouth outbreak in the UK that year. Estimated fitness functions are given with 95% confidence intervals (OLS) or credible intervals (Bayesian generalized linear mixed model or GLMM) intervals of the predictions. OLS estimates are generated independently for each cohort; GLMM estimates depict cohort‐specific predictions from the random regression mixed model described in equation 21.

Annual (log) quadratic functions from the random regression mixed model (equation 21) correspond reasonably closely to quadratic OLS estimates, for most cohorts (Figure 2). There are some systematic differences, arising primarily from the fact that the two models have different functional forms; the former is a quadratic function, while the latter is a quadratic function on the log scale, or a Gaussian function when the parameter g is negative, as it often is (Table 1 a).

Conclusion

We have provided analytical expressions for selection gradients, given the parameters of exponential fitness models with quadratic arguments. These functions can be applied in conjunction with a range of generalised linear model approaches, specific situations in capture‐mark‐recapture, survival analysis, and parentage analysis, and relate empirical selection gradients directly to types of fitness functions used in theoretical studies. The general relationship of selection gradients to the coefficients of log‐linear and log‐quadratic models, and in particular, various ways of estimating these using generalized linear models, are probably the most generally useful feature of our results. In empirical applications, our preliminary simulation results indicate that, given an appropriate model of a log‐scale fitness function, inference using log‐linear and log‐quadratic models may be very robust, and could provide more reliable statements about uncertainty (i.e., reasonable standard errors) than the main methods used to date. It should be noted, however, that OLS methods proved to be highly robust in our simulations with Poisson fitness residuals (see also McGee, submitted), except in the presence of very strong nonlinear selection. Furthermore, the relationships given here between log‐quadratic fitness functions and selection gradients could lead to better integration between empirical and theoretical strategies for modelling selection.

AUTHOR CONTRIBUTIONS

MBM and IBJG developed the theory, conducted the analyses, and wrote the manuscript together.

Associate Editor: J. Tufto

Handling Editor: A. G. McAdam

Supporting information

Supporting Information

ACKNOWLEDGMENTS

The authors thank Andy Gardner, Graeme Ruxton, Joe Felsenstein, and Kerry Johnson for discussions, comments, and advice. The manuscript was improved by the comments of two reviewers. Peter Jupp provided particular insights that improved this paper. M.B.M. is supported by a Royal Society (London) University Research Fellowship.

GAUSSIAN FITNESS FUNCTIONS AS SPECIAL CASES OF LOG NORMAL FITNESS FUNCTIONS WITH QUADRATIC ARGUMENTS

Suppose that the fitness function W(z) is proportional to a Gaussian density function for z with mean θ and covariance matrix ω . The exponent of the corresponding exponential function is then given by

12(zθ)ω1(zθ)=12θω1θ+θω1z12zω1z=a+bz+12zgz,

where a=12θω1θ, b=θω1 and g=ω1. Let z have a Gaussian distribution with mean μ and covariance matrix Σ. Setting S=(ω+Σ)1, the selection gradient vector β given by equation 7 is

S(μθ)=(ω+Σ)1(ωbμ)=(g1+Σ)1(g1bμ)=(g1Σ)1g1g(g1b+μ)=(IkgΣ)1(b+gμ)=Q(b+gμ), (A1)

where Q is defined following equation 16. This shows that the result for the selection gradient using Gaussian fitness functions is a special case of our result in equation 17.

Comparison of the pre‐multipliers of μ in A1 indicates that S=Qg. It follows that equation 8, giving the result for quadratic selection gradients using Gaussian fitness functions, is a special case of our result in equation 18.

SAMPLING VARIANCES OF SELECTION GRADIENT ESTIMATES

Denote a vector containing all unique elements of γ by γ. The following assumes that γ is composed by vertically stacking the columns of the diagonal and sub‐diagonal elements of γ . For example, in an analysis with three traits, γ=[γ1,1,γ2,1,γ3,1,γ2,2,γ3,2,γ3,3]. Let v() denote the function mapping the distinct elements of a symmetric matrix r onto the column vector r.

The first‐order approximation to the sampling covariance matrix of the elements of β and γ is then given by JΣJ, where Σ is the sampling covariance matrix of a vector containing the elements of b and g, where the latter is a column vector containing the distinct elements of g arranged according to the same scheme that defines γ. J is the Jacobian, or gradient matrix of first order partial derivatives, of β and γ with respect to b and g, i.e.,

J=βbβgγbγg,

evaluated at the estimated values of b and g.

Note that some users may prefer to fit the model 9 with gii replaced by 2gi, say. The formulae for β and γ are readily re‐expressed in terms of these variables by making this substitution. If Σ1 denotes the covariance matrix obtained when fitting this revised model, the required covariance matrix Σ can be calculated using Σ=DΣ1D, where D is a diagonal matrix with all the diagonal elements equal to one, apart from those corresponding to the variables gii which equal 2.

The four submatrices of J can be treated separately. Noting that β=Q(b+gμ) (equation 17),

βb=Q. (A2)

Let s=12k(k+1), where k is the number of traits in the analysis, and let e1,,es be the standard basis for an s dimensional space (i.e., e1=[1,0,,0], etc.). Define an indicator matrix Cm=C(i,j) where C(i,j) is a k by k matrix in which

C(i,j)xy=1,(x,y)=(i,j)or(j,i);0,otherwise.

Using the standard expression for the derivative of the inverse of a matrix with respect to a scalar, we can obtain βg, i.e., the upper‐right sub‐matrix of J.

β=Ψ1b+gμβgm=βgij=Ψ1ΨgijΨ1b+gμ+Ψ1b+gμgij=QIkgΣgijQb+gμ+Qggijμ=QggijΣQb+gμ+Qggijμ=QC(ij)Σβ+μ=QCmΣβ+μβg=m=1sβgmem=Qm=1sCmΣβ+μem (A3)

Let Q[u] denote the u th column of Q. Using the previous relation βb=Q, we can obtain γb, i.e., the lower‐left sub‐matrix of J.

γ=ββ+Qgγbu=ββbu+βbuβ=βQ[u]+Q[u]βγbu=vβQ[u]+Q[u]βγb=u=1kvβQ[u]+Q[u]βeu (A4)

Let M(m)=QCm(Σβ+μ)β. Note that Q1=Ω1Σ implies Ω=ΣQ. Moreover Ω1=Σ1g implies firstly that

Ik+gΩ=Σ1Ω=Q (A5)

and secondly that Ω is symmetric, since Σ and g are both symmetric. It follows that

Q=Ik+(gΩ)=Ik+Ωg. (A6)

The lower‐right sub‐matrix of J can then be derived.

γgij=βgijβ+ββgij+QC(ij)+QC(ij)ΣQg=QC(ij)Σβ+μβ+βQC(ij)Σβ+μ+QC(ij)+QC(ij)Ωgγgij=vM(m)+(M(m))+QCm(Ik+Ωg)γg=m=1svM(m)+(M(m))+QCmQem, (A7)

by use of equation A6.

Finally note that equations A5 and A6 are also relevant to the derivation of formula 16. By definition, f(z)=a+zb+12zgz, and we have log[pμ,Σ(z)]=12zΣ1z+zΣ1μ+α, where α does not depend on z. Thus, if α=α+a, it follows that, as a function of z,

f(z)+log[pμ,Σ(z)]=12z(Σ1g)z+z(b+Σ1μ)+α=12zΩ1z+zΩ1Ω(b+Σ1μ)+α,

Now, by A5 and A6, we have Ω(b+Σ1μ)=Ωb+(Σ1Ω)μ=Ωb+Qμ=Ωb+(Ik+Ωg)μ=ν, implying that

f(z)+log[pμ,Σ(z)]=12zΩ1z+zΩ1ν+α=12(zν)Ω1(zν)+α, (A8)

where α is constant as a function of z. The exponent of ef(z)pμ,Σ(z) is thus identical, as a function of z, to that of pν,Ω(z). Hence formula 16 holds.

OTHER ANALYSES THAT CORRESPOND TO LOG‐LINEAR AND LOG‐QUADRATIC FITNESS FUNCTIONS

In addition to generalised linear models with log link functions, there may be other cases where models of trait‐fitness relationships may correspond to log‐linear or log‐quadratic fitness functions. In this section, we describe how the formulae for β and γ given in equations 17 and 18 are applicable in four kinds of analyses. In each of these situations, it may not be immediately apparent that a form of the fitness function equivalent to equation 9 is implied.

Models with overdispersion

Key applications of equations to calculate selection gradients from parameters of log‐linear fitness models, such as GLM and generalised linear mixed model (GLMM) analysis, are likely to involve fitness measures with more dispersion than is accommodated by standard statistical distributions such as the Poisson distribution. Multiplicative overdispersion of count variables is often handled with a negative binomial distribution, parameterised via its expectation and a dispersion parameter. Thus negative binomial GLM analyses can be used directly to estimate b and g, and thus β and γ . For additive overdispersion, the relations between a generalised model of a fitness function and selection gradients are equally direct, but this is not so immediately obvious. Consider a model for expected fitness such as

log(E[W]i)=a+bzi+12gzi2+εi,εiN0,σε2,

which represents a GLMM analysis with additive overdispersion. Heterogeneity in the εi affects the expectation, such that E[W]i=ea+bzi+12gzi2+12σε2 (note that the mean of a log normal distribution depends on the dispersion on the log scale such that μx=eμlogx+σlog(x)22; Aitchison & Brown 1957). In this situation, the selection gradients, β and γ are again given by equations 17 and 18, even though this expression for log fitness or expected fitness ostensibly differs from that implied by equations 10 a,b. This can be seen by rearranging such that the dependence of expected absolute fitness on the dispersion parameter, σε2, is absorbed into the intercept,

E[W]i=eα+bzi+12gzi2,

where

α=a+12σε2.

Since selection gradients do not depend on the intercept a in equations 17 and 18, they do not depend on α in the above expression, and therefore depend neither on the intercept a nor the dispersion parameter σε2. Consequently, linear and quadratic terms from log‐link GLMMs with additive overdispersion can also be used with equations 17 and 18 to obtain selection gradients for Gaussian traits.

Multinomial models, as in parentage analysis

In parentage inference, some methods have been proposed wherein the probability that candidate parent i is the parent of a given offspring is modelled according to

W(zi)ef(zi),

and where realised parentages of a given offspring array are then modelled according to a multinomial distribution, potentially integrating over uncertainty in paternity assignments based on molecular data (Hadfield et al. 2006, Smouse et al. 1999). When f(z) is a linear function, (Smouse, Meagher & Korbak 1999; T. Meagher, personal communication) interpreted the analysis as being analogous to that of Lande & Arnold (1983), but not necessarily identical. For a linear f(z), this analysis does in fact yield estimates of β , and for a quadratic function, directional and quadratic selection gradients can be obtained using equations 17 and 18. This can be seen by noting that expected fitness, given phenotype, of candidate fathers for any given offspring array will be, in the log‐linear case,

W(z)=cea+bzea+bz,

where c is a constant and a=a+log(c). As W(z) has been expressed in the form of equation 4, it follows from equation 5 that β=b. Moreover, if expected parentage is modelled as a log‐quadratic function, equations 17 and 18 can be used to recover selection gradients from multinomial parentage models.

Mark‐recapture with constant survival functions

Another case where our formulae may be applicable pertains to inferences of survival rate. Often, data about trait‐dependent survival rates may be assessed over discrete intervals. While the experimental unit of time may be an interval (e.g., a day or a year), the biologically‐relevant aspect of variation in survival may be longevity, i.e., for how many intervals an individual survives. One such situation arises when per‐interval survival rate is assessed via a logistic regression analysis, and trait‐dependent survival rates are (or may be assumed to be) constant across intervals. A common case of logistic regression analysis that satisfies this first condition is often implemented in capture‐mark‐recapture procedures. Suppose that per‐interval survival rate, given phenotype, may be assumed to be constant, with death in a particular interval of an individual with phenotype z occurring at probability ρ(z). If fitness may reasonably be assumed to be reflected by the expected number of intervals survived, then it will be given, as the expectation of a geometric distribution, by

W(z)=1ρ(z)ρ(z).

If trait‐dependent per‐interval survival probability is denoted ϕ(z) (ϕ being the standard symbol for survival rate in capture‐mark‐recapture analyses; Lebreton et al. 1992), then the fitness function in terms of expected number of intervals lived is W(z)=1(1ϕ(z))1ϕ(z)=ϕ(z)1ϕ(z). If per‐interval survival rate has been modeled as a logistic regression, i.e.,

ϕ(z)=ef(z)1+ef(z),

where ϕ(z) denotes the per‐interval fitness function, and f(z) is the fitness function on the logistic scale, then the fitness function on the discrete longevity scale is

W(z)=ef(z)1+ef(z)1ef(z)1+ef(z)=ef(z).

Therefore, if f(z) is a linear function, then its terms are the directional selection gradients on the discrete‐longevity scale. If f(z) is a quadratic function, then the corresponding directional and quadratic selection gradients, again if the relevant aspect of fitness is the number of intervals survived, can be obtained using equations 17 and 18. Waller and Svensson (2016) take advantage of these relationships to compare inference of trait‐dependent survival in capture‐mark‐recapture models to classical inference using Lande & Arnold's (1983) least‐squares regression analysis where fitness is assessed as the number of intervals that individuals survive.

It must be stressed that these results do not justify interpretation of logistic regression coefficients of survival probability as selection gradients in a general way. Such coefficients differ from selection gradients for three reasons: (1) they pertain to a linear predictor scale, and natural selection plays out on the data scale, (2) they directly model absolute fitness, not relative fitness, and (3) they pertain to per‐interval survival, which may not necessarily be the aspect of survival that best reflects fitness in any given study. It is only when the number of intervals survived is of interest (and phenotype‐dependent survival rates can be assumed to be constant across intervals) that these three different aspects of scale cancel out such that the parameters of a logistic regression are selection gradients.

Survival analysis

Another situation where an important analysis for understanding trait‐fitness relationships that has an immediate—but not necessarily immediately apparent—relationship to selection gradients, arises in survival analysis. In a proportional hazards model (Cox 1972), the instantaneous probability of mortality experienced by live individuals, the hazard λ(t), as a function of their phenotype, could be modelled as

λ(t)=λ0ef(z),

where λ0 is the baseline hazard, and the factor ef(z) describes individual deviations from this baseline hazard. If the baseline hazard is constant in time, then survival distributions conditional on phenotype are exponential, and have mean λ1. So, if fitness may be assumed to be proportional to longevity (as a continuous variable now, not discrete number of intervals as in the relations given above between logistic models of per‐interval survival and selection gradients) then

W(z)1λ0ef(z)=1λ0ef(z).

In expressions for selection gradients (equations 1 and 2), 1λ0 would be a constant in the integrals in both the numerators and denominators, and therefore cancels in calculations of selection gradients. Therefore, if proportional hazards are modeled with f(z) as a linear or quadratic function, then the expressions for selection gradients (equations (5), (17) and 18) hold, but the coefficients of the trait‐dependent hazard function must be multiplied by ‐1. The relationship to survival analysis will also approximately hold when fitness is determined by survival to a specific age, and mean survival to that age is low.

LITERATURE CITED

  1. Aitchison, J. & Brown, J. (1957) The Lognormal Distribution, with special reference to its uses in economics. Cambridge: Cambridge University Press. [Google Scholar]
  2. Chevin, L.M. & Haller, B.C. (2014) The temporal distribution of directional gradients under selection for an optimum. Evolution, 68, 3381–3394. [DOI] [PubMed] [Google Scholar]
  3. Chevin, L.M. & Hospital, F. (2008) Selective sweep at a quantitative trait locus in the presence of background genetic variation. Genetics, 180, 1645–1660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Chevin, L.M. , Visser, M.E. & Tufto, J. (2015) Estimating the variation, autocorrelation, and environmental sensitivity of phenotypic selection. Evolution, 69, 2319–2332. [DOI] [PubMed] [Google Scholar]
  5. Clutton‐ Brock, T.H. & Pemberton, J.M. (2004) Soay sheep dynamics and selection in an island population. Cambridge: Cambridge University Press. [Google Scholar]
  6. Cox, D.R. (1972) Regression models and life‐tables. Journal of the Royal Statistical Society. Series B, 34, 187–220. [Google Scholar]
  7. Endler, J.A. (1986) Natural selection in the wild. Princeton University Press. [Google Scholar]
  8. Falconer, D.S. (1960) Introduction to Quantitative Genetics. Oliver and Boyd. [Google Scholar]
  9. Gomulkiesicz, R. & Houle, D. (2009) Demographic and genetic constraints on evolution. The American Naturalist, 174, E218–E229. [DOI] [PubMed] [Google Scholar]
  10. Hadfield, J. (2010) MCMC methods for multi‐response generalized linear mixed models: The MCMCglmm R package. Journal of Statistical Software, 33(2), 1–22.20808728 [Google Scholar]
  11. Hadfield, J.D. , Richardson, D.S. & Burke, T. (2006) Towards unbiased parentage assignment: combining genetic, behavioural and spatial data in a Bayesian framework. Molecular Ecology, 15, 3715–3731. [DOI] [PubMed] [Google Scholar]
  12. Janzen, F.J. & Stern, H.S. (1998) Logistic regression for empirical studies of multivariate selection. Evolution, 52,1564–1571. [DOI] [PubMed] [Google Scholar]
  13. Kingsolver, J.G. , Hoekstra, H.E. , Hoekstra, J.M. , Vignieri, C. , Berrigan, D. , Hill, E. , Hoang, A. , Gilbert, P. & Beerli, P. (2001) The strength of phenotypic selection in natural populations. The American Naturalist, 157, 245–261. [DOI] [PubMed] [Google Scholar]
  14. Lande, R. (1976) Natural selection and random genetic drift in phenotypic evolution. Evolution, 30, 314–334. [DOI] [PubMed] [Google Scholar]
  15. Lande, R. (1979) Quantitative genetic analysis of multivariate evolution, applied to brain:body size allometry. Evolution, 33, 402–416. [DOI] [PubMed] [Google Scholar]
  16. Lande, R. (1983) The response to selection on major and minor mutations affecting a metrical trait. Heredity, 50, 47–55. [Google Scholar]
  17. Lande, R. & Arnold, S.J. (1983) The measurement of selection on correlated characters. Evolution, 37, 1210–1226. [DOI] [PubMed] [Google Scholar]
  18. Lebreton, J.D. , Burnham, K.P. , Colbert, J. & Anderson, D.R. (1992) Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies. Ecological Monographs, 62, 67–118. [Google Scholar]
  19. Lynch, M. & Walsh, B. (1998) Genetics and analysis of quantitative traits. Sunderland, MA: Sinauer. [Google Scholar]
  20. Manly, B.F.J. (1985) The statistics of natural selection. New York: Chapman and Hall. [Google Scholar]
  21. Morrissey, M.B. & Sakrejda, K. (2013) Unification of regression‐based approaches to the analysis of natural selection. Evolution, 67, 2094–2100. [DOI] [PubMed] [Google Scholar]
  22. Schluter, D. (1988) Estimating the form of natural selection on a quantitative trait. Evolution, 42, 849–861. [DOI] [PubMed] [Google Scholar]
  23. Shaw, R.G. & Geyer, C.J. (2010) Inferring fitness landscapes. Evolution, 64, 2510–2520. [DOI] [PubMed] [Google Scholar]
  24. Smouse, P.E. , Meagher, T.R. & Kobak, C.J. (1999) Parentage analysis in Chamaelirium luteum (L.) gray (Liliaceae): why do some males have higher reproductive contributions? Journal of Evolutionary Biology, 12, 1069–1077. [Google Scholar]
  25. Stinchcombe, J.R. , Agrawal, A.F. , Hohenlohe, P.A. , Arnold, S.J. & Blows, M.W. (2008) Estimating nonlinear selection gradients using quadratic regression coefficients: Double or nothing? Evolution, 62, 2435–2440. [DOI] [PubMed] [Google Scholar]
  26. Waller, J. & Svensson, E. (2016) The measurement of selection when detection is imperfect: how good are naïve methods? Methods in Ecology and Evolution, 7, 538–548. [Google Scholar]
  27. Walsh, B. & Lynch, M. (2018) Evolution and selection of quantitative traits. Oxford, UK: Oxford University Press. [Google Scholar]
  28. Weldon, W.F.R. (1901) A first study of natural selection in Clausilia italica (von martens). Biometrika, 1, 109–124. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information


Articles from Evolution; International Journal of Organic Evolution are provided here courtesy of Wiley

RESOURCES