Abstract
Genetic variation that influences complex disease susceptibility is introduced into the population by mutation and removed by natural selection and genetic drift. This mutation-selection-drift-balance (MSDB) shapes the prevalence of a disease and its genetic architecture. To date, however, MSDB has only been modeled for monogenic (Mendelian) diseases. Here, we develop a MSDB model for complex disease susceptibility: we assume that genotype relates to disease risk according to the canonical liability threshold model and that selection on variants affecting risk derives from the fitness cost of the disease, and focus on diseases that are highly polygenic, entail a substantial fitness cost, and are neither extremely common in the population nor exceedingly rare. Contrasting model predictions with GWAS and other findings in humans suggests that directional selection plays little role in shaping common genetic variation affecting complex disease susceptibility but might substantially affect rare, large effect variation. In turn, common variation affecting complex disease susceptibility appears to be dominated by pleiotropic stabilizing selection on other traits. Our results further suggest that current estimates of disease heritability are likely biased. More generally, our model provides a better understanding of the evolutionary processes that shape the architecture and prevalence of complex diseases.
Introduction
A central goal of population genetics is to understand how evolutionary processes shape the prevalence of genetic diseases and the population distribution of their underlying genetic variants. This question is of particular interest in humans. Since the late 20th century, we have learned a lot about the genetic basis of simple (Mendelian) diseases and the frequencies of their underlying variants in human populations (Jobling et al. 2013, Ch. 16). We also have long-standing models for the evolutionary processes that generate and maintain these diseases (Haldane 1927; Fuller et al. 2019), whose predictions are in qualitative, if not quantitative, agreement with empirical observations (see, e.g., Amorim et al. 2017).
Most common genetic diseases in humans (e.g., with prevalence ≥ 0.1%) are complex (Jobling et al. 2013. Ch. 17), however, and it is only over the past decade or so that genome-wide association studies (GWAS) have begun to reveal their genetic basis (The Wellcome Trust Case Control Consortium 2007; Trubetskoy et al. 2022). These studies have now identified many thousands of robust associations between genetic variants and many diseases, and in so doing, have begun to uncover the numbers, effect sizes and frequencies of the variants underlying disease risk—henceforth the “genetic architecture” of complex diseases (Abdellaoui et al. 2023). Yet we still lack a good understanding of the evolutionary processes that shape the architecture and prevalence of complex diseases.
The discoveries from GWAS shed some light on these processes. Notably, they reveal that variant effects on disease risk are negatively correlated with their minor allele frequency, indicating that natural selection acts to remove genetic variation affecting disease risk and that the strength of selection on variants increases with their effect on risk (Schoech et al. 2019; Zeng et al. 2021). Additionally, many of the significant associations in GWAS of diseases are common (see, e.g., Trubetskoy et al. 2022) indicating that for much of the variation affecting disease risk, the effects of selection on variant frequencies are comparable to those of random genetic drift. It further appears that variation in the risk of developing common complex diseases is thinly spread among many thousands of segregating variants that are widely distributed across the genome (Loh et al. 2015; Shi, Kichaev, and Pasaniuc 2016; Boyle, Li, and Pritchard 2017). This extreme polygenicity, alongside evidence for selection and drift that lead to the removal of variation, implies that genetic variation is continually replenished by mutations at numerous sites across the genome. Thus, complex disease prevalence and architecture are shaped by a balance between mutation, natural selection, and genetic drift.
Existing models for mutation-selection-drift balance (MSDB) come in three flavors. The first is the classic model for simple Mendelian diseases introduced by Danforth and Haldane and extended by Muller, Kimura and others (Danforth 1923; Haldane 1927; 1937; Muller 1950; Crow 1958; Kimura 1961; Kimura, Maruyama, and Crow 1963; Clark 1998). In its most basic form, the model assumes that mutations arising at a single gene cause the disease in either heterozygotes (the dominant case) or homozygotes (the recessive case) and that the disease markedly reduces individual fitness (see, e.g., Gillespie 2004). The fitness cost of the disease induces strong selection against the alleles that cause it, leading to their loss from the population. The model describes the prevalence of the disease, the frequency of the underlying alleles, and the genetic load as a function of the mutation rate and fitness cost. This model and its generalizations, however, are not applicable to complex diseases, because the risk of developing complex diseases arises from the contribution of many variants (and effects of the environment), which, in turn, generates a less obvious relationship between the fitness cost of the disease and selection on its underlying variants.
The second flavor of MSDB models relate selection on complex traits to the selection acting on the many variants that affect these traits. They do not focus on disease risk, however, but on quantitative (continuous) traits, assuming that these traits are subject to stabilizing selection, i.e., that traits have an optimal value and individual fitness declines continuously with displacement from it (Robertson 1956; Lande 1975; Keightley and Hill 1988; Simons et al. 2018). Typically, these models further assume that mutations are equally likely to increase or decrease trait values, and that the effect sizes in either direction have the same distribution, an assumption known as symmetric mutation (but see Waxman and Peck 2003; Zhang and Hill 2008; Charlesworth 2013a; 2013b). MSDB models of quantitative, complex traits have been invaluable is studying the processes that maintain heritable variation in complex traits. More recently, they have been used to study the genetic architecture of complex traits and to interpret the results of human GWAS (Simons et al. 2018; O’Connor et al. 2019; Zeng et al. 2021; Simons et al. 2022; Spence et al. 2024). It is not obvious that they apply to complex diseases, however.
Indeed, some diseases, for example hypertension or obesity, are defined in terms of underlying quantitative traits that exceed a threshold value, and these underlying traits may well be subject to stabilizing selection and have symmetric mutation. Other diseases, for instance type 2 diabetes, may reflect a discrete biological dysfunction, such as a breakdown of homeostasis (Alon 2023). In such cases, we might expect the effects of the disease on fitness to be discrete, selection to be directional, in always acting to reduce disease risk, and mutations may therefore tend to increase disease risk.
The third flavor of MSDB models includes all these elements, while considering fitness rather than disease risk as the focal trait. These models were introduced to study the fitness burden of deleterious mutations—the genetic load—in natural populations (King 1966; Kimura and Maruyama 1966; Alexey S. Kondrashov 1995) and were later related to evolutionary advantages of sex (Alexey S. Kondrashov 1982; 1988). They typically assume that fitness drops sharply around some threshold number of deleterious mutations or threshold ‘liability’ (sometimes referred to as ‘fitness potential’; Milkman 1978; Kondrashov 2018), which arises from additive (weighted) contributions over all the deleterious alleles that an individual carries. With fitness always decreasing with an increasing number of deleterious alleles (or with increasing liability), selection is always directed against these alleles. Under such directional selection, loci tend to be fixed for beneficial alleles and therefore mutations at these loci tend to be deleterious. With some modifications, these models could be framed as models of complex diseases that are similar to the model we present below.
In their existing form, however, such models cannot be related to the architecture and prevalence of complex disease. Indeed, some of the models (Kimura and Maruyama 1966; Alexey S. Kondrashov 1982; 1984) assume that the variants that underlie fitness are all subject to strong selection (i.e., selection that is much stronger than genetic drift), an assumption that contradicts what we have learned from GWAS of complex diseases (see above). Other studies rely on rough approximations to describe weakly selected variation (e.g. Kondrashov 1995) or assume a mapping between liability and fitness that is inconsistent with disease models (Charlesworth 1990; 2013b). These assumptions do not allow the genetic architecture and prevalence of complex diseases to be related to their underlying evolutionary parameters.
Motivated by these considerations, we develop a MSDB model of complex disease risk and solve it for the genetic architecture of the disease and its prevalence. Similar to classic models of simple (Mendelian) diseases and analogous to models of genetic load, we assume that directional selection always acts to reduce disease risk, and that the disease state is discrete. Similar to models of complex quantitative traits and analogous to models of genetic load, we assume that disease risk arises from the joint effects of many variants and environmental effects. Unlike models of genetic load, we consider accurate approximations for the behavior of weakly selected variation affecting disease risk. By combining these features, we are able to describe the expected genetic architecture of complex diseases and their prevalence and to contrast our predictions with the findings of GWAS in humans.
The Model
We use the canonical model for binary traits in human and quantitative genetics—the liability threshold model—to relate an individual’s genotype with their risk of developing a disease (Wright 1926; 1934; Lush, Lamoreux, and Hazel 1948; Dempster and Lerner 1950; Falconer 1965). Liability is a quantitative (continuous) trait that cannot be observed directly. When an individual’s liability, , exceeds a threshold, , they will develop or have the disease (Figure 1A).
Figure 1.
The model. (A) The liability distribution in the population. Individuals whose liability exceed the threshold develop the disease (in red). (B) The fitness landscape. In these illustrations, we assume that the genetic liability distribution is Normal, that the heritable variance in liability , the fitness cost , and the prevalence .
An individual’s liability relates to their genotype through the standard additive model of quantitative complex traits (Falconer and Mackay 1995). Namely, we assume that the number of genomic sites affecting liability (i.e., the target size) is very large, , and that an individual’s liability is given by
| (1) |
where is the genetic contribution, which is the sum of contributions over sites, is the effect size of the liability increasing allele at site , and is the number of copies of that allele at that site; and is the environmental contribution.
This model implies that an individual’s total liability is normally distributed around their genetic liability, with the variance arising from the environmental contribution. Namely, that . An individual’s probability of developing the disease, their genetic risk, is therefore
| (2) |
where is the Normal cumulative distribution function with mean and variance .
We model fitness by assuming that the disease entails a fitness cost , such that an individual without the disease has fitness 1 and one with the disease has fitness . The mean fitness of a population with disease prevalence is therefore , and the relative fitness associated with genetic liability is
| (3) |
(with given in Eq. 2).
Figure 1B shows the resulting fitness landscape. Fitness plateaus at at low genetic liability and at at high genetic liability, and the width of the transition between plateaus reflects the variance of the environmental contribution. This drop-off in fitness leads to selection to reduce the mean genetic liability of the population.
We assume, for simplicity, that the genomic sites affecting liability are bi-allelic. We set the liability scale such that at a given site , the low liability allele contributes 0 and the high liability allele contributes liability . We denote the distribution of effects across sites by and its mean by . In these terms, the possible values of genetic liability range between 0 and (when all sites are fixed for the low or high liability alleles, respectively).
Each generation, mutation introduces new variants that affect liability into the population. We assume that the two possible alleles mutate to one another with probability per gamete per generation. We further assume, for simplicity, that the rate of mutational input per site in the population is sufficiently low for us to rely on the infinite sites approximation (specifically, we assume that , where is the population size). In this approximation, segregating derived alleles are assumed to have arisen from a single mutation, and the number of mutations per gamete per generation follows a Poisson distribution with mean . The infinite sites approximation is standard and sensible in many contexts in humans (Harpak, Bhaskar, and Pritchard 2016; Schraiber, Spence, and Edge 2024).
The population dynamics follow the standard model of a diploid, panmictic population of constant size , with non-overlapping generations. In each generation, parents are randomly chosen to reproduce with probabilities proportional to their fitness (Eq. 3), i.e., Wright-Fisher sampling with fertility selection, followed by mutation, free recombination (i.e., no linkage) and Mendelian segregation. Our notation is summarized in Table S1.
Scope.
We focus on diseases that are highly polygenic, have a substantial fitness cost, and are not extremely common or exceedingly rare. Specifically, our analysis should apply to diseases with a target size (noting that the target size is typically much greater than the polygenicity), fitness cost , and prevalence . These assumptions plausibly encompass most common complex diseases in humans. Moreover, within these parameter ranges, the liability threshold model on which we rely and other standard models of complex diseases in human statistical genetics are practically interchangeable (Slatkin 2008; Wray and Goddard 2010); these include Risch’s multiplicative model used as premise in linkage studies (Risch 1990) and the logistic model used in case-control GWAS (Sham and Purcell 2014).
We focus on the model’s behavior at MSDB (i.e., at equilibrium). For the type of diseases consider, most of the liability distribution falls below the threshold, with only a small tail above it (Fig. 1A). For simplicity, we assume that the liability distribution is well approximated by its stationary distribution and ignore stochastic fluctuations around this stationary distribution. Under our assumptions (notably of high polygenicity), this is a sensible assumption that should not have a substantial effect our results.
Simulations.
We validate our analytic results using simulations. The simulations are implemented in SLiM (version 3.6; Haller and Messer 2019) and realize the models with one or two liability effect sizes. We initialize the simulations with genetically identical, homozygous individuals at all sites. We set effect sizes at sites according to their expected fixed state at MSDB, which we derive below (otherwise, reaching MSDB would take too long to be computationally feasible). We run the simulation for a burn-in period of generations to allow genetic variation to be near the steady state at MSDB. We then run the simulation for an additional generations and sample the population every generations to collect 500 samples per simulation. We run 6 replicate simulations for each parameter setting and estimate quantities of interest by averaging over samples. For more details about the simulations see Supplement section S8.
Resources.
Documented code for the simulations and numerical solutions of the model, as well as the scripts used to produce all figures can be downloaded at https://github.com/jjberg2/msdbPaperCode.
Results
The population dynamics and genetic architecture at individual sites
The dynamics at a site.
The dynamics at a site can be described in terms of the first two moments of change in allele frequency in a single generation (Ewens 2004, Ch. 4). We calculate the moments by averaging the fitness of the three genotypes over genetic backgrounds and plugging these averages into the standard equation for the change in allele frequency at a single locus (Supplement section 2.1). We find that the expected change in frequency of an allele at frequency that increases liability by is well approximated by
| (4) |
The selection coefficient takes an intuitive form: it equals the fitness cost of the disease, , multiplied by the allele’s effect on disease risk, , namely:
| (5) |
is defined as the expected increase in individual disease risk caused by substituting a random liability-decreasing allele at this site by a liability-increasing one. As an aside, we note that the allele’s effect on risk equals its (absolute) “population attributable risk” in epidemiology and can be translated into its odds-ratio estimated in case-control GWAS (see Supplement section 2.5). The second moment of change in allele frequency is well approximated by the standard drift term
| (6) |
To complete the description of these dynamics, we require the functional form of the risk effect . In Supplement section 3, we show that the risk effect is well approximated by the area under the liability distribution that is pushed over the liability threshold when the distribution is shifted by on the liability scale (Figure 2). We denote the probability density at liability by and the probability that liability exceeds liability by . In these terms,
| (7) |
Figure 2.
The mapping between liability and risk effects. For small effect sites, illustrated in yellow, the risk effect is approximately equal to the product of the liability effect and the threshold density. For large effect sites, this linear approximation is inaccurate, and the risk effect must be approximated in terms of the difference in the areas in the tails of the liability distribution.
We can therefore assume that the risk effect and selection coefficient do not vary with allele frequency and adjust our notation to and , dropping the dependence on .
The dependence of an allele’s risk effect on its liability effect can be divided into two cases (Figure 2). When the allele’s effect on liability is sufficiently small such that the corresponding shift to the liability distribution has a negligible effect on the probability density near the threshold , we can approximate the allele’s effect on risk by the area of the rectangle with width and height , i.e.,
| (8) |
(Milkman 1978; Kimura and Crow 1978). An allele is small in this sense if its effect is substantially smaller than the width of the phenotypic distribution, i.e., (see Supplement section 2.3 for further analysis).
In turn, an allele is large if its effect on liability is comparable to or greater than the width of the phenotypic distribution, i.e., . When the effect of alleles on liability increases from small to large, the dependence of their risk effect on their liability effect becomes superlinear (Figure 2; but see Figure S1). In Supplement section 3.4, we show that for the kinds of diseases that we consider–that have a substantial fitness cost and are not exceedingly rare–alleles become strongly selected (i.e., ) before the superlinear dependence kicks in. This finding implies that we can divide alleles into three categories: small (in the sense of Eq. 8) and weakly selected (i.e., ), small and strongly selected, and large and strongly selected.
Next, we consider the genetic architecture at mutation-selection-drift balance (MSDB). To this end, we only care about alleles’ selection coefficients (rather than their effects on liability). The insensitivity of alleles’ risk effect to their frequency allows us to treat their selection coefficients as constant, which, in turn, allows us to apply standard approximations to solve for quantities of interest throughout the range allele effect sizes.
The fixed state.
At MSDB, fixations are at detailed balance: for a given effect size, the rates of fixation of liability-increasing and -decreasing alleles at sites are equal, i.e., they balance each other out (Iwasa 1988; Sella and Hirsh 2005). The proportions of sites with effect fixed for the risk increasing and decreasing alleles, and respectively, therefore satisfy
| (9) |
where is the fixation probability of a mutation with selection coefficient and scaled selection coefficient that arises at frequency (Crow and Kimura 1970). We solve for the fixed state and represent the solution in terms of the bias toward risk-decreasing alleles
| (10) |
In these terms, and . The expected bias satisfies , because selection always favors the risk-decreasing allele.
The fixed state exhibits the three standard selection regimes (Figure 3A). When selection is extremely weak and drift dominates, sites are equally likely to be fixed for the risk-increasing and -decreasing alleles, i.e., . This effectively neutral regime occurs when . At the other extreme, when selection is strong and dominates over drift, sites are always fixed for the risk-decreasing allele, i.e., . This strong selection regime occurs when . In the weak selection regime, when , the fixed state transitions between the effectively neutral and strongly selected extremes, with the bias increasing between 0 and 1 as selection becomes stronger.
Figure 3.
The genetic architecture at MSDB. A) The fixation bias as a function of the scaled selection coefficient at a site. B) The site frequency spectrum. C) The expected genetic diversity as a function of the scaled selection coefficient.
With the fixed state biased toward risk-decreasing alleles (at all but effectively neutral sites), mutation is biased toward risk-increasing alleles. Like in the classic model of simple (Mendelian) diseases, this mutational asymmetry arises from the dynamics of the model rather than from assumptions about mutation (namely, we assumed symmetric mutation between risk-increasing and -decreasing alleles).
Segregating sites.
Figure 3B shows the frequency distribution of segregating, disease-increasing alleles at MSDB for several values of the population-scaled selection coefficient (see Supplement section 3.1 for derivation). With the fixed state biased toward alleles that decrease risk, derived, segregating alleles tend to increase risk. Even neutral derived alleles, let alone derived alleles that increase risk, segregate at lower frequencies than ancestral ones. These effects explain why risk-increasing alleles tend to segregate below frequency , and why this bias is stronger when selection is stronger. The predicted asymmetry between the frequencies of alleles that increase and decrease risk can be tested using data from GWAS (see Discussion and Koch et al., 2024).
Next, we consider diversity levels. We can calculate the expected diversity levels using the diffusion approximation (Crow and Kimura 1970; Ewens 2004). A variant with allele frequency contributes to heterozygosity. To calculate the expectation per site, we multiply the rate at which mutations arise by the expected total contribution of an individual mutation during its sojourn in the population. Namely,
| (11) |
with the sojourn time
| (12) |
defined such that is the expected number of generations that an allele with scaled selection coefficient spends between frequencies and . In this way, we find that the expected heterozygosity per site
| (13) |
where and is the fixed bias.
Diversity levels also exhibit the three standard selection regimes (Fig. 3C). In the effectively neutral regime (i.e., when ), heterozygosity is well approximated by the neutral expectation . In the strong selection regime (i.e., ), sites are always fixed for the risk decreasing allele, implying that and that the derived, risk increasing allele segregates at low frequency, i.e., , so
| (14) |
which aligns with the classic expectation under mutation-selection balance. In the weak selection regime (i.e., when ), diversity levels transition between these two extremes.
Contribution to variance.
The last facet of architecture that we consider here is the contribution to additive variance in liability. Estimates of this contribution are used to assess how much of the heritable variance in disease risk arises from variants of small and large effects (see Discussion). The total genetic variance in liability will become important when we consider disease prevalence.
We can calculate the expected contribution to variance based on the expected heterozygosity (Eq. 13). A variant with allele frequency and liability effect contributes to additive variance in liability. Therefore, the expected contribution per site
| (15) |
where
| (16) |
is the contribution per unit diploid mutation rate.
We can calculate the total additive variance in liability by summing over sites and integrating over the distribution of effect sizes. Namely,
| (17) |
where is the number of sites and is the distribution of effect sizes. If we assume that all effect sizes are small, then (Eq. 8) and . In this case, the total variance is well approximated by
| (18) |
where is the mean effect size and is the mean fixation bias, with sites weighted by their effect sizes.
In Supplement section S4.3, we investigate how the contributions to liability- and risk-scale variance vary with variant effect sizes. We show that in our model, these variances do not exhibit the same kind of asymptotic ‘flattening’ found in models of stabilizing selection (see Discussion and Simons et al. 2018; O’Connor et al. 2019). We also describe how some simple summaries of asymmetry in the architecture at individual sites can be calculated (Supplement section S4.4; Figure S12 and S13). Like the summaries considered here, all these summaries be described in terms of and .
The mapping between liability effects and selection coefficients
A phenotypic perspective on MSDB.
At MSDB, mutation is biased toward risk-increasing alleles, causing a mutational increase in mean genetic liability each generation; we denote it by . The mutational bias is balanced by an equal but opposing selection response that we denote by . Thus, at MSDB, .
Here we focus on the mutational bias. We first approximate the mutational bias based on the fixed state. In this approximation, all risk-increasing mutations occur at sites fixed for the risk-decreasing allele, and vice versa. As above, we denote the proportions of sites with effect fixed for the risk-increasing and -decreasing alleles at MSDB by and , respectively. We then approximate the mutational bias as
| (19) |
where is the distribution of liability effects across sites. Recalling that the fixation bias , we find that
| (20) |
where is the mean effect size and is the mean fixation bias, weighted by effect sizes.
We can also express the mutational bias in terms of the mean genetic liability . In the fixed state approximation, the mean liability is
| (21) |
where we have set the range of genetic liability scale between 0 and , with low liability alleles contributing 0. From Eqs. 19–21, we find that
| (22) |
In Supplementary section 5, we show that this equation is exact when we relax the fixed state approximation and account for segregating genetic variation.
Under our modeling assumptions, the distance between the mean genetic liability and the liability threshold is tiny relative to the scale of possible genetic liability , i.e., . To understand why, we consider the case without an environmental contribution to liability. (An environmental contribution only reduces the distance between the mean genetic liability and the threshold, because it reduces the efficacy of selection on individual sites.) Without an environmental contribution, our assumption of a low mutation rate per site implies that the scale of variation in liability among individuals is much smaller than the scale of possible genetic liabilities (i.e., ). In turn, our assumption that the disease prevalence is not exceedingly small requires the distance between the mean genetic liability and the threshold to be on the scale of the genetic variation (i.e., ). It therefore follows that .
This condition allows us to approximate the mutational bias in terms of the position of the threshold. Specifically, given that , we find that
| (23) |
where measures the position of the threshold relative to the middle of the liability scale; we henceforth refer to it as the threshold bias. Eq. 23 shows that the threshold bias determines the mutational bias. Moreover, comparing Eqs. 20 and 23, we find that
| (24) |
Thus, the threshold bias also determines the mean fixed bias at MSDB, which reflects the strength of selection acting on individual sites (Eq. 10).
Selection at sites with small effects.
Equation 24 also tells us how the strength of selection on sites relates to their effects on liability, so long as these effects are small. To understand how, we first consider a simple case in which all sites have the same liability effect size and therefore the same scaled selection coefficient . When we express the fixation bias in terms of the scaled selection coefficient (Eq. 10), Eq. 24 becomes
| (25) |
Solving for the scaled selection coefficient, we find that
| (26) |
Thus, the position of liability threshold determines the scaled selection coefficient at MSDB (Fig. 4A). Eq. 26 applies so long as the threshold is not extremely close to 0 (specifically, , such that some sites are fixed for the liability-increasing allele. For this to be the case selection cannot be too strong, implying that our small effect size approximation (Eq. 8) applies and that . Consequently, the threshold density at MSDB is
| (27) |
Figure 4.
The mapping between liability and fitness effects. A) The threshold position determines the scaled selection coefficients in the model with a single effect size. The solid line depicts the analytic approximation (Eq. 6) and the circles and crosses depict averages in simulations with specified fitness costs (see Model and Supplement section S8 for details). Scaled selection coefficients in simulations were estimated by computing the average of the risk effect, , over many sites and generations and multiplying by . B) Selection on small effect sites is insensitive to the cost of the disease. When the fitness cost, , increases, mean liability is pushed down to reduce the threshold density, , such that and thus scaled selection coefficients at small effect sites remain invariant. C) The mapping between liability and scaled selection effects at large effect sites. The results shown are based on numerically solving the model with two effect sizes (see Supplement section S7 for details), setting , and per site per generation, with fractions of small effect sites and of large effect sites, varying with , and using the specified values of and .
Thus, given the compound evolutionary parameter , the required strength of selection is attained by adjusting the population’s liability distribution, such that the threshold density satisfies Eq. 27.
Next, we consider the general case with a distribution of effect sizes. In this case, when we express Eq. 24 in terms of scaled selection coefficients, we find that
| (28) |
If we assume the small effect approximation for , the equation becomes
| (29) |
We can solve this equation numerically using a line search for the threshold density given the distribution , and . Importantly, given the that solves Eq. 29, solves Eq. 28.
The solution for the threshold density divides sites into three kinds. On the high end of effect sizes, sites are strongly selected, with , and thus fixed for the liability decreasing alleles; these sites all contribute to attaining . On the lower end of effect sizes, sites are effectively neutral, with , and thus equally likely to be fixed for the liability-increasing and -decreasing alleles; these sites all contribute to attaining . In between these ends, sites are weakly selected, with and their fixed state is highly sensitive to the threshold density, with the fixed bias ranging between 0 and 1.
If the distribution of liability effects is highly concentrated around , the solution resembles the case with a single effect size. MSDB is attained by adjusting the threshold density such that most sites—with effects near —are weakly selected, and . In the other extreme, with liability effects thinly distributed over several orders of magnitude, only a small proportion of sites would fall in the intermediate, weakly selected range of effect sizes. In this case, the threshold density at MSDB divides the range of effect sizes such that the fixed bias from strongly selected sites matches the threshold bias, i.e., such that . In Supplementary section 6, we investigate how variation in the distribution of liability effects, , and relative position of the liability threshold, , would affect the solution of Eq. 29, assuming most sites have small effects.
Here we focus on what the solution of Eq. 29 tells us about the mapping between liability effects and selection coefficients. As we already know, the approximation . breaks down when sites are sufficiently large (see, e.g., Fig. 2). This does not affect the solution to Eq. 29 because selection at these sites is sufficiently strong (i.e., ) to maximize their fixed bias (i.e., ) regardless of the exact form relating their liability effect sizes and selection coefficients. This reasoning clarifies that Equations 29 is insensitive to, and uninformative about, the strength of selection acting on sites with sufficiently large effects.
In contrast, the strength of selection acting on sites with small effects follows from Equation 29. When effect sizes are sufficiently small, the strength of selection is well approximated by . As we already noted, for diseases with a substantial fitness cost that are not exceedingly rare, sites with such small effect liabilities range from being effectively neutral to being strongly selected. Selection on sites in this range is determined by the relative position of the liability threshold, , and the distribution of effects, , where the mapping between effect sizes and scaled selection coefficients is attained by adjusting the threshold density, .
The genetic architecture at sites with small effects.
Our results imply that scaled section coefficients at sites with small effects are insensitive to the fitness cost, , environmental variance, , and population size, . Figure 4B illustrates the effects of an increase in fitness cost. In this case, the mean genetic liability is pushed farther below the threshold to reduce the threshold density , leaving unchanged, and maintaining the same mapping between liability effect sizes and scaled selection coefficients. An increase in population size or decrease in environmental variance act similarly, although they also affect the total variance in liability. Because the disease is highly polygenic, changes in mean genetic liability are achieved by tiny changes to the fixed state across sites, with negligibly small effects on the scaled selection coefficients.
As we described earlier, the architecture at sites depends only on and . For sites with small effects, the dependence on translates into a dependence on the threshold bias, , and the distribution of liability effects, . In turn, the architecture at sites with small effects is insensitive to the fitness cost of the disease and the environmental variance, while the population size affects only the total number of segregating sites, but not the distribution of their allele frequencies.
Selection on and genetic architecture of sites with large effects.
Large effect sites span a wide range of liability effect sizes, and the factors that determine the strength of selection acting on them vary within this range. At the lower end of this range, liability effect sizes are just above those of strongly selected, small effect sites. Near this boundary, scaled selection coefficients are still strongly affected by the threshold density , as well as by derivatives of the liability distribution at the threshold (see, e.g., Figure 2). We would therefore expect selection on such large effect sites to resemble selection on strongly selected, small effect sites in being affected by the relative position of the threshold, , and distribution of selection effects, .
At the other end, we have sites with effect sizes that are large enough for a single copy of the risk-increasing allele to almost always cause the disease (e.g., when ); these alleles are dominant, with full or nearly full penetrance. At this end, , and the expected frequency of the risk-increasing allele at a site is describe by classic MSDB, with . Thus, in contrast to sites with large effect sizes at the lower end (and to sites with small effects), the selection acting on them and their genetic architecture are determined by the fitness cost of the disease and are insensitive to the relative position of the liability threshold, .
Figure 4C shows the mapping between effect size and scaled selection coefficients for a large effect site, in a model with two effect sizes. When the large effect size increases, the dependence of the strength of selection on model parameters varies gradually between the two behaviors that we described.
Disease prevalence
Lastly, we consider the disease prevalence at MSDB. The prevalence is equal to the area under the tail of the liability distribution that lies beyond the threshold, so calculating it requires knowledge of the shape of this distribution. The liability distribution is often assumed to be Normal (see Discussion). This assumption seems sensible when the disease is highly polygenic and genetic contributions are small, such that an individual’s liability arises from many small effect genetic contributions and a normally distributed environmental contribution.
We begin by considering this case and assuming normality, which allows us to calculate the prevalence based on the density of the liability distribution at the threshold. Given the threshold density on the standard scale, , the prevalence is given by
| (30) |
where is the positively defined inverse of the standard Normal PDF, and is the standard Normal CDF. As we assume that the disease is not exceedingly common or exceedingly rare (e.g., ), the dependence on the threshold density is approximately linear, with
| (31) |
(Fig. S2).
We can rely on our small effect approximations to calculate the threshold density on the standard scale. The standardized threshold density
| (32) |
where is the variance in liability. Noting that in the small effect approximation , that the heritability in liability and relying on our approximation for (Eq. 18), we find that
| (33) |
Under our assumption that effect sizes are small, the mean scaled selection coefficient and thus the term are fully determined by the threshold bias, , and distribution of effect sizes, . In the single effect case, we can substitute the explicit expression for (Eq. 26) to find that the standard threshold density
| (34) |
Figure 5A shows how the prevalence in the single effect size model increases with for several possible values of the compound parameter . As an illustration, we consider parameter values typical of humans, i.e., and per site per generation, a heritability , and a cost . We vary the target size within the wide range estimated for complex, quantitative traits, e.g., (Simons et al. 2022). For these parameters, we find that a disease prevalence of 1% or greater at MSDB is attainable only if the target size or the threshold bias is relatively large.
Figure 5.
Disease prevalence at MSDB. A) Prevalence versus threshold bias in the model with a single effect size. Analytic results are based on Eqs. 30 and 34; for simulations details see Methods and Supplement section S8. The results correspond to setting per site per generation, , and , and varying the target size and threshold bias as indicated. B) Prevalence versus the fraction of sites with large effect sites in the model with two effect sizes. The model was solved as described in the text (see Supplement section S7 for details). The parameters are as in A, with and . See Figure S3 for other choices of and . C) The impact of large effect sites on the liability distribution. The two distributions shown correspond to the parameter values highlighted in panel B, with and without large effect sites. The dotted outline shows a Normal distribution with the same mean and variance as the Poisson convolution with large effects.
Next, we consider how sites with large effects impact prevalence. To this end, we study a model with two effect sizes: a fraction of sites have a small effect size and a fraction of sites have a large effect size , where . In this parametrization, the boundary case with and corresponds to the single effect model that we considered before, and changes in prevalence when we move away from this boundary by increasing reflect the effect of increasing the proportion of large effects.
If we plausibly assume that individuals carry only a few large effect risk-increasing alleles, then we no longer expect the liability distribution to be Normal. Variation among individuals in the number of these alleles introduces a fat tail of individuals with higher liabilities and thus a skewed liability distribution. In the case with two effect sizes, the number of large effect, risk-increasing alleles follows a Poisson distribution with mean (Felsenstein 1974), where we assume that the mean so that an individual carries only at most a few of these alleles.
We therefore model the liability distribution at MSDB as arising from three components: (1) a large effect genetic liability distribution that arises from the Poisson distributed number of large effect risk-increasing alleles, (2) a normally distributed genetic liability arising from small effect sites, and (3) a normally distributed environmental liability. The liability of an individual in the population is randomly sampled from the contributions to liability arising from each of these components. The liability distribution is therefore the convolution of these three distributions.
We solve this model numerically (for details see Supplement section S7). We assume that all large effect sites are fixed for the low liability allele and derive the threshold density, , by requiring that the fixed bias from the small and large effect sites combined matches the threshold position, . Given the threshold density, we solve for the liability distribution arising from small effect sites. We then solve for the selection coefficient of large effect sites, , by requiring that the convolution of the Normal and Poisson components of the liability distribution match the threshold density, .
Figure 5B shows the prevalence as a function of the proportion of large effect sites. Here, we set and but in Figure S3, we explore other choices and obtain similar qualitative results. All other model parameters are set to the same values that we used for the case with a single effect size (as in Fig 5A). We vary the proportion of large effect sites between , corresponding to the case without large effects, and such that . We validated our numerical solution against simulations (Fig. S7).
Increasing the proportion of large effect sites affects the liability distribution in three ways (Figs. 5C). First, it increases the variance in liability, because large effect sites contribute much more variance than small effect sites (Supplement section S4.3). Second, it introduces a right skew in liability due to variation in the number of large effect risk-increasing alleles. Third, it decreases the threshold density , because, with large effect sites all fixed for the risk-decreasing alleles, the fixed bias and thus the selection at small effect sites is weaker. We would expect the first two effects to increase the prevalence and the third one to decrease it. For the parameter values that we consider in Fig. 5B, the combination of these effects leads to a reduction in prevalence when the fraction of large effect sites increases. In Fig. S3–S6, we consider how the prevalence changes with increasing fraction of large effect sites for other choices of and . In general, we find that when is closer to zero, increasing the fraction of large effect sites tends to lead to a decrease in prevalence, whereas when it is closer to one, it tends to lead to an increase.
Discussion
We introduced an evolutionary model of complex disease susceptibility, in which a variant’s effect on disease risk follows from the liability threshold model, and a variant’s effect on fitness follows from its effect on disease risk. The model can be viewed as a generalization of the classic MSDB model of Mendelian diseases and is also closely related to models used to study genetic load (see Introduction). We solved the model for the genetic architecture and prevalence of the disease at MSDB.
Mutation-selection-drift balance in this model can be understood as a ‘mean field’ equilibrium. Selection on sites with a given effect on liability is determined by the phenotypic distribution (the ‘field’), and the phenotypic distribution arises from the aggregate behavior at all sites (the ‘mean’). The density of the phenotypic distribution at the liability threshold is determined by matching the population’s fixed state with the position of the threshold on the liability scale (given the distribution of liability effect sizes). The threshold density divides liability effect sizes into ‘small’ and ‘large’, which differ in their mapping onto the effects on disease risk and fitness. For sites with small effects on the liability scale, the effects on disease risk and fitness are proportional to their liability effects and to the threshold density. For sites with large effects, the effects on disease risk and fitness depend on non-linear cumulants of the liability distribution and the fitness cost of the disease. The selection acting on sites shapes their genetic architecture, which together with environmental effects on liability, determines disease prevalence. We mapped out these relationships and their main implications for observable quantities and solved the model explicitly for simple distributions of effect sizes.
With these predictions in hand, we can ask whether our model fits what is known about the genetic architecture of complex disease susceptibility in humans. Figure 6 shows examples of the “smile” architecture of GWAS hits typical of complex diseases (Koch et al. 2024). These hits were ascertained in GWAS based on genotyping and imputation and therefore include common variants under weak and moderate selection but not variants under very strong selection. The “smile” architecture reflects selection in that variants with larger risk effects segregate at low minor allele frequencies (i.e., risk allele frequencies near 0 or 1). This is not the signature of selection expected under our model. Instead, we predict that if selection acted on variants due to their effects on disease risk, the architecture should be asymmetric, with a depletion of major alleles increasing risk, rather than (approximately) symmetric between minor and major risk-increasing alleles (compare Figs. 3B and 6).
Figure 6.
The “smile” architecture of complex diseases in humans. Points correspond to approximately independent genome-wide significant hits. The data was taken from Dönertaş et al (2021), Trubetskoy et al. (2022), Liu et al. (2015), and Spracklen et al. (2020). See Supplement section S9 for data processing details.
Our results suggest that selection on variants due to their effects on disease risk should be stronger for diseases with greater fitness costs and higher prevalences (see, e.g., Eqs. 5 and 34). The departure from our predictions might seem all the more surprising then, given that complex diseases often entail substantial fitness costs and are quite common in contemporary human populations. Schizophrenia, for example, has a prevalence of 0.5–1% (Jablensky 2000; Saha et al. 2005; Chan et al. 2015; Simeone et al. 2015), affects several fitness components and has been estimated to reduce fertility by up to 75% (Haukka, Suvisaari, and Lönnqvist 2003; Laursen and Munk-Olsen 2010; Power et al. 2013). Other diseases, such as type 2 diabetes, with a global prevalence of 6% (Ong et al. 2023) and multiple sclerosis, with a prevalence of ~0.3% in the United States (Nelson et al. 2019), are associated with a substantial increase in mortality rates and are plausibly associated with substantial reductions in fitness (Scalfari et al. 2013; Graves et al. 2023; Emerging Risk Factors Collaboration 2023). How then might we make sense of the fact that the architecture of common variation affecting complex disease susceptibility does not reflect selection against these diseases?
One possibility is that complex diseases that are common in contemporary human populations substantially increased in prevalence with very recent changes in environment. Examples plausibly include type 2 diabetes and other diseases associated with obesity (Dai et al. 2020; Teng et al. 2022; Ong et al. 2023), asthma (Eder, Ege, and Mutius 2006), and autism (Atladóttir et al. 2007; Weintraub 2011; Hansen, Schendel, and Parner 2015). More generally, we know little about the fitness cost and prevalence of human diseases more than a century back.
Persistent selection over molecular evolutionary time scales would be needed to attain the fixed bias that shapes the genetic architecture at MSDB in our model. These time scales vary with the scaled selection coefficients affecting sites. As an illustration, given the contemporary mutation rate in humans and assuming that scaled selection coefficients remains constant, sites with a scaled selection coefficient would take on the order of a million generations to near MSDB (see, e.g., Supplement section 2.2 in Simons et al. 2014); this roughly corresponds to 30 million years, extending back to the common ancestor of humans and Old World monkeys. With , it would take on the order of 40 million generations. Disease biology and genetics has plausibly changed substantially over such time scales. We cannot rule out there being subtle footprints of asymmetry between risk-increasing and -decreasing alleles owing to fitness effects of complex diseases over some evolutionary timescales (see below). Nonetheless, such fitness effects do not explain the “smile” architecture of common variation affecting complex diseases.
What could explain the “smile” architecture is if common variants affecting disease risk were selected on because of their pleiotropic effects on myriad quantitative traits that have been subject to stabilizing selection over long evolutionary time scales (see also Koch et al. 2024). The negative relationship between the effect sizes of the variants and minor allele frequencies would arise if the effects on diseases today are positively correlated with their effects on quantitative traits that were under stabilizing selection over the molecular evolutionary timescales that shape architecture at MSDB. The approximate symmetry between risk-increasing and -decreasing alleles would be expected if mutations with small and moderate fitness effects were (approximately) equally likely to increase or decrease disease risk (because new mutations are always selected against under long-term stabilizing selection). Together these features would generate the “smile” architecture observed for many complex diseases.
This scenario seems plausible. The effects of most variants on complex traits are plausibly mediated by perturbations to the expression of genes in cellular, life history and other contexts in which expression is held close to an optimal, nonzero value by stabilizing selection. Indeed, most heritable variance in complex traits arises from common regulatory variants (Yang et al. 2010; 2011; Finucane et al. 2015) in regions with open chromatin in the cellular contexts that affect these traits (Boyle, Li, and Pritchard 2017; Sinnott-Armstrong et al. 2021; Spence et al. 2024). Moreover, there is an a priori expectation that genes that are expressed in a given context would have some nonzero, optimal expression level, and evidence that selection generally acts against eQTLs (Mostafavi et al. 2023). In this genic perspective, larger perturbations to expression would have greater effects on traits and would be more strongly selected against (see, e.g., Conrad et al. 2006; Glassberg et al. 2019; Zeng et al. 2023; Mostafavi et al. 2023), leading to a negative relationship between effect sizes and minor allele frequencies. The approximate symmetry between trait-increasing and -decreasing alleles would arise if weakly and moderately selected perturbations to gene expression were approximately equally likely to increase or decrease gene expression.
The kind of pleiotropic stabilizing selection that would generate the “smile” architecture has been shown to explain key features of the genetic architecture observed in GWAS of highly polygenic quantitative traits. A simple model (with few parameters) of direct and pleiotropic stabilizing selection was shown to fit the joint distribution of frequencies and effect sizes of GWAS hits for highly polygenic quantitative traits in the UKB (Simons et al. 2022). A single parameter in the model describes the coupling between the effects of the variants on the trait and on fitness. The functional form of this relationship arises from assuming that genetic variation in the trait is highly pleiotropic (Simons et al. 2018). As in the genic case described above, this functional form associates larger effects on fitness with larger effects on a trait, and mutations affecting the trait are assumed to be equally likely to increase or decrease it, giving rise to a “smile” architecture. Additionally, the high polygenicity of complex diseases and quantitative traits that GWAS revealed has been partially attributed to ‘flattening’—whereby variants whose effects on a trait exceed a small threshold value have similar expected (asymptotic) contributions to variance in the trait (O’Connor et al. 2019). Such flattening arises under direct and pleiotropic stabilizing selection (Simons et al. 2018) but does not arise under the kind of directional selection modeled here (Supplementary section S4.2).
We would expect that there to be considerable overlap between common variation affecting complex quantitative traits and complex diseases and consequently in the selection pressures that shape their genetic architecture. From a genic perspective, variation in gene expression in myriad contexts plausibly affect both. From the other end, quantitative traits like body mass index have been estimated to have mutational target sizes that exceed half of the fraction of the genome that has been estimated to be functional (Simons et al. 2022), and the high polygenicity of diseases like schizophrenia indicates similarly large target sizes (Loh et al. 2015). While the highly pleiotropic stabilizing selection model explains key observations about the genetic architecture of both complex quantitative traits and diseases, we cannot rule out there being alternative explanations for these observations. For example, a highly pleiotropic model of directional selection on traits in which the effects of variants on different traits are uncorrelated could potentially explain current observations (and could also be viewed as “apparent” stabilizing selection; Barton 1990; A. S. Kondrashov and Turelli 1992). Moreover, as we already noted, while the kind of directional selection we modeled falls short of explaining salient features of the architecture of common variation affecting complex disease susceptibility, we cannot rule out that it does have some effects on architecture.
Notably, the predictions of our model seem to be better aligned with the genetic architecture of rare variants with large effects on disease risk. These variants are too rare to be discovered in GWAS based on genotyping and imputation, and were therefore discovered using other study designs, including association and burden tests based on whole-exome sequencing (Singh et al. 2022; Palmer et al. 2022) or whole-genome sequencing in quartet families (Satterstrom et al. 2020). In these studies, rare, large effect alleles are generally found to increase disease risk as our model predicts. A caveat is that these studies have substantially greater power to identify rare risk-increasing alleles than rare risk-decreasing ones, so the observed asymmetry could also reflect an ascertainment bias. Nevertheless, the variants discovered in this way are often LoF or copy number variants that appear to be under strong purifying selection, suggesting that the asymmetry is real.
This “large-effect” mode of architecture has been found for several common, complex diseases, notably autism spectrum disorder and schizophrenia (Satterstrom et al. 2020; Singh et al. 2022). Rare, strongly deleterious alleles are much younger than the common variation identified in GWAS and are therefore more likely to reflect selection on contemporary diseases. Purifying selection on these alleles could also reflect pleiotropic selection on other traits. However, alleles that are more specific in their effect on a given complex disease are expected to contribute more to heritable variance in that disease and are therefore also more likely to be identified in mapping studies (Spence et al. 2024).
The source of selection notwithstanding, complex disease architecture including both a strongly selected “large effect” mode and a weakly selected “polygenic” mode should resemble the one that we modeled in having a fat-tailed liability distribution. Importantly, this architecture violates the normality often assumed in inference and theory (see e.g., Dempster and Lerner 1950; Falconer 1965). These departures from normality plausibly bias current estimates of the heritability of complex disease.
In particular, they would bias estimates of the proportional contributions of small and large effect variants. Total liability-scale heritability is estimated from the correlations among relatives in disease state assuming: (i) that the liability distribution is Normal in order to derive the threshold density from the prevalence (using our Eq. 30), and (ii) that variant effect sizes are sufficiently small for liability effects to equal the ratio of risk effects and threshold density (using our Eq. 8; Dempster and Lerner 1950; Falconer 1965). The contribution of small effect variants to liability-scale heritability is estimated from GWAS based on the same assumptions (Hong Lee et al. 2011). When large effect variants contribute substantially to heritability, the departure from these assumptions results in two main biases. First, the contribution of small effect variants to the liability-scale heritability is underestimated (based on either GWAS or correlations among relatives) because assuming normality when the tail is fatter leads to an overestimate of the threshold density. Second, the relative contribution of large effect variants to the total liability-scale heritability is overestimated because the linear transformation between risk and liability effects overestimates their contribution. Consequently, current estimates of the proportional contribution of large effect variants are plausibly inflated. The magnitude of these biases depends on the departures from normality, which are unknown. These biases warrant further investigation.
In summary, there is compelling evidence that heritable variation in human complex traits, including in complex disease risk, evolves under mutation-selection-drift balance (Sella and Barton 2019). Evidence from human GWAS and other study designs suggest that, at least for common genetic variation, the mode of selection in this balance is predominantly pleiotropic stabilizing selection (Simons et al. 2018; 2022; Koch et al. 2024; Spence et al. 2024). Rare genetic variation affecting complex disease susceptibility might also be shaped by directional selection of the kind we modeled here, and other modes of selection, such as balancing selection, doubtless contribute, but appear comparatively minor. More generally, as this work illustrates, by contrasting the predictions of evolutionary models with empirical findings, we can learn about the nature of selection affecting heritable variation in complex traits, and, more generally, about the evolutionary processes that shape inter-individual differences.
Supplementary Material
Acknowledgements.
We thank Nick Barton, Magnus Nordborg, Molly Przeworski, and Himani Sachdeva for many helpful discussions and for comments on the manuscript. We also thank members of the Sella, Przeworski and Andolfatto labs at Columbia University, and the Berg, Novembre and Steinrücken labs at the University of Chicago, for feedback on the work at various stages. This work was supported by NIH F32 grant GM126787 and R35 grant GM151257 to JJB and NIH R01 grant GM115889 to GS.
Literature Cited
- Abdellaoui Abdel, Yengo Loic, Verweij Karin J.H., and Visscher Peter M.. 2023. “15 Years of GWAS Discovery: Realizing the Promise.” The American Journal of Human Genetics 110 (2): 179–94. 10.1016/j.ajhg.2022.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alon Uri. 2023. Systems Medicine. 1st Edition. Chapman and Hall/CRC. [Google Scholar]
- Amorim Carlos Eduardo G., Gao Ziyue, Baker Zachary, José Francisco Diesel Yuval B. Simons, Haque Imran S., Pickrell Joseph, and Przeworski Molly. 2017. “The Population Genetics of Human Disease: The Case of Recessive, Lethal Mutations.” PLOS Genetics 13 (9): e1006915. 10.1371/journal.pgen.1006915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atladóttir Hjördís Ósk, Parner Erik T., Schendel Diana, Dalsgaard Søren, Thomsen Per Hove, and Thorsen Poul. 2007. “Time Trends in Reported Diagnoses of Childhood Neuropsychiatric Disorders: A Danish Cohort Study.” Archives of Pediatrics & Adolescent Medicine 161 (2): 193–98. 10.1001/archpedi.161.2.193. [DOI] [PubMed] [Google Scholar]
- Barton N H. 1990. “Pleiotropic Models of Quantitative Variation.” Genetics 124 (3): 773–82. 10.1093/genetics/124.3.773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boyle Evan A., Li Yang I., and Pritchard Jonathan K.. 2017. “An Expanded View of Complex Traits: From Polygenic to Omnigenic.” Cell 169 (7): 1177–86. 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chan Kit Yee, Zhao Fei-fei, Meng Shijiao, Demaio Alessandro R., Reed Craig, Theodoratou Evropi, Campbell Harry, Wang Wei, Rudan Igor, and Global Health Epidemiology Reference Group (GHERG). 2015. “Prevalence of Schizophrenia in China between 1990 and 2010.” Journal of Global Health 5 (1): 010410. 10.7189/jogh.05.010410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charlesworth Brian. 1990. “Mutation-Selection Balance and the Evolutionary Advantage of Sex and Recombination.” Genetics Research 55 (3): 199–221. 10.1017/S0016672300025532. [DOI] [PubMed] [Google Scholar]
- ———. 2013a. “Stabilizing Selection, Purifying Selection, and Mutational Bias in Finite Populations.” Genetics 194 (4): 955–71. 10.1534/genetics.113.151555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ———. 2013b. “Why We Are Not Dead One Hundred Times Over.” Evolution 67 (11): 3354–61. 10.1111/evo.12195. [DOI] [PubMed] [Google Scholar]
- Clark Andrew G. 1998. “Mutation-Selection Balance with Multiple Alleles.” In Mutation and Evolution, edited by Woodruff Ronny C. and Thompson James N., 41–47. Dordrecht: Springer Netherlands. 10.1007/978-94-011-5210-5_4. [DOI] [PubMed] [Google Scholar]
- Conrad Donald F., Andrews T. Daniel, Carter Nigel P., Hurles Matthew E., and Pritchard Jonathan K.. 2006. “A High-Resolution Survey of Deletion Polymorphism in the Human Genome.” Nature Genetics 38 (1): 75–81. 10.1038/ng1697. [DOI] [PubMed] [Google Scholar]
- Crow James F. 1958. “Some Possibilities for Measuring Selection Intensities in Man.” Human Biology 30 (1): 1–13. [PubMed] [Google Scholar]
- Crow, and Kimura. 1970. An Introduction to Population Genetics Theory. The Blackburn Press. [Google Scholar]
- Dai Haijiang, Alsalhe Tariq A., Chalghaf Nasr, Riccò Matteo, Bragazzi Nicola Luigi, and Wu Jianhong. 2020. “The Global Burden of Disease Attributable to High Body Mass Index in 195 Countries and Territories, 1990–2017: An Analysis of the Global Burden of Disease Study.” PLOS Medicine 17 (7): e1003198. 10.1371/journal.pmed.1003198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dempster Everett R., and Lerner I. Michael. 1950. “Heritability of Threshold Characters.” Genetics 35 (2): 212–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dönertaş Handan Melike, Fabian Daniel K., Fuentealba Matías, Partridge Linda, and Thornton Janet M.. 2021. “Common Genetic Associations between Age-Related Diseases.” Nature Aging 1 (4): 400–412. 10.1038/s43587-021-00051-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eder Waltraud, Ege Markus J., and von Mutius Erika. 2006. “The Asthma Epidemic.” New England Journal of Medicine 355 (21): 2226–35. 10.1056/NEJMra054308. [DOI] [PubMed] [Google Scholar]
- Emerging Risk Factors Collaboration. 2023. “Life Expectancy Associated with Different Ages at Diagnosis of Type 2 Diabetes in High-Income Countries: 23 Million Person-Years of Observation.” The Lancet. Diabetes & Endocrinology 11 (10): 731–42. 10.1016/S2213-8587(23)00223-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ewens Warren. 2004. Mathematical Population Genetics 1: Theoretical Introduction. Springer. [Google Scholar]
- Falconer D. S. 1965. “The Inheritance of Liability to Certain Diseases, Estimated from the Incidence among Relatives.” Annals of Human Genetics 29 (1): 51–76. 10.1111/j.1469-1809.1965.tb00500.x. [DOI] [Google Scholar]
- Falconer D. S., and Mackay Trudy. 1995. Introduction to Quantitative Genetics. Longman. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Felsenstein Joseph. 1974. “THE EVOLUTIONARY ADVANTAGE OF RECOMBINATION.” Genetics 78 (2): 737–56. 10.1093/genetics/78.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finucane Hilary K., Bulik-Sullivan Brendan, Gusev Alexander, Trynka Gosia, Reshef Yakir, Loh Po-Ru, Anttila Verneri, et al. 2015. “Partitioning Heritability by Functional Annotation Using Genome-Wide Association Summary Statistics.” Nature Genetics 47 (11): 1228–35. 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuller Zachary L., Berg Jeremy J., Mostafavi Hakhamanesh, Sella Guy, and Przeworski Molly. 2019. “Measuring Intolerance to Mutation in Human Genetics.” Nature Genetics 51 (5): 772–76. 10.1038/s41588-019-0383-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillespie John. 2004. Population Genetics: A Concise Guide.
- Glassberg Emily C, Gao Ziyue, Harpak Arbel, Lan Xun, and Pritchard Jonathan K. 2019. “Evidence for Weak Selective Constraint on Human Gene Expression.” Genetics 211 (2): 757–72. 10.1534/genetics.118.301833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graves Jennifer S., Krysko Kristen M., Hua Le H., Absinta Martina, Franklin Robin J. M., and Segal Benjamin M.. 2023. “Ageing and Multiple Sclerosis.” The Lancet Neurology 22 (1): 66–77. 10.1016/S1474-4422(22)00184-3. [DOI] [PubMed] [Google Scholar]
- Haldane J. B. S. 1927. “A Mathematical Theory of Natural and Artificial Selection, Part V: Selection and Mutation.” Mathematical Proceedings of the Cambridge Philosophical Society 23 (7): 838–44. 10.1017/S0305004100015644. [DOI] [Google Scholar]
- ———. 1937. “The Effect of Variation of Fitness.” The American Naturalist 71 (735): 337–49. 10.1086/280722. [DOI] [Google Scholar]
- Haller Benjamin C, and Messer Philipp W. 2019. “SLiM 3: Forward Genetic Simulations Beyond the Wright-Fisher Model.” Molecular Biology and Evolution 36 (3): 632–37. 10.1093/molbev/msy228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hansen Stefan N., Schendel Diana E., and Parner Erik T.. 2015. “Explaining the Increase in the Prevalence of Autism Spectrum Disorders: The Proportion Attributable to Changes in Reporting Practices.” JAMA Pediatrics 169 (1): 56–62. 10.1001/jamapediatrics.2014.1893. [DOI] [PubMed] [Google Scholar]
- Harpak Arbel, Bhaskar Anand, and Pritchard Jonathan K.. 2016. “Mutation Rate Variation Is a Primary Determinant of the Distribution of Allele Frequencies in Humans.” PLOS Genetics 12 (12): e1006489. 10.1371/journal.pgen.1006489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haukka Jari, Suvisaari Jaana, and Lönnqvist Jouko. 2003. “Fertility of Patients With Schizophrenia, Their Siblings, and the General Population: A Cohort Study From 1950 to 1959 in Finland.” American Journal of Psychiatry 160 (3): 460–63. 10.1176/appi.ajp.160.3.460. [DOI] [PubMed] [Google Scholar]
- Lee Hong, Sang Naomi R. Wray, Goddard Michael E., and Visscher Peter M.. 2011. “Estimating Missing Heritability for Disease from Genome-Wide Association Studies.” The American Journal of Human Genetics 88 (3): 294–305. 10.1016/j.ajhg.2011.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iwasa Yoh. 1988. “Free Fitness That Always Increases in Evolution.” Journal of Theoretical Biology 135 (3): 265–81. 10.1016/S0022-5193(88)80243-1. [DOI] [PubMed] [Google Scholar]
- Jablensky Assen. 2000. “Epidemiology of Schizophrenia: The Global Burden of Disease and Disability.” European Archives of Psychiatry and Clinical Neuroscience 250 (6): 274–85. 10.1007/s004060070002. [DOI] [PubMed] [Google Scholar]
- Jobling Mark, Hollox Edward, Hurles Matthew, Kivisild Toomas, and Tyler-Smith Chris. 2013. Human Evolutionary Genetics. 2nd Edition. New York: Garland Science. [Google Scholar]
- Keightley Peter D., and Hill William G.. 1988. “Quantitative Genetic Variability Maintained by Mutation-Stabilizing Selection Balance in Finite Populations.” Genetics Research 52 (1): 33–43. 10.1017/S0016672300027282. [DOI] [PubMed] [Google Scholar]
- Kimura. 1961. “Some Calculations on the Mutational Load.” Japanese Journal of Genetics 36 (Suppl.): 179–90. [Google Scholar]
- Kimura, and Crow James F. 1978. “Effect of Overall Phenotypic Selection on Genetic Change at Individual Loci.” Proceedings of the National Academy of Sciences 75 (12): 6168–71. 10.1073/pnas.75.12.6168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura, and Maruyama T. 1966. “The Mutational Load with Epistatic Gene Interactions in Fitness.” Genetics 54 (6): 1337–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura Takeo Maruyama, and Crow James F.. 1963. “The Mutation Load in Small Populations.” Genetics 48 (10): 1303–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- King Jack Lester. 1966. “The Gene Interaction Component of the Genetic Load.” Genetics 53 (3): 403. 10.1093/genetics/53.3.403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koch E., Connally N. J., Baya N., Reeve M. P., Daly M., Neale B., Lander E. S., Bloemendal A., and Sunyaev S.. 2024. “Genetic Association Data Are Broadly Consistent with Stabilizing Selection Shaping Human Common Diseases and Traits,” July, 2024.06.19.599789. 10.1101/2024.06.19.599789. [DOI] [Google Scholar]
- Kondrashov A. S., and Turelli M.. 1992. “Deleterious Mutations, Apparent Stabilizing Selection and the Maintenance of Quantitative Variation.” Genetics 132 (2): 603–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kondrashov Alexey S. 1982. “Selection against Harmful Mutations in Large Sexual and Asexual Populations.” Genetics Research 40 (3): 325–32. 10.1017/S0016672300019194. [DOI] [PubMed] [Google Scholar]
- ———. 1984. “Deleterious Mutations as an Evolutionary Factor: 1. The Advantage of Recombination.” Genetics Research 44 (2): 199–217. 10.1017/S0016672300026392. [DOI] [PubMed] [Google Scholar]
- ———. 1988. “Deleterious Mutations and the Evolution of Sexual Reproduction.” Nature 336 (6198): 435–40. 10.1038/336435a0. [DOI] [PubMed] [Google Scholar]
- ———. 1995. “Contamination of the Genome by Very Slightly Deleterious Mutations: Why Have We Not Died 100 Times Over?” Journal of Theoretical Biology 175 (4): 583–94. 10.1006/jtbi.1995.0167. [DOI] [PubMed] [Google Scholar]
- ———. 2018. “Through Sex, Nature Is Telling Us Something Important.” Trends in Genetics 34 (5): 352–61. 10.1016/j.tig.2018.01.003. [DOI] [PubMed] [Google Scholar]
- Lande Russell. 1975. “The Maintenance of Genetic Variability by Mutation in a Polygenic Character with Linked Loci.” Genetics Research 26 (3): 221–35. 10.1017/S0016672300016037. [DOI] [PubMed] [Google Scholar]
- Laursen T. M., and Munk-Olsen T.. 2010. “Reproductive Patterns in Psychotic Patients.” Schizophrenia Research 121 (1): 234–40. 10.1016/j.schres.2010.05.018. [DOI] [PubMed] [Google Scholar]
- Liu Jimmy Z., Suzanne van Sommeren Hailiang Huang, Ng Siew C., Alberts Rudi, Takahashi Atsushi, Ripke Stephan, et al. 2015. “Association Analyses Identify 38 Susceptibility Loci for Inflammatory Bowel Disease and Highlight Shared Genetic Risk across Populations.” Nature Genetics 47 (9): 979–86. 10.1038/ng.3359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loh Po-Ru, Bhatia Gaurav, Gusev Alexander, Finucane Hilary K., Bulik-Sullivan Brendan K., Pollack Samuela J., de Candia Teresa R., et al. 2015. “Contrasting Genetic Architectures of Schizophrenia and Other Complex Diseases Using Fast Variance-Components Analysis.” Nature Genetics 47 (12): 1385–92. 10.1038/ng.3431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lush J. L., Lamoreux W. F., and Hazel L. N.. 1948. “The Heritability1 of Resistance to Death in the Fowl.” Poultry Science 27 (4): 375–88. 10.3382/ps.0270375. [DOI] [Google Scholar]
- Milkman Roger. 1978. “SELECTION DIFFERENTIALS AND SELECTION COEFFICIENTS.” Genetics 88 (2): 391–403. 10.1093/genetics/88.2.391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mostafavi Hakhamanesh, Spence Jeffrey P., Naqvi Sahin, and Pritchard Jonathan K.. 2023. “Systematic Differences in Discovery of Genetic Effects on Gene Expression and Complex Traits.” Nature Genetics 55 (11): 1866–75. 10.1038/s41588-023-01529-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller H. J. 1950. “Our Load of Mutations.” American Journal of Human Genetics 2 (2): 111. [PMC free article] [PubMed] [Google Scholar]
- Nelson Lorene M., Wallin Mitchell T., Marrie Ruth Ann, Culpepper W.J., Langer-Gould Annette, Campbell Jon, Buka Stephen, et al. 2019. “A New Way to Estimate Neurologic Disease Prevalence in the United States.” Neurology 92 (10): 469–80. 10.1212/WNL.0000000000007044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Connor Luke J., Schoech Armin P., Hormozdiari Farhad, Gazal Steven, Patterson Nick, and Price Alkes L.. 2019. “Extreme Polygenicity of Complex Traits Is Explained by Negative Selection.” The American Journal of Human Genetics 105 (3): 456–76. 10.1016/j.ajhg.2019.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ong Kanyin Liane, Stafford Lauryn K., McLaughlin Susan A., Boyko Edward J., Vollset Stein Emil, Smith Amanda E., Dalton Bronte E., et al. 2023. “Global, Regional, and National Burden of Diabetes from 1990 to 2021, with Projections of Prevalence to 2050: A Systematic Analysis for the Global Burden of Disease Study 2021.” The Lancet 402 (10397): 203–34. 10.1016/S0140-6736(23)01301-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmer Duncan S., Howrigan Daniel P., Chapman Sinéad B., Adolfsson Rolf, Bass Nick, Blackwood Douglas, Boks Marco P. M., et al. 2022. “Exome Sequencing in Bipolar Disorder Identifies AKAP11 as a Risk Gene Shared with Schizophrenia.” Nature Genetics 54 (5): 541–47. 10.1038/s41588-022-01034-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Power Robert A., Kyaga Simon, Uher Rudolf, MacCabe James H., Långström Niklas, Landen Mikael, McGuffin Peter, Lewis Cathryn M., Lichtenstein Paul, and Svensson Anna C.. 2013. “Fecundity of Patients With Schizophrenia, Autism, Bipolar Disorder, Depression, Anorexia Nervosa, or Substance Abuse vs Their Unaffected Siblings.” JAMA Psychiatry 70 (1): 22–30. 10.1001/jamapsychiatry.2013.268. [DOI] [PubMed] [Google Scholar]
- Risch N. 1990. “Linkage Strategies for Genetically Complex Traits. I. Multilocus Models.” American Journal of Human Genetics 46 (2): 222–28. [PMC free article] [PubMed] [Google Scholar]
- Robertson Alan. 1956. “The Effect of Selection against Extreme Deviants Based on Deviation or on Homozygosis.” Journal of Genetics 54 (2): 236. 10.1007/BF02982779. [DOI] [Google Scholar]
- Saha Sukanta, Chant David, Welham Joy, and McGrath John. 2005. “A Systematic Review of the Prevalence of Schizophrenia.” PLOS Medicine 2 (5): e141. 10.1371/journal.pmed.0020141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Satterstrom F. Kyle, Kosmicki Jack A., Wang Jiebiao, Breen Michael S., De Rubeis Silvia, An Joon-Yong, Peng Minshi, et al. 2020. “Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism.” Cell 180 (3): 568–584.e23. 10.1016/j.cell.2019.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scalfari Antonio, Knappertz Volker, Cutter Gary, Goodin Douglas S., Ashton Raymond, and Ebers George C.. 2013. “Mortality in Patients with Multiple Sclerosis.” Neurology 81 (2): 184–92. 10.1212/WNL.0b013e31829a3388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoech Armin P., Jordan Daniel M., Loh Po-Ru, Gazal Steven, O’Connor Luke J., Balick Daniel J., Palamara Pier F., Finucane Hilary K., Sunyaev Shamil R., and Price Alkes L.. 2019. “Quantification of Frequency-Dependent Genetic Architectures in 25 UK Biobank Traits Reveals Action of Negative Selection.” Nature Communications 10 (1): 790. 10.1038/s41467-019-08424-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schraiber Joshua G., Spence Jeffrey P., and Edge Michael D.. 2024. “Estimation of Demography and Mutation Rates from One Million Haploid Genomes.” bioRxiv. 10.1101/2024.09.18.613708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sella Guy, and Barton Nicholas H.. 2019. “Thinking About the Evolution of Complex Traits in the Era of Genome-Wide Association Studies.” Annual Review of Genomics and Human Genetics 20 (1): 461–93. 10.1146/annurev-genom-083115-022316. [DOI] [PubMed] [Google Scholar]
- Sella Guy, and Hirsh Aaron E.. 2005. “The Application of Statistical Physics to Evolutionary Biology.” Proceedings of the National Academy of Sciences 102 (27): 9541–46. 10.1073/pnas.0501865102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sham Pak C., and Purcell Shaun M.. 2014. “Statistical Power and Significance Testing in Large-Scale Genetic Studies.” Nature Reviews Genetics 15 (5): 335–46. 10.1038/nrg3706. [DOI] [PubMed] [Google Scholar]
- Shi Huwenbo, Kichaev Gleb, and Pasaniuc Bogdan. 2016. “Contrasting the Genetic Architecture of 30 Complex Traits from Summary Association Data.” The American Journal of Human Genetics 99 (1): 139–53. 10.1016/j.ajhg.2016.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simeone Jason C., Ward Alexandra J., Rotella Philip, Collins Jenna, and Windisch Ricarda. 2015. “An Evaluation of Variation in Published Estimates of Schizophrenia Prevalence from 1990–2013: A Systematic Literature Review.” BMC Psychiatry 15 (1): 193. 10.1186/s12888-015-0578-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simons Yuval B., Bullaughey Kevin, Hudson Richard R., and Sella Guy. 2018. “A Population Genetic Interpretation of GWAS Findings for Human Quantitative Traits.” PLOS Biology 16 (3): e2002985. 10.1371/journal.pbio.2002985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simons Yuval B., Mostafavi Hakhamanesh, Smith Courtney J., Pritchard Jonathan K., and Sella Guy. 2022. “Simple Scaling Laws Control the Genetic Architectures of Human Complex Traits.” bioRxiv. 10.1101/2022.10.04.509926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simons Yuval B., Turchin Michael C., Pritchard Jonathan K., and Sella Guy. 2014. “The Deleterious Mutation Load Is Insensitive to Recent Population History.” Nature Genetics 46 (3): 220–24. 10.1038/ng.2896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh Tarjinder, Poterba Timothy, Curtis David, Akil Huda, Al Eissa Mariam, Barchas Jack D., Bass Nicholas, et al. 2022. “Rare Coding Variants in Ten Genes Confer Substantial Risk for Schizophrenia.” Nature 604 (7906): 509–16. 10.1038/s41586-022-04556-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinnott-Armstrong Nasa, Naqvi Sahin, Rivas Manuel, and Pritchard Jonathan K. 2021. “GWAS of Three Molecular Traits Highlights Core Genes and Pathways alongside a Highly Polygenic Background.” Edited by Flint Jonathan, Wittkopp Patricia J, Lynch Vincent J, Wray Naomi, and Aravinda Chakravarti. eLife 10 (February):e58615. 10.7554/eLife.58615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slatkin Montgomery. 2008. “Exchangeable Models of Complex Inherited Diseases.” Genetics 179 (4): 2253–61. 10.1534/genetics.107.077719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spence Jeffrey P., Mostafavi Hakhamanesh, Ota Mineto, Milind Nikhil, Gjorgjieva Tamara, Smith Courtney J., Simons Yuval B., Sella Guy, and Pritchard Jonathan K.. 2024. “Specificity, Length, and Luck: How Genes Are Prioritized by Rare and Common Variant Association Studies.” bioRxiv. 10.1101/2024.12.12.628073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spracklen Cassandra N., Horikoshi Momoko, Young Jin Kim Kuang Lin, Bragg Fiona, Moon Sanghoon, Suzuki Ken, et al. 2020. “Identification of Type 2 Diabetes Loci in 433,540 East Asian Individuals.” Nature 582 (7811): 240–45. 10.1038/s41586-020-2263-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teng Margaret LP, Ng Cheng Han, Huang Daniel Q., Chan Kai En, Tan Darren JH, Lim Wen Hui, Yang Ju Dong, Tan Eunice, and Muthiah Mark D.. 2022. “Global Incidence and Prevalence of Nonalcoholic Fatty Liver Disease.” Clinical and Molecular Hepatology 29 (Suppl): S32. 10.3350/cmh.2022.0365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- The Wellcome Trust Case Control Consortium. 2007. “Genome-Wide Association Study of 14,000 Cases of Seven Common Diseases and 3,000 Shared Controls.” Nature 447 (7145): 661–78. 10.1038/nature05911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trubetskoy Vassily, Pardiñas Antonio F., Qi Ting, Panagiotaropoulou Georgia, Awasthi Swapnil, Bigdeli Tim B., Bryois Julien, et al. 2022. “Mapping Genomic Loci Implicates Genes and Synaptic Biology in Schizophrenia.” Nature 604 (7906): 502–8. 10.1038/s41586-022-04434-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waxman D, and Peck J R. 2003. “The Anomalous Effects of Biased Mutation.” Genetics 164 (4): 1615–26. 10.1093/genetics/164.4.1615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weintraub Karen. 2011. “The Prevalence Puzzle: Autism Counts.” Nature 479 (7371): 22–24. 10.1038/479022a. [DOI] [PubMed] [Google Scholar]
- Wray Naomi R., and Goddard Michael E.. 2010. “Multi-Locus Models of Genetic Risk of Disease.” Genome Medicine 2 (2): 10. 10.1186/gm131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright Sewall. 1926. “A Frequency Curve Adapted to Variation in Percentage Occurrence.” Journal of the American Statistical Association, June. https://www.tandfonline.com/doi/abs/ 10.1080/01621459.1926.10502168. [DOI] [Google Scholar]
- ———. 1934. “AN ANALYSIS OF VARIABILITY IN NUMBER OF DIGITS IN AN INBRED STRAIN OF GUINEA PIGS.” Genetics 19 (6): 506–36. 10.1093/genetics/19.6.506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Jian, Benyamin Beben, McEvoy Brian P., Gordon Scott, Henders Anjali K., Nyholt Dale R., Madden Pamela A., et al. 2010. “Common SNPs Explain a Large Proportion of the Heritability for Human Height.” Nature Genetics 42 (7): 565–69. 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Jian, Manolio Teri A., Pasquale Louis R., Boerwinkle Eric, Caporaso Neil, Cunningham Julie M., de Andrade Mariza, et al. 2011. “Genome Partitioning of Genetic Variation for Complex Traits Using Common SNPs.” Nature Genetics 43 (6): 519–25. 10.1038/ng.823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeng Jian, Xue Angli, Jiang Longda, Lloyd-Jones Luke R., Wu Yang, Wang Huanwei, Zheng Zhili, et al. 2021. “Widespread Signatures of Natural Selection across Human Complex Traits and Functional Genomic Categories.” Nature Communications 12 (1): 1164. 10.1038/s41467-021-21446-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeng Jeffrey P. Spence, Mostafavi Hakhamanesh, and Pritchard Jonathan K.. 2023. “Bayesian Estimation of Gene Constraint from an Evolutionary Model with Gene Features.” bioRxiv. 10.1101/2023.05.19.541520. [DOI] [PubMed] [Google Scholar]
- Zhang Xu-Sheng, and Hill William G. 2008. “The Anomalous Effects of Biased Mutation Revisited: Mean–Optimum Deviation and Apparent Directional Selection Under Stabilizing Selection.” Genetics 179 (2): 1135–41. 10.1534/genetics.107.083428. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






