Skip to main content
Genetics logoLink to Genetics
. 2023 Dec 26;226(3):iyad218. doi: 10.1093/genetics/iyad218

The fitness consequences of genetic divergence between polymorphic gene arrangements

Brian Charlesworth 1,b,✉,c
Editor: K Dyer
PMCID: PMC11090464  PMID: 38147527

Abstract

Inversions restrict recombination when heterozygous with standard arrangements, but often have few noticeable phenotypic effects. Nevertheless, there are several examples of inversions that can be maintained polymorphic by strong selection under laboratory conditions. A long-standing model for the source of such selection is divergence between arrangements with respect to recessive or partially recessive deleterious mutations, resulting in a selective advantage to heterokaryotypic individuals over homokaryotypes. This paper uses a combination of analytical and numerical methods to investigate this model, for the simple case of an autosomal inversion with multiple independent nucleotide sites subject to mildly deleterious mutations. A complete lack of recombination in heterokaryotypes is assumed, as well as constancy of the frequency of the inversion over space and time. It is shown that a significantly higher mutational load will develop for the less frequent arrangement. A selective advantage to heterokaryotypes is only expected when the two alternative arrangements are nearly equal in frequency, so that their mutational loads are very similar in size. The effects of some Drosophila pseudoobscura polymorphic inversions on fitness traits seem to be too large to be explained by this process, although it may contribute to some of the observed effects. Several population genomic statistics can provide evidence for signatures of a reduced efficacy of selection associated with the rarer of two arrangements, but there is currently little published data that are relevant to the theoretical predictions.

Keywords: inversion polymorphism, mutational load, heterokaryotype advantage, efficacy of selection, population subdivision

Introduction

Wright and Dobzhansky (1946) obtained evidence that some natural inversion polymorphisms in Drosophila pseudoobscura are associated with major differences in fitness among karyotypes, which can lead to their stable maintenance within a single population under constant environmental conditions. There have subsequently been many other experimental studies documenting strong effects of inversion karyotypes on fitness components in several Drosophila species (reviewed in Krimbas and Powell 1992; Kapun and Flatt 2019), and in some other species such as the seaweed fly Coelopa frigida (Butlin et al. 1984; Mérot et al. 2020). The startling observations of Wright and Dobzhansky (1946) raised the question of the causes of fitness differences between apparently functionally insignificant chromosomal variants. This question is still the subject of ongoing inquiry, stimulated by the new evidence from genome sequencing that inversion polymorphisms are abundant in natural populations of many species (Wellenreuther and Bernatchez 2018; Faria et al. 2019; Berdan et al. 2023).

Well before the work of Wright and Dobzhansky, Sturtevant and Mather (1938) had proposed a process that could cause fitness differences between inversion karyotypes, with a fitness advantage to heterokaryotypes over homokaryotypes. In their words:

… if a chromosome exists, in a population, in two sequences, differing by an inversion, it will in effect show two distinct lines of descent. There is free exchange of material within any line (i.e. sequence), but none between the sequences. Therefore, fluctuations of the genic contents must occur almost independently in the two sequences. Under such conditions, it is inevitable that in time the gene content of the two sequences will become different. It must be supposed that each sequence is susceptible to the same mutations, and with the same frequencies, but, as a result of the relative rarity in the population of a given mutant allelomorph at any one moment, certain genes will be present in one sequence but not the other. ….. It is thus clear, considering two sequences A and B, that the homozygotes AA and BB are more likely to be homozygous for deleterious recessive mutations than is the heterozygote AB. …. Thus in general the sequence heterozygote AB will be at a selective advantage with respect to either of the corresponding homozygotes.

Sturtevant and Mather (1938) did not attempt a quantitative model of this process, simply noting that “there can be no stability in the exact relations of the sequences with each other”, and that “…. a single gene difference can not in general cause such heterosis. The simplest effective condition is that in which each sequence contains a deleterious recessive not present in the other.” This proposal raises the question of what strength of selection on the two arrangements might be expected on its basis.

Ohta (1971) developed a mathematical model of a closely related, but not identical, process. This was based on the concept of associative overdominance, first outlined by Frydenberg (1963). Here, a polymorphic neutral locus can acquire an apparent heterozygote advantage, because of linkage disequilibrium (LD) generated by genetic drift with a locus subject to selection in favor of heterozygotes or to selection against recessive/partially recessive deleterious alleles maintained by mutation pressure. Unlike the model of Sturtevant and Mather (1938), this process does not require the generation of heterosis by multiple selected loci, often referred to as pseudo-overdominance (Waller 2021). Ohta (1971) summed the effects of individual selected loci that were completely linked to a dialellic neutral locus (equivalent to an inversion polymorphism) over a large segment of genome, and generated expressions for the apparent fitness advantage to heterozygotes at the neutral locus.

It was, however, shown by Zhao and Charlesworth (2016) that Ohta's formulae for the relative fitnesses at the neutral locus induced by LD with the selected locus do not predict any change in allele frequency at the neutral locus. Using a different approach, they found that an induced selection pressure in favor of increased variability at the neutral locus only exists when the product of population size and selection coefficient is of the order of 1. Otherwise, variability is reduced by background selection effects, even for recessive or partially recessive deleterious mutations (see also Charlesworth 2022). Ohta's results therefore do not solve the quantitative problem of whether the strength of selection on inversions revealed by the experiments cited above can be explained by this process. Nonetheless, they are still invoked as a potential contributor to the selective maintenance of inversion polymorphisms, e.g. Faria et al. (2019), Berdan et al. (2021), Jay et al. (2021), and Matschiner et al. (2023).

A different perspective was developed by Nei et al. (1967), who examined a purely deterministic model involving the balance between mutation and selection at numerous autosomal loci. This process results in an equilibrium frequency distribution of the number of mutant alleles per haploid genome. A new autosomal inversion then has a chance of arising on a haplotype with a lower number of mutations than average and acquiring a selective advantage. But, as time goes on, the mutant-free loci on the inverted haplotypes will accumulate mutations (Nei et al. 1967). Unless the inversion goes to fixation, the loci in the inversion subpopulation will eventually acquire the same frequencies of mutant alleles as the corresponding loci in the standard arrangement subpopulation. Nei et al. (1967) interpreted this as implying that the inversion would then be selectively neutral. However, if reverse mutations from mutant to wild-type alleles do not occur, the loci at which mutations were present in the original inversion haplotype will all be homozygous in inversion homokaryotypes, causing these to have a reduced fitness compared to homokaryotypes for the standard arrangement. A reanalysis and extension of this model by Connallon and Olito (2021) suggested that it could produce a net heterozygote advantage to an autosomal inversion in a randomly mating population, resulting in the maintenance of the inversion, but only if the inversion has a sufficiently large direct selective (but nonheterotic) advantage that is independent of the deleterious mutations that it carries.

Berdan et al. (2021) conducted simulations of a finite population with multiple loci experiencing mutations to deleterious and completely recessive alleles. They found that a selective advantage to heterokaryotypes could develop, provided that recombinational exchange between arrangements in heterokaryotypes was sufficiently infrequent, and there was a small additional selective advantage to heterokaryotypes that kept the inversion in the population long enough for mutation accumulation to occur. Studies of the effects of deleterious mutations have shown that complete recessivity is unlikely to be frequent, especially for mildly deleterious mutations (Muller 1950; Crow 1993; Manna et al. 2011). Multilocus computer simulations with less extreme assumptions about the degree of recessivity of deleterious mutations showed that autosomal inversion polymorphisms are unlikely to be established at higher than neutral rates (Jay et al. 2022).

These theoretical studies therefore suggest that deleterious mutations in themselves are unlikely to create an initial selective advantage to autosomal inversions in a randomly mating population. Indeed, if a new inversion arises on a unique haplotype, the process of accumulation of new mutational load within the inversion subpopulation will take a long time, and cannot contribute to any initial selective effect of the inversion. It is, however, an open question as to whether the fitness differences among karyotypes mentioned above could have a significant component resulting from the process proposed by Sturtevant and Mather (1938), whereby genetic drift causes the inverted and standard arrangements to differ in their genetic content. There is a strong, but not perfect, analogy with population subdivision, where genetic drift can cause local populations to diverge at weakly selected loci subject to mutation to deleterious variants. In the case of inversions, however, the existence of heterokaryotypes means that selection does not act independently on the two subpopulations, as described below in the section on the mathematical model.

Provided that mutations are at least partially recessive with respect to their fitness effects, interpopulation crosses may show heterosis, due to different loci having accumulated different deleterious mutations in different populations (Whitlock et al. 2000; Glémin et al. 2003; Roze and Rousset 2004; Spigler et al. 2017; Charlesworth 2018). Such heterosis has been observed in populations of animals and plants (reviewed in Charlesworth 2018). In addition, theory predicts that populations with a smaller effective size (Ne) should allow the accumulation of deleterious mutations and hence acquire larger mutational loads than populations with a larger Ne, unless mutations have strong selective effects relative to drift and are highly recessive (Wright 1931; Kimura et al. 1963; Nei 1968; Bataillon and Kirkpatrick 2000; Charlesworth 2018).

Such differences in Ne can arise either from differences in the adult population size itself, differences in the mating system (especially the frequency of inbreeding), differences in recombination rates associated with different levels of genetic hitchhiking effects, or a combination of all 3 factors. Again, there is empirical support for this predicted effect of Ne, both from measurements of fitness components (Leimu et al. 2006; Lohr and Haag 2015; Charlesworth 2018) and from population genomic indicators of the efficacy of selection against deleterious mutations, e.g. Robinson et al. (2022) (population size), Glémin et al. (2006) (mating system), and Campos et al. (2014) (recombination rate).

The purpose of the present paper is to investigate the properties of a population genetic model of mutation, selection, and drift acting on a polymorphism for an autosomal inversion and a standard arrangement, in which the inversion polymorphism is maintained by sufficiently strong selection that the frequency of the inversion is constant over time and space. Similar assumptions were used in an earlier paper that examined neutral differentiation between the two arrangements (Charlesworth 2023). In order to simplify the calculations, and to maximize the effects of drift and mutation within arrangements, no recombinational exchange between arrangements in heterokaryotypes is allowed. The results should therefore provide upper bounds on the likely size of effects.

Models are developed of both a single, randomly mating, population and of a population divided into a large number of local populations. It might be expected that population subdivision, with its greater opportunities for drift, would enhance divergence among arrangements at sites under selection. The model assumes a large number of freely recombining sites subject to selection and mutation, with a wide distribution of selection coefficients over sites. No attempt is made to investigate the consequences of enhanced Hill–Robertson interference among sites due to restricted recombination in heterokaryotypes, which was included in the simulations of the fates of autosomal inversions in Berdan et al. (2021) and Jay et al. (2022). Unless an autosomal inversion is very rare or very common, there should be sufficient recombination within homokaryotypes to prevent major Hill–Robertson effects; very low recombination rates are sufficient to prevent the operation of Muller's ratchet (Charlesworth et al. 1993).

The overall conclusion is that a low-frequency arrangement will have a higher mutational load and exhibit weaker population genomic signals of purifying selection than its counterpart. Heterokaryotypic superiority in fitness is, however, unlikely to be observed unless the inverted and standard sequences are approximately equal in frequency, and it is likely to be small in magnitude unless the inversion contains millions of sites under selection. Population subdivision has only a small effect on the load and population genomic statistics.

A model of the mutational load associated with an inversion polymorphism

General considerations and notation

The main symbols used in this paper are defined in Table 1. Consider first a single randomly mating, discrete generation population of size N, assuming a Wright–Fisher model of reproduction such that N is equal to the effective population size. The frequencies of the two karyotypes, the inverted (In) and standard (St) arrangements are denoted by x and y = 1 – x; designation as In vs St is purely arbitrary in the situation considered here, so the convention x ≤ ½ is used. Balancing selection on the inversion is assumed to be sufficiently strong that x can be treated as constant over time. Let qi and pi = 1 – qi be the respective frequencies of the mutant (A2) and wild-type allele (A1) at a given nucleotide site within haplotypes carrying a type i karyotype, where i = 1 corresponds to In and i = 2 to St. In a given generation, there will be random drift as well as selection within the populations of In and St karyotypes, so that in general q1q2. Drift occurs independently within karyotypes under the assumption of constant sizes of the two subpopulations, so that the effective population sizes of carriers of In and St are N1 = Nx and N2 = Ny, respectively.

Table 1.

Definitions of the most important symbols used in the text.

x and y Frequencies of the inversion (In) and standard arrangement (St), respectively
N Size of the population (single population case) or size of a local population (subdivided population case)
d Number of demes in the subdivided population case
NT = dN Sum of local population sizes for a subdivided population
Li Mutational load for the homokaryotypic subpopulation of type i
L 12 Mutational load for In/St heterokaryotypes
Hi Homozygous load for subpopulation i
Bi Inbreeding load for subpopulation i
ti Reduction in fitness of homokaryotype i relative to the heterokaryotype's fitness
u and v Mutation rates from wild-type to mutant alleles and vice versa
α and β u and v scaled by 4N or 4NT, depending on context
m and M Migration rate and migration rate scaled by 4N
qi Expected frequency of mutant alleles for subpopulation i
Vqi Variance in frequency of mutant alleles for subpopulation i
s Selection coefficient against homozygotes for a deleterious mutation
h The dominance coefficient for heterozygotes for a deleterious mutation
a Shape parameter of the gamma distribution of selection coefficients
γ¯ Mean selection coefficient against a deleterious mutation, scaled by 2N or 2NT

Subscripts i = 1 and i = 2 are used to denote parameters applicable to the In and St subpopulations, respectively. Subscript d is used to denote within-deme parameters.

The expectation of qi is denoted by qi,with pi = 1 – qi, where the angle brackets denote an expectation taken over the probability distribution of qi. The variance in qi in a given generation over this probability is denoted by Vqi=qi2qi2, with the corresponding F statistic (Wright 1951) given by Fi = Vqipiqi. In the absence of recombination but the presence of selection, there will be a negative covariance C12 between q1 and q2, with a correspondingly negative correlation coefficient R12. This is because a higher frequency of the mutant allele in one karyotype results in a higher frequency of mutant homozygotes in In/St individuals, enhancing the strength of selection against the mutation in the other karyotype. For neutral sites in the absence of recombination between the arrangements, C12 = R12 = 0.

Following Kimura et al. (1963), equations can be written for the genetic loads at a single diallelic autosomal locus, assuming a homozygous selection coefficient s and dominance coefficient h. The fitness of mutant homozygotes relative to wild-type is 1 – s and the fitness of heterozygotes is 1 – hs; s may vary across loci, but h is treated as a constant, although it is easy to relax this assumption. Li is the genetic load for individuals homozygous for karyotype i produced by random mating within the population, defined as the expected reduction below 1 of their mean fitness relative to wild-type homozygotes. L12 is the corresponding load for heterokaryotypes. The homozygous load Hi is the reduction below 1 in the expected relative fitness of individuals made homozygous for gametes with karyotype i, with probability qi of being homozygous for the mutant allele. Bi is the inbreeding load for karyotype i, defined as Hi – Li (Charlesworth and Charlesworth 2010, p. 173).

Following Charlesworth (2018), some simple algebra yields the following expressions for these quantities:

Li=qi[2h+(12h)(qi+Fipi)]s (1a)
L12={h[q1p2+q2p1]+q1q2+(12h)C12}s (1b)
Hi=qis (1c)
Bi=piqi(1Fi)(12h)s (1d)

where

Fi=Vqi/piqi (1e)

The selective difference between heterokaryotypes and a homokaryotype of class i is measured by ti = LiL12. Heterozygote advantage exists if both tis are positive. To obtain conditions for heterozygote advantage, it is useful to rearrange Equations (1) by writing q1=q+δq, q2=qδq, where 2δq is the expected difference in the frequency of A2 between In and St. δq will be nonnegative if x < ½, due to the greater effectiveness of drift relative to selection against nonrecessive, mildly deleterious mutations in a smaller population (Kimura et al 1963). Substituting these expressions for the qi into Equations (1a) and (1b), the following expressions are obtained:

L1=(q+δq){[2h+(12h)(q+δq+F1(pδq))]}s={q[2h+(q+pF1)(12h)]+δq(2h+[F1+2q(1F1)(12h)])+(δq)2(1F1)(12h)}s (2a)
L2={q[2h+(q+pF2)(12h)]δq(2h+[F2+2q(1F2)(12h)])+(δq)2(1F2)(12h)}s (2b)
L12=[q2+2pqh(δq)2(12h)+2(12h)C12]s (2c)

Equation (2c) shows that when h < ½ the mean fitness of heterokaryotypes is increased by a difference in the expected frequencies of deleterious mutations between the two karyotypes, as expected intuitively.

These expressions yield the following results for the selective differences between heterokaryotypes and homokaryotypes:

t1={pqF1(12h)+δq[2h+(2q(1F1)+F1)(12h)]+δq2(2F1)(12h)2(12h)C12}s (3a)
t2={pqF2(12h)δq[2h+(2q(1F2)+F2)(12h)]+δq2(2F2)(12h)2(12h)C12}s (3b)

It is easily seen that, for an inversion with x < ½ and δq > 0 (see above), we have t1 > 0 provided that h ≤ ½ and C12 ≤ 0, so that In/In then has a lower fitness than In/St. The expectation for a low-frequency inversion is thus that t1 > 0, although the magnitude of the effect is likely to be small for a single, large population. Given that F1 > 0 and h < ½, this is also the case even when δq = 0, reflecting the fact that genetic drift causes a reduction in mean fitness by increasing the frequencies of homozygotes for recessive or partially recessive mutations; this effect is not experienced by In/St individuals unless C12 > 0, which can be ruled out by the argument given above.

If δq > 0 and h ≤ ½, the condition for t2 > 0 is more stringent than for t1 > 0, due to the opposite sign of the term in δq in Equations (3a) and (3b), especially if F2 is close to 0. It may therefore be difficult to find conditions in which there is an advantage to In/St over both homokaryotypes, unless the inversion frequency is close to ½, so that δq ≈ 0.

These results can be generalized to the case of a subdivided population with a constant frequency of the inversion across all local populations (demes), by taking expectations of within-deme allele frequencies across populations, as described in Section 5 of Supplementary File 1.

A single population: modeling drift and selection

In order to obtain numerical results for the load statistics described above, expressions for the means and variances of the qi, as well their covariance, are needed. Recombination is assumed to be absent in heterokaryotypes. We first consider the expected changes in allele frequencies due to selection within each karyotype. For In karyotypes, the marginal fitness of haplotypes carrying the wild-type allele (A1) at a locus is easily seen to be

w11=1hs(xq1+yq2) (4a)

Similarly, the marginal fitness of In haplotypes carrying the mutant allele is

w12=1hs(xp1+yp2)s(xq1+yq2) (4b)

The net expected change in the frequency of A2 within In karyotypes due to selection (neglecting second-order terms in s) is thus

Δsq1p1q1[w11w12]=sp1q1{h+(12h)[xq1+yq2]} (5a)

Similarly, the net expected change in the frequency of A2 within St karyotypes due to selection is

Δsq2sp2q2{h+(12h)[xq1+yq2]} (5b)

Equations (4) and (5) bring out the interdependence between the evolutionary processes in the two karyotypes when h ≠ ½. To proceed further, the effects of mutation and drift also need to be analyzed, such that an expression for the stationary joint probability density function (p.d.f.) for q1 and q2, ϕ(q1, q2), can be obtained. Kimura (1964, p. 41) derived a pair of coupled forward diffusion equations describing the joint stationary p.d.f. for two variables, using the principle that a zero flux of the probability density of each variable across all their values guarantees a stationary joint distribution. This method can be applied to the above selection equations, together with the terms arising from mutation.

Let the rates of mutation from A1 to A2 and vice versa be u and v, respectively, with scaled mutation rates α1 = 4N1u, α2 = 4N2u, β1 = 4N1v, β2 = 4N2v. The mutational bias toward deleterious mutations, κ, is equal to u/v. As shown by Kimura (1964), in order to analyze this type of situation it is most convenient to use the natural logarithm of ϕ  ,ψ=lnϕ. If drift affects q1 and q2 independently, as assumed here, the conditions for a stationary distribution are

Vδqiqiψ=2ΔqiqiVδq1i(i=1,2) (6)

where Vδq1=p1q1/(2N1) and Vδq2=p2q2/(2N2) are the variances in the changes in allele frequencies per generation due to drift within In and St, respectively. Correspondingly, q1Vδq1=(12q1)/2Nx; q2Vδq2=(12q2)/2Ny. The Δqi are given by the selection equations (4) and (5) together with the relevant mutation terms.

For a meaningful solution of Equation (6) to exist, ψ must have an exact differential, dψ=q1ψdq1+q2ψdq2, which requires q1q22ψ=q2q12ψ (Kimura 1964, p. 41). Kimura showed that this is the case for the mutation terms, so we need only consider the selection terms contributed by Equations (4) and (5). The following expression satisfies both this condition and Equation (6) for the selection contribution to ψ:

ψs(q1,q2)=2γ[h(xq1+yq2)+12(12h)(xq1+yq2)2] (7)

where γ is the scaled selection coefficient, 2Ns.

The full solution to Equation (6) is thus

ϕ(q1,q2)=Cexp(ψs)q1α11p1β11q2α21p2β21 (8)

where C is a normalization constant, given by the inverse of the double integral of ϕ over the closed intervals (0, 1) for q1 and q2.

A single population: obtaining the mutational load and population genomic statistics

To obtain the single-locus load statistics for a given h, numerical integration of Equation (8) and its product with powers and crossproducts of the qi can be performed, for the purpose of determining the expectations and variances of the qi and their covariance C12. The corresponding Fi statistics can be obtained from Equation (1e). The means of the single-locus load statistics, with the terms in s omitted, can then be obtained using Equations (1). Details of the integration procedures are given in Section 1 of Supplementary File 1.

In order to calculate the load statistics themselves under reasonably realistic assumptions, a gamma distribution of the scaled selection coefficient γ = 2Ns is assumed, with a p.d.f. given by

ψ(γ)=γa1exp(γb)baΓ(a) (9)

where a is the shape parameter, b=γ¯/a is the location parameter, and Γ(a) is the gamma function. This distribution has been widely used in population genomic methods for estimating the distribution of fitness effects of deleterious mutations (e.g. Charlesworth 2015; Booker et al. 2017).

The values of γ¯ and the shape parameter a are chosen to correspond to estimates from the population genomics studies just mentioned, which indicate γ¯ values of hundreds or thousands for nonsynonymous mutations and shape parameters of approximately 0.3, implying a wide distribution of selection coefficients. It is assumed that fitness effects are multiplicative across sites, so that the products of the number of sites, ns, and the expectations of Equations (1a)–(1c) over the distribution of γ correspond to the natural logarithms of the corresponding multisite load statistics. The exponentials of the negatives of these expressions then yield the mean fitnesses of the karyotypes concerned, relative to that of mutant-free individuals. The exponential of the negative of the product of ns and the expectation of Equation (1c) yields the fitness of totally inbred individuals of karyotype i relative to outbred individuals that are homozygous for karyotype i, i.e. a measure of the inbreeding depression experienced by carriers of karyotype i.

The net selection coefficient against homozygosity for karyotype i relative to heterokaryotypes for a given ns is measured by

tsi=1exp(nsE{L12Li}) (10)

where E{} indicates the expectation over the distribution of γ (as opposed to an expectation over the distribution of q, denoted by angle brackets).

It is also of interest to summarize the expected patterns of variation at the loci themselves. To do this, the p.d.f. for q1 and q2 are used to calculate the expected nucleotide site diversities within each karyotype subpopulation for a given selection coefficient, πi=2piqi(1Fi), which are then averaged over the distribution of s. In addition, the expected proportion of segregating sites for a sample of n alleles, Sni, can be determined as described in Section 2 of Supplementary File 1. Division by the sum of the harmonic series, an=j=1j=n1(1/j), yields the expected values of Watterson's theta (θwni) for each subpopulation (Watterson 1975). The skew of the distribution of segregating variants toward rare variants for subpopulation i can conveniently be measured by Δθwni=1πi/θwi (Campos and Charlesworth 2019).

For practical purposes of computation, it is convenient to divide the range of values of γ into several zones according to the strength of selection and to compute the integrals of the load statistics over each zone separately. The overall values of the load statistics are then given by summing the results over all zones. The details are given in Section 1 of the Appendix, and the computer code for generating the numerical results for this case is given in Supplementary File 2.

A finite island model metapopulation

In this case, a metapopulation of total size NT is divided into a large number d of subpopulations (demes), each of size N = NT/d. A Wright–Fisher model is assumed to apply to each deme. N is assumed to be sufficiently large that the frequency of the inversion is held at the same frequency x in all demes. A fraction m of each deme is derived by migration from a pool with equal contributions from all demes. Let the current mean frequency across all demes of the mutant allele A2 at a locus be q¯i for karyotype i, so that migrants contribute m  q¯i to the new frequency of A2 among type i haplotypes within a deme and m  p¯i to the new frequency of A1.

This model poses the problem that the evolutionary processes within demes change the values of the q¯i, so that they cannot realistically be treated as fixed quantities. Following previous treatments of this problem, it is assumed here that the process of change in the q¯i can be described by a pair of coupled diffusion equations, using the expectations of piqi/(2NT) over all d demes as the drift variance terms together with the corresponding expectations of the expressions for the deterministic changes in q¯i (Whitlock 2002; Cherry and Wakeley 2003; Roze and Rousset 2003; Wakeley 2003). The mutational contributions to the latter are simply obtained by substituting q¯i for qi in the standard formulae for the deterministic changes in the q¯i, since the mutational changes are linear in qi. The expectation of piqi over demes conditioned on q¯i can be written as p¯iq¯i(1FSTi), where FSTi is the variance among demes in qi divided by p¯iq¯i. The total effective population size for the metapopulation for karyotype i, Nemi, is thus equal to NT/(1FSTi) (Wakeley and Aliacar 2001).

The nonlinearity with respect to qi of the selection terms for allele frequency changes within demes (after division by piqi) when h ≠ 0.5 means that an exact closed expression for their contributions to the expected changes in the q¯i cannot be obtained, except in the absence of dominance (Whitlock 2002; Roze and Rousset 2003; Wakeley 2003). However, a useful approximation can be obtained by neglecting the 3rd moments about their means of the within-deme allele frequencies; these moments are necessarily smaller than the variances of the q¯i and will be considerably smaller when selection is sufficiently strong in relation to drift that the q¯i are close to 0. When selection is sufficiently weak, a neutral approximation for the 3rd moment can be used (Whitlock 2002), but this will in general overcorrect when selection is strong and so is not employed here. The error introduced by ignoring this correction affects only the small portion of the distribution of selection coefficients where s is O(1/2NT) and is thus unlikely to be important for the load and population genomic statistics calculated here.

The following expressions are obtained after some algebra:

Δsq¯1sp¯1q¯1{(1FST1)[h+(xq¯1+yq¯2)]+(12h)xFST1(12q¯1)} (11a)
Δsq¯2sp¯2q¯2{(1FST2)[h+(xq¯1+yq¯2)]+(12h)yFST2(12q¯2)} (11b)

We also have the following expression for the variance in q¯i due to drift, e.g. Whitlock (2002):

Vδq¯i=p¯iq¯i(1FSTi)/(2NTi) (12)

where NT1 = NTx and NT2 = NTy.

Combining Equations (11) and (12), and using the approach that led to Equation (8) and carrying out some rearrangements of terms, we obtain the equivalent of the ψ function for the panmictic case, which describes the selection contribution to the logarithm of the p.d.f. for the metapopulation. The details are given in Section 2 of the Appendix. Section 5 of Supplementary File 1 describes how the p.d.f. can be used to obtain the load statistics. The computer code for generating the numerical results is given in Supplementary File 3.

Results

General considerations

Intuitively, the mutational load associated with each homokaryotype in an inversion polymorphism would be expected to be strongly affected by the dominance coefficient (h), the scaled strength of selection (2Ns = γ), the mutation rate to deleterious mutations (u), the number of selected sites that it contains (ns), and its frequency (x). Dominance coefficients less than one-half are well known to be required for inbreeding depression and heterosis, whose magnitudes are inversely related to h (Charlesworth and Charlesworth 2010, Chap. 4). It would thus be expected that the reduction in fitness associated with an arrangement, and any fitness advantage to In/St heterokaryotypes, would decrease with h but increase with ns and u, if γ is sufficiently small that deleterious mutations are significantly affected by drift.

It is less clear how these fitness effects are related to γ, since stronger selection reduces the frequencies of deleterious mutations but also reduces their effects on fitness if they rise to high frequencies. However, if selection is so strong in relation to drift that allele frequencies are at mutation–selection equilibrium, no differentiation in allele frequencies between In and St will occur, removing any possibility of a selective advantage to heterokaryotypes (Sturtevant and Mather 1938; Nei et al. 1967). The mean fitnesses of both homokaryotypes with ns selected sites will then be equal to the deterministic value, exp(−2nsu) ≈ 1 −2nsu when fitnesses are multiplicative and h is > 0 (Haldane 1937).

The rarer of the two arrangements experiences more genetic drift than its counterpart, so that x < 0.5 means that inversion homokaryotypes should have a lower overall mean fitness than standard homokaryotypes. It is less clear when an advantage to heterokaryotypes can be generated; the increased load associated with the rarer arrangement may simply generate a net selective advantage to its counterpart, especially when mutations are only partially recessive. Finally, it is likely that population subdivision will increase the magnitude of the mutational loads for each homokaryotype and the selective differences among the three karyotypes, because drift within demes in a metapopulation occurs at a faster rate than in a single population with the same size as the metapopulation. However, this is counterbalanced by a slower rate of drift for the metapopulation as a whole (Whitlock 2002; Cherry and Wakeley 2003; Wakeley 2003), so the net effect is hard to predict intuitively.

A single population: load statistics

Numerical results for a single randomly mating population are presented here. These are based on Equations (1) for individual selected sites together with the procedures for combining the effects of mutation, drift, and selection for all sites that were described above. The number of sites was set to 105, corresponding to an inversion containing 100 genes with a mean of 1,000 nonsynonymous sites per gene. The expectations of the single-locus mutational loads (Li and L12) and the inbreeding loads (Bi), defined by Equations (1), are then multiplied by 105 to obtain their net values. If multiplicative fitnesses are assumed, these quantities are equivalent to the negatives of the natural logarithms of the corresponding mean fitnesses (for the Ls) or differences in log mean fitnesses (for the Bs). The values of the corresponding mean fitnesses for a different number of sites, ns1, can be found by taking the exponentials of their negatives multiplied by ns1/105. The selection coefficients ti against the 2 homokaryotypes are obtained by exponentiation of the product of ns1 and the expectation of LiL12 (Equation 10). If these products are small, as is mostly the case in practice, tins E{LiL12}. Heterokaryotype advantage requires both of the ti to be positive; if one is positive and the other negative, there is directional selection in favor of the karyotype with the negative ti.

Figure 1 displays the values of the relevant load statistics and the selection coefficients against homokaryotypes as a function of the dominance coefficient h, for three different inversion frequencies and two different mean scaled selection coefficients (γ¯= 1,000 and 4,000), using a gamma distribution of selection coefficients with shape parameter a = 0.3, and a mutation rate to deleterious alleles of u = 5 × 10−9. These values are broadly consistent with population genomic estimates from Drosophila melanogaster (Charlesworth 2015). A strong mutational bias to deleterious mutations of κ = 1.5 was assumed, in order to maximize the magnitude of the mutational loads and selection coefficients.

Fig. 1.

Fig. 1.

The mutational load statistics for a single population of size N = 106 with 105 selected sites, plotted against the dominance coefficient h. The results for 3 different frequencies of the inversion are shown. The selection coefficients follow a gamma distribution with a mean of 5 × 10−4 and a shape parameter of 0.3. The mutation rate to deleterious alleles is 5 × 10–9 per basepair, with a mutational bias toward deleterious variants of 1.5. The upper and lower panels have products of 2N and mean s of 4,000 and 1,000, respectively. The filled and half-filled symbols denote values for the In and St subpopulations, respectively. The lozenges are the net mutational loads within the respective karyotypes, and the crosses are the corresponding inbreeding loads. The triangles are the selection coefficients against homokaryotypes relative to the heterokaryotype. Only the values for In are shown when x = 0.5.

If there were no effects of drift, the net load (L) for both subpopulations would be approximately 2nsu = 0.001, and the net inbreeding load (B) would equal nsu(h−1–2)—see Charlesworth and Charlesworth (2010, Chap. 4). This gives B = 0.002 with h = 0.25 and 0.036 with h = 0.05, with zero selection coefficients for both homokaryotypes. The numerical results show that the loads for both In and St subpopulations are always substantially larger than the deterministic values, even with h = ½ (Supplementary Table 1). With x = 0.1 and γ¯ = 1,000, L1, the load for In is more than 20-fold greater than the deterministic value for h between 0.05 and 0.5. Both L values are higher for the lower γ¯ values, as might be expected from the larger effects of drift in this case. Similarly, with x < ½, L1 is always larger than L2, and decreases with x for a given h and γ¯. Furthermore, L1 strongly decreases with h for x = 0.1 and 0.3, whereas L2 is only slightly affected by h. The fact that L1 < L2 with h = ½ and x < ½ shows that the increase in load due to smaller population size is partly caused by increased mean frequencies of deleterious mutations, not simply by increased frequencies of homozygotes.

In contrast, the Bi values are always lower than the deterministic values, reflecting the reduction in variability caused by drift, and are decreasing functions of h (vanishing when h = 0.5), although this is hard to see from Fig. 1 due to their overall very small values. These effects are especially noticeable for B1 when x = 0.1 and γ¯ = 1,000. The small Bi values reflect the fact that, except for the lowest dominance coefficient (h = 0.05), the Li are always quite close to the corresponding homozygous loads (the Hi of Equation 1c). For example, with γ¯ = 1,000 and x = 0.1, for h = 0.05, we have L1 = 0.212, H1 = 0.224, L2 = 0.002, H2 = 0.006; for h = 0.25, we have L1 = 0.057, H1 = 0.058, L2 = 0.002, H2 = 0.003.

There can be substantial selection against the inversion homokaryotypes at the lower h values, especially for the smaller x values: t1 reaches 0.19 with h = 0.05 and γ¯ = 1,000, but decreases sharply with x and h; it is only 0.04 for h = 0.25, x = 0.1, and γ¯ = 1,000. The values for γ¯ = 4,000 are somewhat smaller, consistent with smaller effects of drift in causing divergence between the 2 subpopulations. In contrast, t2 is mostly negative and quite small (approximately −0.01 for a wide range of h values when x = 0.1 and γ¯ = 1,000), indicating weak directional selection against the inversion. Only when x is close to 0.5 does the heterokaryotype experience a slight advantage over both homokaryotypes (a maximal value of t1 = t2 ≈ 0.004 for h = 0.05 and γ¯ = 1,000). With x = 0.5, the selective advantage to heterokaryotypes declines sharply with h, and is only 0.0016 for h = 0.25 and γ¯ = 1,000; it vanishes when h = 0.5. With 106 instead of 105 selected sites, the heterokaryotype advantage would be approximately 10 times as large. With x = 0.5, symmetry implies that t1 = t2, and the equality of effective population sizes means that the mean fitness of the heterokaryotypes is superior to that of the homokaryotypes purely because they have a lower frequency of mutant homozygotes than either of the homokaryotypic subpopulations.

It is also of interest to examine the effect of differences in population size and selection coefficients on the load statistics for a constant scaled selection strength. Figure 2 is similar to Fig. 1, but with population sizes of 2 × 106 and 105, and a mean selection coefficient (0.001) that is one-half of that in Fig. 1. A comparison of the top panels of Figs. 1 and 2 shows that reducing the mean selection coefficient by one-half, but keeping γ¯ constant, results in a reduction in the L values, by a factor of close to 2 for the case of the In subpopulation with x = 0.1, but by considerably less for the St population, where L2 is only slightly greater than the deterministic value of 0.001, even for h = 0.05. The relative effect on a subpopulation is reduced as it increases in size, as might be expected intuitively. These effects reflect the fact that the load caused by drift is greater when the selection coefficients involved are larger (for fixed γ¯), in contrast to the deterministic formula L = 2u. The two selection coefficients on homokaryotypes relative to heterokaryotypes, ti, show a similar pattern. In contrast, the inbreeding loads are only slightly reduced by a reduction in the strength of selection when γ¯ is fixed. Further results for different deme sizes are shown in Supplementary Table 2.

Fig. 2.

Fig. 2.

The mutational load statistics for a single populations of size N = 2 × 106 (upper panel) and N = 105 (lower panel), with the same mean selection coefficient (0.001), plotted against the dominance coefficient h. The results for 3 different frequencies of the inversion are shown. The other parameters are the same as in Fig. 1. The filled and half-filled symbols denote values for the In and St subpopulations, respectively. The lozenges are the net mutational loads within the respective karyotypes, and the crosses are the corresponding inbreeding loads. The triangles are the selection coefficients against homokaryotypes relative to the heterokaryotype. Only the values for In are shown when x = 0.5.

The effect of a reduced population size while holding mean s constant can be seen by comparing the upper and lower panels of Fig. 2. As might be expected from the greater effects of drift in a smaller population, the Li and ti are considerable larger when the population size is reduced by a factor of 20, especially for the inversion subpopulation with x = 0.1, for which N1 = 10,000 in the lower panel. However, with x = 0.5 and h = 0.25, the selective advantage to the heterokaryotypes is only 0.0003 for the smaller population size. The inbreeding loads are barely affected by the population size difference.

A single population: population genomics statistics

It is also of interest to examine the effects of an inversion polymorphism on population genomics statistics that can be used to assess the effects of the differences in effective population size between the In and St subpopulations, and between these and the part of the genome that is independent of the inversion polymorphism. Figure 3 shows the results for the St population using the same parameter values as in Fig. 1 for 3 different values of x; the case with x = 0.1 is nearly equivalent to the situation for the rest of the genome. It can be seen, somewhat surprisingly, that the mean frequency of deleterious alleles in the St subpopulation (q¯2, half-filled triangles), which includes both fixed and segregating sites, is nearly independent of h. It increases slightly as the size of the St population, given by N2 = N(1 – x), decreases as x changes from 0.1 to 0.5; the effect of N2 is somewhat greater for γ¯ = 1,000, as would be expected from the greater effects of drift in this case. This difference partly reflects the higher proportion of sites that can become fixed for deleterious mutations with weaker selection, as well as their higher mean frequency at segregating sites. Its behavior implies that the effect of population size on the load for a subpopulation when the mean strength of selection is strong is largely caused by changes in the variance of allele frequency rather than its mean.

Fig. 3.

Fig. 3.

The population genomics statistics for the St subpopulation, plotted against the dominance coefficient h. The evolutionary parameters are the same as in Fig. 1. The results for 3 different frequencies of the inversion are shown. The half-filled triangles are the mean frequencies of A2 and the half-filled lozenges are the mean diversities at selected sites. The filled triangles are the ratios of mean diversities at selected sites to mean diversities at neutral sites. The filled lozenges are the values of Δθw for the selected sites. Note that the distinction between filled and half-filled symbols does not refer to the In vs St subpopulations, as was the case for the previous figures.

The mean nucleotide site diversity at the selected sites for St (πsel2, half-filled lozenges) decreases somewhat with increasing h and decreasing N2, but is always between 0.0011 and 0.0025 for γ¯ = 4,000 and between 0.0017 and 0.0037 for γ¯ = 1,000. The ratio of πsel2 to the mean nucleotide site diversity at neutral sites (πsel2/πneut2, filled triangles) decreases with N2, especially when h is small. The measure of skew toward low-frequency variants at selected sites, Δθw2, for a sample of 20 haploid genomes (filled lozenges) increases with N2, but the effect is weak unless h is small; the effect of h on Δθw2 is smaller than on πsel2/πneut2, and plateaus around h = 0.25 (for the definition of Δθw, see A single population: modeling drift and selection).

These effects of h and N2 can also be seen in the right-hand panels of Fig. 4 (labeled as x = 0.5 vs x = 0.1) which plot the ratios of q¯2,  πsel2, πsel2/πneut2, and Δθw2 with x = 0.5 to their values with x = 0.1, a 1.8-fold difference in N2. The plots for πsel2/πneut2 bring out clearly that a lower subpopulation size is associated with larger πsel2/πneut2, especially when h is small. For Δθw2, the ratio is approximately 0.85 for h = 0.05, but increases rapidly toward 1 as h increases. The 4-fold difference in γ¯ between the upper and lower panels has a remarkably small effect on the ratios for all 4 statistics.

Fig. 4.

Fig. 4.

The first 2 panels in each row show the ratios of the population genomics statistics for the In subpopulation to those for the St subpopulation, plotted against the dominance coefficient h. The third panel shows the ratios of these statistics for the St subpopulation with an inversion frequency of 0.5 to their values for a frequency of 0.1, for which the corresponding ratio of population sizes is 0.556. The evolutionary parameters are the same as in Fig. 1. The half-filled triangles are the ratios of the mean frequencies of A2; the half-filled lozenges are the ratios of the mean diversities at selected sites. The filled triangles are the ratios of mean diversities at selected sites to mean diversities at neutral sites. The filled lozenges are the ratios of Δθw values for the selected sites.

The other 2 panels of Fig. 4 display plots against h of the ratios of these statistics for the In subpopulation to their values for the St subpopulation for x = 0.1 and 0.3. The ratios of population sizes for St vs In are 9-fold for x = 0.1 and 2.3-fold for x = 0.3, flanking the ratio N2/N1 for the right-hand panels. N2 for x = 0.3 is 1.4-fold greater than N2 for x = 0.5, which enhances the contrast between In and St. Accordingly, the patterns are more marked than for the right-hand-most panels, and are strongest for the case with x = 0.1, with a maximal ratio of πsel1/πneut1 to πsel2/πneut2 of 2.4 at h = 0.05 and γ¯ = 4,000. This is still much less than the ratio of 9 for neutral diversity. The ratio Δθw1θw2 with x = 0.1 is close to 0.5 for both γ¯ values when h = 0.05, but rapidly becomes close to 1 as h increases—for x = 0.1, h = 0.25, and γ¯ = 4,000, we have Δθw1θw2 = 0.915. The skew in the site frequency spectrum at selected sites is thus unlikely to be a powerful statistic for detecting a reduced efficacy of selection on a low-frequency inversion, given that very small h values are unlikely for mildly deleterious mutations (Crow 1993; Manna et al. 2011).

A subdivided population

As described in General considerations, A finite island model metapopulation, this case assumes that a metapopulation of total size NT is divided into a large number d of demes, each of size N = NT/d. A Wright–Fisher model applies to each deme, and the deme size is assumed to be sufficiently large that the frequency of the inversion is held at the same frequency x in all demes. A fraction m of each deme is derived by migration from a pool with equal contributions from all demes. For sites that are independent of the inversion, the level of neutral genetic differentiation between demes is described by FSTn1/(1+M), where M is the scaled mutation rate 4Nm. Different levels of subdivision are characterized by different values of FSTn; for simplicity, the subscript n is dropped in what follows. The scaled selection parameter γ is now defined as 2NTs, and the scaled mutation parameters are α = 4NTu and β = 4NTv.

Figure 5 shows the effects of population subdivision on the load statistics of most interest, together with the ratio of the diversities at selected sites for the In vs St subpopulations. There is an inversion frequency of 0.1 in a metapopulation of 200 demes with a total size NT = 106 (N = 5,000); the selection and mutation parameters are the same as in Fig. 1. The results for FST = 0 were obtained from the single population calculations described above. The results can be summarized very simply: there is a remarkably small effect of population subdivision as FST changes from 0 to 0.25, with the most marked effect occurring over the change from FST = 0 to FST = 0.05, especially for the smallest dominance coefficient (h = 0.05). For example, with γ¯ = 1,000 and h = 0.05, the selection coefficients against the In and St homokaryotypes relative to the heterokaryotype change from t1 = 0.1868 and t2 −0.0107 with FST = 0 to 0.211 and −0.0120 with FST = 0.05, reaching 0.250 and −0.0142 at FST = 0.25. With the more plausible value of h = 0.25, the changes are much smaller: t1 = 0.0405 and t2 −0.0136 at FST = 0, and 0.0442 and −0.0146 at FST = 0.25. Both the absolute values of the loads and selection coefficients and their dependence on FST are much smaller with γ¯ = 4,000 than 1,000. As in the case of a single population, a selective advantage to the heterokaryotype is not found unless there are nearly equal frequencies of In and St (Supplementary Table 3). The magnitude of such an advantage is not greatly increased by subdivision; for example, with x = 0.5, γ¯ = 1,000 and h = 0.05, t1 = t2 = 0.0039 at FST = 0.0, and 0.00860 at FST = 0.25; with h = 0.25, the corresponding values are 0.00160 and 0.00196.

Fig. 5.

Fig. 5.

The mutational load and polymorphism statistics for an inversion polymorphism with inversion frequency x = 0.1 in a subdivided island population of total size NT = 106 with 200 demes and 105 selected sites, plotted against FST for neutral sites independent of the inversion. The results for 3 different values of the dominance coefficient, h, are shown. The selection coefficients follow a gamma distribution with a mean of 5 × 10−4 and a shape parameter of 0.3. The mutation rate to deleterious alleles is 5 × 10–9 per basepair, with a mutational bias toward deleterious variants of 1.5. The upper and lower panels have products of 2NT and mean s of 4,000 and 1,000, respectively. The filled and half-filled symbols denote values for the In and St subpopulations, respectively. The lozenges are the net mutational loads within the respective karyotypes, and the triangles are the selection coefficients against homokaryotypes relative to the heterokaryotype. The crosses are the ratios of the nucleotide site diversities for In vsSt subpopulations.

The relation with FST of the ratio of diversities at selected sites for In vs St with x = 0.1 is also shown in Fig. 5, and is similarly rather weak. This finding also applies to the population genomic statistics that are not displayed in Fig. 5—see Supplementary Table 3 for details (the Δθw statistic was not calculated, as this statistic is hard to evaluate with population subdivision and was not very informative in the single population case). For example, with x = 0.1, γ¯ = 1,000 and h = 0.05, the ratio of πsel/πneut for In vs St decreases from 2.32 with FST = 0 to 1.80 with FST = 0.25, indicating a greater efficacy of selection on the smaller subpopulation when there is greater subdivision. With h = 0.25, the change is much smaller, from 1.97 to 1.90. Similarly, the ratio of the mean frequency of A2 for In vs St with h = 0.05 changes from 3.02 with FST = 0 to 3.78 with FST = 0.25, but only from 2.21 to 2.32 with h = 0.25.

These results are all for a relatively large deme size of N = 5,000. Intuitively, it might seem that reducing the deme size would enhance the effects of drift within demes, and lead to larger loads and magnitudes of the selection coefficients, as well as reducing the values of such signatures of purifying selection as the mean frequencies of mutant alleles and the ratios of diversities at selected sites to neutral sites. The example in the upper part of Table 2, where deme sizes of 500 and 5,000 are compared for the same neutral FST value, shows that this expectation is met for the case of equal frequencies of In and St. For the mean fitnesses of the homokaryotypes and the selection coefficients, the effects are small and are only visible in the table in a few cases. However, the mean frequencies of mutations and the ratios of diversity at selected sites vs neutral sites show a clear pattern of reduced efficacy of selection with the smaller deme size, which is reduced in magnitude by larger h and FST. Further results for cases with smaller deme sizes are shown in Supplementary Table 4.

Table 2.

Some load and population genomics statistics for a subdivided population of total size 106 for 2 different local deme sizes.

x = 0.5 w¯1 t 1 = t2 q¯1 = q¯2 π sel1/πneut1
FST = 0.05,
h = 0.05
0.992
0.992
0.005
0.005
0.090
0.088
0.341
0.325
FST = 0.05,
h = 0.25
0.994
0.998
0.002
0.002
0.087
0.085
0.262
0.254
FST = 0.05,
h = 0.45
0.995
0.995
0.0003
0.0003
0.086
0.083
0.229
0.226
FST = 0.25,
h = 0.05
0.987
0.987
0.090
0.083
0.101
0.099
0.263
0.258
FST = 0.25,
h = 0.25
0.994
0.998
0.002
0.002
0.089
0.087
0.238
0.232
FST = 0.25,
h = 0.45
0.995
0.995
0.0003
0.0003
0.086
0.084
0.228
0.221
x = 0.1 w¯1 /w¯2 t 1 t 2 q¯1 /q¯2 (πsel1/πneut1)
/(πsel2/πneut2)
FST = 0.05,
h = 0.05
0.780
0.780
0.210
0.211
−0.012
−0.012
3.23
3.29
2.17
2.04
FST = 0.05,
h = 0.25
0.944
0.944
0.042
0.042
−0.014
−0.014
2.26
2.29
1.93
1.88
FST = 0.05,
h = 0.45
0.971
0.970
0.016
0.016
−0.013
−0.013
1.93
1.96
1.88
1.85
FST = 0.25,
h = 0.05
0.739
0.739
0.250
0.250
−0.014
−0.014
3.72
3.78
1.83
1.88
FST = 0.25,
h = 0.25
0.941
0.942
0.045
0.042
−0.015
−0.015
2.29
2.32
1.91
1.90
FST = 0.25,
h = 0.45
0.969
0.970
0.017
0.016
−0.014
−0.014
1.93
1.96
1.90
1.86

Results for N = 500 and 5,000 are shown in the upper and lower parts of each cell, respectively.

w¯1 and w¯2 are the mean fitnesses of the products of random mating among In and St karyotypes, respectively.

The selection and mutation parameters are the same as in Fig. 1.

The patterns are somewhat more complex, however, when there is a large difference in frequency between arrangements, as shown in Table 2 for the case of x = 0.1. Here, the differences in the load statistics between the large and small deme size cases are negligible. There is, however, a signature of a slightly enhanced efficacy of selection with the smaller deme size in the In subpopulation relative to St, indicated by a consistently smaller value of q¯1/q¯2 when N = 500. Confusingly, πsel1/πneut1 is greater than πsel2/πneut2 when N = 500, indicating the opposite pattern. The first of these results is explained by the fact that the much smaller size of the In subpopulation means that mutations are behaving nearly neutrally within demes, so that a lower deme size will not have much of an effect, whereas it will reduce the overall efficacy of selection on the St subpopulation, resulting in an increase in q¯2. For example, with h = 0.05 and FST = 0.05, the ratio of q¯2 for N = 500 vs q¯2 for N = 5,000 is 1.024, whereas the value for q¯1 is 1.006. The corresponding ratio for πsel1/πneut1is 1.072 whereas that for πsel2/πneut2 is 1.005. This is entirely due to a higher ratio of πsel1 for N = 500 vs N = 5,000, since there is no effect of local deme size on neutral diversity for a fixed neutral FST. It is not entirely clear how to interpret this pattern, but one factor is likely to be the fact that the relaxation of the efficacy of selection in St subpopulation leads to an increase in FST at selected sites, which would work against any increase in πsel2 due to a reduced efficacy of selection.

Discussion

It should be borne in mind that the results described above relate to a situation in which an inversion polymorphism is maintained by balancing selection that is invariant over space and sufficiently strong that the frequency of the arrangements is constant over time and space. This obviously does not apply to many natural situations, e.g. when there are clinal patterns of variation in inversion frequencies or temporal fluctuations in frequencies, as is often the case (Krimbas and Powell 1992; Kapun and Flatt 2019). Nevertheless, the theoretical results described above address several questions that can in principle be answered by comparisons with empirical studies. First, does a low-frequency arrangement accumulate a larger mutational load than its counterpart? Second, can mutational load contribute a significant selective advantage to heterokaryotypes, which might help to stabilize the polymorphism? Third, can differences in population genomic statistics between common and rare arrangements shed light on differences in mutational load between arrangements? These questions are each discussed in turn below. The focus will be on the results for a single population, since the analysis of the properties of a subdivided population showed that there were no major differences from those of a single population over the range of FST values examined, although subdivision tends to slightly magnify the differences between rare and common arrangements.

Are the relative values of the fitness components of carriers of different gene arrangements consistent with mutational load?

The relevant variables with respect to the relative fitnesses of the In and St subpopulations are the Li, the mutational loads associated with individuals produced by random mating among outbred individuals homozygous for karyotype i, where i = 1 for the rarer arrangement (In) and i = 2 for the more common one (St). These quantities are, course not directly observable, since the fitness of mutation-free individuals is unknown, but they can be used to predict the relative values of the mean fitnesses (or fitness components) for In and St. On the assumption of multiplicative fitnesses, the ratio of the mean fitnesses of type 1 and type 2 homokaryotypes is w¯1/w¯2=exp(L2L1), corresponding to a difference L2L1 in the natural logarithm of mean fitness. The mutational and selection parameters used here (a mutation rate of 5 × 10–9 per site and a gamma distribution of selection coefficients with shape parameter 0.3) are consistent with the results of population genomic studies of D. melanogaster, e.g. Kousathanas and Keightley (2013) and Wang et al. (2023).

In order to relate the theoretical predictions to data on inversion polymorphisms, it is necessary to have information about h and the parameters of the distribution of s, which is made difficult by the fact that there is much uncertainty about the relation between h and s. The evidence from Drosophila studies of the effects of mutations on fitness components suggest that only very strongly selected deleterious mutations such as homozygous lethals have h values low as 0.05, whereas the much more abundant deleterious mutations with s ≤ 0.02 have h values of the order of 0.25 or more (Crow 1993; Manna et al. 2011). It therefore seems safe to use h = 0.25 as a working value for comparing theory with data, since mutations like lethals that are strongly selected against when heterozygous will be held close to their deterministic equilibrium values in both the In and St subpopulations (Nei 1968), and hence will not contribute to the genetic differences between them. It is worth noting that a higher load for the rarer arrangement does not require mutations to have h < ½, contrary to what is often stated (e.g. Jay et al. 2021). If drift is sufficiently strong in relation to selection that there is a higher expected frequency of deleterious mutations in the smaller subpopulation, a higher load can arise even for completely dominant mutations.

Estimates of the distribution of γ = 2Nes from population genomics studies reflect the more weakly selected part of the distribution of selection coefficients and are thus the most useful source of information for the present purpose. With a shape parameter of 0.3 and h = 0.25, πsel2/πneut2=0.15 with γ¯ = 4,000 when x = 0.1. This is only slightly higher than the observed ratios of nonsynonymous to silent site diversities in normally recombining regions of the genome in putatively ancestral range populations of D. melanogaster (e.g. Campos et al. 2014), suggesting that this value of γ¯ can be used as a working estimate for nonsynonymous mutations.

One of the best-studied D. melanogaster inversions is In(3R)P. This has a frequency of 0.1 in populations in its ancestral range in Africa (Kapun and Flatt 2019), which are the most relevant populations for comparisons with the theoretical predictions. It is in the middle of the size range for polymorphic inversions in this species, covering approximately 8 Mb of sequence (Kapun et al. 2023). This corresponds to approximately 1,000 protein-coding sequences, i.e. 106 nonsynonymous sites rather than the 105 selected sites illustrated in the figures. Functional noncoding sequences may also contribute to the mutational load, and these appear to be under weaker selective constraints than nonsynonymous mutations (Andolfatto 2005; Casillas et al. 2007; Campos et al. 2017). As a rough estimate of their contribution to the load and population genomic statistics, it is plausible to assume that functional noncoding sites are three times as abundant as nonsynonymous sites (Halligan and Keightley 2006) but have γ¯ = 1,000 rather than 4,000. With x = 0.1 and h = 0.25, the results in Supplementary Table 1 imply that, by combining the effects of 106 nonsynonymous sites with γ¯ = 4,000 and 3 × 106 noncoding sites with γ¯ = 1,000 and h = 0.25, we would have L1 = 2.09 and L2 = 0.08 for an inversion with frequency 0.1, giving a predicted w¯1/w¯2= 0.13.

Predictions of this kind are highly sensitive to the frequency of the inversion, and to assumptions concerning the abundance of functional sites and the strength and mode of selection. With the model just described and an inversion frequency of 0.3, L1 = 0.42 and L2 = 0.11, giving w¯1/w¯2= 0.73. If only nonsynonymous sites are considered, w¯1/w¯2 rises to 0.69 for x = 0.1 and 0.95 for x = 0.3. Synergistic epistasis among deleterious mutations is expected to reduce the Li by a factor of approximately 2 (Kondrashov 1995; Charlesworth 2013), so that the model of a mixture of 106 nonsynonymous and 3 × 106 functional noncoding sites would generate w¯1/w¯2values of 0.37 and 0.86 for x = 0.1 and x = 0.3, respectively. These complexities means that it is difficult to make rigorous comparisons between theory and data, and the estimate of w¯1/w¯2=0.13 for In(3R)P is probably at the lower end of what is to be expected.

A large number of studies of the effects of inversions on fitness and components of fitness under laboratory conditions have been published, with D. melanogaster and D. pseudoobscura being been the most intensively studied species—see reviews by Sperlich and Pfriem (1986), Krimbas and Powell (1992), and Kapun and Flatt (2019). Many of these studies do not, however, provide useful information for the present purpose, either because they involve traits like male mating success that are hard to quantify in terms of relative fitness measures, because only small numbers of independently sampled arrangement haplotypes were used, or because samples from laboratory rather than natural populations were used. The most informative available estimates are shown in Supplementary File 4, with their standard errors where available.

For the two large-scale studies of egg-to-adult viability in D. melanogaster (Mukai and Yamaguchi 1974; Watanabe et al. 1976), where balancer crosses were used to extract 2nd or 3rd chromosomes from wild flies, the crosses involving pairs of independently extracted chromosomes were divided into cases where either each member of a pair was inversion-free or at least one member of each pair carried an inverted chromosome. There is no evidence for a difference between these categories in either study. One possible explanation for this discrepancy is the fact that a single component of fitness such as viability reflects only a portion of the net effect of deleterious mutations on fitness. Table S8 in Charlesworth (2015) presents estimates of the ratios of the fitness effects of deleterious mutation on various fitness components to their effects on net fitness (the α parameters). The estimates of α are subject to considerable uncertainty, but even the lowest estimate for viability (0.09) cannot explain the absence of effects on viability in these experiments.

Studies such as these that involve populations of D. melanogaster outside the ancestral range of the species in south-eastern Africa are difficult to interpret, because of the effects on variability of the severe population bottlenecks associated with the spread of the species out of Africa (e.g. Haddrill et al. 2005). Indeed, a curious feature of the data of Mukai and Yamaguchi (1974) on the 2nd chromosome is the much higher frequency of homozygous lethal In chromosomes than St chromosomes (54 vs 36%); this is associated with a higher frequency of pairs of crosses in which combinations of different lethal-bearing chromosomes were lethal (19 vs 11%), suggesting that there may have been a recent population size bottleneck that affected the In chromosomes more severely than the more abundant St chromosomes. Another possibility, proposed by a reviewer, is that high frequency lethals associated with the inversions were also associated with the Sd segregation distorter haplotype, as has been found in samples from US natural populations (reviewed by Larracuente and Presgraves 2012). Unfortunately, no information on the fitness effects of inversion genotypes appears to be available for samples from the ancestral range of D. melanogaster.

Species that are less subject to the problem of recent range expansion, such as D. pseudoobscura, are thus more favorable material. Estimates of the net relative fitnesses and of two measures of viability for D. pseudoobscura 3rd chromosome arrangements in population cages are also given in Supplementary File 4 (the horizontal lines separate data from different experiments). AR and CH cover approximately 30 and 20% of chromosome 3 (the homolog of chromosome arm 2R of D. melanogaster), respectively (Powell 1992), and are thus suitable for comparison with the theoretical prediction for In(3R)P given above. For the viability experiments of Dobzhansky (1947), involving CH and ST from a California population, CH had a frequency of between 0.20 and 0.35 in the original population, but which varied substantially over the year (Wright and Dobzhansky 1946, Fig. 2). The larval viability of CH/CH individuals was 86% of that of ST/ST individuals, which is reasonably consistent with expectations for an inversion with a frequency of around 0.3. The relative viabilities of CH/CH and AR/AR individuals were similar, as expected from the similar frequencies of CH and AR in this population.

The results for egg-to-adult viability measurements that used a balancer chromosome to extract 3rd chromosomes from a natural population of D. pseudoobscura (Crumpacker and Salceda 1968) show smaller effects than those predicted from mutational load; only the results for the 2 most common arrangements are shown in the table, with AR and CH having frequencies of approximately 0.5 and 0.28 in the population; the ratio of viabilities of CH/CH vs AR/AR in crosses between carriers of independently extracted chromosomes is 0.98, which is unlikely to differ significantly from 1.

The net fitness estimates obtained from population cage experiments on the natural population used for Dobzhansky's viability estimates, but which segregated for ST, AR, and CH (Wright and Dobzhansky 1946), show patterns that are inconsistent with the mutational load predictions; in particular, AR/AR has a much lower fitness than CH/CH despite their similar frequencies. In contrast, the ratios of net female fitnesses for CH/CH to AR/AR estimated by Anderson and Watanabe (1997) for a laboratory population derived from the same population but segregating only for CH and AR is 0.85, which is consistent with the fact that the population equilibrated at about 25% CH. However, the ratio for ST/ST vs AR/AR in another experimental population was 0.70, despite the fact that ST is usually at least as frequent as AR (Powell 1992).

The overall conclusion from these analyses of the relative performance of rare vs common arrangements is that some measurements fit the expectation of a larger equilibrium mutational load for the less frequent Drosophila inversions, but that the overall patterns imply that other factors obscure the contribution of load to homokaryotype fitnesses.

Can mutational load create a selective advantage to heterokaryotypes?

The theoretical results described earlier make it clear that the mutational load model used here can only create a heterokaryotypic advantage when In and St are present at nearly equal frequencies (see Figs. 1 and 2, and Supplementary Tables 1 and 2). Furthermore, the size of such an advantage is likely to be small, even in subdivided populations, except for large inversions. For example, with x = 0.5, h = 0.25 and 105 selected sites, the selection coefficient against both homokaryotypes vs the heterokaryotyope, as given by Equations (3), is 0.0016 with γ¯ = 1,000, and 0.0010 with γ¯ = 4,000. However, if the above model of 3 × 106 sites with γ¯ = 1,000 and 106 sites with γ¯ = 4,000 is used, the selection coefficient becomes 0.057. With x = 0.4, the selection coefficients under this model are t1 = 0.13 for In/In and t2 = 0.0013 for St/St. For lower values of x, the selective advantage to In/St heterokaryotypes over St/St is replaced by a selective disadvantage, as shown in Figs. 1 and 2.

The reason for this behavior is that a rarer arrangement accumulates a larger mutational load than its counterpart, due to the lower efficacy of selection with smaller Ne for mildly deleterious, partially recessive mutations (see Introduction). When a haplotype from the In population is made heterozygous with a haplotype from the St subpopulation, there is a smaller expected number of heterozygous mutations in the In/St individuals than in the In subpopulation. This means that t1 > 0 if h < ½, as is evident from Equation (3a), where F1, 1 – F1, and −C12 are all positive, as is the difference in the expected frequency of mutations, 〈δq〉, between the In and St subpopulations. In contrast, In/St individuals have a larger expected number of mutations than the St subpopulation. As can be seen from Equation (3b), if h < ½ and the magnitude of 〈δq〉 is sufficiently large, t2 is negative. But if the 2 arrangements are equally frequent, 〈δq〉 = 0, and the remaining terms guarantee that t1 = t2 > 0 when h < ½. In all cases, t1 = t2 = 0 if h = ½. The only surprising aspect of the theoretical results is that t2 is so sensitive to the effect of the relative subpopulation sizes on 〈δq〉.

These theoretical predictions can be compared with the data in Supplementary File 4. The 2 studies of viability in D. melanogaster showed no difference in viability between chromosomal heterozygotes that were free of inversions and chromosomal heterozygotes where at least 1 of the pairs of chromosomes involved carried an inversion. Since the inversions were rare, most of the latter cases will have involved heterokaryotypes, so the lack of any difference is consistent with an absence of heterokaryotypic superiority, as expected for rare inversions. For D. pseudoobscura, the results on net fitness and on viability from population cage experiments indicate strong heterokaryotypic advantages, much larger than the theoretical predictions for inversions of the size involved here. In contrast, the balancer cross data on viability showed a small heterokaryotypic advantage (Crumpacker and Salceda 1968), consistent with the theoretical predictions for equally frequent arrangements. Mérot et al. (2020), Huang et al. (2022), and Pei et al. (2022) found no evidence for heterokaryotypic superiority for several fitness components in seaweed flies, sunflowers, and zebra finches, respectively. Overall, it seems likely that the mutational load model may contribute modestly to heterokaryotypic superiority for inversions that are at intermediate frequencies, but cannot explain the large net fitness effects seen in the D. pseudoobscura or C. frigida inversions. This parallels the finding that mutational load is unlikely to provide a selective advantage to new autosomal inversions in randomly mating populations (Nei et al. 1967; Connallon and Olito 2021; Jay et al. 2022).

Population genomic indicators of a reduced efficacy of selection on low-frequency arrangements

As described in A single population: population genomics statistics, the two most informative population genomic statistics concerning a reduced efficacy of selection on a low-frequency arrangement are the mean frequencies of variants at the selected sites (the q¯i) and the ratios of diversities at selected and neutral sites (the πseli/πneuti). The measure of skew toward low-frequency variants relative to neutral expectation (Δθwi) is much less sensitive to subpopulation size, unless the dominance coefficient is implausibly small. In addition, measures of skew are sensitive to population size changes (Tajima 1989), and must be treated with caution when making inferences about selection. For this reason, only the other two statistics will be considered here.

A problem with using the q¯i is that these are not directly observable unless bioinformatic methods for classifying variants as deleterious are used; simply using the mean frequency of derived nonsynonymous variants in a sample as a proxy (cf., Campos et al. 2014) is not necessarily adequate when selection is weak, since it does not take into account fixed sites. Stenløk et al. (2022) used PROVEAN scores to estimate the mean numbers of deleterious missense mutations in inverted and standard arrangements of Atlantic salmon, but found no significant differences; an enrichment of small indels in the large (3.09 Mb) Chr18 inversion was, however, detected. The frequencies of this inversion are, however, highly variable between populations, so it is not clear how to interpret this difference.

There are, however, theoretical reasons for expecting the ratio q¯1/q¯2 to behave similarly to the ratio Rπ = (πsel1/πneut1)/ (πsel2/πneut2), at least when h is not too small. Welch et al. (2008) analyzed the properties of the ratio of selected site to neutral site diversity under a similar model to that used here, assuming h = 0.5 and a gamma distribution of selection coefficients (see also James et al. 2017). With γ¯ >> 1/a, where a is the shape parameter of the gamma distribution in Equation (9), the following approximate relation holds:

ln(πselπneut)=kaln(πneut) (13a)

where k is a constant that depends on the effective population size and the mean strength of selection.

In the present case, this relation implies that

Rln(π)=ln(πsel1πneut2πsel2πneut1)aln(πneut2πneut1) (13b)

While we would not expect this relation to be exact for h < 0.5, it is plausible to assume that it would be a reasonably good approximation when h is not too close to 0. The argument used by Welch et al. (2008) also implies that a similar relation should apply to the mean frequencies of deleterious mutations:

Rln(q)=ln(q¯1q¯2)aln(πneut2πneut1) (13c)

The accuracy of these approximations can be tested using the numerical results in Supplementary Tables 1 and 2. Table 3 gives some examples for the case of a single randomly mating population, showing that the two ratios on the left-hand sides of Equations (13b) and (13c) behave very similarly as functions of h, with Rln(q) > Rln(π) when h < 0.25, approaching the prediction on the right-hand sides of the equations for h ≥ 0.35. Similar results apply to the case of a subdivided population.

Table 3.

Values of the ratios with respect to In vs St of the natural logarithms of πselneut (Rln(π)) and q¯ (Rln(q)) for the case of a single population.

γ¯ = 1,000 γ¯ = 4,000
x = 0.1 x = 0.3 x = 0.1 x = 0.3
h R ln(π) R ln(q) R ln(π) R ln(q) R ln(π) R ln(q) R ln(π) R ln(q)
0.05 0.840 1.106 0.395 0.485 0.884 1.110 0.403 0.487
0.15 0.720 0.907 0.313 0.393 0.744 0.909 0.319 0.394
0.25 0.679 0.794 0.282 0.329 0.694 0.795 0.286 0.330
0.35 0.661 0.718 0.267 0.285 0.675 0.719 0.270 0.285
0.45 0.653 0.658 0.257 0.250 0.665 0.659 0.261 0.250
0.50 0.650 0.632 0.254 0.235 0.662 0.633 0.257 0.235

The predicted values of the ratios with the shape parameter a = 0.3 are 0.659 for x = 0.1 and 0.254 for x = 0.3.

The mutational parameters and population size are the same as in Fig. 1.

These results suggest that Rπ can used as a conservative proxy for q¯1/q¯2, at least for the case of a gamma distribution of selection coefficients with γ¯ >> 1/a and an intermediate dominance coefficient. An objection to using the correlation between Rln(π) and a measure of neutral diversity such as synonymous site diversity to investigate whether the efficacy of purifying selection declines with Ne is that πneutis the denominator of πsel/πneut. In addition to the statistical problem of a negative correlation introduced by this relationship, discussed by James et al. (2017), sites under sufficiently strong purifying selection could maintain a constant diversity across different Ne values (Campos et al. 2014). If this were the case, the expectation of πsel/πneut would simply be proportional to 1/πneut, and we would then have Rπ = πneut2/πneut1. As described in A single population: population genomics statistics, this is not the case with the model used here; Rπ with x < 0.5 is always less than y/x. For example, with γ¯ = 4,000 and h = 0.25, Rπ = 2.00 for x = 0.1 (y/x = 9) and Rπ = 1.33 for x = 0.3 (y/x = 2.33). This indicates that Rπ provides a signal that purifying selection is weakened by smaller subpopulation size. Comparisons of this kind could easily be done using real data.

Overall, therefore, these considerations suggest that, despite the above reservations, Rπ is quite a useful index of the efficacy of purifying selection, and that one might expect Rln(π) to be approximately equal to a ln(πneut2/πneut1). A related principle was used by James et al. (2017) and Castellano et al. (2018) to test for relations between the efficacy of purifying selection and Ne for animal mitochondrial genomes and different regions of the D. melanogaster nuclear genome, respectively. Unfortunately, there appears to be relatively little relevant information for autosomal inversions, other than cases such as the mimicry supergene in Heliconius numata (Jay et al. 2021) and the behavioral supergene of the white-throated sparrow Zonotrichia albicolis (Jeong et al. 2021), which are largely maintained as heterozygotes due to negative assortative mating. These systems are analogous to sex chromosomes, where one arrangement is permanently heterozygous and effectively lacks recombination. There is thus likely to be intense Hill–Robertson interference (Charlesworth and Charlesworth 2000), which would greatly reduce the efficacy of selection below the simple effect of a lower subpopulation size. This is consistent with the strongly elevated πsel/πneut values found for the H. numata inversions; Z. albicolis showed, however, only a modest effect. Jay et al. (2021) also found a large increase in the density of transposable elements (TEs) in the H. numata inversions and interpreted this as evidence for an increased mutational load. The accumulation of TEs in low recombination regions of genomes, including low-frequency Drosophila inversions (Sniegowski and Charlesworth 1994), has long been documented (Charlesworth et al. 1994). Most insertions are found in intergenic regions, where direct selective effects are likely to be weak, and where ectopic exchange inducing deleterious chromosome rearrangements is probably a major factor in causing their elimination. It is thus likely that a reduced frequency of ectopic recombination is the major factor in causing higher densities of TE insertions in such cases (Charlesworth et al. 1994), so that this phenomenon cannot be taken as evidence for an increased mutational load.

What strength of selection has the main effect on the load and population genomic statistics?

Another question raised by the theoretical results is: what part of the distribution of selection coefficients contribute to the differences between the In and St subpopulations? Stronger selection reduces the frequencies of deleterious mutations and their chances of fixation within a subpopulation, but also increases the sizes of any resulting loads. The major contribution to the relevant load statistics is thus likely to come from selection coefficients that are neither too large nor too small. This expectation can be tested by examining the contributions from the different zones described in A single population: obtaining the mutational load and population genomic statistics. These are shown in Supplementary Table 1 for the case of a single population. These results show that the major contributions to the Li and ti come from zone 2a, defined by an intermediate intensity of selection (for more details, see Supplementary Table 5). For example, for a randomly mating single population with γ¯ = 4,000, x = 0.1, and h = 0.25, zones 1, 2a, 2b, and 3 contribute 4, 36, 24, and 36%, respectively, to the distribution of selection coefficients against deleterious mutations. Zone 2a alone gives L1 = 0.0375, L2 = 0.0012, t1 = 0.0270, and t2 = −0.0090, compared with net values of L1 = 0.0381, L2 = 0.0016, t1 = 0.0271, and t2 = −0.0090. It covers the interval (0.278, 463) of γ for the whole population.

The finer dissection of the distribution of selection coefficients used in the case of a subdivided population reveals that the so-called quasineutral zone 2 in this case (see Section 5 of Supplementary File 1) contributes most to the load statistics. For example, with γ¯ = 4,000, x = 0.1, h = 0.25, and FST = 0.05 (M = 19), zones 1, 2, 3, and 4 for the subdivided case contribute 4, 17, 21, and 58%, respectively. Zone 2 covers the γ interval (0.278, 55.6), and contributes L1 = 0.0344, L2 = 0.0011, t1 = 0.0249, and t2 = −0.0082, compared with values of L1 = 0.0401, L2 = 0.0019, t1 = 0.0285, and t2 = −0.0094 for the whole distribution. In this case, with 200 demes of size 5,000 each, both the In and St subpopulations behave as effectively neutral within demes for this part of the distribution of selection coefficients. A similar pattern holds even with FST = 0.25. Nevertheless, the results are very similar to those for the single population, implying that the expectations of the population genetic parameters for an island model with a relatively low level of isolation between demes are mainly controlled by the size of the metapopulation, as was shown to be the case analytically by Wakeley (2003). Intuitively, this reflects the fact that, under these conditions, a mutation spends relatively little of its total sojourn time in the deme in which it arose.

Confounding factors

The present study assumes that selected sites are at statistical equilibrium under mutation, selection and drift, an absence of recombinational exchange between In and St in heterokaryotypes, and complete independence among sites under selection. These assumptions are likely to violated in many real-life situations. First, consider the question of departure from equilibrium. If the inversion arises as a unique mutational event, as the available evidence seems to suggest (reviewed in Charlesworth 2023), the In subpopulation will initially completely lack genetic variability, and much time will be needed for equilibrium to be approached. The St subpopulation can be assumed to be close to equilibrium initially and will thus approach its new equilibrium much faster than the In subpopulation, so that only the latter need be considered here. For this subpopulation, the magnitudes of the Li, Bi, and ti will be below their equilibrium values for a long time after the inversion has approached its equilibrium frequency under balancing selection.

It is difficult to make exact predictions about the rate of approach to statistical equilibrium when both drift and selection play a role, which has been shown above to be the situation that contributes the most to the selective differences among karyotypes. For the limiting case of complete neutrality, it is known that the divergence of the expected nucleotide site diversity from its equilibrium value at time t is equal to the product of its initial value and exp(t/2Ne) in a randomly mating population (Malécot 1969, p. 40), so that the timescale for approach to equilibrium is of the order of 2Ne generations. For the other limiting case of fully deterministic evolution, with γ >> 1, the divergence at time t of the mutant allele frequency q from its equilibrium value of u/hs is approximately equal to the product of its initial value and exp(t/hs). For the intermediate situation when γ is not much greater than 1, the two measures of the rate of approach to equilibrium are not very different, so it is plausible to assume that the true rate lies between them. The time needed to approach equilibrium with respect to mutations that have the largest effect on the load statistics is thus likely to be < 2Nx generations under a Wright–Fisher model. For a population with an Ne of 106 and x = 0.1, this would correspond to 105 generations, i.e. about 10,000 years for a species like D. melanogaster with approximately 10 generations per year. For x = 0.5, about 50,000 years would be required.

A problem with assessing whether these timescales are consistent with data on inversion polymorphisms is that the occurrence of recombinational exchange between In and St due to gene conversion and/or double crossing over (reviewed by Krimbas and Powell 1992; Korunes and Noor 2019), means that estimates of inversion age based on sequence divergence between different arrangements tend to produce underestimates of the age of the derived arrangement (Charlesworth 2023). A variety of lines of evidence suggests, however, that inversions such as In(3R)P are close to selection-drift-recombination equilibrium with respect to neutral variability (Charlesworth 2023); since selection against deleterious mutations will cause a faster approach to equilibrium than with neutrality, it is likely that the load statistics will also be close to their equilibrium values, unless demographic disturbances have caused serious perturbations.

However, the theory developed here for predicting mutational loads has ignored recombination. If the estimate of a typical rate of gene conversion of 10–5 in female meiosis in inversion heterokaryotypes in Drosophila (Chovnick 1973; Korunes and Noor 2019) is accepted, the effective rate is 0.5 × 10–5 due to the absence of exchange in males. An hs value that somewhat exceeds 10–5 would thus be sufficient to overcome the effects of gene flow between In and St; with h = 0.25 and NT = 106, this would correspond to γ > 80, which lies outside the range of γ values that contribute to a noticeable difference in load statistics between In and St with x = 0.1, as discussed in the previous section. It is therefore likely that recombination will significantly reduce such differences, providing another reason for regarding the above estimates as providing upper bounds to the predictions.

In addition, the fitness effects of any deleterious mutations that were present on the initial inversion haplotype have been ignored; however, the final equilibrium state considered here, where reverse mutations have been included, means that such effects will have been removed. During the approach to equilibrium they must, of course, play a role in reducing any selective advantage to a new inversion (Nei et al. 1967; Connallon and Olito 2021; Jay et al. 2022).

The low frequency of recombination in inversion heterokaryotypes for a rare inversion may create Hill–Robertson interference effects, as noted above in Population genomic indicators of a reduced efficacy of selection on low-frequency arrangements. Berdan et al (2021) used simulations to study this possibility, and found strong interference effects under the assumption that mutations were completely recessive. As pointed out previously, this assumption is implausible, even for mutations with large homozygous fitness effects. In addition, recombination between In and St will reduce interference effects. These will tend to increase the load in the less frequent subpopulation and in heterokaryotypes, and so will not contribute to the maintenance of the inversion polymorphism. There is little evidence for any effects of Hill–Robertson interference on the available population genomic statistics for the D. melanogaster inversions (Charlesworth 2023).

Conclusions

The theoretical results described here show that a long-maintained autosomal inversion polymorphism with no recombination in heterokaryotypes may develop a substantially higher mutational load for the less frequent arrangement. The magnitude of the difference between arrangements can be large for rare polymorphic inversions of the size usually encountered in Drosophila populations, but declines quickly as the frequency of the rare arrangement increases. It is also strongly influenced by the abundance of relative weakly selected noncoding sequences, since drift acts more strongly on these than on strongly selected nonsynonymous mutations. A selective advantage to heterokaryotypes is only expected when the two alternative arrangements are nearly equal in frequency, and is likely to be small even in this case. Experiments on the effects of several Drosophila inversion polymorphisms on fitness components give inconsistent results, although mutational load may contribute to some of the effects that have been detected. It should also be possible to detect molecular signatures of an increased load, such as an enhanced ratio of nonsynonymous to synonymous nucleotide site diversities, but the data are currently too scanty to draw firm conclusions. The effects of recombinational exchange in heterokaryotypes and Hill–Robertson interference, which oppose each other, were ignored here, and deserve further study.

Supplementary Material

iyad218_Supplementary_Data

Acknowledgments

I thank Deborah Charlesworth, Tim Connallon, Thomas Flatt and Michael Whitlock for their helpful comments on a draft of this paper. The comments of two anonymous reviewers considerably improved the final version.

Appendix

Load calculations for a single population

Zone 1 corresponds to a quasineutral region where γ is sufficiently small (e.g. 0.25) that the selection term in Equations (5) can be ignored. The two qis can be then treated as independent variables, each following a beta distribution with parameters αi and βi. In this case, we have

qi=αiαi+βi=uu+v (A.1a)
Fi=11+αi+βi (A.1b)
πi=2piqi=2αiβiαi+βi(1Fi) (A.1c)

where Equation (A.1c) give the expected nucleotide site diversity for subpopulation i. In most applications, we have αi, βi << 1, so that this equation can be approximated by

πi2αiβiαi+βi=4Niu~ (A.1d)

where u~=2κv/(1+κ) is the net mutation rate at base composition equilibrium (Charlesworth and Charlesworth 2010, p. 274). This is identical in form to the infinite sites model formula for nucleotide site diversity, which does not explicitly take reverse mutations into account (Kimura 1971).

These expressions can be inserted into Equations (1). For this zone, the exponential term in γ in the gamma distribution p.d.f. can be neglected. As it is assumed that x ≤ 0.5, the focus is on ensuring that the St population behaves as neutral. As shown in Section 3 of Supplementary File 1, if the upper bound to γy (the scaled selection coefficient for St in zone 1) is denoted by γc1, the probability Pz1 that γyγc1 and the integral of γy over this zone, I1 are given by

Pz1Γ(a)1(γ¯y)aaaγc1a (A.2a)
I1a(a+1)1γc1Pz1 (A.2b)

Zone 2 involves a moderate intensity of selection, with γc1γyγc2, where γc2 is such that there is a nontrivial probability that either q1 or q2 reaches an intermediate or high frequency. In order to increase the accuracy of the integrations of the joint p.d.f. over the distribution of γ, zone 2 was divided into 2 subzones, with zone 2a having an upper limit of γy = γc2a, and zone 2b with γc2aγyγc2. Numerical values for γc1, γc2a, and γc2 can be found in Supplementary Table 1. The probability of s falling into this zone, Pz2a, is given by the excess over Pz1 of the integral of the gamma distribution from 0 to γc2a; the probability for zone 2b, Pz2b is found in a similar way by using γc2a and γc2 as integration limits. The bivariate distribution of Equation (8) is used to determine the expected value of the load statistics in Equations (1) for a given γ, using the means and variances of the qi generated by the distribution. The integrals over γc1aγyγc2a and γc2aγyγc2 of the products of Equations (1a)–(1c) with ns and the p.d.f. for the gamma distribution of γy yield the net contributions from zone 2a and 2b to the load statistics.

It was found that use of Equation (8) for values of γy that were close to γc1 (usually set to 0.25) gave inaccurate results, with expected frequencies of A2 somewhat greater than those obtained for zone 1. To avoid this problem, an approximation to Equation (8), described in Section 4 of Supplementary File 1, was used for values of γy less than a threshold value. Trial and error showed that accurate results, where both methods agreed closely in the results for integration over the whole zone were obtained by setting this threshold to 0.25γy/(hx), equivalent to 40γy with h = 0.05 and x = 0.5 (this adjusts for the fact that selection against rare mutations is effectively weaker when they are more recessive).

Zone 3 involves a high intensity of selection, such that q1 and q2 are both confined to their boundaries close to 0; here γc2γyγc3, where γc3 is assigned as the upper limit to the distribution of γy, usually chosen to correspond to the upper 99th percentile of the gamma distribution. The probability of s falling into this zone, Pz3, can be found as for Pz2a and Pz2b. The two frequencies q1 and q2 can be treated as independent variables, since their expected product is negligible and A2 alleles within In and St haplotypes have little chance of encountering each other. Provided that h is sufficiently greater than 0, their p.d.f.s for a given h and s are well approximated by gamma distributions (Nei 1968), with shape parameters αi and means u/hs. The corresponding variances are (u/hs)2/αi, allowing the F and other load statistics of Equations (1) to be determined easily. The net contributions of zone 3 to the load statistics can then be found by integrations of the same type as described for zone 2.

The integrals over the distribution of γ for each of zones 2a, 2b, and 3 were evaluated numerically using Simpson's rule (Atkinson 1989), usually using 650 points over a single interval for zones 2a, 2b, and 3 (trial and error showed that this number was sufficient to produce a high degree of stability for the values of the statistics of interest). This procedure was also used for the integrals involved in obtaining the load statistics from the probability distribution of allele frequencies. No integration over γ values is needed for zone 1, since neutrality is assumed and the integral of γ over this interval is obtained using Equation (9).

Probability distribution for a subdivided population

Writing γm = 2NTs for the scaled selection coefficient with regard to the whole metapopulation and Gi=FSTi/(1FSTi), we have

ψsm(q¯1,q¯2)=2γm(a1q¯1+a1q¯2+b11q¯12+b22q¯22+b12q¯1q¯2) (A.3a)

where

a1=[G1+(12G1)h]x2+hxy (A.3b)
a2=[G2+(12G2)h]y2+hxy (A.3c)
b11=12(12G1)(12h)x2 (A.3d)
b22=12(12G2)(12h)y2 (A.3e)
b12=(12h)xy (A.3f)

When selection is sufficiently weak in relation to migration that 2NTs << M, where M is the scaled migration parameter 4Nm, we have FST11/(1+Mx) and FST21/(1+My). For sites independent of the inversion, FSTn1/(1+M). Purifying selection is expected to reduce the FSTi below their neutral values (Charlesworth and Charlesworth 2010, p. 353), so that use of the neutral approximations overestimates the effect of drift relative to selection. An approximate correction for the effect of selection on FSTi is used here, based on Equation (B7.8.4) of Charlesworth and Charlesworth (2010, p. 354); the term 4Nhs is added to M in the formula for FST. This procedure is only rigorous for Nhs >> 1; numerical studies show that it very slightly reduces the effect of population subdivision compared with using the purely neutral expression (Supplementary Table 6).

Combining Equations (A.3) with the mutational terms, the bivariate p.d.f. has a similar form to Equation (8), except that p¯i and q¯i are used instead of pi and qi, and the scaled mutational parameters involve the products of 4NT1/(1 – FST1) and 4NT2/(1 – FST2) with the relevant mutation rates. Thus, αi and βi are replaced by αim = 4NTi u/(1 – FSTi) and βim = 4NTi v/(1 – FSTi), respectively.

In order to use Equations (1) to calculate the load statistics, it is necessary to consider the p.d.f. of allele frequencies within demes. Following Cherry and Wakeley (2003) and Wakeley (2003), by conditioning on the current values of the q¯i for a deme with frequencies qi we obtain the general expression:

ϕ(q1,q2|q¯1,q¯1)=C~exp(ψs)q1α1d1p1β1d1q2α2d1p2β2d1 (A.4)

where ψs has the same form as in Equation (7), α1d=Mxq¯1+4Nxu, β1d=Mxp¯1+4Nxv, α2d=Myq¯1+4Nyu, β2d=Mxp¯2+4Nxv (the subscript d is used to indicate within-deme values).

An exact calculation of the load statistics for a given s would require multiplication of this function by the bivariate p.d.f. for the q¯i, followed by integrations over the qi and q¯i to obtain the requisite quantities to insert into Equations (1); this would then be followed by integration over the distribution of s. In order to avoid this cumbersome procedure, partitioning of the distribution of s into different zones was used together with some approximations, similarly to what was done for the case of a single panmictic population. The details of the procedure are described in Section 5 of Supplementary File 1.

Data availability

No new data or reagents were generated for this work. The codes for the computer programs used to generate the results described above are available in Supplementary Files 2 and 3. The numerical results used to produce the figures are presented in Supplementary Tables 1–4.

Supplemental material is available at GENETICS online.

Funding

This work was not funded.

Literature Cited

  1. Anderson  WW, Watanabe  TK. 1997. A demographic approach to selection. Proc Natl Acad Sci U S A. 94(15):7742–7747. doi: 10.1073/pnas.94.15.7742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andolfatto  P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature. 437(7062):1149–1152. doi: 10.1038/nature04107. [DOI] [PubMed] [Google Scholar]
  3. Atkinson  KE. 1989. Introduction to Numerical Analysis. New York, NY: John Wiley. [Google Scholar]
  4. Bataillon  T, Kirkpatrick  M. 2000. Inbreeding depression due to mildly deleterious mutations in finite populations: size does matter. Genet Res. 75(1):75–81. doi: 10.1017/S0016672399004048. [DOI] [PubMed] [Google Scholar]
  5. Berdan  EL, Barton  NH, Butlin  R, Charlesworth  B, Faria  R, Fragata  I, Gilbert  KJ, Jay  P, Kapun  M, Lotterhos  KE, et al.  2023. How chromosomal inversions reorient the evolutionary process. J Evol Biol. 36:1762–1782. doi: 10.1111/jeb.14242. [DOI] [PubMed] [Google Scholar]
  6. Berdan  EL, Blankaert  A, Butlin  RK, Bank  C. 2021. Deleterious mutation accumulation and the long-term fate of chromosomal inversions. PLoS Genet. 17(3):e1009411. doi: 10.1371/journal.pgen.1009411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Booker  TR, Jackson  BC, Keightley  PD. 2017. Detecting positive selection in the genome. BMC Biol. 15(1):98. doi: 10.1186/s12915-017-0434-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Butlin  RK, Collins  PM, Day  TH. 1984. The effect of larval density on an inversion polymorphism in the seaweed fly Coelopa frigida. Heredity. 52(3):415–423. doi: 10.1038/hdy.1984.49. [DOI] [Google Scholar]
  9. Campos  JL, Charlesworth  B. 2019. The effects on neutral variability of recurrent selective sweeps and background selection. Genetics. 212(1):287–303. doi: 10.1534/genetics.119.301951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Campos  JL, Halligan  DL, Haddrill  PR, Charlesworth  B. 2014. The relation between recombination rate and patterns of molecular evolution and variation in Drosophila melanogaster. Mol Biol Evol. 31(4):1010–1028. doi: 10.1093/molbev/msu056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Campos  JL, Zhao  L, Charlesworth  B. 2017. Estimating the parameters of background selection and selective sweeps in Drosophila in the presence of gene conversion. Proc Natl Acad Sci U S A. 114(24):E4762–E4771. doi: 10.1073/pnas.1619434114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Casillas  S, Barbadilla  A, Bergman  CM. 2007. Purifying selection maintains highly conserved noncoding sequences in Drosophila. Mol Biol Evol. 24(10):2222–2234. doi: 10.1093/molbev/msm150. [DOI] [PubMed] [Google Scholar]
  13. Castellano  D, James  J, Eyre-Walker  A. 2018. Nearly neutral evolution across the Drosophila melanogaster genome. Mol Biol Evol. 35(11):2685–2694. doi: 10.1093/molbev/msy164. [DOI] [PubMed] [Google Scholar]
  14. Charlesworth  B. 2013. Why we are not dead one 100 times over. Evolution. 67(11):3354–3361. doi: 10.1111/evo.12195. [DOI] [PubMed] [Google Scholar]
  15. Charlesworth  B. 2015. Causes of natural variation in fitness: evidence from studies of Drosophila populations. Proc Natl Acad Sci U S A. 12(6):1662–1669. doi: 10.1073/pnas.1423275112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Charlesworth  B. 2018. Mutational load, in breeding depression and heterosis in subdivided populations. Mol Ecol. 24(24):4991–5003. doi: 10.1111/mec.14933. [DOI] [PubMed] [Google Scholar]
  17. Charlesworth  B. 2022. The effects of weak selection on neutral diversity at linked sites. Genetics. 221(1):iyac027. doi: 10.1093/genetics/iyac027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Charlesworth  B. 2023. The effects of inversion polymorphisms on patterns of neutral genetic diversity. Genetics. 224(4):iyad116. doi: 10.1093/genetics/iyad116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Charlesworth  B, Charlesworth  D. 2000. The degeneration of Y chromosomes. Phil Trans R Soc B. 355(1403):1563–1572. doi: 10.1098/rstb.2000.0717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Charlesworth  B, Charlesworth  D. 2010. Elements of Evolutionary Genetics. Greenwood Village, CO: Roberts and Company. [Google Scholar]
  21. Charlesworth  D, Morgan  MT, Charlesworth  B. 1993. Mutation accumulation in finite outbreeding and inbreeding populations. Genet Res. 61(1):39–56. doi: 10.1017/S0016672300031086. [DOI] [Google Scholar]
  22. Charlesworth  B, Sniegowski  P, Stephan  W. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature. 371(6494):215–220. doi: 10.1038/371215a0. [DOI] [PubMed] [Google Scholar]
  23. Cherry  JL, Wakeley  J. 2003. A diffusion approximation for selection and drift in a subdivided population. Genetics. 163(1):421–428. doi: 10.1093/genetics/163.1.421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Chovnick  A. 1973. Gene conversion and transfer of genetic information within the inverted region of inversion heterozygotes. Genetics. 74(1):123–131. doi: 10.1093/genetics/75.1.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Connallon  T, Olito  C. 2021. Natural selection and the distribution of chromosomal lengths. Mol Ecol. 31(13):3627–3641. doi: 10.1111/mec.16091. [DOI] [PubMed] [Google Scholar]
  26. Crow  JF. 1993. Mutation, mean fitness, and genetic load. Oxf Surv Evol Biol. 9:3–42. [Google Scholar]
  27. Crumpacker  DW, Salceda  VM. 1968. Uniform heterokaryotypic superiority for viability in a Colorado population of Drosophila pseudooscura. Evolution. 22(2):256–261. doi: 10.2307/2406523. [DOI] [PubMed] [Google Scholar]
  28. Dobzhansky  T. 1947. Genetics of natural populations. XIV. A response of certain gene arrangements in the third chromosome of Drosophila pseudoobscura to natural selection. Genetics. 32(2):142–160. doi: 10.1093/genetics/32.2.142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Faria  R, Johanesson  K, Butlin  RK, Westram  AM. 2019. Evolving inversions. Trends Ecol Evol. 34(3):239–248. doi: 10.1016/j.tree.2018.12.005. [DOI] [PubMed] [Google Scholar]
  30. Frydenberg  O. 1963. Population studies of a lethal mutant in Drosophila melanogaster. I. Behaviour in populations with discrete generations. Hereditas. 50(1):89–116. doi: 10.1111/j.1601-5223.1963.tb01896.x. [DOI] [Google Scholar]
  31. Glémin  S, Bazin  E, Charlesworth  D. 2006. Impact of mating systems on patterns of sequence polymorphism in flowering plants. Proc R Soc B. 273(1604):3011–3019. doi: 10.1098/rspb.2006.3657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Glémin  S, Ronfort  J, Bataillon  T. 2003. Patterns of inbreeding depression and architecture of the load in subdivided populations. Genetics. 165(4):2193–2212. doi: 10.1093/genetics/165.4.2193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Haddrill  PR, Thornton  KR, Charlesworth  B, Andolfatto  P. 2005. Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations. Genome Res. 15(6):790–799. doi: 10.1101/gr.3541005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Haldane  JBS. 1937. The effect of variation on fitness. Am Nat. 71(735):337–349. doi: 10.1086/280722. [DOI] [Google Scholar]
  35. Halligan  DL, Keightley  PD. 2006. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide sequence comparison. Genome Res. 16(7):875–884. doi: 10.1101/gr.5022906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Huang  K, Ostevik  KL, Elphinstone  C, Todesco  M, Bercovich  N, Owens  GL, Rieseberg  LH. 2022. Mutation load in sunflower inversions is negatively correlated with inversion heterozygosity. Mol Biol Evol. 39(5):msac101. doi: 10.1093/molbev/msac101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. James  J, Castellano  D, Eyre-Walker  A. 2017. DNA sequence diversity and the efficiency of natural selection in animal mitochondrial DNA. Heredity. 118(1):88–95. doi: 10.1038/hdy.2016.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Jay  P, Chouteau  M, Whibley  A, Bastide  H, Parinello  H, Llaurens  V, Joron  M. 2021. Mutation load at a mimicry supergene sheds new light on the evolution of inversion polymorphisms. Nat Genet. 53(3):288–293. doi: 10.1038/s41588-020-00771-1. [DOI] [PubMed] [Google Scholar]
  39. Jay  P, Tezenas  E, Véber  A, Giraud  T. 2022. Sheltering of deleterious mutations explains the stepwise extension of recombination suppression on sex chromosomes and other supergenes. PLoS Biol. 20(7):e3001698. doi: 10.1371/journal.pbio.3001698. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  40. Jeong  H, Baran  NM, Sun  D, Chatterjee  P, Layman  TS, Balakhrishnan  CN, Maney  DL, Yi  SV. 2021. Dynamic molecular evolution of a supergene with suppressed recombination in white-throated sparrows. Elife. 11:e79387. doi: 10.7554/eLife.79387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kapun  M, Durmaz Mitchell  E, Kawecki  T, Schmidt  P, Flatt  T. 2023. An ancestral balanced inversion polymorphism confers global adaptation. Mol Biol Evol. 40(6):msad118. doi: 10.1093/molbev/msad118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Kapun  M, Flatt  T. 2019. The adaptive significance of chromosomal inversion polymorphisms in Drosophila melanogaster. Mol Ecol. 28(6):1263–1282. doi: 10.1111/mec.14871. [DOI] [PubMed] [Google Scholar]
  43. Kimura  M. 1964. Diffusion models in population genetics. J App Prob. 1(2):177–223. doi: 10.2307/3211856. [DOI] [Google Scholar]
  44. Kimura  M. 1971. Theoretical foundations of population genetics at the molecular level. Theor Pop Biol. 2(2):174–208. doi: 10.1016/0040-5809(71)90014-1. [DOI] [PubMed] [Google Scholar]
  45. Kimura  M, Maruyama  T, Crow  JF. 1963. The mutation load in small populations. Genetics. 48(10):1303–1312. doi: 10.1093/genetics/48.10.1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kondrashov  AS. 1995. Contamination of the genome by very slightly deleterious mutations: why have we not died 100 times over?  J Theor Biol. 175(4):583–594. doi: 10.1006/jtbi.1995.0167. [DOI] [PubMed] [Google Scholar]
  47. Korunes  K, Noor  MAF. 2019. Pervasive gene conversion in chromosomal inversion heterozygotes. Mol Ecol. 28(6):1302–1315. doi: 10.1111/mec.14921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Kousathanas  A, Keightley  PD. 2013. A comparison of models to infer the distribution of fitness effects of new mutations. Genetics. 193(4):1197–1208. doi: 10.1534/genetics.112.148023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Krimbas  CB, Powell  JR. 1992. Drosophila Inversion Polymorphism. Boca Raton. FL: CRC Press. [Google Scholar]
  50. Larracuente  AM, Presgraves  DC. 2012. The selfish Segregation Distorter gene complex of Drosophila melanogaster. Genetics. 192(1):33–53. doi: 10.1534/genetics.112.141390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Leimu  R, Mutikainen  P, Koricheva  J, Fischer  M. 2006. How general are positive relationships between plant population size, fitness and genetic variation?  J Ecol. 94(5):943–953. doi: 10.1111/j.1365-2745.2006.01150.x. [DOI] [Google Scholar]
  52. Lohr  JN, Haag  CR. 2015. Genetic load, inbreeding depression and hybrid vigour covary with population size in Daphnia magna: an empirical evaluation of theoretical predictions. Evolution. 69(12):3109–3122. doi: 10.1111/evo.12802. [DOI] [PubMed] [Google Scholar]
  53. Malécot  G. 1969. The Mathematics of Heredity. San Francisco, CA: W.H. Freeman. [Google Scholar]
  54. Manna  F, Martin  G, Lenormand  T. 2011. Fitness landscapes: an alternative theory for the dominance of mutation. Genetics. 189(3):923–937. doi: 10.1534/genetics.111.132944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Matschiner  M, Barth  JMI, Tørresen  OK, Star  B, Baalsrud  HT, Brieuc  MSO, Pampoulie  C, Bradbury  I, Jakobsen  KT, Jentoft  S. 2023. Supergene origin and maintenance in Atlantic cod. Nat Ecol Evol. 6(4):469–481. doi: 10.1038/s41559-022-01661-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mérot  C, Llaurens  V, Normandeau  E, Bernatchez  L, Wellenreuther  M. 2020. Balancing selection via life-history trade-offs maintains an inversion polymorphism in a seaweed fly. Nat Commun. 11(1):670. doi: 10.1038/s41467-020-14479-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Mukai  T, Yamaguchi  O. 1974. The genetic structure of natural populations of Drosophila melanogaster. XI. Genetic variability in a local population. Genetics. 76(2):339–366. doi: 10.1093/genetics/76.2.339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Muller  HJ. 1950. Our load of mutations. Am J Hum Genet. 2:111–176. [PMC free article] [PubMed] [Google Scholar]
  59. Nei  M. 1968. The frequency distribution of lethal chromosomes in finite populations. Proc Natl Acad Sci U S A. 60(2):517–524. doi: 10.1073/pnas.60.2.517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Nei  M, Kojima  K-I, Schaffer  HE. 1967. Frequency changes of new inversions in populations under mutation-selection equilibria. Genetics. 57(4):741–750. doi: 10.1093/genetics/57.4.741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Ohta  T. 1971. Associative overdominance caused by linked detrimental mutations. Genet Res. 18(3):277–286. doi: 10.1017/S0016672300012684. [DOI] [PubMed] [Google Scholar]
  62. Pei  Y, Forstmeier  W, Knief  U, Kempenaers  B. 2022. Weak antagonistic fitness effects can maintain an inversion polymorphism. Mol Ecol. 32(13):3575–3585. doi: 10.1111/mec.16963. [DOI] [PubMed] [Google Scholar]
  63. Powell  JR. 1992. Inversion polymorphisms in Drosophila pseudoobscura and Drosophila persimilis. In: Krimbas  CB, Powell  JR, editors. Drosophila Inversion Polymorphism. Boca Raton, FL: CRC Press. p. 73–126. [Google Scholar]
  64. Robinson  JA, Kyriazis  CC, Nigenda-Morales  SF, Beichman  AC, Rojas-Bracho  L, Robertson  KM, Fontaine  MC, Wayne  RK, Lohmueller  KE, Taylor  BL, et al.  2022. The critically endangered vaquita is not doomed to extinction by inbreeding depression. Science. 376(6593):635–639. doi: 10.1126/science.abm1742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Roze  D, Rousset  F. 2003. Selection and drift in subdivided populations: a straightforward method for deriving diffusion approximations and applications involving dominance, selfing and local extinctions. Genetics. 165(4):2153–2166. doi: 10.1093/genetics/165.4.2153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Roze  D, Rousset  F. 2004. Joint effects of self-fertilization and population structure on mutation load, inbreeding depression and heterosis. Genetics. 167(2):1001–1015. doi: 10.1534/genetics.103.025148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Sniegowski  PD, Charlesworth  B. 1994. Transposable element numbers in cosmopolitan inversions from a natural population of Drosophila melanogaster. Genetics. 137(3):815–827. doi: 10.1093/genetics/137.3.815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Sperlich  D, Pfriem  P, Ashburner  M, Carson  HL, Thompson  JN. 1986. Chromosomal polymorphisms in natural and experimental populations. In: Ashburner  M, Carson  HL, Thompson  JN, editors. The Genetics and Biology of Drosophila, Vol. 3e. London: Academic Press. p. 257–309. [Google Scholar]
  69. Spigler  RB, Theodorou  K, Chang  S-M. 2017. Inbreeding depression and drift load in small populations at demographic disequilibrium. Evolution. 71(1):81–94. doi: 10.1111/evo.13103. [DOI] [PubMed] [Google Scholar]
  70. Stenløk  K, Saitou  M, Rud-Johansen  L, Nome  T, Moser  M, Áryasi  M, Kent  M, Barson  NJ, Sigbørn  L. 2022. The emergence of supergenes from inversions in Atlantic salmon. Phil Trans R Soc B. 377(1856):20210195. doi: 10.1098/rstb.2021.0195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Sturtevant  AH, Mather  K. 1938. The interrelations of inversions, heterosis and recombination. Am Nat. 72(742):447–452. doi: 10.1086/280797. [DOI] [Google Scholar]
  72. Tajima  F. 1989. The effect of change in population size on DNA polymorphism. Genetics. 123(3):597–601. doi: 10.1093/genetics/123.3.597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Wakeley  J. 2003. Polymorphism and divergence for island-model species. Genetics. 163(1):411–420. doi: 10.1093/genetics/163.1.411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wakeley  J, Aliacar  N. 2001. Gene genealogies in a metapopulation. Genetics. 159(2):893–905. doi: 10.1093/genetics/159.2.893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Waller  DM. 2021. Addressing Darwin's dilemma: can pseudo-overdominance explain persistent inbreeding depression and load?  Evolution. 75(4):779–793. doi: 10.1111/evo.14189. [DOI] [PubMed] [Google Scholar]
  76. Wang  Y, McNeil  P, Abdulazeez  R, Pascual  M, Johnston  SE, Keightley  PD, Obbard  DJ. 2023. Variation in mutation, recombination, and transposition rates in Drosophila melanogaster and Drosophila simulans. Genome Res. 33(4):587–598. doi: 10.1101/gr.277383.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Watanabe  T, Yamaguchi  O, Mukai  T. 1976. The genetic variability of third chromosomes in a local population of Drosophila melanogaster. Genetics. 82(1):63–82. doi: 10.1093/genetics/82.1.63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Watterson  GA. 1975. On the number of segregating sites in genetical models without recombination. Theor Pop Biol. 7(2):256–276. doi: 10.1016/0040-5809(75)90020-9. [DOI] [PubMed] [Google Scholar]
  79. Welch  JJ, Eyre-Walker  A, Waxman  D. 2008. Divergence and polymorphism under the nearly neutral theory of molecular evolution. J Mol Evol. 67(4):418–426. doi: 10.1007/s00239-008-9146-9. [DOI] [PubMed] [Google Scholar]
  80. Wellenreuther  M, Bernatchez  L. 2018. Eco-evolutionary genomics of chromosomal inversions. Trends Ecol Evol. 33(6):427–440. doi: 10.1016/j.tree.2018.04.002. [DOI] [PubMed] [Google Scholar]
  81. Whitlock  MC. 2002. Selection, load and inbreeding depression in a large metapopulation. Genetics. 160(3):1191–1202. doi: 10.1093/genetics/160.3.1191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Whitlock  MC, Ingvarsson  PK, Hatfield  T. 2000. Local drift load and the heterosis of interconnected populations. Heredity.  84(4):452–457. doi: 10.1046/j.1365-2540.2000.00693.x. [DOI] [PubMed] [Google Scholar]
  83. Wright  S. 1931. Evolution in Mendelian populations. Genetics. 16(2):97–159. doi: 10.1093/genetics/16.2.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Wright  S. 1951. The genetical structure of populations. Ann Eugen. 15(1):323–354. doi: 10.1111/j.1469-1809.1949.tb02451.x. [DOI] [PubMed] [Google Scholar]
  85. Wright  S, Dobzhansky  T. 1946. Genetics of natural populations. XII. Experimental reproduction of some of the changes caused by natural selection in certain populations of Drosophila pseudoobscura. Genetics. 31(2):125–156. doi: 10.1093/genetics/31.2.125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Zhao  L, Charlesworth  B. 2016. Resolving the conflict between associative overdominance and background selection. Genetics. 203(3):1315–1334. doi: 10.1534/genetics.116.188912. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

iyad218_Supplementary_Data

Data Availability Statement

No new data or reagents were generated for this work. The codes for the computer programs used to generate the results described above are available in Supplementary Files 2 and 3. The numerical results used to produce the figures are presented in Supplementary Tables 1–4.

Supplemental material is available at GENETICS online.


Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES