Abstract
The evolution of microbial and viral organisms often generates clonal interference, a mode of competition between genetic clades within a population. Here we show how interference impacts systems biology by constraining genetic and phenotypic complexity. Our analysis uses biophysically grounded evolutionary models for molecular phenotypes, such as fold stability and enzymatic activity of genes. We find a generic mode of phenotypic interference that couples the function of individual genes and the population’s global evolutionary dynamics. Biological implications of phenotypic interference include rapid collateral system degradation in adaptation experiments and long-term selection against genome complexity: each additional gene carries a cost proportional to the total number of genes. Recombination above a threshold rate can eliminate this cost, which establishes a universal, biophysically grounded scenario for the evolution of sex. In a broader context, our analysis suggests that the systems biology of microbes is strongly intertwined with their mode of evolution.
Subject terms: Evolutionary theory, Molecular evolution, Population genetics, Complexity, Evolvability
In asexual populations selection at different genomic loci can interfere with each other. Here, using a biophysical model of molecular evolution the authors show that interference results in long-term degradation of molecular function, an effect that strongly depends on genome size.
Introduction
In the absence of recombination, evolution is constrained by genetic linkage. That is, selection on an allele at one genomic locus can interfere with the evolution of simultaneously present alleles throughout the genome. Interference interactions between loci include background selection (the spread of a beneficial allele is impeded by linked deleterious alleles), hitchhiking or genetic draft (a neutral or deleterious allele is driven to fixation by a linked beneficial allele), and clonal interference between beneficial alleles originating in disjoint genetic clades (only one of which can reach fixation). These interactions and their consequences for genome evolution have been studied extensively in laboratory experiments1,2 and in natural populations3,4. Recent theory5–13 has quantified two broad interference effects in asexual evolution. First, interference selection rather than genetic drift constrains the genetic diversity in large populations, which, in turn, limits the efficacy of selection10,13–15. Second, interference reduces the speed of evolution7–9,11–13; this has been observed in laboratory evolution experiments16–19. The resulting fitness cost of interference, which has also been been observed in microbial laboratory evolution20–23, is the center piece of classic arguments for the evolutionary advantage of sex24–28.
Much less clear is how interference affects the evolution of molecular phenotypes, such as protein stabilities and affinities governing gene regulation and cellular metabolism. The systems-biological consequences of interference evolution are the topic of this paper. Our analysis is based on biophysical models of molecular evolution29–36. In a minimal model, each gene of an organism carries a single quantitative trait G, the stability of its protein fold. A fitness landscape f(G) quantifies the effect of protein stability on reproductive success. This landscape is a sigmoid function with a high-fitness plateau corresponding to stable proteins and a low-fitness plateau corresponding to unfolded proteins (Fig. 1a). We also discuss a stability-affinity protein model with a two-dimensional fitness landscape f(G, E); this model includes enzymatic or regulatory functions of genes, specifically the protein binding affinity E to a molecular target. From the perspective of molecular evolution, these landscapes provide a generic biophysical model of local fitness epistasis, which couples all sequence sites contributing to a stability or affinity trait in the same gene. Importantly, local epistasis in protein-coding sequence operates independently of fitness interactions across genes. Beyond proteins, local epistasis occurs ubiquitously in quantitative molecular traits associated with binding interactions. This form of epistasis is an important building block of our model that is not covered by the standard theory of asexual evolution5–13.
The system-wide evolution of molecular quantitative traits under genetic linkage defines a particular mode of phenotypic interference, which occurs broadly under conditions of typical microbial systems. This mode couples global and local evolution in a specific way: the global pace of evolution sets the average selection coefficient of local trait changes. In the first part of the paper, we develop the theory of phenotypic interference and derive a key quantitative result: in a system of g genes, the steady-state fitness cost of interference increases quadratically with g. This super-linear cost reflects a specific evolutionary mechanism: each additional gene degrades stability and function of all other genes by increasing the accumulation of deleterious mutations. We then turn to biological implications of phenotypic interference. We show that the interference cost can outweigh the metabolic cost of genes37,38 and generate long-term impact on systems biology: it strongly constrains genome complexity in viable, asexually reproducing organisms and drives the loss of non-essential genes. On the time scales of laboratory evolution experiments, phenotypic interference reduces fitness through the attrition of molecular traits; we compare this prediction to experimental data20–23. Finally, phenotypic interference provides a surprisingly simple pathway for the evolution of sex. We show that facultative recombination at low rates R can evolve near neutrality yet, once R exceeds a threshold R*, provides a large competitive advantage against competing non-recombining lineages. The predicted threshold R* is of order of the mutation rate, which is consistent with observed recombination rates.
Results
Housekeeping evolution under phenotypic interference
Here we analyze the evolution of genetically linked systems in a conservative environment, where populations maintain the functionality of molecular traits in the presence of deleterious mutations but there is no adaptive pressure on these traits. This scenario defines a system-wide mutation-selection steady state that we call housekeeping evolution (here, housekeeping does not refer to a particular class of genes or metabolic processes). It builds on the assumption that over long time scales, selection acts primarily to repair the deleterious effects of mutations, because these processes are continuous and affect the entire genome. In contrast, adaptive processes are often environment-dependent and transient, and they affect only specific genes. In Methods section, we extend our analysis to scenarios of adaptive evolution and show that these do not affect the conclusions of the paper.
Figure 1 illustrates the ingredients of phenotypic interference in the housekeeping state (and can serve as a shortcut through theory for readers primarily interested in the biological implications). First, local quantitative traits of a given gene are in an evolutionary equilibrium, where the long-term average of the trait value and its position on the fitness landscape are determined by the uphill force of selection and the downhill force of mutations (Fig. 1a and Supplementary Fig. 1a). Second, global genome evolution takes place in a so-called fitness wave; that is, genetic and phenotypic variants in multiple genes co-exist in a population and generate a broad distribution of fitness values7–9,11–13 (Fig. 1b and Supplementary Fig. 1b). These levels are linked by a common evolutionary parameter, the coalescence rate , or equivalently by the effective population size (Supplementary Table 1 lists all mathematical symbols). The joint solution of the local and global evolutionary dynamics identifies a broad regime of phenotypic interference, which is marked by a system-wide genetic load depending quadratically on genome size (Fig. 1c).
Evolution of a quantitative trait under interference selection
In the framework of the minimal biophysical model, we study the housekeeping evolution of genome-wide protein fold stability. The stability trait G of a given gene is defined as the free energy difference between the unfolded and the folded state (and usually denoted by ΔG; we abbreviate this notation to avoid confusion with the variance measures defined below). The trait G evolves in a fitness landscape f(G) of sigmoid form (Fig. 1a, see Methods section).
The mutation-selection equilibrium on a flank of the landscape f(G) can be characterized by the equilibrium values of its population mean trait, , and the trait diversity or genetically heritable trait variance, (overbars denote averages within a population39). First, the diversity ΔG takes the simple, effectively neutral equilibrium form
1 |
which is proportional to the total mutation rate u and the mean square stability effect of the relevant sequence sites, and to the effective population size . This form extends previous results on neutral sequence diversity14,40–42 and on quantitative trait diversity under genetic drift43–45. In Methods section, we derive Eq. 1 for quantitative traits in a fitness landscape f(G) by showing that stabilizing selection on ΔG can be neglected throughout the phenotypic interference regime; this scaling is confirmed by simulations (Supplementary Fig. 2a). In a fitness wave, the parameter couples each individual trait to the global evolutionary dynamics of all genetically linked genes (Fig. 1a, b). In contrast, an independently evolving trait would depend on an effective population size Ne set by genetic drift. Next, we compute the equilibrium point for the mean trait by equating the rate of stability increase by selection with the rate of stability degradation by mutations,
2 |
details are given in Methods section. This mutation-selection equilibrium depends on the effective population size, in contrast to protein evolution models in the infinite population limit31. By inserting Eq. 1, we can express the mean square selection coefficient at trait sites, , and the fitness variance in terms of the coalescence rate,
3 |
a similar relation for s2 under genetic drift has been derived in refs. 46,47. These equations describe stable trait equilibria on the downward-curved shoulder of the fitness landscape f(G), which is a non-linear trait interval with . They express universal characteristics of these equilibria, which do not depend on details of the fitness landscape and of the trait effect distribution of sequence sites. Their validity is confirmed by numerical simulations (Supplementary Fig. 2). The above derivation neglects fluctuations of by genetic drift and genetic draft; cf. Supplementary Fig. 1a. However, Eq. 3 remain exactly valid in the full mutation-selection-coalescence dynamics (Supplementary Methods 1 and Supplementary Fig. 3).
A salient feature of selection on quantitative traits becomes apparent from Eq. 3: the selection coefficients of new genetic variants are not fixed a priori, but are an emergent property of the global evolutionary process. A faster pace of evolution, i.e., an increase in coalescence rate , reduces the efficacy of selection10,11,14. On the downward curved shoulder of the fitness landscape, this drives the population to an equilibrium point of lower fitness and higher fitness gradients. In other words, trait-changing mutations are under ubiquitous negative epistasis: the combined (log) fitness effect of two deleterious trait changes is larger in magnitude, the combined effect of two beneficial mutations is smaller than the sum of the individual effects. This epistasis tunes typical selection coefficients to marginal relevance, where mean allele sojourn times between low and high frequencies, 1/s, are of the order of the coalescence time . That point marks the crossover between effective neutrality () and strong selection ()10; consistently, most but not all trait sites carry their beneficial allele.
Interference of multiple traits
We now obtain a closed solution of housekeeping evolution under phenotypic interference by matching the individual trait equilibria given by Eq. 3 with a fitness wave model for global evolution. First, the total fitness variance σ2 is simply the sum of the fitness variances Δf of the individual genes (Supplementary Fig. 4). Using Eq. 3, this sum rule takes the form , which relates the scales of global selection and coalescence, σ and . Second, given a sufficient supply of non-neutral mutations, global evolution proceeds in a fitness wave (the condition for wave occurrence will be made precise below). General fitness wave theory then provides another relation between global selection and coalescence,
4 |
where N is the population size and c0 is a model-dependent prefactor12,13 (Methods). Combining these relations, we obtain the global fitness wave of phenotypic interference,
5 |
Equations 3 then determine the corresponding characteristics of individual traits,
6 |
Equations 5 and 6 involve the fitness wave parameter defined in Eq. 4,
7 |
which depends only weakly on the evolutionary parameters and provides corrections to the scaling. This parameter estimates the complexity of the fitness wave, that is, the average number of genes with simultaneously segregating beneficial genetic variants destined for fixation (Fig. 1 and see Methods section). A wave pattern with temporally stable fitness polymorphism of approximately Gaussian form occurs whenever the mutation rate exceeds the average site selection coefficient, 15. This regime underlies the closure of Eqs. 5, 6; cf. Supplementary Fig. 1b. As shown in Methods section, it applies to gene numbers above a threshold g0 given by the condition
8 |
These relations are the centerpiece of phenotypic interference theory. They show that the collective evolution of molecular quantitative traits under genetic linkage depends strongly on the number of genes that encode these traits. The dependence is generated by a feedback between the global fitness variation, σ2, and mean square local site selection coefficients, s2. This feedback also tunes the evolutionary process to the crossover point between independently evolving genomic sites and strongly correlated fitness waves composed of multiple small-effect mutations (Supplementary Methods 2). Remarkably, local and global characteristics of phenotypic interference are strongly universal: they depend only on the parameters g, u, and c but decouple from details of gene fitness landscapes and site effect distributions.
The scaling of phenotypic interference is confirmed by extensive numerical simulations of Fisher-Wright populations, which are detailed in Methods section. Figure 2 shows the global observables σ2, and the local observables ΔG, s2 as functions of g. The data display a crossover from a weak-interference regime of independently evolving genes at low values of g (brown dashed lines) to the phenotypic interference scaling given by Eqs. 5–7 (red dashed lines); this crossover occurs around a modest gene number . The calibration between theory and data involves the fitting of a single model-dependent amplitude c0; the calibrated theory matches the data for realistic gene numbers (g ~ 103 − 104) without additional fit parameters. The data also show the universality of the leading scaling behavior; gene selection coefficients f0 varying by more than three orders of magnitude introduce only small corrections to scaling. Supplementary Fig. 1 displays the separation of diversity scaling between predominantly monomorphic individual traits and standing fitness variation, as detailed in Eqs. 19, 20 of Methods section. The underlying near-linear relation between global fitness variance σ2 and coalescence rate , which is a general property of fitness waves, is checked in Supplementary Fig. 2d.
Interference selection against complexity
The evolutionary cost of deleterious mutations is quantified by the genetic load, which is defined as the mean fitness of a population compared to the fitness maximum. In the biophysical fitness landscape f(G) of the minimal model, the load of a given gene takes the approximate form , where denotes the population mean stability and f0 is the fitness of a fully stable gene (); see Fig. 1a and Eq. 13 in Methods section. We now compute the genetic load under phenotypic interference for stable and functional genes, which are located in the concave part of the minimal model landscape f(G). This part can be approximated by its exponential tail, where the load is proportional to the slope . Equation 6, , then predicts a load per gene, where we have used that typical reduced effect sizes are of order 1 (see Methods section). With Eq. 5, we obtain a quadratic scaling of the total equilibrium genetic load,
9 |
which sets on at a small gene number g0 given by Eqs. 7, 8 (Fig. 1c; numerical simulations are shown in Fig. 3). The superlinearity of the load is the most important biological consequence of phenotypic interference and the main difference to previous results on protein evolution31. It is generated by the evolutionary feedback between global and local selection discussed in Fig. 1: increasing the number of genes reduces the coalescence time and, thus, the efficacy of selection on every single gene.
In Supplementary Methods 3 and Supplementary Fig. 5, we discuss phenotypic interference in extended biophysical models. These include active protein degradation at the cellular level, a ubiquitous process that drives the thermodynamics of folding out of equilibrium48. Another example is the stability-affinity model, which has two quantitative traits per gene that evolve in a two-dimensional sigmoid fitness landscape f(G, E)35,49. Under reasonable biophysical assumptions, evolution in the stability-affinity model produces a 2-fold higher interference load than the minimal model, . Alternative models with a quadratic single-peak fitness landscape describe, for example, gene expression levels under stabilizing selection50. Such landscapes generate an even stronger load nonlinearity, . In contrast, a discrete model with a fitness effect f0 of each gene shows a linear load up to a characteristic gene number associated with the onset of mutational meltdown by Muller’s ratchet8,51,52. These examples suggest that superlinear scaling of the genetic load holds quite generally, given a sufficient number of quantitative traits evolving under genetic linkage and in fitness landscapes with negative epistasis. This type of landscape is ubiquitous in biophysical models.
The equilibrium load generates strong long-term selection against genome complexity: the fitness cost for each additional gene, , can take sizeable values even at moderate genome size. For example, in a “standard” microbe of the complexity of E. coli, a 10% increase in gene number may incur an additional load under the stability-affinity model (with parameters , , ). This estimate should be regarded as a lower bound, which is based only on core protein functions but ignores, for example, regulatory functions encoded in intergenic DNA. In comparison, the discrete model leads to a much smaller value for the same parameters.
Genetic load can exceed metabolic fitness cost
We can compare the interference load of an extra gene with its physiological fitness cost , which is generated primarily by the synthesis of additional proteins (and is part of the fitness amplitude f0). Metabolic theory shows that spurious expression leads to a re-allocation of metabolic resources in the cell and a reduced growth rate, , where is the proteome fraction of unnecessary genes and is the total proteome fraction available for growth ( for E. coli in exponential growth)37,38. A single gene with average expression level encodes a proteome fraction ; this leads to a metabolic cost per generation. Similarly, the energetic cost of a gene is of order 1/g53. While the precise form of these cost components depends on details of cell metabolism, we expect generically . For evolution under phenotypic interference, this implies for , which is similar to the interference load per gene in a standard microbe but becomes subleading in larger genomes.
The physiological cost per gene acts as a selective force on changes of genome size within a coalescence interval . The inequality says that such changes are weakly selected and suggests a two-scale evolution of genome sizes. On short time scales, the dynamics of gene numbers is permissive and allows the rapid acquisition of adaptive genes. On longer time scales (of order ; see Eq. 11 below), the interference load prunes marginally relevant genes in a more stringent way, for example, by invasion of strains with more compact genomes.
Interference drives gene loss
The near-neutral dynamics of genome size extends to gene losses, which become likely when a gene gets close to the inflection point of the sigmoid fitness landscape and the stability condition underlying Eq. 2 no longer holds (Fig. 4a). The relevant threshold gene selection, , is
10 |
in the minimal model; see Eq. 5. Strongly selected genes () have equilibrium trait values firmly on the concave part of the landscape, resulting in small loss rates of order 10; these genes can be maintained over extended evolutionary periods. Marginally selected genes () have near-neutral loss rates of order u10, generating a continuous turnover of genes. According to Eq. 10, the threshold for gene loss increases with genome size, which expresses again the evolutionary constraint on genome complexity. The dependence of the gene loss rate on f0 and is confirmed by simulations (Fig. 4b). The housekeeping coalescence rate sets a lower bound for , adaptive evolution can lead to much larger values of and .
Load accumulation in evolution experiments
After a change in gene number or other systems parameters, the evolutionary process reaches a new steady state. Because (additional) deleterious trait changes are only marginally selected (i.e., have selection coefficients of magnitude ), the relaxation time is of the order of the inverse mutation rate per trait,
11 |
where we have used Eq. 5. This time scale exceeds the coalescence time and is of order 106 generations for a standard microbe. Hence, interference selection against complexity is a potent evolutionary force affecting natural populations but is beyond the time scales of laboratory evolution experiments.
Nevertheless, the phenotypic interference model makes testable predictions on load accumulation in laboratory populations. Consider a standard microbe that has an initial housekeeping interference load per gene and is subject to strong adaptive pressure in the experiment, generating an increased coalescence rate . Equations 6, 11 then predict a lower bound for the genome-wide rate of load increase, per generation. This loss reflects the system-wide collateral degradation of protein stability, which is caused by deleterious hitchhiker mutations of the adaptive process.
A collateral fitness decline of this type and magnitude has been observed in E. coli populations from long-term evolution experiments20–23. While the decline is masked in the original long-term experiments by a larger adaptive fitness gain21, it has been revealed by fitness measurements of the evolved strains on other substrates20. A substantial part of the fitness loss can be rescued in fitness assays at lower temperature, suggesting a link to protein stability20. The phenotypic interference model supports this interpretation. Protein stability G, as well as quantitative protein function traits, provides a large, genome-wide supply of weakly selected mutations prone to hitchhiking (). Moreover, the biophysical fitness landscapes of protein stability and affinity are explicitly temperature-dependent, which explains why fitness losses by deleterious mutations can be compensated by temperature reduction. We obtain a lower bound on the fitness loss related to the genome-wide attrition of these biophysical traits, per generation, by evaluating the temperature-rescuable part of the fitness decline in mutator lines (see Methods section). Nonsynonymous substitutions have been observed at a genome-wide rate per generation in these lines, and a large part appears to be effectively neutral hitchhikers22. Associating these substitutions with quantitative traits, the phenotypic interference model provides a lower-bound estimate per generation (see Methods section), which is consistent with the observed loss rate.
The pathway to sexual evolution
Recombination reshuffles genome segments at a rate R per genome and per generation (R is also called the genetic map length). Evolutionary models show that recombination generates linkage blocks that are units of selection. A block contains an average number of genes, such that there is one recombination event per block and per coalescence time, as given by the relation 13,15,54,55. Depending on R, these models predict a regime of asexual evolution, where selection acts on entire genotypes (), and a distinct regime of sexual evolution with selection acting on individual alleles (). Here we focus on the evolution of the recombination rate itself and establish a selective avenue for the transition from asexual to sexual evolution.
With the phenotypic interference scaling for , as given by Eq. 5, our minimal model produces an instability at a threshold recombination rate
12 |
signaling a first-order phase transition with the genetic load as order parameter. For , the population is in the asexual mode of evolution (), where phenotypic interference produces a superlinear load . For , efficient sexual evolution generates much smaller block sizes (). In this regime, the load drops to the linear form providing a net long-term evolutionary fitness gain . The first-order transition is a specific consequence of phenotypic interference. Because recombination rate and coalescence rate in a linkage block are both proportional to the block size , the recombination-coalescence balance criterion takes the -independent form . That is, linkage blocks cover either the entire genome () or just small genome segments (). The resulting drop of in recombining populations close to R* is confirmed by simulations (Fig. 5a). The process of recombination comes with a direct, short-term cost per generation, which includes mating costs, physiological costs, and deleterious reshuffling costs, and can potentially prevent the evolution of recombination. The classical factor 2 scenario of obligately sexual populations says that this cost is of order 1 per recombination event; that is, 28,56,57. For early isogamous populations without the full machinery of sexual reproduction, is likely to be smaller58. Importantly, in the phenotypic interference mode, this cost remains always marginal. Even the upper-bound assumption leads to a cost at the transition, which implies only weak negative selection.
Together, the theory of phenotypic interference suggests a specific selective pathway for the evolution of recombination (Fig. 5b). First, given that the evolution of recombination at a rate of order R* is near-neutral, a recombining sub-lineage with arising in an asexual background population can fix by genetic drift and draft. Second, a recombining strain with can eliminate the interference load by the parallel fixation of beneficial mutations in unlinked genome segments. This leads to a long-term benefit over non-recombining but otherwise equivalent strains; by Eq. 9, this benefit is of order 1 for a standard microbe. Hence, the evolved recombining strain can readily outcompete related non-recombining strains in the same ecological niche. The threshold recombination rate R* is of the order of the genome-wide mutation rate ug, so even rare facultative recombination provides a robust pathway to sexual evolution. This pathway builds on a separation of selection scales: the near-neutral establishment of recombination is followed by the buildup of a large benefit. We can compare observed recombination rates in natural populations with the predicted threshold rates R* (Supplementary Table 2). Consistently, genome-wide average rates for species in different parts of the tree of life are always well above R*; a high-resolution recombination map of the Drosophila genome shows low-recombining regions with values above but of order R*59,60.
The phenotypic interference pathway to recombination has highly universal characteristics: its long-term benefit of recombination is g-fold higher than the upper-bound cost, independently of details of the genome-wide selection and mutation landscape. In particular, this pathway does not require any of the strong assumptions of previous models for the evolution of recombination, which include direct benefits of recombination28,58,61,62, strong and continual adaptation61,63–65, and genome-wide epistasis between mutations28,65–68. It builds instead on local diminishing-return epistasis for functional traits of individual proteins, which is a natural consequence of their underlying biophysical mechanism. Recent fitness-wave models, which have an interference dynamics qualitatively similar to ours, quantify the difference in adaptive speed between clonally evolving and recombining populations7–9,11–13, but a direct cost-benefit balance of recombination based on genetic load has not been attempted. We note that these models assume mutations with a fixed distribution of selection coefficients and no local epistasis, which creates important quantitative differences to phenotypic interference. First, strongly deleterious effects of asexual evolution, which are associated with the onset of Muller’s ratchet, set in at larger genome sizes8 than under phenotypic interference (Fig. 1c). Second, the crossover to sexual evolution, which has been studied in the context of adaptive fitness waves, takes place at a larger recombination rate R15,69 and, hence, a larger recombination cost. A more detailed model comparison is given in Supplementary Discussion.
Discussion
Here we have developed the evolutionary genetics of multiple biophysical traits in non-recombining populations. Our approach combines quantitative trait theory with fitness wave theory. We find a specific evolutionary mode of phenotypic interference, which is characterized by a feedback between global and local selection. The system-wide genetic variation of the traits generates fitness variance, which, in turn, determines the scale of selection at local genomic sites encoding the traits. This feedback generates highly universal features, which do not depend on system details. These include the complexity of the evolutionary process and the scaling of coalescence rate and genetic load with the gene number, as given by Eqs. 5–9. A similar destructive feedback generating a superlinear cost has been identified in crosstalk of gene regulation70. Importantly, phenotypic interference also generates universal local selection. By Eq. 3, the average selective amplitude of trait-changing mutations decouples from the total fitness effect of the trait. That is, the spectrum of site selection coefficients is not a fixed input, but a dynamical output of the evolutionary process. This selection filter is the main difference of our approach to previous population-genetic models of asexual evolution5–13. We argue it is a relevant step towards biological realism.
Phenotypic interference depends on two prerequisites: selection is globally clonal and its local genomic units are broadly epistatic. The clonality of selection is a generic consequence of low recombination rates; broad fitness epistasis is a ubiquitous feature of biophysical gene traits, including protein stabilities and activities. Such traits have non-linear fitness landscapes, in which the selection on trait changes depends on the trait value (Fig. 1a).
We have shown that phenotypic interference produces systems-biological effects on different evolutionary time scales. In clonal adaptation experiments, it predicts a system-wide functional and fitness degradation in line with observations20–23. On macro-evolutionary scales, it generates strong selection against genome complexity in clonally reproducing populations. The underlying genetic load originates from the interference of phenotypic variants within a population and accumulates with a time delay beyond the coalescence time, as given by Eq. 11. Interference load acts as an evolutionary force in an ecological context: microbial strains with shorter genomes can outcompete otherwise similar strains with longer genomes that are in the same ecological niche. We have shown that this force, which arises naturally from a systems perspective of multiple biophysical traits, provides a robust eco-evolutionary pathway for the transition to recombination. Its selective input is local fitness epistasis, which occurs ubiquitously in quantitative molecular traits. Therefore, unlike previous models based on global epistasis28,65–68, this pathway does not require ad-hoc assumptions on the form of selection.
The target of phenotypic interference is molecular complexity, which can be regarded as a key systems-biological observable. In our simple biophysical models, we measure complexity by number of stability and affinity traits in a proteome. This is clearly just a starting point towards a broader systems-biological approach that includes regulatory, signaling, and metabolic networks. These define additional landscapes of biophysical interactions, but the key evolutionary mechanisms of phenotypic interference—globally clonal selection and tuned, epistatic selection on system components—are expected to play out in a similar way. In a systems model, we can define complexity as the number of (approximately) independent molecular quantitative traits, which includes network contributions that scale in a nonlinear way with genome size. Interference selection affects the complexity and architecture of all of these networks, establishing new links between evolutionary and systems biology to be explored in future work.
Methods
Biophysical fitness models
In thermodynamic equilibrium at temperature T, a protein is folded with probability , where G is the Gibbs free energy difference between the unfolded and the folded state and kB is Boltzmann’s constant. A minimal biophysical fitness model for proteins takes the form
13 |
with a single selection coefficient capturing functional benefits of folded proteins and metabolic costs of misfolding32–34. The constant C is irrelevant for the computation of fitness differences (selection coefficients). This model describes the effect of a protein on Malthusian (logarithmic) fitness, depending on its free energy of folding. Similar fitness models based on binding affinity have been derived for transcriptional regulation29,30,71,72; the rationale of biophysical fitness models has been reviewed in refs. 36,73. Equation 13 applies to genes with individually small fitness effects (). An appropriate extension to essential genes is a landscape describing zero growth (lethality) at a finite stability threshold G0, which corresponds to a singularity of the Malthusian fitness, f(G) → −∞ for . An example is the landscape , which has a threshold G0 given by for ; alternative models for essential genes are described in refs. 31,32. However, the extended fitness landscape retains the form Eq. 13 in the regime of stable folding (), which implies that our conclusions remain unaffected. In particular, the load per gene remains independent of the selection amplitude f0, as given by Eq. 9 and confirmed by simulations (Fig. 3). In Supplementary Methods 3, we introduce further alternative fitness landscapes for proteins and show that our results depend only on broad characteristics of these landscapes.
The minimal global fitness landscape for a system of g genes with traits and selection coefficients is taken to be additive, i.e., without epistasis between genes,
14 |
Evolutionary model
We characterize the population genetics of an individual trait G by its population mean and its expected variance ΔG. These follow the stochastic evolution Equations45
15 |
16 |
These equations contain white noise of mean and variance and of mean and variance with an effective population size generated by genetic draft. This dynamics is characterized by the rate u, the mean effect (−κ), and the mean square effect of trait-changing mutations. We use effects − 3kBT, which have been measured for fold stability31,74 and for molecular binding traits29,75,76. Furthermore, we approximate the mutational bias by a constant , which reflects the observation that most mutations affecting a functional trait are deleterious.
Evolutionary equilibria for individual traits
We now derive the equilibrium conditions of the model given by Eqs. 15, 16, which are used in the main text. This involves three steps. First, the deterministic term in Eq. 16 determines the average trait diversity ΔG as given in Eq. 1, if we neglect the selection component (this will be justified in step three below). That is, ΔG follows from a mutation-coalescence balance: the trait gains a heritable variance ΔG by new mutations at a speed , and it loses variation by coalescence at a rate . Equation 1 is consistent with well-known results for the average sequence diversity Δ, indicating that diversity expectation values do not depend on details of the coalescence process. These results include the relation in the standard theory of neutral evolution, where Ne is proportional to the actual population size40. The same relation is obtained for the sequence diversity of neutral genomic sites in models of genetic draft41 and in fitness wave models, where is determined by selection14,42. To obtain the equivalent form for a quantitative trait G, we simply rescale the sequence diversity by the mean square effect 44,45, which leads to Eq. 1.
Second, the equilibrium point of the mean trait follows from a mutation-selection balance, as given by Eq. 2. The rate of stability increase by selection, , is essentially a statement of Fisher’s theorem; the corresponding rate of fitness increase reads
17 |
The rate of stability decrease by mutations is the product of the total mutation rate per trait, u, and the mean effect per mutation (−κ) with the approximation as discussed above. In Supplementary Methods 1 and Supplementary Fig. 3, we derive the equilibrium of the mean trait in a fully stochastic calculus. We also note that the weakness of stabilizing selection on the trait diversity is consistent with finite directional selection on the population mean trait45.
Third, we can check a posteriori that the selection term in Eq. 16 can be self-consistently neglected. For stable genes, our biophysical traits live on the downward-curved shoulder of the fitness landscape (where ). The neutral relation (1) remains approximately valid for these traits if the resulting stabilizing selection on the trait diversity is negligible. This condition can be written in terms of the diversity load ,
18 |
see ref. 45. We now show that this condition is self-consistently fulfilled throughout the phenotypic interference regime. Evaluating the expected fitness curvature in the high-fitness part of the minimal fitness landscape, Eq. 13, where , and in the mutation-coalescence equilibrium given by Eq. 1, we obtain . By Eqs. 6, 18 then reduces to
19 |
which is identical to the condition for phenotypic interference, Eq. 8. We conclude that Eq. 1 is a valid approximation for the trait diversity throughout the phenotypic interference regime. This is confirmed by our simulation results (Supplementary Fig. 2a).
Housekeeping equilibrium and fitness waves of phenotypic interference
The deterministic equilibrium solution (, ) of Eq. 15 determines the dependence of ΔG and the associated fitness variance on , as given by Eq. 3; the same scaling follows from the full stochastic equation (Supplementary Methods 1). The derivation of the global housekeeping steady state, Eqs. 5–7, uses two additional inputs: the additivity of the fitness variance, , which is confirmed by our simulations (Supplementary Fig. 4), and the universal relation Eq. 4 in a fitness wave12,13. This relation is obtained by evaluating the total fitness span, in a population of finite census size N. Here fmax is the fitness maximum in the set of established mutations (i.e., mutations that have overcome genetic drift), which requires a mutant clone frequency . Given a Gaussian bulk fitness distribution , the tail condition for established mutations, , produces . Equation 4 then follows via the kinematic relation given by Fisher’s theorem. The prefactor c0 is model-dependent and known only in the infinitesimal fitness wave limit, e.g., in the model of refs. 12,13. Here we treat c0 as a fit parameter in simulations. The wave parameter c has a double interpretation in generic fitness wave models: it relates the total fitness span and the coalescence time to the fitness variance, and . The dependence of c on genome size under phenotypic interference, Eq. 7, is obtained by inserting Eqs. 5 into 4 and neglecting subleading terms . It is important to note that the housekeeping fitness wave describes a genome-wide mutation-selection steady state of constant mean fitness and without adaptive changes12,77, which is consistent with the equilibria of deleterious and beneficial substitutions in each gene30.
Local and global diversity scaling under phenotypic interference
Equation 19 expresses an important scaling property of the phenotypic interference regime: individual traits evolve in the low-mutation regime and are monomorphic at most times. In contrast, the cumulative variance of all traits defines a polymorphic fitness wave,
20 |
where we used Eq. 19. A related measure is the complexity of the fitness wave, defined as the average number of beneficial substitutions per coalescence time, . Here is the spectrum of site selection coefficients, which has the average by Eq. 3, and v+(s) is the equilibrium beneficial substitution rate at a site of selection coefficient s, which has a near-neutral regime for and rapidly decreases for . Hence, we obtain a wave complexity
21 |
with a prefactor of order 1; here we have used Eq. 5. By Eq. 7, the fitness wave measures Eqs. 20, 21 depend only weakly on g.
Onset of phenotypic interference
Interference effects on quantitative traits can be read off from the scaling of the genetic load, which has the linear form for independently evolving genes and is given by Eq. 9 in the phenotypic interference regime. Equating these relations identifies an onset gene number g0 given by
22 |
or equivalently by Eq. 8.
Evolutionary equilibria of stable genes
Equilibrium traits of genes with are located in the high-fitness part of the minimal fitness landscape, . These genes have an average fitness slope
23 |
an average trait , and an average load given by Eq. 9. This is in accordance with well-known population data of protein stability in microbial populations34: typical genes balance a few kBT above the melting point , which corresponds to the shoulder of the fitness landscape above the inflection point (Fig. 1a). The average stability has only a log-dependence on evolutionary rates.
Phenotypic interference in adaptive evolution
Here we show that the phenotypic interference scaling extends to simple models of adaptive evolution. In the minimal biophysical model, we assume that protein stabilities are still at local evolutionary equilibria of the universal form given by Eq. 3, generating a combined housekeeping component of the fitness variance, . The global fitness variance acquires an additional contribution from adaptive evolution of other system functions,
24 |
where is the adaptive fitness flux or rate of adaptive fitness gain78. This term quantifies the deviations of the adaptive evolutionary process from housekeeping evolution. Closure of the modified dynamics leads to an increased coalescence rate
25 |
and total interference load
26 |
Hence, the load retains the leading nonlinearity generated by housekeeping evolution, as given by Eq. 9; this is true even if we assume that is proportional to g. At high fitness flux (), coalescence becomes dominated by adaptation, leading to a further substantial decrease in the efficacy of selection. This is the likely regime of the laboratory evolution experiments discussed in the main text.
Fitness loss in evolution experiments
Bacterial lineages from the long-term evolution experiment of ref. 21 have been subject to fitness measurements in diverse environments20. These measurements show heterogeneous combinations of environment-specific fitness gains and losses compared to the ancestor strain. In mutator lines evolved over 50,000 generations a higher average growth rate λ at temperature 30 °C than at temperature 37 °C. To extract a bona fide order-of magnitude estimate of the fitness loss due to attrition of quantitative traits,we evaluate the population-average difference in log growth rate, generations, using the data provided in ref. 79. The observed average number of fixations per stable population clade is about 500/50 k generations22. These data provide the estimates and used in the main text, and they inform the model estimate with the standard microbe housekeeping value . We note two additional consistency checks: (a) The inferred average deleterious fitness effect per substitution, is of order of the observed inverse coalescence time, supporting the conclusion that a large fraction of these changes is effectively neutral22. (b) Non-mutator lines, which have a 100-fold lower mutation rate, do not show evidence of a large proportion of effectively neutral fixations and have significantly lower ΔL.
Numerical simulations of phenotypic interference
We use a Wright-Fisher process to simulate the evolution of stability traits in a population. A population consists of N individuals with genomes . A genotype consists of g segments; each segment is a subsequence with binary alleles (; ). A segment a defines a stability trait , where G0 is the minimum trait value. The resulting effect distribution of point mutations has as a second moment and a first moment , where is the state-dependent probability of a mutation at site k being beneficial and brackets denote averaging across parallel simulations or time. The genomic fitness is with f(G) given by Eq. 13 and gene-specific amplitudes f0,i. In each generation, the sequences undergo point mutations with probability for each site, where is the generation time, and the sequences of the next generation are drawn by multinomial sampling with a probabilities proportional to .
Simulations are performed with parameters , , each trait with genomic base of size , and each site with equal effect . The population size N is smaller than in natural populations; this is compensated by an increased mutation rate to keep the product Nμ at a realistic value. The quantitative trait dynamics is insensitive to the form of the effect distribution45,80. To increase the performance of the simulations, we do not keep track of the full genome. We only store the number of deleterious alleles for each trait, we draw mutations with rate , and we assign to each mutation a beneficial change with probability and a deleterious change − otherwise. This procedure produces the correct genome statistics for bi-allelic sites with uniform trait effects . Simulation data are shown with theory curves for , which provide a good fit to all amplitudes; the input is different by a factor of order 1 which includes fluctuation effects (Supplementary Methods 1).
Simulations run to reach a stationary state and then have 2000–128,000 consecutive measurements (for largest to smallest ) every 400 generations. These intervals exceed the correlation time of the coalescence process. Therefore, measurements of the global observables σ2, , and , as well as the local variance δg, decorrelate. Measurements of the other local variables s2, the loss rate, and Δf are averaged over all g genes.
For the simulations of housekeeping evolution in Figs. 2, 3, where we are not explicitly interested in the loss of genes, we use an exponential approximation of the stable regime of the stability fitness landscape. The reason is a limited accessible parameter range in simulations constraining the values of f0 and due to finite N. We checked that the exponential approximation gives the same results as the full model in the regime , where the gene loss rate in the biophysical landscape is negligible.
For the loss rate measurements of Fig. 4b, a long-term stationary population is maintained by evolving 70% of the traits in a biophysical fitness landscape with selection f0; the remaining 30% of the traits are modeled to be essential with selection 10f0. Gene loss is defined by the condition . To maintain a constant number of genes, lost genes are replaced immediately with an input trait value .
For simulations with recombination (Fig. 5a), we draw recombination events with rate NR for the whole population from a Poisson distribution. Each recombination event is implemented as one crossover between the genomes of two individuals at a random, uniformly distributed position of the genomes.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank T. Bollenbach and A. Sousa for discussions. This work has been supported by Deutsche Forschungsgemeinschaft grants SFB 680 and SFB 1310 (to M.L.). We acknowledge computational support by the CHEOPS platform at University of Cologne.
Author contributions
Conceptualization, all; Methodology, all; Software, T.H. and D.K.; Validation, all; Formal analysis, all; Investigation, all; Writing, all; Visualization, all; Supervision, M.L.; Funding Acquisition, M.L.
Data availability
The data generated from the simulations are available from the corresponding author upon reasonable request.
Code availability
The code for the simulations of this study is included as Supplementary Software 1.
Competing interests
The authors declare no competing interests.
Footnotes
Journal peer review information: Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Torsten Held, Daniel Klemmer.
Supplementary information
Supplementary Information accompanies this paper at 10.1038/s41467-019-10413-8.
References
- 1.Wiser MJ, Ribeck N, Lenski RE. Long-term dynamics of adaptation in asexual populations. Science. 2013;342:1364–1367. doi: 10.1126/science.1243357. [DOI] [PubMed] [Google Scholar]
- 2.Barroso-Batista J, et al. The first steps of adaptation of Escherichia coli to the gut are dominated by soft sweeps. PLoS Genet. 2014;10:e1004182. doi: 10.1371/journal.pgen.1004182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Betancourt AJ, Welch JJ, Charlesworth B. Reduced effectiveness of selection caused by a lack of recombination. Curr. Biol. 2009;19:655–660. doi: 10.1016/j.cub.2009.02.039. [DOI] [PubMed] [Google Scholar]
- 4.Strelkowa N, Lässig M. Clonal interference in the evolution of influenza. Genetics. 2012;192:671–682. doi: 10.1534/genetics.112.143396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Tsimring LS, Levine H, Kessler DA. Rna virus evolution via a fitness-space model. Phys. Rev. Lett. 1996;76:4440–4443. doi: 10.1103/PhysRevLett.76.4440. [DOI] [PubMed] [Google Scholar]
- 6.Gerrish PJ, Lenski RE. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102:127–144. doi: 10.1023/A:1017067816551. [DOI] [PubMed] [Google Scholar]
- 7.Desai MM, Fisher DS. Beneficial mutation–selection balance and the effect of linkage on positive selection. Genetics. 2007;176:1759–1798. doi: 10.1534/genetics.106.067678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rouzine IM, Brunet É, Wilke CO. The traveling-wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. Theor. Popul. Biol. 2008;73:24–46. doi: 10.1016/j.tpb.2007.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hallatschek O. The noisy edge of traveling waves. Proc. Natl Acad. Sci. USA. 2011;108:1783–1787. doi: 10.1073/pnas.1013529108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Schiffels S, Szöllösi GJ, Mustonen V, Lässig M. Emergent neutrality in adaptive asexual evolution. Genetics. 2011;189:1361–1375. doi: 10.1534/genetics.111.132027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Good BH, Rouzine IM, Balick DJ, Hallatschek O, Desai MM. Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc. Natl Acad. Sci. USA. 2012;109:4950–4955. doi: 10.1073/pnas.1119910109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Neher RA, Hallatschek O. Genealogies of rapidly adapting populations. Proc. Natl Acad. Sci. USA. 2013;110:437–442. doi: 10.1073/pnas.1213113110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Neher RA, Kessinger TA, Shraiman BI. Coalescence and genetic diversity in sexual populations under selection. Proc. Natl Acad. Sci. USA. 2013;110:15836–15841. doi: 10.1073/pnas.1309697110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rice DP, Good BH, Desai MM. The evolutionarily stable distribution of fitness effects. Genetics. 2015;200:321–329. doi: 10.1534/genetics.114.173815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Neher RA. Genetic draft, selective interference, and population genetics of rapid adaptation. Annu. Rev. Ecol. Evol. Syst. 2013;44:195–215. doi: 10.1146/annurev-ecolsys-110512-135920. [DOI] [Google Scholar]
- 16.de Visser AGJM, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE. Diminishing returns from mutation supply rate in asexual populations. Science. 1999;283:404–406. doi: 10.1126/science.283.5400.404. [DOI] [PubMed] [Google Scholar]
- 17.Cooper TF. Recombination speeds adaptation by reducing competition between beneficial mutations in populations of Escherichia coli. PLoS Biol. 2007;5:e225. doi: 10.1371/journal.pbio.0050225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Perfeito L, Fernandes L, Mota C, Gordo I. Adaptive mutations in bacteria: high rate and small effects. Science. 2007;317:813–815. doi: 10.1126/science.1142284. [DOI] [PubMed] [Google Scholar]
- 19.McDonald MJ, Rice DP, Desai MM. Sex speeds adaptation by altering the dynamics of molecular evolution. Nature. 2016;531:233–236. doi: 10.1038/nature17143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Leiby N, Marx CJ. Metabolic erosion primarily through mutation accumulation, and not tradeoffs, drives limited evolution of substrate specificity in escherichia coli. PLoS Biol. 2014;12:1–10. doi: 10.1371/journal.pbio.1001789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tenaillon Olivier, Barrick Jeffrey E., Ribeck Noah, Deatherage Daniel E., Blanchard Jeffrey L., Dasgupta Aurko, Wu Gabriel C., Wielgoss Sébastien, Cruveiller Stéphane, Médigue Claudine, Schneider Dominique, Lenski Richard E. Tempo and mode of genome evolution in a 50,000-generation experiment. Nature. 2016;536(7615):165–170. doi: 10.1038/nature18959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Good, B. H., McDonald, M. J., Barrick, J. E., Lenski, R. E. & Desai, M. M. The dynamics of molecular evolution over 60,000 generations. Nature551, 45 (2017). [DOI] [PMC free article] [PubMed]
- 23.Couce A, et al. Mutator genomes decay, despite sustained fitness gains, in a long-term experiment with bacteria. Proc. Natl Acad. Sci. USA. 2017;114:E9026–E9035. doi: 10.1073/pnas.1705887114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Fisher, R. A. The Genetical Theory of Natural Selection. (The Clarendon Press, Oxford, 1930).
- 25.Muller HJ. Some genetic aspects of sex. Am. Nat. 1932;66:118–138. doi: 10.1086/280418. [DOI] [Google Scholar]
- 26.Eigen M. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften. 1971;58:465–523. doi: 10.1007/BF00623322. [DOI] [PubMed] [Google Scholar]
- 27.Felsenstein J. The evolutionary advantage of recombination. Genetics. 1974;78:737–756. doi: 10.1093/genetics/78.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kondrashov AS. Classification of hypotheses on the advantage of amphimixis. J. Hered. 1993;84:372–387. doi: 10.1093/oxfordjournals.jhered.a111358. [DOI] [PubMed] [Google Scholar]
- 29.Gerland U, Hwa T. On the selection and evolution of regulatory dna motifs. J. Mol. Evol. 2002;55:386–400. doi: 10.1007/s00239-002-2335-z. [DOI] [PubMed] [Google Scholar]
- 30.Berg J, Willmann S, Lässig M. Adaptive evolution of transcription factor binding sites. BMC Evol. Biol. 2004;4:42. doi: 10.1186/1471-2148-4-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zeldovich KB, Chen P, Shakhnovich EI. Protein stability imposes limits on organism complexity and speed of molecular evolution. Proc. Natl Acad. Sci. USA. 2007;104:16152–16157. doi: 10.1073/pnas.0705366104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen P, Shakhnovich EI. Lethal mutagenesis in viruses and bacteria. Genetics. 2009;183:639–650. doi: 10.1534/genetics.109.106492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Goldstein RA. The evolution and evolutionary consequences of marginal thermostability in proteins. Protein. 2011;79:1396–1407. doi: 10.1002/prot.22964. [DOI] [PubMed] [Google Scholar]
- 34.Serohijos AW, Shakhnovich EI. Merging molecular mechanism and evolution: theory and computation at the interface of biophysics and evolutionary population genetics. Curr. Opin. Struct. Biol. 2014;26:84–91. doi: 10.1016/j.sbi.2014.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Manhart M, Morozov AV. Protein folding and binding can emerge as evolutionary spandrels through structural coupling. Proc. Natl Acad. Sci. USA. 2015;112:1797–1802. doi: 10.1073/pnas.1415895112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Chi PB, Liberles DA. Selection on protein structure, interaction, and sequence. Protein Sci. 2016;25:1168–1178. doi: 10.1002/pro.2886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Scott M, Gunderson CW, Mateescu EM, Zhang Z, Hwa T. Interdependence of cell growth and gene expression: origins and consequences. Science. 2010;330:1099–1102. doi: 10.1126/science.1192588. [DOI] [PubMed] [Google Scholar]
- 38.Basan M, et al. Overflow metabolism in Escherichia coli results from efficient proteome allocation. Nature. 2015;528:99–104. doi: 10.1038/nature15765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lynch, M. & Walsh, B. Genetics and Analysis of Quantitative Traits (Sinauer Associates Inc, Sunderland, 1998).
- 40.Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983).
- 41.Gillespie JH. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics. 2000;155:909–919. doi: 10.1093/genetics/155.2.909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Good BH, Walczak AM, Neher RA, Desai MM. Genetic diversity in the interference selection limit. PLoS Genet. 2014;10:e1004222. doi: 10.1371/journal.pgen.1004222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lynch M, Hill WG. Phenotypic evolution by neutral mutation. Evolution. 1986;40:915–935. doi: 10.1111/j.1558-5646.1986.tb00561.x. [DOI] [PubMed] [Google Scholar]
- 44.Keightley PD, Hill WG. Quantitative genetic variability maintained by mutation-stabilizing selection balance in finite populations. Genet. Res. 1988;52:33–43. doi: 10.1017/S0016672300027282. [DOI] [PubMed] [Google Scholar]
- 45.Nourmohammad A, Schiffels S, Lässig M. Evolution of molecular phenotypes under stabilizing selection. J. Stat. Mech. Theor. Exp. 2013;2013:P01012. doi: 10.1088/1742-5468/2013/01/P01012. [DOI] [Google Scholar]
- 46.Wylie CS, Shakhnovich EI. A biophysical protein folding model accounts for most mutational fitness effects in viruses. Proc. Natl Acad. Sci. 2011;108:9916–9921. doi: 10.1073/pnas.1017572108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Charlesworth B. Stabilizing selection, purifying selection, and mutational bias in finite populations. Genetics. 2013;194:955–971. doi: 10.1534/genetics.113.151555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hochstrasser M. Ubiquitin-dependent protein degradation. Annu. Rev. Genet. 1996;30:405–439. doi: 10.1146/annurev.genet.30.1.405. [DOI] [PubMed] [Google Scholar]
- 49.Chéron N, Serohijos AWR, Choi J-M, Shakhnovich EI. Evolutionary dynamics of viral escape under antibodies stress: a biophysical model. Protein Sci. 2016;25:1332–1340. doi: 10.1002/pro.2915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Nourmohammad A, et al. Adaptive evolution of gene expression in drosophila. Cell Rep. 2017;20:1385–1395. doi: 10.1016/j.celrep.2017.07.033. [DOI] [PubMed] [Google Scholar]
- 51.Muller HJ. The relation of recombination to mutational advance. Mutat. Res. 1964;106:2–9. doi: 10.1016/0027-5107(64)90047-8. [DOI] [PubMed] [Google Scholar]
- 52.Gordo I, Charlesworth B. The degeneration of asexual haploid populations and the speed of Muller’s ratchet. Genetics. 2000;154:1379–1387. doi: 10.1093/genetics/154.3.1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lynch M, Marinov GK. The bioenergetic costs of a gene. Proc. Natl Acad. Sci. USA. 2015;112:15690–15695. doi: 10.1073/pnas.1421641112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Weissman DB, Barton NH. Limits to the rate of adaptive substitution in sexual populations. PLoS Genet. 2012;8:1–18. doi: 10.1371/journal.pgen.1002740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Weissman DB, Hallatschek O. The rate of adaptation in large sexual populations with linear chromosomes. Genetics. 2014;196:1167–1183. doi: 10.1534/genetics.113.160705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Maynard Smith, J. Group Selection 163–175 (Aldine Atherton, Chicago, 1971).
- 57.Maynard Smith, J. The Evolution of Sex. Technical Report (Cambridge University Press, Cambridge, 1978).
- 58.Lehtonen J, Jennions MD, Kokko H. The many costs of sex. Trends Ecol. Evol. 2012;27:172–178. doi: 10.1016/j.tree.2011.09.016. [DOI] [PubMed] [Google Scholar]
- 59.Comeron JM, Ratnappan R, Bailin S. The many landscapes of recombination in Drosophila melanogaster. PLoS Genet. 2012;8:1–21. doi: 10.1371/journal.pgen.1002905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Schiffels, S., Mustonen, V. & Lässig, M. The asexual genome of Drosophila. Preprint at https://arxiv.org/abs/1711.10849 (2017).
- 61.Bernstein, H., Hopf, F. A. & Michod, E. in The Evolution of Sex 139–160 (Sinauer Press, Sunderland, MA, 1988).
- 62.Whitlock MC, Agrawal AF. Purging the genome with sexual selection: reducing mutation load through selection on males. Evolution. 2009;63:569–582. doi: 10.1111/j.1558-5646.2008.00558.x. [DOI] [PubMed] [Google Scholar]
- 63.Hamilton WD. Sex versus non-sex versus parasite. Oikos. 1980;35:282–290. doi: 10.2307/3544435. [DOI] [Google Scholar]
- 64.Salathé M, Kouyos RD, Bonhoeffer S. The state of affairs in the kingdom of the red queen. Trends Ecol. Evol. 2008;23:439–445. doi: 10.1016/j.tree.2008.04.010. [DOI] [PubMed] [Google Scholar]
- 65.Hartfield M, Keightley PD. Current hypotheses for the evolution of sex and recombination. Integr. Zool. 2012;7:192–209. doi: 10.1111/j.1749-4877.2012.00284.x. [DOI] [PubMed] [Google Scholar]
- 66.Kondrashov AS. Selection against harmful mutations in large sexual and asexual populations. Genet. Res. 1982;40:325–332. doi: 10.1017/S0016672300019194. [DOI] [PubMed] [Google Scholar]
- 67.Kouyos RD, Silander OK, Bonhoeffer S. Epistasis between deleterious mutations and the evolution of recombination. Trends Ecol. Evol. 2007;22:308–315. doi: 10.1016/j.tree.2007.02.014. [DOI] [PubMed] [Google Scholar]
- 68.Neher RA, Shraiman BI. Competition between recombination and epistasis can cause a transition from allele to genotype selection. Proc. Natl Acad. Sci. USA. 2009;106:6866–6871. doi: 10.1073/pnas.0812560106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Neher RA, Shraiman BI, Fisher DS. Rate of adaptation in large sexual populations. Genetics. 2010;184:467–481. doi: 10.1534/genetics.109.109009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Friedlander, T., Prizak, R., Guet, C. C., Barton, N. H. & Tkačik, G. Intrinsic limits to gene regulation by global crosstalk. Nat. Commun. 7, 12307 (2016). [DOI] [PMC free article] [PubMed]
- 71.Mustonen V, Kinney J, Callan CGJ, Lässig M. Energy-dependent fitness: A quantitative model for the evolution of yeast transcription factor binding sites. Proc. Natl Acad. Sci. USA. 2008;105:12376–12381. doi: 10.1073/pnas.0805909105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Friedlander T, Prizak R, Barton NH, Tkačik G. Evolution of new regulatory functions on biophysically realistic fitness landscapes. Nat. Commun. 2017;8:216. doi: 10.1038/s41467-017-00238-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Lässig M. From biophysics to evolutionary genetics: statistical aspects of gene regulation. BMC Bioinform. 2007;8:S7. doi: 10.1186/1471-2105-8-S6-S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Tokuriki N, Stricher F, Schymkowitz J, Serrano L, Tawfik DS. The stability effects of protein mutations appear to be universally distributed. J. Mol. Biol. 2007;369:1318–1332. doi: 10.1016/j.jmb.2007.03.069. [DOI] [PubMed] [Google Scholar]
- 75.Kinney JB, Murugan A, Callan CG, Cox EC. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA. 2010;107:9158–9163. doi: 10.1073/pnas.1004290107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Tuǧrul M, Paixão T, Barton NH, Tkačik G. Dynamics of transcription factor binding site evolution. PLoS Genet. 2015;11:1–28. doi: 10.1371/journal.pgen.1005639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Goyal S, et al. Dynamic mutation–selection balance as an evolutionary attractor. Genetics. 2012;191:1309–1319. doi: 10.1534/genetics.112.141291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Mustonen V, Lässig M. Fitness flux and ubiquity of adaptive evolution. Proc. Natl Acad. Sci. USA. 2010;107:4248–4253. doi: 10.1073/pnas.0907953107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Leiby, N. & Marx, C. Data from: metabolic erosion primarily through mutation accumulation, and not tradeoffs, drives limited evolution of substrate specificity in escherichia coli. Dryad. Digital Repos.10.5061/dryad.7g401 (2014). [DOI] [PMC free article] [PubMed]
- 80.Held T, Nourmohammad A, Lässig M. Adaptive evolution of molecular phenotypes. J. Stat. Mech. 2014;2014:P09029. doi: 10.1088/1742-5468/2014/09/P09029. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data generated from the simulations are available from the corresponding author upon reasonable request.
The code for the simulations of this study is included as Supplementary Software 1.