Abstract
Predicting evolutionary outcomes is an important research goal in a diversity of contexts. The focus of evolutionary forecasting is usually on adaptive processes, and efforts to improve prediction typically focus on selection. However, adaptive processes often rely on new mutations, which can be strongly influenced by predictable biases in mutation. Here, we provide an overview of existing theory and evidence for such mutation-biased adaptation and consider the implications of these results for the problem of prediction, in regard to topics such as the evolution of infectious diseases, resistance to biochemical agents, as well as cancer and other kinds of somatic evolution. We argue that empirical knowledge of mutational biases is likely to improve in the near future, and that this knowledge is readily applicable to the challenges of short-term prediction.
This article is part of the theme issue ‘Interdisciplinary approaches to predicting evolutionary biology’.
Keywords: adaptation, mutation, prediction, theory, population genetics
1. Introduction
Predicting the dynamics and outcome of evolution is an important goal of the biological sciences, offering the potential to design better drugs, combat pathogens and conserve endangered species [1–11]. Targets of prediction include genetic changes underlying adaptation, such as those causing antibiotic resistance or enhancing thermostability, as well as their corresponding phenotypes, such as minimum inhibitory concentration or melting temperature [12,13]. Higher-level targets include the diversity, abundance and ecosystem functions of microbial communities [14], as well as the rate of adaptation itself [13].
Owing to the stochastic nature of the evolutionary process, forecasting offers the greatest potential over short and intermediate timescales. Our ability to make accurate forecasts depends crucially on high-quality experimental data, such as those describing the phenotypic or fitness effects of mutations. For example, over short timescales, where one may wish to predict the next beneficial mutation to arise and go to fixation, empirical knowledge of the distribution of fitness effects is key, because this provides information about the fixation probabilities of new mutations [15]. At intermediate timescales, where one may wish to predict which of several possible mutational trajectories to adaptation is the most likely, empirical knowledge of the fitness effects of combinations of mutations is key, because this can be used to delineate between mutational trajectories that ascend adaptive peaks from those that fall into maladaptive valleys [2]. As such, the project of predicting evolution has benefited greatly from recent advances in high-throughput sequencing technologies and phenotypic assays, which ameliorate so-called ‘data limits’ on accurate forecasting [7]. These technologies have been used to characterize the phenotypic and fitness effects of mutations in a diversity of biological systems, including regulatory elements [16,17], macromolecules [18–24], gene regulatory circuits [25,26] and metabolic pathways [27].
However, empirical knowledge of the phenotypic and fitness effects of mutations only takes us so far. Whereas these data provide useful information about the likelihood of mutations going to fixation, they tell us nothing about the rate with which new mutations are introduced into a population. This is an important limitation, because evolution often proceeds via the introduction of new mutations, and some types of mutations are more likely to arise than others [28,29]. For example, studies of the rates and spectra of spontaneous mutations, such as those based on mutation accumulation experiments, have revealed a bias towards transitions (purine-to-purine or pyrimidine-to-pyrimidine changes), relative to transversions (purine-to-pyrimidine changes, or vice versa) in a wide range of species [30]. The exact degree of transition bias emerging under any particular set of conditions is the net outcome of biases in all stages in the genesis of nucleotide mutations, including biases in susceptibility to damage (e.g. oxidative damage), in the efficiency of damage recognition and repair, in rates of polymerase errors and proofreading, and in the efficiency of recognition and repair of mispaired bases (see [31,32]). Because transition bias and other kinds of mutation bias make some mutational steps to adaptation more likely than others, empirical knowledge of mutation bias offers the potential to improve evolutionary forecasting, both at short and intermediate timescales.
Here, we address how effects of mutational biases—predictable differences in rates between different categories of mutational conversions—make evolution more predictable, focusing mostly on the case of short-term adaptation from new mutations, and setting aside some related topics such as the role of specialized mutation-generating systems [33, ch. 5] and hypermutators [34]. First, we review theoretical work suggesting that such biases can exert a strong influence on the outcome of evolutionary processes, including adaptive processes, that depend on new mutations. Next, we review the empirical case for an influence of mutation bias on adaptation in the laboratory and in nature. Finally, we discuss specific applications where empirical knowledge of mutation bias is anticipated to improve evolutionary forecasting, in regard to topics such as infectious diseases, cancer and other kinds of somatic evolution, as well as resistance to biochemical agents. We note some recurring themes: (i) the most commonly observed outcome is often the most mutationally favourable of the adaptive options, not the most fit, (ii) ordinary nucleotide mutation biases often have strong and predictable effects on the genetic changes underlying adaptation, (iii) perturbing the mutation spectrum alters the distribution of such changes, and (iv) the influence of mutation biases can be altered by the beneficial mutation supply and other population-genetic and environmental conditions. In general, we argue that knowledge of mutation can improve predictability in practical contexts. We conclude with comments on open questions and future prospects.
2. Theory
Under what conditions will empirical knowledge of mutation bias improve evolutionary forecasting? To address this question, we first turn to theory. The classic ‘Modern Synthesis’ view assumes evolution from standing variation in an abundant gene pool, so that the process of evolution is formally a process of recombining and shifting frequencies of available alleles without new mutations [33,35,36]. In this context, adaptation happens by selectively favourable shifts in frequencies of multi-locus combinations of small-effect alleles generated by recombination [37–40]. The role of mutation is strictly limited: recurrent mutation acts only as a weak pressure, ineffectual except when mutation rates are high and unopposed by selection [41–43]. Therefore, in this theory, the predictability of evolution emerges from a consideration of selection: in the short-term, an evolving population ascends a fitness gradient in a multi-locus allele-frequency space; in the long term, it approaches a local or global maximum of fitness.
A different view of the roles of mutation and selection emerged during the molecular revolution. Comparisons of protein sequences suggested that evolutionary divergence occurs by the accumulation of individual substitutions of amino acid residues, where each substitution reflects a mutation that was promoted—or at least, tolerated—by selection, which was conceptualized as a filter acting on individual mutations [44–46]. This way of thinking placed the process of mutation in the more important role of offering individual variants directly for selective filtering (rather than merely filling up the gene pool to facilitate subsequent recombination). This conception of evolution as a two-step process was formalized in ‘origin-fixation’ models, which depict the limiting behaviour of evolution when the number of new mutations introduced per generation becomes arbitrarily small [47]. In an origin-fixation model, the rate of evolution is determined by the product of a rate of ‘introduction’ or origin Nμ and a probability of fixation π, i.e. R = Nμπ.
Importantly, this new way of thinking about evolution suggests an increased influence for mutation biases, because the likelihood of each possible step will depend on the likelihood of the underlying mutation. For evolution in the origin-fixation regime, mutational biases (i.e. biases in origination) and biases in fixation each have proportional effects on the course of evolution [29], i.e. we can express a ratio of origin-fixation rates in terms of these two different types of biases:
2.1 |
where Rij is the rate of change from allele i to allele j, μij is the mutation rate from allele i to allele j, πij is the chance of fixation of a new allele of type j in a population otherwise of type i, and N is the population size (see also [48]). That is, the evolutionary bias between two alternative types of changes, i → j versus i → k, can be expressed as the product of a bias in origination (e.g. transition-transversion bias or GC-AT bias) and a bias in fixation [29,48]. This means that biases in the introduction process can influence adaptation even when mutation rates are low and selection is strong, in contrast to the classical view in which internal biases are assumed to require evolution by mutation pressure [41–43], which requires high rates of mutation.
The equation above reflects origin-fixation conditions, and is useful for thinking about short-term evolution, or about long-term evolution in an infinite space. What about less ideal conditions, e.g. extended adaptive walks on finite landscapes? To grasp the potential effects of mutation bias on adaptive walks, it is helpful to consider the different perspectives of points, paths, local peaks and landscapes. From a typical point on a complex landscape, multiple upward (fitness-increasing) steps are possible, and some are mutationally favourable (whereas others are not), so that the orientation of an evolving system may be biased. Any path of upward steps eventually ends at some local peak, and some paths are enriched in mutationally favourable steps (whereas others are not), so that a system evolving under a bias may favour some paths over others. From the perspective of peaks, each fitness peak is accessible by some set of upward paths, and this set of paths may differ in size, and may be more or less enriched for mutationally favourable paths, so that certain peaks may be more likely outcomes of evolution, averaging over many possible starting points. Finally, for a given landscape with many peaks, we can define all the upward paths, i.e. all the possible adaptive walks, and thus some landscapes will have more mutationally favoured walks, making them more navigable.
Evolutionary simulations on complex adaptive landscapes confirm these broad expectations and provide some guidance on the size of effects [49–53]. For instance, [52] modelled adaptive walks using an NK model of fitness applied to a protein encoded by a gene subject to variable GC : AT bias, finding that a several-fold bias in mutation can have a substantial impact on the amino acid composition of evolved proteins. Cano & Payne [49] explored the effect of transition-transversion bias on the navigability of empirical landscapes for transcription-factor binding sites, finding that the landscapes are most navigable when the mutation bias matches the bias inherent in the landscape. Schaper & Louis [51] find that RNA folds with the most sequences are more findable in adaptation.
How far do effects of mutation biases extend outside of the strict origin-fixation regime that emerges as the mutation supply μN becomes arbitrarily small? In the hypothetical case of an infinite-sites model, mutation biases are influential regardless of mutation supply (appendix A). For finite cases, the results of Yampolsky & Stoltzfus [29] suggest that biases in the introduction process decay with mutation supply but remain influential well outside the origin-fixation regime. Subsequent work has clarified this relationship [56–58]. In particular, Cano et al. [56] used simulations to study the effect of mutation supply in a codon-based model of protein adaptation. They quantified the effect of mutation bias with a single statistic, β, which ranges from 0, indicating no influence, to 1, indicating that the spectrum of amino acid-changing mutations has a proportional influence on the spectrum of changes fixed in adaptation. They found that β ≈ 1 when the mutation supply is low (Nμ ≈ 10−4), and ultimately goes to 0 for high mutation supply, with most of the shift from 1 to 0 happening as mutation supply goes from 10−2 to 100.
Finally, what are the implications for predictability? As explained in appendix B, considering a single adaptive step, predictability (in the sense of repeatability) decreases with the number of possibilities, and increases with the variance in their probabilities [63,66]. This predictability can be partitioned further (under limiting conditions explained in appendix B) into contributions of mutation and fixation. The separate terms have the same property that, the greater the variability in the probability of fixation π, or the greater the variability in μ, the greater the contribution to repeatability. An important implication of this theoretical result is that, in designing approaches to prediction, it is important to capture as much variance as possible in elementary chances, and to treat mutation and selection comparably to avoid a skewed picture of their contributions. For instance, if 40 different beneficial mutations are possible, and we use individual fitness measurements for each s, but characterize each μ with an average rate from a model of six types of rates, this artificially reduces the expected contribution of mutation to repeatability, given that such simplified models capture only a minority of the variance in individual mutation rates [67].
What about predictability in long-term adaptive walks? In the special case of adaptation on a fixed and finite landscape without epistasis, the evolving system will converge on a single global peak, and mutation bias will influence the trajectory and the length of the walk, but not the final destination. In any other case, mutation bias may influence the direction, length and ultimate destination of a walk, as outlined above. Predictability has a somewhat counterintuitive relationship to mutation bias when a system with a particular bias is on a landscape enriched for upward paths favoured by that bias. In this case, as shown by Cano & Payne [49], there is a larger set of upward paths enriched for mutationally favourable changes, and so the particular path taken in any instance of adaptation is less predictable.
Predicting evolutionary trajectories is further complicated by the potential for changes in the mutation spectrum itself, which can occur even on short timescales, owing to transient changes in environmental conditions [68,69]. Durable genetic changes in the mutation spectrum that may be important in evolution on various timescales include (i) the emergence of hypermutators with greatly enhanced mutation rates and distinct mutation spectra [34,70], (ii) changes that modify mutation spectra without dramatic changes in total mutation rate [71,72], (iii) long-term changes in DNA repaire repertoire including the loss and gain of entire pathways [73], (iv) shifts in (and long-term equilibration of) the genomic frequency of sequence contexts under the long-continued influence of context-dependent mutation [74], (v) genome-wide patterns of adaptive amelioration reducing the frequency or severity of deleterious mutations [75,76], and (vi) bias reversals that temporarily enhance the rate of adaptation by enhancing mutational access to previously under-sampled classes of beneficial mutations [50,53].
In summary, theory suggests that mutation bias can influence adaptation under a broad range of population genetic conditions, with the strongest signal of mutational influence appearing when the mutation supply is low. Mutation biases can influence both the outcomes of short-term adaptation, and the trajectory, length, and outcome of adaptive walks, dependent on conditions. The extent to which empirical knowledge of mutation bias will improve evolutionary forecasting depends on the extent to which natural systems evolve under conditions favourable to these effects. Because this is an empirical issue, we next turn to experimental evidence, from the laboratory and nature.
3. Evidence
As outlined above, theory suggests that, where conditions allow, systematic biases in mutation can shape the course of adaptation via biases in the introduction process. What is the evidence that this kind of causation is real? What do we know about effect-sizes under various conditions? How well do these effects fit theoretical expectations? How broadly are such effects expected?
(a) . Causal agency
To begin, one may ask what studies establish causal agency, i.e. proving beyond any reasonable doubt that X causes Y? The gold standard is to manipulate X and show the expected effects on Y under controlled conditions. This standard is satisfied by the work of Couce et al. [77] and Horton et al. [78], laboratory studies with microbial systems, involving adaptation from new mutations under controlled conditions that include direct manipulation of the mutation spectrum.
Couce et al. [77] subjected 192 replicate lines of Escherichia coli to increasing concentrations of the β-lactam antibiotic cefotaxime, using three different parental strains: wild-type, mutH and mutT. The latter two are mutators with higher overall rates of mutation and distinctive biases toward transitions (mutH) or A : T → C : G transversions (mutT). Figure 1 shows the resistance-conferring mutations that arose in ftsI, the gene in which most of these mutations are found. The resistance-conferring mutations from mutT isolates (blue) tend to be A : T → C : G transversions (left block of bars), which are the type favoured by mutT, whereas the resistance-conferring mutations that evolved in the mutH strain (red) tend to be the transitions (centre block of bars) favoured by mutH. That is, changing the mutation spectrum changes the spectrum of adaptive changes in a corresponding manner.
The second study, by Horton et al. [78], was motivated by the observation that two different strains of Pseudomonas fluorescens adapt to the loss of motility in strikingly different ways. In one strain, over 95 per cent of the time, adaptation involved an A289C change in the ntrB locus, whereas in the other strain, adaptation involved mutations in diverse genes. They identified a hotspot mutation associated with synonymous sequence differences in the two strains. To test that the mutational hotspot caused the difference in adaptation, they used genetic engineering to create the hotspot in one strain, and remove it in the other—all without changing the protein sequence (because the engineered changes were synonymous). The results confirmed the mutational hypothesis. When the hotspot was removed, adaptation no longer relied on the mutation in the ntrB locus; and when the hotspot was engineered, adaptation no longer involved mutations in diverse genes, but rather relied on the A289C mutation.
(b) . Range of effect-sizes
Having established causal agency with studies that involve unusual conditions—some mutators and a hotspot—let us now ask about effect-sizes when ordinary nucleotide mutation biases are involved, and particularly, let us consider whether quantitative relationships between s, μ and the frequency with which a variant evolves f are roughly what we expect from theory. Several studies are useful in this regard. We will focus here on Maclean et al. [62], Rokyta et al. [59] and Cano et al. [56].
Maclean et al. [62] tracked the emergence of resistance to Rifampicin in replicate cultures of Pseudomonas aeruginosa. Resistant strains typically have mutations in rpoB, encoding the main RNA polymerase subunit. Maclean et al. [62] measured selection coefficients for 35 resistant variants, and mutation rates for 11 of these. The mutation rates—all for single-nucleotide substitutions—ranged 30-fold. However, the selection coefficients are very large and show a much smaller range, from 0.3 to 0.9, so that the range expected for the probability of fixation is even smaller, just 0.45 to 0.83 (using the formula of [60]). Thus, under origin-fixation conditions, we expect a 30-fold effect of mutation but only about a twofold effect of selection (given that clonal interference is not expected). The results shown in figure 2 confirm this expectation and provide some additional useful evidence (these results are also used as an example in appendix B). As shown in the left panel, the most frequent outcome is not the most fit; the top two most frequent outcomes fall in the middle of the fitness distribution. Meanwhile, there is a strong and roughly proportional effect of mutation rate, as shown in the centre panel. The right panel confirms that this effect of mutation rate is not owing to confounding with selection coefficients, which are uncorrelated with the mutation rates.
In a well-known study of recurrent evolution, Rokyta et al. [59] carried out one-step adaptation 20 times in replicate populations of bacteriophage ϕX174, under conditions of adaptation from new mutations. They found that the most frequent change, repeated six times, was not the most fit, but rather the fourth most fit. These results were not in agreement with the model of Orr [79], which assumes uniform mutation, prompting the authors to seek a mutational explanation. They found that an origin-fixation model incorporating (i) measured selection coefficients, and (ii) a model of nucleotide mutation rates (from comparative data) performed better in predicting outcomes than Orr’s [79] model, which assumes homogeneity in mutation rates. Thus knowledge of mutation rates improved the predictability of adaptive outcomes.
As explained in §2, Cano et al. [56] developed a method to capture the influence of the mutation spectrum with a single coefficient of mutational influence β that ranges from 0 (no influence) to 1 (proportional influence). They also applied this method to three datasets of adaptive amino acid substitutions, including substitutions implicated in natural adaptation of Mycobacterium tuberculosis to antibiotics, as well as laboratory adaptation of E. coli and Saccharomyces cerevisiae to environmental stress, using independently curated species-specific mutation spectra that describe the relative rates of the six possible nucleotide changes within double-stranded DNA. For each species, they found that β is close to 1 and significantly different from 0, indicating a proportional influence of the mutation spectrum. Moreover, they showed this was not just an effect of transition bias, but rather of the entire distribution of rates across the six types of single-nucleotide changes. Indeed, the frequencies of the six types of nucleotide changes among adaptive substitutions are strongly correlated with the independently curated species-specific mutation spectra (figure 3). The authors note that the three species differ in important population genetic conditions, such as mutation supply. Whereas M. tuberculosis has one of the lowest mutation rates of all bacteria [80] and is therefore likely to experience only limited clonal interference during adaptation to a new human host [81,82], E. coli and S. cerevisiae have relatively higher mutation rates [83,84] and often experience clonal interference in laboratory evolution experiments [85,86]. The results of Cano et al. [56] therefore provide empirical support for the theoretical result that mutation bias can influence adaptation across a broad range of population genetic conditions [29,57].
(c) . Scope of applicability
Now, having established causal agency and the potential for large effect-sizes, let us consider the scope and generality of this kind of cause-effect relationship. How widely can we expect it to apply? An ideal way to address this question would be to carry out a meta-analysis of published studies of adaptation. We would want to include in this analysis all of the relevant work, dividing it into experimental and natural adaptation, and perhaps considering other factors such as taxonomy and population size. At present, such an analysis would be quite difficult and would cover only a very minor fraction of the literature. The difficulties may be summarized as follows. In over a century of experimental studies of adaptation, the vast majority do not include a genetic analysis. Those with a genetic analysis typically implicate loci or alleles (e.g. involved in the adaptation of quantitative traits) without identifying specific mutations. The adaptation studies that implicate specific mutations (a tiny fraction of all adaptation studies) typically do not have sufficient replicates to support powerful tests, e.g. sometimes they are a one-off case [87]. In addition, most reports implicating adaptive mutations do not follow a rigorous standard for making this determination, so that mis-attributions are common [88], a serious problem given the prior expectation that non-adaptive changes will show effects of mutation biases. Furthermore, even in cases where adaptation can be traced confidently to specific mutations, we rarely have the kind of information on mutation biases and selection coefficients that would be needed to reach the conclusion that mutational effects are consequential once selection is taken into account.
The meta-analysis strategy of Stoltzfus & McCandlish [89], focused on transition-transversion bias among amino acid changes, was designed to maximize the use of available data given these difficulties. Briefly, it takes advantage of the following: (i) many contemporary studies of adaptation implicate specific amino acid changes, typically caused by single-nucleotide substitutions, doing so in a rigorous way based on verifying effects with genetic comparisons or engineering, (ii) for a broad range of taxa, nucleotide mutations show a bias towards transitions, typically two- to fourfold above null expectations [30,90], and (iii) experimental studies of mutational effects do not reveal any substantial tendency for transitions to be more benign than transversions [90,91], so that a reasonable null expectation for beneficial (or neutral) changes in the absence of mutation bias is a simple 1 : 2 ratio, given that there are twice as many possible transversions, as argued by Stoltzfus & McCandlish [89]. A substantial excess of transitions, e.g. a 1 : 1 or 2 : 1 ratio, would indicate an effect of mutation bias. Note that, in the literature of molecular evolution, it was long supposed that transitions are more conservative in their effects on proteins, as discussed by Stoltzfus & Norris [90]. However, this idea is not supported by systematic fitness measurements for amino acid-changing mutations, which show that transitions and transversions hardly differ at the upper end of the fitness distribution [90,91], though there may be some differences at the lower end, as argued by Lyons & Lauring [91].
On this basis, one may gather qualifying results and combine them, applying statistical tests for an excess of transitions relative to the 1 : 2 expectation. For instance, Meyer et al. [92] carried out replicate laboratory evolution experiments with bacteriophage λ under conditions that favour changes in the J gene, the product of which helps the virus target its bacterial host. Among 241 putatively adaptive changes, the ratio of transitions to transversions was 192 : 49, roughly eightfold higher than the 1 : 2 null expectation. The meta-analysis by Stoltzfus & McCandlish [89] covers experimental and natural adaptation using this approach, with the added safeguard that results are restricted to recurrent amino acid changes, i.e. their dataset is conditioned on parallel evolution. The experimental dataset covers five different experimental systems, the largest of which are the study of Meyer et al. [92] and the studies of Crill et al. [93] and Bull et al. [94] that uncovered numerous reversals and parallels in lines of ϕX174 propagated through successive host reversals (between E. coli and Salmonella typhimurium). Combining the data from all five studies, Stoltzfus & McCandlish [89] find a highly significant 304 : 83 ratio of transitions to transversions among events of parallel adaptive amino acid changes.
Several subsequent studies have shown effects of transition-transversion bias. Sackman et al. [95] extended the earlier study of Rokyta et al. [59] by applying the same 20-replicate protocol to three additional types of phages, for a total of 80 adaptive changes. For each of the four phages, the most common variant to evolve was not the one with the largest fitness benefit. Out of 20 × 4 = 80 changes, the transition-transversion ratio was 74 : 6, a striking result. Likewise, Bertels et al. [96] observed a strong enrichment of transitions among adaptive mutations in propagation of HIV-1 in human T-cell lines, and Katz et al. [97] observed a bias towards transitions during adaptation of E. coli to long-term stationary phase.
What about adaptation in nature? The meta-analysis of Stoltzfus & McCandlish [89] includes data from 10 cases of natural adaptation traced to specific mutations, with results shown in table 1. For example, species such as monarch butterflies (Danaus plexippus) evolve resistance to cardiac glycosides by changes in the sodium pump ATPα1 [98–100], which not only allows them to eat Apocynaceae, but also to sequester the toxin in their tissues, making them noxious to predators. Other cases involved adaptation to natural or anthropogenic toxins (tetrodotoxin, insecticides, benzimidazole and the antiviral agent Ritonavir), altitude adaptation via haemoglobin changes, convergent foregut fermentation, trichromatic vision and echolocation. Combining the data from these cases of natural adaptation, Stoltzfus & McCandlish [89] uncovered a ratio of 132 transitions to 99 transversions (table 1)—a 2.7-fold enrichment over the null.
Table 1.
phenotype | taxon | target | Ti events |
Tv events |
||
---|---|---|---|---|---|---|
counts | sum | counts | sum | |||
insecticide resistance* | insecta | Rdl, Kdr, Ace | 2, 2, 5, 2, 3 | 14 | 9, 2, 4 | 15 |
tetrodotoxin resistance | vertebrata | Na channels | 2, 6, 3 | 11 | 2, 2, 2, 3, 3 | 12 |
dlycoside resistance | metazoa | Na+/K+-ATPase | 4, 4, 2, 2 | 12 | 7, 2, 2, 4 | 15 |
herbicide resistance* | Poaceae | ACCase | 5, 2 | 7 | 7, 2, 4, 5 | 18 |
altitude adaptation | Aves | β-haemoglobin | 4, 13 | 17 | 2, 3, 2 | 7 |
trichromatic vision | vertebrata | opsins | 2, 5 | 7 | 6, 4, 2 | 12 |
echolocation | mammalia | prestin | 2, 2, 2 | 6 | 3, 2 | 5 |
growth in Ritonavir* | HIV1 | protease | 25, 7, 9 | 41 | 4 | 4 |
foregut fermentation | vertebrata | ribonucleases | 2, 4, 4 | 10 | 0 | |
benzimidazole resistance* | ascomycota | β-tubulin | 7 | 7 | 5, 6 | 11 |
totals | 132 | 99 |
Another example of transition-transversion bias in natural adaptation involves a very large set of resistance mutations identified clinically in the global human pathogen M. tuberculosis, which exhibits a strong mutation bias towards transitions [101] and evolves resistance to antibiotics exclusively through chromosomal mutations [102]. Examining two independently curated datasets, Payne et al. [103] uncovered transition-transversion ratios of 1755 : 1020 and 1771 : 900, a 3.4-fold and 3.9-fold enrichment over the null, respectively. They also took advantage of the special case of Met-to-Ile replacements, which can occur via one transition (ATG → ATA) and two transversions (ATG → ATT and ATG → ATC). Thus a 1 : 2 ratio is expected under the null hypothesis in which mutation bias has no effect. Instead, they observed ratios of 88 : 49 and 96 : 39 in the two datasets, roughly in fourfold excess of the null expectation.
What about other forms of mutation bias? In mammals and birds, mutation rates are elevated at cytosine-guanine dinucleotides (CpG) relative to other sequence contexts, owing to the effects of cytosine methylation on DNA damage and repair [104–106]. Genetic studies of high-altitude birds provided the first hints that this form of mutation bias may influence adaptation in nature, specifically the evolution of increased affinity of haemoglobin for oxygen, which is probably adaptive in hypoxic conditions and preferentially occurs via missense mutations at CpG dinucleotides [107,108]. Building off these observations, Storz et al. [109] systematically analysed the genetic sequences of haemoglobins in 35 matched, phylogenetically independent pairs of high- and low-altitude bird populations. Among the 35 pairs, they found 22 changes in oxygen affinity plausibly linked to altitude adaptation, implicating 10 different amino acid changes in haemoglobins. Of these 10 amino acid changes, six involved CpG mutations, whereas only one CpG mutation would be expected by chance, a significant excess. Thus, altitude adaptation in natural bird populations shows a significant enrichment of mutationally likely genetic changes, specifically mutations at CpG dinucleotides.
Taken together, the evidence summarized in this section provides robust support for a large and predictable influence of mutation biases on the changes involved in adaptation. The most common adaptive variants are often not the most fit, but the ones with the highest mutation rates. Quantitative biases in nucleotide mutation rates can have proportional effects, leading to a detailed match between the mutation spectrum and the spectrum of adaptive changes, and results from episodes of natural adaptation traced to the molecular level suggest a broad taxonomic scope.
4. Applications
Addressing ecological, agricultural and biomedical challenges often involves seeking to limit the reproduction of threatening biological agents such as microbial pathogens and parasites. Accordingly, understanding the evolutionary processes that give rise to problems of drug and pesticide resistance can lead to marked advancements in the agricultural and biomedical sciences. Extrapolating from its general use in evolutionary modelling, here we discuss how considerations of mutation-biased evolution shows tremendous potential in addressing challenges of widespread human concern, with a particular focus on evolutionary dynamics in somatic contexts such as cancer, drug and pesticide resistance and infectious disease.
(a) . Somatic evolution
Human somatic DNA mutates throughout adulthood in a manner that can cause disease, particularly as repeated rounds of genome replication in mitotically active cells provide opportunities for the emergence of mutant cells that have a replicative advantage, often to the detriment of the organism. Accounting for biases in mutation rates can provide improved insight into the evolutionary dynamics that occur among somatic cells [110]. In recent years, several studies exploring the evolutionary conversion of healthy somatic cells to cancer cells have shed valuable insights on the roles played by mutation biases [110,111]. These are typically characterized as so-called mutation signatures [112,113], which describe nucleotide mutation rates within a triplet context. When such signatures are constructed using DNA sequencing data from cancer cells, they simultaneously reflect mutation biases from endogenous sources such as DNA repair processes, as well as exogenous sources such as tobacco smoke [114]. Bioinformatic techniques can then be used to decompose this global mutation signature into underlying mutation signatures that can be attributed to these endogenous and exogenous sources [112]. For example, the mutational signature associated with APOBEC enzymes, which catalyse the deamination of cytosine bases, is a major contributor to mutational burden in head and neck squamous cell carcinomas [112]. A recent analysis of such mutations found that the relative importance of mutations for the cancer phenotype often differed from their prevalence, with some variants occurring infrequently despite being highly favoured by selection [115]. A more comprehensive analysis featuring 7815 cancer exomes identified dozens of highly statistically significant associations between cancer-driving mutations and specific mutational signatures such as those associated with environmental carcinogens and mutagenic enzyme activity [116]. The vast majority of these associations include deamination by APOBEC and deficiencies in proofreading and mismatch repair during replication. Intriguingly, this study also identified a negative association between tobacco smoke and the G12D substitution in KRAS; in other words, KRAS G12D is more common among the lung cancers of non-smokers [116]. Consistent with this finding, lung cancers harbouring the KRAS G12D substitution were recently associated with a lower tumour mutation burden [117,118], for which reason this mutation may serve as a negative biomarker for the success of immunotherapy. In addition to showing that the mutations most strongly favoured by selection are not necessarily the most prevalent among cancer patients, these findings suggest that mutational biases facilitate a link between the source of carcinogenesis and the predicted success of a given treatment.
What other factors may alter mutation rates in a manner that predictably influences the progression and treatability of cancer? Importantly, chemotherapy itself represents a source of mutagenesis, suggesting that attempts to treat cancer may inadvertently induce adaptive changes in the cancer that complicates further treatment options. For example, although mutations at residues 12 and 13 of the cell-signalling GTPase KRAS have a higher selective advantage, the Q61H mutation is common in colorectal cancers with resistance to the anti-EGFR antibody cetuximab [119], owing to a mutational signature associated with chemotherapy that elevates T > G transversions. Importantly, this work suggests that mutation signatures can serve as a basis for predicting the evolution of drug resistance in cancer patients. More recent investigation has also found that depending on the cancer type, the predominant driver mutations can arise from ‘actionable’ mutation signatures. In addition to tobacco, these drivers include mutations associated with exposure to ultraviolet light and endogenous processes associated with ageing [120]. By identifying specific causal factors underlying the likelihood of driver and drug-resistance mutations across different types of cancers, these findings provide a basis for predicting the efficacy of preventative and therapeutic strategies.
Finally, in tissues such as skin and blood, the relative over-proliferation of cell lineages with mutations conferring growth advantages is another medically important evolutionary process, and a target for prediction that may be informed by mutation rates. For instance, in the case of clonal haematopoesis, context-dependent nucleotide mutation rates play an important role in determining the prevalance of different variants [121]. The most frequent variant in the most frequently implicated gene, DNMT3a, is a CpG hotspot mutation changing Arg882 to histidine; but a change from Arg882 to cysteine, which occurs with a lower mutation rate, confers a larger growth advantage [121]. Similarly, a recent study of chronic myelogenous leukaemia found that for the tyrosine kinase inhibitor imatinib, epidemiological incidences of mutations conferring drug resistance are best predicted by the likelihood of the mutations rather than by their fitness effects [122]. Together, these results highlight mutation bias as an important predictor of somatic disease risk as well as drug resistance.
(b) . Resistance to biochemical agents
The evolution of resistance to drugs and host immunity represent substantial obstacles in the fight against disease. Accordingly, by providing insights on the processes underlying adaptive evolution, accounting for the combined roles of mutation and selection can improve our ability to understand and thus predict how resistance evolves among microbial pathogens. Specifically, are some mutational trajectories towards drug resistance enriched for higher-probability mutations than others, and can this information be used to fight infectious disease, in particular by tailoring treatment approaches that minimize the predicted likelihood of drug resistance? Recent work exploring large datasets of drug-resistance mutations has begun to shed light on these questions. For example, the study by Payne et al. [103] discussed in the previous section suggests the evolution of antibiotic resistance in M. tuberculosis is at least partially predictable, with some mutational paths towards resistance occurring more frequently than others depending on the relative abundance of transition mutations. Moving forward, it will be greatly informative to comprehensively characterize mutational paths towards resistance in a greater diversity of infectious pathogens and across a wide panel of antibiotics. By identifying drugs or drug cocktails for which mutational paths towards resistance tend to be relatively depleted of high-probability mutations, it may be possible to employ treatment regimens that minimize the predicted likelihood of evolved resistance, enabling treatments with longer-lasting effectiveness.
Besides transition-transversion bias, how else might biases in mutation rate guide the evolution of drug resistance? The idea that the most mutationally probable changes are not necessarily the most strongly favoured by selection implies the existence of potential mutations that would be highly adaptive but whose rates of occurrence are negligible. Accordingly, by altering the relative rates of mutation types, changes in the sources of mutation may promote adaptation by enhancing access to otherwise-rare beneficial mutations [50,53]. In one recent example, point mutations in a DNA topoisomerase gene, which is important for relieving topological stress in DNA strands, were reported to introduce mutation hotspots that result in new adaptive paths towards antibiotic resistance in E. coli [123]. Although the relevance to infectious pathogens remains unclear, these findings highlight a promising approach towards identifying new potential mutational paths to the evolution of antibiotic resistance. In particular, future work may be able to determine whether mutations in DNA maintenance or repair genes shift the mutation spectrum in a manner that promotes drug resistance evolution in pathogenic bacteria. Granting such insights, we anticipate the potential for bacterial genotyping as a predictor for the likelihood of evolved resistance to specific classes of antibiotic.
As with antibiotics, the widespread use of pesticides and fungicides in agriculture can select for the evolution of resistance, which has been reported in hundreds of species [124,125]. Similar to many examples discussed in the previous sections, mutation biases have been implicated in instances of insecticide, fungicide and herbicide resistance (table 1) [89], suggesting a broad range of potential agricultural applications for incorporating mutation bias into evolutionary forecasting. In addition to transition-transversion biases, how else might mutation biases improve the predictability of resistance evolution? To address this question, we consider examples of natural mutators. Powdery mildews are fungal plant pathogens that belong to the genera Erysiphe and Blumeria and represent a major agricultural threat [126]. These taxa have undergone extensive loss of DNA mismatch repair genes throughout their evolutionary history, leading to rapid, mutation-biased genome evolution [73]. Importantly, heavy use of fungicide has been reported to accelerate the evolution of resistance in these taxa [126]. Whether variation in the mismatch repair pathway predictably alters the likelihood of resistance evolution remains unclear. However, the number of mismatch repair genes lost during evolution varies greatly across taxa and correlates with nucleotide substitution and composition biases [73], which raises the interesting possibility that the tendency for the genomic changes that facilitate fungicide resistance might also correlate with the loss of these genes. It would be fascinating to address this possibility in future work, in particular by interrogating the mutation spectra produced from targeted disruption of mismatch repair for their tendencies towards fungicide resistance.
(c) . Infectious disease
The COVID-19 pandemic represents an exceptional case study in the importance of forecasting evolutionary trajectories among both real and potential pathogens. Since the start of the pandemic, scientists and medical professionals have sought to understand the mechanisms underlying both disease severity and viral evolution, with the goal of maximizing mitigation efforts and vaccine effectiveness. Toward this end, numerous investigations of SARS-CoV-2 genomes have identified mutation biases strongly favouring uracil content, with potential implications ranging from vaccine design to personalized therapies and the emergence of new viral variants. For example, Rice et al. [127] recently reported that although mutation bias strongly favours U content, selection largely occurs against U content, which raises the question about how informative this mutation bias may be towards predicting adaptive changes. On the other hand, a strong C-to-U mutation bias was more recently reported to drive the diversification of CD8+ T-cell epitopes and the depletion of proline residues, which has been suggested to compromise T-cell immunity among individuals carrying the human leucocyte antigen B7 serotype [128]. Because host immunity represents a strong source of selection pressure on viral replication, the C-to-U bias may be helping to sustain high COVID-19 infection rates by facilitating immunity evasion within at least a subset of the population. Thus, in addition to the evolution of drug resistance as described above, mutation bias may shape the evolution of viral pathogens in a manner that predictably disrupts host immunity.
Given the abundance of recent changes in the spike protein [129], it may be possible to draw statistical inferences about whether, and to what extent, recent adaptations in SARS-CoV-2 are mutation-biased. This will greatly inform our ability to develop and implement accurate pandemic forecasting. In particular, the most commonly observed adaptive mutations are not necessarily the most fit. Accordingly, if the recent evolution of SARS-CoV-2 has been largely determined by amino acid changes that are mutationally favoured but selectively suboptimal, then there may exist adaptive ‘jackpot’ mutations that have yet to be sampled. In this case, a prolonged high rate of infections could be expected to enable the eventual occurrence of low-probability mutations that substantially enhance viral transmission. This scenario seems consistent with the recurrent emergence of increasingly transmissible variants. On the other hand, high COVID-19 infection rates raise the question of whether the evolution of SARS-CoV-2 is mutation-limited, especially given the ability of new variants to spread between geographical regions and populations. In either case, the rapid accumulation of amino acid replacements provides a considerable sample of empirical data. These data could be combined with estimates of mutation rates in order to determine whether recent or future emergence of increasingly transmissible variants are driven by systematic relationships between mutation rates and fitness effects.
In addition to SARS-CoV-2, the rapid pace of adaptive evolution has made some pathogens such as HIV notorious for their ability to evade our efforts to employ treatments and vaccinations with long-term efficacy. As a retrovirus, HIV requires reverse transcriptase to infect hosts, and numerous reverse transcriptase mutations can confer resistance to reverse transcriptase inhibitors that are used to treat HIV infection. Importantly, owing to a bias favouring the G-to-A transition mutation, the resistance-conferring M184I replacement in reverse transcriptase was found to occur more readily than M184V, despite the latter conferring greater replicative fitness [130]. Consistent with this finding, theoretical modelling has implicated G-to-A mutations, mediated by the APOBEC family of host deaminases, in the evolution of drug resistance in HIV [131].
How might such biases aid in the predictability of HIV evolution? Recent work suggests that instances of parallel evolution serve as a promising source of insight on this question. In particular, the relative number of independent occurrences of a given type of evolutionary change reflects its underlying probability: if one of two types of evolutionary change has a twofold greater probability of occurring, it can be expected to occur twice as often in independent lineages. Since the chance of parallel or repeated evolution increases with greater variance in mutation rates (see appendix B), mutation biases raise the probability of particular types of evolutionary change. Consistent with these theoretical expectations, a long-term evolution experiment involving the continued passaging of HIV in human T-cells revealed numerous instances of parallel changes, characterized by a strong bias for G-to-A transitions [96]. Unfortunately, because long-term evolution is bound to involve the accumulation of both adaptive and neutral changes, such experiments pose the challenge of disentangling the roles of mutation and selection.
To overcome this difficulty, deep mutational scanning can be used to isolate the functional effects of massive numbers of individual mutations. For example, Haddox et al. [132] used deep mutational scanning to characterize the amino acid preferences at every site in the envelope proteins from two HIV lineages. Results from such experiments can be combined with measures of mutation rates to generate pairwise estimates of rate and fitness effect for large numbers of potential mutations. Such pairwise estimates enable the prediction of likely sequence changes during evolution, since the rate of such changes are jointly proportional to both mutation rate and selection coefficient (equation (2.1)). Finally, given its rapid rate of evolution, long-term evolution experiments with HIV such as the one performed by Bertels et al. [96] provide a wealth of empirical sequence changes for testing and refining evolutionary predictions. By identifying adaptive paths involving low-probability mutations, such an approach could potentially uncover new drug and vaccine targets that minimize the likelihood of evolved resistance, leading to treatment regimens with longer-term effectiveness.
5. Challenges
Theory and empirical evidence indicate that mutation biases can have predictable effects on the genetic changes fixed in adaptation under a broad range of population-genetic conditions. In the context of research on the predictability of evolution, the obvious application of these results is simply to absorb the science that is already well established—nucleotide substitution biases shape short-term adaptation—and apply that by using available information on the mutation spectrum.
Beyond these obvious applications, what further gains would be possible with new technology or a shift in resources and attention? In this section, we suggest specific areas in which a stronger focus on effects of biases in variation might yield substantial gains, including (i) improving measurements of basic quantities, (ii) expanding our attention beyond nucleotide substitution biases, and (iii) assessing the predictability of mutational effects across diverse conditions and timescales.
(a) . Expanding coverage of basic measurements
The results reported above show the value of obtaining fundamental measurements of the following three quantities, for each possible outcome: selection coefficient (s), mutation rate (μ) and frequency of evolution (f). In particular, the example of Maclean et al. [62], as employed in figure 2 and appendix B, shows that such data are extraordinarily valuable, yet this case is small—just 11 variants—and we know of no other comparable dataset.
More commonly, we have access to individual measures of functional effects via deep mutational scanning, but no individual mutation rates, which are instead represented by a model of average rates for classes, e.g. a model of two rates for transitions and transversions, or a model of six types of nucleotide substitutions. Yet, as explained in appendix B, prediction will always suffer when models of average rates are used. Direct and indirect evidence indicates individual rates have a large amount of variance (i.e. useful information) that simplified models of mutation rates simply do not capture, e.g. Maclean et al. [62] find a 30-fold range in mutation rates for just 11 nucleotide substitution mutations in the same gene; Hodgkinson & Eyre-Walker [67] use a comparative method to estimate that a triplet context model captures only about one-third of the actual variance in mutation rates.
The technical barriers to addressing this rather stunning deficit are rapidly disappearing. Until recently, methods for measuring mutation rates dated from the 1940s and were used infrequently [133]. However, new methods for identifying and tracking mutations are now appearing rapidly, including methods based on real-time visualization [134], and methods specifically designed to measure mutation rates accurately in deep sequencing experiments [135,136]. We note that, if estimates of μ, s and f are used to interrogate their relationships, it is imperative to ensure that the estimates are unbiased with respect to these relationships. For instance, some methods used to study somatic evolution, e.g. clonal haematopoesis in Watson et al. [121], infer both μ and s from a joint distribution of population frequency and somatic prevalence (measures obviously related to f), and this raises the question of whether they are subject to correlated errors, e.g. if under-estimation of s induces over-estimation of μ.
We look forward to a future in which quantitative scientists have access to well defined sets of fundamental measurements for diverse model systems in somatic evolution, the emergence of resistance to toxins, and the adaptive evolution of infectious agents exploiting host resources.
(b) . Exploring diverse sources of variational bias
Most approaches to analysis and modelling that incorporate rates for nucleotide substitution mutations use a simplified model, e.g. a two-parameter model (i.e. transitions and transversions) or a six-parameter model. However, as noted above, such models capture only a minority of the variance in individual rates [67]. This is particularly important given the common observation that adaptive outcomes are often highly enriched for a few high-rate mutations that happen to be favourable. This suggests an importance for improving models for mutation hotspots, a topic that is rarely treated (e.g. [137]). In addition, one must not overlook the possibility of highly consequential correlations between mutational and functional features of genomes, e.g. as in Monroe et al. [76]. Such correlations are important to consider whether one thinks of them as evolved features (as argued by [76]) or coincidences, e.g. Monroe et al. [138] find that, in prokaryotic genomes, transcription-replication collisions result in some very specific and large effects, including an orientation-dependent fourfold increase in point mutations affecting promoters, mostly due to T → C mutations at position-7 relative to the start of transcription.
Although much work remains to be done in terms of basic measurements and models regarding single-nucleotide mutations, there is a far larger universe of possible mutations to explore, including multi-nucleotide mutations, compound changes affecting dispersed sites, microindels (very small insertions and deletions), the expansions and contractions of highly variable repeat loci, segmental duplications, transposable element insertions, inversions, and chromosome fission and fusion. A quantitative overview of this universe of mutations is given in (Stoltzfus [33], app. B). Within each of these categories, the distribution of individual mutation rates will reflect (i) the immediate sequence context [112,139], (ii) the regional chromosomal environment including local states of expression and chromatinization [32,140], (iii) aspects of the state of the cell reflecting age, DNA repair activities and cell-cycle state (e.g. differences in nucleotide precursors, repair enzymes or reactive oxygen species) [141–143], and (iv) broad environmental factors such as ambient radiation (e.g. exposure to ultraviolet light) and temperature [68,69].
Opportunities to improve prediction in this regard arise most obviously when, for some specific prediction problem, evolution commonly involves mutations other than nucleotide substitutions (and they also arise, less obviously, when such mutations are probable under some conditions but are not observed). For instance, segmental duplications occur commonly in experimental yeast adaptation (e.g. [144]), transposable element insertions are commonly implicated in local adaptation of bacteria in nature (e.g. [145]), and highly variable short-tandem-repeat loci have been implicated in cases of short-term adaptation such as the domestication of dogs [146]. A case of particular interest are multi-nucleotide changes to codons [147], which have been observed in studies of cancer [148], developmental disorders [149], SARS-CoV2 [150,151], and resistance to antimicrobials, e.g. in M. tuberculosis [152]. Such mutations are a target of opportunity given the kind of information already available, namely: (i) deep mutational scanning studies, which cover the amino acid changes that occur by double- and triple-nucleotide changes to codons, and (ii) prior information on the underlying rate for tandem double or triple mutations in eukaryotes, which appears to be (in total) about two or three orders of magnitude less than the total rate of single-nucleotide changes [147].
Finally, we stress that the literature on natural and experimental evolution covers a variety of phenomena—under the headings of predictability, contingency (repeatability), constraints, genotype–phenotype maps, and so on—that are not usually associated with the concept of mutation bias but which are subject to the same rules of population genetics as mutational effects under a scheme of aggregation (appendix B). For instance, the genetic code is a genotype–phenotype map dictating that there is one single-nucleotide mutational path from Met (ATG) to Val (GTG), two paths from Met to Leu (TTG, CTG) and three paths from Met to Ile (ATT, ATC and ATA). The biases induced by this mapping are not the same thing as mutation biases in the narrow sense of biases imposed by the mechanism of mutagenesis: instead, they are induced by an asymmetric mapping of genetic changes into a phenotype space. Nevertheless, from the perspective of understanding effects of biases in the introduction of variation, a scheme of aggregation that imposes twofold or threefold biases has the same impact as a twofold or threefold mutation bias. Likewise, when the mutational target size of a trait or the mutational accessibility of an alternative phenotype is identified [48,63,153–155], this corresponds to a scheme of aggregation over elementary mutational events. For instance, the series of studies from [155,156] dissecting the emergence of the wrinkly spreader phenotype in P. fluorescens provides a detailed look at asymmetries in the mutational accessibility of an alternative phenotype. Analyses of genotype–phenotype maps in a wide diversity of biological systems reveal that such asymmetries are common [157]. In long-term evolution, the phenotypes that have more genotypes are on the whole more ‘findable’ [51,158,159]. Understanding the extent to which these biases have predictable effects on the genetic changes fixed in adaptation is an important direction for future research, i.e. the challenge is to measure the predictive accuracy of different kinds of aggregation (and some guidance for doing so is provided in appendix B).
(c) . Considering diverse conditions and timescales
The most robust empirical and theoretical results available today focus on short-term or one-step adaptation, and the effects are best understood for the case of mutation-limited conditions, although we are beginning to get a clear sense of what happens as mutation supply increases and clonal interference becomes common in finite spaces [29,56,58] or infinite ones (appendix B, [57]). These results are relevant to many challenges in prediction, as we have argued above, e.g. antibiotic resistance. However, a challenge for future research is to expand the consideration of mutational effects to cover longer timescales and a greater diversity of contexts, including evolution from standing variation and synthetic evolution. Attempts to predict long-term effects of mutation bias, for instance, can take advantage of limited theory currently available on how mutation bias influences multi-step trajectories to adaptation [49–53].
Regarding evolution from standing genetic variation, when multiple beneficial alleles are present in numbers high enough to escape random loss, the most fit allele typically wins [63]. This appears to leave no room for effects of mutation bias, but actually it just pushes the question of origination biases into a different realm, where the primary question concerns how the distribution of standing variation is shaped by tendencies of mutation. For instance, the rate of length changes in short tandem repeat loci is so high that the vast majority of such loci will exhibit standing variation for length in a moderately sized population, and this is relevant for cases such as short-term evolution of gene expression [160].
Finally, mutation bias may improve evolutionary forecasting in synthetic evolution [161], such as in laboratory evolution experiments with genomically recoded organisms [162–165] or engineered mutagenesis mechanisms [166]. For instance, a directed evolution technique called Orthorep uses an orthogonal DNA polymerase to introduce mutations to plasmid-borne target genes [166]. The polymerase’s mutation spectrum is heavily biased towards G : C → A : T transversions [167], which may influence evolutionary outcomes, such as the primary and promiscuous functions of enzymes [168]. More broadly, synthetic evolution provides a useful testbed for mutation-biased adaptation theory, as the mutation spectrum can be manipulated under controlled laboratory conditions, and evolutionary outcomes can be quantified with DNA sequencing and phenotypic assays.
6. Conclusion
We have presented theoretical arguments and empirical evidence that mutation bias can have predictable effects on the genetic changes fixed in adaptation. In studies of adaptation in diverse contexts, where fitness effects have been measured, it is regularly observed that the most common outcome is not the most fit: instead, it is often a beneficial variant with an extreme rate of mutational origin. More generally, the spectrum of changes observed in adaptation reflects the mutation spectrum. Sometimes this effect can be quite strong, even proportional. The study of mutation-biased adaptation has achieved some degree of quantitative and theoretical sophistication, although much remains to be determined about factors such as the influence of population-genetic conditions, and the scope of applicability in natural adaptation.
On this basis, we can make a strong argument that knowledge of mutation bias can be used to improve evolutionary forecasting. We have highlighted applications where we think this approach may prove particularly valuable, in relation to somatic evolution, resistance to toxins and the adaptation of infectious agents to make use of host resources. Our hope is that this review will serve as a useful source of guidance for those implementing approaches to prediction, and that the information contained in it will be quickly eclipsed by more diverse, general and precise results.
Acknowledgements
The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. D.M.M. also acknowledges additional support from an Alfred P. Sloan Research Fellowship and from the Simons Center for Quantitative Biology.
Appendix A. Mutational biases in models with finite and infinite sites
Consider the case where there are m possible beneficial mutations, and the ith beneficial mutation has selection coefficient si and mutation rate μi. We assume that the si’s are drawn independently from the same distribution of mutational effects on fitness. For a population of size N, under what circumstances do the μi’s influence which of these m mutations will be the first to reach fixation?
To gain some intuition for this question it is helpful to consider two limiting situations:
-
(i)
in the limit of very large mutation supply, in particular if Nμi ≫ 1 for all μi, , all possible beneficial mutations are introduced into the population in each generation. These mutations will all compete with each other and, assuming that mutation rates are small relative to selection coefficients, ultimately the fittest of the m mutations will go to fixation. To see the consequences of this result, consider single-nucleotide substitutions, which can be classified as either transitions or transversions. Because there are twice as many transversions than transitions, the fittest variant, that is the one that will eventually go to fixation, has two times the probability of being a transversion than being a transition. Thus, under these conditions, the expected transition : transversion ratio among adaptive substitutions would be 1 : 2, regardless of whether transitions arise at a higher rate; and
-
(ii)
now let us consider the limit where all the Nμi are small (specifically, assume , so that the first beneficial mutation to become established in the population will typically have sufficient time to reach fixation before the next beneficial mutation is able to become established, [54]). In this setting, the probability that a mutation will be the first to go to fixation is proportional to both its mutation rate and its selection coefficient (equation (2.1)), so that all other things being equal, we expect that classes of mutation with high mutation rates, such as transitions, will be over-represented among fixed mutations. For instance, if the mutation rate for individual transitions is κ times the mutation rate for individual transversions (μi/μj = κ if i is a transition and j is a transversion), then we expect a transition : transversion ratio of κ : 2 among fixed mutations.
Thus, broadly speaking, mutational biases will tend to have a stronger influence on molecular adaptation when the beneficial mutation supply is low, because in this regime the first beneficial mutation that becomes established in the population is likely to go to fixation rather than the fittest possible mutation, and the waiting time until a mutation becomes established is inversely proportional to its mutation rate.
Another common class of models are the infinite sites models, where we assume that each new mutation has a selection coefficient that is drawn independently from some distribution of fitness effects. In this class of models, if different types of mutations share the same distribution of fitness effects, then the relative proportions of different types of mutations among fixed mutations always varies in a manner directly proportional to the mutational bias. For example, if we consider transitions and transversions, for each selection coefficient s, the transitions : transversion ratio among mutations with that selection coefficient is κ : 2. Thus no matter how competition between co-segregating mutations alters the distribution of fixed selection coefficients relative to the overall distribution of fitness effects, the expected transition : transversion ratio among fixed mutations will always be κ : 2. The results of this intuition are shown graphically using evolutionary simulations in figure 4. Even though we see the effects of competition between multiple adaptive mutations as a departures from the origin-fixation expectation that sets in at a total mutation supply 2Nμ ≈ 1 (figure 4), the ratios of fixed mutations are proportional to the introduction rates regardless of the value of the mutation supply (figure 4). In the more general case of infinite sites models for mutational types that do not share the same distribution of fitness effects, the strict proportionality with mutation rates need not hold, however a similar intuition applies in that we can consider the relative proportion of each mutational type for each possible selection coefficient s, and then the relative frequencies of fixed mutations can be determined by averaging these proportions over the characteristic distribution of selection coefficients [55] fixed during adaptation.
The simulation code is available in a GitHub repository at https://github.com/alejvcano/infiniteSites.
Appendix B. Quantifying contributions of mutation and selection to repeatability
How do mutation and selection contribute to the predictability of evolution? Can we partition their effects? One sense of predictability is repeatability, the chance that the outcome of evolution will match what we have seen before. Let us define repeatability as the chance of parallel evolution between a pair of trials. If we have n elementary outcomes, each happening with some probability pi, then repeatability Ppara is the sum of squares of pi:
B1 |
This is equivalent to the measure known as Simpson’s index S(p), where by p we denote the vector of pi’s (analogously, we use bold symbols to denote vectors of variables in the following). Simpson’s index has a simple relationship to the variance or coefficient of variation c(p) of the distribution of elementary probabilities:
B2 |
Under uniformity, , and Ppara = 1/n. The greater the variance in probabilities, the greater the chance of parallelism. Any factor that increases variance, increases parallelism. Likewise, any approach to prediction that ignores variance, e.g. by aggregating outcomes into classes whose members are assigned the average behaviour of the class, will underestimate parallelism.
One of the ways to quantify the effect of heterogeneity is to compute an effective number of options with the same chance of recurrence under uniformity, equal to the inverse of the probability of parallelism, i.e. ne = 1/Ppara, comparing that to n. If one state has p = 1 and the others have p = 0, then repeatability is 1, and ne = 1. If all states are equally likely, then pi = 1/n for all i, ne = n and the repeatability is . From the 20 replicates of Rokyta et al. [59], the counts ranked by selection coefficient are k = [1, 5, 3, 6, 1, 1, 1, 1, 1], thus repeatability is , and ne = 5.2, i.e. the effect of heterogeneity in pi’s is like reducing the choices from 9 to 5.2. Note this calculation ignores sampling error by treating the observed frequency fi as the true probability pi.
How could we partition repeatability to effects of mutation and selection? This is possible for the special case of origin-fixation dynamics [47]. For event i, an origin-fixation process specifies a rate Nμiπi, with πi being the fixation probability of the event, so that its application in the current context means that pi ∝ μiπi. Then the chance of parallelism is given by
B3 |
if we can assume that covariance of μ and π, as well as the covariance of the squares of μi’s and squares of πi’s is 0. In general, the fixation probability π( · ) is a function of s, the selection coefficient, and N, population size [60]. However, under strong selection weak mutation conditions π(s, N) ≈ 2s [42], and this leads directly to the result of Chevin et al. [61]:
B4 |
Consider some applications of equation (B 3). For the results of Rokyta et al. [59], c(π)2 + 1 = 1.086 for the probabilities of fixation computed from the reported selection coefficients, and c(μ)2 + 1 = 1.33 for mutation rates (given the model described by the authors), so mutational heterogeneity contributes slightly more. For the 11 variants from Maclean et al. [62] with known mutation rates, c(μ)2 + 1 = 7.93 and c(π)2 + 1 = 1.05, so mutational heterogeneity is contributing much more. This disparity reflects a 30-fold variability in mutation rates, but only about twofold range in probabilities of fixation, given that all the resistant variants have large selection coefficients (because π(s, N) ≈ 2s does not apply for large s, we must use equation (B 3) instead of (B 4)). Note that this framework does not fully partition the effects of mutation and selection, as these factors are conflated in determining n, which in practice reflects the number of mutations that are both sufficiently beneficial and sufficiently mutationally likely to have an appreciable chance of being detected.
Finally, it is of interest to consider repeatability when outcomes are aggregated. In the case of predicting phenotypes, a genotype–phenotype map is used to assign a phenotype to each of the n elementary outcomes. Alternatively, the categories could be defined by genes [63,64], pathways, or gene ontology (GO) categories, as per Tenaillon et al. [65]. Suppose that the n elementary outcomes are assigned fully to mutually exclusive groups of size m1, m2, …mℓ by a function f such that f(i) = j when outcome i is in group j, and suppose further that f( · ) assigns outcomes randomly. Then (as given in the mathematical appendix), the expected probability Pgpara that two outcomes of evolution are in the same group is:
B5 |
where S(p) is Simpson’s index over n elementary outcomes and is Simpson’s index of the partition into ℓ groups (and where the approximation is valid for large n). The effect of aggregation is always to increase parallelism. Note that equation (B 5) is symmetric in S(p) and S(g), so that the effect of the probability distribution for elementary events is the same as the effect of the probability distribution of categories. The use of this formula is that the prediction success of a concrete scheme of aggregation (e.g. GO categories) can be compared to the baseline expectation for a random aggregation with the same distribution of category sizes.
Contributor Information
Arlin Stoltzfus, Email: arlin@umd.edu.
David M. McCandlish, Email: mccandlish@cshl.edu.
Joshua L. Payne, Email: joshua.payne@env.ethz.ch.
Data accessibility
Data for figure 1 are provided as the electronic supplementary material [169]. Data for figure 2 are provided in Maclean et al. [62]; data for figure 3 are provided in Cano et al. [49]; Data for table 1 are provided in Stoltzfus & McCandlish [89]. Data and code for figure 4 are provided in the Github repository indicated in the figure caption.
Authors' contributions
A.V.C.: conceptualization, formal analysis, investigation, resources, software, visualization, writing—original draft, writing—review and editing; B.L.G.: conceptualization, investigation, writing—original draft, writing—review and editing; H.R.: conceptualization, formal analysis, investigation, methodology, validation, writing—review and editing; A.S.: conceptualization, data curation, formal analysis, investigation, methodology, project administration, resources, supervision, visualization, writing—original draft, writing—review and editing; D.M.M.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, supervision, writing—original draft, writing—review and editing; J.L.P.: conceptualization, funding acquisition, investigation, project administration, supervision, writing—original draft, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
The identification of any specific commercial products is for the purpose of specifying a protocol, and does not imply a recommendation or endorsement by the National Institute of Standards and Technology. This work was made possible through the support of a grant from the John Templeton Foundation (grant no. 61782, D.M.M.) and from the Swiss National Science Foundation (grant nos. PP00P3_292672 and 310030_192541, J.L.P.).
References
- 1.Blount Z, Lenski R, Loso J. 2018. Contingency and determinism in evolution: replaying life’s tape. Science 362, eaam5979. ( 10.1126/science.aam5979) [DOI] [PubMed] [Google Scholar]
- 2.de Visser JA, Krug J. 2014. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480-490. ( 10.1038/nrg3744) [DOI] [PubMed] [Google Scholar]
- 3.de Visser JA, Elena SF, Fragata I, Matuszewski S. 2018. The utility of fitness landscapes and big data for predicting evolution. Heredity 121, 401-405. ( 10.1038/s41437-018-0128-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fragata I, Blanckaert A, Dias Louro MA, Liberles DA, Bank C. 2019. Evolution in the light of fitness landscape theory. Trends Ecol. Evol. 34, 69-82. ( 10.1016/j.tree.2018.10.009) [DOI] [PubMed] [Google Scholar]
- 5.Lind P. 2019. Repeatability and predictability in experimental evolution. In Evolution, origin of life, concepts and methods (ed. P Pontarotti), pp. 57–83. Cham, Switzerland: Springer.
- 6.Lässig M, Mustonen V, Walczak AM. 2017. Predicting evolution. Nat. Ecol. Evol. 1, 77. ( 10.1038/s41559-017-0077) [DOI] [PubMed] [Google Scholar]
- 7.Nosil P, Flaxman S, Feder J, Gompert Z. 2020. Increasing our ability to predict contemporary evolution. Nat. Commun. 11, 5592. ( 10.1038/s41467-020-19437-x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Papp B, Notebaart R, Pál C. 2011. Systems-biology approaches for predicting genomic evolution. Nat. Rev. Genet. 12, 591-602. ( 10.1038/nrg3033) [DOI] [PubMed] [Google Scholar]
- 9.Stern DL, Orgogozo V. 2008. The loci of evolution: how predictable is genetic evolution? Evolution 62, 2155-2177. ( 10.1111/j.1558-5646.2008.00450.x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Stern DL, Orgogozo V. 2009. Is genetic evolution predictable? Science 323, 746-751. ( 10.1126/science.1158997) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Wortel M, et al. 2021. The why, what and how of predicting evolution across biology: from disease to biotechnology to biodiversity. ecoevoarxiv. ( 10.32942/osf.io/4u3mg). [DOI]
- 12.Duan J, Lupyan D, Wang L. 2020. Improving the accuracy of protein thermostability predictions for single point mutations. Biophys. J. 119, 115-127. ( 10.1016/j.bpj.2020.05.020) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Martinez J, Baquero F, Andersson D. 2008. Predicting antibiotic resistance. Nat. Rev. Microbiol. 5, 958-965. ( 10.1038/nrmicro1796) [DOI] [PubMed] [Google Scholar]
- 14.van den Berg N, Machado D, Santos S, Rocha I, Chacón J, Harcombe W, Mitri S, Patil K. 2022. Ecological modelling approaches for predicting emergent properties in microbial communities. Nat. Eco. Evo. 6, 855-865. ( 10.1038/s41559-022-01746-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Eyre-Walker A, Keightley P. 2007. The distribution of fitness effects of new mutations. Nat. Rev. Genet. 8, 610-618. ( 10.1038/nrg2146) [DOI] [PubMed] [Google Scholar]
- 16.de Boer C, Vaishnav E, Sadeh R, Abeyta E, Friedman N, Regev A. 2019. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat. Biotechnol. 38, 56-65. ( 10.1038/s41587-019-0315-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Vaishnav E, et al. 2022. The evolution, evolvability, and engineering of gene regulatory DNA. Nature 603, 455-463. ( 10.1038/s41586-022-04506-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bank C, Matuszewski S, Hietpas R, Jeffrey J. 2016. On the (un)predictability of a large intragenic fitness landscape. Proc. Natl Acad. Sci. USA 113, 14 085-14 090. ( 10.1073/pnas.1612676113) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Li C, Zhang J. 2018. The fitness landscapes of a tRNA gene. Science 352, 1025-1032. ( 10.1126/science.aae0568) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lite T-L, Grant R, Nocedal I, Littlehale M, Guo M, Laub M. 2020. Uncovering the basis of protein-protein interaction specificity with a combinatorially complete library. eLife 9, e60924. ( 10.7554/eLife.60924) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Qiu C, et al. 2016. High-resolution phenotypic landscape of the RNA polymerase II trigger loop. PLoS Genet. 12, e1006321. ( 10.1371/journal.pgen.1006321) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sarkisyan K, et al. 2016. Local fitness landscape of the green fluorescent protein. Nature 533, 397-401. ( 10.1038/nature17995) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Tack D, Tonner P, Pressman A, Olson N, Levy S, Romantseva E, Alperovich N, Vasilyeva O, Ross D. 2021. The genotype-phenotype landscape of an allosteric protein. Mol. Syst. Biol. 17, e10179. ( 10.15252/msb.202010179) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wu N, Dai C, Olson C, Lloyd-Smith J, Sun R. 2016. Adaptation in protein fitness landscapes is facilitated by indirect paths. eLife 5, e16965. ( 10.7554/eLife.16965) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Santos-Moreno K, Tasiudi E, Kusumawardhani H, Stelling J, Schaerli Y. 2022. Synthetic genotype networks. bioRxiv. ( 10.1101/2022.09.01.506159) [DOI] [PMC free article] [PubMed]
- 26.Schaerli Y, Jimenez A, Duarte J, Mihajlovic L, Renggli J, Isalan M, Sharpe J, Wagner A. 2018. Synthetic circuits reveal how mechanisms of gene regulatory networks constrain evolution. Mol. Syst. Biol. 14, e8102. ( 10.15252/msb.20178102) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bassalo M, Garst A, Choudhury A, Grau W, Oh E, Spindler E, Lipscomb T, Gill R. 2018. Deep scanning lysine metabolism in Escherichia coli. Mol. Syst. Biol. 14, e8371. ( 10.15252/msb.20188371) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Stoltzfus A, Yampolsky LY. 2009. Climbing mount probable: mutation as a cause of nonrandomness in evolution. J Hered. 100, 637-647. ( 10.1093/jhered/esp048) [DOI] [PubMed] [Google Scholar]
- 29.Yampolsky LY, Stoltzfus A. 2001. Bias in the introduction of variation as an orienting factor in evolution. Evol. Dev. 3, 73-83. ( 10.1046/j.1525-142x.2001.003002073.x) [DOI] [PubMed] [Google Scholar]
- 30.Katju V, Bergthorsson U. 2019. Old trade, new tricks: insights into the spontaneous mutation process from the partnering of classical mutation accumulation experiments with high-throughput genomic approaches. Genome Biol. Evol. 11, 136-165. ( 10.1093/gbe/evy252) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Friedberg EC, Friedberg EC, A. S. for Microbiology. 2006 DNA repair and mutagenesis, 2nd edn. Washington, DC: ASM Press.
- 32.Lujan SA, et al. 2014. Heterogeneous polymerase fidelity and mismatch repair bias genome variation and composition. Genome Res. 24, 1751-1764. ( 10.1101/gr.178335.114) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Stoltzfus A. 2021. Mutation, randomness and evolution. Oxford, UK: Oxford University Press. [Google Scholar]
- 34.Couce A, Guelfo J, Blázquez J. 2013. Mutational spectrum drives the rise of mutator bacteria. PLoS Genet. 9, e1003167. ( 10.1371/journal.pgen.1003167) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Beatty J. 2022. The synthesis and the two scenarios. Evolution 76, 6-14. ( 10.1111/evo.14423) [DOI] [PubMed] [Google Scholar]
- 36.Stoltzfus A. 2017. Why we don’t want another ‘synthesis’. Biol. Direct 12, 23. ( 10.1186/s13062-017-0194-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Dobzhansky T, Ayala F, Stebbins G, Valentine J. 1977. Evolution. San Francisco, CA: W.H. Freeman. [Google Scholar]
- 38.Mayr E. 1963. Animal species and evolution. Cambridge, MA: Harvard University Press. [Google Scholar]
- 39.Simpson GG. 1964. Organisms and molecules in evolution. Science 146, 1535-1538. ( 10.1126/science.146.3651.1535) [DOI] [PubMed] [Google Scholar]
- 40.Stebbins G. 1966. Processes of organic evolution. Englewood Cliffs, NJ: Prentice Hall. [Google Scholar]
- 41.Fisher R. 1930. The genetical theory of natural selection. London, UK: Oxford University Press. [Google Scholar]
- 42.Haldane J. 1927. A mathematical theory of natural and artificial selection. v. selection and mutation. Proc. Cam. Phil. Soc. 26, 220-230. ( 10.1017/S0305004100015644) [DOI] [Google Scholar]
- 43.Haldane JB. 1933. The part played by recurrent mutation in evolution. Am. Nat. 67, 5-19. ( 10.1086/280465) [DOI] [Google Scholar]
- 44.Margoliash E. 1963. Primary structure and evolution of cytochrome c. Proc. Natl Acad. Sci. 50, 672-679. ( 10.1073/pnas.50.4.672) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zuckerkandl E, Pauling L. 1962. Molecular disease, evolution, and genetic heterogeneity, pp. 189-225. New York, NY: Academic Press [Google Scholar]
- 46.Zuckerkandl E, Pauling L. 1965. Evolutionary divergence and convergence in proteins. New York, NY: Academic Press. [Google Scholar]
- 47.McCandlish D, Stoltzfus A. 2014. Modeling evolution using the probability of fixation: history and implications. Q. Rev. Biol. 89, 225-252. ( 10.1086/677571) [DOI] [PubMed] [Google Scholar]
- 48.Streisfeld MA, Rausher MD. 2011. Population genetics, pleiotropy, and the preferential fixation of mutations during adaptive evolution. Evolution 65, 629-642. ( 10.1111/j.1558-5646.2010.01165.x) [DOI] [PubMed] [Google Scholar]
- 49.Cano AV, Payne JL. 2020. Mutation bias interacts with composition bias to influence adaptive evolution. PLoS Comput. Biol. 16, 1-26. ( 10.1371/journal.pcbi.1008296) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Sane M, Diwan GD, Bhat BA, Wahl LM, Agashe D. 2022. Shifts in mutation spectra enhance access to beneficial mutations. bioRxiv. ( 10.1101/2020.09.05.284158) [DOI] [PMC free article] [PubMed]
- 51.Schaper S, Louis AA. 2014. The arrival of the frequent: how bias in genotype-phenotype maps can steer populations to local optima. PLoS ONE 9, e86635. ( 10.1371/journal.pone.0086635) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Stoltzfus A. 2006. Mutation-biased adaptation in a protein NK model. Mol. Biol. Evol. 23, 1852-1862. ( 10.1093/molbev/msl064) [DOI] [PubMed] [Google Scholar]
- 53.Tuffaha M, Varakunan S, Castellano D, Gutenkunst R, Wahl L. 2022. Shifts in mutation bias promote mutators by altering the distribution of fitness effects. bioRxiv. ( 10.1101/2022.09.27.509708). [DOI] [PMC free article] [PubMed]
- 54.Gerrish PJ, Lenski RE. 1998. The fate of competing beneficial mutations in an asexual population. Genetica 102, 127-144. ( 10.1023/A:1017067816551) [DOI] [PubMed] [Google Scholar]
- 55.Neher RA. 2013. Genetic draft, selective interference, and population genetics of rapid adaptation. Annu. Rev. Ecol. Evol. Syst. 44, 195-215. ( 10.1146/annurev-ecolsys-110512-135920) [DOI] [Google Scholar]
- 56.Cano AV, Rozhonova H, Stoltzfus A, McCandlish DM, Payne JL. 2022. Mutation bias shapes the spectrum of adaptive substitutions. Proc. Natl Acad. Sci. USA 119, e2119720119. ( 10.1073/pnas.2119720119) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gomez K, Bertram J, Masel J. 2020. Mutation bias can shape adaptation in large asexual populations experiencing clonal interference. Proc. R. Soc. B 287, 20201503. ( 10.1098/rspb.2020.1503) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Soares AA, Wardil L, Klaczko LB, Dickman R. 2021. Hidden role of mutations in the evolutionary process. Phys. Rev. E 104, 044413. ( 10.1103/PhysRevE.104.044413) [DOI] [PubMed] [Google Scholar]
- 59.Rokyta D, Joyce P, Caudle S, Wichman H. 2005. An empirical test of the mutational landscape model of adaptation using a single-stranded DNA virus. Nat. Genet. 37, 441-444. ( 10.1038/ng1535) [DOI] [PubMed] [Google Scholar]
- 60.Kimura M. 1962. On the probability of fixation of mutant genes in a population. Genetics 47, 713-719. ( 10.1093/genetics/47.6.713) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chevin LM, Martin G, Lenormand T. 2010. Fisher’s model and the genomics of adaptation: restricted pleiotropy, heterogenous mutation, and parallel evolution. Evolution 64, 3213-3321. ( 10.1111/j.1558-5646.2010.01058.x) [DOI] [PubMed] [Google Scholar]
- 62.Maclean C, Perron G, Gardner A. 2010. Diminishing returns from beneficial mutations and pervasive epistasis shape the fitness landscape for Rifampicin resistance in Pseudomonas aeruginosa. Genetics 186, 1345-1354. ( 10.1534/genetics.110.123083) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Bailey SF, Blanquart F, Bataillon T, Kassen R. 2017. What drives parallel evolution?: how population size and mutational variation contribute to repeated evolution. Bioessays 39, 1-9. ( 10.1002/bies.201600176) [DOI] [PubMed] [Google Scholar]
- 64.Bailey SF, Guo Q, Bataillon T. 2018. Identifying drivers of parallel evolution: a regression model approach. Genome Biol. Evol. 10, 2801-2812. ( 10.1093/gbe/evy210) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Tenaillon O, Rodríguez-Verdugo A, Gaut RL, McDonald P, Bennett AF, Long AD, Gaut BS. 2012. The molecular diversity of adaptive convergence. Science 335, 457-461. ( 10.1126/science.1212986) [DOI] [PubMed] [Google Scholar]
- 66.Lenormand T, Chevin LM, Bataillon T. 2016. Parallel evolution: what does it (not) tell us and why is it (still) interesting? In Chance in evolution (eds G Ramsey, C Pence), pp. 196–220. Chicago, IL: University of Chicago Press.
- 67.Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate across mammalian genomes. Nat. Rev. Genet. 12, 756-766. ( 10.1038/nrg3098) [DOI] [PubMed] [Google Scholar]
- 68.Liu H, Zhang J. 2019. Yeast spontaneous mutation rate and spectrum vary with environment. Curr. Biol. 29, 1584-1591. ( 10.1016/j.cub.2019.03.054) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Maharjan RP, Ferenci T. 2017. A shifting mutational landscape in 6 nutritional states: stress-induced mutagenesis as a series of distinct stress input-mutation output relationships. PLoS Biol. 15, e2001477. ( 10.1371/journal.pbio.2001477) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wei W, Ho W-C, Behringer M, Miller S, Bcharah G, Lynch M. 2022. Rapid evolution of mutation rate and spectrum in response to environmental and population-genetic challenges. Nat. Commun. 13, 4752. ( 10.1038/s41467-022-32353-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Harris K. 2015. Evidence for recent, population-specific evolution of the human mutation rate. Proc. Natl Acad. Sci. USA 112, 3439-3444. ( 10.1073/pnas.1418652112) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Harris K, Pritchard JK. 2017. Rapid evolution of the human mutation spectrum. eLife 6, e24284. ( 10.7554/eLife.24284) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Phillips MA, Steenwyk JL, Shen XX, Rokas A. 2021. Examination of gene loss in the dna mismatch repair pathway and its mutational consequences in a fungal phylum. Genome Biol. Evol. 13, evab219. ( 10.1093/gbe/evab219) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Oman M, Alam A, Ness RW. 2022. How sequence context-dependent mutability drives mutation rate variation in the genome. Genome Biol. Evol. 14, evac032. ( 10.1093/gbe/evac032) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Martincoreña I, Luscombe NM. 2013. Non-random mutation: the evolution of targeted hypermutation and hypomutation. BioEssays : News Rev. Mol., Cell. Dev. Biol. 35, 123-130. ( 10.1002/bies.201200150) [DOI] [PubMed] [Google Scholar]
- 76.Monroe JG, et al. 2022. Mutation bias reflects natural selection in Arabidopsis thaliana. Nature 602, 101-105. ( 10.1038/s41586-021-04269-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Couce A, Rodríguez-Rojas A, Blazquez J. 2015. Bypass of genetic constraints during mutator evolution to antibiotic resistance. Proc. R. Soc. B 282, 20142698. ( 10.1098/rspb.2014.2698) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Horton JS, Flanagan LM, Jackson RW, Priest NK, Taylor TB. 2021. A mutational hotspot that determines highly repeatable evolution can be built and broken by silent genetic changes. Nat. Commun. 12, 6092. ( 10.1038/s41467-021-26286-9) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Orr HA. 2002. The population genetics of adaptation: the adaptation of DNA sequences. Evolution 56, 1317-1330. ( 10.1111/j.0014-3820.2002.tb01446.x) [DOI] [PubMed] [Google Scholar]
- 80.Eldholm V, Balloux F. 2016. Antimicrobial resistance in Mycobacterium tuberculosis: the odd one out. Trends Microbiol. 24, 637-648. ( 10.1016/j.tim.2016.03.007) [DOI] [PubMed] [Google Scholar]
- 81.Liu Q, et al. 2015. Within patient microevolution of Mycobacterium tuberculosis correlates with heterogeneous responses to treatment. Sci. Rep. 5, 17507. ( 10.1038/srep17507) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Trauner A, et al. 2017. The within-host population dynamics of Mycobacterium tuberculosis vary with treatment efficacy. Genome Biol. 18, 71. ( 10.1186/s13059-017-1196-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Lee H, Popodi E, Tang H, Foster PL. 2012. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl Acad. Sci. USA 109, E2774-E2783. ( 10.1073/pnas.1210309109) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Zhu YO, Siegal ML, Hall DW, Petrov DA. 2014. Precise estimates of mutation rate and spectrum in yeast. Proc. Natl Acad. Sci. USA 111, E2310-E2318. ( 10.1073/pnas.1323011111) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Good B, Mcdonald M, Barrick J, Lenski R, Desai M. 2017. The dynamics of molecular evolution over 60,000 generations. Nature 551, 45-50. ( 10.1038/nature24287) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, Desai MM. 2013. Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature 500, 571-574. ( 10.1038/nature12344) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Xie K, et al. 2019. DNA fragility in the parallel evolution of pelvic reduction in stickleback fish. Science 363, 81-84. ( 10.1126/science.aan1425) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Coombes D, Moir JWB, Poole AM, Cooper TF, Dobson RCJ. 2019. The fitness challenge of studying molecular adaptation. Biochem. Soc. Trans. 47, 1533-1542. ( 10.1042/BST20180626) [DOI] [PubMed] [Google Scholar]
- 89.Stoltzfus A, McCandlish DM. 2017. Mutational biases influence parallel adaptation. Mol. Biol. Evol. 34, 2163-2172. ( 10.1093/molbev/msx180) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Stoltzfus A, Norris RW. 2016. On the causes of evolutionary transition:transversion bias. Mol. Biol. Evol. 33, 595-602. ( 10.1093/molbev/msv274) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Lyons DM, Lauring AS. 2017. Evidence for the selective basis of transition-to-transversion substitution bias in two RNA viruses. Mol. Biol. Evol. 34, 3205-3215. ( 10.1093/molbev/msx251) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Meyer JR, Dobias DT, Weitz JS, Barrick JE, Quick RT, Lenski RE. 2012. Repeatability and contingency in the evolution of a key innovation in phage lambda. Science 335, 428-432. ( 10.1126/science.1214449) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Crill WD, Wichman HA, Bull JJ. 2000. Evolutionary reversals during viral adaptation to alternating hosts. Genetics 154, 27-37. ( 10.1093/genetics/154.1.27) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Bull JJ, Badgett MR, Wichman HA, Huelsenbeck JP, Hillis DM, Gulati A, Ho C, Molineux IJ. 1997. Exceptional convergent evolution in a virus. Genetics 147, 1497-1507. ( 10.1093/genetics/147.4.1497) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Sackman AM, McGee LW, Morrison AJ, Pierce J, Anisman J, Hamilton H, Sanderbeck S, Newman C, Rokyta DR. 2017. Mutation-driven parallel evolution during viral adaptation. Mol. Biol. Evol. 34, 3243-3253. ( 10.1093/molbev/msx257) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Bertels F, Leemann C, Metzner KJ, Regoes RR. 2019. Parallel evolution of HIV-1 in a long-term experiment. Mol. Biol. Evol. 36, 2400-2414. ( 10.1093/molbev/msz155) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Katz S, Avrani S, Yavneh M, J. Gross HS, Hershberg R. 2021. Dynamics of adaptation during three years of evolution under long-term stationary phase. Mol. Biol. Evol. 38, 2778-2790. ( 10.1093/molbev/msab067) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Aardema ML, Zhen Y, Andolfatto P. 2012. The evolution of cardenolide-resistant forms of na+, k+-atpase in Danainae butterflies. Mol. Ecol. 21, 340-349. ( 10.1111/j.1365-294X.2011.05379.x) [DOI] [PubMed] [Google Scholar]
- 99.Ujvari B, et al. 2015. Widespread convergence in toxin resistance by predictable molecular evolution. Proc. Natl Acad. Sci. USA 112, 11 911-11 916. ( 10.1073/pnas.1511706112) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Zhen Y, Aardema ML, Medina EM, Schumer M, Andolfatto P. 2012. Parallel molecular evolution in an herbivore community. Science 337, 1634-1637. ( 10.1126/science.1226630) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Hershberg R, Petrov DA. 2010. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet. 6, e1001115. ( 10.1371/journal.pgen.1001115) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Gagneux S. 2018. Ecology and evolution of Mycobacterium tuberculosis. Nat. Rev. Microbiol. 16, 202-213. ( 10.1038/nrmicro.2018.8) [DOI] [PubMed] [Google Scholar]
- 103.Payne JL, Menardo F, Trauner A, Borrell S, Gygli SM, Loiseau C, Gagneux S, Hall AR. 2019. Transition bias influences the evolution of antibiotic resistance in Mycobacterium tuberculosis. PLoS Biol. 17, 1-23. ( 10.1371/journal.pbio.3000265) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Ehrlich M, Zhang X-Y, Inamdar NM. 1990. Spontaneous deamination of cytosine and 5-methylcytosine residues in DNA and replacement of 5-methylcytosine residues with cytosine residues. Mutat. Res. 238, 277-286. ( 10.1016/0165-1110(90)90019-8) [DOI] [PubMed] [Google Scholar]
- 105.Hess ST, Blake JD, Blake R. 1994. Wide variations in neighbor-dependent substitution rates. J. Mol. Biol. 236, 1022-1033. ( 10.1016/0022-2836(94)90009-4) [DOI] [PubMed] [Google Scholar]
- 106.Smeds L, Qvarnström A, Ellegren H. 2014. Direct estimate of the rate of germline mutation in a bird. Genome Res. 26, 1211-1218. ( 10.1101/gr.204669.116) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Galen SC, et al. 2015. Contribution of a mutational hot spot to hemoglobin adaptation in high-altitude Andean house wrens. Proc. Natl Acad. Sci. USA 112, 13 958-13 963. ( 10.1073/pnas.1507300112) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Zhu X, et al. 2018. Divergent and parallel routes of biochemical adaptation in high-altitude passerine birds from the Qinghai-Tibet plateau. Proc. Natl Acad. Sci. USA 115, 1865-1870. ( 10.1073/pnas.1720487115) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Storz JF, Natarajan C, Signore AV, Witt CC, McCandlish DM, Stoltzfus A. 2019. The role of mutation bias in adaptive molecular evolution: insights from convergent changes in protein function. Phil. Trans. R. Soc. B 374, 20180238. ( 10.1098/rstb.2018.0238) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Cannataro VL, Gaffney SG, Townsend JP. 2018. Effect sizes of somatic mutations in cancer. J. Natl Cancer Inst. 110, 1171-1177. ( 10.1093/jnci/djy168) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Temko D, Tomlinson I, Severini S, Schuster-Bockler B, Graham T. 2018. The effects of mutational processes and selection on driver mutations across cancer types. Nat. Commun. 9, 1857. ( 10.1038/s41467-018-04208-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Alexandrov L, Stratton M. 2013. Mutational signatures: the patterns of somatic mutations hidden in cancer genomes. Curr. Opinions Genet. Dev. 24, 52-60. ( 10.1016/j.gde.2013.11.014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Alexandrov L, et al. 2018. The repertoire of mutational signatures in human cancer. Nature 578, 94-101. ( 10.1101/322859) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Alexandrov L, Nik-Zainal S, Wedge D, Campbell P, Stratton M. 2013. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246-259. ( 10.1016/j.celrep.2012.12.008) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Cannataro VL, et al. 2019. APOBEC-induced mutations and their cancer effect size in head and neck squamous cell carcinoma. Oncogene 38, 3475-3487. ( 10.1038/s41388-018-0657-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Poulos R, Wong YT, Ryan R, Pang H, Wong J. 2018. Analysis of 7,815 cancer exomes reveals associations between mutational processes and somatic driver mutations. PLoS Genet. 14, e1007779. ( 10.1371/journal.pgen.1007779) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Gao G, Liao W, Ma Q, Zhang B, Chen Y, Wang Y. 2020. KRAS G12D mutation predicts lower TMB and drives immune suppression in lung adenocarcinoma. Lung Cancer 149, 41-45. ( 10.1016/j.lungcan.2020.09.004) [DOI] [PubMed] [Google Scholar]
- 118.Tan C, Mandell JD, Dasari K, Cannataro VL, Alfaro-Murillo JA, Townsend JP. 2022. Heavy mutagenesis by tobacco leads to lung adenocarcinoma tumors with KRAS G12 mutations other than G12D, leading KRAS G12D tumors-on average-to exhibit a lower mutation burden. Lung Cancer 166, 265-269. ( 10.1016/j.lungcan.2021.10.008) [DOI] [PubMed] [Google Scholar]
- 119.Woolston A, et al. 2021. Mutational signatures impact the evolution of anti-EGFR antibody resistance in colorectal cancer. Nat. Ecol. Evol. 5, 1024-1032. ( 10.1038/s41559-021-01470-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Cannataro VL, Mandell JD, Townsend JP. 2022. Attribution of cancer origins to endogenous, exogenous, and preventable mutational processes. Mol. Biol. Evol. 39, msac084. ( 10.1093/molbev/msac084) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Watson CJ, Papula AL, Poon GYP, Wong WH, Young AL, Druley TE, Fisher DS, Blundell JR. 2020. The evolutionary dynamics and fitness landscape of clonal hematopoiesis. Science 367, 1449-1454. ( 10.1126/science.aay9333) [DOI] [PubMed] [Google Scholar]
- 122.Leighow S, Liu C, Inam H, Zhao B, Pritchard J. 2020. Multi-scale predictions of drug resistance epidemiology identify design principles for rational drug design. Cell Rep. 30, 3951-3963. ( 10.1016/j.celrep.2020.02.108) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Bachar A, Itzhaki E, Gleizer S, Shamshoom M, Milo R, Antonovsky N. 2020. Point mutations in topoisomerase I alter the mutation spectrum in E. coli and impact the emergence of drug resistance genotypes. Nucl. Acids Res. 48, 761-769. ( 10.1093/nar/gkz1100) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Gould F, Brown ZS, Kuzma J. 2018. Wicked evolution: can we address the sociobiological dilemma of pesticide resistance?. Science 360, 728-732. ( 10.1126/science.aar3780) [DOI] [PubMed] [Google Scholar]
- 125.Hawkins NJ, Fraaije BA. 2021. Contrasting levels of genetic predictability in the evolution of resistance to major classes of fungicides. Mol. Ecol. 30, 5318-5327. ( 10.1111/mec.15877) [DOI] [PubMed] [Google Scholar]
- 126.Jones L, Riaz S, Morales-Cruz A, Amrine KC, McGuire B, Gubler WD, Walker MA, Cantu D. 2014. Adaptive genomic structural variation in the grape powdery mildew pathogen, Erysiphe necator. BMC Genomics 15, 1081. ( 10.1186/1471-2164-15-1081) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Rice AM, et al. 2021. Evidence for strong mutation bias toward, and selection against, U content in SARS-CoV-2: implications for vaccine design. Mol. Biol. Evol. 38, 67-83. ( 10.1093/molbev/msaa188) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Hamelin DJ, et al. 2022. The mutational landscape of SARS-CoV-2 variants diversifies T cell targets in an HLA-supertype-dependent manner. Cell Syst. 13, 143-157. ( 10.1016/j.cels.2021.09.013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Abas AH, et al. 2022. Can the SARS-CoV-2 omicron variant confer natural immunity against COVID-19? Molecules 27, 2221. ( 10.3390/molecules27072221) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Keulen W, Back NK, van Wijk A, Boucher CA, Berkhout B. 1997. Initial appearance of the 184Ile variant in lamivudine-treated patients is caused by the mutational bias of human immunodeficiency virus type 1 reverse transcriptase. J. Virol. 71, 3346-3350. ( 10.1128/JVI.71.4.3346-3350.1997) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Jern P, Russell RA, Pathak VK, Coffin JM. 2009. Likely role of APOBEC3G-mediated G-to-A mutations in HIV-1 evolution and drug resistance. PLoS Pathog. 5, e1000367. ( 10.1371/journal.ppat.1000367) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Haddox HK, Dingens AS, Hilton SK, Overbaugh J, Bloom JD. 2018. Mapping mutational effects along the evolutionary landscape of HIV envelope. eLife 7, e34420. ( 10.7554/eLife.34420) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Lauria SE, Delbrück M. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28, 491-511. ( 10.1093/genetics/28.6.491) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Robert L, Ollion J, Robert J, Song X, Matic I, Elez M. 2018. Mutation dynamics and fitness effects followed in single cells. Science 359, 1283-1286. ( 10.1126/science.aan0797) [DOI] [PubMed] [Google Scholar]
- 135.Acevedo A, Brodsky L, Andino R. 2014. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature 505, 686-690. ( 10.1038/nature12861) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Dolan PT, Taguwa S, Rangel MA, Acevedo A, Hagai T, Andino R, Frydman J. 2021. Principles of dengue virus evolvability derived from genotype-fitness maps in human and mosquito cells. eLife 10, e61921. ( 10.7554/eLife.61921) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Rogozin IB, Pavlov YI. 2003. Theoretical analysis of mutation hotspots and their DNA sequence context specificity. Mutat. Res. 544, 65-85. ( 10.1016/s1383-5742(03)00032-2) [DOI] [PubMed] [Google Scholar]
- 138.Sankar T, Wastuwidyaningtyas B, Dong Y, Lewis S, Wang J. 2016. The nature of mutations induced by replication-transcription collisions. Nature 535, 178-181. ( 10.1038/nature18316) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Long H, Sung W, Miller S, Ackerman M, Doak T, Lynch M. 2014. Mutation rate, spectrum, topology, and context-dependency in the DNA mismatch repair-deficient pseudomonas fluorescens ATCC948. Genome Biol. Evol. 7, 262-271. ( 10.1093/gbe/evu284) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Terekhanova NV, Seplyarskiy VB, Soldatov RA, Bazykin GA. 2017. Evolution of local mutation rate and its determinants. Mol. Biol. Evol. 34, 1100-1109. ( 10.1093/molbev/msx060) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Dillon M, Sung W, Lynch M, Cooper V. 2018. Periodic variation of mutation rates in bacterial genomes associated with replication timing. MBio 9, e01371-18. [DOI] [PMC free article] [PubMed]
- 142.Watt DL, Buckland RJ, Lujan SA, Kunkel TA, Chabes A. 2015. Genome-wide analysis of the specificity and mechanisms of replication infidelity driven by imbalanced dNTP pools. Nucleic Acids Res. 44, 1669-1680. ( 10.1093/nar/gkv1298) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Witt E, Langer C, Svetec N, Zhao L. 2023. Transcriptional and mutational signatures of the drosophila ageing germline. Nat. Ecol. Evol. 7, 1-10. ( 10.1038/s41559-022-01958-x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Chang SL, Lai HY, Tung SY, Leu JY. 2013. Dynamic large-scale chromosomal rearrangements fuel rapid adaptation in yeast populations. PLoS Genet. 9, e1003232. ( 10.1371/journal.pgen.1003232) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Vandecraen J, Chandler M, Aertsen A, Van Houdt R. 2017. The impact of insertion sequences on bacterial genome plasticity and adaptability. Crit. Rev. Microbiol. 43, 709-730. ( 10.1080/1040841X.2017.1303661) [DOI] [PubMed] [Google Scholar]
- 146.Fondon JW, Garner HR. 2004. Molecular origins of rapid and continuous morphological evolution. Proc. Natl Acad. Sci. USA 101, 18 058-18 063. ( 10.1073/pnas.0408118101) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Schrider D, Hourmozdi J, Hahn M. 2011. Pervasive multinucleotide mutational events in eukaryotes. Curr. Biol. 21, 1051-1054. ( 10.1016/j.cub.2011.05.013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Wang Q, et al. 2020. Landscape of multi-nucleotide variants in 125 748 human exomes and 15 708 genomes. Nat. Commun. 11, 2539. ( 10.1038/s41467-019-12438-5) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Kaplanis J, et al. 2019. Exome-wide assessment of the functional impact and pathogenicity of multinucleotide mutations. Genome Res. 29, 1047-1056. ( 10.1101/gr.239756.118) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Borges V, et al. 2021. Mutation rate of SARS-CoV-2 and emergence of mutators during experimental evolution. Evol. Med. Public Health 10, 142-155. ( 10.1101/2021.05.19.444774) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Tonkin-Hill G, et al. 2021. Patterns of within-host genetic diversity in SARS-CoV-2. eLife 10, e66857. ( 10.7554/eLife.66857) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Yadon A, Maharaj K, Adamson J, Lai Y-P, Sacchettini J, Ioerger T, Rubin E, Pym A. 2017. A comprehensive characterization of pnca polymorphisms that confer resistance to pyrazinamide. Nat. Commun. 8, 588. ( 10.1038/s41467-017-00721-2) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Besnard F, Picao-Osorio J, Dubois C, Felix MA. 2020. A broad mutational target explains a fast rate of phenotypic evolution. Elife 9, e54928. ( 10.7554/eLife.54928) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Houle D. 1998. How should we explain variation in the genetic variance of traits? Genetica 102, 241-253. ( 10.1023/A:1017034925212) [DOI] [PubMed] [Google Scholar]
- 155.Lind PA, Farr AD, Rainey PB. 2015. Experimental evolution reveals hidden diversity in evolutionary pathways. eLife 4, e07074. ( 10.7554/eLife.07074) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Lind PA, Libby E, Herzog J, Rainey PB. 2019. Predicting mutational routes to new adaptive phenotypes. eLife 8, e38822. ( 10.7554/eLife.38822) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Ahnert S. 2017. Structural properties of genotype-phenotype maps. J. R. Soc. Interface 14, 20170275. ( 10.1098/rsif.2017.0275) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Dingle K, Ghaddar F, Šulc P, Louis AA. 2020. Phenotype bias determines how RNA structures occupy the morphospace of all possible shapes. Mol. Biol. Evol. 39, msab280. ( 10.1093/molbev/msab280) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Garson J, Wang L, Sarkar S. 2003. How development may direct evolution. Biol. Phil. 18, 353-370. ( 10.1023/A:1023996321257) [DOI] [Google Scholar]
- 160.Gymrek M, et al. 2015. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat. Genet. 48, 22-29. ( 10.1038/ng.3461) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Simon A, Doelsnitz S, Ellington A. 2019. Synthetic evolution. Nat. Biotechnol. 37, 730-746. ( 10.1038/s41587-019-0157-4) [DOI] [PubMed] [Google Scholar]
- 162.Calles J, Justice I, Brinkley D, Garcia A, Endy D. 2019. Fail-safe genetic codes designed to intrinsically contain engineered organisms. Nucl. Acids Res. 47, 10 439-10 451. ( 10.1093/nar/gkz745) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Drienovska I, Roelfes G. 2020. Expanding the enzyme universe with genetically encoded unnatural amino acids. Nat. Catal. 3, 193-202. ( 10.1038/s41929-019-0410-8) [DOI] [Google Scholar]
- 164.Moratorio G, et al. 2017. Attenuation of RNA viruses by redirecting their evolution in sequence space. Nat. Microbiol. 2, 17088. ( 10.1038/nmicrobiol.2017.88) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Pines G, Winkler J, Gill PAR. 2017. Refactoring the genetic code for increased evolvability. mBio 8, e10654-17. ( 10.1128/mBio.01654-17) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Ravikumar A, M. Obadi AGA, Javanpour A, Liu CC. 2018. Scalable, continuous evolution of genes at mutation rates above genomic error thresholds. Cell 175, 1946-1957. ( 10.1016/j.cell.2018.10.021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Ravikumar A, Arrieta A, Liu CC. 2014. An orthogonal DNA replication system in yeast. Nat. Chem. Biol. 10, 175-177. ( 10.1038/nchembio.1439) [DOI] [PubMed] [Google Scholar]
- 168.Rix G, Watkins-Dulaney E, Almhjell P, Boville C, Arnold F, Liu C. 2020. Scalable, continuous evolution for the generation of diverse enzyme variants encompassing promiscuous activities. Nat. Commun. 11, 5644. ( 10.1038/s41467-020-19539-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169.Cano AV, Gitschlag BL, Rozhoňová H, Stoltzfus A, McCandlish DM, Payne JL. 2023. Mutation bias and the predictability of evolution. Figshare. ( 10.6084/m9.figshare.c.6444362) [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- Cano AV, Gitschlag BL, Rozhoňová H, Stoltzfus A, McCandlish DM, Payne JL. 2023. Mutation bias and the predictability of evolution. Figshare. ( 10.6084/m9.figshare.c.6444362) [DOI] [PMC free article] [PubMed]
Data Availability Statement
Data for figure 1 are provided as the electronic supplementary material [169]. Data for figure 2 are provided in Maclean et al. [62]; data for figure 3 are provided in Cano et al. [49]; Data for table 1 are provided in Stoltzfus & McCandlish [89]. Data and code for figure 4 are provided in the Github repository indicated in the figure caption.