Skip to main content
eLife logoLink to eLife
. 2022 Dec 22;11:e82240. doi: 10.7554/eLife.82240

Adaptation dynamics between copy-number and point mutations

Isabella Tomanek 1,, Călin C Guet 1,
Editors: Sergey Kryazhimskiy2, Molly Przeworski3
PMCID: PMC9833825  PMID: 36546673

Abstract

Together, copy-number and point mutations form the basis for most evolutionary novelty, through the process of gene duplication and divergence. While a plethora of genomic data reveals the long-term fate of diverging coding sequences and their cis-regulatory elements, little is known about the early dynamics around the duplication event itself. In microorganisms, selection for increased gene expression often drives the expansion of gene copy-number mutations, which serves as a crude adaptation, prior to divergence through refining point mutations. Using a simple synthetic genetic reporter system that can distinguish between copy-number and point mutations, we study their early and transient adaptive dynamics in real time in Escherichia coli. We find two qualitatively different routes of adaptation, depending on the level of functional improvement needed. In conditions of high gene expression demand, the two mutation types occur as a combination. However, under low gene expression demand, copy-number and point mutations are mutually exclusive; here, owing to their higher frequency, adaptation is dominated by copy-number mutations, in a process we term amplification hindrance. Ultimately, due to high reversal rates and pleiotropic cost, copy-number mutations may not only serve as a crude and transient adaptation, but also constrain sequence divergence over evolutionary time scales.

Research organism: E. coli

Introduction

Adaptive evolution proceeds by selection acting on mutations, which are often implicitly equated with point mutations, that is, changes to a single nucleotide in the DNA sequence. However, nature is full of different types of bigger-scale mutations, such as mutations to the copy-number of genomic regions ranging from only a few base pairs up to half a bacterial chromosome (Anderson and Roth, 1977; Darmon and Leach, 2014). The specific properties of mutations, such as their rate of formation and reversal, might influence the evolutionary dynamics in major ways, but are rarely considered.

In bacteria, which are our focus, the duplication of genes or genomic regions occurs orders of magnitude more frequently than point mutations, ranging from 10–6 up to 10–2 per cell per generation (Roth, 1988; Drake et al., 1998; Andersson and Hughes, 2009; Elez et al., 2010; Reams and Roth, 2015). Moreover, while duplications can form via different mechanisms, they all are genetically unstable (Andersson and Hughes, 2009); the repeated stretch of DNA sequence is prone to recA-dependent homologous recombination. At rates between 10–3 and 10–1 per cell per generation duplications will reverse to the single copy (deletion) or duplicate further (amplification) (Roth, 1988; Andersson and Hughes, 2009; Pettersson et al., 2009; Reams and Roth, 2015; Tomanek et al., 2020). Amplification of a gene or genomic region will, to a first approximation, increase its expression by means of elevated gene dosage (Elde et al., 2012; Gruber et al., 2012; Näsvall et al., 2012; Yona et al., 2015; Steinrueck and Guet, 2017; Belikova et al., 2020; Todd and Selmecki, 2020). Not surprisingly, due to their high rate of formation, gene amplifications are adaptive in situations where a rapid increase in gene expression is needed: resistance to antibiotics, pesticides or drugs via over-expression of resistance determinants (Prody et al., 1989; Albertson, 2006; Bass and Field, 2011; Nicoloff et al., 2019), immune evasion (Belikova et al., 2020), or novel metabolic capabilities through increased expression of spurious enzymatic side activities (Blount et al., 2020; Richts et al., 2021). Due to their high intrinsic rate of deletion, often combined with significant fitness cost (Bergthorsson et al., 2007; Pettersson et al., 2009; Reams et al., 2010), copy-number mutations not only differ from point mutations in their frequency of occurrence, but also in the nature of their reversibility.

Together, copy-number and point mutations are responsible for the evolution of most functional novelty of genes through the process of duplication and divergence of existing genes (Ohno, 1970; Kacser and Beeby, 1984; Conant and Wolfe, 2008; Andersson et al., 2015). Owing to the dynamic nature of gene duplication formation and reversal, the interplay between copy-number and point mutations may lead to complex evolutionary dynamics around the time point of origin of a new gene duplication event. However, so far most attention has been focused on understanding the long-lasting process of how duplicate gene pairs diverge by accumulating point mutations (Lynch and Conery, 2000; Teufel et al., 2015; Friedlander et al., 2017), while we know little about the potentially short-lived initial duplication event itself (Innan and Kondrashov, 2010). On one hand, this bias is due to significant technical challenges in studying transient copy-number variation experimentally (Andersson and Hughes, 2009; Lauer and Gresham, 2019; Belikova et al., 2020; Tomanek et al., 2020), and on the other hand, research has focused on the plethora of long-term evolutionary data that document the sequence divergence of paralogs, as ‘attention is shifted to where the data are’ (Kondrashov, 2012).

In bacteria adaptive amplification, that is, amplification as a response to selection as opposed to neutral duplication and divergence, is considered the default mode of paralog evolution (Andersson and Hughes, 2009; Treangen and Rocha, 2011; Copley, 2020) and has been conceptualized in the innovation-amplification-divergence (IAD) model (Bergthorsson et al., 2007), which was later validated by evolution experiments (Elde et al., 2012; Näsvall et al., 2012). The IAD model posits that selection for a novel enzymatic activity leads to adaptive gene amplification that increases expression of an existing enzyme if it exhibits low levels of a beneficial secondary enzymatic activity (also referred to as promiscuous functions; Aharoni et al., 2005; Tawfik, 2010; Copley, 2017). Eventually, protein sequences diverge as point mutations improve the secondary enzymatic function: a new protein function is born from an existing one. After the new (improved) function is present, superfluous additional gene copies will be lost due to their cost and high rate of reversibility, leaving only the copies of the two (ancestral and evolved) paralogs (Bergthorsson et al., 2007; Reams et al., 2010; Elde et al., 2012; Näsvall et al., 2012).

Similarly, adaptive amplification can precede the divergence of promoter sequences under selection favouring increased gene expression (Steinrueck and Guet, 2017). Thus, gene amplifications serve as a fast adaptation which can later be replaced by point mutations either within the coding region of a gene, increasing a cryptic enzymatic activity, or in its non-coding promoter region, increasing its expression (Elde et al., 2012; Näsvall et al., 2012; Yona et al., 2015; Steinrueck and Guet, 2017).

Since elevated numbers of gene copies provide an increased target for point mutations to occur (San Millan et al., 2017), it has been suggested that copy-number mutations speed up the process of divergence (Andersson and Hughes, 2009). However, if both, copy-number and point mutations, are adaptive (Gruber et al., 2012), they also have the potential to interact epistatically or due to clonal interference (Gerrish and Lenski, 1998). This interaction could result in unexpected evolutionary dynamics due to the different rates of formation and reversal of the two different mutation types.

To fill the knowledge gap that exists at around ‘time zero’ of the duplication-divergence process (Innan and Kondrashov, 2010), we designed a synthetic genetic system with which we can monitor, in real time, arising copy-number and point mutations in evolving populations of Escherichia coli. Importantly, while our results are also relevant to the divergence of paralogous protein sequences, here we study the process of divergence in a model gene promoter. Our genetic reporter system allows us to phenotypically distinguish between copy-number and point mutations, by specifically selecting for the increased expression of an existing but barely expressed gene. With our system at hand, we set out to test whether adaptive copy-number mutations facilitate or hinder adaptation by point mutation.

Results

The motivation for this work was sparked by an evolution experiment conducted in E. coli at a locus exhibiting high rates of gene amplification (Steinrueck and Guet, 2017), which failed to produce any evolved clones with point mutations and thus lead us to hypothesize that copy-number mutations may interfere with the evolution by point mutations under certain conditions.

An experimental system that distinguishes copy-number and point mutations

To study the interplay between copy-number and point mutations during adaptation, we follow the fate of a barely expressed gene during its evolution towards higher expression. Our experimental system consists of an intact endogenous galK gene of E. coli that harbours a random promoter sequence (P0) that replaces its endogenous promoter. By growing E. coli in the presence of the sugar galactose, we are selecting for increased galK expression. Adaptation to selection for increased expression can happen by two different, non-mutually exclusives ways: through increased copy-number (duplication or amplification) or through point mutations in the P0 promoter region of galK (divergence) (Tomanek et al., 2020).

Importantly, our genetic reporter system allows us to distinguish between the two mutation types. GalK is part of a chromosomal reporter gene cassette and is transcriptionally fused to a yfp gene (Figure 1A). Hence, any increases in galK expression – be it by copy-number or point mutations – can be detected as increases in YFP expression. However, only mutations to the copy-number of the entire galK locus lead to an additional increase in the expression of an independently transcribed cfp gene downstream of galK-yfp (Steinrueck and Guet, 2017; Tomanek et al., 2020; Figure 1A, Figure 1—figure supplement 1A–C). Hence, increases in yfp alone indicate the divergence of the galK promoter sequence P0 by point mutations, while increases of both fluorophores indicate copy-number mutations of the whole locus. Finally, clones with increased yfp but without point mutations in P0 would indicate the presence of a trans-acting mutation at a different locus on the chromosome or a rare amplification event occurring independent of the repeated IS elements and excluding CFP (Steinrueck and Guet, 2017; Tomanek et al., 2020). Moreover, while in principle possible, an adaptive mutation in the coding sequence of galK itself is extremely unlikely to be selected under our experimental conditions given that growth is limited only by expression of the endogenous and fully functional galactokinase enzyme.

Figure 1. An experimental system to study gene duplication and divergence in strains with different duplication rates.

(A) Cartoon of chromosomal selection and reporter cassette. The galK-yfp gene fusion does not have a functional promoter, but instead a random sequence, P0 (thin arrow), drives very low levels of baseline gene expression. Cfp expression is driven by a constitutive promoter (black arrow). Light bulbs symbolize fluorescence. Two fundamentally different kinds of adaptive mutations are shown on the right: (i) point mutations in P0 lead to increases in GalK-YFP while CFP remains at ancestral single-copy levels (top), (ii) mutations to the copy-number of the whole reporter cassette will increase both YFP and CFP expression (bottom). (B) Growth rate (as a proxy for fitness) as a function of different induction levels of galK expression in four different concentrations of galactose. Expression of a synthetic para-galK cassette (schematic below the figure) is induced by the addition of arabinose. Growth rate increases along with increasing galK expression, but it plateaus at different values for different gene expression levels depending on galactose concentration (low, intermediate, and high gene expression demand). (C–D) Experimental layout. The adaptive dynamics and sequence divergence in P0 is compared between two otherwise isogenic strains (IS- and IS+) that differ in their rate of forming duplications. For IS- the second endogenous copy of IS1C located 12 kb downstream of the selection and reporter cassette has been deleted (C). Ninety-six replicate populations of each strain are evolved in three different levels of galactose, which select for increasing levels of gene expression improvement for 12 days, respectively. Throughout, fluorescence is analysed in bulk and on a single-cell level to analyse evolutionary dynamics, and relevant clones are sequenced (D).

Figure 1—source data 1. Contains an R script along with optical density measurments to plot Figure 1B.

Figure 1.

Figure 1—figure supplement 1. An experimental system to study gene duplication and divergence in strains with different duplication rates.

Figure 1—figure supplement 1.

(A–C) Fluorescence phenotype and copy-number as measured by qPCR of bacterial clones with different levels of galK expression and copy-number grown on LB agar. Point colour indicates P0 sequence of clones (black = ancestral, green = promoter mutation ‘H5’ [–30T>A and –37C>T]). (A) Colony CFP fluorescence plotted against copy-number relative to a single-copy control strain as determined by qPCR. Error bars represent the standard deviation of three and four replicates for copy-number and CFP fluorescence, respectively. Linear fit: Adjusted R-squared=0.956, p-value = 3.35e-06. (B) Colony YFP fluorescence plotted against copy-number relative to a single-copy control strain as determined by qPCR. Linear model fitted to all data points with ancestral P0 sequence. Adjusted R-squared=0.97, p-value = 3.57e-06. (C) Colony YFP fluorescence plotted against CFP fluorescence. Linear model fitted to all data points with ancestral P0 sequence. Adjusted R-squared=0.97, p-value = 7.6e-06.
Figure 1—figure supplement 1—source data 1. Contains an R script along with qPCR and fluorescence intensity data to plot Figure 1—figure supplement 1.

Different substrate levels result in different enzyme expression demands

Our experimental environment consists of liquid minimal medium containing amino acids as a basic carbon and energy source, such that cells can grow even in the absence of galK expression (Figure 1B – grey line). Adding galactose to this basic medium renders galK expression highly beneficial. To characterize the relation between fitness and galK expression, we engineered a construct where the expression of galK is induced by the addition of arabinose. Growth rate increased along with galK expression and saturated at a certain expression level, which depended on the galactose medium used (Figure 1B). Thus, our system allows studying adaptation in environments with different gene expression demands: low concentrations of galactose demand a low level of galK expression (and increasing expression above this level does not add any extra benefit), while high concentrations of galactose demand a higher level of galK expression to obtain maximum growth rate. In other words, our experimental system allows selecting for different levels of improvement of a biological function (in our case increased galK expression) by growing cells in different galactose concentrations.

Evolution of galK expression in IS+ and IS- strains

Given the vast range of duplication rates observed at different chromosomal loci in bacteria (Roth, 1988; Andersson and Hughes, 2009; Elez et al., 2010; Reams and Roth, 2015), our objective was to experimentally manipulate the ability of galK to form duplications and study its effect on evolutionary dynamics. A common way to manipulate the duplication rate is by deleting the recA gene involved in homologous recombination (Goldberg and Mekalanos, 1986; Reams et al., 2010; Dhar et al., 2014). However, given its role in DNA repair, comparing recA and ΔrecA strains will be strongly influenced by the growth defects that such a mutation entails. In order to not have to consider pleiotropic effects caused by a difference in the genome-wide duplication rate, we instead compare two identical strains whose difference in duplication rate is restricted to a single genomic locus. To this end, we take advantage of a chromosomal location that is characterized by high rates of duplication and amplification due to homologous recombination occurring between two endogenous identical insertion sequences (IS) elements that flank this specific locus (Steinrueck and Guet, 2017; Tomanek et al., 2020). By deleting one copy of IS1, we generated two otherwise isogenic strains of E. coli that differ solely by the presence of one IS1 element approximately 10 kb downstream of galK (Figure 1C), and are thus predicted to show strong differences in their rates of duplication formation at this locus. In the following, we will refer to these strains as IS+ and IS-.

To understand how the duplication rate affects adaptive dynamics, we conducted an evolution experiment with 96 replicate populations of the IS+ and IS- strains (Figure 1D). Growing these populations in minimal medium containing only amino acids (control) or supplemented with three different galactose concentrations enabled us to follow adaptation to different gene expression demands (levels of selective pressure) (Figure 2A). Daily measurements of population fluorescence prior to dilution (1:820) allowed us to monitor population phenotypes roughly every 10 generations over 12 days.

Figure 2. Evolutionary dynamics depend on galactose concentration and duplication rate.

(A) Daily measurements of normalized CFP fluorescence as a proxy for gene copy-number of 96 populations of IS+ (black) and IS- (red) strains growing in three different galactose concentrations (% indicated in the plot), respectively, as well as 33 replicates of IS+ and IS- strain, respectively, growing in the absence of galactose (control, black). (B) Logarithmic plots for an overview of fold changes in YFP and CFP fluorescence of populations from (A) (YFP and CFP were normalized to the mean fluorescence of ancestral populations (anc.) evolved in 0% galactose [top panel]). Lines connect measurements of each population. Populations’ fluorescence phenotypes occupy three different areas: increased YFP only (YFP+), increased CFP and YFP (YFP+CFP+ , i.e. amplified) and increased CFP with an additional elevation in YFP above the YFP+CFP+ fraction (mixed). The number of populations for IS- (red) and IS+ (black) in the respective fractions are indicated (see Figure 1—figure supplement 1A and Figure 3A–B). (C) Representative flow cytometry plots showing single-cell YFP and CFP fluorescence for populations from the YFP+ (left), mixed (middle), and YFP+CFP+ (right) fraction (indicated in panel B), respectively.

Figure 2—source data 1. Contains an R script along with optical density and fluorescence data to plot Figure 2A-B.

Figure 2.

Figure 2—figure supplement 1. Number of amplified populations and their copy-number depends on the gene expression demand of the environment.

Figure 2—figure supplement 1.

(A) Data replotted from Figure 2A. Green line indicates threshold to classify as population as amplified (CFP/OD600 exceeds the mean ancestral CFP/OD600 by four standard deviations). (B) Using the same threshold, mean CFP/ OD600 fluorescence as a proxy for copy-number of all evolved populations is shown for 0.01%, 0.1%, and 1% galactose (68, 19, and 34 populations for low, intermediate, and high galactose, respectively). p-Values (two-sided t-test): 3.6*10–6 (between 0.01% and 1% gal) and 3*10–2 (between 0.01% and 0.1% galactose).
Figure 2—figure supplement 2. Evolutionary dynamics depend on galactose concentration.

Figure 2—figure supplement 2.

(A) Additional evolution experiment with daily measurements of normalized CFP fluorescence as a proxy for gene copy-number of 96 populations of the IS+ strain growing in three different galactose concentrations (% indicated next to the plots), as well as in the absence of galactose (control). (B) Growth rate in M9 minimal medium with increasing concentrations of galactose (left panel) as well as glycerol (control, right panel) of strain H5 with two SNPs in P0 (–30T>A and –37C>T) and the ancestral strain. Error bars represent the standard deviation of four (galactose) and five (glycerol) replicates, respectively.
Figure 2—figure supplement 2—source data 1. Contains an R script along with optical density measurments to plot Figure 2—figure supplement 2B.
Figure 2—figure supplement 3. YFP-only amplifications occur in IS- populations evolved in 0.1% galactose.

Figure 2—figure supplement 3.

(A) Normalized YFP fluorescence as a proxy for galK expression of 96 populations in the IS- strain growing in 0.1% galactose. Populations with increased YFP fluorescence are highlighted. (B) GalK copy-number of the YFP+IS- populations evolved in 0.1% galactose shown in (A) as estimated by qPCR. For each population, genomic DNA of one colony with ancestral (black bars) and one with increased YFP (yellow bars) fluorescence was analysed. (C) Scheme of galk-yfp-only amplification with a duplication junction upstream of the cfp gene.
Figure 2—figure supplement 3—source data 1. Contains an R script along with qPCR data to plot Figure 2—figure supplement 3B.

The evolution experiment confirmed that the two strains differ strongly in their rate of copy-number mutations of the galK locus. The strain lacking one of the flanking IS1 elements (IS-) showed a drastic reduction in the ability to undergo galK amplification. In contrast to the IS+ strain, very few IS- populations evolved increased CFP expression (Figure 2A – red traces). Interestingly, in the IS+ strain, the number of populations amplified by the end of the experiment depended on the environment. At least twice as many populations were amplified in the low (0.01%) galactose environment compared to the other two environments (68, 19, and 34 populations for low, intermediate, and high galactose, respectively) (Figure 2—figure supplement 1A). Not only the number of amplified populations, but also the maximum CFP fluorescence attained by IS+ populations differed significantly between the low (0.01%) and higher (0.1% and 1%) galactose environments (Figure 2—figure supplement 1B). Populations, which evolved increases in CFP fluorescence, did so within 2 days and maintained this level relatively stably for the duration of the experiment. (See Figure 2—figure supplement 2A for an independent evolution experiment confirming the environment-dependent patterns of amplification.) The observed difference in the number of galK copies is consistent with the observation that the three environments select for different levels of increasing gene expression (‘levels of improvement’) (Figure 1B) and confirms that amplifications are an efficient way of tuning gene expression (Tomanek et al., 2020).

We then asked whether other differences in the nature of adaptive mutations exist between the three different environments. To get a coarse-grained overview, we plotted the YFP fluorescence of evolving populations as a proxy for galK expression against their CFP fluorescence as a proxy for galK copy-number for all time points (Figure 2B). The YFP-CFP plot shows that evolving populations exhibit qualitatively different distributions of fluorescence levels in the three different environments, indicating that adaptation has followed different trajectories.

In the absence of galactose, populations retain their ancestral fluorescence phenotype. In the lowest galactose concentration (0.01%), data points show a correlated increase between YFP and CFP fluorescence indicative of gene copy-number mutations (‘YFP+CFP+’ in Figure 2B). In the intermediate galactose concentration (0.1%) 5/96 IS- populations exhibit increased YFP fluorescence with ancestral (single-copy) CFP fluorescence indicative of promoter mutants, (‘YFP+’ fraction in Figure 2B; Figure 2—figure supplement 3A). However, sequencing the P0 region upstream of galK of these evolved clones from populations with strongly increased YFP fluorescence (‘YFP+’ fraction in Figure 2B) showed that they harboured an ancestral P0 sequence (Figure 2—figure supplement 3A). We hypothesized that the YFP+ populations carried an amplification extending into galK-yfp, yet excluding cfp. Quantitative real-time PCR confirmed our suspicion (Figure 2—figure supplement 3B). As the IS- strain cannot undergo the frequent duplication via the two flanking IS elements, it cannot access a major adaptive route available to the IS+ strain. Thus, its adaptation follows an alternative trajectory, which occurs through a repeat-independent lower-frequency duplication with junctions between yfp and cfp (Figure 2—figure supplement 3C).

While increased CFP still reliably reports on increased copy-number, the yfp-only amplification hijacks our ability to unambiguously infer ancestral copy-number from ancestral CFP fluorescence alone. As increasing CFP itself bears no adaptive benefit, populations with increased CFP must carry amplifications that also include galK. In contrast, ancestral copy-number can only be confirmed by qPCR. The fact that some populations carry IS-independent yfp-only amplifications implies that our system of fluorescence reporters will yield a slight underestimate of the number of amplified populations both in the IS+ and IS- strain. However, we were ultimately interested in the divergence of promoter sequences, and going forward relied on sequencing to unambiguously determine the presence of adaptive promoter mutations.

In the high (1%) and intermediate (0.1%) galactose environment, data points occupy an additional space (‘mixed fraction’ in Figure 2B) between the other two fractions, where both YFP and CFP are increased, but the YFP increase is larger than in the YFP+CFP+ fraction. Based on these population-level data, we hypothesized that this phenotypic space is occupied either by a population of double mutants carrying a combination of point and copy-number mutations, or by populations consisting of cells with only promoter mutations and cells with only copy-number mutations (i.e. the two mutations being mutually exclusive). Knowing the single-cell phenotype is therefore crucial for distinguishing between the two cases. Importantly, single-cell fluorescence (using FACS) recapitulated the population measurements with the YFP-CFP phenotype falling into three distinct fractions (Figure 2C).

Copy-number and point mutations occur as a combination in the intermediate and high demand environment

To understand whether copy-number and point mutations are mutually exclusive or if they occur as a combination in the IS+ strain after evolution in intermediate (0.1%) and high (1%) galactose, we determined the single-cell fluorescence of all mixed fraction populations using flow cytometry (Figure 3A–B). It is worth noting that after 12 days of evolution, cells with ancestral YFP and CFP fluorescence were still present in every single amplified population. While some populations consisted of a high fraction of cells with elevated CFP fluorescence, mutants did not yet spread to complete fixation in any of them, highlighting the fact that our experiments are capturing the transient adaptive dynamics.

Figure 3. Confirming the presence of a combination of copy-number and point mutations in intermediate and high galactose.

Figure 3.

(A–B) Log plot of YFP and CFP fluorescence of all 96 IS+ populations during evolution in 0.1% (A) and 1% (B) galactose (black points), respectively. Data replotted from Figure 2B for an overview of population fluorescence of all mixed fraction populations (coloured points). Time points of measurements are indicated by the degree of shading. (C–D) Single-cell fluorescence phenotypes as measured by flow cytometry of all mixed fraction populations identified in (A–B) after 12 days of evolution, respectively, indicate the presence of combination mutations (an increase of both YFP and CFP within a single cell as opposed to a mixed population of cells with either an increase in YFP or an increase in CFP, compare to Figure 2C). (E) Sanger sequencing of individual colonies allows to determine the genotype of an evolved clone of any fluorescence phenotype. Images of CFP (left) and YFP (right) fluorescence of individual colonies from a representative IS+ population (A10) streaked onto LB agar after having evolved in 0.1% galactose for 12 days. Sanger sequencing of the P0 sequence revealed a T>A point mutation in an amplified (red arrow) but not an ancestral colony (grey arrow). Scalebars: 1cm.

Figure 3—source data 1. Contains an alignment of sequencing data for Figure 3E.

Flow cytometry results showed that IS+ populations of the mixed fraction from intermediate (0.1%) galactose (Figure 3A) consisted of a single type of mutant with increased YFP/CFP fluorescence relative to the ancestral values (Figure 3C). If instead a population consisted of two mutually exclusive mutants, we would expect cells to fall into two distinct phenotypic clusters, one with only increased YFP (corresponding to the ‘YFP+’ fraction) and one with only amplifications (corresponding to the ‘YFP+CFP+’ fraction). Moreover, YFP fluorescence of the mixed fraction cells was greater than YFP for pure amplification mutants, which falls along the diagonal axis (Figure 2C – right panel), again indicating a combination of copy-number and promoter mutations. To confirm the presence of combination mutants, we randomly picked three populations of the mixed fraction. Sequencing revealed that within these populations, only amplified clones, but not clones with single-copy cfp harboured an SNP (–30T>A) in P0 (Figure 3E).

Similar to intermediate galactose, IS+ populations from the high (1%) galactose mixed fraction (Figure 3B) harboured cells with the combination mutation phenotype and, in addition, cells with pure amplifications (Figure 3D). Taken together, these data indicate that copy-number and point mutations can occur as a combination in environments with sufficiently high gene expression demand.

Copy-number and point mutations are mutually exclusive in the low demand environment

After finding combination mutants in the high galactose environments, we analysed the single-cell fluorescence of all IS+ populations from the low (0.01%) galactose environment. Surprisingly, and in contrast to the intermediate and high galactose environments, in low galactose adaptive amplification of IS+ populations happened more rapidly with the majority of populations (68/96) showing increases in CFP fluorescence during the course of the experiment (Figure 4A – left top and bottom panel, Figure 4—figure supplement 1A–B). Notably, cells of those few populations that did not follow this general trend (Figure 4A – right top and bottom panel) showed an increase in YFP without a concomitant increase in CFP. As this small increase in YFP was not visible in the initial population measurements of liquid cultures (Figure 2B), we turned to patching populations onto LB agar, a potentially more sensitive method, which alleviates changes in fluorescence related to growth rate. Imaging populations confirmed the increase in YFP for all populations with elevated YFP in single-cell measurements (Figure 4—figure supplement 2A). We first examined population B1 with clearly increased YFP more carefully by re-streaking it on LB agar (Figure 4C). Consistent with flow cytometry results (Figure 4B), we found colonies with three different fluorescence phenotypes: ancestral, increased YFP (‘YFP+’), and a small subpopulation with both, increased YFP and CFP (amplified). Sequencing of the amplified colony type confirmed it to be a bona fide amplification without additional promoter SNPs. Sequencing of the YFP+ colony type uncovered two adaptive SNPs in P0 (–30T>A and –37C>T), which were identical to a previously identified promoter mutation ‘H5’ (Figure 2—figure supplement 2B; Steinrueck and Guet, 2017; Tomanek et al., 2020).

Figure 4. Confirming the presence of mutually exclusive mutations in low galactose.

(A) Representative flow cytometry density plot showing YFP fluorescence (upper left and right panel) and CFP fluorescence (lower left and right panel) of IS+ populations B3 (left panels) and B1 (right panels) over time (grey – ancestral, black – day 4, dark blue – day 8, purple – day 12). The small YFP+CFP+ subpopulation is indicated by a magenta arrow (see corresponding arrow in B – right panel). (B) YFP versus CFP plot of populations B3 (left panel, black) and B1 (right panel, magenta) at day 12 together with an ancestral population (grey) in order to better visualize the two distinct subpopulations in B1 (magenta arrows indicate YFP+ and YFP+CFP+ subpopulation, respectively). Data is replotted from A in order to visualize subpopulations. (C) Images of CFP (left) and YFP (right) fluorescence of individual colonies from IS+ population B1 (shown in B) streaked onto LB agar after 12 days of evolution in 0.01% galactose. The population consists of amplified colonies with increased CFP and YFP fluorescence (grey arrows) and single-copy colonies with a promoter mutation (red arrows). Scalebars: 1cm. (D) Quantitative analysis of patched populations indicates that promoter mutants (YFP+) evolve only in single-copy backgrounds. YFP-CFP plot of median colony fluorescence intensity of populations patched onto agar (as shown in B) on days 1, 4, 8, and 12 of evolution in 0.01% galactose. Populations were classified as YFP+ if their YFP but not CFP fluorescence intensity values exceeded ancestral fluorescence (red triangles, confirmed by flow cytometry). In all these populations, the YFP+ phenotype evolved from an ancestral phenotype. Blue triangle represents an amplified population, which was classified as YFP+ in the previous time point (flow cytometry showed that this population became dominated by copy-number mutations later). Black triangle marks population incorrectly classified as YFP+ (ancestral fluorescence according to flow cytometry). See also Table 1.

Figure 4—source data 1. Contains an R script along with colony fluorescence intensity data over time to plot Figure 4D.

Figure 4.

Figure 4—figure supplement 1. Adaptation to the low galactose environment is dominated by gene amplification.

Figure 4—figure supplement 1.

(A) Flow cytometry density plot showing YFP fluorescence and (B) CFP fluorescence of IS+ populations over time (grey – ancestral, black – day 4, dark blue – day 8, purple – day 12).
Figure 4—figure supplement 2. Monitoring population fluorescence under neutral conditions with respect to galK expression reveals small increases in YFP fluorescence in the absence of amplification.

Figure 4—figure supplement 2.

(A) Representative images of CFP (left panel) and YFP (middle panel) fluorescence of populations patched onto LB agar, which allows comparing population fluorescence in the absence of galactose-dependent growth effects. Magenta arrows indicate population B1, which exhibits increased YFP but ancestral CFP fluorescence (quantification of patch fluorescence intensity in Figure 4D). Scalebars: 1cm.

As we failed to find combination mutants (i.e. a mixed fraction) in population measurements from the low galactose environment (Figure 2B), we used agar patches from four different time points of the evolution experiment (Figure 4—figure supplement 2A) to screen IS+ populations more comprehensively (Figure 4D). Re-streaking, sequencing and flow cytometry analysis revealed that all populations with elevated YFP and ancestral CFP (Figure 4D – red triangles) harboured either only promoter mutants or a mixed population of a few amplified cells and a majority of promoter mutants (Table 1). As opposed to high and intermediate galactose, we did not find a single population with combination mutants in low galactose. Moreover, the fact that mutations were mutually exclusive within populations was also reflected when we analysed their fate over time. Quantitative analysis of the fluorescence intensity of patched populations (Figure 4D) confirmed that populations with a significant fraction of promoter mutants (i.e. visibly YFP+ on the agar patch) did not become amplified later in the experiment. As a single exception, population F6 gained the YFP+ phenotype early, but became dominated by gene amplifications by the end of the experiment (Figure 4D – right panel, blue triangle). Nevertheless, also in this case, copy-number and point mutations did not occur in the same genetic background. Conversely, all YFP+ populations evolved exclusively from those with ancestral phenotype; no single amplified population gained a functional promoter within the time frame of the experiment (Figure 4D).

Table 1. Sequencing and phenotypic analysis of all YFP+IS+ populations evolved in 0.01% galactose (Figure 4D – red triangles).

Increase in fluorescence relative to ancestral (anc) phenotype indicated by YFP+ and CFP+. Results shown for day 12 populations unless otherwise noted (d4, d8).

Population Seq (all YFP+) Flow cytometry phenotype Agar streak Comment
A6 –30T>A YFP+, v. few CFP+ (mixed populations) YFP+, few CFP+
B1 –30T>A, –37C>T (“mutation H5”) YFP+, CFP+ (mixed populations) Few YFP+, few CFP+, mixed pop
B2 –30T>A YFP+ YFP+, v. few CFP+
C1 –30T>A YFP+ (d12) YFP+, v. few CFP+
C9 Ancestral YFP (d8), only CFP+(d12) Few CFP+ Incorrectly classified as YFP+ (Figure 4D – grey triangle)
D2 –30T>A YFP+ (d12) YFP+ only
D9 anc YFP+ (d8, d12) YFP+ only
E10 –30T>A YFP+ (d12) YFP+ only
F6 YFP+ (d4), CFP+(d12) CFP+ subpopulation YFP+ at d8, then amplified population (Figure 5D – blue triangle)
F10 –30T>A YFP+, CFP+, anc
(mixed populations)
YFP+, CFP+, mixed pop qPCR confirmed
G1 –30T>A YFP+(d4–8), v. few CFP+ (d12) YFP+, v. few CFP+
G12 –30T>A YFP+ (d8) YFP+, no CFP+ (d12) FACS CFP+ carry over

The complete absence of combination mutants in the low demand environment is consistent with the fact that only a modest increase in galK expression is necessary to reach maximal fitness (Figure 1B). Thus, while a combination of amplification and promoter point mutation evolves in response to selection for a strong increase in galK expression (intermediate and high demand environments), either mutation alone might provide a sufficient increase in gene expression to allow for maximal growth in the low demand environment. This would mean that the fitness benefit of either mutation does not add up when combined. In other words, we hypothesize that negative epistasis precludes the evolution of combination mutants in the low demand environment. Alternatively, the lack of combination mutants could be explained by clonal interference between competing adaptive amplifications and point mutations (a possibility we discuss in the last section of Results).

An increased fraction of adaptive promoter mutations is found in IS- populations evolved in the low demand environment

If copy-number mutations are more frequent than point mutations and their combination does not spread to observable frequencies in the low demand environment, we would expect divergence to proceed more slowly as compared to an intermediate or high demand environment.

To directly test this hypothesis, we estimated the level of divergence between all of the IS+ and IS- populations evolved in the low demand (0.01% galactose) environment. We pooled all 96 populations into pools of 32 and quantified the fraction of SNPs in P0 previously known to be adaptive (Tomanek et al., 2020). To do so, we subjected PCR amplicons of the pooled populations to next-generation sequencing (Figure 5A, Figure 5—figure supplement 1A). We designed our sequencing experiment such that we were able to analyse 39 bp upstream and downstream of the galK start codon. We calculated the fraction of sequence reads carrying either one or both most frequently observed adaptive SNPs at position –30 and –37 upstream of the galK start codon (Table 1). As a control, we also compared the fraction of SNPs within the galK gene of the IS+ and IS- evolved under different galactose conditions. In our experimental system, galactose selection is not expected to lead to adaptive mutations anywhere in the coding region of galK, as the enzyme itself is fully functional despite lacking a functional promoter sequence. Comparing the fraction of reads with SNPs (i.e. reads with a single SNP in galK divided by the number of reads with ancestral galK) allowed us to compare across samples with different absolute numbers of sequencing reads (Figure 5—figure supplement 1A). Consistent with our expectation, the fraction of sequencing reads with a single SNP at any position in galK was similar in populations evolved in different galactose concentrations and in the control populations evolved in the absence of galactose (Figure 5A–B).

Figure 5. Amplicon deep sequencing of P0 in pooled evolved populations.

(A) (Left panel) Number of reads carrying a P0 sequence with two adaptive SNPs 30 and 37 bp upstream of galK, respectively (‘T>A + C>T’ in blue) or its respective single SNPs (‘T>A’ in green, ‘C>T’ in cyan). Values are normalized to the number of reads with ancestral P0 for IS- and IS+ populations evolved in 0.01% galactose. The mean fraction of reads with any single SNP in galK is shown as a control (grey). Error bars represent the standard deviation of three replicates, consisting each of 32 pooled evolved populations. (Right panel) Read fractions of the same respective SNPs shown for a pool of all 96 IS+ and IS- populations evolved in the absence of galactose. (B) Mean read fractions as in (A) shown for three replicates of each 32 pooled populations evolved in intermediate (0.1%) galactose.

Figure 5.

Figure 5—figure supplement 1. Total number of sequencing reads for all replicates.

Figure 5—figure supplement 1.

(A) Log plot of total read numbers showing contamination of P0 amplicons with P02 amplicons stemming from pooled samples of the 0.1% galactose populations of both promoter sequences (blue rectangles; see Methods).

We then compared the fraction of reads with the two adaptive SNPs in P0 previously known to confer increased galK expression (Figure 5A). While the fraction of reads carrying SNPs in galK is similar in all media, SNPs in P0 were more frequent in media containing galactose than in the control (Figure 5A – left and right panels) in agreement with strains adapting to galactose selection. Intriguingly, in low galactose, we found a higher fraction of reads carrying both adaptive single SNPs (–30T>A and –37C>T) in IS- populations than in the IS+ populations. This is consistent with our hypothesis that the more frequent amplification mutants effectively out-compete point mutations in the low demand environment.

We are here using the fraction of sequencing reads (‘alleles’) with adaptive SNPs divided by the number of ancestral reads as a simple metric of divergence. However, this normalization leads to an underestimation of SNPs if they occur in an amplified background. For instance, a SNP within a cell with four P0-galK copies, where one carries an SNP, counts less than a cell with one copy of P0-galK carrying one SNP. The rationale for using the fraction of adaptive alleles as our metric of divergence as opposed to the alternative, which is the number of SNPs per cell, is twofold: First, the methodology used here does not allow comparing absolute read counts between samples. Second, and more importantly, due to the random nature of deletion mutations, a single SNP in an amplified array of four copies has a one in four chance of being retained as a lasting divergent copy in the process of amplification and divergence. Hence, the dilution of SNPs by additional amplified copies is not simply a counting artefact, but reflects a biological reality relevant to the very process that we are studying. Therefore, we conclude that in the low demand environment a strain which cannot adapt by gene amplification exhibits a higher level of divergence than a strain which frequently adapts by gene amplification.

Evolutionary dynamics between mutation types differ for different initial random promoter sequences

Given the paucity of point mutations that we observed for the evolution of the random P0 sequence (either a combination of –30T>A and –37C>T or each SNP alone), we wondered whether a greater variety of mutations could be obtained when using a different random promoter sequence as a starting point for evolution. Therefore, we repeated our evolution experiment in the intermediate (0.1%) galactose environment with three additional random promoter sequences (P0-1, P0-2, P0-3).

After 10 days of evolution, only two out of the four random P0 sequences evolved increased galK-yfp expression (Figure 6A). This is roughly consistent with the fact that approximately 60–80% of random sequences are one point mutation away from a functional constitutive promoter (Yona et al., 2018; Lagator et al., 2022). Interestingly, P0-1 and P0-3 did not gain any gene duplications or amplifications. At first glance, this drastic difference in gene amplification was unexpected, since the IS+ strains only differ in their P0 sequence, and not in their gene duplication rate. However, random sequences have different abilities to recruit RNA-polymerase, and as a result, different baseline expression levels (Yona et al., 2018; Lagator et al., 2022). Given that a plateau exists in the expression growth relation for low levels of expression (Figure 1B), the initial expression level conferred by P0-1 and P0-3 might be too low to yield a selective benefit upon gene duplication alone. According to this hypothesis, these random (non-)promoters are not only two (or more) point mutations away from a beneficial sequence, but also two (or more) copy-number mutations.

Figure 6. Evolutionary dynamics for different random P0 sequences in 0.1% galactose.

(A) YFP versus CFP fluorescence normalized to the ancestral value of 96 populations of IS+ (black) and IS- (red) strain each harbouring a different random sequence upstream of galK (‘P0’, ‘P0-1’, ‘P0-2’, ‘P0-3’) grown in 0.1% galactose and without galactose (grey lines, control), respectively. Time points are indicated by the degree of shading. The number of populations for IS- (red) and IS+ (black) in the respective fractions are indicated. (B) YFP/CFP fluorescence to visualize increases in galK-YFP expression not caused by copy-number increases plotted for the duration of the evolution experiment for P0-2 populations of IS+ (left panel) and IS- (right panel). Here, gene amplifications (see Figure 6—figure supplement 1A) are visible as slight decrease in YFP/CFP relative to the 0% galactose control (grey), putative promoter mutations are visible as an increase in YFP/CFP. (C) Distribution of P0-2 mutants in IS+ and IS- populations after 12 days of evolution in 0.1% galactose. Mutations in P0-2 are exclusively found in populations with increased YFP and ancestral CFP fluorescence (YFP+). IS+ clones from all six YFP+ populations were sequenced, while IS- clones from a random subsample of 21 YFP+ populations were sequenced. (D) Mean normalized YFP fluorescence of reconstituted P0-2 mutants and the P0-2 ancestor strain (grey) grown in control medium (0% galactose). (E) Mean growth rate of reconstituted P0-2 mutants and the ancestor strain (grey) in 0.01% galactose, 0.1% galactose, and control medium (0% galactose). Error bars represent the standard deviation of four replicates.

Figure 6—source data 1. Contains an R script along with optical density and fluorescence intensity measurments to plot Figure 6A-B.
Figure 6—source data 2. Contains an R script along with optical density and fluorescence intensity measurments to plot Figure 6D-E.

Figure 6.

Figure 6—figure supplement 1. Rapid amplification of IS+ populations with P02.

Figure 6—figure supplement 1.

(A) CFP/OD600 as a proxy for copy-number plotted over the course of the evolution experiment for IS+ with P0-2 populations in 0.1% galactose and control populations in 0% galactose (grey). Green line indicates threshold to classify population as amplified (day 8 CFP/OD600 exceeds the mean ancestral CFP/OD600 by four standard deviations). (B) YFP/OD600 plotted versus CFP/OD600 of evolved IS+ populations with P0 (black) and P0-2 (blue) (data replotted from Figure 6A). (C) Flow cytometry measurement of YFP fluorescence intensity as a proxy for galK expression of IS- strains harbouring the four random promoter sequences as well as a P0 with adaptive SNPs as a comparison (‘H5’; indicated at the bottom of the figure), respectively, normalized to a strain without fluorescence marker. Error bars represent the standard deviation of three biological replicates. (D) End-point OD600 (‘yield’) of IS- populations carrying P0, P0-1, P0-2, and P0-3 after 24 hr of growth in 0.1% galactose (left panel) and in the absence of galactose (right panel). Boxes indicate the mean and standard deviation of 96 populations (left panel) and 12 populations (right panel), respectively. Asterisks indicate a significant difference between mean OD600 (two-sided t-test, p<0.0001).

Copy-number and point mutations are mutually exclusive in the intermediate demand environment for P0-2

For P0, the evolution experiment in intermediate galactose reproduced our previous findings, namely a YFP+CFP+ (amplified) and a mixed (amplified with increased YFP) fraction for IS+ populations and a YFP+ fraction for IS- populations (compare Figure 6A with Figure 2B), which corresponds to an amplification of YFP, but not CFP (Table 2).

Table 2. Mutations of P0-2 underlying increased YFP fluorescence in IS+ and IS- populations evolved in 0.1% galactose.

IS+ clones IS- clones
P02-A11 –131_–144del P02-A7 –100C>T
P02-B10 –122_–134del P02-H12 –100C>T
P02-F4 –100C>T P02-C3 –100C>T
P02-F4 –100C>T, poor quality read P02-H9 –122_–134del
P02-F2 –122_–134del
P02-D1 –100C>T
P02-E2 –100C>T
P02-A1 Bigger band, maps to insD1 coding sequence
P02-E5 –41del
P02-C5 201 bp deletion leaving 20 bp of P02
P02-H5 201 bp deletion leaving 20 bp of P02
(seven different kinds of mutations)

For P0-2, the evolutionary dynamics differed from P0. In the IS+ strain, almost every single population evolved amplifications within the first 2 days of the evolution experiment (Figure 6B, Figure 6—figure supplement 1A). Moreover, only two fractions are visible in the YFP-CFP plots of P0-2. The first fraction is occupied by YFP+ populations carrying a single copy of cfp. The second fraction along the diagonal between YFP and CFP is occupied by amplified populations (YFP+CFP+). Moreover, it is shifted towards higher values of YFP/CFP relative to values found for P0 (Figure 6—figure supplement 1B), suggesting that P0-2 exhibits a higher baseline expression level than all the other three random promoter sequences. In contrast to the population-level measurements, single-cell measurements were not sufficiently sensitive to corroborate any difference in leaky expression amongst the four random promoter sequences (Figure 6—figure supplement 1C). However, in line with the observed evolutionary dynamics, P0 and even more so P0-2 confers a significant growth advantage over the other two promoters (Figure 6—figure supplement 1D). As mentioned above, this suggests that the observed growth advantage of P0-2 populations can explain their rapid amplification dynamics. In agreement with the evolution experiments with P0, the YFP+CFP+ (amplification) fraction is also strongly reduced in the IS- strain for P0-2.

Intriguingly, with the majority (88/96) of P0-2 IS+ population amplified, six P0-2 IS+ populations that failed to evolve amplifications show an increase in YFP/CFP early in the evolution experiment (Figure 6B – left panel, Figure 6—figure supplement 1A). This result combined with the idea that P0-2 exhibits a relatively high baseline expression level and the absence of a mixed fraction for P0-2 (Figure 6A) suggests that increases in gene expression evolve either via gene amplification or via point mutation. In other words, because initial galK expression is high in P0-2, a small improvement (either amplification or a promoter mutation) is sufficient to reach the required gene expression demand. Thus, the adaptive trajectory of P0-2 in intermediate galactose resembles that of P0 in low galactose as both environments select only for a modest improvement in galK expression.

In contrast to the IS+ strain, where only six populations showed increased YFP/CFP fluorescence that emerged only within the first 3 days of evolution, populations of the IS- strain were evolving increased YFP/CFP fluorescence throughout the experiment (Figure 6B – right panel). We were curious whether the increase in YFP/CFP in both, IS+ and IS- populations, was due to promoter mutations. Sequencing of randomly picked evolved clones revealed that in the majority (4/6 for IS+, 11/21 for IS-) of clones with increased YFP/CFP indeed harboured a mutation in P0-2, including an SNP, a 12 and a 13 bp deletion (Table 2; Figure 6C). Importantly, colonies of the same populations but with ancestral fluorescence harboured ancestral P0-2 sequences (Table 1), indicating that the observed mutations (Table 2) are causal for the increased YFP expression. While finding the causal mutations for the remaining evolved clones with increased YFP but ancestral P0-2 (Figure 6C) lies outside the scope of the current work, we speculate that they may occur further upstream of P0-2 or could be acting in trans such as mutations in the transcription factor rho (Steinrueck and Guet, 2017).

To confirm that the 12 bp deletion mutation, the 13 bp deletion mutation and the SNP were in fact adaptive, we reconstituted these mutations into the ancestral P0-2 strain, where they conferred increased YFP expression (Figure 6D) resulting in increased growth in medium supplemented with galactose (Figure 6E). The finding that the promoter mutations were responsible for increased galK-yfp expression was corroborated by the fact that these mutations occurred exclusively in populations with increased YFP but ancestral CFP, and were completely absent in amplified (YFP+CFP+) and ancestral colonies from a random set of 14 IS+ populations (Figure 6C). It is worth noting that mutations observed in P0-2 were more diverse than those observed in P0 (seven different mutations including indels, an IS insertion and an SNP in P0-2 versus three different SNPs in P0 – compare Tables 1 and 2). Thus, amplification can interfere with divergence not only by point mutations but also by small insertions and deletions.

Taken together, the facts that (i) the majority of IS+ populations become rapidly amplified, (ii) with few promoter mutations arising exclusively in the first day in non-amplified populations (mutations are mutually exclusive), and (iii) many more promoter mutations occur in IS- populations throughout the evolution experiment strongly suggest that negative epistasis between frequent copy-number mutation and point mutations hinder fixation of the latter.

Amplification hinders divergence by point mutations in the low demand environment

The experimental results we presented this far suggest that the evolutionary dynamics of duplication/amplification and divergence depend on the level of gene expression increase selected for (Figure 7). In both environments, promoter point mutations evolve at a low rate in a single-copy background. However, if rates of copy-number mutation are high, evolutionary dynamics are dominated by amplification. Irrespective of the environment, this amplification increases the mutational target size for rarer adaptive point mutations to occur. However, only if a strong increase in galK expression is selected for (high demand environment), the beneficial effects of both types of mutation add up, and we observe a combination of amplifications and point mutations to occur, in agreement with the IAD model (Bergthorsson et al., 2007; Näsvall et al., 2012; Andersson et al., 2015; Figure 7A).

Figure 7. Frequent copy-number mutation can hinder adaptation by point mutations.

Genotype-fitness map (‘fitness landscape’) illustrating the difference between adaptive trajectories of a high demand (A) and low demand (B) environment, which differ solely by the increase in gene expression they select for. The dashed line indicates the level of gene expression sufficient to reach maximal growth rate (‘fitness’) (see also Figure 1B). Right panels show the experimentally observed genotypes for each environment. (A) For an environment selecting for a large increase in gene expression (high demand), more than one adaptive mutation is necessary to reach maximal fitness. If copy-number mutations are frequent (as in the IS+ strain), adaptation by amplification is most likely (bold arrow). Alternatively, at a lower frequency, adaptation occurs via a point mutation in the promoter sequence (thin arrow). Due to an increased mutational target size, cells with gene amplfications are more likely to gain a beneficial point mutation than cells with a single copy of galK. Alternatively, rare promoter mutants can become amplified, in either case leading to the combination mutant observed in experiments. (B) For an environment selecting for only a modest increase in gene expression (low demand), maximal growth rate is attained either by gene amplification (more frequent, bold arrow) or by point mutations (less frequent, thin arrow). Therefore, combination mutants do not provide an additional fitness benefit and would only increase in frequency due to drift (horizontal faint dashed lines), not selection. Combination mutants are not observed in the experiment (right panel).

Figure 7.

Figure 7—figure supplement 1. Amplification hindrance is consistent with negative epistasis under conditions of low gene expression demand.

Figure 7—figure supplement 1.

(A). Maximal OD600 (‘yield’) as a function of different induction levels of galK expression in four different concentrations of galactose. Expression of a synthetic para-galK cassette is induced by the addition of arabinose (see Figure 1B for a plot of growth rate data). At the lowest level of galK expression yield as a proxy for population size is similar across all environments. (B) OD600 of evolved strains is plotted over time for the evolution medium supplemented with three different galactose concentrations and glucose (control; indicated on top of figure panels). For each galactose concentration, two amplified strains (blue), which have evolved in this environment for 7 days, respectively, are compared to two strong promoter mutants (‘H5r’, ‘D8c’, green, see Tomanek et al., 2020) grown in the same environment. (C) Yield (max OD600) and (D) maximal growth rate of the data shown in A.

The IAD model assumes that amplification and point mutations only occur in the same genetic background. However, whether the two different types of mutation fix consecutively in the same genetic background or in different competing clones depends on the effective population size and the respective mutation rates (Gerrish and Lenski, 1998). High rates of duplication and amplification may cause clonal interference between competing mutants, slowing down the fixation of either. Moreover, there needs to be sufficient selective benefit (‘demand’) for two consecutive selective sweeps to occur. If, however, only a modest level of gene expression increase is selected for (low demand environment) (Figure 1B), a single mutational event may be sufficient to provide it. Therefore, adaptation is dominated by the more frequent type of mutation, namely copy-number mutation. In other words, amplifications effectively hinder divergence in the low demand environment due to their negative epistatic interaction with point mutations. Thus, in a process, which we term amplification hindrance, the high rate of amplification results in evolutionary dynamics that slow down divergence via two different non-mutually exclusive mechanisms: clonal interference and negative epistasis.

However, in our experiments mutation rates can be assumed to be equal across environments. Moreover, in the absence of galK expression (i.e. for the ancestral strain) population sizes are similar across different galactose concentrations (Figure 7—figure supplement 1A). Hence, clonal interference is an unlikely explanation for the absence of combination mutants in the low galactose environment. However, there is a difference in the degree to which strains that harbour amplifications fulfil the necessary gene expression demand posed by the environment they have evolved in. Strains with amplifications evolved in the high and intermediate galactose environment grow slower and to lower densities than a strain with a strong constitutive promoter. In contrast, in the low galactose concentration strains with amplifications evolved in this environment exhibit both yield and growth rate comparable to that of the promoter mutant strain (Figure 7—figure supplement 1B–D).

These results suggest that gaining additional promoter point mutations on top of an amplification would only be beneficial in the higher galactose concentrations, but yield little or no fitness benefit in the low galactose environment. Therefore, under the experimental conditions presented here, gene expression demand – and hence negative epistasis – plays a major role in amplification hindrance.

Discussion

In this study, we investigated the interaction dynamics between two different types of mutations, adaptive copy-number and point mutations. While the process of gene duplication and divergence per se has been intensely studied since the pioneering work of Ohno more than half a century ago, no experiments have scrutinized the early phase of this process, where transient evolutionary changes may prevail. So far, the few existing experimental studies simply introduced mutations a priori without studying their formation dynamics (Dhar et al., 2014), while in silico studies used genomics to query the ‘archaeological’ results of millions of years of sequence evolution (Innan and Kondrashov, 2010).

Here, we used experimental evolution to investigate how the early adaptive dynamics of diverging promoter sequences is influenced by the rate of copy-number mutations as well as the level of expression increase selected for. We found that the spectrum of adaptive mutations differed drastically between environments selecting for different levels of expression of the same gene (Figures 1B, 3A and 6A). Combination mutants carrying both, copy-number and promoter point mutations, only evolved under conditions selecting for big increases in the levels of galK expression. In contrast, selection for only a modest increase in galK expression lead to populations adapting by either gene amplifications or point mutations in their random promoter sequence, but not both simultaneously. Moreover, if amplification occurred early in the experiment, the random promoter sequence P0 did not diverge within the timespan of the experiment (Figure 4D). This phenomenon was even more pronounced for a second random promoter sequence, P0-2 (Figure 6B–C).

Moreover, comparing the number of point mutations between strains that differ solely in the rate of undergoing copy-number mutations in the galK locus, we found that under a low demand environment, a strain with a high duplication rate (IS+) diverged more slowly compared to a strain with low duplication rate (IS-).

Taken together, our results suggest that frequent gene amplification hinders the fixation of adaptive point mutations due to most likely negative epistasis between these two different mutation types. While epistatic interactions can occur with any two adaptive mutations, copy-number mutations are unique, in that they are orders of magnitude more frequent than point mutations in bacteria (Roth, 1988; Drake et al., 1998; Andersson and Hughes, 2009; Elez et al., 2010; Reams and Roth, 2015) and in eukaryotes (Lynch et al., 2008; Lipinski et al., 2011; Schrider et al., 2013; Keith et al., 2016). This large difference in rates means that a competition between point and copy-number mutations is heavily skewed in favour of the latter (Figure 7B).

Unlike the phenomenon of clonal interference (which occurs between any two beneficial mutations even if their adaptive benefits are additive) (Gerrish and Lenski, 1998), negative epistasis does not slow down adaptation per se, as adaptation is agnostic to whether point or copy-number mutations lead to an improved phenotype. However, negative epistasis slows down divergence as populations have reached the fitness peak with an alternative kind of adaptive mutation. Negative epistasis between point and copy-number mutations can be expected to occur in any selective condition, which requires only a relatively modest increase to a particular biological function, namely an increase in gene expression or enzyme activity by only a few-fold. Thus, amplification hindrance may not only be of general relevance for the evolution of gene expression in bacteria, but also for the evolution of promiscuous enzyme functions, which analogous to a barely expressed gene can be enhanced by either copy-number mutations or point mutations in the coding sequence.

While we found that amplification slows down divergence under conditions of negative epistasis, the consensus in the literature has been that copy-number mutations not only serve as a first step in the ‘relay race of adaptation’ (Yona et al., 2015), but that they also facilitate divergence, either indirectly by providing a first ‘crude’ adaptation to cope with a new environment until more refined adaptation occurs by point mutations, or directly by increasing the target size for point mutations (Andersson and Hughes, 2009; Elde et al., 2012; Yona et al., 2015; Cone et al., 2017; Bayer et al., 2018; Lauer et al., 2018; Todd and Selmecki, 2020). The intuitive idea that amplification speeds up divergence (Andersson et al., 1998) was originally developed as strong evidence against the adaptive mutagenesis hypothesis proposed by Cairns and others (Cairns et al., 1988; Cairns and Foster, 1991).

Based on it, various experimental studies interpreted observations of adaptation to dosage selection in the light of ‘amplification as a facilitator of divergence’ (Song et al., 2009; Pränting and Andersson, 2011; Elde et al., 2012; Näsvall et al., 2012; Yona et al., 2012; Yona et al., 2015; Cone et al., 2017; Bayer et al., 2018; Lauer et al., 2018; Todd and Selmecki, 2020). However, despite showing that adaptive amplification precedes divergence by point mutations, none of the studies provided a direct experimental test of the hypothesis that amplification causes increased rates of divergence. Experiments controlling for the rate of amplification were needed in order to dissect the ensuing evolutionary dynamics and establish causality.

All else being equal, more copies indeed mean more DNA targets for point mutations to occur (San Millan et al., 2017). However, as our experiments show, all else is not necessarily equal, and the evolutionary dynamics may differ strongly between an organism that can increase copy-number as an adaptation and an organism that cannot. Intriguingly, indications for more complex dynamics can be found in the existing literature (Yona et al., 2012; Lauer et al., 2018; Richts et al., 2021). One study showed that rapid adaptive gene amplification in yeast results in strong clonal interference between lineages (Lauer et al., 2018). A second study in yeast found that adaptation to an abrupt increase in temperature was dominated by rapid copy-number mutation, with SNPs occurring only much later (Yona et al., 2012; Yona et al., 2015). Lastly, an experimental evolution study in Bacillus, adaptation was dominated by copy-number mutations and the authors noted the surprising lack of promoter mutations (Richts et al., 2021).

The transient dynamics of gene amplification allows tuning of gene expression on short evolutionary time scales in the absence of an evolved promoter (Tomanek et al., 2020). In principle, such transient evolutionary dynamics do not leave traces in the record of genomic sequence data on evolutionary time scales and as such, their detailed study may not seem warranted. This is especially true in the context of duplication and divergence of paralogs, which is studied because abundant genomic sequence data are available (Kondrashov, 2012). Our present study proved this intuition wrong, as we uncovered a potentially long-lasting effect resulting from the transient dynamics associated with copy-number mutations: if adaptation by amplification is the fastest and sufficient, other, less frequent, mutations may not have a chance to compete. While our evolution experiments were conducted under continuous selection, natural environments are often characterized by regimes of fluctuating selection. Due to the pleiotropic cost often associated with copy-number increases as well as their high rate of deletion, adaptive amplification returns to the ancestral single-copy state in the absence of selection (Andersson and Hughes, 2009; Reams et al., 2010). This means that once the selective benefit of the transient adaptation ceased, no change at the level of genomic DNA remains (Roth, 1996). Therefore, the idea that gene amplifications act as a transient ‘regulatory state’ rather than a mutation (Roth, 1996; Tomanek et al., 2020) can be extended by an implication found here, namely that amplifications could effectively act as buffer against long-lasting point mutations. In this view, amplification could repeatedly provide rapid adaptation to selection for increased gene expression, but collapse back to the single-copy ancestral state once selection has subsided and yet hinder sequence divergence each time it does so. Thus, on sufficiently long time scales, the transient dynamics that play out before the fixation of mutations may ultimately shape entire genomes (Cvijović et al., 2018).

Amplification hindrance is in agreement with the observation that gene duplication and divergence is not a dominant force in the expansion of protein families in bacteria (Treangen and Rocha, 2011; Tria and Martin, 2021). Consequently, in all situations where rapid amplification provides sufficient adaptation, amplification hindrance could work as a mutational force that – in addition to purifying selection – acts to conserve existing genes and their expression level. While purifying selection affects deleterious alleles only, counterintuitively, amplification hindrance prevents beneficial mutations from fixating.

Methods

Bacterial strain construction

To construct the IS- strain, we replaced the second copy of IS1 downstream of the selection and reporter cassette in IT030 (Tomanek et al., 2020) with a kanamycin cassette using pSIM6-mediated recombineering (Datta et al., 2006). Recombinants were selected on 25 µg/ml kanamycin to ensure single-copy integration.

To generate the additional random promoters sequences P0-1, P0-2, and P0-3, we generated 189 nucleotides using the ‘Random DNA sequence generator’ (https://faculty.ucr.edu/~mmaduro/random.htm) with the same GC content as P0 (55%). We synthesized these three sequences as gBlocks (Integrated DNA Technology, BVBA, Leuven, Belgium) with attached XmaI and XhoI restriction sites, which we used to clone P0-1, P0-2, and P0-3 into plasmid pMS6* (Tomanek et al., 2020) by replacing P0. We used pMS6* with the respective P0 sequence as a template to amplify the selection and reporter cassette and integrate it into MS022 (IS+) and IT049 (IS-) as described previously (Tomanek et al., 2020).

>P0

ACCGGAAAGACGGGCTTCAAAGCAACCTGACCACGGTTGCGCGTCCGTATCAAGATCCTCTTAATAAGCCCCCGTCACTGTTGGTTGTAGAGCCCAGGACGGGTTGGCCAGATGTGCGACTATATCGCTTAGTGGCTCTTGGGCCGCGGTGCGTTACCTTGCAGGAATTGAGGCCGTCCGTTAATTTCC.

>P0_1

GTAGGCCCGCACGCAAGACAAACTGCTGGGGAACCGCGTTTCCACGACCGGTGCACGATTTAACTTCGCCGACGTGACGACATTCCAGGCAGTGCCTCCGCCGCCGGACCCCCCTCGTGATCGGGTAGCTGGGCATGCCCTTGTGAGATATAACGAGAGCCTGCCTGTCTAATGATCTCACGGCGAAAG.

>P0_2

TCGGGGGGACAGCAGCGGCTGCAGACATTATACCGCAACAACACCAAGGTGAGATAACTCCGTAGTTGACTACGCGTCCCTCTAGGCCTTACTTGACCGGATACAGTGTCTTTGACACGTTTGTGGGCTACAGCAATCACATCCAAGGCTGGCTATGCACGAAGCAACTCTTGGGTGTTAGAATGTTGA.

>P0_3

CCCCTGTATTTGGGATGCGGGTAGTAGATGAGCGCAGGGACTCCGAGGTCAAGTACACCACCCTCTCGTAGGGGGCGTTCCAGATCACGTTACCACCATACCATTCGAGCATGGCACCATCTCCGCTGTGCCCATCCTGGTAGTCATCATCCCTATCACGCTTTCGAGTGTCTGGTGGCGGATATCCCC.

Reconstitution of P0-2 mutants in the ancestral strain

The reconstituted P0-2 mutant strains were obtained using pSIM6-mediated oligo recombineering (Sawitzke et al., 2011) of the ancestral strain and selecting recombinants on M9 0.1% galactose agar. The sequence of the oligonucleotides used is listed below. Successful recombinants were confirmed by Sanger sequencing of P0-2. Amongst the recombinants transformed with the –122_–134del construct, we also recovered one colony with higher YFP fluorescence intensity than the other recombinants. Sequencing showed an additional single deletion (–118del) in addition to the –122_–134del created by recombineering. Fluorescence and growth rate of the serendipitously obtained mutant is shown in Figure 6D–E along with the three intended mutants.

>A11 oligo (–131_–144del)

ACCGCAACAACACCAAGGTGAGATAACTCCGTAGTTGACTGGCCTTACTTGACCGGATACAGTGTCTTTGACACGTTTGTGGG.

>H12 oligo (–100C>T)

CTAGGCCTTACTTGACCGGATACAGTGTCTTTGATACGTTTGTGGGCTACAGCAATCACATCCAAGGCTG.

>F2 oligo (–122_–134del)

CAACACCAAGGTGAGATAACTCCGTAGTTGACTACGCGTCCCTTGACCGGATACAGTGTCTTTGACACGTTTGTGGGCTACAGCA.

List of strains used

Strain name Genotype Purpose Source
MG1655 F- λ- ilvG- rfb-50 rph-1 Strain background for all evolution experiments Lab collection
IT013-TCD BW27784, JA23100::galP, mglBAC::FRT, galK::FRT, locus1::pBAD-galK Strain with pBAD-galK for testing expression-growth relation Tomanek et al., 2020
BW25142 lacIq rrnB3 ∆lacZ4787 hsdR514 ∆(araBAD)567 ∆(rhaBAD)568 ∆phoBR580 rph-1 galU95 ∆endA9 uidA(∆MluI)::pir-116 recA1 Host for pir plasmid pMS6* Khlebnikov et al., 2001
MS022 MG1655, JA23100::galP, mglBAC::FRT, galK::FRT IS+ background for ancestor strain construction Lab collection
IT030 MS022 locus2::P0-RBS-galK -RBS-yfp -FRT-pR-cfp IS+ ancestor strain Tomanek et al., 2020
IT049 MS022 deleted for IS1C IS- background for ancestor strain construction This study
IT049-P0 IT049 locus2::P0-RBS-galK -RBS-yfp -FRT-pR-cfp IS- ancestor strain P0 This study
IT049-P0-1 IT049 locus2::P0-1-RBS-galK -RBS-yfp -FRT-pR-cfp IS- ancestor strain P0-1 This study
IT049-P0-2 IT049 locus2::P0-2-RBS-galK -RBS-yfp -FRT-pR-cfp IS- ancestor strain P0-2 This study
IT049-P0-3 IT049 locus2::P0-3-RBS-galK -RBS-yfp -FRT-pR-cfp IS- ancestor strain P0-3 This study
MS022-P0 MS022 locus2::P0-RBS-galK -RBS-yfp -FRT-pR-cfp IS+ ancestor strain P0 This study
MS022-P-01 MS022 locus2::P0-1-RBS-galK -RBS-yfp -FRT-pR-cfp IS+ ancestor strain P0-1 This study
MS022-P0-2 MS022 locus2::P0-2-RBS-galK -RBS-yfp -FRT-pR-cfp IS+ ancestor strain P0-2 This study
MS022-P0-3 MS022 locus2::P0-3-RBS-galK -RBS-yfp -FRT-pR-cfp IS+ ancestor strain P0-3 This study
IT030-H5r MS022 locus2::pconst-RBS-galK -RBS-yfp -FRT-pR-cfp Strain with constitutive galK expression conferred by two SNPs in P0 Tomanek et al., 2020
IT030-D8c MS022 locus2::pconst-RBS-galK -RBS-yfp -FRT-pR-cfp Strain with constitutive galK expression conferred by one SNP in P0 Tomanek et al., 2020

List of primers used

Name Sequence Purpose
E_flank_f GCTGGAGCCACTTGTAGCC cassette integration test locus 2, sequencing P0s
E_flank_r TCCTTGCTGAATCATTTTGTTC cassette integration test locus 2
P0_check_Fw GTGTGAGTGGCAGGGTAG sequencing P0s (together with E_flank_f)
qPCR_galK _Fw GCTACCCTGCCACTCACA estimating galK copy number
qPCR_galK _Rv CGCAGGGCAGAACGAAAC estimating galK copy number
rbsB_qPCR_Fw GGCACAAAAATTCTGCTGATTAA qPCR control locus
rbsB_qPCR_Rv GCAGCTCGATAACTTTGGC qPCR control locus
P1_P0-1 GCCTTAGTTGTAAGTGTCTACCAT GTCCCCGAACAAGTGTTCACTATG TCTAGGCCCGCACGCAAGAC integration of the selection and reporter cassette with P0-1 (Fw primer)
P1_P0-2 GCCTTAGTTGTAAGTGTCTACCAT GTCCCCGAACAAGTGTTCACTATG TCTCGGGGGGACAGCAGCG integration of the selection and reporter cassette with P0-2 (Fw primer)
P1_P0-3 GCCTTAGTTGTAAGTGTCTACCAT GTCCCCGAACAAGTGTTCACTATG TCTGTATTTGGGATGCGGGTAGTAGA integration of the selection and reporter cassette with P0-3 (Fw primer)
E_int_Rv TCGGAAGGGAAGAGGGAGTGCGGG AAATTTAAGCTGGATCACATATTGCC GAGGCCTTATGCTAGCTTC integration of the selection and reporter cassette (Rv primer)
E_int_Fw GCCTTAGTTGTAAGTGTCTACCATGTC CCCGAACAAGTGTTCACTATGTCACCG GAAAGACGGGCTTC integration of the selection and reporter cassette with P0 (Fw primer)
deep_seq_Fw TCGTCGGCAGCGTCAGATGTGTATAAG AGACAGACGGGTTCTTATGCCTTAGTT 1st step PCR for amplicon deep sequencing (with 5´nextera anchor for Illumina sequencing)
deep_seq_Rv GTCTCGTGGGCTCGGAGATGTGTATAA GAGACAGGTGTGAGTGGCAGGGTAG 1st step PCR for amplicon deep sequencing (with 5´nextera anchor for Illumina sequencing)

Culture conditions

Bacterial strains were grown at 37°C. All evolution experiments, as well as growth experiments with the purpose of measuring OD600 and fluorescence, were conducted in M9 medium supplemented with 2 mM MgSO4, 0.1 mM CaCl2, 0.1% casaminoacids (‘evolution medium’), and carbon source (galactose, glucose, or glycerol) at the concentration indicated in the respective figures (Sigma-Aldrich, St Louis, MO), with the exception of Figure 2—figure supplement 2B, where bacteria were grown in M9 medium without casaminoacids (carbon sources as indicated in the figure).

Evolution experiments

Evolution experiments were inoculated with ancestral colonies of IS+ and IS- strains grown in 3 ml of LB medium over night, after two washing steps in M9 medium without carbon source (M9 buffer) and a dilution of 1:200.

Bacterial cultures were grown in 200 µl liquid evolution medium with the indicated galactose concentrations in clear flat-bottom 96-well plates and shaken in a Titramax plateshaker at 750 rpm (Heidolph, Schwabach, Germany), allowing for a total population size of ~108 colony forming units for the ancestral strain. Every day, populations were transferred to fresh plates using a VP408 pin replicator (V&P SCIENTIFIC, Inc, San Diego, CA) resulting in a dilution of ~1:820 (Steinrueck and Guet, 2017), corresponding to ~10 generations. Immediately after the transfer, growth and fluorescence measurements were performed in the overnight plates using a Biotek H1 plate reader (Biotek, Vinooski, Vermont). Thus, population phenotypes were measured every 10 generations.

Growth rate measurements in liquid cultures

To measure the growth rate in a 2D gradient of arabinose and galactose (Figure 1B), an overnight culture of strain IT013 (Tomanek et al., 2020) grown in M9 supplemented with 1% glycerol and 0.1% casaminoacids was diluted 1:200 into 96-well plates containing 200 µl of M9 supplemented with 0.1% casaminoacids, with concentrations of galactose and the inducer arabinose as indicated in Figure 1B. For the full duration of the experiment, cultures were grown in the plate reader with continuous orbital shaking and OD600 and fluorescence was measured in 10 min intervals.

Growth rate was calculated using a custom R script. Briefly, the script applies a linear model (base R function lm()) to a 20-datapoint sliding window of log(OD600) as a function of time. The script then outputs the steepest slope (maximal growth rate) amongst all possible sliding windows (Figure 1—source data 1). The growth rates plotted in Figure 6E and Figure 2—figure supplement 2B were obtained in the same manner (see Figure 6—source data 2 and Figure 2—figure supplement 2—source data 1), with strains and carbon sources as indicated in the respective figures.

Flow cytometry experiments

Frozen evolved populations (–80°C, 15% glycerol) from day 4, day 8, or day 12 (as indicated in the figures) were pinned (1:820) into M9 buffer and put on ice until the measurement. Fluorescence was measured using a BD FACSCanto II system (BD Biosciences, San Jose, CA) equipped with FACSDiva software. CFP fluorescence was collected with a 450/50 nm bandpass filter by exciting with a 405 nm laser. YFP fluorescence was collected with a 510/50 bandpass filter by exciting with a 488 nm laser. The bacterial population was gated on the FSC and SSC signal resulting in approximately 6000 events analysed per sample, out of 10,000 recorded events.

Quantitative real-time PCR

For qPCR, gDNA was isolated from overnight cultures grown in the respective evolution medium inoculated by single evolved colonies using Wizard Genomic DNA purification kit (Promega, Madison, WI). We performed qPCR using Promega qPCR 2× Mastermix (Promega, Madison, WI) and a C1000 instrument (Bio-Rad, Hercules, CA). To quantify the copy-number of samples of an evolving population, we designed one primer pair within galK (target) and one primer within rbsB as a reference, which lies outside the amplified region. We compared the ratios of the target and the reference loci to the ratio of the same two loci in the single-copy control. Using dilution series of one of the gDNA extracts as template, we calculated the efficiency of primer pairs and quantified the copy-number of galK in each sample employing the Pfaffl method, which takes amplification efficiency into account (Pfaffl, 2001). qPCR was performed in three technical replicates.

Measurement of colony fluorescence

Evolving populations were pinned onto LB agar supplemented with 1% charcoal and imaged using a macroscope setup (https://openwetware.org/wiki/Macroscope) (Chait et al., 2010). To obtain median colony YFP and CFP fluorescence intensity, a region of interest was determined using the ImageJ plugin ‘Analyze Particles’ (settings: 200px-infinity, 0.5–1.0 roundness) to identify colonies on 16-bit images with threshold adjusted according to the default value. The region of interest including all colonies was then used to measure intensity and plotted using a custom R script (Figure 4—source data 1).

Amplicon deep sequencing of P0

Frozen samples of evolved populations were diluted 1:10 into 100 µl of LB and grown for 5 hr (37°C, shaking) to increase cell numbers prior to DNA extractions. Columns 1–4 (populations A1, B1, C1, …, F4, G4, H4), 5–8 (populations A5, B5, C5, …, F8, G8, H8), and 9–12 (populations A9, B9, C9, …, F12, G12, H12) of each 96-well plate were pooled prior to DNA extraction using Wizard Genomic DNA purification kit (Promega, Madison, WI). The P0 region including the beginning of galK was amplified for 25 PCR cycles using primers deep_seq_Fw and deep_seq_Rv carrying 5′ adaptors for Illumina sequencing. In parallel, PCRs were performed for 35 cycles to confirm bands on a gel. Illumina sequencing was carried out by Microsynth (Balgach, Switzerland).

We note that our amplicon libraries of P0 were contaminated with reads carrying the sequence of P0-2, which we had prepared for sequencing in parallel (Figure 5—figure supplement 1). We therefore excluded all reads of P0-2 for our analysis of P0 and do not report the result of the P0-2-specific samples as they could not be trusted.

Reads of P0 were analysed using a custom R script. Briefly, we defined four sequence motifs of each 39 bp length, which represented the ancestral P0 sequence and the same region with known adaptive SNPs (–30T>A, –37C>T or both). We calculated the fraction of reads with an evolved versus ancestral 39 bp motif in all samples, including those of control populations evolved in the absence of galactose. We also calculated the fraction of reads carrying a 39 bp ancestral galK sequence motif with any single single SNP versus those with the same 39 bp motif of the ancestral galK sequence.

Acknowledgements

We are grateful to N Barton, F Kondrashov, M Lagator, M Pleska, R Roemhild, D Siekhaus, and G Tkacik for input on the manuscript and to K Tomasek for help with flow cytometry.

Funding Statement

No external funding was received for this work.

Contributor Information

Călin C Guet, Email: calin@ist.ac.at.

Sergey Kryazhimskiy, University of California, San Diego, United States.

Molly Przeworski, Columbia University, United States.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Resources, Supervision, Funding acquisition, Project administration, Writing – review and editing.

Additional files

MDAR checklist

Data availability

Source Data and R scripts to generate the plots shown in the Figures are uploaded as the respective source code files. Flow cytometry and Illumina sequencing data are uploaded on Dryad together with R scripts to generate the plots shown in the respective Figures (Flow cytometry data: Figure 2C, 3C-D and Figure Supplements, 4A; Illumina sequencing data: Figure 5 and Figure Supplement).

The following dataset was generated:

Tomanek I, Guet C. 2022. Flow cytometry YFP and CFP data and deep sequencing data of populations evolving in galactose. Dryad Digital Repository.

References

  1. Aharoni A, Gaidukov L, Khersonsky O, McQ Gould S, Roodveldt C, Tawfik DS. The “ evolvability ” of promiscuous protein functions. Nature Genetics. 2005;37:73–76. doi: 10.1038/ng1482. [DOI] [PubMed] [Google Scholar]
  2. Albertson DG. Gene amplification in cancer. Trends in Genetics. 2006;22:447–455. doi: 10.1016/j.tig.2006.06.007. [DOI] [PubMed] [Google Scholar]
  3. Anderson RP, Roth JR. Tandem genetic duplications in phage and bacteria. Annual Review of Microbiology. 1977;31:473–505. doi: 10.1146/annurev.mi.31.100177.002353. [DOI] [PubMed] [Google Scholar]
  4. Andersson DI, Slechta ES, Roth JR. Evidence that gene amplification underlies adaptive mutability of the bacterial lac operon. Science. 1998;282:1133–1135. doi: 10.1126/science.282.5391.1133. [DOI] [PubMed] [Google Scholar]
  5. Andersson DI, Hughes D. Gene amplification and adaptive evolution in bacteria. Annual Review of Genetics. 2009;43:167–195. doi: 10.1146/annurev-genet-102108-134805. [DOI] [PubMed] [Google Scholar]
  6. Andersson DI, Jerlström-Hultqvist J, Näsvall J. Evolution of new functions de novo and from preexisting genes. Cold Spring Harbor Perspectives in Biology. 2015;7:1–19. doi: 10.1101/cshperspect.a017996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bass C, Field LM. Gene amplification and insecticide resistance. Pest Management Science. 2011;67:886–890. doi: 10.1002/ps.2189. [DOI] [PubMed] [Google Scholar]
  8. Bayer A, Brennan G, Geballe AP. Adaptation by copy number variation in monopartite viruses. Current Opinion in Virology. 2018;33:7–12. doi: 10.1016/j.coviro.2018.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Belikova D, Jochim A, Power J, Holden MTG, Heilbronner S. “Gene accordions” cause genotypic and phenotypic heterogeneity in clonal populations of Staphylococcus aureus. Nature Communications. 2020;11:3526. doi: 10.1038/s41467-020-17277-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bergthorsson U, Andersson DI, Roth JR. Ohno’s dilemma: evolution of new genes under continuous selection. PNAS. 2007;104:17004–17009. doi: 10.1073/pnas.0707158104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Blount ZD, Maddamsetti R, Grant NA, Ahmed ST, Jagdish T, Baxter JA, Sommerfeld BA, Tillman A, Moore J, Slonczewski JL, Barrick JE, Lenski RE. Genomic and phenotypic evolution of Escherichia coli in a novel citrate-only resource environment. eLife. 2020;9:e55414. doi: 10.7554/eLife.55414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cairns J, Overbaugh J, Miller S. The origin of mutants. Nature. 1988;335:142–145. doi: 10.1038/335142a0. [DOI] [PubMed] [Google Scholar]
  13. Cairns J, Foster PL. Adaptive reversion of a frameshift mutation in Escherichia coli. Genetics. 1991;128:695–701. doi: 10.1093/genetics/128.4.695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chait R, Shrestha S, Shah AK, Michel J-B, Kishony R. A differential drug screen for compounds that select against antibiotic resistance. PLOS ONE. 2010;5:e15179. doi: 10.1371/journal.pone.0015179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Conant GC, Wolfe KH. Turning a hobby into a job: how duplicated genes find new functions. Nature Reviews. Genetics. 2008;9:938–950. doi: 10.1038/nrg2482. [DOI] [PubMed] [Google Scholar]
  16. Cone KR, Kronenberg ZN, Yandell M, Elde NC. Emergence of a viral RNA polymerase variant during gene copy number amplification promotes rapid evolution of vaccinia virus. Journal of Virology. 2017;91:e01428-16. doi: 10.1128/JVI.01428-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Copley SD. Shining a light on enzyme promiscuity. Current Opinion in Structural Biology. 2017;47:167–175. doi: 10.1016/j.sbi.2017.11.001. [DOI] [PubMed] [Google Scholar]
  18. Copley SD. Evolution of new enzymes by gene duplication and divergence. The FEBS Journal. 2020;287:1262–1283. doi: 10.1111/febs.15299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cvijović I, Nguyen Ba AN, Desai MM. Experimental studies of evolutionary dynamics in microbes. Trends in Genetics. 2018;34:693–703. doi: 10.1016/j.tig.2018.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Darmon E, Leach DRF. Bacterial genome instability. Microbiology and Molecular Biology Reviews. 2014;78:1–39. doi: 10.1128/MMBR.00035-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Datta S, Costantino N, Court DL. A set of recombineering plasmids for gram-negative bacteria. Gene. 2006;379:109–115. doi: 10.1016/j.gene.2006.04.018. [DOI] [PubMed] [Google Scholar]
  22. Dhar R, Bergmiller T, Wagner A. Increased gene dosage plays a predominant role in the initial stages of evolution of duplicate TEM-1 beta lactamase genes. Evolution; International Journal of Organic Evolution. 2014;68:1775–1791. doi: 10.1111/evo.12373. [DOI] [PubMed] [Google Scholar]
  23. Drake JW, Charlesworth B, Charlesworth D, Crow JF. Rates of spontaneous mutation. Genetics. 1998;148:1667–1686. doi: 10.1093/genetics/148.4.1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Elde NC, Child SJ, Eickbush MT, Kitzman JO, Rogers KS, Shendure J, Geballe AP, Malik HS. Poxviruses deploy genomic accordions to adapt rapidly against host antiviral defenses. Cell. 2012;150:831–841. doi: 10.1016/j.cell.2012.05.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Elez M, Murray AW, Bi L-J, Zhang X-E, Matic I, Radman M. Seeing mutations in living cells. Current Biology. 2010;20:1432–1437. doi: 10.1016/j.cub.2010.06.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Friedlander T, Prizak R, Barton NH, Tkačik G. Evolution of new regulatory functions on biophysically realistic fitness landscapes. Nature Communications. 2017;8:216. doi: 10.1038/s41467-017-00238-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gerrish PJ, Lenski RE. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102:127–144. [PubMed] [Google Scholar]
  28. Goldberg I, Mekalanos JJ. Effect of a RecA mutation on cholera toxin gene amplification and deletion events. Journal of Bacteriology. 1986;165:723–731. doi: 10.1128/jb.165.3.723-731.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gruber JD, Vogel K, Kalay G, Wittkopp PJ. Contrasting properties of gene-specific regulatory, coding, and copy number mutations in Saccharomyces cerevisiae: frequency, effects, and dominance. PLOS Genetics. 2012;8:e1002497. doi: 10.1371/journal.pgen.1002497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nature Reviews. Genetics. 2010;11:97–108. doi: 10.1038/nrg2689. [DOI] [PubMed] [Google Scholar]
  31. Kacser H, Beeby R. Evolution of catalytic proteins or on the origin of enzyme species by means of natural selection. Journal of Molecular Evolution. 1984;20:38–51. doi: 10.1007/BF02101984. [DOI] [PubMed] [Google Scholar]
  32. Keith N, Tucker AE, Jackson CE, Sung W, Lucas Lledó JI, Schrider DR, Schaack S, Dudycha JL, Ackerman M, Younge AJ, Shaw JR, Lynch M. High mutational rates of large-scale duplication and deletion in Daphnia pulex. Genome Research. 2016;26:60–69. doi: 10.1101/gr.191338.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Khlebnikov A, Datsenko KA, Skaug T, Wanner BL, Keasling JD. Homogeneous expression of the p(bad) promoter in Escherichia coli by constitutive expression of the low-affinity high-capacity arae transporter. Microbiology. 2001;147:3241–3247. doi: 10.1099/00221287-147-12-3241. [DOI] [PubMed] [Google Scholar]
  34. Kondrashov FA. Gene duplication as a mechanism of genomic adaptation to a changing environment. Proceedings. Biological Sciences. 2012;279:5048–5057. doi: 10.1098/rspb.2012.1108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lagator M, Sarikas S, Steinrueck M, Toledo-Aparicio D, Bollback JP, Guet CC, Tkačik G. Predicting bacterial promoter function and evolution from random sequences. eLife. 2022;11:e43. doi: 10.7554/eLife.64543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lauer S, Avecilla G, Spealman P, Sethia G, Brandt N, Levy SF, Gresham D. Single-Cell copy number variant detection reveals the dynamics and diversity of adaptation. PLOS Biology. 2018;16:e3000069. doi: 10.1371/journal.pbio.3000069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lauer S, Gresham D. An evolving view of copy number variants. Current Genetics. 2019;65:1287–1295. doi: 10.1007/s00294-019-00980-0. [DOI] [PubMed] [Google Scholar]
  38. Lipinski KJ, Farslow JC, Fitzpatrick KA, Lynch M, Katju V, Bergthorsson U. High spontaneous rate of gene duplication in Caenorhabditis elegans. Current Biology. 2011;21:306–310. doi: 10.1016/j.cub.2011.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. doi: 10.1126/science.290.5494.1151. [DOI] [PubMed] [Google Scholar]
  40. Lynch M, Sung W, Morris K, Coffey N, Landry CR, Dopman EB, Dickinson WJ, Okamoto K, Kulkarni S, Hartl DL, Thomas WK. A genome-wide view of the spectrum of spontaneous mutations in yeast. PNAS. 2008;105:9272–9277. doi: 10.1073/pnas.0803466105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Näsvall J, Sun L, Roth JR, Andersson DI. Real-time evolution of new genes by innovation, amplification, and divergence. Science. 2012;338:384–387. doi: 10.1126/science.1226521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Nicoloff H, Hjort K, Levin BR, Andersson DI. The high prevalence of antibiotic heteroresistance in pathogenic bacteria is mainly caused by gene amplification. Nature Microbiology. 2019;4:504–514. doi: 10.1038/s41564-018-0342-0. [DOI] [PubMed] [Google Scholar]
  43. Ohno S. Evolution by Gene Duplication. Springer link; 1970. [DOI] [Google Scholar]
  44. Pettersson ME, Sun S, Andersson DI, Berg OG. Evolution of new gene functions: simulation and analysis of the amplification model. Genetica. 2009;135:309–324. doi: 10.1007/s10709-008-9289-z. [DOI] [PubMed] [Google Scholar]
  45. Pfaffl MW. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Research. 2001;29:45e–445. doi: 10.1093/nar/29.9.e45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Pränting M, Andersson DI. Escape from growth restriction in small colony variants of salmonella typhimurium by gene amplification and mutation. Molecular Microbiology. 2011;79:305–315. doi: 10.1111/j.1365-2958.2010.07458.x. [DOI] [PubMed] [Google Scholar]
  47. Prody CA, Dreyfus P, Zamir R, Zakut H, Soreq H. De novo amplification within a “silent” human cholinesterase gene in a family subjected to prolonged exposure to organophosphorous insecticides. PNAS. 1989;86:690–694. doi: 10.1073/pnas.86.2.690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Reams AB, Kofoid E, Savageau M, Roth JR. Duplication frequency in a population of Salmonella enterica rapidly approaches steady state with or without recombination. Genetics. 2010;184:1077–1094. doi: 10.1534/genetics.109.111963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Reams AB, Roth JR. Mechanisms of gene duplication and amplification. Cold Spring Harbor Perspectives in Biology. 2015;7:a016592. doi: 10.1101/cshperspect.a016592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Richts B, Lentes S, Poehlein A, Daniel R, Commichau FM. A Bacillus subtilis δpdxt mutant suppresses vitamin B6 limitation by acquiring mutations enhancing pdxs gene dosage and ammonium assimilation. Environmental Microbiology Reports. 2021;13:218–233. doi: 10.1111/1758-2229.12936. [DOI] [PubMed] [Google Scholar]
  51. Roth JR. Rearrangements of the bacterial chromosome: formation and applications. Science. 1988;241:1314–1318. doi: 10.1126/science.3045970. [DOI] [PubMed] [Google Scholar]
  52. Roth JR. In: In Escherichia coli and Salmonella: Cellular and Molecular Biology. 2nd edn. Neidhardt FC, editor. Washington, D.C: American Society for Microbiology; 1996. Rearrangements of the bacterial chromosome: formation and applications; pp. 2256–2276. [Google Scholar]
  53. San Millan A, Escudero JA, Gifford DR, Mazel D, MacLean RC. Multicopy plasmids potentiate the evolution of antibiotic resistance in bacteria. Nature Ecology & Evolution. 2017;1:10. doi: 10.1038/s41559-016-0010. [DOI] [PubMed] [Google Scholar]
  54. Sawitzke JA, Costantino N, Li X-T, Thomason LC, Bubunenko M, Court C, Court DL. Probing cellular processes with oligo-mediated recombination and using the knowledge gained to optimize recombineering. Journal of Molecular Biology. 2011;407:45–59. doi: 10.1016/j.jmb.2011.01.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Schrider DR, Houle D, Lynch M, Hahn MW. Rates and genomic consequences of spontaneous mutational events in Drosophila melanogaster. Genetics. 2013;194:937–954. doi: 10.1534/genetics.113.151670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Song S, Berg OG, Roth JR, Andersson DI. Contribution of gene amplification to evolution of increased antibiotic resistance in salmonella typhimurium. Genetics. 2009;182:1183–1195. doi: 10.1534/genetics.109.103028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Steinrueck M, Guet CC. Complex chromosomal neighborhood effects determine the adaptive potential of a gene under selection. eLife. 2017;6:e100. doi: 10.7554/eLife.25100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Tawfik DS. Messy biology and the origins of evolutionary innovations’, nature chemical biology: nature publishing group, a division of macmillan publishers limited. All Rights Reserved. 2010;6:692–696. doi: 10.1038/nchembio.441. [DOI] [PubMed] [Google Scholar]
  59. Teufel AI, Masel J, Liberles DA. What fraction of duplicates observed in recently sequenced genomes is segregating and destined to fail to fix? Genome Biology and Evolution. 2015;7:2258–2264. doi: 10.1093/gbe/evv139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Todd RT, Selmecki A. Expandable and reversible copy number amplification drives rapid adaptation to antifungal drugs. eLife. 2020;9:e58349. doi: 10.7554/eLife.58349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Tomanek I, Grah R, Lagator M, Andersson AMC, Bollback JP, Tkačik G, Guet CC. Gene amplification as a form of population-level gene expression regulation. Nature Ecology & Evolution. 2020;4:612–625. doi: 10.1038/s41559-020-1132-7. [DOI] [PubMed] [Google Scholar]
  62. Treangen TJ, Rocha EPC. Horizontal transfer, not duplication, drives the expansion of protein families in prokaryotes. PLOS Genetics. 2011;7:e1001284. doi: 10.1371/journal.pgen.1001284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Tria FDK, Martin WF. Gene duplications are at least 50 times less frequent than gene transfers in prokaryotic genomes. Genome Biology and Evolution. 2021;13:1–14. doi: 10.1093/gbe/evab224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Yona AH, Manor YS, Herbst RH, Romano GH, Mitchell A, Kupiec M, Pilpel Y, Dahan O. Chromosomal duplication is a transient evolutionary solution to stress. PNAS. 2012;109:21010–21015. doi: 10.1073/pnas.1211150109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Yona AH, Frumkin I, Pilpel Y. A relay race on the evolutionary adaptation spectrum. Cell. 2015;163:549–559. doi: 10.1016/j.cell.2015.10.005. [DOI] [PubMed] [Google Scholar]
  66. Yona AH, Alm EJ, Gore J. Random sequences rapidly evolve into de novo promoters. Nature Communications. 2018;9:1530. doi: 10.1038/s41467-018-04026-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Sergey Kryazhimskiy 1

This is an important paper that proposes a novel evolutionary mechanism by which copy-number mutations can slow down the accumulation of point mutations in populations evolving in certain environments. The authors use an evolution experiment in bacteria equipped with a clever reporter system to provide convincing evidence that this mechanism indeed operates. This paper will be of broad interest to readers in evolutionary biology and related fields.

Decision letter

Editor: Sergey Kryazhimskiy1
Reviewed by: Sergey Kryazhimskiy2, Joakim Näsvall3, Alejandra Rodríguez4

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "The adaptation dynamics between copy-number and point mutations" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Sergey Kryazhimskiy as Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Molly Przeworski as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: Joakim Näsvall (Reviewer #2); Alejandra Rodríguez (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

All reviewers agree that the paper asks an interesting question, and addresses it with a powerful reporter system. The most important issue that the reviewers identified is a lack of a convincing argument against clonal interference as an alternative explanation for the observations made in the low-demand environment. In your revision, please address this and other major criticisms listed below, and consider various suggestions for improving clarity.

Essential revisions:

(1) While your explanation involving negative epistasis makes a lot of sense, the lack of mutants carrying both duplications and point mutations in the low-demand condition could be explained by clonal interference between these types of mutations.

Please explicitly describe this alternative hypothesis in the main text and make an argument why the negative epistasis hypothesis is favored. The reviewers suggest several possibilities for making this argument.

The gold standard experiment is to construct double mutants (carrying a duplication and a point mutation that arose in the low-demand condition) and compare their competitive fitness with that of single mutants. The direct prediction of your hypothesis is that the double mutants should not be more fit than either of the single mutants. If engineering double mutants is not feasible, it may be possible instead to compare the competitive fitness (in the low-demand environments) of double mutants isolated from the high- or medium-demand environments with that of single mutants. In addition, it may also be possible to make an argument against clonal interference in the low-demand condition based on the observations made in other conditions and/or by carrying out simulations.

If a convincing argument against clonal interference cannot be made at this time, then please adjust your claims accordingly. Once the new data and/or analyses become available, you can publish them as a Research Advance article.

(2) Your main line of reasoning currently relies on the assumption that the YFP+CFP+ cells have only duplications but no promoter mutations. While this seems reasonable, we ask you to clearly spell out the argument supporting this assumption.

(3) All reviewers found the Amplification Hindrance hypothesis very interesting and useful. However, duplications are inherently unstable and are very quickly lost once selection stops favoring them. Please discuss the Amplification Hindrance hypothesis in light of this important fact.

(4) The reviewers found that your Materials and methods section lacks key pieces of information. For example, how were growth rate assays performed? How was the growth rate estimated? How were genetic reconstructions of mutations, described on LL. 472--484 and in Figure 6, carried out? This list is not exhaustive but simply illustrates the methodological gaps that the reviewers found particularly glaring. We ask you to revise the Methods section by adding enough detail so that all experiments and analyses can be reproduced.

Reviewer #1 (Recommendations for the authors):

LL. 241-248. I don't understand this discussion. "Despite the occurrence of yfp-only mutations in the IS- strain, increased CFP still reliably reports on increased copy-number." This sentence is self-contradictory as written. Perhaps the authors meant to add "in IS+ strains" in the second clause?

"However, the yfp-only amplification hijacks our ability to unambiguously infer ancestral copy-number from ancestral CFP fluorescence alone." This also doesn't make sense to me. Isn't the ancestral copy number known ( = 1)? Why does it need to be inferred from CFP fluorescence?

L. 253. "Mixed mutants" is a strange term. Please replace it with "double mutants" or "combination mutants". But calling populations "mixed" in the next sentence would be appropriate and helpful.

LL. 296--298. "in low galactose adaptive amplification of IS+ populations happened

rapidly with the majority of populations showing increases in CFP fluorescence during the course of the experiment". The speed has not been quantified, so I suggest making the language more precise here by dropping "rapidly". Similarly, instead of saying "the majority", it would be good to give the exact number as well as the percentage.

L. 306. "populations with clearly increased YFP levels". Which ones are these? Are these the ones labelled with red triangles in Figure 4D? If so, please say so explicitly. It would also be helpful to identify them somehow in panel 4B.

L. 310. "Sequencing of the amplified colony". Was only one colony sequenced here? I thought all YFP+ colonies were sequenced and reported in Supp Table 1. Please clarify.

L. 347. "do not occur". Point mutations presumably do occur, but they don't spread. Please correct.

L. 434-435. "Moreover, it is shifted towards higher values of YFP/CFP relative to values found for P0" Please clarify how you arrived at this claim. To say that, P0 and P0-2 must be shown on the same plot, which I do not see.

LL. 446. "with the majority of P0-2 IS+ population amplified, those few P0-2 IS+ populations that failed to evolve amplifications". Please be precise: "majority" – how many and what fraction? "Few" – how many and what fraction?

Figure 1. In panel B, it would be helpful to have error bars to see whether the decline in growth rate at higher expression levels is significant.

Figure 2:

(i) In panel A (0% galactose), please color IS- populations red as in other panels for the sake of consistency. Lines can be made more transparent to increase clarity.

(ii) In other panels, it would be helpful to show the threshold of fluorescence for calling an amplification (as a horizontal line) and then also indicate on each panel the number and fraction of populations that have acquired an amplification by the end of the experiment.

(iii) In panel B, I would suggest doing the same thing as above, i.e., showing the number (and fraction) of populations in each class e.g., YFP+CFP+ (18). Also, please show grid lines and the diagonal so that it is possible to compare data across panels.

Figure 3: It is currently impossible to associate each panel in C with a specific point in panel A because multiple points have the same color. One possible way to resolve this ambiguity is to draw the trajectories in A as lines but only show one time point (the one that is shown as a panel in C) with a circle.

Figure 4: I have several issues with panel A.

(i) How are the cell counts normalized and why?

(ii) I think the authors should consider splitting the CFP vs YFP plot into two, one showing population B1 and the other showing population B3. Right now, B1 completely obscures B3. I also suggest somehow highlighting the three subpopulations that the authors refer to in the text on LL. 308-309.

(iii) I don't understand whether the right-most panel and the middle set of panels show the same data. Specifically, on the CFP vs YFP plot, it is clear that population B1 has a subpopulation with log(CFP) ~ 6 and log(YFP) ~ 6. On the YFP plot, there is a visible tail around the same value, but the CFP plot does not seem to have a corresponding peak. Is this a plotting error? If not, perhaps the issue is with the normalization, or maybe the cell counts need to be plotted on the log scale.

(iv) It seems important for the authors' argument to show that ALL populations evolved in 0.01% show the same pattern as B1 and B3. I would suggest showing the YFP vs CFP plot for every population as a supplementary figure.

(v) Panel B. It is nice to show the raw data, but I think the exposition would improve if this panel was in the supplement. The reason is that differences in YFP fluorescence values are very difficult to discern by eye, and the same data (as far as I understood) are anyway presented in a more discernible way in panel D.

Figure 5:

(i) In the caption and in the text (l. 369) the authors refer to the "number of reads", but I think in all case they mean "fraction of reads". I am also a bit confused what the authors mean by "normalized to the number of reads with ancestral P0". Do they show the actual fraction of reads carrying the respective mutations in each set of population or is it something else?

(ii) I would suggest adding the position number to the labels of point mutations, e.g., T30A, C37T

Figure 6:

(i) please add a legend to panel A, as in Figure 2A.

(ii) I am very confused about panel C. If I understand it correctly, it shows that >20 colonies sampled from IS- populations are labelled as YFP+CFP+. Based on what data was this assignment made? Panel A shows that there are apparently no IS- populations that have the YFP+CFP+ phenotype, and we don't expect them to happen at such high rates in the IS- populations. Please clarify.

(iii) In panel E, mutants are labeled A11, F2, etc., but I don't seem to find a correspondence between these labels and the specific mutations that have been reconstructed. Please clarify.

(iv) By the same token, I don't see a description of how these mutants were constructed.

Figure 7B: The double-mutant is missing from this panel. I understand that the authors have not observed such double-mutants in low-demand environments, but I think it would be helpful to show it for the purposes of clarifying their hypothesis.

Reviewer #2 (Recommendations for the authors):

I find the present manuscript acceptable for publication but think the authors could consider adding a few points where it's suitable:

1. Could the authors discuss or comment on the "possible weakness" mentioned in the public review?

2. The evolutionary dynamics at different galactose concentrations are shown in Supplementary Figure 2, but there might be more to be learned by looking at the exact conditions of the evolution experiments (M9+casaminoacids+various concentrations of galactose): what are the growth dynamics of the ancestral strain and the evolved strains? Is the main evolutionary advantage on the maximum growth rate, final population size, or somewhere else (or a combination of several parameters)? If data on this is not already available I think it should be a quick and simple experiment to add (note that I do not think this is critical for publication, but that it could potentially add some additional value to the discussion).

3. Could selection at slightly different conditions (smaller/larger populations, smaller/larger bottlenecks) or with fluctuating selection pressures affect the outcome? Could variations in the conditions used produce a different skew of duplications vs. promoter mutations?

I enjoyed reading the manuscript and hope to see it published soon!

Reviewer #3 (Recommendations for the authors):

Overall, I think is an exciting study and the researcher was rigorously conducted and well presented.

eLife. 2022 Dec 22;11:e82240. doi: 10.7554/eLife.82240.sa2

Author response


Essential revisions:

(1) While your explanation involving negative epistasis makes a lot of sense, the lack of mutants carrying both duplications and point mutations in the low-demand condition could be explained by clonal interference between these types of mutations.

Please explicitly describe this alternative hypothesis in the main text and make an argument why the negative epistasis hypothesis is favored. The reviewers suggest several possibilities for making this argument.

The gold standard experiment is to construct double mutants (carrying a duplication and a point mutation that arose in the low-demand condition) and compare their competitive fitness with that of single mutants. The direct prediction of your hypothesis is that the double mutants should not be more fit than either of the single mutants. If engineering double mutants is not feasible, it may be possible instead to compare the competitive fitness (in the low-demand environments) of double mutants isolated from the high- or medium-demand environments with that of single mutants. In addition, it may also be possible to make an argument against clonal interference in the low-demand condition based on the observations made in other conditions and/or by carrying out simulations.

If a convincing argument against clonal interference cannot be made at this time, then please adjust your claims accordingly. Once the new data and/or analyses become available, you can publish them as a Research Advance article.

We thank the reviewers for the suggested experiment, which we agree would unequivocally distinguish between epistasis and clonal interference. We will actually do this experiment and we would like to take the opportunity to publish it as a Research Advance article. For the current revision however, we could not do this, as the first author is already in Oxford doing her postdoctoral work and is tied up by family obligations as well (two small kids) from travelling to perform this work in Vienna, at the present time.

Therefore, we chose to address this criticism in two ways: (i) We explicitly discuss clonal interference (CI) in the main text while at the same time we weaken our epistasis claims to entertain the possibility of CI, as we are not showing the presence of epistasis directly with experiments. (ii) We also present arguments (in this response but also in the main text) of why we think epistasis is the more likely explanation. Please see the final section of the Results section, starting at line 551 (together with the newly added Figure 7 —figure supplement 1).

Clonal interference (CI) is the effect of the competition between lineages arising from different beneficial mutations in a population without recombination (i.e. asexually reproducing). When the product of population size and mutation rate (N*µ) is smaller than one (N*µ<1), mutations fix one by one, whereas if N*µ exceeds one significantly (N*µ>>1), mutations compete, thereby slowing down the fixation of each other (Gerrish and Lenski, 1998; Park and Krug, 2007).

At the first glance, the fact that significantly more replicate populations become amplified in the lowest galactose concentration suggests that clonal interference could play a role in this environment. All else being equal, we expect CI to occur for larger population sizes or at higher mutation rates. However, in the absence of strong galK expression (i.e. expression levels of.galK with ancestral P0) the maximal OD600 (“yield”) as a proxy for population size is similar between populations in the different galactose environments (please see new Figure 7 —figure supplement 1a). The reason for this is that in the absence of galK expression the cells essentially only metabolize the casaminoacid component of the evolution medium, which is the same across all galactose environments.

At the same time, there is no reason to assume that the mutation rate should differ between the different galactose concentrations. Instead, the observation that amplifications evolve faster in the low galactose environment can be explained by differences in the expression-growth relation for galK (“fitness landscape”, Figure 1b) between the different environments. Specifically, it appears as though the environments differ in their adaptive benefit of increasing galK expression starting from a low level (at low levels of inducer arabinose in Figure 1b). This would imply that acquiring a duplication of galK incurs little fitness benefit in 1% galactose (the fitness landscape exhibits a plateau for low expression values), while the same mutation incurs a bigger benefit in 0.01% galactose (the fitness landscape is slightly steeper when going from one copy of galK to two copies of galK). Because we can with relative confidence rule out both very significant differences in the population size and differences in the duplication rate between environments, CI is an unlikely explanation for the fact that we see mutually exclusive mutations in low galactose, but not in higher galactose concentrations.

There is one additional argument to be made in favour of epistasis: In our (new) Figure 7 —figure supplement 1b-d, the growth of amplified strains (blue) that have evolved in the three different Gal concentrations for seven days, respectively, is compared to the growth of two different promoter mutants (double mutant, and slightly weaker single mutant, both in green) (for convenience, also see Figure 7 —figure supplement 1b-d). Intriguingly, the difference between both yield and growth rate of amplified and promoter mutant strains is bigger in both high galactose environments (1% and 0.1%) than in low galactose (0.01%). The amplified strains, which have evolved in low galactose are growing comparably well to the promoter mutant. One of the amplified strains (blue continuous lines) even reaches higher population densities (yields) than either promoter mutants. This suggests that amplification can in fact provide sufficient expression in low galactose, while a higher expression level is required in high galactose. These results are in agreement with the epistasis hypothesis, namely that additional point mutations would be beneficial in the higher galactose concentrations, but that they would yield little or no fitness benefit in low galactose (=resulting in negative epistasis).

Finally, a second line of evidence argues against clonal interference playing a role in making mutations mutually exclusive in low galactose. CI essentially means that many adaptive mutations arise simultaneously but in different individuals. Hence, beneficial mutations are not only competing against the ancestral (low fitness) allele, but against all other beneficial alleles. Its fixation is therefore slowed down. However, even in low galactose, despite the rapid amplification of most populations, a significant fraction of cells is still ancestral by the end of the evolution experiment (see new Figure 4 —figure supplement 1, Figure 4a). This suggests that the population is not yet “saturated” with competing mutations. Extending this argument, in the case of CI we would expect to find at least small subpopulations of combination mutants when monitoring populations both, on the single-cell level (Figure 4 —figure supplement 1, Figure 4a) or single-colony level (essentially corresponding to single cells; e.g. Figure 4C).

The above listed arguments, together with the fact that the maximal copy-number differs significantly for evolved populations in the three environments (Figure 2 – Figures Supplement 1b) strongly suggests (even in the absence of direct experimental evidence) that combination mutants are absent in the low demand condition because additional point mutations are obsolete (i.e. negative epistasis).

We made the following changes in the main text to address the point about CI versus epistasis:

– We adjusted the abstract by removing the claim of epistasis (and pointing out the high frequency of amplifications, which is underlying amplification hindrance irrespective of whether it occurs through negative epistasis or clonal interference): line 23.

– We adjusted line 109 in the Introduction: “However, if both, copy-number and point mutations are adaptive (Gruber et al., 2012), they also have the potential to interact epistatically or due to clonal interference.”

– We added a new section to the final part of the Results, starting at line 551 (together with the newly added Figure 7 —figure supplement 1).

– We adjusted the discussion in line 626 (“most likely epistasis”)

(2) Your main line of reasoning currently relies on the assumption that the YFP+CFP+ cells have only duplications but no promoter mutations. While this seems reasonable, we ask you to clearly spell out the argument supporting this assumption.

We agree that this is an important point which we have not made sufficiently explicit (in part because YFP+CFP+ cells having only copy-number mutations has been a repeated observation we have had in past studies (Steinrueck and Guet, 2017; Tomanek et al., 2020)).

We adjusted the manuscript in two places to make this point clearer and also support it with additional data:

(i) Using data from previous experiments, we added a new Figure 1 —figure supplement 1, which shows the correlation between YFP and CFP and copy-number as measured by qPCR. In these plots, a promoter mutant ((-30T>A and -37C>T)) is an obvious outlier in the levels of YFP. This is mentioned in line 153.

(ii) In Figure 6C we summarise the sequencing result of (among other things) 21 colonies from the amplified fraction (i.e. with a correlated increase in both fluorophores (“YFP+CFP+”)), which all harbour an ancestral P0-2 sequence. We point this out more clearly in the text now (line 520).

(3) All reviewers found the Amplification Hindrance hypothesis very interesting and useful. However, duplications are inherently unstable and are very quickly lost once selection stops favoring them. Please discuss the Amplification Hindrance hypothesis in light of this important fact.

We agree with this point and want to emphasise that we are aware of the unstable nature of amplifications, which we have studied previously under regimes of fluctuating selection (Tomanek et al., 2020).

In our previous study, we evolved populations with amplifications in P0-galK in environments alternatingly selecting for and against galK expression. Under our experimental conditions, copy-number polymorphisms accumulate rapidly and provide an ample diversity of galK expression levels for selection to pick from. Most likely because evolutionary dynamics were dominated by strong selection and high rates of duplication/deletion, in that study we failed to recover any point mutations in P0 in populations evolved under our fluctuating conditions.

In fact, it is plausible that under fluctuating selection amplification hindrance may be even stronger than under continuous selection. The reason for this is that promoter point mutations leading to constitutive expression might be slightly costly when expression is not selected for. Unlike the rate of reversal of point mutations, amplifications have a very high rate of reversal that allows amplified populations to rapidly attain ancestral (low) expression levels, only to re-appear once selection for increased gene expression resumes.

That being said, we agree with the general point that the study uses simplified environments. Therefore, we would be very interested in whether the phenomenon described here actually occurs in natural/clinical settings (e.g. antibiotic resistance genes that are known to be amplified and causing heteroresistance) and are more conserved than comparable resistance genes, which are not amplified.

In our present manuscript, we touched on this topic in the Discussion section (see below). We now added the following statements to more explicitly discuss fluctuating selection (line 694).

“If adaptation by amplification is the fastest and sufficient, other, less frequent, mutations may not have a chance to compete. While our evolution experiments were conducted under continuous selection, natural environments may provide regimes of fluctuating selection. Due to the pleiotropic cost often associated with copy-number increases as well as their high rate of deletion, adaptive amplification returns to the ancestral single copy state in the absence of selection (Andersson and Hughes, 2009; Reams et al., 2010). This means that once the selective benefit of the transient adaptation is gone, no change at the level of genomic DNA remains (Roth et al., 1996). Therefore, the idea that gene amplifications act as a transient “regulatory state” rather than a mutation (Roth et al., 1996; Tomanek et al., 2020) can be extended by an implication found here, namely that amplifications could effectively act as buffer against long-lasting point mutations. In this view, amplification could repeatedly provide rapid adaptation to selection for increased gene expression, but collapse back to the single copy ancestral state once selection has subsided and yet hinder sequence divergence each time it does so. Thus, on sufficiently long time-scales, the transient dynamics that play out before the fixation of mutations may ultimately shape entire genomes”

(4) The reviewers found that your Materials and methods section lacks key pieces of information. For example, how were growth rate assays performed? How was the growth rate estimated? How were genetic reconstructions of mutations, described on LL. 472--484 and in Figure 6, carried out? This list is not exhaustive but simply illustrates the methodological gaps that the reviewers found particularly glaring. We ask you to revise the Methods section by adding enough detail so that all experiments and analyses can be reproduced.

We thank the reviewers for pointing out this issue. We added the following new sections to Methods:

(i) OD600 and fluorescence measurements in liquid cultures and calculation of growth rate (line 824)

(ii) Reconstitution of P0-2 mutants in the ancestral strain by oligo-recombineering (line 768)

Reviewer #1 (Recommendations for the authors):

LL. 241-248. I don't understand this discussion. "Despite the occurrence of yfp-only mutations in the IS- strain, increased CFP still reliably reports on increased copy-number." This sentence is self-contradictory as written. Perhaps the authors meant to add "in IS+ strains" in the second clause?

We agree that this sentence (and the following mentioned below) was confusing as stated. To make it clearer we now elaborate more on the issue. In fact, increased CFP still reports on increased copy-number in both IS+ and IS- strain (although rare in the latter due to its lowered rate of (IS-dependent) amplification).

We replaced the sentence in line 258 with the following paragraph:

“While increased CFP still reliably reports on increased copy-number, the yfp-only amplification hijacks our ability to unambiguously infer ancestral copy-number from ancestral CFP fluorescence alone. As increasing CFP itself bears no adaptive benefit, populations with increased CFP must carry amplifications that also include galK. In contrast, ancestral copy-number can only be confirmed by qPCR. The fact that some populations carry IS-independent yfp-only amplifications implies that our system of fluorescence reporters will yield a slight underestimate of the number of amplified populations both in the IS+ and IS- strain. However, we were ultimately interested in the divergence of promoter sequences, and going forward relied on sequencing to unambiguously determine the presence of adaptive promoter mutations.“

"However, the yfp-only amplification hijacks our ability to unambiguously infer ancestral copy-number from ancestral CFP fluorescence alone." This also doesn't make sense to me. Isn't the ancestral copy number known (=1)? Why does it need to be inferred from CFP fluorescence?

We corrected the issue together with the preceding sentence. Please see above paragraph.

L. 253. "Mixed mutants" is a strange term. Please replace it with "double mutants" or "combination mutants". But calling populations "mixed" in the next sentence would be appropriate and helpful.

We changed “mixed mutants” to “double mutants”.

LL. 296--298. "in low galactose adaptive amplification of IS+ populations happened rapidly with the majority of populations showing increases in CFP fluorescence during the course of the experiment". The speed has not been quantified, so I suggest making the language more precise here by dropping "rapidly". Similarly, instead of saying "the majority", it would be good to give the exact number as well as the percentage.

The numbers were unfortunately buried in the captions of Supplementary Figure 1A (now Figure 1 —figure supplement 1a). We now added them to the main text (pointing also to the Supplement for our definition of what constitutes an “amplified” population). We changed the wording to “more rapidly” (i.e. a relative term) as we found it important to draw the readers´ attention to the difference between the environments (line 319).

L. 306. "populations with clearly increased YFP levels". Which ones are these? Are these the ones labelled with red triangles in Figure 4D? If so, please say so explicitly. It would also be helpful to identify them somehow in panel 4B.

Yes, indeed. We added “(Figure 4d – red triangles)” to clarify.

L. 310. "Sequencing of the amplified colony". Was only one colony sequenced here? I thought all YFP+ colonies were sequenced and reported in Supp Table 1. Please clarify.

Yes, indeed, we sequenced all YFP+ colonies (corresponding to the red triangles in Figure 4D and reported in Table 1). The reviewer is right, as the way we were reporting our results was confusing. We now make it clearer by first reporting on one specific population (B1, the one in Figure 4a-b) for which we show the agar streak to illustrate our procedure for sequencing and point out the obvious difference in phenotype that can be noticed “by eye” (Figure 4c). We then explain that we sequenced the remaining YFP+ (red triangle) populations in the same manner.

L. 347. "do not occur". Point mutations presumably do occur, but they don't spread. Please correct.

We corrected the wording to: “If copy-number mutations are more frequent than point mutations and their combination does not spread to observable frequencies in the low demand environment…” (line 379)

L. 434-435. "Moreover, it is shifted towards higher values of YFP/CFP relative to values found for P0" Please clarify how you arrived at this claim. To say that, P0 and P0-2 must be shown on the same plot, which I do not see.

In order to clarify, we added Figure 6 —figure supplement 1b, where we plot YFP/OD600 against CFP/OD600 for both IS+ populations of P0 and P0-2 in one plot.

LL. 446. "with the majority of P0-2 IS+ population amplified, those few P0-2 IS+ populations that failed to evolve amplifications". Please be precise: "majority" – how many and what fraction? "Few" – how many and what fraction?

We added this information (“majority”: 88/96 and “few”: 6/96) to the main text (line 487). We also added the threshold of considering a population amplified (mean of ancestral CFP + 4 SD of ancestral CFP) as a green line in Figure 6 —figure supplement 1A.

Figure 1. In panel B, it would be helpful to have error bars to see whether the decline in growth rate at higher expression levels is significant.

The plot shown in Figure 1 panel B is a single replicate. We have repeated the experiment twice three years later and it qualitatively produced similar results, most importantly the fact that growth rate increases with increasing galK expression, but levels off at different induction levels for different galactose concentrations (please see Author response image 1). The differences between replicate experiments are probably mainly due to batch effects in the sugar- and casaminoacids stocks used for preparing the different media.

Author response image 1. Replicates of the Experiment shown in Figure 1b, where growth rate is plotted against arabinose concentration used to induce galK expression for different galactose concentrations (line colors, see legend).

Author response image 1.

Left panels show all tested galactose concentrations, while right panels show the relevant galactose concentrations used in the experiments of the manuscript.

As the reviewer noted, it seems there is indeed a growth rate reduction in 1% arabinose (in the control, i.e. in the absence of galactose), but this is likely an artifact of the strain we are using rather than the cost of high galK expression itself. We used strain background BW27784 for our chromosomal pBAD-galK construct. BW27784 has the advantage of allowing an almost linear induction behaviour for the arabinose promoter (pBAD) as opposed to the none-to-all behaviour typical of most catabolic promoters (Khlebnikov et al., 2001). To this end, it carries a constitutive arabinose import system and a deletion of the arabinose catabolic genes. Therefore, BW27784 is not able to catabolize arabinose, such that high concentrations are probably toxic. Importantly, our evolution experiments we used an MG1655 strain background (and no arabinose), so we don’t expect high levels of galK expression to be toxic. Indeed, we do not see a cost of galK expression even in the absence of galactose when comparing strains with very low levels of galK expression (ancestral P0) and strains with high galK expression (P0 double promoter mutant) such as in Figure 2 —figure supplement 2b.

Moreover, we are confident that the data in Figure 1b capture the most important aspects of the galK expression-growth relation in the different environments, as the level of amplification (#of copies) evolving in our evolution experiment increases with increasing galactose concentration (Figure 2 —figure supplement 1B).

Figure 2:

(i) In panel A (0% galactose), please color IS- populations red as in other panels for the sake of consistency. Lines can be made more transparent to increase clarity.

(ii) In other panels, it would be helpful to show the threshold of fluorescence for calling an amplification (as a horizontal line) and then also indicate on each panel the number and fraction of populations that have acquired an amplification by the end of the experiment.

We are showing the threshold of fluorescence for determining an amplification in Figure 2 —figure supplement 1A.

We report the numbers in the main text and added them to the figure as well.

(iii) In panel B, I would suggest doing the same thing as above, i.e., showing the number (and fraction) of populations in each class e.g., YFP+CFP+ (18). Also, please show grid lines and the diagonal so that it is possible to compare data across panels.

We added grid lines to the plots and added the numbers to the figure panel.

Figure 3: It is currently impossible to associate each panel in C with a specific point in panel A because multiple points have the same color. One possible way to resolve this ambiguity is to draw the trajectories in A as lines but only show one time point (the one that is shown as a panel in C) with a circle.

We drew evolutionary trajectories in color and added the last time point in addition to the lines, as suggested by the reviewer.

Figure 4: I have several issues with panel A.

(i) How are the cell counts normalized and why?

For this plot we use the geom_density() function of the R package ggplot2 to draw a kernel density estimate, a smoothed version of the underlying histogram, as this is often used for flow cytometry data in the field. The default plot used by us scales the data such that the area under the curve equals 1. We now replaced “histogram” in the caption with “density plot”.

(ii) I think the authors should consider splitting the CFP vs YFP plot into two, one showing population B1 and the other showing population B3. Right now, B1 completely obscures B3. I also suggest somehow highlighting the three subpopulations that the authors refer to in the text on LL. 308-309.

We split the plots into two as suggested and added arrows to highlight subpopulations.

(iii) I don't understand whether the right-most panel and the middle set of panels show the same data. Specifically, on the CFP vs YFP plot, it is clear that population B1 has a subpopulation with log(CFP) ~ 6 and log(YFP) ~ 6. On the YFP plot, there is a visible tail around the same value, but the CFP plot does not seem to have a corresponding peak. Is this a plotting error? If not, perhaps the issue is with the normalization, or maybe the cell counts need to be plotted on the log scale.

We agree that the plotting has not been ideal. The data is indeed the same. Cell numbers of the amplified subpopulation in B1 are small (this is hard to judge from the YFP-CFP plot as it was shown). The amplified sub-peak is actually there in the histogram but basically not visible in the small plot we show (please see histograms in a log scale, author response image 2). Therefore, we now stretched the plot along the y-axis and added a small arrow to highlight the rather tiny peak.

Author response image 2. Histogram of population B1 CFP fluorescence intensity [a.u.] plotted on a log scale to visualize the small amplified subpopulation (black=day 4, blue=day 8, purple = day 12).

Author response image 2.

(iv) It seems important for the authors' argument to show that ALL populations evolved in 0.01% show the same pattern as B1 and B3. I would suggest showing the YFP vs CFP plot for every population as a supplementary figure.

We now added this data as new Figure 4 —figure supplement 1a-b.

(v) Panel B. It is nice to show the raw data, but I think the exposition would improve if this panel was in the supplement. The reason is that differences in YFP fluorescence values are very difficult to discern by eye, and the same data (as far as I understood) are anyway presented in a more discernible way in panel D.

We added the data as a new Figure 4 —figure supplement 2a, since (as correctly pointed out) it is quantified anyways in panel D.

Figure 5:

(i) In the caption and in the text (l. 369) the authors refer to the "number of reads", but I think in all case they mean "fraction of reads". I am also a bit confused what the authors mean by "normalized to the number of reads with ancestral P0". Do they show the actual fraction of reads carrying the respective mutations in each set of population or is it something else?

Our wording here was indeed misleading. We used “numbers” with the intent to describe in simple words how our algorithm works, but we should have instead said fractions from the beginning of the paragraph, to avoid confusion. We fixed this issue now. By “Normalized to the number of reads with ancestral p0” we meant that we divided #reads_with_mutations/#reads_without_mutations. We added this as an explicit statement in line 397 and also corrected a similar issue in the methods (paragraphs after line 894).

(ii) I would suggest adding the position number to the labels of point mutations, e.g., T30A, C37T

We added the position numbers in the plot legends as suggested.

Figure 6:

(i) please add a legend to panel A, as in Figure 2A.

Thank you for pointing that out, we added now both the fraction names and the respective numbers.

(ii) I am very confused about panel C. If I understand it correctly, it shows that >20 colonies sampled from IS- populations are labelled as YFP+CFP+. Based on what data was this assignment made? Panel A shows that there are apparently no IS- populations that have the YFP+CFP+ phenotype, and we don't expect them to happen at such high rates in the IS- populations. Please clarify.

Thank you for spotting this mistake. We unintentionally seemed to have swapped the labels of “YFP+CFP+” and “ancestral” while preparing the figure from the original R plots. Indeed, IS- just has a single CFP+YFP+ population. We actually sequenced 21 ancestral colonies instead. Similarly, for the IS+ strain the numbers of sequenced colonies are swapped: we sequenced 17 YFP+CFP+ and 14 ancestral ones (not the other way around). We now corrected this mistake so that the table makes sense again (and is in agreement with the numbers we describe in the text).

(iii) In panel E, mutants are labeled A11, F2, etc., but I don't seem to find a correspondence between these labels and the specific mutations that have been reconstructed. Please clarify.

Thank you for pointing that out. In one case we had added the population name, in the other the kind of mutation (both types of info are combined in the Table 2). We corrected the issue now so the labels are corresponding.

(iv) By the same token, I don't see a description of how these mutants were constructed.

We added this missing information in the methods section named: Reconstitution of P0-2 mutants by oligo-recombineering.

Figure 7B: The double-mutant is missing from this panel. I understand that the authors have not observed such double-mutants in low-demand environments, but I think it would be helpful to show it for the purposes of clarifying their hypothesis.

We followed the suggestion and added the combination mutant to the scheme. We connected this mutant with the other ones by faint horizontal lines indicating neutral evolution (described in the captions).

Reviewer #2 (Recommendations for the authors):

I find the present manuscript acceptable for publication but think the authors could consider adding a few points where it's suitable:

1. Could the authors discuss or comment on the "possible weakness" mentioned in the public review?

We discuss the issue about epistasis versus clonal interference in our reply here (please see the section starting on page 1) and made several adjustments to the text (please see explanations above).

2. The evolutionary dynamics at different galactose concentrations are shown in Supplementary Figure 2, but there might be more to be learned by looking at the exact conditions of the evolution experiments (M9+casaminoacids+various concentrations of galactose): what are the growth dynamics of the ancestral strain and the evolved strains? Is the main evolutionary advantage on the maximum growth rate, final population size, or somewhere else (or a combination of several parameters)? If data on this is not already available I think it should be a quick and simple experiment to add (note that I do not think this is critical for publication, but that it could potentially add some additional value to the discussion).

This is an interesting point. The data behind Figure 1b (where we induced expression of galK along a gradient of arabinose and measured growth rates in the different galactose concentrations) hints at the fact that both growth rate and population size (yield) will be increased with increasing either galK expression or galactose concentration (at a given expression level). The data also suggests that when galK is expressed a low level, low galactose concentrations result in higher growth rate. (As mentioned above, we suspect that this is the reason behind the steeper fitness peak for low galactose and the reason why amplifications evolve there more rapidly than in the higher galactose concentrations). Please see Author response image 3.

Author response image 3. Example growth curves in M9 evolution medium and a 2D gradient of galactose and arabinose (as an inducer of galK expression).

Author response image 3.

These measurements are the basis for Figure 1b. The top left panel shows growth in 1% glycerol (control) instead of galactose.

The idea that both, yield and growth rate are increased by increasing galK expression (irrespective of the nature of the underlying mutation) is corroborated by the growth curves of the different evolved strains in the different galactose concentrations. Specifically, we compared the growth of six different amplified strains, where two evolved in one of the three different galactose concentrations, respectively, to two promoter mutants (the double mutant “H5” and a single mutant isolated in a previous study). While we do not have data on how the ancestral strain compares in the same environment, the data from the different mutants suggests that populations gain both, growth rate and yield increases with increasing galactose concentration (please see new Figure 7 —figure supplement 1; for convenience the figure is also plotted within Author response image 3 where we discuss it in the context of clonal interference/epistasis – Figure 2).

3. Could selection at slightly different conditions (smaller/larger populations, smaller/larger bottlenecks) or with fluctuating selection pressures affect the outcome? Could variations in the conditions used produce a different skew of duplications vs. promoter mutations?

I enjoyed reading the manuscript and hope to see it published soon!

Intuitively, we think that slightly different population sizes/ bottlenecks should not matter too much for the experiments shown. We gather this information from some preliminary experiments where we used different dilution factors, and one experiment where we also used a 10-fold bigger population size.

However, the rather big difference in outcome between the different environments seems to suggest that the distribution of mutations should quite sensitively depend on the galactose concentration, especially for lower galactose concentrations. We would speculate that the fitness landscape (shown in Figure 1b) if resolved at a finer scale (i.e. plotting growth rate along a series of very small increases in gene expression), might show an initial plateau for the high galactose concentrations, but not for the low one. We suspect that metabolic/biochemical constraints differ in the different galactose concentrations and the strength of selection will not simply be a linear function of the galactose concentration.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Tomanek I, Guet C. 2022. Flow cytometry YFP and CFP data and deep sequencing data of populations evolving in galactose. Dryad Digital Repository. [DOI]

    Supplementary Materials

    Figure 1—source data 1. Contains an R script along with optical density measurments to plot Figure 1B.
    Figure 1—figure supplement 1—source data 1. Contains an R script along with qPCR and fluorescence intensity data to plot Figure 1—figure supplement 1.
    Figure 2—source data 1. Contains an R script along with optical density and fluorescence data to plot Figure 2A-B.
    Figure 2—figure supplement 2—source data 1. Contains an R script along with optical density measurments to plot Figure 2—figure supplement 2B.
    Figure 2—figure supplement 3—source data 1. Contains an R script along with qPCR data to plot Figure 2—figure supplement 3B.
    Figure 3—source data 1. Contains an alignment of sequencing data for Figure 3E.
    Figure 4—source data 1. Contains an R script along with colony fluorescence intensity data over time to plot Figure 4D.
    Figure 6—source data 1. Contains an R script along with optical density and fluorescence intensity measurments to plot Figure 6A-B.
    Figure 6—source data 2. Contains an R script along with optical density and fluorescence intensity measurments to plot Figure 6D-E.
    MDAR checklist

    Data Availability Statement

    Source Data and R scripts to generate the plots shown in the Figures are uploaded as the respective source code files. Flow cytometry and Illumina sequencing data are uploaded on Dryad together with R scripts to generate the plots shown in the respective Figures (Flow cytometry data: Figure 2C, 3C-D and Figure Supplements, 4A; Illumina sequencing data: Figure 5 and Figure Supplement).

    The following dataset was generated:

    Tomanek I, Guet C. 2022. Flow cytometry YFP and CFP data and deep sequencing data of populations evolving in galactose. Dryad Digital Repository.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES