Skip to main content
eLife logoLink to eLife
. 2015 Aug 14;4:e06492. doi: 10.7554/eLife.06492

Reverse evolution leads to genotypic incompatibility despite functional and active site convergence

Miriam Kaltenbach 1,2, Colin J Jackson 3, Eleanor C Campbell 3, Florian Hollfelder 2, Nobuhiko Tokuriki 1,*
Editor: Michael Laub4
PMCID: PMC4579389  PMID: 26274563

Abstract

Understanding the extent to which enzyme evolution is reversible can shed light on the fundamental relationship between protein sequence, structure, and function. Here, we perform an experimental test of evolutionary reversibility using directed evolution from a phosphotriesterase to an arylesterase, and back, and examine the underlying molecular basis. We find that wild-type phosphotriesterase function could be restored (>104-fold activity increase), but via an alternative set of mutations. The enzyme active site converged towards its original state, indicating evolutionary constraints imposed by catalytic requirements. We reveal that extensive epistasis prevents reversions and necessitates fixation of new mutations, leading to a functionally identical sequence. Many amino acid exchanges between the new and original enzyme are not tolerated, implying sequence incompatibility. Therefore, the evolution was phenotypically reversible but genotypically irreversible. Our study illustrates that the enzyme's adaptive landscape is highly rugged, and different functional sequences may constitute separate fitness peaks.

DOI: http://dx.doi.org/10.7554/eLife.06492.001

Research organism: E. coli

eLife digest

Enzymes in bacteria and other organisms are built following instructions contained within each cell's DNA. Changes in the DNA, that is to say, mutations, can alter the shape and activity of the enzymes that are produced, which can ultimately affect the ability of the organism to survive and reproduce. Mutations that are beneficial to the organism are more likely to be passed on to future generations, which can lead to populations changing over time.

The DNA sequences that an organism carries are referred to as its ‘genotype’ and the resulting physical characteristics of the organism are known as its ‘phenotype’. Studies of evolution tend to focus on how particular species or molecules become more different over time. However, one area that remains controversial is whether it is possible for evolution to be reversed so that an organism or molecule returns to a previous form.

An enzyme called PTE is said to have phosphotriesterase activity because it catalyzes this particular type of chemical reaction. Recently, a group of researchers used a method called ‘directed evolution’ to demonstrate that it is possible for PTE to evolve in a way that means it loses its phosphotriesterase activity and becomes able to catalyze a different type of chemical reaction. Here, Kaltenbach et al.—including some of the researchers from the previous work—investigated whether it was possible to use the same method to reverse this evolution and restore the enzyme's original activity.

The experiments show that reverse evolution is possible as phosphotriesterase activity was restored to the PTE enzyme from the previous study. However, although the phenotype of the final enzyme matched that of the original PTE enzyme, the genotypes did not match as the DNA sequences of the genes that encode these enzymes differ. The DNA does not revert to its original sequence because the effect of individual mutations on the phenotype depends on what other mutations are present. For example, as the enzyme evolved its new activity, additional mutations accumulated that did not alter enzyme activity. During the reverse evolution experiment, some of these mutations could have started to exert influence on the phenotype so that different mutations were required to restore the phosphotriesterase activity.

In the future, Kaltenbach et al.'s findings may aid efforts to engineer artificial enzymes for use in medicine or industry.

DOI: http://dx.doi.org/10.7554/eLife.06492.002

Introduction

The controversy surrounding evolutionary reversibility pertains to one of the fundamental questions in evolutionary biology: the extent to which selection pressure determines evolutionary outcomes (Teotonio and Rose, 2001; Gould, 2007; Collin and Miglietta, 2008; Lobkovsky and Koonin, 2012). Also, through understanding reversibility on the levels of both phenotype and genotype, one could catch a glimpse at the structure of the respective fitness (or adaptive) landscape. The extent of ruggedness of adaptive landscapes—that is, the prevalence of epistasis, and thus historical contingency—have recently received considerable attention (Whitlock et al., 1995; Poelwijk et al., 2007; de Visser et al., 2011; Breen et al., 2012; Harms and Thornton, 2013; McCandlish et al., 2013; Kaltenbach and Tokuriki, 2014). While the unlikelihood of reversing a historical pathway taken by evolution has been demonstrated (Bridgham et al., 2009), a large number of sequences can encode functionally identical proteins (‘genotypic redundancy’) and phenotypic reversion can still occur via alternative pathways (Clarke, 1985; Lenski, 1988; Crill et al., 2000; Teotonio and Rose, 2000; Kitano et al., 2008). Yet, the evolutionary dynamics underlying phenotypic reversion have not been addressed. Does phenotypic reversion lead back to the ancestral peak on the adaptive landscape or to a new peak (Carneiro and Hartl, 2010; Lobkovsky and Koonin, 2012)? In other words, to what extent are the sequences of the ancestral and reverse-evolved proteins accessible via a neutral network—that is, are amino acid exchanges between the two proteins tolerated or result in loss of function? The inability to exchange amino acids between homologous proteins due to epistasis represents ‘genotypic incompatibility’ and can result in a non-functional enzyme, a phenomenon which can be compared to the ‘Dobzhansky-Muller effect’ of hybrid incompatibility (Orr, 1995; Kondrashov et al., 2002).

Another important aspect to be explored is the underlying molecular mechanism of phenotypic reversibility. Restoration of function can either be brought about by the same structure and mechanism as in the ancestor, or by a distinct, alternative state. Structural convergence would indicate that functional requirements exist, which deterministically lead to one particular structural solution. On the other hand, structural divergence would imply the accessibility of various solutions that can bring about efficient catalysis. Thus, understanding the molecular basis for (ir)reversibility and (in)compatibility would provide valuable insights into protein sequence-function-structure relationships. What are the molecular requirements for a specific function? What structural changes are required to switch from one function to another? Identifying such changes, which are often based on subtle effects (e.g., on mutations occurring in remote locations, or mutations which only show a favorable effect in combination), remains a great challenge in protein science. What is the molecular basis underlying mutational epistasis, which leads to alternative evolutionary outcomes?

Directed evolution is a powerful tool to address these questions and explore adaptive landscapes because it allows the study of evolution in a highly controlled setup (Peisajovich and Tawfik, 2007; Romero and Arnold, 2009; Kawecki et al., 2012). High selection pressure can prevent fixation of neutral, functionally irrelevant mutations, resulting in an adaptive trajectory without mutational noise. All evolutionary intermediates (the ‘molecular fossil record’) are obtained, so the evolutionary dynamics and their molecular basis can be characterized in detail. Performing evolution in both the forward and reverse direction and comparing the changes in each direction provides a unique handle for identifying such effects. Understanding these phenomena would improve our ability to design and engineer novel proteins in the laboratory.

Here, we experimentally test the reversibility of enzyme evolution and investigate its molecular basis. We previously evolved the enzyme PTE, a phosphotriesterase, into an arylesterase (Roodveldt and Tawfik, 2005; Tokuriki et al., 2012; Wyganowski et al., 2013). In this work, we applied a selection pressure to restore the original phosphotriesterase activity. We characterized the entire trajectory including both the forward and reverse process in terms of phenotypic reversibility (function or enzymatic activity), genotypic irreversibility (sequence), as well as in terms of the underlying molecular basis (structure and mechanism). We find that PTE has a rugged adaptive landscape on which the accessibility of functional mutations is severely limited, and describe the mechanisms that lead to genotypic irreversibility and incompatibility.

Results

Phenotypic reversibility in the laboratory evolution of PTE

We previously reported the laboratory evolution of PTE (wtPTE) into a highly efficient arylesterase for 2-naphthyl hexanoate (2NH) (Roodveldt and Tawfik, 2005; Tokuriki et al., 2012; Wyganowski et al., 2013). In the course of the trajectory, the original phosphotriesterase activity decreased drastically (104-fold) although no selection pressure was applied against it. In this work, we first completed the functional transition by further decreasing the remaining phosphotriesterase activity (∼10-fold) by four additional rounds of directed evolution for maintaining arylesterase but reducing phosphotriesterase activity (Supplementary file 1). Briefly, libraries were generated by error-prone PCR and transformed into Escherichia coli (BL21 (DE3)). As a pre-screen for arylesterase activity, protein expression was induced in the bacterial colonies on agar plates, and a mixture of the substrate 2NH and a product stain (Fast Red) was added as previously described (Figure 1A) (Roodveldt and Tawfik, 2005; Tokuriki et al., 2012; Wyganowski et al., 2013). Upon hydrolysis of 2NH, Fast Red forms a red complex with the naphtholate leaving group, meaning colonies that develop a red color contain active arylesterase variants. In each round, 2000–10,000 colonies were screened in this fashion, theoretically covering most single point mutations in the 330 amino acid PTE gene. Positive colonies were then re-grown and re-assayed in 96-well plates and initial rates of both 2NH and paraoxon hydrolysis were determined in clarified lysate. In our experience, activity increases >1.3-fold compared to the respective parent yielded reliably improved variants. The variant with the largest improvement in initial rate was then used as the template for the next round of error-prone PCR or several variants were subjected to DNA shuffling. To buffer the destabilizing effects of functional mutations and minimize reductions in soluble protein expression levels, we used GroEL/ES overexpression as previously described (Supplementary file 1) (Tokuriki and Tawfik, 2009; Wyganowski et al., 2013). In total, with 22 rounds of ‘forward evolution’, the accumulation of 26 mutations from wtPTE resulted in a highly efficient and specialized arylesterase (AE) with a ∼105-fold increase in arylesterase rates (kcat/KM for 2NH >106 M−1s−1) and an overall ∼105-fold decrease in phosphotriesterase activity (kcat/KM for paraoxon ≈102 M−1s−1, Figure 1B,C). Because selection was specific for arylester hydrolysis until round 18, the change in phosphotriesterase activity was stochastic: many mutations decreased phosphotriesterase activity (11 mutations), some were neutral (nine mutations), and others increased phosphotriesterase activity (six mutations). Starting from AE, we then performed the reverse evolution to restore phosphotriesterase activity using an experimental setup equivalent to the forward process (Figure 1A) with the following modifications: the pre-screen was carried out using a fluorogenic phosphotriester as a surrogate for paraoxon (Supplementary file 1) and then validated in 96-well format as described above. The selection criterion was now an increased initial rate of paraoxon hydrolysis. In our evolutionary model system, variant fitness is defined as the level of enzymatic activity in cell lysate. All variants were also purified and the kinetic parameters determined, which correlated well with lysate activity (Figure 1—figure supplement 1, Supplementary file 2).

Figure 1. Activity and sequence changes of PTE over the evolution.

(A) Overview of the experimental evolution. Libraries were generated and transformed into Escherichia coli. Proteins were expressed and screened for paraoxon and/or 2NH hydrolysis in bacterial lysates. Several thousand variants were screened per round, theoretically covering most single point mutations in the ∼1000 bp PTE gene. Details are given in Supplementary file 1. (B) Activity changes during the forward (screening for arylesterase hydrolysis) and reverse evolution (screening for re-increase in phosphotriesterase hydrolysis). Steady-state kinetic parameters for all variants are provided in Supplementary file 2A. (C) Type, position, and order of occurrence of the 33 mutations obtained in the evolution. Mutations are shown relative to wtPTE (GenBank accession number KJ680379) with lower case italics denoting the amino acid found in wtPTE. Note that wtPTE was obtained in previous screens for improved expression levels in E. coli and contains six mutations relative to the naturally occurring PTE (I106L, F132L, K185R, D208G, R319S) (Roodveldt and Tawfik, 2005; Tokuriki et al., 2012). The following mutations occurred in individual variants, but were not fixated after DNA shuffling: R7a: a204G, R7c: a102V, R19: a78T, v143A, t311A, revR1: c59Y, s238R, revR5: i176V, revR8a: d264E, revR8b: i296V. All additional variants characterized and sequenced in each round are shown in Supplementary file 1.

DOI: http://dx.doi.org/10.7554/eLife.06492.003

Figure 1.

Figure 1—figure supplement 1. Correlation between activities measured in cell lysate and using purified enzyme for all variants selected over the evolution (Supplementary file 2).

Figure 1—figure supplement 1.

(A) Phosphotriesterase activity. (B) Arylesterase activity. All measurements were carried out at 200 μM substrate. Activities in cell lysate are given relative to wtPTE.

The restoration of phosphotriesterase activity in the reverse evolution followed a pattern similar to that observed for arylesterase activity in the forward evolution: increasing smoothly and gradually through the stepwise accumulation of mutations (Figure 1B,C). Moreover, it followed a ‘diminishing returns’ pattern characteristic for the development of a function under selection—that is, the activity gain per mutation gradually decreased in later stages of the functional transition, where fitness reached a plateau (Figure 1B) (Stebbins, 1944; MacLean et al., 2010; Chou et al., 2011; Khan et al., 2011; Tokuriki et al., 2012). Furthermore, trade-offs between the two activities were weak in the early rounds of evolution, resulting in a generalist, bifunctional intermediate (Aharoni et al., 2005; Khersonsky and Tawfik, 2010). In the forward evolution, trade-offs then became stronger, leading to specialization of the arylesterase. The reverse evolution, however, retained characteristics of a generalist: the large increase in phosphotriesterase activity (>104-fold) was accompanied by only a small (five-fold) reduction in arylesterase activity. A possible reason for this is that the reverse evolution is still at an early phase of the functional transition after 12 rounds (vs. 22 in the forward evolution). Because we were unable to isolate any variant with further improved phosphotriesterase activity, it might be necessary to impose a negative selection pressure to specialize the enzyme. The molecular basis of substrate binding and trade-offs is described further below (see also Figure 2 and Figure 2—figure supplement 1). Overall, we obtained a new efficient, enzyme (neoPTE) on par with wtPTE (kcat/KM>106 M−1s−1 for paraoxon in both cases). The recovery of identical phosphotriesterase rates in neoPTE compared to wtPTE establishes that evolution of the phenotype was fully reversible.

Figure 2. Reshaping of the PTE active site over the evolution.

(A) WtPTE (PDB ID: 4PCP) features an active site which is well adapted for paraoxon hydrolysis, but suboptimal for 2NH. (B) In the forward evolution, selection for arylesterase activity leads to several changes in the binding pocket from wtPTE to AE (PDB ID: 4PCN). (C) The reverse evolution leads to restoration of the ancestral state in neoPTE (PDB ID: 4PBF). The four regions of change are highlighted in different colors. Top row: the 2NH analogue (yellow) was modeled into the three structures by superposition with PTE-R18 in complex with the analogue (PDB ID: 4E3T) (Tokuriki et al., 2012). Bottom row: the paraoxon analogue diethyl 4-methoxyphenyl phosphate (yellow) was modeled into the structures by superposition with Agrobacterium radiobacter PTE in complex with the analogue (PDB ID: 2R1N) (Hong and Raushel, 1996). Amino acids found in wtPTE are shown in lower case italics.

DOI: http://dx.doi.org/10.7554/eLife.06492.005

Figure 2.

Figure 2—figure supplement 1. Details of the active site changes.

Figure 2—figure supplement 1.

By overlaying the structures of wtPTE, AE, and neoPTE with the structure of A. radiobacter PTE in complex with the paraoxon analogue diethyl 4-methoxyphenyl phosphate (see Figure 2), regions important for paraoxon binding could be identified and the effect of mutations derived. The loss of phosphotriesterase in AE and its restoration in neoPTE is achieved mainly by changes in shape complementarity between enzyme and substrate, changes in hydrophobicity, and π-π stacking (AC). It is likely that the movement of the β-metal also influences catalysis, although the exact effects of the metal displacement on catalysis are as yet unclear (D). (A) Interaction between paraoxon and residues 306 and 308. Substitution of the bulky Phe306 by Ile improves 2NH binding in AE, but results in a loss of interaction with paraoxon. In neoPTE, rather than reversion of f306I, s308 is mutated to the more hydrophobic Cys, improving interaction with the para-nitrophenyl group (Figure 2 pink region, Figure 7A). (B) Steric hindrance between Phe271 and paraoxon. Substitution of Leu271 to the larger Phe improves 2NH binding in AE, but causes steric hindrance with paraoxon. In neoPTE, in addition to reversion to the ancestral Leu, repositioning of the loop through a combination of remote mutations results in a ‘downward’ movement of Leu271, further enlarging the pocket (see also Figure 2 orange region, Figure 7A). (C) Shift in position of Leu106/Trp131/Leu132 (Figure 2 purple region, Figure 7B). While wt- and neoPTE feature edge-to-face π-π stacking between Trp131 and the para-nitrophenyl ring, in AE the shift in position brings Trp131 closer to the partially positive edges of the ring, resulting in electrostatic repulsion. Moreover, the shift in Leu106 brings it closer to the ethoxy group of the substrate in AE. (D) Shift in position of the β-metal. The inter-metal distance is reduced from 3.8 Å in wtPTE to 3.3 Å in AE through a movement of His201 and the β-metal. In neoPTE, the original spacing seen in wtPTE is restored (Figure 2 light blue region, Figure 7C), perhaps because the decreased distance in AE destabilized the transition state for paraoxon hydrolysis.
Figure 2—figure supplement 2. Overlay of electron density maps for the active sites of (A) wtPTE (salmon) and AE (cyan) and (B) wtPTE (salmon) and neoPTE (magenta).

Figure 2—figure supplement 2.

Electron density for wtPTE is shown as grey isosurface (2 s), while electron density of AE and neoPTE is shown as isomesh (2 s), for contrast between the two maps. Comparison between (A) and (B) illustrates the shift of the β-metal ion closer to the α-metal ion in AE, and the shift back to the wtPTE position in neoPTE. Likewise comparison between the positions of Leu106, Trp131 and Leu132 in (C) wtPTE (salmon) and AE (cyan) and (D) wtPTE (salmon) and neoPTE (magenta) illustrates that these sidechains adopt alternative positions in AE, but return to their original conformations in neoPTE.
Figure 2—figure supplement 3. Development of B-factors over the evolution.

Figure 2—figure supplement 3.

In wtPTE, loop 7 shows the maximum B-factor. The forward evolution for 2NH activity resulted in stabilization of loop 7, whereas flexibility of loops 4 and 5 increased. In neoPTE, the original dynamics of the structure were restored as shown by the increased flexibility of loop 7 as well as the reduced B-factor of loops 4 and 5.
Figure 2—figure supplement 4. Linear free energy relationships of wtPTE, AE, and neoPTE.

Figure 2—figure supplement 4.

(A) Arylester hydrolysis. The kcat/KM of all three variants is independent of the leaving group pKa. (B) Phosphotriester hydrolysis. In neoPTE, the break in leaving group dependence around pH 7, which is characteristic for wtPTE (Hong and Raushel, 1996; Tokuriki et al., 2012), is restored. Information about each substrate is provided in Supplementary file 2.

Genotypic irreversibility and constraints underlying phenotypic reversion

To examine the genetic changes causing phenotypic reversion, the sequence of all evolutionary intermediates was determined. Only five of the 26 mutations that accumulated in the forward evolution were reverted to the original sequence (‘reversions’, A49v, I172t, Q180h, L271f, M314t, Figure 1C; amino acids shown in lower case italics denote the wtPTE state, while amino acids not present in the wild type are shown in capital letters). Nine additional ‘new mutations’ accumulated, two of which occurred in positions that were mutated in the forward evolution (V130M—originally leu, I306M—originally phe), and seven were in positions that were not previously mutated (p135S, y156H, g174D, a203E, m293K, s258N, s308C). Overall, neoPTE is separated further from wtPTE (28 out of 333 amino acids) than AE from wtPTE (26 amino acids). Additional rounds of evolution failed to yield more reversions or activity increases (Supplementary file 1). In the forward evolution, the loss of phosphotriesterase activity was largely a side product of the property under selection, the increase in arylesterase activity. Therefore, not all mutations decreased phosphotriesterase activity (Figure 1B), and it is not surprising that phenotypic reversion did not require full genotypic reversion. However, a number of mutations that did contribute to decreasing phosphotriesterase in the forward process were also not reverted. Moreover, the new mutations are located in the same mutational clusters seen in the forward evolution (Figure 1C), indicating they may be alternative solutions to the same functional requirement and replace reversions, as detailed further below. Taken together, although the phenotype was reversible, PTE evolution was genotypically irreversible, but an alternative trajectory was readily taken.

The active site converged towards its original state in the reverse evolution

To unravel the molecular basis of the observed genotypic irreversibility, we solved crystal structures of wtPTE, AE, and neoPTE (Supplementary file 3). We compared the structures and modeled both a paraoxon and a 2NH analogue into each active site (by superposition with structures containing these analogues [Hong and Raushel, 1996; Tokuriki et al., 2012]). The phosphotriester paraoxon is characterized by tetrahedral ground-state geometry and P–O cleavage proceeds via a trigonal bipyramidal transition state. The arylester 2NH is planar and C–O bond hydrolysis proceeds via a tetrahedral transition state. The structural comparison indicates that AE adapted to the planar substrate 2NH in the forward evolution, but that this came at a cost of phosphotriesterase activity, as the bulky paraoxon is no longer efficiently recognized (Figure 2). We identify several regions of the active site that may be responsible for the functional transition (Figure 2 and Figure 2—figure supplement 1). First, a binding pocket for the naphthyl leaving group of 2NH was excavated through the combined action of h254R and d233E (Figure 2A,B, green region) (Hong and Raushel, 1996; Tokuriki et al., 2012). Leaving group coordination was further improved through a subtle ∼1.0 Å shift of Trp131 (Figure 2A,B, purple region). Moreover, the pocket was elongated through the f306I mutation (Figure 2A,B, pink region) and narrowed by l271F (Figure 2A,B, orange region), resulting in better accommodation of the long hexanoate chain of 2NH. These changes may lead to the reduction of phosphotriesterase activity through loss of interactions (either shape complementarity, hydrophobicity, or π-π stacking) in several regions and steric hindrance in others, as described in further detail below (Figure 2—figure supplement 1). Additionally, the distance between the two active site zinc ions decreased from 3.8 Å to 3.3 Å (Figure 2A,B, light blue region and Figure 2—figure supplement 1). The observed structural changes are subtle, at the sub-angstrom scale, and their contributions to catalysis unquantified. However, the dispersion precision indicator (DPI; Cruickshank, 1999) for each of the structures is less than one-tenth of an angstrom, meaning that the observed distance changes (including the 0.5 Å shift in the metal position) are significant (Figure 2—figure supplement 2).

In neoPTE, the part of the active site necessary for phosphotriesterase activity has converged back towards its original state. The regions of suboptimal binding were re-optimized for paraoxon and the metal distance was restored to the 3.8 Å (Figure 2C). Moreover, the pattern of loop flexibility that is characteristic of wtPTE was also restored in neoPTE (Figure 2—figure supplement 3). Furthermore, we measured linear free energy relationships for wtPTE, AE, and neoPTE for both arylester and phosphotriester hydrolysis (Figure 2—figure supplement 4), that is, the dependence of the catalytic parameters kcat/KM on the pKa of the leaving group. For phosphotriester hydrolysis by wtPTE, a break in pKa dependence around 7 is consistent with the rate-limiting step changing on either side of this break (Hong and Raushel, 1996; Tokuriki et al., 2012). By contrast, AE shows a continuous, linear dependence over the whole pKa range, indicating that the rate-limiting step does not change. In neoPTE, the pattern characteristic for wtPTE was restored. Together with the observed structural convergence, the simplest assumption must be that the very similar active site environment enables similar residue contributions to catalysis in wt- and neoPTE on phosphotriesterase activity. However, active site convergence is not complete, as the naphthyl binding pocket remains intact (Figure 2C, green region, Arg254 and Glu233), which likely explains why neoPTE is still bifunctional.

It should be pointed out that, at this stage, we do not know the extent to which the modification of each structural element contributes to the overall >104-fold activity change. Also, we cannot exclude the existence of alternative substrate binding modes from our model, as well as the role protein dynamics play in the functional switch. However, in the combined forward and reverse evolution, which involved a change in catalytic activity of >104 M−1s−1 in each direction, only four mutations were located in the active site. Instead, most functional mutations occur in more remote positions. Therefore, it is likely that fine-tuning of the active site by these remote mutations contributes significantly to the activity changes. Taken together, the restoration of all structural elements key for phosphotriesterase activity as well as the catalytic mechanism occurred despite the alternative genotypic trajectory, suggesting that biophysical requirements exist for this particular active site shape, and that phosphotriesterase activity may otherwise be inefficient.

To further investigate whether mutational accessibility is dictated by the necessity for structural convergence to the wild-type active site, a parallel evolutionary experiment was performed. In this experiment, we attempted to restore phosphotriesterase activity by a trajectory containing only new mutations. To this end, we sequenced the improved variants after each round and removed all those containing reversions. This trajectory only resulted in a 70-fold improvement in five rounds (Figure 3A), after which the activity plateaued and no further improved variants could be found. This failure to reach wild-type activity levels without reversions, as well as the fact that three out of the five new mutations obtained (p135S, a203E, s308C, Figure 3B) were identical to the successful trajectory containing reversions, emphasizes that the number of adaptive trajectories that lead to a wild-type level fitness peak from AE are highly limited. However, trajectories involving neutral mutations, or trajectories which do not pass through the best variant in each round but through less improved intermediates, may exist. It is likely that a wild-type-like paraoxon binding pocket is compulsory to achieve efficient phosphotriesterase activity, and only a small set of mutations (e.g., reversions or the combination of reversions and new mutations that we identified) can provide such a solution.

Figure 3. An alternative experimental evolution, where fixation of back-to-wild-type reversions was prohibited, failed to restore the original level of PTE activity.

Figure 3.

(A) Activity changes in the alternative trajectory. After five rounds, PTE activity plateaued at a 65-fold improvement (trajectory 2), 340-fold lower than the main trajectory (trajectory 1). (B) Mutations accumulated in the alternative trajectory. All clones containing reversions, which occurred frequently, were removed after sequencing and thereby prohibited from fixating. Three of the five new mutations also fixated in the main trajectory. The mutations c59Y and s238R occurred in variants revR1 and revR2b, but were not fixated after DNA shuffling. Amino acids found in wtPTE are shown in lower case italics. All additional variants characterized and sequenced in each round are shown in Supplementary file 1.

DOI: http://dx.doi.org/10.7554/eLife.06492.010

Emergence of sequence incompatibility between the two PTEs

Next, we set out to answer the question how the two enzymes exhibiting identical phosphotriesterase activity, wt- and neoPTE, are connected on the adaptive landscape. If they populate the same fitness plateau, amino acid exchanges between them should be neutral. On the other hand, a loss of function upon interconversion between amino acids would indicate genotypic incompatibility (Kondrashov et al., 2002; Lunzer et al., 2010; Wellner et al., 2013), meaning that the two enzymes occupy distinct positions on the landscape that are poorly connected through a neutral network. To this end, we characterized the effect of all 28 single point exchanges separating the two enzymes in each background (56 mutants in total, Figure 4 and Supplementary file 2). Mutations are considered non-neutral if they cause a >1.3-fold change in phosphotriesterase activity in lysate compared to the parent background because, in our screening system, this cut-off enabled us to reliably identify improved variants. Moreover, we have performed a statistical analysis of the mutational effects, which confirms that a >1.3-fold change is significant (p-values <0.05) in almost all cases (statistics are provided in Figure 4—source data 1 and Supplementary file 2B). According to this analysis, only eight of 28 exchanges were compatible; they were neutral in both backgrounds (Figure 4A). The remaining 20 positions showed incompatibility, 15 of which were partially incompatible, as the exchange was neutral in one background but deleterious in the other (Figure 4B,C). Five exchanges were completely incompatible; they severely decreased activity in both backgrounds (Figure 4D). Taken together, despite >90% sequence identity between wt- and neoPTE, the reverse trajectory led to a functional sequence that is poorly connected with the original one. It remains unknown whether the two sequences comprise completely separate peaks on the adaptive landscape or are connected through a neutral network, that is, if the neutral exchanges would permit the subsequent occurrence of initially deleterious exchanges. However, because >70% of the mutated positions cause incompatibility, only one out of the 54 exchanges confers higher fitness, and this exchange (neoPTE + f306M) would require two simultaneous base changes, it is unlikely that an evolutionary transition between the two could easily occur by adaptive or strong purifying selection.

Figure 4. Genotypic incompatibility between wtPTE and neoPTE.

Figure 4.

(AD) The effect of the 28 amino acid exchanges separating the two enzymes was tested in the background of wtPTE and neoPTE, respectively. Activities are given relative to the parent mutational background, wtPTE or neoPTE. Amino acids found in wtPTE are shown in lower case italics. Color code as in Figure 1. (A) Compatible exchanges, neutral in both backgrounds. (B, C) Partially incompatible exchanges, neutral in one background but deleterious for another. (D) Mutually incompatible exchanges. Mutations causing a >1.3-fold change compared to the respective parent mutant (dotted line) are considered non-neutral. p-values compared to each parent (Supplementary file 2B) and p-values for the effect of each mutation Figure 4—source data 1. in the two backgrounds were calculated. Note that the effect of i313F, which causes a significant decrease in wtPTE, is statistically not significant between wtPTE and neoPTE.

DOI: http://dx.doi.org/10.7554/eLife.06492.011

Figure 4—source data 1. Comparison of the effect of mutations in wt- and neoPTE.
Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds AND a p-value <0.05 are considered significant. Non-significant values are underlined. These belong to the ‘PANEL A’ series, which consists of mutations that are neutral in both cases. The only exception is i313F from PANEL C, which is significantly different from wtPTE, but not significantly different between wt- and neoPTE.
elife06492s001.xlsx (45.8KB, xlsx)
DOI: 10.7554/eLife.06492.012

Mutational epistasis underlies genotypic irreversibility

To understand how convergence to the original function and architecture was achieved despite genotypic irreversibility and incompatibility, we performed a comprehensive mutational analysis. All 33 mutated positions were examined in the background of the three enzymes (wtPTE, AE, and neoPTE), and in the background in which they originally occurred in the evolution (i.e., in the different rounds) to identify mutations that are epistatic—that is, change their effect depending on the genetic background (Figure 5 and Supplementary file 2). To determine whether or not the measured changes were significant, the same stringent cut-off as described above for the comparison between wt- and neoPTE was applied (statistics are provided in Supplementary file 2B–G). Furthermore, we analyzed the crystal structures to determine which mutations caused the divergence and convergence of the active site configuration.

Figure 5. Changes in phosphotriesterase activity upon mutations in five different backgrounds: wtPTE in the forward evolution, AE in the reverse evolution, and neoPTE.

Figure 5.

Thirty-three positions were mutated in the entire evolution, two of which (130 and 306) were mutated to two different amino acids. Amino acids found in wtPTE are shown in lower case italics. Numbers indicate the fold change in activity caused by a mutation in a certain background (Supplementary file 2B–F). Mutations causing a >1.3-fold change compared to the respective parent mutant are considered non-neutral. p-values compared to each parent were calculated (Supplementary file 2B,D,F). The mutations T341i in AE, l140M and t199I in the forward evolution, and V49a and s258N in the reverse evolution are not significant (p-values >0.05). Therefore, out of 144 mutations, only five show a >1.3-fold effect, but are statistically not significant. Boxes that are crossed out indicate that a mutation did not occur in this background. For direct comparison, the activity changes resulting from a mutation are adjusted to the same direction—from the amino acid found in AE to the respective other amino acid (label at the top). To illustrate, the effect of R254h was measured as follows: AE and neoPTE contain Arg254, and thus the effect of R254h is directly calculated based on the comparison between AE and AE-R254h (Fold changeR254h = ActicityAE-R254h/ActivityAE) and between neoPTE and neoPTE-R254h (Fold changeR254h = ActicityneoPTE-R254h/ActivityneoPTE). However, because wtPTE and the forward evolution background already contain His254, the effect of introducing this amino acid has to be calculated ‘in reverse’ by first assuming to remove this mutation and then adding it back in, that is, based on the comparison between wtPTE-h254R and wtPTE (Fold changeR254h = ActicitywtPTE/ActivitywtPTE-h254R). All mutational effects that were calculated in this ‘reverse’ way are underlined. Note that wtPTE-h254R is identical to the round 1 variant and therefore the effect in the forward evolution is the same as in the wtPTE background. Because R254h did not occur in the reverse evolution, no effect could be calculated in this background and the respective box is crossed out.

DOI: http://dx.doi.org/10.7554/eLife.06492.013

The analysis revealed extensive epistasis during the forward and reverse evolution. In the forward evolution, the effect of mutations is significantly altered after their fixation due to epistasis caused by mutations subsequently accumulated in the trajectory. For example, some mutations initially increased (t172I in round 6 and l271F in round 14) or were neutral to (l130V in round 14) phosphotriesterase activity when they occurred in the trajectory, and were thus unfavorable to revert as their reversion would not change (V130l) or decrease (I172t, F271l) activity (Figure 6A). However, reversion of these mutations became possible (i.e., would lead to an increase in activity) in the background of AE (Figure 6A). On the contrary, h254R decreased phosphotriesterase activity when it occurred in round 1 and therefore its reversion (R254h) would initially be favorable. However, the effect of this reversion switched to unfavorable (R254h) when it was tested in AE (Figure 6B). Moreover, mutations in the forward evolution had a permissive effect on the accumulation of new mutations and, in this way, opened up a path towards the alternative trajectory taken in the reverse process; all new mutations had a neutral or negative effect on phosphotriesterase activity in the genetic background of wtPTE but most of them become positive in AE (Figure 6C); for example, AE-s308C (6.4-fold), AE-V130M (5.4-fold) and AE-p135S (2.9-fold). Because these mutations can compete with the most favorable reversions (>1.3–8-fold effect, Supplementary file 2), they were selected in the early rounds of the reverse evolution, laying the foundation for the alternative trajectory.

Figure 6. Epistasis between mutations in the forward evolution restricts some reversions while permitting others as well as new mutations.

Figure 6.

(A) Several reversions change their effect from unfavorable upon their initial occurrence in the forward evolution to favorable in AE. (B) Other reversions change their effect from favorable to unfavorable. Note that, in the forward evolution, mutations occurred in the opposite direction as shown (l130V, t172I, l271F, and h254R), but are given in the same direction as AE for direct comparison. Phosphotriesterase activity was too low to be determined in AE + R254h, but at least 10-fold reduced. (C) The effect of new mutations changes from wtPTE (small panel) to AE (large panel). Relative activities were calculated by comparing a variant containing a certain mutation with one lacking only this mutation. Mutations causing a >1.3-fold change compared to the respective parent mutant (dotted line) are considered non-neutral. p-values compared to each parent (Supplementary file 2B) and p-values for the effect of each mutation in the two respective backgrounds shown in each panel were calculated (Figure 6—source data 1, 2). Note that the mutation m293K, which causes a significant increase in AE, does not have a significantly different effect in the two backgrounds. Amino acids found in wtPTE are shown in lower case italics. Color code as in Figure 1. All other mutational effects in the different backgrounds are given in Figure 5 and Supplementary file 2.

DOI: http://dx.doi.org/10.7554/eLife.06492.014

Figure 6—source data 1. Comparison of the effect of mutations in the forward evolution and in AE (panels A, B).
Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. [b] Phosphotriesterase activity was too low to be determined in AE + R254h, but at least 10-fold reduced.
elife06492s002.xlsx (38.1KB, xlsx)
DOI: 10.7554/eLife.06492.015
Figure 6—source data 2. Comparison of the effect of mutations in wtPTE and AE (panel C).
Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. Non-significant values are underlined. Note that the effect of m293K, which causes a significant increase in AE, is statistically not significant between wtPTE and AE.
elife06492s003.xlsx (36.4KB, xlsx)
DOI: 10.7554/eLife.06492.016

In the reverse evolution, the active site architecture necessary for phosphotriesterase activity was restored largely through new mutations, which restricted the reversion of mutations accumulated in the forward evolution. Overall, nine of the 10 reversions that were initially favorable in the background of AE lost their favorable effect in neoPTE because of epistasis during the reverse evolution (Figure 7A). We were able to trace the molecular basis of this effect in several cases as described in the following. First, f306I enlarged the active site in the forward evolution, resulting in a loss of shape complementarity to paraoxon. In the reverse evolution, the nearby s308C offsets this effect by increasing the hydrophobicity of the pocket (Figure 2, Figure 7C, Figure 2—figure supplement 1A). The redundancy of the mutations f306I and s308C was also evidenced by combinatorial mutational analysis; incorporation of s308C restricts subsequent reversion of I306f due to sign epistasis (Figure 7B). While this reversion would have been favorable in isolation, phosphotriesterase activity of the double mutant AE-I306f-s308C is reduced compared to AE-s308C. Second, the active site was narrowed in the forward evolution by l271F and several other mutations in loop7/8 including l272M and i313F (Figure 7C and Figure 2—figure supplement 1B), causing steric hindrance for paraoxon. The pocket was re-opened initially by the reversion F271l. Subsequently, the new mutation s258N destabilized and altered the conformation of loop 7 and further enlarged the pocket (Figure 2, Figure 2—figure supplement 1B). We also observed that incorporation of s308C and F271l restricted the reversion of both l272M and f313I (M272l and I313f, Figure 7B). Third, the active site was reshaped by a subtle ∼1.0 Å shift of Leu106 and Trp131, which was likely triggered by a cluster of remote mutations occurring in the same loops (s102T, l130V, m138I, s137T, and v140M, Figure 7D, and Figure 2—figure supplement 1C). In the reverse evolution, these residues are shifted back to their original positions through two new remote mutations, p135S and V130M (Figure 7D). Again, the two mutations are redundant and mutually exclusive; p135S restricts the reversion of m138I (I138m, Figure 7B). Fourth, the distance between the two active site zinc ions decreased from 3.8 to 3.3 Å in the forward evolution through displacement of the metal-chelating His201 and the β-metal (Figure 7E), which was likely triggered by the combined action of several remote mutations in loops 4 and 5 (t172I, q180H, t199I, and a204G, Figure 2). In the reverse evolution, the positions of His201 and the β-metal, as well as the original inter-metal distance of 3.8 Å, were restored through the reversion I172t and formation of a new hydrogen bonding network with two additional new mutations, a203E and g174D (Figure 7E). These examples demonstrate that rewiring the intramolecular interaction network of the protein can result in the same physical solution in key elements in the active site. Rewiring occurs because new mutations act as ‘epistatic ratchets’ (Bridgham et al., 2009) for potential reversions, restricting their fixation and thus leading to the incompatible new enzyme neoPTE.

Figure 7. Convergence to the original active site configuration in the reverse evolution through rewiring of the molecular interaction network leads to genetic incompatibility.

Figure 7.

(A, B) Epistasis during the reverse evolution causes irreversibility and incompatibility. (A) The activity change of mutations that were favorable in the initial stage of reverse evolution, but not reverted. neoPTE background: small panel; AE background: large panel. Color code as in Figures 1, 2. Mutations causing a >1.3-fold change compared to the respective parent mutant (dotted line) are considered non-neutral. p-values compared to each parent (Supplementary file 2B) and p-values for the effect of each mutation in the two respective backgrounds shown in each panel were calculated (Figure 7—source data 1). (B) Combinations of mutations that constrained the evolutionary trajectory due to sign epistasis. Phosphotriesterase activity is shown on a linear scale. p-values are given in Supplementary file 2G. Note that the two mutants AE + F271l + s308C and AE + M272l + s308C have non-significant p-values compared to the ‘double mutant’ in this series, AE + F271l + M272l + s308C. However, determination of kcat/KM values confirms sign epistasis in this series (see also Supplementary file 2G). (CE) Amino acid changes in the forward (left panel) and reverse evolution (right panel). (C) Reorganization of loops 7 and 8. A new mutation, s258N, caused the reorganization (see also Figure 2—figure supplement 1A,B). (D) Different combinations of remote mutations in loop 3 resulted in identical positioning of Leu106, Trp131, and Leu132 in wtPTE and neoPTE (see also Figure 2—figure supplement 1C). (E) Rewiring the interaction network in neoPTE by remote mutations in loops 4 and 5 led to β-metal displacement (see also Figure 2—figure supplement 1D). Amino acids found in wtPTE are shown in lower case italics.

DOI: http://dx.doi.org/10.7554/eLife.06492.017

Figure 7—source data 1. Comparison of the effect of mutations in wtPTE and AE (panel A).
Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. Non-significant values are underlined.
elife06492s004.xlsx (37.2KB, xlsx)
DOI: 10.7554/eLife.06492.018

Discussion

Our work demonstrated that a >104-fold loss in phosphotriesterase activity, which accompanied the functional transition to a distinct chemical reaction—arylester hydrolysis—via accumulation of 26 mutations, is readily restored when the selection pressure is reverted. Phenotypic reversal has been observed in previous cases (Clarke, 1985; Lenski, 1988; Crill et al., 2000; Teotonio and Rose, 2000; Kitano et al., 2008), supporting the notion that the phenotype is largely subject to deterministic forces. The likelihood of evolutionary reversibility depends on the complexity of the system and the distance in sequence and function from the ancestor, and it is possible that starting from a more distantly evolved arylesterase would have failed to restore phosphotriesterase function. Moreover, we modulated protein stability throughout the entire trajectory using overexpression of GroEL/ES to avoid evolutionary dead ends caused by stability bottlenecks (Socha and Tokuriki, 2013; Wyganowski et al., 2013). In the absence of chaperones, adaptation may have occurred through a different pathway. Another limitation of our work is that we only examined two evolutionary trajectories (the main trajectory and the trajectory without reversions). One could imagine conducting multiple parallel evolutionary experiments to shed light on the repeatability of the trajectory taken, but unfortunately our screening system is not amenable to such a throughput. Nevertheless, our experiment shows that the genotype is subject to strong constraints: an alternative mutational pathway was taken which prevented retracing of the original pathway. Genotypic irreversibility was caused by several factors. First, because selection in the forward evolution was only for increased arylesterase activity (except in rounds 19–22), the effect of the mutations on phosphotriesterase activity was stochastic: most decreased phosphotriesterase activity, some did not affect it, and some increased phosphotriesterase activity. Therefore, even if one were to revert the mutations in the reverse order of their occurrence (from rounds 22 back to 1), the lack of continuous activity increases would prevent a gradual adaptive trajectory. Second, by the end of the forward evolution, several new mutations able to increase paraoxonase activity emerged due to epistasis. The fixation of these mutations then acts as an epistatic ratchet (Bridgham et al., 2009) that prevents reversions. Therefore, as soon as the first new mutation accumulates, the trajectory deviates further from the original path.

Our work suggests that only certain sets of mutations are able to cause phenotypic reversal. Although genotypic redundancy was observed, the presence of at least some reversions was essential for complete restoration of catalytic activity, and several new mutations were shared between the two trajectories examined. Similarly, other experimental evolution studies that examined parallel evolutionary trajectories starting from the same sequence often resulted in accumulation of the same mutations (Bull et al., 1997; Salverda et al., 2011; Dickinson et al., 2013; Khanal et al., 2014). These observations indicate that a number of functional mutations accessible from a particular starting point are highly limited, and that the genotype is also subjected to deterministic forces to some extent. In our case, the limited accessibility to functional mutation can be explained by the requirement to adapt the wild-type active site configuration in order to obtain efficient phosphotriesterase hydrolysis. Recent work by Harms et al. showed that the accessibility of functional and permissive mutations on hormone receptors is also strongly constrained by biophysical requirements imposed on the binding pocket as well as by protein dynamics (Harms and Thornton, 2014). Understanding such biophysical requirements and, in the case of enzymes, imperatives of chemical reactivity, is essential to develop our knowledge of evolutionary dynamics and constraints, although the exact nature of such requirements may be unique to each protein.

In the case of PTE, the combination of multiple subtle changes is required to fulfill these biophysical requirements and completely switch the enzyme's ability to recognize two different substrates (paraoxon vs. 2NH: tetrahedral vs. planar, P–O bond vs C–O bond cleavage, trigonal bipyramidal vs. tetrahedral transition state geometry). All but four of the 33 mutations occur in locations remote from the active site and act by fine-tuning rather than directly changing the active site configuration. Some changes occur at the sub-Å level (e.g., the shift in Trp131, Leu132, and the β-metal), and possibly act by influencing the dynamics of the active site loops. It may be that only remote mutations can achieve such subtle optimization. A mutation directly in the active site would result in a larger, more disruptive change (e.g., even a single additional carbon center would fill an additional 4 Å radius) and therefore be unable to provide the necessary fine-tuning. Other directed evolution studies also observed the accumulation of remote mutations (Morley and Kazlauskas, 2005), suggesting that fine-tuning of the active site may be a common strategy to implement a new function.

Our work reveals that the adaptive landscape of PTE is highly rugged: even single amino acid changes can regulate activities upwards or downwards and also predetermine the potential effect of subsequent mutations. As discussed above, because multiple mutations can directly or indirectly affect the same key component for catalysis, their effects are likely to be epistatic. Therefore, the alternative trajectory is caused by epistasis between mutations: frequently, those mutations that accumulate first have a permissive or restrictive effect on subsequent mutations. Overall, >70% of mutations have highly variable effects on phosphotriesterase activity, depending on the genetic background (26 out of 33 positions, Figure 5), and ∼40% showed sign epistasis (7 out of 33). The role of epistasis in natural evolution has recently received much attention, but its extent and prevalence are still under debate (Whitlock et al., 1995; Poelwijk et al., 2007; de Visser et al., 2011; Breen et al., 2012; Harms and Thornton, 2013; McCandlish et al., 2013; Kaltenbach and Tokuriki, 2014). Our findings suggest a high frequency of strong epistatic interactions during functional adaptation and therefore support the view that epistasis is paramount in shaping evolution. However, while restrictive mutations block many of the possible evolutionary trajectories, as has been previously emphasized, permissive mutations simultaneously open up new pathways, avoiding ‘evolutionary dead-ends’ and contributing to the diversity of enzyme homologs found in nature.

Moreover, our study demonstrates how genotypic irreversibility leads to the emergence of a functional sequence incompatible with the original one (Kondrashov et al., 2002; Maheshwari and Barbash, 2011), and implies the importance of evolutionary contingency on the genotypic level. In nature, the environment never ceases to change and a temporary relaxation in selection pressure (i.e., the level and type of nutrients or toxins) followed by re-adaptation (through both reversions and new compensatory mutations) may be common (Akashi et al., 2012). Higher levels of organization (such as metabolic or regulatory networks) might be subject to similar contingency; restoration of a certain function may be achieved by alternative mutations in other parts of the protein structure, in other domains, or in a different protein altogether. If the mutations are mutually exclusive, sequence incompatibilities may arise rapidly. Therefore, in addition to proposed mechanisms such as genetic drift (Akashi et al., 2012), genotypic irreversibility may contribute to the prevalence of incompatibility between orthologous enzymes (Lunzer et al., 2010; Kvitek and Sherlock, 2011; Corbett-Detig et al., 2013; Wellner et al., 2013; Schumer et al., 2014; Shafee et al., 2015).

Finally, our observations have important implications for the engineering of highly efficient enzymes—for example, how fine-tuning of multiple active site regions can confer significant activity changes, and how context-dependent such changes are. As our understanding of protein sequence-structure-function relationships grows, further rational and computational approaches need to be developed to address the role of remote mutations and epistasis to enhance our ability to create tailor-made proteins.

Materials and methods

Error-prone PCR

Error-prone PCR libraries were generated using nucleotide analogues (8-oxo-2′-deoxyguanosine-5′-triphosphate [8-oxo-dGTP] and 2′-deoxy-P-nucleoside-5′-triphosphate [dPTP]) or Mutazyme (GeneMorph II Random Mutagenesis kit, Agilent, Santa Clara, CA, United States). A typical protocol using nucleotide analogues can be found in Tokuriki and Tawfik (2009). A typical protocol using Mutazyme starts with a 50 μl PCR reaction containing 50 ng of pET-Strep-PTE template and 0.8 μM of outer primers (forward TTCCCCATCGGTGATGTC, reverse GTCACGCTGCGCGTAAC). Cycling conditions were: initial denaturation at 95°C for 2 min followed by 10 cycles of denaturation (30 s, 95°C), annealing (30 s, 63°C) and extension (1 min, 72°C), and a final extension step at 72°C for 10 min. Plasmid was removed by treatment with Dpn I (NEB, Ipswich, MA, United States). The PCR product was purified using the QIAquick PCR purification kit (Qiagen, Netherlands), and amplified further with BIOTAQ DNA polymerase (Bioline, United Kingdom) using inner primers (forward ACGATGCGTCCGGCGTA, reverse GCTAGTTATTGCTCAGCG) and starting from 20 ng of template in a 100 μl reaction volume. Cycling conditions were: initial denaturation at 95°C for 2 min followed by 20 cycles of denaturation (30 s, 95°C), annealing (1 min, 58°C) and extension (30 s, 72°C), and a final extension step at 72°C for 2 min. This gave an average of two amino acid substitutions per gene.

DNA shuffling

PTE genes of selected variants were amplified by PCR from pET-Strep-PTE plasmids using the outer primers and BIOTAQ DNA polymerase. Cycling conditions were: initial denaturation at 95°C for 2 min followed by 25 cycles of denaturation (30 s, 95°C), annealing (1 min, 63°C) and extension (1 min, 72°C), and a final extension step at 72°C for 2 min. PCR products were purified using the QIAquick PCR purification kit and mixed at equal amounts. Before the preparative digest, conditions were optimized by digesting 1 μg of template DNA with a range of DNase I concentrations (Fermentas, Waltham, MA, United States). DNase digest buffer (10×) consists of 0.5 M Tris-HCl pH 7.5 supplemented with 0.5 mg/ml BSA. In addition, reactions contained 10 mM MnCl2. Reactions were incubated for 10 min at 37°C, stopped by addition of 1/5 vol of stop buffer (30 mM EDTA pH 8.0, 30% glycerol and ≈0.6× of a DNA loading buffer) and analyzed by agarose gel electrophoresis in TBE buffer (2% agarose gel, 45 mM Tris, 45 mM boric acid, and 1 mM EDTA pH 8.0; for all other agarose gel electrophoresis procedures, we used 1% agarose gels and TAE buffer, which is 40 mM Tris, 20 mM acetic acid, and 1 mM EDTA pH 8.0). Reactions were scaled up to 10–15 μg of DNA and the digest repeated at the appropriate DNase dilution to give fragments in the range of 50–150 bp. Fragments were purified by gel extraction and 60–80 ng used in a 20 μl assembly PCR. This PCR was performed with Herculase I (Stratagene, La Jolla, CA, United States). Cycling conditions were: initial denaturation at 96°C for 90 s followed by 35 cycles of denaturation (30 s, 94°C), annealing (incremental 3°C steps from 65°C down to 41°C, 90 s each) and extension (2 min, 72°C), and a final extension step at 72°C for 10 min. Full-length assembly products were amplified using the inner primers and BIOTaq DNA polymerase under the following cycling conditions: initial denaturation at 95°C for 2 min followed by 25 cycles of denaturation (30 s, 95°C), annealing (1 min, 58°C) and extension (1 min, 72°C), and a final extension step at 72°C for 2 min. The amount of assembly product used as template for this reaction was varied and product formation verified by 1% agarose gel electrophoresis. Fractions containing product were pooled and purified using the QIAquick PCR purification kit.

Construction of single and double mutants

Mutants were constructed by site-directed mutagenesis as described in the QuikChange Site-Directed Mutagenesis manual (Agilent).

Cloning

PCR products and pET-Strep-ACP vector were digested with Fermentas FastDigest Nco I and Hind III (or Kpn I, see Supplementary file 1) for 1 hr at 37°C. The vector was treated with CIP (calf-intestinal alkaline phosphatase, NEB, Ipswich, MA, United States) for an additional hour and subsequently insert and vector were purified from 1% agarose gel using the QIAquick gel extraction kit followed by the Qiagen PCR purification kit. Ligations were performed at a vector:insert mass ratio of 1:1 using T4 DNA ligase (NEB, Ipswich, MA, United States) supplemented with 0.5 mM ATP (NEB, Ipswich, MA, United States) for 2 hr at 22°C or 16°C overnight. Prior to transformation, reactions were purified by ethanol/glycogen (Fermentas, Waltham, MA, United States) precipitation. Transformation into electrocompetent E. cloni 10G (Lucigen, Middleton, WI, United States) yielded at least 105 colonies.

Pre-screen on agar plates

Plasmids were extracted and re-transformed into E. coli BL21 (DE3) containing pGro7 plasmid for overexpression of the GroEL/ES chaperone system. Transformation reactions were plated on an average of 10 agar plates (140 mm diameter) containing 100 μg/ml ampicillin (or 50 μg/ml kanamycin, see Supplementary file 1) and 34 μg/ml chloramphenicol such that each plate contained 200–1000 colonies, leading to a final library size of 2000–10,000 variants. Colonies were transferred onto nitrocellulose membrane (BioTrace NT Pure Nitrocellulose Transfer Membrane 0.2 μm, PALL Life Sciences, Port Washington, NY, United States), which was then placed onto a second plate additionally containing 1 mM isopropyl β-D-1-thiogalactopyranoside (IPGT) 200 μM ZnCl2 (to ensure availability of Zn2+ ions necessary for enzymatic activity), and either 20% (wt/vol) arabinose for chaperone overexpression or 20% (wt/vol) glucose for repression of chaperone expression. After expression overnight at room temperature for plates containing arabinose or for 1 hr at 37°C for plates containing glucose, the membrane was placed into an empty petri dish. For low activity levels of the parent gene where a maximum signal is desirable, cells were lysed prior to the activity assay by alternating three times between storage at −20°C and 37°C. For higher activities, the lysis step was omitted, making it easier to differentiate between different colonies. For the activity assay, 20–25 ml of 0.5% Agarose in 50 mM Tris-HCl buffer, pH 7.5 containing 200 μM 2NH (Sigma, St. Louis, MO, United States) and Fast Red (Sigma, St. Louis, MO, United States) was poured onto the membrane. Red color developed within 30 min. To screen for phosphotriesterase activity, the buffer contained varying concentrations of fluorogenic phosphotriester instead of 2NH/Fast Red as indicated in Supplementary file 1. Turnover of O-fluoresceinyl-O,O-diethyl-thiophosphate (fluoresceinyl-DETP, excitation 495 nm, emission 520 nm) was detected in a Typhoon 9400 scanner (GE Healthcare, Wauwatosa, WI, United States) after an appropriate incubation time (0–3 hr). In the case of 7-O-diethylphosphoryl-3-cyano-4-methyl-7-hydroxycoumarin (Me-DEPCyC), activity was detected in an agarose gel imager (excitation 365 nm) using a SYBR Safe filter.

Screens in 96-well plates

Colonies exhibiting high enzymatic activity identified in the pre-screen were picked and re-grown in four to six 96-deep well plates overnight at 30°C, leading to a library of 400–600 pre-selected variants. Wells contained 200 μl lysogeny broth (LB) supplemented with 100 μg/ml ampicillin and 34 μg/ml chloramphenicol. Subsequently, deep well plates containing 500 μl LB per well supplemented with ampicillin, chloramphenicol, and 20% (wt/vol) arabinose or glucose (depending on whether chaperone overexpression was to be induced or repressed) were inoculated with 25 μl of pre-culture and grown for 2–3 hr at 37°C until the OD600 reached ∼0.6. Expression of PTE variants was induced by adding IPTG to a final concentration of 1 mM and cultures were incubated for an additional 2 hr at 30°C or for 1 hr at 37°C in rounds aimed at reducing chaperone dependence. Cells were spun down at 4°C at maximum speed (3320×g) for 5–10 min and the supernatant was removed. Pellets were frozen for a minimum of 30 min at −80°C and subsequently lysed by addition of 200 μl 50 mM Tris-HCl pH 7.5 supplemented with 0.1% (wt/vol) Triton-X100, 200 μM ZnCl2, 100 μg/ml lysozyme, and ∼1 μl of benzonase (25 U/μl, Novagen, Madison, WI, United States) per 100 ml. After 30 min of lysis at room temperature, cell debris was spun down at 4°C at 3320×g for 20 min. Depending on the activity level of the library, clarified lysate was diluted prior to the activity assay to obtain a good signal in the initial linear phase of the reaction. Reactions were performed in transparent 96-well plates containing 200 μl per well (20 μl lysate + 180 μl of 200 μM substrate in 50 mM Tris-HCl, pH 7.5 supplemented with 0.02% Triton-X100 in the case of paraoxon and 0.1% in the case of 2NH/FR). Paraoxon hydrolysis was monitored at 405 nm; 2NH hydrolysis was monitored at 500 nm via complex formation with Fast Red. Improvements >1.3-fold relative to the previous round were considered significant. The best clones were picked and re-grown in triplicate. The observed initial rates were normalized to cell density (determined by the OD600) and the average values determined. Approximately 10 improved variants were sequenced after each round. A description of each directed evolution round including selection criteria, the mutations found in each sequenced variant, and mention of the variants chosen as templates for the next library generation can be found in Supplementary file 1.

Purification of Strep-tagged proteins for enzyme kinetics

pET-Strep-PTE plasmids were transformed into E. coli BL21 (DE3) and grown at 37°C in TB medium containing 100 µg/ml ampicillin and 200 μM ZnCl2. Expression was induced with 0.4 mM IPTG when cell density reached an OD600 of 0.6 units and cells grown overnight at 20°C. Cells were harvested by centrifugation at 3320×g and 4°C for 10 min, resuspended and lysed for 1 hr at room temperature using a 1:1 mixture of B-PER Protein Extraction Reagent (Thermo Scientific, Waltham, MA, United States) and 50 mM Tris-HCl buffer, pH 7.5 containing 200 μM ZnCl2, 100 μg/ml lysozyme and ∼1 μl of benzonase per 100 ml. Cell debris was removed by centrifugation at 30,000×g and 4°C for 45 min and the clarified lysate passed through a 45 μm filter before loading onto a Strep-Tactin Superflow High capacity column (1 ml column volume). After several washes with 50 mM Tris-HCl buffer, pH 7.5 containing 200 μM ZnCl2, Strep-PTE variants were eluted in the same buffer containing 2.5 mM desthiobiotin according to the manufacturer's instructions (IBA BioTAGnology, Germany). Protein was dialyzed overnight against 50 mM Tris-HCl buffer, pH 7.5 containing 100 mM NaCl and concentrated if necessary. This protocol was adapted for purification in 96-well format by using AcroPrep 96 Filter Plates (Pall Life Sciences, Port Washington, NY, United States) according to the manufacturer's instructions. Lysates were clarified using Lysate Clearance plates (3 μm GxF, 0.2 μm Supor) and transferred to filter plates (0.45 μm GHP) containing 50 μl Strep-tactin resin per well. Wells were washed 3× with 50 mM Tris-HCl pH 8.5 containing 100 mM NaCl and 200 μM ZnCl2 and 3× with pH 7.5 buffer. After elution, samples were concentrated and elution buffer removed using ultrafiltration plates (Omega 10K membrane).

Kinetic characterization of variants

Paraoxon, 2NH, and Fast Red were purchased from Sigma (St. Louis, MO, United States). Substrates for linear free energy relationships were gifts from Dan Tawfik's laboratory and their synthesis is described in Khersonsky and Tawfik (2005). Absorbance wavelengths and extinction coefficients are given in Supplementary file 2. For determination of initial rates in lysate, cells were grown and assayed in at least duplicate as described under the section ‘Screens in 96-well plates’. The experiment was repeated and the average change relative to the respective parent variant and the standard deviation were determined (Supplementary file 2). A Student's t-test was performed to obtain p-values. Where applicable, p-values were also calculated to determine whether the effect of a certain mutation in two different backgrounds (rather than compared to the parent mutant lacking this mutation) is significant. For determination of initial rates using purified enzyme, variants were expressed and purified in 96-well format in at least duplicate and assayed as described above in ‘Screens in 96-well plates’. For determination of Michaelis–Menten parameters, reactions were performed in triplicate at a range of substrate concentrations (0–2000 μM). Reactions were initiated by addition of 180 μl of substrate solution (in 50 mM Tris-HCl, pH 7.5 supplemented with 0.02–0.1% Triton X-100) to 20 μl of enzyme in 50 mM Tris-HCl, pH 7.5 supplemented with 200 μM ZnCl2 and 0.02% Triton X-100. Data were fit to Michaelis–Menten kinetics in Kaleidagraph.

Purification of untagged proteins for crystallization

AE and neoPTE genes were cloned into pET32-trx plasmid without Strep-tag using FastDigest NcoI and HindIII as described above, transformed into E. coli BL21 (DE3), and grown for 72 hr at 30°C in TB medium containing 100 μg/ml ampicillin and 500 μM ZnCl2. Cells were harvested by centrifugation at 3320×g and 4°C for 10 min, resuspended in 20 mM Tris-HCl pH 8 containing 100 μM ZnCl2 and lysed by sonication (OMNI Sonic Ruptor 400, Thermo Scientific, Waltham, MA, United States, 3× 30 s on/60 s off, amplitude 40%). Cell debris was removed by centrifugation at 30,000×g and 4°C for 45 min and lysate filtered through 45 μm filters (Millipore). The lysate was loaded onto two HiPrep Q FF columns (GE Healthcare, Wauwatosa, WI, United States) in series. PTE elutes in the flow through as well as the early wash fractions. Active fractions were pooled and passed through a 45 μm filter. The sample was concentrated over a Millipore spin column (MWCO 30,000) and purified by gel filtration (HiLoad 16/60 Superdex 200 prep grade, GE Healthcare, Wauwatosa, WI, United States). Protein was concentrated to 12 mg/ml and stored at 4°C.

Crystallization, data collection and structure determination

Crystals of wtPTE, AE, and neoPTE were obtained by vapor diffusion from a solution containing protein (10 mg/ml) plus 20–30% wt/vol 2-methane-4-pentane diol (MPD), buffered to pH 6.5 by 0.1 M sodium cacodylate, as described previously (Tokuriki et al., 2012). Serial microseeding was performed to increase crystal size (Bergfors, 2003). Crystals grew to approximately 200 micrometers and were soaked in 40% MPD, 0.1 M sodium cacodylate as cryoprotectant for 5–10 min and then flash-cooled to 100 K in the gaseous nitrogen cryostream of a cooling device (Oxford Cryosystems, United Kingdom). Data were collected from frozen crystals on beamline MX1 of the Australian Synchrotron (AS). The data were indexed and integrated by XDS (Kabsch, 2010) and Aimless (Evans and Murshudov, 2013), with data cut-off being made at the highest resolution that retained a mean half dataset correlation coefficient (CC1/2) of at least 0.5 in the outer shell (Supplementary file 3) (Karplus and Diederichs, 2012). Although all three crystals were crystallized in the same conditions, neoPTE crystallized in a different space group with different unit cell dimensions (p65, with a h, −h−k, l merohedral twin operator, vs C2221). A starting model for refinement (R18; PDB ID: 4E3T) (Tokuriki et al., 2012) was used to provide initial phases. Structures were refined with phenix.refine (Afonine et al., 2012) and validated with molprobity (Chen et al., 2010), as implemented in the phenix software suite (Supplementary file 3) (Adams et al., 2010).

Acknowledgements

We thank Dan S Tawfik and members of the Tokuriki laboratory for comments on the manuscript and Kirsten Wyganowski for technical support. This work was supported by the Natural Sciences and Engineering Research Council of Canada. MK was partially supported by the EU ITN ProSA. NT is a CIHR new investigator and a Michael Smith Foundation of Health Research (MSFHR) career investigator. CJJ acknowledges an ARC Discovery Early Career Researcher Award. FH is an ERC Starting Investigator.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • Natural Sciences and Engineering Research Council of Canada (Conseil de Recherches en Sciences Naturelles et en Génie du Canada) Discovery Grants RGPIN 418262-12 to Nobuhiko Tokuriki.

  • Australian Research Council (ARC) FT140101059 to Colin J Jackson.

  • European Commission (EC) MRTN-CT-2005-019475 to Florian Hollfelder.

  • Biotechnology and Biological Sciences Research Council (BBSRC) BB/I004327/1 to Florian Hollfelder.

  • European Research Council (ERC) 208813 to Florian Hollfelder.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

MK, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article.

NT, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article.

CJJ, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article.

ECC, Acquisition of data, Analysis and interpretation of data.

FH, Analysis and interpretation of data, Drafting or revising the article.

Additional files

Supplementary file 1.

Description of the directed evolution rounds.

DOI: http://dx.doi.org/10.7554/eLife.06492.019

elife06492s005.docx (1.1MB, docx)
DOI: 10.7554/eLife.06492.019
Supplementary file 2.

Kinetic parameters of all variants.

DOI: http://dx.doi.org/10.7554/eLife.06492.020

elife06492s006.docx (1.2MB, docx)
DOI: 10.7554/eLife.06492.020
Supplementary file 3.

Crystallographic information.

DOI: http://dx.doi.org/10.7554/eLife.06492.021

elife06492s007.docx (143.6KB, docx)
DOI: 10.7554/eLife.06492.021

References

  1. Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N, Headd JJ, Hung LW, Kapral GJ, Grosse-Kunstleve RW, McCoy AJ, Moriarty NW, Oeffner R, Read RJ, Richardson DC, Richardson JS, Terwilliger TC, Zwart PH. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallographica Section D: Biological Crystallography. 2010;66:213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Afonine PV, Grosse-Kunstleve RW, Echols N, Headd JJ, Moriarty NW, Mustyakimov M, Terwilliger TC, Urzhumtsev A, Zwart PH, Adams PD. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallographica Section D: Biological Crystallography. 2012;68:352–367. doi: 10.1107/S0907444912001308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Aharoni A, Gaidukov L, Khersonsky O, Mc QG, Roodveldt C, Tawfik DS. The ‘evolvability’ of promiscuous protein functions. Nature Genetics. 2005;37:73–76. doi: 10.1038/ng1482. [DOI] [PubMed] [Google Scholar]
  4. Akashi H, Osada N, Ohta T. Weak selection and protein evolution. Genetics. 2012;192:15–31. doi: 10.1534/genetics.112.140178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bergfors T. Seeds to crystals. Journal of Structural Biology. 2003;142:66–76. doi: 10.1016/S1047-8477(03)00039-X. [DOI] [PubMed] [Google Scholar]
  6. Breen MS, Kemena C, Vlasov PK, Notredame C, Kondrashov FA. Epistasis as the primary factor in molecular evolution. Nature. 2012;490:535–538. doi: 10.1038/nature11510. [DOI] [PubMed] [Google Scholar]
  7. Bridgham JT, Ortlund EA, Thornton JW. An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature. 2009;461:515–519. doi: 10.1038/nature08249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bull JJ, Badgett MR, Wichman HA, Huelsenbeck JP, Hillis DM, Gulati A, Ho C, Molineux IJ. Exceptional convergent evolution in a virus. Genetics. 1997;147:1497–1507. doi: 10.1093/genetics/147.4.1497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Carneiro M, Hartl DL. Colloquium papers: adaptive landscapes and protein evolution. Proceedings of the National Academy of Sciences of USA. 2010;107(Suppl 1):1747–1751. doi: 10.1073/pnas.0906192106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chen VB, Arendall WB, III, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallographica Section D: Biological Crystallography. 2010;66:12–21. doi: 10.1107/S0907444909042073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chou HH, Chiu HC, Delaney NF, Segre D, Marx CJ. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science. 2011;332:1190–1192. doi: 10.1126/science.1203799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Clarke CA. Evolution in reverse: clean air and the peppered moth. Biological Journal of the Linnean Society. 1985;26:189–199. doi: 10.1111/j.1095-8312.1985.tb01555.x. [DOI] [Google Scholar]
  13. Collin R, Miglietta MP. Reversing opinions on Dollo's Law. Trends in Ecology & Evolution. 2008;23:602–609. doi: 10.1016/j.tree.2008.06.013. [DOI] [PubMed] [Google Scholar]
  14. Corbett-Detig RB, Zhou J, Clark AG, Hartl DL, Ayroles JF. Genetic incompatibilities are widespread within species. Nature. 2013;504:135–137. doi: 10.1038/nature12678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Crill WD, Wichman HA, Bull JJ. Evolutionary reversals during viral adaptation to alternating hosts. Genetics. 2000;154:27–37. doi: 10.1093/genetics/154.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cruickshank DW. Remarks about protein structure precision. Acta Crystallographica. Section D, Biological Crystallography. 1999;55:583–601. doi: 10.1107/S0907444998012645. [DOI] [PubMed] [Google Scholar]
  17. de Visser JA, Cooper TF, Elena SF. The causes of epistasis. Proceedings of the Royal Society B: Biological Sciences. 2011;278:3617–3624. doi: 10.1098/rspb.2011.1537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dickinson BC, Leconte AM, Allen B, Esvelt KM, Liu DR. Experimental interrogation of the path dependence and stochasticity of protein evolution using phage-assisted continuous evolution. Proceedings of the National Academy of Sciences of USA. 2013;110:9007–9012. doi: 10.1073/pnas.1220670110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Evans PR, Murshudov GN. How good are my data and what is the resolution? Acta Crystallographica Section D: Biological Crystallography. 2013;69:1204–1214. doi: 10.1107/S0907444913000061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gould SJ. Wonderful life: burgess shale and the nature of history. New York: W.W. Norton & Company, Inc; 2007. [Google Scholar]
  21. Harms MJ, Thornton JW. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nature Reviews Genetics. 2013;14:559–571. doi: 10.1038/nrg3540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Harms MJ, Thornton JW. Historical contingency and its biophysical basis in glucocorticoid receptor evolution. Nature. 2014;512:203–207. doi: 10.1038/nature13410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hong SB, Raushel FM. Metal-substrate interactions facilitate the catalytic activity of the bacterial phosphotriesterase. Biochemistry. 1996;35:10904–10912. doi: 10.1021/bi960663m. [DOI] [PubMed] [Google Scholar]
  24. Kabsch W. Xds. Acta Crystallographica Section D: Biological Crystallography. 2010;66:125–132. doi: 10.1107/S0907444909047337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kaltenbach M, Tokuriki N. Dynamics and constraints of enzyme evolution. Journal of Experimental Zoology B: Molecular and Developmental Evolution. 2014;322:468–487. doi: 10.1002/jez.b.22562. [DOI] [PubMed] [Google Scholar]
  26. Karplus PA, Diederichs K. Linking crystallographic model and data quality. Science. 2012;336:1030–1033. doi: 10.1126/science.1218231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kawecki TJ, Lenski RE, Ebert D, Hollis B, Olivieri I, Whitlock MC. Experimental evolution. Trends in Ecology & Evolution. 2012;27:547–560. doi: 10.1016/j.tree.2012.06.001. [DOI] [PubMed] [Google Scholar]
  28. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF. Negative epistasis between beneficial mutations in an evolving bacterial population. Science. 2011;332:1193–1196. doi: 10.1126/science.1203801. [DOI] [PubMed] [Google Scholar]
  29. Khanal A, Yu McLoughlin S, Kershner JP, Copley SD. Differential effects of a mutation on the normal and promiscuous activities of orthologs: implications for natural and directed evolution. Molecular Biology and Evolution. 2014;32:100–108. doi: 10.1093/molbev/msu271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Khersonsky O, Tawfik DS. Structure-reactivity studies of serum paraoxonase PON1 suggest that its native activity is lactonase. Biochemistry. 2005;44:6371–6382. doi: 10.1021/bi047440d. [DOI] [PubMed] [Google Scholar]
  31. Khersonsky O, Tawfik DS. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annual Review of Biochemistry. 2010;79:471–505. doi: 10.1146/annurev-biochem-030409-143718. [DOI] [PubMed] [Google Scholar]
  32. Kitano J, Bolnick DI, Beauchamp DA, Mazur MM, Mori S, Nakano T, Peichel CL. Reverse evolution of armor plates in the threespine stickleback. Current Biology. 2008;18:769–774. doi: 10.1016/j.cub.2008.04.027. [DOI] [PubMed] [Google Scholar]
  33. Kondrashov AS, Sunyaev S, Kondrashov FA. Dobzhansky-Muller incompatibilities in protein evolution. Proceedings of the National Academy of Sciences of USA. 2002;99:14878–14883. doi: 10.1073/pnas.232565499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Kvitek DJ, Sherlock G. Reciprocal sign epistasis between frequently experimentally evolved adaptive mutations causes a rugged fitness landscape. PLOS Genetics. 2011;7:e1002056. doi: 10.1371/journal.pgen.1002056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lenski RE. Experimental studies of pleiotropy and epistasis in Escherichia coli. II. Compensation for maladaptive effects associated with resistance to T4 virus. Evolution; International Journal of Organic Evolution. 1988;42:433–440. doi: 10.2307/2409029. [DOI] [PubMed] [Google Scholar]
  36. Lobkovsky AE, Koonin EV. Replaying the tape of life: quantification of the predictability of evolution. Frontiers in Genetics. 2012;3:246. doi: 10.3389/fgene.2012.00246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lunzer M, Golding GB, Dean AM. Pervasive cryptic epistasis in molecular evolution. PLOS Genetics. 2010;6:e1001162. doi: 10.1371/journal.pgen.1001162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. MacLean RC, Perron GG, Gardner A. Diminishing returns from beneficial mutations and pervasive epistasis shape the fitness landscape for rifampicin resistance in Pseudomonas aeruginosa. Genetics. 2010;186:1345–1354. doi: 10.1534/genetics.110.123083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Maheshwari S, Barbash DA. The genetics of hybrid incompatibilities. Annual Review of Genetics. 2011;45:331–355. doi: 10.1146/annurev-genet-110410-132514. [DOI] [PubMed] [Google Scholar]
  40. McCandlish DM, Rajon E, Shah P, Ding Y, Plotkin JB. The role of epistasis in protein evolution. Nature. 2013;497:E1–E2. doi: 10.1038/nature12219. discussion E2–E3. [DOI] [PubMed] [Google Scholar]
  41. Morley KL, Kazlauskas RJ. Improving enzyme properties: when are closer mutations better? Trends in Biotechnology. 2005;23:231–237. doi: 10.1016/j.tibtech.2005.03.005. [DOI] [PubMed] [Google Scholar]
  42. Orr HA. The population genetics of speciation: the evolution of hybrid incompatibilities. Genetics. 1995;139:1805–1813. doi: 10.1093/genetics/139.4.1805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Peisajovich SG, Tawfik DS. Protein engineers turned evolutionists. Nature Methods. 2007;4:991–994. doi: 10.1038/nmeth1207-991. [DOI] [PubMed] [Google Scholar]
  44. Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ. Empirical fitness landscapes reveal accessible evolutionary paths. Nature. 2007;445:383–386. doi: 10.1038/nature05451. [DOI] [PubMed] [Google Scholar]
  45. Romero PA, Arnold FH. Exploring protein fitness landscapes by directed evolution. Nature Reviews Molecular Cell Biology. 2009;10:866–876. doi: 10.1038/nrm2805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Roodveldt C, Tawfik DS. Shared promiscuous activities and evolutionary features in various members of the amidohydrolase superfamily. Biochemistry. 2005;44:12728–12736. doi: 10.1021/bi051021e. [DOI] [PubMed] [Google Scholar]
  47. Salverda ML, Dellus E, Gorter FA, Debets AJ, van der Oost J, Hoekstra RF, Tawfik DS, de Visser JA. Initial mutations direct alternative pathways of protein evolution. PLOS Genetics. 2011;7:e1001321. doi: 10.1371/journal.pgen.1001321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Schumer M, Cui R, Powell DL, Dresner R, Rosenthal GG, Andolfatto P. High-resolution mapping reveals hundreds of genetic incompatibilities in hybridizing fish species. eLife. 2014;3:e02535. doi: 10.7554/eLife.02535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Shafee T, Gatti-Lafranconi P, Minter R, Hollfelder F. Handicap-recover evolution leads to a chemically versatile, nucleophile-permissive protease. Chembiochem. 2015 doi: 10.1002/cbic.201500295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Socha RD, Tokuriki N. Modulating protein stability—directed evolution strategies for improved protein function. The FEBS Journal. 2013;280:5582–5595. doi: 10.1111/febs.12354. [DOI] [PubMed] [Google Scholar]
  51. Stebbins J. The Law of diminishing returns. Science. 1944;99:267–271. doi: 10.1126/science.99.2571.267. [DOI] [PubMed] [Google Scholar]
  52. Teotonio H, Rose MR. Variation in the reversibility of evolution. Nature. 2000;408:463–466. doi: 10.1038/35044070. [DOI] [PubMed] [Google Scholar]
  53. Teotonio H, Rose MR. Perspective: reverse evolution. Evolution; International Journal of Organic Evolution. 2001;55:653–660. doi: 10.1554/0014-3820(2001)055[0653:PRE]2.0.CO;2. [DOI] [PubMed] [Google Scholar]
  54. Tokuriki N, Jackson CJ, Afriat-Jurnou L, Wyganowski KT, Tang R, Tawfik DS. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nature Communications. 2012;3:1257. doi: 10.1038/ncomms2246. [DOI] [PubMed] [Google Scholar]
  55. Tokuriki N, Tawfik DS. Chaperonin overexpression promotes genetic variation and enzyme evolution. Nature. 2009;459:668–673. doi: 10.1038/nature08009. [DOI] [PubMed] [Google Scholar]
  56. Wellner A, Raitses Gurevich M, Tawfik DS. Mechanisms of protein sequence divergence and incompatibility. PLOS Genetics. 2013;9:e1003665. doi: 10.1371/journal.pgen.1003665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Whitlock MC, Phillips PC, Moore FB, Tonsor SJ. Multiple fitness peaks and epistasis. Annual Review of Ecology, Evolution, and Systematics. 1995;26:601–629. doi: 10.1146/annurev.es.26.110195.003125. [DOI] [Google Scholar]
  58. Wyganowski KT, Kaltenbach M, Tokuriki N. GroEL/ES buffering and compensatory mutations promote protein evolution by stabilizing folding intermediates. Journal of Molecular Biology. 2013;425:3403–3414. doi: 10.1016/j.jmb.2013.06.028. [DOI] [PubMed] [Google Scholar]
eLife. 2015 Aug 14;4:e06492. doi: 10.7554/eLife.06492.022

Decision letter

Editor: Michael Laub1

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for sending your work entitled “Reverse evolution leads to molecular speciation despite functional and active-site convergence” for consideration at eLife. Your article has been favorably evaluated by Michael Marletta (Senior Editor) and three reviewers, one of whom is a member of our Board of Reviewing Editors.

The Reviewing Editor and the reviewers discussed their comments before we reached a decision to invite you to revise your submission. As you'll see below, the reviewers were somewhat enthusiastic about the work and its potential suitability for eLife. However, there were a number of major concerns about the clarity and precision of the writing and discussion of your results. Although standard practice for eLife is to provide authors a single review that integrates the major comments from the reviewers, we have decided to provide you the reviewer comments in full. As you'll see the reviewer comments are thoughtful and detailed, so any attempt to summarize them runs the risk of obscuring or confusing their message.

Reviewer #1:

Overall, I like this manuscript and I think it presents a nice story about enzyme evolution. The data are relatively clean and straightforward. My primary concern with this manuscript is the frequent claim throughout it that evolution is “irreversible”. This isn't really accurate, I don't think. The mutations that accumulated could always be reversed in the exact, reverse order in which they initially occurred. I think what the authors really mean when they say “irreversible” is that (i) the probability of an exact reversion of multiple mutations is improbable and (ii) reversion of only some mutations may not restore the original phenotype if other mutations have also occurred. In other words, let's say evolution of wtPTE into AE required substitutions Z1z and Y2y (i.e. ZY at positions 1/2 = phenotype 1 and zy at positions 1/2 = phenotype 2) but that a third mutation, say X3x occurred at position 3. If that mutation is incompatible with ZY (i.e. ZYX = phenotype 1, while ZYx = dead protein) then the authors are saying evolution is irreversible. But that's not really logical. It is formally reversible, provided one also reverses the mutation at position 3. So, bottom line, I get what the authors are driving at, but their language is imprecise and, as a consequence, misleading. This issue affects much of the manuscript and needs attention.

Rephrasing the argument above, what the authors have demonstrated here is that there is redundancy, or degeneracy, in PTE catalysis, meaning that multiple genotypes can give rise to the same phenotype. Because of this redundancy, evolution is unlikely to return to the exact same genotype following selection for a new trait followed by selection for the old trait. Additionally, they have nicely documented the epistasis that occurs during their experimental evolution, showing how mutations can have very different effects on the wtPTE, AE, and neoPTE genetic backgrounds, or what they call genetic incompatibility. I think these ideas should be articulated more clearly rather than selling the idea of irreversibility, which, as noted above, doesn't seem at all logical to me.

In the subsection “Genotypic irreversibility and constraints underlying phenotypic reversion” the authors state: “Nine additional, ‘new mutations’ were needed to accomplish the phenotypic reversion.” Nine additional mutations certainly arose, but was each one of them needed? This statement needs to be better justified, or I missed something in a supplemental figure/table. If it's the latter, then the data should probably be better explained/laid out in the manuscript and main figures.

Reviewer #2:

The authors use directed evolution techniques to study evolutionary irreversibility and epistasis. They put an enzyme under selection pressure to evolve a new activity and then, after it has done so, reverse the selection to drive reacquisition of the ancestral activity. This allows the authors to chart the step-by-step mutational trajectories taken in both directions, to identify any genetic incompatibilities between the starting and ending states, and to use X-ray crystallography to identify causes of incompatibility.

This is a careful and detailed study. It sheds useful light on an important question in molecular evolution: the extent of epistasis during evolutionary trajectories and its potential effects on evolutionary processes, particularly evolutionary reversal to ancestral states. Numerous studies have addressed this problem computationally or at the phenotypic level, but our understanding of molecular irreversibility is limited. A previous molecular study showed that epistasis can prevent functionally important substitutions during historical evolution from being directly reversed, but that study did not show whether or how the ancestral function could be reacquired if the protein were subject to selection pressure to do so. The submitted manuscript is the first to use a directed evolution approach to address this question, and this strategy is quite appropriate for this purpose.

I believe that the paper has the potential to be a strong contribution to eLife. There are some significant issues in the way the argument is framed and the data presented that make the paper more confusing than it ought to be or, in a few places, imprecise or inadequately supported in its claims. It should be possible for the authors to address all these concerns. I detail these below.

Issues related to interpretation and analysis:

1) One significant limitation is that the authors have examined only a single evolutionary pathway in both directions. Decisively answering questions of contingency and determinism and repeatability and the number of paths between states requires examination of many paths, not just one. The authors' detailed characterization of the trajectory they studied does allow certain general inferences on these subjects to be drawn, but they are indirect because a single instance of evolution is studied. This does not invalidate the work, but some acknowledgement and discussion of the cautions required by this limitation are important.

2) Claims about effects of mutations are made without appropriate statistical analysis and rationale. For example, in Figure 4 and the claims based on these data, the authors use a 1.3-fold change in activity as a threshold of significance: changes smaller than this are considered neutral, while those that reduce activity by a factor larger than this (or by its reciprocal, to be precise), are considered deleterious. I don't see a justification for this threshold, which seems arbitrary, although it is very important for the conclusions. The authors must justify this choice rationally or change their claims. In this figure, the claim the authors want to make pertains to incompatibility; for such a claim to be made, there should be evidence to show that a state from PTE is deleterious in neoPTE, or vice versa: the evidence of a deleterious effect must be strong enough for us to rule out the possibility that the state is in fact neutral or beneficial. I see error bars, but I don't know what they represent, and I therefore don't know how confident I should be that many of these mutations are authentically deleterious (that is, that I can rule out the possibility that the true effect of the mutation is zero, but error over a small number of measurements has yielded a spurious reduction in activity.) The authors should apply an appropriate statistical procedure to keep the rate of false discoveries of incompatibility to an acceptable level, considering the large number of tests being conducted.

Similarly, in Figure 6, the authors say that R111s exhibits sign epistasis when tested in the two different backgrounds. It looks like one has a very small negative effect and the other a very small positive effect, and it is not obvious whether we can rule out zero (and therefore no difference in sign) for either one, based on the error bars. There are many other putative examples of sign epistasis in this figure for which it appears to be difficult to rule out different signs due to stochastic/measurement error around a near-zero effect. Again, we need a proper test or statistical characterization of confidence for all these claims. Similar problems affect many of the paper's figures.

3) The authors venture hypotheses about biophysical mechanisms underlying the observed functional effects of mutations based on X-ray crystal structures and structural models in which the various ligands have been computationally placed into the active site of the crystal structures. The conclusions about the causes of functional incompatibility are mostly based on observations of clashes observed in models between proteins and their non-preferred substrates. Proteins are flexible, however, and can often accommodate new ligands in ways that models do not predict. This possibility and the uncertainty introduced by using a model of a complex should be acknowledged.

4) The authors propose structural mechanisms for some of the functional effects of mutations that they observe: some of these involve changes in structure at the sub-angstrom level, but the structures are themselves are only at resolution of 1.6 to 2.0 angstrom. How confident can we be in such fine-scale putative differences in the location of atoms given the actual resolution of the structures? These inferences should be made with caution or not made at all.

There are some conceptual issues that I feel should be thought out more clearly and expressed more precisely:

1) The authors swap states between PTE and neoPTE, observe that swapping some residues severely compromises function, and conclude that the proteins occupy “separate adaptive peaks” that are not “connected” on the adaptive landscape. The thinking and language here are imprecise. All proteins are by definition connected via some number of mutations on the adaptive landscape. The question the authors seek to address whether they are accessible from each other via a continuously connected neutral (or functional) network of single-replacement changes (in the sense used by Maynard Smith and A. Wagner). Incompatibility of single mutation-swaps does not mean that proteins cannot be reached through such a connected network. If permissive mutations interact epistatically with the “incompatible” amino acid and if the permissive mutations can be introduced without deleterious effect on the function, then the genotypes with and without the “incompatible” amino acid can in fact be connected. Incompatibility does mean that the cluster of functional genotypes containing the ancestral state at the residue of interest and the functional network of genotypes containing the derived state have fewer connections than they might have if no incompatibility were present. That is all that can be said about connectivity without further analysis. The large number of incompatibilities the authors observe may indicates that the number of connections is reduced in a fairly dramatic way, but how dramatic the reduction is relative to the total number of possible connections a priori is also unknown without further analysis.

Another issue related to epistasis, sequence space and contingency is the sufficiency of selection to move a protein through sequence space from one protein to another. If the permissive mutations that make residues tolerable in one genetic background but not the other can be introduced without affecting the PTE function, then the authors' observations would show that selection for PTE function would not be sufficient to drive the reacquisition of the ancestral state at the residue of interest; in fact, this does seem to be true. There are probably other interesting and valid ways for the authors to describe their findings, but they should modify their language to make the conceptual model of an adaptive landscape and the conclusions they draw more precise.

2) According to the Abstract, selection for the ancestral function drove the acquisition of a new genotype provides evidence for irreversibility. This is not clearly thought out. If there were no epistasis and no selective barriers to reacquisition of the ancestral genotype (or any other genotype that confers the ancestral function), and so long as there were numerous genotypes that encode the ancestral function, then selection for the ancestral phenotype is likely to produce different outcomes every time. That is, the simple fact that under selection the genotype that evolved is different from the ancestral genotype does not establish epistasis, incompatibility, or irreversibility. It is the authors' other experiments that show this.

3) The authors say that selection for aryl esterase activity was accompanied by a decrease on PTE activity, which indicates an “intrinsic trade-off.” This conclusion is not correct. If most random mutations reduce PTE activity and there is no necessary or intrinsic mechanistic trade-off between PTE and AE activity, then mutations that increase AE activity are more likely than not to reduce PTE activity.

4) The authors use a metaphor of speciation for the evolution of incompatible enzyme genotypes. In the Discussion, however, it appears that they use the term literally, as if the evolution of this enzyme could cause reproductive isolation between populations of organisms. There is no evidence to support this causal leap, and the authors should avoid this type of discussion of the issue. If they are making an analogy between reproductive isolation of populations due to epistasis/Dobzhansky-Muller effect and the evolution of incompatible amino acids in their PTE enzymes that is fine, but the analogy should be labeled as such – as a metaphor. The authors say that evolution of the genotype is “constrained” to follow certain paths deterministically under selection for the ancestral function, and they say that Harms et al. “showed that the evolution of hormone receptors is similarly constrained by functional requirements imposed on the binding pocket.” The comparison seems imprecise. The Harms paper argued that evolution of the protein does not proceed deterministically, because function-changing mutations require prior rare mutations that do not change the function; the rarity of these permissive mutations comes from the fact they must fulfill numerous requirements. The authors are comparing this case to their own, in which they say the evolution of certain aspects of the ancestral genotype is deterministic because they represent the only way to achieve the ancestral function. The language should be cleaned up here.

5) The authors use the word constraint in a way that I find confusing. (This is not surprising, because the term is used variously and inconsistently in the literature, too.) Here, the authors say that catalytic requirements impose evolutionary constraints that virtually guarantee that something like the ancestral structure will re-evolve when selection for the ancestral function is imposed. This usage seems inconsistent with the way the term constraint has usually been used. Although the term has numerous meanings, it is almost always used to describe requirements that slow the rate of evolution or limit the capacity of selection to produce an outcome: 1) “functional constraints,” which refers to limitations on the process of genetic drift that are imposed by purifying selection and thus limit the rate of evolution; 2) “developmental (and similar) constraints,” which refers to limits on the kinds of variation that mutation can produce and make available selection; or 3) biochemical constraints or trade-offs, which make it nearly impossible for mutation and selection to produce an outcome that is “optimal” for some property looked at in isolation. The authors' point seems to be that there is one easily accessible way of solving the “problem” of esterase activity imposed by the selection pressure, so evolution is likely to always produce this outcome. This doesn't seem to me to be a constraint but rather a lack of constraint on the capacity of mutation and selection to produce exactly what the experimenters have asked it to do. I suggest finding another word.

Reviewer #3:

This manuscript describes the evolution of a new arylesterase activity in a phosphotriesterase, and then a reversal to the phenotype of the ancestral enzyme. The central question is whether the trajectory toward the new activity is reversible.

The authors report beautiful structural data showing how the active site of the enzyme can be reshaped to allow an efficient new activity, and how it can be reshaped again to restore the initial activity. The manuscript includes a vast amount of kinetic data for enzymes at various stages of the forward and reverse trajectories. The work is fascinating and of high quality. It is particularly intriguing that the final enzyme had very high phosphotriesterase and arylesterase activity.

Major comments:

1) I feel that the Introduction is somewhat overblown. We know from the structures of orthologous enzymes that there are many ways (in terms of primary sequence) to achieve the end result of an efficient enzyme when starting from a common ancestor. We also know that epistasis is important. Thus, the demonstration that the reverse evolution toward the original activity proceeds via a different trajectory does not seem surprising to me. In fact, I would be surprised if the evolutionary trajectory had been perfectly reversible.

2) I object to the use of terms such as speciation and genetic incompatibility that apply to reproductive isolation of organisms to describe enzyme variants. Enzymes do not reproduce or exchange residues. These terms should be replaced with something more appropriate for molecular evolutionary processes.

3) The authors should discuss the fact that they have explored only one possible trajectory for both the forward and reverse evolution stages, and that there are likely to be others. In fact, there may be trajectories that are fully reversible in terms of genotype. A fair statement from this one example is that reversion of a phenotype does not require reversion of a genotype.

4) In the subsection “The active site converged to its original state in the reverse evolution”, the authors say that “the rate-limiting step of phosphotriester hydrolysis was changed in AE, but restored in neoPTE” (Figure 2–figure supplement 3)”. This point requires an expanded discussion in the text, which should explain how the experiment was done, why the results shown in the supplementary material indicate that the rate-limiting step has changed, and what the rate-limiting step might be in each case.

5) Beginning with the discussion of Figure 5, the manuscript is very difficult to follow due to imprecise wording and obscure terminology.

A) The presentation of the data in Figure 5 is very hard to follow: “Underlined values indicate that the amino acid in question (Thr in this case) is already present in a certain background, and the effect is calculated as its reintroduction after removal (reversion of wtPTE-t45A to wtPTE).” So some effects are measured in one direction and some in the other, I think. The authors should find a better way to communicate these data. They did an enormous amount of work and the results are important.

B) I could not figure out the second paragraph of the subsection “Mutational epistasis underlies genotypic irreversibility” even after spending considerable time trying. The results need to be described in more precise language in order to avoid confusing the reader. For example, the passage: “the effect of mutations is significantly altered by the accumulation of subsequent mutations”. How can the effect of a mutation be altered by something that has not yet occurred?

C) Please clarify: “reversions that were initially deleterious for phosphotriesterase activity became favorable in the reverse evolution”. During the forward evolution, mutations occurred (not reversions), and the mutations were deleterious for phosphotriesterase activity.

D) The authors state: “all new mutations had no effect or a negative effect on phosphotriesterase activity at the onset of forward evolution”. Figure 1 shows that the first mutation that occurred decreased phosphotriesterase activity.

E) Figure 7A is also difficult to understand. The mutations are listed as, for example, I341t. To me, that suggests reversion of I341 to the ancestral Thr. But looking at Figure 1 tells me that the ancestral residue was Ile, and that the mutant enzymes (AE and neoPTE) have Thr at that position. Figure 7A also shows the effect of V144e, which is even more confusing because Figure 1 indicates that the ancestral residue was Thr and it changed to Val.

F) In the subsection “Mutational epistasis underlies genotypic irreversibility”, the sentence “The replacement of f306I by s308c…” does not make sense.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled “Reverse evolution leads to molecular speciation despite functional and active-site convergence” for further consideration at eLife. Your revised article has been favorably evaluated by Michael Marletta (Senior Editor), a Reviewing Editor, and two reviewers. The manuscript has been improved but there are some remaining issues that need to be addressed. These issues are listed below. Each item needs to be fully addressed if a re-revised manuscript is to be considered further for publication.

Reviewer #2:

1) I remain concerned about the threshold of 1.3-fold change. An increase in activity of 1.3-fold by a mutation “is considered significant.” The authors say that clones displaying a 1.3-fold improvement are generally verified as authentically displaying an increase in activity upon repeat examination. There are several issues about which I am concerned here. The first is that no data are provided to support this justification for the threshold; in fact, it is never articulated in the paper, although it is found in the letter. Thus the 1.3-fold threshold remains unjustified in the text. Second, the use of this threshold in defining examples of epistasis strikes me as problematic. Mutations that produce an observed increase in activity by >1.3x in one background but not another are judged to be epistatically modified. Thus, an increase of 1.4x in one background but 1.2x in another exhibit epistasis, because they are significantly improved in the first case but not the second. The authors do not present any way to establish whether the differences between these two genotypes are really statistically or biologically significant, so the claim of epistasis remains unjustified. The authors argue that a statistical analysis shows that most cases of 1.3x improvement are statistically significant (also not shown), but this is beside the point; the relevant issue is whether the difference between the effects of a mutation in two genetic backgrounds is statistically and biologically significant, not whether a 1.3-fold effect one mutation is different from no effect. Many, although not all, of the authors' examples of epistasis in the paper are, in fact, rather subtle examples of phenomena like this – not sign epistasis, or very large effects vs. very small or no effects, but the difference between an effect slightly greater than 1.3 and one slightly less. I therefore believe that the authors need to strengthen this aspect of the analysis. I recommend a test of differences of effect between backgrounds.

2) The authors make inferences about clashes on a sub-angstrom scale from structures that are only at 1.6 to 2 angstrom resolution. They say that the B-factors are low, but this is of limited relevance. This indicates that there is limited thermal motion for these atoms, but the resolution of the structure still does not allow location of the atom with precision greater than the stated resolution, which is necessary to strongly support the claims made. I believe this could be dealt with in the manuscript with more direct cautionary language and softening of the claims.

3) The idea of different protein “species” that are “genetically incompatible” with each other is still obscure, and I do not find it persuasive. Do the authors mean that if the alleles were brought together in a heterozygote that recombination within the gene would produce nonfunctional proteins and might be selected against? Are they arguing that this might contribute to speciation? If that's what the authors mean, they should say so, but it seems very far-fetched and marginally relevant to our understanding of protein evolution. I think the effects of epistasis on the evolutionary potential of a protein are a much more solid basis for the authors' interpretation.

Reviewer #3:

1) The “speciation” concept that another reviewer and I both objected to is still present in the title, Abstract and text. I think alternative wording should be used whenever possible, and the title should be changed.

2) The Introduction is still somewhat grandiose. In particular, I am bothered by the first sentence. I do not see how reversibility relates to the question of whether re-playing life's tape would lead to the same outcome. The latter question addresses whether the same things would happen again starting from the same place, while the former asks whether an individual trajectory can be reversed. Either omit or rephrase/expand to clarify the connection.

3) Is there any reason not to plot kcat/KM values in Figure 1B? They would be more relevant in terms of the changes in the two enzyme activities as the evolution proceeds.

4) The section head in the Results section states that “The active site converged to its original state in the reverse evolution”. I feel that this is misleading. The active site residues were not restored, and the shape of the active site is rather different. It would be correct to say that a functional active site had been restored, but I don't agree with the claim in this section (and elsewhere in the manuscript) that the active site converged to its original state. Further, the authors say that “the naphthyl binding pocket remains intact […] which likely explains why neoPTE is still bi-functional”. This statement is not consistent with the statement that the active site converged to its original state.

5) In the aforementioned subsection, it is claimed that: “the number of accessible mutational trajectories that lead to a wild-type level fitness peak from AE are highly limited”. I don't believe this statement is justified. There may well be many trajectories that could lead to wt-level activity that start from states that were discarded during the evolution because they were not the best variants at that particular stage.

6) A picture showing smooth and rugged fitness landscapes would be useful for readers who are not experts in molecular evolution.

7) Subsection “Emergence of incompatibility between the two seemingly identical enzymes”: the heading of this section refers to “two seemingly identical enzymes”. The enzymes are not seemingly identical, as neoPTE still has fairly robust arylesterase activity.

8) The terminology used to describe mutations in the discussion of Figures 5 and 6 is problematic. Usually the authors use the conventional notation; e.g. AxxB, which means that A at position xx was changed to B. But sometimes they use the reverse. For example, in the subsection “Mutational epistasis underlies genotypic irreversibility”, the text says: “some mutations initially increased (I172t and F271l) or were neutral (V130l) to phosphotriesterase activity…”. The mutations during the forward evolution were actually t172I, l271F and l130V. According to Figure 1 and Figure 1–figure supplement 1, t172I decreased, rather than increased, phosphotriesterase activity. This section needs to be clarified.

9) Figure 5 is still nearly incomprehensible to me. For example, in the column labeled “R254h”, the legend says that the effect is calculated in the direction R254h in all cases. So the figure suggests that changing R254 to H increases phosphotriesterase activity by 13-fold. But the wt enzyme has His at 254, not Arg. So what is really meant is that changing His254 to Arg decreases phosphotriesterase activity by 13-fold, which is consistent with Figure 1B. Having to do these mental gymnastics for each position and in each background makes my brain hurt. I do not think this way of conveying the data is salvageable.

10) Figure 6A shows the effects of V130l on phosphotriesterase activity in the “forward evolution” and “AE background” contexts. Figure 6 would be easier to understand if it specified V130l in R13b, and I172T in R5a (and so on), rather than using the term “forward evolution”, which initially gave me the impression that V130 was changed to I during the forward evolution. In fact, I am still confused by whether the bars at the left side of Figure 6A refer to the actual mutation that occurs during the forward evolution, or the reversion of the mutation at position 130. Even after several tries, I am unable to follow the discussion in subsection “Mutational epistasis underlies genotypic irreversibility”.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled “Reverse evolution leads to genotypic incompatibility despite functional and active-site convergence” for further consideration at eLife. Your revised article has been favorably evaluated by Michael Marletta (Senior Editor) and a Reviewing Editor. The manuscript has been improved but there is one remaining issue that needs to be addressed before acceptance:

An issue raised by several of the reviewers during both rounds of revision related to the terms ‘molecular speciation’ and ‘incompatibility’. While you have removed the confusing former term, the latter still appears in several places in the text. This is fine, but it needs, in some instances, a bit of clarification and adjustment of the language. For instance, the Abstract states that the reverse evolution experiment performed led “to a different sequence incompatible with the original one, despite functional identity.” As written, the statement is ambiguous and confusing in the sense that what you really mean is that certain substitutions that occurred during the reverse evolution process are incompatible with some of the residues present in the ancestral/original protein. It's not that the whole sequence is somehow incompatible with the original one, as written. Please adjust the language throughout the text where this concept arises to improve precision and meaning.

eLife. 2015 Aug 14;4:e06492. doi: 10.7554/eLife.06492.023

Author response


Reviewer #1:

Overall, I like this manuscript and I think it presents a nice story about enzyme evolution. The data are relatively clean and straightforward. My primary concern with this manuscript is the frequent claim throughout it that evolution is “irreversible”. This isn't really accurate, I don't think. The mutations that accumulated could always be reversed in the exact, reverse order in which they initially occurred. I think what the authors really mean when they say “irreversible” is that (i) the probability of an exact reversion of multiple mutations is improbable and (ii) reversion of only some mutations may not restore the original phenotype if other mutations have also occurred. In other words, let's say evolution of wtPTE into AE required substitutions Z1z and Y2y (i.e. ZY at positions 1/2 = phenotype 1 and zy at positions 1/2 = phenotype 2) but that a third mutation, say X3x occurred at position 3. If that mutation is incompatible with ZY (i.e. ZYX = phenotype 1, while ZYx = dead protein) then the authors are saying evolution is irreversible. But that's not really logical. It is formally reversible, provided one also reverses the mutation at position 3. So, bottom line, I get what the authors are driving at, but their language is imprecise and, as a consequence, misleading. This issue affects much of the manuscript and needs attention.

What we mean when we say “evolution is genotypically irreversible” is that a certain mutation cannot be reverted when a selection pressure to restore the original function is applied. First, if all mutations accumulated in the forward trajectory would be either detrimental or neutral to phosphotriesterase activity, it would certainly be the case that exact reversal of the mutations from rounds 22 back to 1 would lead to a theoretically possible trajectory. However, some of the forward mutations actually increased phosphotriesterase activity, meaning their reversal would not be possible. Second, as soon as new mutations occur, they negate the effect of some reversions, making even these initially possible changes impossible.

In the example given by the reviewer, mutation X3x could be such a new mutation occurring in the reverse trajectory: because it increases phosphotriesterase activity, it cannot be reverted. If it interacts with the amino acids in positions 1 and 2 and negates the effect of reversions z1Z and y2Y, these transitions will be blocked. Alternatively, X3x could also be a mutation that occurred in the forward trajectory and increased phosphotriesterase activity, in which case it could also not be reverted. Again, if it shows sign-epistasis with the other two positions, they cannot be reverted either. Therefore, we argue that irreversibility and incompatibility are strongly connected. In our original manuscript, we describe these effects (subsections “Phenotypic reversibility in the laboratory evolution of PTE”, “Genotypic irreversibility and constraints underlying phenotypic reversion” and “Mutational epistasis underlies genotypic irreversibility”). However, we agree with the reviewer that this should be emphasized and better explained, and have therefore added a summary to the Discussion (“Genotypic irreversibility was caused by several factors […] the trajectory deviates further from the original path.”).

Rephrasing the argument above, what the authors have demonstrated here is that there is redundancy, or degeneracy, in PTE catalysis, meaning that multiple genotypes can give rise to the same phenotype. Because of this redundancy, evolution is unlikely to return to the exact same genotype following selection for a new trait followed by selection for the old trait. Additionally, they have nicely documented the epistasis that occurs during their experimental evolution, showing how mutations can have very different effects on the wtPTE, AE, and neoPTE genetic backgrounds, or what they call genetic incompatibility. I think these ideas should be articulated more clearly rather than selling the idea of irreversibility, which, as noted above, doesn't seem at all logical to me.

We appreciate this suggestion and have rewritten the Abstract and Introduction accordingly. The revised version puts more emphasis on genotypic incompatibility while the idea of irreversibility has been toned down. It also includes the term “genotypic redundancy” throughout the manuscript (Introduction, Results and Discussion).

In the subsection “Genotypic irreversibility and constraints underlying phenotypic reversion” the authors state: “Nine additional ‘new mutations’ were needed to accomplish the phenotypic reversion.” Nine additional mutations certainly arose, but was each of one of them needed? This statement needs to be better justified, or I missed something in a supplemental figure/table. If it's the latter, then the data should probably be better explained/laid out in the manuscript and main figures.

We have rephrased the sentence (“Nine additional, ‘new mutations’ accumulated…”).

Reviewer #2:

[…] Issues related to interpretation and analysis:

1) One significant limitation is that the authors have examined only a single evolutionary pathway in both directions. Decisively answering questions of contingency and determinism and repeatability and the number of paths between states requires examination of many paths, not just one. The authors' detailed characterization of the trajectory they studied does allow certain general inferences on these subjects to be drawn, but they are indirect because a single instance of evolution is studied. This does not invalidate the work, but some acknowledgement and discussion of the cautions required by this limitation are important.

We agree with the reviewers (Reviewer 3 has also raised this point). First, we would like to note that we explored a second trajectory where we prevented the fixation of reversions (see last paragraph of subsection “The active site converged to its original state in the reverse evolution” and the second paragraph of the Discussion). This trajectory (1) plateaued at a phosphotriesterase activity level far below that of wtPTE and (2) consisted largely of the same new mutations that also accumulated in the main trajectory. This gives strong support for the observations made in the main trajectory that (1) some reversions are necessary for phenotypic reversibility, i.e. appear repeatedly and (2) certain new mutations also appear repeatedly, which shows that evolution is constrained by the necessity to realize a particular active-site configuration. Nevertheless, we fully agree that multiple parallel evolutionary experiments would be the method of choice to gain more general insights and strengthen these claims. Therefore, we have added the following to the Discussion:

“Another important limitation of our work is that we only examined two evolutionary trajectories (the main trajectory and the trajectory without reversions). One could imagine conducting multiple parallel evolutionary experiments to shed light on the repeatability of the trajectory taken, but our screening system is not amenable to such a throughput.”

2) Claims about effects of mutations are made without appropriate statistical analysis and rationale. For example, in Figure 4 and the claims based on these data, the authors use a 1.3-fold change in activity as a threshold of significance: changes smaller than this are considered neutral, while those that reduce activity by a factor larger than this (or by its reciprocal, to be precise), are considered deleterious. I don't see a justification for this threshold, which seems arbitrary, although it is very important for the conclusions. The authors must justify this choice rationally or change their claims. In this figure, the claim the authors want to make pertains to incompatibility; for such a claim to be made, there should be evidence to show that a state from PTE is deleterious in neoPTE, or vice versa: the evidence of a deleterious effect must be strong enough for us to rule out the possibility that the state is in fact neutral or beneficial. I see error bars, but I don't know what they represent, and I therefore don't know how confident I should be that many of these mutations are authentically deleterious (that is, that I can rule out the possibility that the true effect of the mutation is zero, but error over a small number of measurements has yielded a spurious reduction in activity.) The authors should apply an appropriate statistical procedure to keep the rate of false discoveries of incompatibility to an acceptable level, considering the large number of tests being conducted.

Similarly, in Figure 6, the authors say that R111s exhibits sign epistasis when tested in the two different backgrounds. It looks like one has a very small negative effect and the other a very small positive effect, and it is not obvious whether we can rule out zero (and therefore no difference in sign) for either one, based on the error bars. There are many other putative examples of sign epistasis in this figure for which it appears to be difficult to rule out different signs due to stochastic/measurement error around a near-zero effect. Again, we need a proper test or statistical characterization of confidence for all these claims. Similar problems affect many of the paper's figures.

We thank the reviewer for this comment and apologize for the lack of explanation provided on this point. We chose a 1.3-fold improvement as cut-off throughout the directed evolution experiment because in our experience, clones with initial improvements of >1.3-fold were consistently confirmed as improved variants, whereas smaller improvements often turned out to be false positives. Because this threshold was applied as selection threshold, we used it to evaluate all other mutants generated as well.

However, as the reviewer suggested, we have re-analyzed the lysate data to determine p-values. The data was calculated from two biological replicates (independent bacterial growth, protein expression, etc.), each containing 2-3 technical replicates (4-6 samples in total). We find that a 1.3-fold change is consistently significant (in some cases, smaller changes are also significant). Therefore, we can rule out the reviewer’s concern that changes >1.3-fold are neutral instead of authentically deleterious.

3) The authors venture hypotheses about biophysical mechanisms underlying the observed functional effects of mutations based on X-ray crystal structures and structural models in which the various ligands have been computationally placed into the active site of the crystal structures. The conclusions about the causes of functional incompatibility are mostly based on observations of clashes observed in models between proteins and their non-preferred substrates. Proteins are flexible, however, and can often accommodate new ligands in ways that models do not predict. This possibility and the uncertainty introduced by using a model of a complex should be acknowledged.

We thank the reviewer for raising this important point. The active sites of enzymes, such as PTE, are indeed flexible. However, the observation of some level of substrate turnover suggests that the substrate can bind and undergo hydrolysis at least some of the time. Because of this, analysis of the dominant structure observed in the crystal can be informative, since it gives an indication of the state most often adopted by the enzyme. Likewise, modeling the ligand into the active site, based on solved crystal structures, needs to be interpreted with some caution, as the existence of alternative binding modes cannot be excluded. Nevertheless, it strongly suggests that a number of the mutations that have accumulated increase steric hindrance for correct binding of paraoxon in AE, and in reverse for the neoPTE variant.

To acknowledge this level of uncertainty, we have added a list of caveats to the subsection “The active site converged to its original state in the reverse evolution”.

4) The authors propose structural mechanisms for some of the functional effects of mutations that they observe: some of these involve changes in structure at the sub-angstrom level, but the structures are themselves are only at resolution of 1.6 to 2.0 angstrom. How confident can we be in such fine-scale putative differences in the location of atoms given the actual resolution of the structures? These inferences should be made with caution or not made at all.

The observed changes are indeed subtle, at the sub-angstrom scale. However, the active site of these proteins superimpose almost perfectly and the atoms of the active site have some of the lowest B-factors in the protein, suggesting that the positions of the atoms can be predicted with some certainty (we have added this explanation to the subsection “The active site converged to its original state in the reverse evolution”). We have added a supplement to Figure 2 (Figure 2–figure supplement 2) showing an overlay of the electron density maps of wtPTE, AE, and neoPTE to support this claim.

There are some conceptual issues that I feel should be thought out more clearly and expressed more precisely:

1) The authors swap states between PTE and neoPTE, observe that swapping some residues severely compromises function, and conclude that the proteins occupy “separate adaptive peaks” that are not “connected” on the adaptive landscape. The thinking and language here are imprecise. All proteins are by definition connected via some number of mutations on the adaptive landscape. The question the authors seek to address whether they are accessible from each other via a continuously connected neutral (or functional) network of single-replacement changes (in the sense used by Maynard Smith and A. Wagner). Incompatibility of single mutation-swaps does not mean that proteins cannot be reached through such a connected network. If permissive mutations interact epistatically with the “incompatible” amino acid and if the permissive mutations can be introduced without deleterious effect on the function, then the genotypes with and without the “incompatible” amino acid can in fact be connected. Incompatibility does mean that the cluster of functional genotypes containing the ancestral state at the residue of interest and the functional network of genotypes containing the derived state have fewer connections than they might have if no incompatibility were present. That is all that can be said about connectivity without further analysis. The large number of incompatibilities the authors observe may indicates that the number of connections is reduced in a fairly dramatic way, but how dramatic the reduction is relative to the total number of possible connections a priori is also unknown without further analysis.

We agree with the reviewer that a more careful and precise description of the connectivity of wtPTE and neoPTE on the adaptive landscape is necessary. We had initially described their connectivity as follows: “It remains unknown whether the two species comprise separate peaks on the fitness landscape or are connected by mutational ridges.” To clarify, we have now added the following: “It remains unknown whether the two species comprise separate peaks on the fitness landscape or are connected through a neutral network, i.e. if the neutral exchanges would permit the subsequent occurrence of initially deleterious exchanges.”

In addition, we have rephrased all other instances where adaptive peaks were mentioned.

Another issue related to epistasis, sequence space and contingency is the sufficiency of selection to move a protein through sequence space from one protein to another. If the permissive mutations that make residues tolerable in one genetic background but not the other can be introduced without affecting the PTE function, then the authors' observations would show that selection for PTE function would not be sufficient to drive the reacquisition of the ancestral state at the residue of interest; in fact, this does seem to be true. There are probably other interesting and valid ways for the authors to describe their findings, but they should modify their language to make the conceptual model of an adaptive landscape and the conclusions they draw more precise.

As mentioned in our reply to the reviewer’s previous comment, we have modified the way we describe the connectivity of wt- and neoPTE on the adaptive landscape. Regarding the possibility of neutral mutations occurring first that epistatically interact with other, initially deleterious mutations, making them neutral and therefore tolerable, we are now describing this using the concept of a “neutral network” (subsection “Emergence of incompatibility between the two seemingly identical enzymes”).

2) According to the Abstract, selection for the ancestral function drove the acquisition of a new genotype provides evidence for irreversibility. This is not clearly thought out. If there were no epistasis and no selective barriers to reacquisition of the ancestral genotype (or any other genotype that confers the ancestral function), and so long as there were numerous genotypes that encode the ancestral function, then selection for the ancestral phenotype is likely to produce different outcomes every time. That is, the simple fact that under selection the genotype that evolved is different from the ancestral genotype does not establish epistasis, incompatibility, or irreversibility. It is the authors' other experiments that show this.

We thank the reviewer for pointing this out and have rewritten the Abstract accordingly.

3) The authors say that selection for aryl esterase activity was accompanied by a decrease on PTE activity, which indicates an “intrinsic trade-off.” This conclusion is not correct. If most random mutations reduce PTE activity and there is no necessary or intrinsic mechanistic trade-off between PTE and AE activity, then mutations that increase AE activity are more likely than not to reduce PTE activity.

We point out in the discussion of the crystal structures (subsection “The active site converged to its original state in the reverse evolution”):

“The structural comparison indicates that AE adapted to the planar substrate 2NH in the forward evolution, but that this came at a cost of phosphotriesterase activity, as the bulky paraoxon is no longer efficiently recognized (Figure 2).”

We then give a detailed explanation of the factors leading to this intrinsic, mechanistic trade-off (in subsections “The active site converged to its original state in the reverse evolution” and “Mutational epistasis underlies genotypic irreversibility”, Figure 2, and Figure 2–figure supplement 1): Steric hindrance, loss of shape complementarity, changes in hydrophobicity, and π-π-stacking. Therefore, it seems that the reduction of phosphotriesterase activity is a side product of the increase in arylesterase activity.

However, we agree with the referee that this does not prove that there are “intrinsic” trade-offs between the two activities, and there is the possibility that the enzyme active site can adapt to both activities simultaneously. Therefore we removed this wording from the section in question (“Phenotypic reversibility in the laboratory evolution of PTE”).

4) The authors use a metaphor of speciation for the evolution of incompatible enzyme genotypes. In the Discussion, however, it appears that they use the term literally, as if the evolution of this enzyme could cause reproductive isolation between populations of organisms. There is no evidence to support this causal leap, and the authors should avoid this type of discussion of the issue. If they are making an analogy between reproductive isolation of populations due to epistasis/Dobzhansky-Muller effect and the evolution of incompatible amino acids in their PTE enzymes that is fine, but the analogy should be labeled as such – as a metaphor.

Reviewer 3 has also raised this point. We agree with both reviewers that we went overboard in our usage of terminology from speciation and have cleaned up our language accordingly.

The authors say that evolution of the genotype is “constrained” to follow certain paths deterministically under selection for the ancestral function, and they say that Harms et al. “showed that the evolution of hormone receptors is similarly constrained by functional requirements imposed on the binding pocket.” The comparison seems imprecise. The Harms paper argued that evolution of the protein does not proceed deterministically, because function-changing mutations require prior rare mutations that do not change the function; the rarity of these permissive mutations comes from the fact they must fulfill numerous requirements. The authors are comparing this case to their own, in which they say the evolution of certain aspects of the ancestral genotype is deterministic because they represent the only way to achieve the ancestral function. The language should be cleaned up here.

The Harms paper showed that biophysical requirements dictate the limited availability of permissive mutations. In their case, evolution becomes contingent under an adaptive selection pressure because the permissive mutations are neutral by themselves, but without fixation of these permissive mutations, functional adaptation cannot occur. In the Discussion, we argue that the similarity between our and Harms’ work is the strong genetic constraint on accessible mutations dictated by biophysical requirements. We did not argue that biophysical requirements result in either “contingency” or “determinism” under an adaptive selection pressure. To clarify this, we have rephrased the sentence as follows (Discussion, second paragraph):

“Recent work by Harms et al. showed that the accessibility of functional and permissive mutations on hormone receptors is also strongly constrained by biophysical requirements imposed on the binding pocket as well as by protein dynamics (Harms and Thornton).”

5) The authors use the word constraint in a way that I find confusing. (This is not surprising, because the term is used variously and inconsistently in the literature, too.) Here, the authors say that catalytic requirements impose evolutionary constraints that virtually guarantee that something like the ancestral structure will re-evolve when selection for the ancestral function is imposed. This usage seems inconsistent with the way the term constraint has usually been used. Although the term has numerous meanings, it is almost always used to describe requirements that slow the rate of evolution or limit the capacity of selection to produce an outcome: 1) “functional constraints,” which refers to limitations on the process of genetic drift that are imposed by purifying selection and thus limit the rate of evolution; 2) “developmental (and similar) constraints,” which refers to limits on the kinds of variation that mutation can produce and make available selection; or 3) biochemical constraints or trade-offs, which make it nearly impossible for mutation and selection to produce an outcome that is “optimal” for some property looked at in isolation. The authors' point seems to be that there is one easily accessible way of solving the “problem” of esterase activity imposed by the selection pressure, so evolution is likely to always produce this outcome. This doesn't seem to me to be a constraint but rather a lack of constraint on the capacity of mutation and selection to produce exactly what the experimenters have asked it to do. I suggest finding another word.

We agree that our use of the term “constraint” had multiple definitions and was therefore confusing. In the revised manuscript, we have restricted it to “genotypic constraint”, or the “restrictions on accessible functional mutations”. This definition is different from what the reviewer describes as “evolutionary constraint”, but we acknowledge that many previous papers use “genetic constraint” similarly to our definition (e.g. (Weinreich et al., 2005), (Bridgham et al., 2009), (Taute et al., 2014), and (Podgornaia and Laub, 2015)). We have changed the term accordingly. The term “biophysical constraints” has been changed to “biophysical requirements” throughout the manuscript. Moreover, the sentence “To minimize constraints on PTE evolution caused by limited protein stability, we used GroEL/ES overexpression to buffer the destabilizing effect of mutations as previously reported […]” has been changed to: “To buffer the destabilizing effects of functional mutations and miminize reductions in soluble protein expression levels, we used GroEL/ES overexpression as previously described […]”.

Reviewer #3:

1) I feel that the Introduction is somewhat overblown. We know from the structures of orthologous enzymes that there are many ways (in terms of primary sequence) to achieve the end result of an efficient enzyme when starting from a common ancestor. We also know that epistasis is important. Thus, the demonstration that the reverse evolution toward the original activity proceeds via a different trajectory does not seem surprising to me. In fact, I would be surprised if the evolutionary trajectory had been perfectly reversible.

We have rewritten the Introduction (in particular the first paragraph) and toned down the description of genotypic irreversibility and contingency vs. determinism.

2) I object to the use of terms such as speciation and genetic incompatibility that apply to reproductive isolation of organisms to describe enzyme variants. Enzymes do not reproduce or exchange residues. These terms should be replaced with something more appropriate for molecular evolutionary processes.

Reviewer 2 has also raised this point. We agree with both reviewers that we went overboard in our usage of terminology from speciation and have cleaned up our language accordingly (please see our joint reply to Reviewer 2).

However, the term “incompatibility” has been expanded to include the protein level by the community (e.g. “protein sequence incompatibility” (Wellner et al., 2013), “incompatible mutations” (Lunzer et al., 2010), “Dobzhansky-Mueller incompatibility” (Kondrashov, Sunyaev and Kondrashov, 2002)). Therefore, we believe that “genotypic incompatibility”, as we call it, adequately describes the relationship between the different PTE sequences.

3) The authors should discuss the fact that they have explored only one possible trajectory for both the forward and reverse evolution stages, and that there are likely to be others. In fact, there may be trajectories that are fully reversible in terms of genotype. A fair statement from this one example is that reversion of a phenotype does not require reversion of a genotype.

Reviewer 2 has also raised this point. Please see our joint reply.

4) In the subsection “The active site converged to its original state in the reverse evolution”, the authors say that “the rate-limiting step of phosphotriester hydrolysis was changed in AE, but restored in neoPTE” (Figure 2–figure supplement 3)”. This point requires an expanded discussion in the text, which should explain how the experiment was done, why the results shown in the supplementary material indicate that the rate-limiting step has changed, and what the rate-limiting step might be in each case.

We have added more explanation to describe this experiment (“Furthermore, we measured linear free energy relationships for wtPTE, AE, and neoPTE […]. In neoPTE, the pattern characteristic for wtPTE was restored.”

However, while we can state that the rate-limiting step did not change in AE as described above, we cannot conclude from our data what the rate-limiting step may be. The same is true for the other cases. Therefore, we prefer to limit the interpretation of this experiment to comparing the overall pattern of the linear free energy relationships of the different variants.

5) Beginning with the discussion of Figure 5, the manuscript is very difficult to follow due to imprecise wording and obscure terminology.

A) The presentation of the data in Figure 5 is very hard to follow: “Underlined values indicate that the amino acid in question (Thr in this case) is already present in a certain background, and the effect is calculated as its reintroduction after removal (reversion of wtPTE-t45A to wtPTE).” So some effects are measured in one direction and some in the other, I think. The authors should find a better way to communicate these data. They did an enormous amount of work and the results are important.

We have removed this sentence and replaced it by a stepwise explanation of one example mutation, hoping this will make the data more clear. Because Reviewer 2 also raised this point, please see our reply above.

B) I could not figure out the second paragraph of the subsection “Mutational epistasis underlies genotypic irreversibility” even after spending considerable time trying. The results need to be described in more precise language in order to avoid confusing the reader. For example, the passage: “the effect of mutations is significantly altered by the accumulation of subsequent mutations”. How can the effect of a mutation be altered by something that has not yet occurred?

We apologize for this confusion. We have attempted to clean this up and rephrase difficult sentences to address points B) to F) of the reviewer’s comments. This now reads: “In the forward evolution, the effect of mutations is significantly altered after their fixation due to epistasis caused by mutations subsequently accumulated in the trajectory.”

C) Please clarify: “reversions that were initially deleterious for phosphotriesterase activity became favorable in the reverse evolution”. During the forward evolution, mutations occurred (not reversions), and the mutations were deleterious for phosphotriesterase activity.

This has been clarified in the subsection “Mutational epistasis underlies genotypic irreversibility” (“For example, some mutations initially increased […] in the background of AE (Figure 6A)”).

D) The authors state: “all new mutations had no effect or a negative effect on phosphotriesterase activity at the onset of forward evolution”. Figure 1 shows that the first mutation that occurred decreased phosphotriesterase activity.

We have rephrased this sentence.

E) Figure 7A is also difficult to understand. The mutations are listed as, for example, I341t. To me, that suggests reversion of I341 to the ancestral Thr. But looking at Figure 1 tells me that the ancestral residue was Ile, and that the mutant enzymes (AE and neoPTE) have Thr at that position. Figure 7A also shows the effect of V144e, which is even more confusing because Figure 1 indicates that the ancestral residue was Thr and it changed to Val.

We have corrected these mistakes (T341i in Figure 7A and e144V in Figure 1C).

F) In the third paragraph of subsection “Mutational epistasis underlies genotypic irreversibility”, the sentence “The replacement of f306I by s308C…” does not make sense.

We have changed this to say: “The redundancy of the mutations f306I and s308C was also evidenced by combinatorial mutational analysis…”.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #2:

1) I remain concerned about the threshold of 1.3-fold change. An increase in activity of 1.3-fold by a mutation “is considered significant.” The authors say that clones displaying a 1.3-fold improvement are generally verified as authentically displaying an increase in activity upon repeat examination. There are several issues about which I am concerned here. The first is that no data are provided to support this justification for the threshold; in fact, it is never articulated in the paper, although it is found in the letter. Thus the 1.3-fold threshold remains unjustified in the text. Second, the use of this threshold in defining examples of epistasis strikes me as problematic. Mutations that produce an observed increase in activity by >1.3x in one background but not another are judged to be epistatically modified. Thus, an increase of 1.4x in one background but 1.2x in another exhibit epistasis, because they are significantly improved in the first case but not the second. The authors do not present any way to establish whether the differences between these two genotypes are really statistically or biologically significant, so the claim of epistasis remains unjustified. The authors argue that a statistical analysis shows that most cases of 1.3x improvement are statistically significant (also not shown), but this is beside the point; the relevant issue is whether the difference between the effects of a mutation in two genetic backgrounds is statistically and biologically significant, not whether a 1.3-fold effect one mutation is different from no effect. Many, although not all, of the authors' examples of epistasis in the paper are, in fact, rather subtle examples of phenomena like this – not sign epistasis, or very large effects vs. very small or no effects, but the difference between an effect slightly greater than 1.3 and one slightly less. I therefore believe that the authors need to strengthen this aspect of the analysis. I recommend a test of differences of effect between backgrounds.

We now include all statistics for each figure in Supplementary file 2 (Tables B-G) and are giving p-values for the effect of all mutations compared to the respective background in which they occur. Where applicable, we have also calculated p-values for the difference in effect of a certain mutation in two genetic backgrounds to support our claim of epistasis, as suggested by the reviewer. This information can be found in four new source data files (Figure 4–source data 1, Figure 6–source data 1 and 2, Figure 7–source data 1). Our analysis shows that out of 144 mutations, only six show a >1.3-fold effect, but are statistically not significant compared to the parent. When comparing the effect of a mutation in two different backgrounds, only two mutations are not statistically significant. These eight mutations are mentioned both in the relevant table and in the relevant figure legend (please see the new figure supplements as well as changes to the figure legends). We have remade all relevant figures according to the new analysis and would like to note the following changes: In Figure 4 (comparison of mutations in the wt- and neoPTE backgrounds), t341I has moved from panel B (neutral in wtPTE, deleterious in neoPTE) to panel A (neutral in both backgrounds) and a204G has moved from panel B to D (deleterious in both backgrounds).

In Figure 6B, we have removed the R111s mutation as an example for sign epistasis, because its effect is statistically not significant in one of the two backgrounds considered. Furthermore, we now explain the 1.3-fold cut-off in the main text.

2) The authors make inferences about clashes on a sub-angstrom scale from structures that are only at 1.6 to 2 angstrom resolution. They say that the B-factors are low, but this is of limited relevance. This indicates that there is limited thermal motion for these atoms, but the resolution of the structure still does not allow location of the atom with precision greater than the stated resolution, which is necessary to strongly support the claims made. I believe this could be dealt with in the manuscript with more direct cautionary language and softening of the claims.

Respectfully, we disagree with this comment. We believe that the reviewer has mistaken precision with resolution. Resolution refers the ability to fully distinguish between atoms, i.e. the density from two metal ions separated by 3.3 Å will be fully separated at resolutions lower than this, e.g. 1.9 Å. But this does not mean that the position of each metal ion is equally likely to be anywhere within 1.9Å of its modeled position. The precision of the atomic locations is best measured by the dispersion precision indicator (described in detail in (Cruickshank, 1999)). The B-factors are actually of direct relevance – in macromolecular crystallography, thermal motion, crystallographic disorder, even model building errors, all contribute to the B-factor. Moreover, the DPI directly relies on the B-factor and Rfree in its calculation. The greater the accuracy of the model (Rfree) and the lower the atomic disorder (B-factor), the more precise the atomic precision. In this case, the overall DPI for the structures of R0 (0.09), R22 (0.07), and Rev12 (0.02) are all well within the distance change that we observe in these structures. We have included the comment:

“However, the dispersion precision indicator (DPI; Cruickshank, 1999) for each of these structures is less than a tenth of an angstrom, meaning that the observed distance changes (including the 0.5 Å shift in the metal position are significant”.

3) The idea of different protein “species” that are “genetically incompatible” with each other is still obscure, and I do not find it persuasive. Do the authors mean that if the alleles were brought together in a heterozygote that recombination within the gene would produce nonfunctional proteins and might be selected against? Are they arguing that this might contribute to speciation? If that's what the authors mean, they should say so, but it seems very far-fetched and marginally relevant to our understanding of protein evolution. I think the effects of epistasis on the evolutionary potential of a protein are a much more solid basis for the authors' interpretation.

To address the concerns of both Reviewers 2 and 3, we have removed the idea of “molecular speciation” in all instances and only talk about “genotypic incompatibility” between the wt- and neoPTE sequences.

It is our impression that the reviewers object only to the use of the word “speciation”, but not “incompatibility”, a term which has been applied to the level of protein sequences as discussed in our previous reply (…the term “incompatibility” has been expanded to include the protein level by the community (e.g. “protein sequence incompatibility” (Wellner et al., 2013), “incompatible mutations” (Lunzer et al., 2010), “Dobzhansky-Mueller incompatibility (Kondrashov et al., 2002))”. Therefore, we believe that “genotypic incompatibility”, as we call it, adequately describes the relationship between the different PTE sequences.”). We have also taken up the suggestion from Reviewer 2’s previous comments to refer to the “Dobzhansky-Muller effect” as a metaphor in one instance (Introduction, see below). We have made substantial changes throughout the text and changed the title to: “Reverse evolution leads to genotypic incompatibility despite functional and active-site convergence”.

Reviewer #3:

1) The “speciation” concept that another reviewer and I both objected to is still present in the title, Abstract and text. I think alternative wording should be used whenever possible, and the title should be changed.

Please see our reply to Reviewer 2.

2) The Introduction is still somewhat grandiose. In particular, I am bothered by the first sentence. I do not see how reversibility relates to the question of whether re-playing life's tape would lead to the same outcome. The latter question addresses whether the same things would happen again starting from the same place, while the former asks whether an individual trajectory can be reversed. Either omit or rephrase/expand to clarify the connection.

We have rephrased the first sentence of the Introduction.

3) Is there any reason not to plot kcat/KM values in Figure 1B? They would be more relevant in terms of the changes in the two enzyme activities as the evolution proceeds.

The reason we consistently use lysate activities throughout the paper is that this is what was the basis for the selection of improved variants in our screens and therefore more adequately reflects “fitness” in our model evolutionary system, as mentioned in the text (subsection “Phenotypic reversibility in the laboratory evolution of PTE”).

In Figure 1–figure supplement 1, we show that the development of activities measured with purified enzyme at a constant enzyme and substrate concentration correlates well with the lysate data. In Supplementary file 2, we give the kcat/KM values for all variants shown in Figure 1B, which are also in agreement with the development of lysate activities. If the reviewer feels that these values need to occupy a more prominent position in the paper, we are happy to add another figure supplement to Figure 1 equivalent to panel 1B but showing kcat/KM values.

4) The section head in the Results section states that “The active site converged to its original state in the reverse evolution”. I feel that this is misleading. The active site residues were not restored, and the shape of the active site is rather different. It would be correct to say that a functional active site had been restored, but I don't agree with the claim in this section (and elsewhere in the manuscript) that the active site converged to its original state. Further, the authors say that “the naphthyl binding pocket remains intact […] which likely explains why neoPTE is still bi-functional”. This statement is not consistent with the statement that the active site converged to its original state.

We agree with the reviewer that our wording was imprecise. We have modified the text to show that the active site converges only “towards”, not “to” its original state and that this convergence is only in the key elements required for phosphotriesterase activity, not arylesterase activity.

5) In the aforementioned subsection, it is claimed that: “the number of accessible mutational trajectories that lead to a wild-type level fitness peak from AE are highly limited”. I don't believe this statement is justified. There may well be many trajectories that could lead to wt-level activity that start from states that were discarded during the evolution because they were not the best variants at that particular stage.

We thank the reviewer for pointing this out and have modified the section in question (“This failure to reach wild-type activity levels […] through less improved intermediates, may exist”).

6) A picture showing smooth and rugged fitness landscapes would be useful for readers who are not experts in molecular evolution.

Ruggedness of the fitness landscape is certainly one of our conclusions but not entirely central to our theme. Moreover, it is hard to provide a schematic view of what we observe in our experiment (reversibility and incompatibility), and we do not believe that an oversimplified scheme of a smooth vs. a rugged fitness landscape would fit our story well, but instead may be confusing and misleading. We cited a number of reviews on the topic that non-expert readers can follow (Introduction).

7) Subsection “Emergence of incompatibility between the two seemingly identical enzymes”: the heading of this section refers to “two seemingly identical enzymes”. The enzymes are not seemingly identical, as neoPTE still has fairly robust arylesterase activity.

We have changed this to say: “Emergence of incompatibility between the two PTEs”. We have also rephrased the following sentence to say: “Next, we set out to answer the question how the two enzymes exhibiting identical phosphotriesterase activity, wt- and neoPTE, are connected on the adaptive landscape.”

8) The terminology used to describe mutations in the discussion of Figures 5 and 6 is problematic. Usually the authors use the conventional notation; e.g. AxxB, which means that A at position xx was changed to B. But sometimes they use the reverse. For example, in the subsection “Mutational epistasis underlies genotypic irreversibility”, the text says: “some mutations initially increased (I172t and F271l) or were neutral (V130l) to phosphotriesterase activity…”. The mutations during the forward evolution were actually t172I, l271F and l130V. According to Figure 1 and Figure 1–figure supplement 1, t172I decreased, rather than increased, phosphotriesterase activity. This section needs to be clarified.

We sincerely apologize that the text was still so difficult to digest and we agree that the mutations mentioned (t172I, l271F and l130V) should be written in the direction pointed out by the reviewer. To clarify our arguments, we have gone over the whole section in question (“Mutational epistasis underlies genotypic irreversibility”) and made significant changes.

9) Figure 5 is still nearly incomprehensible to me. For example, in the column labeled “R254h”, the legend says that the effect is calculated in the direction R254h in all cases. So the figure suggests that changing R254 to H increases phosphotriesterase activity by 13-fold. But the wt enzyme has His at 254, not Arg. So what is really meant is that changing His254 to Arg decreases phosphotriesterase activity by 13-fold, which is consistent with Figure 1B. Having to do these mental gymnastics for each position and in each background makes my brain hurt. I do not think this way of conveying the data is salvageable.

We are sorry for the confusion. Figure 5 shows how the mutational effects changed in the five different backgrounds in a comprehensive manner. We believe that this is very important because all other figures only show certain parts of the complete data set. However, due to the nature of the dataset, i.e., each background has a different amino acid, the exact numbers in Figure 5 are certainly not easy to digest, although the color coding already gives a good impression of the prevalence of epistasis – this is why it is important in this figure to give all mutations in the same direction, even though this means sometimes adding a mutation, sometimes “taking it out” and calculating the effect of “putting it back in”. To better explain the numbers and directions of the effect, we rewrote the figure legend and sincerely hope will now make it easier to understand how the data was processed.

10) Figure 6A shows the effects of V130I on phosphotriesterase activity in the “forward evolution” and “AE background” contexts. Figure 6 would be easier to understand if it specified V130I in R13b, and I172T in R5a (and so on), rather than using the term “forward evolution”, which initially gave me the impression that V130 was changed to I during the forward evolution. In fact, I am still confused by whether the bars at the left side of Figure 6A refer to the actual mutation that occurs during the forward evolution, or the reversion of the mutation at position 130. Even after several tries, I am unable to follow the discussion in subsection “Mutational epistasis underlies genotypic irreversibility”.

We apologize again for the confusion and have added an additional explanation to the legend of Figure 6.

We have also attempted to clarify the discussion of Figures 5 and 6 as described in our reply to comment no. 8. We have also taken up the reviewer’s suggestion to add the rounds of occurrence to the “forward evolution” variants in Figure 6 and in the text.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

An issue raised by several of the reviewers during both rounds of revision related to the terms ‘molecular speciation’ and ‘incompatibility’. While you have removed the confusing former term, the latter still appears in several places in the text. This is fine, but it needs, in some instances, a bit of clarification and adjustment of the language. For instance, the Abstract states that the reverse evolution experiment performed led “to a different sequence incompatible with the original one, despite functional identity.” As written, the statement is ambiguous and confusing in the sense that what you really mean is that certain substitutions that occurred during the reverse evolution process are incompatible with some of the residues present in the ancestral/original protein. It's not that the whole sequence is somehow incompatible with the original one, as written. Please adjust the language throughout the text where this concept arises to improve precision and meaning.

We agree with your concern about our description of “incompatibility”. We have revised the text to clarify the meaning of the term, in particular in the Abstract, Results (subsection ““Emergence of incompatibility between the two PTEs” and Discussion).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Figure 4—source data 1. Comparison of the effect of mutations in wt- and neoPTE.

    Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds AND a p-value <0.05 are considered significant. Non-significant values are underlined. These belong to the ‘PANEL A’ series, which consists of mutations that are neutral in both cases. The only exception is i313F from PANEL C, which is significantly different from wtPTE, but not significantly different between wt- and neoPTE.

    DOI: http://dx.doi.org/10.7554/eLife.06492.012

    elife06492s001.xlsx (45.8KB, xlsx)
    DOI: 10.7554/eLife.06492.012
    Figure 6—source data 1. Comparison of the effect of mutations in the forward evolution and in AE (panels A, B).

    Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. [b] Phosphotriesterase activity was too low to be determined in AE + R254h, but at least 10-fold reduced.

    DOI: http://dx.doi.org/10.7554/eLife.06492.015

    elife06492s002.xlsx (38.1KB, xlsx)
    DOI: 10.7554/eLife.06492.015
    Figure 6—source data 2. Comparison of the effect of mutations in wtPTE and AE (panel C).

    Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. Non-significant values are underlined. Note that the effect of m293K, which causes a significant increase in AE, is statistically not significant between wtPTE and AE.

    DOI: http://dx.doi.org/10.7554/eLife.06492.016

    elife06492s003.xlsx (36.4KB, xlsx)
    DOI: 10.7554/eLife.06492.016
    Figure 7—source data 1. Comparison of the effect of mutations in wtPTE and AE (panel A).

    Fold-changes between the two backgrounds as well as p-values calculated according to the t-test are given. [a] Only mutations with an average >1.3-fold difference between backgrounds and a p-value <0.05 are considered significant. Non-significant values are underlined.

    DOI: http://dx.doi.org/10.7554/eLife.06492.018

    elife06492s004.xlsx (37.2KB, xlsx)
    DOI: 10.7554/eLife.06492.018
    Supplementary file 1.

    Description of the directed evolution rounds.

    DOI: http://dx.doi.org/10.7554/eLife.06492.019

    elife06492s005.docx (1.1MB, docx)
    DOI: 10.7554/eLife.06492.019
    Supplementary file 2.

    Kinetic parameters of all variants.

    DOI: http://dx.doi.org/10.7554/eLife.06492.020

    elife06492s006.docx (1.2MB, docx)
    DOI: 10.7554/eLife.06492.020
    Supplementary file 3.

    Crystallographic information.

    DOI: http://dx.doi.org/10.7554/eLife.06492.021

    elife06492s007.docx (143.6KB, docx)
    DOI: 10.7554/eLife.06492.021

    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES