Abstract
Mutation of the tobacco etch virus (TEV) protease nucleophile from cysteine to serine causes an approximately ∼104-fold loss in activity. Ten rounds of directed evolution of the mutant, TEVSer, overcame the detrimental effects of nucleophile exchange to recover near-wild-type activity in the mutant TEVSerX. Rather than respecialising TEV to the new nucleophile, all the enzymes along the evolutionary trajectory also retained the ability to use the original cysteine nucleophile. Therefore the adaptive evolution of TEVSer is paralleled by a neutral trajectory for TEVCys, in which mutations that increase serine nucleophile reactivity hardly affect the reactivity of cysteine. This apparent nucleophile permissiveness explains how nucleophile switches can occur in the phylogeny of the chymotrypsin-like protease PA superfamily. Despite the changed key component of their chemical mechanisms, the evolved variants TEVSerX and TEVCysX have similar activities; this could potentially facilitate escape from adaptive conflict to enable active-site evolution.
Keywords: directed evolution, nucleophilic catalysis, PA clan, proteases, tobacco etch virus
Enzymes achieve efficient catalysis through the precise orientation of a key set of active-site residues. This arrangement is dependent on chemical constraints, to the extent that some active-site geometries have convergently evolved many times.1 Consequently, active-site residues are the most evolutionarily conserved within enzyme families. Phylogenetic analysis of extended protein superfamilies suggests that even residues that are crucial for activity are exchanged during evolution over sufficiently long timescales.2 The evidence for such exchanges raises the question of how a gene coding for an inefficient enzyme can persevere in the transition from one type of active site to another. There is no evolutionary advantage for maintaining a gene coding for a catalytically impaired or inactive protein, thus creating the scenario of “adaptive conflict”.3 We know from studies on enzymes4 and enzyme models5 that precise positioning is easily disturbed. Minute disturbances down to the picometer scale cause substantial rate reductions.6 Given the delicacy of catalytic arrangements, it is unknown how the evolution of active sites avoids unfit variants.
Serine and cysteine proteases are textbook examples of enzymes employing covalent, nucleophilic catalysis (Figure 1 A)7 that leads to substantial rate acceleration of a difficult reaction (with a half-life of ≈500 years).8 The sophisticated interplay of the multiple active-site residues involved, including for example, the charge relay system of the catalytic triad,7c suggests that any deviation from such a highly efficient arrangement is likely to be penalised.9 However, phylogenetic analysis of proteases suggests that nucleophile exchanges do occur during evolution. The PA clan of chymotrypsin-like proteases10 encompasses both serine and cysteine proteases evolved from a hypothetical common ancestor.11 Constructing a phylogeny of this clan of proteases (Figure 1 B) shows that—within a highly conserved structure—nucleophile switches must have occurred by divergent evolution at least once: cellular proteases use a serine nucleophile, but both cysteine and serine protease families are found in viruses.12
When the active-site nucleophiles of serine or cysteine protease are interconverted, the single atomic change typically leads to a >104-fold reduction in kcat/KM.9a, 14 Although both thiol and hydroxy groups can act as nucleophilic catalysts, their positioning is likely to be suboptimal after mutation due to structural differences between oxygen and sulfur: oxygen’s smaller atomic radius (by ∼0.4 Å)14a and the formation of shorter bonds (decreasing dC–X and dX–H by ∼1.3-fold each) would be expected to disturb the precise nucleophile positioning. In addition there are reactivity differences: sulfur is softer, and its different pKa (4–5 units lower for RSH compared to ROH) provides a larger fraction of the active form of the nucleophile at physiological pH; this explains why the reactivity of the hydroxy side chain of serine that replaces cysteine would be compromised.14d
These considerations of chemical reactivity and structure raise the question of how such nucleophile transitions have occurred in proteases, despite the enzyme inactivation typically associated with mutating a key active-site residue. Handicap-recover experiments can be used to find if any mutations can epistatically offset a known deleterious mutation or the replacement of a native cofactor.15 Here we use this approach to demonstrate a scenario that could satisfy the fitness requirements of protein evolution by mutating the crucial nucleophile of tobacco etch virus cysteine protease (TEVCys) to serine (TEVSer), recovering activity by directed evolution (DE) and measuring trade-offs in response to nucleophile switches.
The nucleophile mutation compromised the centrepiece of the catalytic mechanism and consequently had a much greater effect on activity than in previous handicap-recover experiments.15 A TEVSer mutant (C151S mutation) was constructed, and the substitution of the cysteine thiol nucleophile by a serine alcohol was found to reduce activity by four orders of magnitude (Figure 2 and Figure S4 in the Supporting Information). The effect of this deliberately introduced handicap was then quantified by measuring the reaction kinetics (by monitoring the hydrolysis of the C-Y FRET-pair substrate,16 Figure S2). Conversion of the catalytic nucleophile from sulfur to oxygen resulted in biphasic kinetics (Figures 2 A and S3, Table S2); this is consistent with the formation and breakdown of an acyl–enzyme intermediate (i.e., a fast first step followed by a slower, rate-limiting step, Figure S5). The second-order rate constants ${k{{{\rm obs1}\hfill \atop 2\hfill}}}$ and ${k{{{\rm obs2}\hfill \atop 2\hfill}}}$ of TEVSer were found to be 80 and 20 000 times lower, respectively, than the measured second-order rate constant of TEVCys (representing kcat/KM).
In order to investigate how the enzyme can compensate for the handicap of using a non-native nucleophile and altered reaction chemistry, ten rounds of DE for activity recovery were performed (numbered TEVSer to TEVSerX). Each round of DE consisted of error-prone PCR (1.3±0.4 amino acid mutations per gene), activity screening of 350 enzyme variants by destruction of FRET in cell lysate (Figure S2), and selection of the single best variant. Any S151C revertants were discarded to force evolution to follow a forward pathway. Measurement of turnover rates in cells during screening reflects enzyme fitness as the product of both chemical reactivity and catalyst concentration (determined by biophysical properties, such as folding and solubility). The same FRET-pair substrate was used for both in vivo screening and in vitro kinetics to describe the enzyme–substrate interactions that were relevant for the selection (and avoid unique effects of the recognition of for example, small-peptide substrates with different reactivity and affinity).
During the rounds of evolution, no mutations in residues that make direct contacts with the triad (or are within a radius of 4 Å) resulted from experimental selections. Conversely, nine of the 13 point mutations accumulated 4–8 Å from the catalytic triad, in the second shell of residue interactions (Figure 3). In vitro kinetics of purified variants showed that the process of directed evolution recovered proteolytic activity by an improvement in both ${k{{{\rm obs1}\hfill \atop 2\hfill}}}$ (2×103-fold) and ${k{{{\rm obs2}\hfill \atop 2\hfill}}}$ (3×103-fold) and also changed the burst amplitude (Figure 2 A, Tables S2 and S3), with diminishing improvements in later rounds. The accumulation of mutations around the enzyme active site (second shell) that increased catalysis reduced soluble expression sixfold; however, in rounds V and VI, surface mutations (W130C and E194D) and a C-terminal truncation (due to a frameshift) were selected that improved both solubility and activity (Figure 2 C).
In addition to the adaptive trajectory of TEVSer (Figure 2 B, front), the identified adaptive mutations were examined in the parental background, by reverting the nucleophile to the original cysteine at each step of the evolutionary trajectory (Figure 2 B, back). The kinetics of these TEVCys variants could be fitted to a monophasic model with good correlation coefficients. Rather than respecialising the active site to use serine, as is typical of directed-evolution experiments, the 14 mutations accumulated by DE proved nearly neutral to activity with the original cysteine nucleophile (i.e., did not trade-off). Whereas the ${k{{{\rm obs1}\hfill \atop 2\hfill}}}$ and ${k{{{\rm obs2}\hfill \atop 2\hfill}}}$ of TEVSerX are 1000-fold improved over those of TEVSer, the nucleophile revertants retained activity within fourfold of TEVCys (Figures 2 B and S6). The evolutionary trajectory therefore results in twin enzymes, differing only in their nucleophile (TEVSerX↔TEVCysX) and with only a small, 2.3-fold difference in activity upon nucleophile exchange.
By forcing TEV into a local fitness valley (TEVSer) and experimentally evolving for activity recovery, we mapped out an uphill trajectory (TEVSer→TEVSerX) that lies parallel in sequence space to a nearly flat, neutral trajectory of mutants with constant Cys nucleophile (TEVCys→TEVCysX; Figure 2 B). Therefore, despite deliberately evolving TEVCys by using a fitness valley, the nucleophile-permissive TEVSerX can also be accessed from TEVCys by nearly neutral mutations without any large drops in activity. The most closely related natural serine protease only retains 15 % sequence identity to TEVCys. However, the mutant TEVSerX shows 92.4 % identity (Table S2, Figure 3), thus suggesting that only a few mutations are necessary to accommodate a nucleophile switch. The >1000-fold improvement to generate a nucleophile generalist with only 13 mutations explains how divergent evolution of core catalytic machinery can occur within evolutionarily superfamilies such as the PA clan (Figure 1 B). It also emphasises the power of directed evolution to find solutions for the challenge of retuning chemical reactivity.17
The challenge of nucleophile permissiveness is conceptually similar to that of catalytic promiscuity (the ability of an enzyme to accept different substrates): how can an enzyme make and break bonds different from those it has evolved for?18 The evolution of promiscuous activities has been previously observed to pass through catalytic generalists, which were able to promote a new reaction, while retaining some activity on their original substrate. Generalist enzymes are proposed to perform an important role in the evolution of new functions by being particularly evolvable.19 Although enzyme promiscuity towards different substrates is well documented,18, 20 the ability to use different residues for nucleophilic, covalent catalysis represents an alternative kind of chemical versatility in the core catalytic machinery.21
Even though the lynchpin of catalysis in the protease active site—the nucleophile—was mutated, the large activity drop was readily recoverable by evolution. Quantitatively, both the handicap introduced and the extent of the recovery exceed those previously observed by approximately two orders of magnitude.15 Specifically, a 120-fold reduction triggered by cofactor exchange was followed by a 70-fold recovery in a study by Miller et al.15a and a 400-fold reduction caused by mutation of a conserved residue was followed by a 40-fold recovery observed by Wellner et al.,15b compared with a 20 000-fold reduction and 3000-fold recovery in this work.
What is more, the similar rates of TEVCysX and TEVSerX (3×103 vs. 7×103 min−1 M−1) represent a rare example of catalytic promiscuity at high, wild-type levels (in contrast to promiscuous, yet low-activity catalytic generalists).22 Paradoxically, a nucleophile mutation would be predicted to be more difficult to recover from, when compared to evolution to accommodate promiscuous binding of multiple substrates, as two different types of bonds (O−C vs. S−C) are being formed and cleaved. However, our data suggest that the differences in nucleophile reactivity and structure can be readily accommodated by TEV protease with apparently little trade-off between rates for different nucleophiles. The unexpected nucleophile tolerance suggests that chemically versatile intermediates such as TEVSer/CysX exist that could facilitate the phylogenetically observed switch between protease clans that differ in their nucleophiles prior to specialisation. The protein framework of TEVSer/CysX allows two nucleophiles to execute their function with good efficiency and constitutes a molecular solution to escape from adaptive conflict.
Acknowledgments
We thank several colleagues, particularly Sean Devenish for insightful comments on the manuscript. We acknowledge financial support from the Biotechnology and Biological Sciences Research Council and MedImmune. F.H. is an ERC Starting Investigator.
Supporting Information
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
References
- 1a.Furnham N, Holliday GL, de Beer TA, Jacobsen JO, Pearson WR, Thornton JM. Nucleic Acids Res. 2014;42:485–489. doi: 10.1093/nar/gkt1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 1b.Buller AR, Townsend CA. Proc. Natl. Acad. Sci. USA. 2013;110:653–661. doi: 10.1073/pnas.1221050110. , E. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Galperin MY, Walker DR, Koonin EV. Genome Res. 1998;8:779–790. doi: 10.1101/gr.8.8.779. [DOI] [PubMed] [Google Scholar]
- 3a.Hittinger CT, Carroll SB. Nature. 2007;449:677–681. doi: 10.1038/nature06151. [DOI] [PubMed] [Google Scholar]
- 3b.Des Marais DL, Rausher MD. Nature. 2008;454:762–765. doi: 10.1038/nature07092. [DOI] [PubMed] [Google Scholar]
- 3c.Sikosek T, Chan HS, Bornberg-Bauer E. Proc. Natl. Acad. Sci. USA. 2012;109:14888–14893. doi: 10.1073/pnas.1115620109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kraut DA, Carroll KS, Herschlag D. Annu. Rev. Biochem. 2003;72:517–571. doi: 10.1146/annurev.biochem.72.121801.161617. [DOI] [PubMed] [Google Scholar]
- 5a.Kirby AJ, Hollfelder F. From Enzyme Models to Model Enzymes. Cambridge: Royal Society of Chemistry; 2009. [Google Scholar]
- 5b.Kirby AJ. Adv. Phys. Org. Chem. 1980:183–278. [Google Scholar]
- 6a.Sigala PA, Kraut DA, Caaveiro JM, Pybus B, Ruben EA, Ringe D, Petsko GA, Herschlag D. J. Am. Chem. Soc. 2008;130:13696–13708. doi: 10.1021/ja803928m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6b.Kirby AJ, Hollfelder F. Nature. 2008;456:45. doi: 10.1038/456045a. [DOI] [PubMed] [Google Scholar]
- 7a.Radisky ES, Lee JM, Lu CJ, Koshland DE., Jr Proc. Natl. Acad. Sci USA. 2006;103:6835–6840. doi: 10.1073/pnas.0601910103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7b.Polgár L. Cell. Mol. Life Sci. 2005;62:2161–2172. doi: 10.1007/s00018-005-5160-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7c.Hedstrom L. Chem. Rev. 2002;102:4501–4524. doi: 10.1021/cr000033x. [DOI] [PubMed] [Google Scholar]
- 8.Radzicka A, Wolfenden R. J. Am. Chem. Soc. 1996;118:6105–6109. [Google Scholar]
- 9a.Beveridge AJ. Protein Sci. 1996;5:1355–1365. doi: 10.1002/pro.5560050714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9b.Higaki JN, Gibson BW, Craik CS. Cold Spring Harbor Symp. Quant. Biol. 1987;52:615–621. doi: 10.1101/sqb.1987.052.01.070. [DOI] [PubMed] [Google Scholar]
- 10.Di Cera E. IUBMB Life. 2009;61:510–515. doi: 10.1002/iub.186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rawlings ND, Barrett AJ, Bateman A. Nucleic Acids Res. 2012;40:343–350. doi: 10.1093/nar/gkr987. , D. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12a.Goldfarb PS. Nature. 1988;336:429. [Google Scholar]
- 12b.Brenner S. Nature. 1988;334:528–530. doi: 10.1038/334528a0. [DOI] [PubMed] [Google Scholar]
- 12c.Gorbalenya AE, Blinov VM, Donchenko AP. FEBS Lett. 1986;194:253–257. doi: 10.1016/0014-5793(86)80095-3. [DOI] [PubMed] [Google Scholar]
- 12d.Irwin DM. Nature. 1988;336:429–430. doi: 10.1038/336429b0. [DOI] [PubMed] [Google Scholar]
- 13.Holm L, Rosenström P. Nucleic Acids Res. 2010;38:545–549. doi: 10.1093/nar/gkq366. , W. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14a.McGrath ME, Wilke ME, Higaki JN, Craik CS, Fletterick RJ. Biochemistry. 1989;28:9264–9270. doi: 10.1021/bi00450a005. [DOI] [PubMed] [Google Scholar]
- 14b.Turkenburg JP, Lamers MB, Brzozowski AM, Wright LM, Hubbard RE, Sturt SL, Williams DH. Acta Crystallogr. Sect. D Biol. Crystallogr. 2002;58:451–455. doi: 10.1107/s0907444901021825. [DOI] [PubMed] [Google Scholar]
- 14c.Neet KE, Koshland DE., Jr Proc. Natl. Acad. Sci USA. 1966;56:1606–1611. doi: 10.1073/pnas.56.5.1606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14d.Polgár L, Asboth B. J. Theor. Biol. 1986;121:323–326. doi: 10.1016/s0022-5193(86)80111-4. [DOI] [PubMed] [Google Scholar]
- 14e.Cheah KC, Leong LE, Porter AG. J. Biol. Chem. 1990;265:7180–7187. [PubMed] [Google Scholar]
- 14f.Hahn CS, Strauss JH. J. Virol. 1990;64:3069–3073. doi: 10.1128/jvi.64.6.3069-3073.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15a.Miller SP, Lunzer M, Dean AM. Science. 2006;314:458–461. doi: 10.1126/science.1133479. [DOI] [PubMed] [Google Scholar]
- 15b.Wellner A, Raitses Gurevich M, Tawfik DS. PLoS Genet. 2013;9:1003665. doi: 10.1371/journal.pgen.1003665. , e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nguyen AW, Daugherty PS. Nat. Biotechnol. 2005;23:355–360. doi: 10.1038/nbt1066. [DOI] [PubMed] [Google Scholar]
- 17.Renata H, Wang ZJ, Arnold FH. Angew. Chem. Int. Ed. 2015;54:3351–3367. doi: 10.1002/anie.201409470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Angew. Chem. 2015;127 [Google Scholar]
- 18.O’Brien PJ, Herschlag D. Chem. Biol. 1999;6:91–105. doi: 10.1016/S1074-5521(99)80033-7. , R. [DOI] [PubMed] [Google Scholar]
- 19a.Aharoni A, Gaidukov L, Khersonsky O, Gould SMcQ, Roodveldt C, Tawfik DS. Nat. Genet. 2004;37:73–76. doi: 10.1038/ng1482. [DOI] [PubMed] [Google Scholar]
- 19b.Jensen RA. Annu. Rev. Microbiol. 1976;30:409–425. doi: 10.1146/annurev.mi.30.100176.002205. [DOI] [PubMed] [Google Scholar]
- 20a.Nobeli I, Favia AD, Thornton JM. Nat. Biotechnol. 2009;27:157–167. doi: 10.1038/nbt1519. [DOI] [PubMed] [Google Scholar]
- 20b.Khersonsky O, Tawfik DS. Annu. Rev. Biochem. 2010;79:471–505. doi: 10.1146/annurev-biochem-030409-143718. [DOI] [PubMed] [Google Scholar]
- 20c.Copley SD. Trends Biochem. Sci. 2015;40:72–78. doi: 10.1016/j.tibs.2014.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Babtie A, Tokuriki N, Hollfelder F. Curr. Opin. Chem. Biol. 2010;14:200–207. doi: 10.1016/j.cbpa.2009.11.028. [DOI] [PubMed] [Google Scholar]
- 22a.Matsumura I, Ellington AD. J. Mol. Biol. 2001;305:331–339. doi: 10.1006/jmbi.2000.4259. [DOI] [PubMed] [Google Scholar]
- 22b.Tokuriki N, Jackson CJ, Afriat-Jurnou L, Wyganowski KT, Tang R, Tawfik DS. Nat. Commun. 2012;3:1257. doi: 10.1038/ncomms2246. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.