Abstract
Our ability to design new or improved biomolecular activities depends on understanding the sequence-function relationships in proteins. The large size and fold complexity of most proteins, however, obscures these relationships, and protein-optimization methods continue to rely on laborious experimental iterations. Recently, a deeper understanding of the roles of stability-threshold effects and biomolecular epistasis in proteins has led to the development of hybrid methods that combine phylogenetic analysis with atomistic design calculations. These methods enable reliable and even single-step optimization of protein stability, expressibility, and activity in proteins that were considered outside the scope of computational design. Furthermore, ancestral-sequence reconstruction produces insights on missing links in the evolution of enzymes and binders that may be used in protein design. Through the combination of phylogenetic and atomistic calculations, the long-standing goal of general computational methods that can be universally applied to study and optimize proteins finally seems within reach.
Introduction
Proteins have a central and increasingly dominant role in modern life sciences research, technology, and medicine. Natural proteins that can be applied as-is, however, are rare as many proteins exhibit significant shortcomings, such as unacceptably low stability, activity, specificity, and high production costs. Therefore, the ability to optimize natural proteins has revolutionized research and applications. One of the most common and successful approaches in protein optimization is lab evolution, in which variants are selected from a genetic pool if they exhibit desirable properties[1]. This strategy has been advanced to the point that it can be applied — particularly if a high-throughput assay is available — to engineer very efficient enzymes[2] and high-affinity binders[3] including therapeutics, and the pioneers of this strategy were honored with the 2018 Nobel Prize in Chemistry[4–6].
Powerful though this strategy is, however, reaching acceptable results requires iterations of medium-to-high throughput screening. But such screens are available for many though not all binders and only to a small subset of enzymes. Therefore, it has been a long-standing goal to complement or replace iterative screening with rational or structure-based design calculations. Indeed, success in the design of protein folds[7,8] (which have been impressively extended over the past decade[9,10]) raised expectations for “black-box” computational protein-optimization strategies that could be applied by non-experts to, in principle, any natural protein. Until recently, however, protein optimization required a higher level of accuracy than computational methods provided[11] and therefore continued to rely on laborious iterations of design and experiment[12–15].
The difficulties of developing general design methods for protein optimization are typically ascribed to two reasons: first, design methods rely on inherently inaccurate energy functions, and protein optimization usually demands to introduce several mutations leading to accumulating errors[11,12,16]. Second, the benefit resulting from many mutations observed in lab-evolution studies were difficult to rationalize from a molecular perspective as the mutations localized to positions far from the protein’s active site[17–19]. These difficulties limited the scope of automated optimization methods to proteins that exhibit relatively simple and predictable sequence-structure relationships such as small or repeat proteins[11,12,20]. Thus, without mitigating the risk from accumulating errors in the design of multipoint variants and without understanding the molecular basis for the benefits resulting from remote mutations, progress in the development of general computational methods for protein optimization was severely impeded.
The conceptual bases of improved optimization-design methods
Broadly speaking, the solution to both of these conundrums came in part from the evidence that accumulated in lab-evolution and high-throughput selection studies. Over the past decade, deep mutational-scanning experiments, which measure the relative activities of every single-point mutant of a target protein (reviewed by [21,22]), revealed that only a small fraction of mutations (typically ≤5%) improve protein activity[23–25]. Nevertheless, focusing screening experiments on combinations of such tolerated single-point mutations often yielded surprisingly large, orders-of-magnitude increases in activity[13,24]. This result provided an intriguing clue to solving the problem of accumulating errors by suggesting that multipoint design calculations might focus on mutations that are predicted to be individually tolerated[23–25] to reduce the risk stemming from the errors accumulating in multipoint design calculations[26].
A second insight was that most natural proteins are only marginally stable, meaning that even a handful of mutations may reduce the fraction of correctly folded protein below its “stability threshold” to unacceptable levels[27,28]. Marginal stability constrains the engineering or design of improved variants since function-enhancing mutations may inadvertently destabilize the protein[29–31]. Fortunately, compensatory stabilizing mutations can accumulate outside the active site, while improving the protein’s tolerance to function-enhancing mutations within the active site[19]. The pervasiveness of the “stability-threshold effect” and the ability to improve protein stability through remote mutations provide a rational molecular explanation for a large fraction of the remote mutations accrued in lab-evolution experiments. From a practical standpoint, the potential of remote mutations to optimize proteins well above the threshold necessary for their proper folding also suggested an important direction for the development of general function-optimization strategies: first, stability-design calculations introduce remote mutations that improve the protein’s thermal stability and expressibility well beyond the stability threshold, and second, function-enhancing mutations are designed within the active site.
Automated methods for activity optimization
The automation of activity-optimization methods has been a long-standing goal of protein-design methodology[12,32]. As described above, however, the development of such methods was stymied by the pervasiveness of stability-threshold effects and energy function inaccuracies. Therefore, the recent development of automated stability-design methods, such as PROSS and FireProt, that could reliably and dramatically improve protein expression levels and thermal stability of diverse proteins[28,33,34] provided a necessary platform for developing function-enhancing methods through active-site design. Still, the sequence-structure relationships in an active site are notoriously complex and a subject of intense research. Foremost, active sites are characterized by a high density of interactions among long amino acid side chains. Due to the density of interactions in the active site, whether a mutation is tolerated may strongly depend on whether another mutation is already present in a phenomenon known as biomolecular epistasis[35].
To address these challenges, HotspotWizard, for instance, uses bioinformatics and structure-based analysis to recommend sites for genetic diversification as starting points for experimental screening[36–38]. The method was applied to numerous enzyme-engineering challenges, such as optimizing the enzyme 1,3-fucosyltransferase’s activity towards its target substrate, resulting in a 15-fold improvement in catalytic rate[39]. Recently, HotspotWizard was used to select positions to design a library of chymotrypsin variants for isolating enzymes with altered substrate selectivities[40]. One of the variants exhibited a selectivity for asparagine that was not observed in the wild type and can be useful for mass-spectrometry based detection of N-glycosylation patterns. In a recent development, the method was extended to address proteins of unknown structure through homology modeling[38]. This and other methods that use homology modeling to design libraries for screening, such as the NewProt web server[41], represent an important direction for the future as they may overcome critical gaps in experimentally determined protein structures[42].
A method called FuncLib was developed in our lab to address the problem of how to design multipoint mutants given the highly epistatic nature of a protein active site[43]. FuncLib first predicts the tolerated sequence space within the active site through a combination of sequence conservation and Rosetta mutational scanning calculations; it then enumerates and ranks all multipoint mutants by energy, selecting a set of designs that exhibit stable and dense constellations of amino acid interactions. Thus, FuncLib first drastically reduces the size and complexity of sequence space in the active-site pocket, focusing on a diverse though reduced subspace that is likely to be enriched for stable and functional variants (Figure 1). Unlike HotspotWizard and other design methods[44], FuncLib results in a set typically comprising 20-50 designs for experimental testing rather than a library for iterative screening.
We tested FuncLib on enzymes that were previously stabilized through the PROSS method (Figure 2). Strikingly, FuncLib designs exhibited orders of magnitude higher catalytic efficiency than the natural parent enzymes, including enzymes with nearly 4,000-fold improved hydrolytic rates of venomous nerve agents. Furthermore, mutational analysis of one of the designs indicated that it was extremely epistatic, with all double mutants exhibiting much lower activity than one of the single-point mutants, thus blocking most of the evolutionarily plausible mutational trajectories from the parent to the designed enzyme (Figure 3)[26]. We also demonstrated that the FuncLib methodology works equally well in improving binding affinities in protein-protein interactions[45] and could lead to large increases in antibody stability, expressibility, and binding affinity when applied to the interfaces between antibody light and heavy chains[25]. Thus, the accurate design of stable and dense constellations of interacting sidechains provided a surprisingly general and automated solution to various protein-optimization problems.
Taken together, the results of the last few years suggest a quite rational and effective strategy for fully computational protein optimization: a first step consisting of automated stability design (e.g., through PROSS or FireProt) introduces dozens of mutations to stabilize the protein scaffold outside the active site; and in a second step, the active site itself is targeted to improve activities (Figure 2). Furthermore, our lab has shown that this strategy could vastly accelerate studies devoted to fundamental and challenging protein-design problems, including the design of structurally and functionally diverse, high-efficiency enzymes[46], ultrahigh specificity binding partners that exhibited new and accurately designed backbones and polar interactions[45], and a biosynthetic pathway[47]. Thus, the combination of phylogenetic analyses and atomistic design calculations that serves as the basis for the methods described in this section is providing general and practical solutions to function-optimization problems that were previously addressed with great difficulty by lab evolution or ended in failure.
ASR identifies connections between binders and enzymes
Ancestral-sequence reconstruction (ASR) uses phylogenetic relationships among homologous proteins to infer likely ancestral sequences. In the field of protein engineering and design, ASR was traditionally used to improve thermal stability[48,49] or extend the substrate range of enzymes and binders[50] thanks to the tendency of ancestral proteins to be functionally less specialized[51] and closer to the family consensus[52] than are extant proteins. High stability and broad substrate range also make ancestral inference quite useful in providing starting points for lab-evolution studies that aim at new or improved activities[53,54].
In an interesting recent development, ASR was used to evolve enzymes from ancestral binding proteins with no catalytic activity[55,56]. In one case, ASR produced three inferred ancestors of the extant chalcone isomerase family that were enzymatically inactive, and using lab evolution, the authors reproduced an active variant, allowing them to map in detail the mutational events linking a presumed ancestral binder and extant enzymes[56]. In a similar vein, ASR of a family of binding proteins that lack catalytic activity resulted in catalytically active proteins[55]. These studies suggest ways for reconstructing critical evolutionary events that linked binders and enzymes as well as insights on the types of mutations that may enable the design of new enzymes from extant binders.
Outlook
The past few years have seen a significant advance in applied protein design methodology: thanks to our improved understanding of how to overcome stability-threshold effects and biomolecular epistasis, it is now practical, even for non-experts, to apply completely automated methods to optimize diverse and very challenging proteins[57,58]. There is an important caveat to keep in mind, however: most reliable design methods require accurate molecular structures, but experimentally determined structures, which are the gold standard in terms of accuracy, are available for only a small fraction of natural proteins, and some classes of proteins may actually resist structure determination. Recent advances in ab initio structure prediction methods[59] are raising an exciting possibility of a future completely automated protein-optimization pipeline in which sequences are subjected to structure prediction followed by computational design. Furthermore, machine-learning methods may in the future be reliable enough to use phylogenetic information, together with limited structural and experimental data to circumvent the requirement of accurate structural data[60].
Acknowledgments
We thank Dan Tawfik and members of the Fleishman lab for discussions on the concepts that underpin evolution-guided atomistic design and Shiran Barber-Zucker for critical reading. The research was funded by a European Research Grant Consolidator Award (815379), the Israel Science Foundation (1844/19), and charitable donations from Sam Switzer and family.
References
- 1.Romero PA, Arnold FH. Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol. 2009;10:866–876. doi: 10.1038/nrm2805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Goldsmith M, Tawfik DS. Enzyme engineering: reaching the maximal catalytic efficiency peak. Curr Opin Struct Biol. 2017;47:140–150. doi: 10.1016/j.sbi.2017.09.002. [DOI] [PubMed] [Google Scholar]
- 3.Boder ET, Midelfort KS, Wittrup KD. Directed evolution of antibody fragments with monovalent femtomolar antigen-binding affinity. Proc Natl Acad Sci U S A. 2000;97:10701–10705. doi: 10.1073/pnas.170297297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Winter G. Harnessing Evolution to Make Medicines (Nobel Lecture) Angewandte Chemie International Edition. 2019;58:14438–14445. doi: 10.1002/anie.201909343. [DOI] [PubMed] [Google Scholar]
- 5.Arnold FH. Innovation by Evolution: Bringing New Chemistry to Life (Nobel Lecture) Angew Chem Int Ed Engl. 2019 doi: 10.1002/anie.201907729. [DOI] [PubMed] [Google Scholar]
- 6.Smith GP. Phage Display: Simple Evolution in a Petri Dish (Nobel Lecture) Angew Chem Int Ed Engl. 2019;58:14428–14437. doi: 10.1002/anie.201908308. [DOI] [PubMed] [Google Scholar]
- 7.Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364–1368. doi: 10.1126/science.1089427. [DOI] [PubMed] [Google Scholar]
- 8.Dahiyat BI, Mayo SL. De novo protein design: fully automated sequence selection. Science. 1997;278:82–87. doi: 10.1126/science.278.5335.82. [DOI] [PubMed] [Google Scholar]
- 9.Kuhlman B, Bradley P. Advances in protein structure prediction and design. Nat Rev Mol Cell Biol. 2019 doi: 10.1038/s41580-019-0163-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Huang P-S, Boyken SE, Baker D. The coming of age of de novo protein design. Nature. 2016;537:320–327. doi: 10.1038/nature19946. [DOI] [PubMed] [Google Scholar]
- 11.Baker D. What has de novo protein design taught us about protein folding and biophysics? Protein Sci. 2019;28:678–683. doi: 10.1002/pro.3588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Magliery TJ. Protein stability: computation, sequence statistics, and new experimental methods. Curr Opin Struct Biol. 2015;33:161–168. doi: 10.1016/j.sbi.2015.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Whitehead TA, Baker D, Fleishman SJ. Computational design of novel protein binders and experimental affinity maturation. Methods Enzymol. 2013;523:1–19. doi: 10.1016/B978-0-12-394292-0.00001-1. [DOI] [PubMed] [Google Scholar]
- 14.Fleishman SJ, Whitehead Ta, Ekiert DC, Dreyfus C, Corn JE, Strauch E-ME-M, Wilson Ia, Baker D. Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin. Science. 2011;332:816–821. doi: 10.1126/science.1202617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rothlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym O, et al. Kemp elimination catalysts by computational enzyme design. Nature. 2008;453:190–195. doi: 10.1038/nature06879. [DOI] [PubMed] [Google Scholar]
- 16.Fleishman SJ, Baker D. Role of the biomolecular energy gap in protein design, structure, and evolution. Cell. 2012;149:262–273. doi: 10.1016/j.cell.2012.03.016. [DOI] [PubMed] [Google Scholar]
- 17.Zhao H, Arnold FH. Directed evolution converts subtilisin E into a functional equivalent of thermitase. Protein Eng. 1999;12:47–53. doi: 10.1093/protein/12.1.47. [DOI] [PubMed] [Google Scholar]
- 18.Wrenbeck EE, Azouz LR, Whitehead TA. Single-mutation fitness landscapes for an enzyme on multiple substrates reveal specificity is globally encoded. Nat Commun. 2017;8 doi: 10.1038/ncomms15695. 15695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wilding M, Hong N, Spence M, Buckle AM, Jackson CJ. Protein engineering: the potential of remote mutations. Biochem Soc Trans. 2019;47:701–711. doi: 10.1042/BST20180614. [* A perspective summarizing the evidence on contributions of mutations away from a protein’s active site to its activity.] [DOI] [PubMed] [Google Scholar]
- 20.Geiger-Schuller K, Sforza K, Yuhas M, Parmeggiani F, Baker D, Barrick D. Extreme stability in de novo-designed repeat arrays is determined by unusually stable short-range interactions. Proc Natl Acad Sci U S A. 2018 doi: 10.1073/pnas.1800283115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wrenbeck EE, Faber MS, Whitehead TA. Deep sequencing methods for protein engineering and design. Curr Opin Struct Biol. 2017;45:36–44. doi: 10.1016/j.sbi.2016.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Whitehead TA, Chevalier A, Song Y, Dreyfus C, Fleishman SJ, De Mattos C, Myers CA, Kamisetty H, Blair P, Wilson IA, et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat Biotechnol. 2012;30:543–548. doi: 10.1038/nbt.2214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Warszawski S, Dekel E, Campeotto I, Marshall JM, Wright KE, Lyth O, Knop O, Regev-Rudzki N, Higgins MK, Draper SJ, et al. Design of a basigin-mimicking inhibitor targeting the malaria invasion protein RH5. Proteins. 2019 doi: 10.1002/prot.25786. [* Deep mutational scanning followed by one-step Rosetta design results in nearly 2,000-fold improved affinity of a host protein to its natural pathogenic binder. This strategy may be particularly useful to generate high-affinity soluble antagonists from natural receptor-ligand interactions.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Warszawski S, Borenstein Katz A, Lipsh R, Khmelnitsky L, Ben Nissan G, Javitt G, Dym O, Unger T, Knop O, Albeck S, et al. Optimizing antibody affinity and stability by the automated design of the variable light-heavy chain interfaces. PLoS Comput Biol. 2019;15:e1007207. doi: 10.1371/journal.pcbi.1007207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Weinreich DM, Watson RA, Chao L. Perspective: Sign epistasis and genetic constraint on evolutionary trajectories. Evolution. 2005;59:1165–1174. [PubMed] [Google Scholar]
- 27.Shoichet BK, Baase Wa, Kuroki R, Matthews BW. A relationship between protein stability and protein function. Proceedings of the National Academy of Sciences. 1995;92:452–456. doi: 10.1073/pnas.92.2.452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Goldenzweig A, Fleishman SJ. Principles of Protein Stability and Their Application in Computational Design. Annu Rev Biochem. 2018;87:105–129. doi: 10.1146/annurev-biochem-062917-012102. [DOI] [PubMed] [Google Scholar]
- 29.Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature. 2006;444:929–932. doi: 10.1038/nature05385. [DOI] [PubMed] [Google Scholar]
- 30.Bloom JD, Labthavikul ST, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A. 2006;103:5869–5874. doi: 10.1073/pnas.0510098103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Warszawski S, Netzer R, Tawfik DS, Fleishman SJ. A “fuzzy”-logic language for encoding multiple physical traits in biomolecules. J Mol Biol. 2014;426:4125–4138. doi: 10.1016/j.jmb.2014.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Nivón LG, Bjelic S, King C, Baker D. Automating human intuition for protein design. Proteins. 2014;82:858–866. doi: 10.1002/prot.24463. [DOI] [PubMed] [Google Scholar]
- 33.Goldenzweig A, Goldsmith M, Hill SE, Gertman O, Laurino P, Ashani Y, Dym O, Unger T, Albeck S, Prilusky J, et al. Automated Structure- and Sequence-Based Design of Proteins for High Bacterial Expression and Stability. Mol Cell. 2016;63:337–346. doi: 10.1016/j.molcel.2016.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bednar D, Beerens K, Sebestova E, Bendl J, Khare S, Chaloupkova R, Prokop Z, Brezovsky J, Baker D, Damborsky J. FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants. PLoS Comput Biol. 2015;11:e1004556. doi: 10.1371/journal.pcbi.1004556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Tokuriki N, Tawfik DS. Stability effects of mutations and protein evolvability. Curr Opin Struct Biol. 2009;19:596–604. doi: 10.1016/j.sbi.2009.08.003. [DOI] [PubMed] [Google Scholar]
- 36.Pavelka A, Chovancova E, Damborsky J. HotSpot Wizard: a web server for identification of hot spots in protein engineering. Nucleic Acids Res. 2009;37:W376–83. doi: 10.1093/nar/gkp410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bendl J, Stourac J, Sebestova E, Vavra O, Musil M, Brezovsky J, Damborsky J. HotSpot Wizard 2.0: automated design of site-specific mutations and smart libraries in protein engineering. Nucleic Acids Res. 2016;44:W479–87. doi: 10.1093/nar/gkw416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Sumbalova L, Stourac J, Martinek T, Bednar D, Damborsky J. HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 2018;46:W356–W362. doi: 10.1093/nar/gky417. [* Smart mutational libraries for experimental screening are designed through a web server even in cases where an experimental structure is not available.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Choi YH, Kim JH, Park BS, Kim B-G. Solubilization and Iterative Saturation Mutagenesis of α1, 3-fucosyltransferase from Helicobacter pylori to enhance its catalytic efficiency. Biotechnol Bioeng. 2016;113:1666–1675. doi: 10.1002/bit.25944. [DOI] [PubMed] [Google Scholar]
- 40.Ramesh B, Abnouf S, Mali S, Moree WJ, Patil U, Bark SJ, Varadarajan N. Engineered ChymotrypsiN for Mass Spectrometry-Based Detection of Protein Glycosylation. ACS Chem Biol. 2019;14:2616–2628. doi: 10.1021/acschembio.9b00506. [DOI] [PubMed] [Google Scholar]
- 41.Schwarte A, Genz M, Skalden L, Nobili A, Vickers C, Melse O, Kuipers R, Joosten H-J, Stourac J, Bendl J, et al. NewProt--a protein engineering portal. Protein Eng Des Sel. 2017;30:441–447. doi: 10.1093/protein/gzx024. [DOI] [PubMed] [Google Scholar]
- 42.Junker S, Roldan R, Joosten H-J, Clapés P, Fessner W-D. Complete Switch of Reaction Specificity of an Aldolase by Directed Evolution In Vitro: Synthesis of Generic Aliphatic Aldol Products. Angew Chem Int Ed Engl. 2018;57:10153–10157. doi: 10.1002/anie.201804831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Khersonsky O, Lipsh R, Avizemer Z, Ashani Y, Goldsmith M, Leader H, Dym O, Rogotner S, Trudeau DL, Prilusky J, et al. Automated Design of Efficient and Functionally Diverse Enzyme Repertoires. Mol Cell. 2018;72:178–186.e5. doi: 10.1016/j.molcel.2018.08.033. [** FuncLib, a method available on a web server, designs dense interaction networks within an enzyme active site, producing variants with as much as 4,000-fold improved catalytic rate against toxic nerve agents.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Wijma HJ, Floor RJ, Jekel PA, Baker D, Marrink SJ, Janssen DB. Computationally designed libraries for rapid enzyme stabilization. Protein Eng Des Sel. 2014;27:49–58. doi: 10.1093/protein/gzt061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Netzer R, Listov D, Lipsh R, Dym O, Albeck S, Knop O, Kleanthous C, Fleishman SJ. Ultrahigh specificity in a network of computationally designed protein-interaction pairs. Nat Commun. 2018;9 doi: 10.1038/s41467-018-07722-9. 5286. [* Phylogenetic analysis and atomistic design are used to compute several ultrahigh specificity binding pairs that exhibit atomically accurate new backbones and polar interaction networks. The FuncLib method is extended here to automatically improve binding affinity in one of the designed pairs, leading to orders of magnitude improvement in both affinity and specificity.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lapidoth G, Khersonsky O, Lipsh R, Dym O, Albeck S, Rogotner S, Fleishman SJ. Highly active enzymes by automated combinatorial backbone assembly and sequence design. Nat Commun. 2018;9 doi: 10.1038/s41467-018-05205-5. 2780. [* New enzymes are designed by assembling backbones of naturally occurring homologous structures followed by sequence design calculations. The design strategy relies on PROSS stability design to obtain functionally diverse enzymes that exhibit more than 100 mutations from any natural enzyme and equal or surpass them in catalytic efficiency and thermal stability. This strategy may be used to generate libraries of enzymes with very large selectivity changes.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Trudeau DL, Edlich-Muth C, Zarzycki J, Scheffen M, Goldsmith M, Khersonsky O, Avizemer Z, Fleishman SJ, Cotton CAR, Erb TJ, et al. Design and in vitro realization of carbon-conserving photorespiration. Proc Natl Acad Sci U S A. 2018 doi: 10.1073/pnas.1812605115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Gumulya Y, Baek J-M, Wun S-J, Thomson RES, Harris KL, Hunter DJB, Behrendorff JBYH, Kulig J, Zheng S, Wu X, et al. Engineering highly functional thermostable proteins using ancestral sequence reconstruction. Nature Catalysis. 2018;1:878–888. [Google Scholar]
- 49.Babkova P, Sebestova E, Brezovsky J, Chaloupkova R, Damborsky J. Ancestral Haloalkane Dehalogenases Show Robustness and Unique Substrate Specificity. ChemBioChem. 2017;18:1448–1456. doi: 10.1002/cbic.201700197. [DOI] [PubMed] [Google Scholar]
- 50.Clifton BE, Whitfield JH, Sanchez-Romero I, Herde MK, Henneberger C, Janovjak H, Jackson CJ. Ancestral Protein Reconstruction and Circular Permutation for Improving the Stability and Dynamic Range of FRET Sensors. Methods Mol Biol. 2017;1596:71–87. doi: 10.1007/978-1-4939-6940-1_5. [DOI] [PubMed] [Google Scholar]
- 51.Khersonsky O, Roodveldt C, Tawfik DS. Enzyme promiscuity: evolutionary and mechanistic aspects. Curr Opin Chem Biol. 2006;10:498–508. doi: 10.1016/j.cbpa.2006.08.011. [DOI] [PubMed] [Google Scholar]
- 52.Trudeau DL, Kaltenbach M, Tawfik DS. On the Potential Origins of the High Stability of Reconstructed Ancestral Proteins. Mol Biol Evol. 2016;33:2633–2641. doi: 10.1093/molbev/msw138. [DOI] [PubMed] [Google Scholar]
- 53.Trudeau DL, Tawfik DS. Protein engineers turned evolutionists—the quest for the optimal starting point. Current Opinion in Biotechnology. 2019;60:46–52. doi: 10.1016/j.copbio.2018.12.002. [DOI] [PubMed] [Google Scholar]
- 54.Gomez-Fernandez BJ, Garcia-Ruiz E, Martin-Diaz J, Gomez de Santos P, Santos-Moriano P, Plou FJ, Ballesteros A, Garcia M, Rodriguez M, Risso VA, et al. Directed -in vitro- evolution of Precambrian and extant Rubiscos. Sci Rep. 2018;8 doi: 10.1038/s41598-018-23869-3. 5532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Clifton BE, Kaczmarski JA, Carr PD, Gerth ML, Tokuriki N, Jackson CJ. Evolution of cyclohexadienyl dehydratase from an ancestral solute-binding protein. Nat Chem Biol. 2018;14:542–547. doi: 10.1038/s41589-018-0043-2. [** ASR is used to reconstruct evolutionarily plausible intermediates that may have linked primordial binding proteins with extant enzymes. The intermediate steps are characterized biochemically and structurally to reveal how neofunctionalization can manifest.] [DOI] [PubMed] [Google Scholar]
- 56.Kaltenbach M, Burke JR, Dindo M, Pabis A, Munsberg FS, Rabin A, Kamerlin SCL, Noel JP, Tawfik DS. Evolution of chalcone isomerase from a noncatalytic ancestor. Nat Chem Biol. 2018;14:548–555. doi: 10.1038/s41589-018-0042-3. [** Similar to the above study, ASR is used to infer mutations that might have linked non-catalytic primordial proteins with extant enzymes. The detailed study shows that a residue that is key both for binding in the putative primordial protein and to catalysis in the extant proteins is conserved through evolution, though its positioning is modified to enable catalysis. Such insights may enable the design of new enzymes from natural binding proteins.] [DOI] [PubMed] [Google Scholar]
- 57.Brazzolotto X, Igert A, Guillon V, Santoni G, Nachon F. Bacterial Expression of Human Butyrylcholinesterase as a Tool for Nerve Agent Bioscavengers Development. Molecules. 2017;22 doi: 10.3390/molecules22111828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Tullman J, Christensen M, Kelman Z, Marino JP. A ClpS-based N-terminal amino acid binding reagent with improved thermostability and selectivity. Biochem Eng J. 2020;154 107438. [Google Scholar]
- 59.Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of Protein Structure Prediction (CASP)--Round XIII. Proteins: Struct Funct Bioinf. 2019 doi: 10.1002/prot.25823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mazurenko S, Prokop Z, Damborsky J. Machine Learning in Enzyme Engineering. ACS Catalysis. 2019 doi: 10.1021/acscatal.9b04321. [* A perspective on current trends in the use of machine-learning methods in enzyme engineering and the important challenges that face the field.] [DOI] [Google Scholar]