Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 May 24.
Published in final edited form as: Cell Chem Biol. 2021 Dec 17;29(2):177–190. doi: 10.1016/j.chembiol.2021.12.001

Making the Cut with Protease Engineering

Rebekah P Dyer 2, Gregory A Weiss 1,2,3,*
PMCID: PMC9127713  NIHMSID: NIHMS1805019  PMID: 34921772

Summary

Proteases cut with enviable precision and regulate diverse molecular events in biology. Such qualities drive a seemingly inexhaustible appetite for proteases with new activities and capabilities. Comprising 25% of the total industrial enzyme market, proteases appear in consumer goods, such as detergents, textile processing, and numerous foods; additionally, proteases include 25 FDA-approved medicines and various research tools. Recent advances in protease engineering strategies address target specificity, catalytic efficiency, and stability. This tunable guide for protease engineering surveys best practices and emerging strategies. We further highlight gaps and flexibilities inherent to each system that suggest opportunities for new technology development along with engineered proteases to solve challenges in proteomics, protein sequencing, and synthetic gene circuits.

eTOC blurb

Proteases address ever-expanding commercial, medical, and research needs through clever engineering strategies. Dyer et al. review how to reshape protease activity and stability to achieve the right cut in the required context and discuss future challenges and opportunities for evolving proteases.

Graphical Abstract

graphic file with name nihms-1805019-f0001.jpg

Introduction

As shown by the MEROPS database of proteolytic enzymes (Rawlings et al., 2018), endogenous proteases account for ≈2% of all known proteins and regulate diverse molecular events across biology. Proteases are also widely applied in commercial processes, medicine, and research (Fig. 1). Indeed, proteases represent 60% of the total industrial enzyme market (Ward, 2019) and include 25 FDA-approved products (U.S. Food and Drug Administration, 2020). Therefore, redirecting protease activity, including altering target specificity, catalytic efficiency, and stability, is a key capability in many technology sectors.

Figure 1. The ubiquity of proteases in industry, medicine, and research.

Figure 1

The representative examples listed here include both native and engineered proteases. Commercial proteases can reduce the environmental footprint and increase the effectiveness of laundry detergents, along with processing food and fabrics. In medicine, proteases can control blood clotting, boost digestion, and treat neuromuscular disorders. Proteases also address various challenges in biochemical research.

Today, growing applications for proteases drive the demand for user-directed peptide cleavage activity. For example, protease-based biotherapeutics can offer the selectivity of antibodies with the power of catalytic targeting; examples include blood coagulation factors and botulinum neurotoxins (BoNTs) for neuroparalysis (Fig. 1). Tailoring the activity of these proteases can improve their therapeutic utility (Craik et al., 2011; Steward et al., 2020). Similarly, commercial products, such as textiles and stain removers, benefit from the cost-effective, environmentally friendly activities of evolved proteases (Li et al., 2013). Engineered proteases also solve challenges in basic research (e.g., in proteomics, protein sequencing, and synthetic biology), which are discussed below. The wide range of existing cleavage activities paired with the evolvability of proteases makes them well-suited to address such diverse applications.

In this article, we present a how-to guide for protease engineering based upon recent literature examples — including adjustments to substrate specificity, catalytic efficiency, and stability. Our review begins with choosing a starting protease for engineering, followed by best practices for library generation and screening. We then focus on current methods for engineering and assessing the activity of proteases, with additional discussion on the requisite characterization of engineered variants. Lastly, we touch on emerging strategies and applications for present and future engineered proteases. By highlighting points of flexibility in existing platforms we hope to inspire the next generation of protease engineering technology.

Recent advances in protein engineering strategies address multiple aspects of enzymatic catalysis, including thermodynamic stability (Tm), protein solubility (S), catalytic efficiency on a single substrate (kcat/Km), or substrate specificity (Km1, Km2, …, Kmn). Identifying a protein engineering project’s desired metrics is critical to its outcome. Experimental choices should be guided by the First Law of Directed Evolution, “you get what you screen for” (Schmidt-Dannert and Arnold, 1999). For example, if a screen aims for greater catalytic efficiency, then its selection criteria must consider substrate turnover rate and ideally also total turnover number. Most importantly, screens must accurately reflect an enzyme’s final conditions for usage, including the range of pH, temperature, salt, and other parameters. This review provides guidance for the optimization of such factors for protease engineering.

Guiding protease choice and its evolution with bioinformatics and other data

Independent of the desired outcome, all protease engineering strategies begin by choosing an appropriate starting enzyme, typically through bioinformatics searches (Fig. 2). The ideal starting protease exhibits a small amount of the desired activity, which can be expanded through rounds of mutagenesis and screening (Romero and Arnold, 2009). Bioinformatics-guided approaches and homology modeling are often used to identify potential starting proteases. For thermal stability, thermophilic organisms found in ProtDataTherm (Modarres et al., 2018) or other extremophiles can be adapted for recombinant expression. For targeting a substrate, PROSPer (Song et al., 2012), DeepCleave (Li et al., 2019), and iProt-Sub (Song et al., 2019) predict protease cleavage activity; additionally, MEROPS provides known substrate specificities of proteases (Rawlings et al., 2018). Given the challenges associated with directed evolution, investing time in selecting a protease saves considerable resources and effort.

Figure 2. Protease engineering overview.

Figure 2.

The desired activity of the protease in its ultimate conditions (dotted green square) guides the directed evolution of proteases. Diversification through random, focused, or DNA shuffling mutagenesis techniques generates libraries of protease clones. Quality control ensures only high-quality libraries are screened. Protease variants are then produced and subjected to screens or selections for the desired activity. Ideally, the experimental screening conditions mimic or match the desired ultimate conditions. Iterative diversification and screening eventually yield a final protease variant. The activity (Km and kcat), stability (Tm), solubility (S), and substrate specificity evaluate the quality of the directed evolution.

Next, techniques for mutagenesis, screens, and characterization must be chosen. In general, the large number of choices for each technique can be narrowed quickly to fit the application. For example, the number of mutants generated for a library, termed its diversity, should match the screening capabilities of the engineering platform. Screening highly diverse libraries with low-throughput methods wastes time, and screening low diversity libraries with high-throughput methods wastes resources.

Screening libraries of protease variants follows the “garbage in, garbage out” principle. Therefore, libraries are ideally subjected to quality control before screening. For instance, sequencing a number of individual clones from a library can estimate its clonal diversity, mutagenesis rate, and the presence of gene fragments or stop codons (Cadwell and Joyce, 1992). Applying this technique to a DNA shuffling library reveals the number of recombination events between parent templates (Zhao et al., 1998). For site-directed libraries with a known theoretical diversity, Qpool analysis calculates and evaluates the experimental diversity of a library (Sullivan et al., 2013; Püllmann et al., 2019). Implementing quality control reserves resources for the libraries most likely to yield improved protease variants.

Irrespective of mutagenesis approach, directed evolution projects benefit from existing data for a given protease. For example, inspecting a peptide-bound structure of caspase-7 (MEROPS: C14.004) uncovered four substrate-binding subsites which significantly alter substrate specificity after site-directed mutagenesis (Hill et al., 2016). Alternatively, evaluating orthologous protease sequences identifies residues potentially influencing catalysis or substrate specificity. For example, comparing chymotrypsin (S01.152) to related serine proteases with different selectivities revealed 11 residues that enabled morphing chymotrypsin into an asparagine-targeting protease through site-saturation mutagenesis (Ramesh et al., 2019). One useful tool, HotSpot Wizard, mines bioinformatics data to suggest target sites for mutagenesis (Sumbalova et al., 2018).

Biochemical characterization of both substrate and protease residues also provides key insights for engineering. For example, mutating the substrate recognition sequence of the highly selective botulinum neurotoxin serotype E protease (M27.002) revealed crucial substrate binding sites in the enzyme (Chen and Barbieri, 2007). Biochemically-informed docking studies then demonstrated a single amino acid substitution was sufficient to generate cleavage activity against a non-native substrate (Chen and Barbieri, 2009). Thus, modeling and computational tools frequently complement such empirical analysis.

Absent of structures, homologs, or other biochemical information, the protein engineer can employ stochastic techniques. Numerous rounds of random mutagenesis, for example, reveal protease residues that influence its stability or activity (Gong et al., 2017; Zhao and Feng, 2018; Hu et al., 2020; Zhu et al., 2020). However, searching randomly through the vast sequence space of an enzyme returns mostly non-functional variants (Bershtein and Tawfik, 2008). Site-directed or DNA shuffling strategies often yield higher fitness protease variants in fewer rounds (Chen et al., 2012; Goldsmith and Tawfik, 2012; Qu et al., 2019). Therefore, we recommend identifying and targeting specific residues during library generation for efficient laboratory evolution.

Multiple protease residues are typically responsible for substrate binding and catalysis; therefore, methods to expedite screening are especially valued. Statistical tools offer a useful estimate of the number of variants to screen per library based upon the library’s diversity (Firth and Patrick, 2005; Patrick and Firth, 2005; Ferla, 2016). For random and DNA shuffling libraries, the numbers of clones typically chosen for screening is based upon insuring library coverage (e.g., to exceed the requirements with 95% confidence). The requisite numbers of variants to screen can be quite large by this criteria, and therefore, these libraries are well-suited to high-throughput screening methods (e.g., FACS). For site-directed libraries, tools such as TopLib (Nov, 2012; Acevedo-Rocha et al., 2015) calculate the number of clones required for screening to uncover the top one or two performing variants within a defined confidence interval. The number obtained through this calculation is significantly lower than the number required for full library coverage. This time-saving approach can accelerate protease engineering campaigns, especially when screens are lower throughput.

If bioinformatics and other established data guides the choice of protease and mutagenesis approach, choosing a protease screening strategy depends almost entirely on the hoped for outcome. Out of respect for the immutable logic of the First Law of Directed Evolution, screening conditions should closely match the final conditions in which the protease must perform. If such conditions are impossible to replicate (e.g., to obtain proteolytic activity in a tissue), approximations can often suffice. Here we illustrate these principles through surveying recent in vitro (Fig. 3) and in vivo (Fig. 4) strategies for engineering the protease specificity, catalysis, or stability.

Figure 3. In vitro protease evolution methods.

Figure 3.

(a) Computational tools, (b) unnatural amino acid substrates, and (c) non-native screening conditions allow for the tailoring of protease activity, specificity, and stability.

Figure 4. In vivo protease evolution methods.

Figure 4.

(a) In E. coli, protease variants can be displayed or expressed in the cytosol. Methods for screening protease activity include fluorogenic peptides, fluorogenic proteins, and conditional phage replication. (b) In yeast, protease variants can be cytosolically expressed or localized to target membranes for fluorescence- or FRET-based selections.

Engineering Protease Specificity

Engineering protease specificity represents a common goal to both research and medicine. Protease biotherapeutics typically require high specificity to avoid off-target toxicities (Craik et al., 2011). Similarly, proteases featuring predictable and consistent cut sites form the basis of protein mass spectrometry (MS) and proteomics (Giansanti et al., 2016). Here, we survey directed evolution platforms designed to reprogram protease specificity.

In the past decade, the cell-based evolution of protease specificity started with proteases displayed on the surfaces of cells to avoid intracellular toxicity. The E. coli surface protease, OmpT, was subjected to FACS-based selections and counterselections with labeled peptide substrates (Fig. 4a). The expansion of OmpT’s substrate specificity at the P1 and P1’ positions allowed first-in-class cleavage of the Glu-Arg peptide bond (Varadarajan et al., 2008a, 2009). Subsequent platforms expanded the approach to include proteases not native to the host organism. For instance, the tobacco etch virus protease (TEVp, C04.004) is commonly used to develop protease engineering platforms due to its high substrate specificity, utility in biotechnology, and minimal cellular toxicity (Renicke et al., 2013; Yi et al., 2013; Dickinson et al., 2014; Carrico et al., 2016; Packer et al., 2017). Yeast endoplasmic sequestration screening (YESS), for example, was used to evolve TEVp variants capable of selective cleavage of substrates with non-native P1 residues (Fig. 4b) (Yi et al., 2013). Recent methods continue the trend of engineering recombinant proteases in prokaryotic and eukaryotic hosts.

Host selection is critical, as certain organisms include the post-translational machinery required for the proper folding and activity of a protease. The yeast-based PreCISE method, for instance, can introduce the post-translational modifications (PTMs) found in kallikrein 7 (S01.300); this human protease requires multiple disulfide bonds and N-linked glycosylation. Reengineering kallikrein 7’s P1 preference from Tyr to Phe enabled cleavage of the Alzheimer’s disease-associated Aβ peptide (Guerrero et al., 2016) (Fig. 4b). Since human-derived proteins are less likely to be immunogenic (De Groot and Scott, 2007), evolving human proteases provides a route to their development as biotherapeutics.

New experimental methods are driving development of ambitious protease engineering projects (Table 1), including some reviewed previously (Guerrero et al., 2017). A prescriptive guideline of enzyme engineering is to choose a starting catalyst exhibiting a small amount of target activity (Romero and Arnold, 2009; Packer and Liu, 2015). As discussed below, recent engineering studies targeting protease specificity have worked around this guideline in a variety of ways.

Table 1.

Recent protease engineering strategies and outcomes.

Platform (name) Goal Protease Host Variants screened per round Library strategy Outcome Reference
E. coli display and screening with fluorophore-labeled peptides Alter specificity (F/L/W/Y↓ to F/L/W/Y/N↓) Chymotrypsin E. coli 107 Focused mutagenesis (NNS at 11 positions; 6 rounds), DNA shuffling (4 rounds), and random mutagenesis (4 rounds) Generated cleavage activity on non-native substrate; improved catalytic efficiency for non-native versus native substrate by 3.5-fold. Varadarajan (2019)
Phage-assisted continuous evolution in vivo (PACE) Alter specificity (ENLYFQ↓S to HPLVGH↓M) TEVp E. coli & M13 bacteriophage 1010* Random mutagenesis (PACE mutation plasmid) & focused mutagenesis (NNK at 11 positions; 2500 generations) Generated cleavage activity on non-native substrate; 8-fold greater catalytic efficiency for native versus non-native substrate. Liu (2017)
Substrate sandwiched between GFP and quenching peptide in vivo Alter specificity (DEVD↓G to VEID↓G) while maintaining all caspase-7 exosites Caspase-7 E. coli 105 Focused mutagenesis (NNS at 3 positions, Q/A/C/D at 1 position; 1 round) Generated cleavage activity on non-native substrate; equivalent catalytic efficiency for native versus non-native substrate. Hardy (2016)
In vitro screening with peptide-quenched fluorophore Alter specificity (K/R↓ to K/R/Cit↓) Trypsin IVTT 106 Focused mutagenesis (NNS at 4 positions; 1 round) Generated cleavage activity on non-native substrate; improved catalytic efficiency for non-native versus native substrate by 11-fold. Paegel (2016)
Light-induced de-caging of substrate in vivo Increase activity TEVp S. cerevisiae 106 Random mutagenesis (4 mutations/gene; 3 rounds) 2.8-fold higher kcat/Km compared to wtTEVp; 6-fold improvement in kapp (proxy for kcat) Ting (2020)
Substrate fused to aggregation-prone peptide in vivo Increase activity 3Cp E. coli 107 Random mutagenesis (7 mutations/gene; 3 rounds) 3.7-fold improvement in specific activity compared to wt3Cp Löfblom (2019)
In vitro screening Increase activity Subtilisin B. subtilis 103 Focused mutagenesis (NNK at 9 positions; 1 round) 300% increase in protease performance with polymer compared to wtSubtilisin Schwaneberg (2018)
Plate-based screening in vivo Improve stability PT121 E. coli 103 Random mutagenesis (1 round) and focused mutagenesis (NNK at 3 positions; 1 round) 2- to 3.5-fold improved half-life in organic solvents compared to wtPT121 Chen (2020)
Plate-based screening in vivo Improve stability Neutral protease I P. pastoris 103 Random mutagenesis (2 rounds) 2.8-fold improved specific activity at pH 5.5 compared to wtNPI Hu (2020)
Agar plate-based screening in vivo Improve stability Dehairing alkaline protease B. subtilis 105 Random mutagenesis (2 rounds) 3-fold increase in catalytic efficiency at 15 °C compared to wtDHAP Feng (2018)
In vitro validation of in silico mutations Improve stability SUMO protease Ulp1 E. coli 101 Computational mutagenesis (Rosetta) 12-fold higher maximum solubility compared to wtUlp1 Bahl (2018)
In vitro screening Improve stability Pro1437 E. coli 105 Random mutagenesis (7 mutations/gene; 2 rounds) 4.8-fold higher soluble expression compared to wtPro1437 Li (2017)

Diverse in vitro, in vivo, and in silico platforms evolve protease specificity, activity, and stability. Platforms featuring continuous mutagenesis, selection, and replication are indicated with an asterisk (*).

Protease substrates are readily genetically modified, allowing for the generation of substrates that feature both native and target residues. Such chimeric substrates can act as evolutionary footholds to bridge native and target activity. Liu & coworkers demonstrated the feasibility of this substrate walking method by engineering TEVp to cleave a recognition sequence that retained only a single amino acid from its original substrate (Packer et al., 2017). In their phage-assisted continuous evolution (PACE) platform, the evolving protease is delivered to E. coli cells through phage infection. Within the cell, substrate cleavage is linked to the transcription of a bacteriophage gene essential for infection (Fig. 4a). Therefore, only phage harboring an active protease will generate more phage.

More than 2500 generations of continuous evolution against seven chimeric substrates yielded a TEVp variant capable of cleaving the target substrate, a peptide loop in the pro-inflammatory cytokine IL-23. The evolved TEVp exhibited an 8-fold greater catalytic efficiency for the native over the target substrate, likely due to the lack of negative selections against the native substrate. Additional gel-based assays confirmed the engineered TEVp cleaves an endogenous IL-23 substrate as predicted with some off-target cleavage activity. Indeed, characterizing the substrate profile of the evolved TEVp revealed a loss of selectivity compared to wild-type (wtTEVp). Encouragingly, the evolved protease inhibited IL-23-mediated immune signaling in primary cultures of murine splenocytes. Though PACE requires a delicately balanced and challenging experimental setup, it is a powerful platform for engineering protease specificity and even predicting protease resistance to drugs (Dickinson et al., 2014).

Recently, simultaneous negative selections have been added to positive selections in PACE (termed PANCE) to improve its selection for protease specificity. Three BoNT proteases—BoNT/X, BoNT/F, and BoNT/E—were reprogrammed to selectively cleave native or non-native target substrates with therapeutic potential. Employing chimeric substrates to tune the stringency of positive and negative selections yielded proteases with changes in specificity ranging from 218- to >107-fold (Blum et al., 2021). These results confirm the benefits of including negative selections in directed evolution efforts, even for a powerful evolutionary system.

Other intracellular platforms also expand the substrate preferences of proteases. For instance, Hardy & coworkers reported an engineering campaign designed to alter the specificity of caspase-7 while maintaining all caspase-7 exosites. Their selection method coupled proteolysis with separation of GFP from a quenching peptide (Hill et al., 2016). By co-expressing structure-guided libraries of caspase-7 variants with this GFP-quencher substrate, four caspase-7 variants with caspase-6 substrate cleavage activity were identified in a single round of mutagenesis and selections by FACS (Fig. 4a). Each caspase-7 variant featured similar catalytic efficiencies against caspase-7 and caspase-6 cognate peptide substrates. The expansion of substrate specificity likely resulted from a choice to omit negative selections for caspase-7 activity. However, differences in cleavage activity emerged when variants were screened against full-length substrates, indicating the important role of substrate binding exosites in the substrate specificity of caspases. This system provides an accessible method for evolving protease specificity in E. coli and potentially other hosts.

If the target substrate for proteolytic cleavage includes a post-translationally modified or a non-canonical amino acid, very high-throughput in vitro methods can overcome the evolutionary barrier of not having a protease with starting activity. For instance, Paegel & coworkers engineered citrulline-specific cleavage into trypsin (S01.151) to address the need for better MS tools to dissect PTMs (Tran et al., 2016). Citrulline results from the deamination of arginine, an amino acid recognized and cleaved by trypsin. As citrulline is not readily genetically encoded, the authors synthetized a bisamide rhodamine substrate that fluoresces upon cleavage (Fig. 3b). By tethering this substrate to both a bead and the DNA of the evolving protease, 1.3 million trypsin variants were compartmentalized and expressed by in vitro transcription/translation (IVTT). The resulting droplets were then sorted for citrulline cleavage activity by FACS. The best variant, trypsin+cit, featured six amino acid substitutions and efficiently cleaved citrulline-containing peptides and proteins in MS assays.

Mapping glycan PTMs represents another important goal in proteomics. Varadarajan & coworkers engineered trypsin’s homolog, chymotrypsin (S01.152), to recognize and cleave unmodified asparagine residues to reveal potential N-linked glycosylation sites (Ramesh et al., 2019). A FACS-based E. coli display platform identified chymotrypsiN, which features eight added mutations and enables the identification of glycosylated asparagines for MS assays (Fig. 4a). Chymotrypsin and trypsin derive their substrate specificity from several loops, therefore both Paegel’s and Varadarajan’s mutagenesis strategies implemented high-diversity loop libraries and very high-throughput screening. Unlike trypsin+cit, chymotrypsiN was the result of positive (Asn-peptide) and negative (Tyr-peptide) FACS selections. Despite this experimental difference, trypsin+cit and chymotrypsiN exhibit similarly reduced cleavage of wild-type amino acids arginine and tyrosine, respectively, compared to their target substrates (Table 1). These engineered enzymes join existing examples of how PTM-targeting proteases can expand the MS reagent toolbox (Knight et al., 2007; Varadarajan et al., 2008b) and enable the discovery of unknown biological or pathological roles of PTMs.

The above examples demonstrate the power of existing methods to re-shape the specificities of proteases with peptide-length recognition sequences. However, cleavage of peptide-sized substrates cannot always predict an engineered protease’s activity on full-length proteins (Hill et al., 2016; Packer et al., 2017). Since substrate specificity is critical to protease-based biotherapeutics, further platform development—including accommodation of longer, folded, or post-translationally modified substrates—is required to reliably engineer specific proteases.

Engineering Protease Catalytic Efficiency

Engineering a highly specific or selective enzyme can sacrifice its catalytic rate. For the goal of attaining sufficient catalytic speeds, screening must include consideration of catalytic efficiency. Recently, several platforms have emerged for engineering the catalytic efficiency of proteases. This field is still emerging, and most examples feature model proteases with unmodified, peptide-sized recognition sequences.

For example, certain polymers improve the catalytic activity of proteases (Gaertner and Puigserver, 1992; Kurinomaru et al., 2014). Schwaneberg & coworkers sought to improve such enzyme-polymer interactions through protease engineering to boost the catalytic activity of the B. lentus alkaline subtilisin (S08.003) (J. Thiele et al., 2018). Molecular dynamics simulations identified nine protease residues in contact with two polyelectrolyte polymers known to increase the activity and stability of S08.003 (Fig. 3a). Site-saturation mutagenesis of these residues revealed four beneficial substitutions that increased proteolytic performance by 300% compared to wild-type S08.003 with no polymer. Multiple orthogonal protease cleavage assays evaluated conditions expected for the evolved S08.003 variant. However, a lack of kinetic characterization limits the understanding of how the added mutations improved catalysis compared to the wild-type protease. Such kinetic experiments are not always possible with milk-based substrates used to assess nonspecific proteolytic activity. Additionally, the engineered protease resulted from a single round of mutagenesis and screening. Investigating combinations of favorable mutations through DNA shuffling could potentially further improve protease performance. This study illustrates how in silico hypotheses investigated through protein engineering can improve protease catalysis.

For cell-based selections, controlling substrate availability can improve catalytic efficiency. Towards this aim, Löfblom & coworkers sandwiched the substrate recognition sequence of the 3C protease from coxsackievirus B3 (C03.011) between a fluorescent protein and the aggregation-prone AB peptide (Meister et al., 2019). If the rate of proteolysis exceeds the rate of aggregation, the GFP stays folded, and the cell can be isolated by FACS (Fig. 4a). While the authors report a 3.7-fold improvement in activity, it is unclear how the evolved variants compare to the wt3C protease without full kinetic characterization. Additionally, the rate of aggregation is not readily tunable, precluding its broad application to very slow or very fast proteases. Still, this platform has the advantage of selecting for both solubility as well as cleavage activity.

Stringent control of protease localization and substrate access over time in vivo can also result in improved proteolytic activity. For example, Ting & coworkers developed an intricate yeast-based system where incident light temporarily induces protease-substrate dimerization and uncages the substrate for cleavage (Fig. 4b). Selection stringency is elegantly tunable through the duration of light exposure, a readily controlled and reproducible condition. Thus, only the proteases with the highest overall cleavage rate will be selected (Sanchez and Ting, 2020).

Applying this system’s temporal control resulted in changes to both Km and kapp (kcat could not be calculated due to substrate solubility limits), with several top-performing variants showing changes in Km only. Therefore, this platform is best suited for increasing catalytic efficiency, not necessarily kcat. Notably, employing an in cellulo platform matches the environment required for other cellular applications of TEVp in yeast such as FLARE (Wang et al., 2017). This platform can be additionally modified to address multiple aspects of protease activity. Libraries of substrates, for instance, could characterize a protease’s substrate profile. Implementing a second substrate and reporter could allow for engineering protease specificity as well as catalysis. Additionally, linking proteolysis to conditional growth could allow continuous evolution in mutator strains of yeast such as the OrthoRep system (Ravikumar et al., 2018).

Enhancing an enzyme’s catalytic activity remains perhaps the most challenging aspect of protein engineering (Goldsmith and Tawfik, 2017). Chemical intuition about molecular recognition can effectively guide altering an enzyme’s specificity, but such logic often doesn’t readily apply to tinkering with catalytic rates, for example. The examples outlined here, however, illustrate panache in tackling this challenge. Notably, several approaches sift through large collections of mutant enzymes to allow more “shots on goal” in response to the unpredictability of enzyme activity screening. A continuous evolution system, such as OrthoRep (Ravikumar et al., 2018), introduces new mutations after each round of selection; such recursive mutagenesis, can build upon even weak initial improvements in catalytic rates. Notably, the novelty of the systems surveyed here suggests the beginning of a breakthrough era allowing scientists to more readily dial-in the catalytic activities of their enzymes.

Engineering Protease Stability

Evolving thermo- or solvent-stable proteases represents some of the first examples of protein engineering (Pantoliano et al., 1987; Chen and Arnold, 1991, 1993; Zhao and Arnold, 1999). Industrial proteases in particular must function in non-native environments, often at extreme pH values, temperatures, or in the presence of organic solvents and detergents. Biochemical and computational techniques have expanded the possibilities for evolution of stable proteases.

Choosing library sites to improve stability while maintaining catalytic activity is less straightforward than targeting substrate-binding residues. For this task, many groups apply random mutagenesis together with careful selection of the starting protease and activity screen (Fig. 3c). For example, Feng & coworkers needed proteases with high activity at low temperatures for energy-efficient leather processing. The authors subjected the mesophilic dehairing alkaline protease from B. pumilus (DHAP, S08.005) to error-prone PCR followed by two screens for temperature-dependent milk-hydrolytic activity. Seven proteases from a library of 30,000 variants exhibited two-fold increased caseinolytic activity at 15 °C. Combining individual mutations generated a DHAP variant with three-fold improved catalytic efficiency and similar thermostability to wtDHAP (Zhao and Feng, 2018). Though the gains are modest, the authors’ approach has the tremendous benefit of being quite straightforward and easy to implement.

Application of random mutagenesis also recently improved the acid stability of A. oryzae neutral protease I (M36.003) 2.8-fold for soy sauce fermentation (Hu et al., 2020) and the solvent stability of P. aeruginosa PT121 (M04.005) 3.5-fold for aspartame production (Zhu et al., 2020). Error-prone PCR and screening of Pro1437 (M48.UPC), a metalloprotease extracted from an oil-polluted mud flat, resulted in four-fold improved soluble expression and six-fold improved activity. Importantly, the industrial utility of engineered Pro1437 was demonstrated through its ability to remove blood stains from cloth during a low-temperature wash cycle with detergent (Gong et al., 2017). These examples illustrate the power of error-prone PCR and screening to successfully engineer protease stability in just a few evolutionary rounds.

In contrast to the above examples, computational approaches can also engineer protease stability. Bahl & coworkers identified the need for small ubiquitin-like modifier (SUMO) proteases with higher solubility and stability to increase their utility as biochemical reagents. Starting with the SUMO protease Ulp1 (C48.001), the authors developed a computational approach to identify polar residues to replace surface-exposed hydrophobic residues (Fig. 3a). Four variants mutated at ten sites predicted to have increased solubility and stability were validated in vitro with precipitation assays, kinetic characterization, and biophysical methods. The best variant, Ulp1_R3, features a 12-fold increased maximum soluble concentration, a similar melting temperature to wtUlp1, and a two-fold increase in catalytic efficiency. At six of the ten mutated sites, their algorithm predicted several favorable substitutions. Investigating these sites through focused libraries or DNA shuffling could reveal potential epistatic effects for further improving solubility. Final variants were not subjected to specificity profiling to determine if they retain the wild-type enzyme’s specificity; however, the small changes to the enzyme’s kinetic parameters suggest that its substrate binding and catalysis were not greatly disrupted (Lau et al., 2018). These results demonstrate the potential for leveraging structural information and computational tools to guide the design of proteases with increased solubility.

Characterization of evolved proteases

The engineering stage of protease development often focuses on a single aspect of enzymatic function, such as catalytic activity, specificity, or stability. Narrowing promising candidates to true lead molecules, however, requires a more holistic view of each protease. As with protease evolution, protease characterization is guided by the final application. For example, a protease engineered for stability in laundry detergent will be expected to act on a variety of protein stains in the presence of chelators and surfactants. Alternatively, a protease drug may be expected to cut only a cleave a single therapeutic bond in a specific type of cell. In addition to specificity, characterizing properties such as expression scalability, kinetic performance, compatibility with formulation reagents, or immunoreactivity distinguishes the top performing proteases. Of particular importance is understanding what bonds an engineered protease will or will not cleave, either as peptides or full substrates.

Nonspecific cleavage activity

Nonspecific cleavage activity can be detrimental to the development of proteases as biotherapeutics (off-target toxicity), biotechnology reagents (reduced efficiency), and research tools (confounding or complicating results). Even when not explicitly engineering for protease specificity, mutations can still result in unintentional changes to the protease’s substrate specificity landscape. Empirically investigating the substrate specificity landscape through specificity profiling allows for critically understanding an engineered protease’s utility.

Many methods can dissect protease specificity. For example, biological display methods allow selections with high-diversity substrate libraries to analyze protease specificity (Matthews and Wells, 1993; Diamond, 2007; Chen et al., 2019). The recent addition of next-generation sequencing (NGS) analyses has enabled more in-depth characterization of a protease’s substrate preference at each residue of its recognition sequence with phage (Packer et al., 2017; Kretz et al., 2018; Zhou et al., 2020) and yeast display (Li et al., 2017; Pethe et al., 2019). While molecular display uniquely allows enrichment of protease substrates from vast libraries, such techniques cannot unequivocally identify the scissile bond. Therefore, display-based substrate profiling methods are most useful when combined with knowledge of the protease’s substrate recognition sequence.

Substrate profiling

MS-based protease substrate profiling complements studies of protease specificity by directly revealing the peptide bonds cleaved in proteomic samples (Schilling and Overall, 2008; Schilling et al., 2011). Recent MS techniques trend towards maintaining as much of the endogenous identity of the sample as possible (Vidmar et al., 2017; Nguyen et al., 2018). MS-based approaches for substrate profiling can even reveal turnover rates on different peptide substrates with isobaric peptide labeling (Lapek et al., 2019). Protease profiling strategies were reviewed in detail by Bogyo and coworkers (Chen et al., 2019) and its further discussion is beyond the scope of this review.

As with directed evolution screens, assessing protease selectivity is most relevant when the specificity screen matches the final context in which the protease will perform. This applies to both the specificity screening conditions and the substrates tested. Furthermore, the optimal screening substrate might not reflect the protease’s native substrate preferences. For example, phage- and yeast-display methods may not recapitulate endogenous PTMs that could affect protease recognition. MS techniques using endogenous proteomes address this issue but are also imperfect; the sites available for cleavage during substrate screening can be shielded or buried in full-length, folded protein targets. Therefore, we recommend pairing specificity profiling with careful kinetic and biochemical characterization of the top substrate candidates as full-length, fully processed proteins.

Emerging applications for engineered proteases

As surveyed here, engineered proteases find broad utility in the laboratory, in medicine, and in various commercial applications. Their formidable properties keep finding new uses, and are driving rapidly growing fields such as synthetic biology. Here, we discuss two emerging applications with an eye towards inspiring new uses for this class of enzymes.

Engineered proteases in gene circuits

Synthetic biological or gene circuits aim to program cells to perform desired functions such as disease detection, synthesis of valuable products, and self-assembly into new tissues. Proteases offer specific regulation of multiple cellular targets, enabling the construction of highly complex synthetic biological circuits (Chung and Lin, 2020). Recently, the SPOC platform (split-protease-cleavable orthogonal-CC-based) successfully demonstrated fast responding, Boolean logic functions in cellulo by splitting proteases engineered with controllable dimerization domains (Fink et al., 2019) (Fig. 5a). Similarly, the CHOMP system (circuits of hacked orthogonal modular proteases) demonstrated that proteases engineered with degrons can be programmed to perform logic functions and induce cell death upon Ras oncogene activation (Gao et al., 2018) (Fig. 5b).

Figure 5. Emerging applications for engineered proteases.

Figure 5.

(a) The SPOC platform achieves chemically induced proteolysis through the fusion of dimerization domains (grey) to split proteases (teal, blue). Here, luciferase output (yellow) is achieved upon the addition of rapamycin and abscisic acid (ABA). (b) Engineering orthogonal protease cleavage sites (circles) into split proteases allows for regulation through proteolysis in the CHOMP system. Here, reconstituting a split TEVp (teal) induces programmed cell death through caspase-3 (Casp3, purple) in response to activation of the Ras oncogene by SOS. Inhibiting TEVp with a TEVp-controlled split TVMVp (pink) reduces off-target effects. (c) The ClpXP protease unfolds then digests its substrate (pink). The FRET signal increases when labeled substrate lysines (red) are in proximity to a labeled protease cysteine (yellow). The peptide can be identified by its Lys and Cys locations. (d) In enzyme-catalyzed Edman degradation, the N-terminal amine of a peptide substrate reacts with PITC. In the active site, nucleophilic attack by the PITC sulfur atom forms an anilinothiazolinone derivative. The active site histidine then helps release a phenylthiohydantoin product (pink).

In gene circuits, irreversible proteolysis provides user-defined, rapid-onset, diverse signals that can direct endogenous cellular processes. For instance, Liu & coworkers recently developed several protease-based circuits for regulating metabolic flux in E. coli, improving production of an influenza drug precursor molecule by 6-fold (Gao et al., 2019). Moreover, protease gene circuits can facilitate fluorescence-guided surgery with a protease-based AND gate (Widen et al., 2020) and light-up ultrasound imaging (Lakshmanan et al., 2020). The protease-based circuits discussed here predominantly rely on the orthogonal cleavage activities offered by plant potyvirus proteases, including the highly specific TEVp. Engineering new protease specificities or tuning existing activities could further adapt gene circuits for translational medicine and other goals in synthetic biology.

Engineered proteases for next-gen protein sequencing

Protease engineering can also accelerate breakthroughs in protein sequencing. Extending NGS capabilities to protein sequencing, for example, could open new possibilities in research and medicine, as reviewed previously (Callahan et al., 2020). One approach employs the ATP-dependent ClpXP protease from E. coli to unfold then proteolyze proteins for a single molecule approach to FRET-based protein fingerprinting (Fig. 5c). Engineering cysteines into the protease domains of the ClpXP complex enables bioconjugation of a donor fluorophore using thiol-maleimide chemistry (van Ginkel et al., 2018). The target protein for sequencing is treated to fluorescently label every lysine or cysteine residue with a FRET acceptor. As ClpXP digests the target protein, the appearance of each lysine or cysteine residue is analyzed by FRET, and their spacing inferred by timing the FRET events. Patterns of lysine and cysteine residues can be sufficient to characterize proteomes (Yao et al., 2015); however, the approach requires prior knowledge of the locations for lysine and cystine residues in each protein. Additionally, ClpXP only recognizes substrates tagged with certain leader peptides, preventing its use with endogenous proteomes. Expanding the substrate promiscuity of ClpXP through protease engineering could make this system applicable to broad protein sequencing.

A second protein sequencing platform features an enzymatic approach to the classic Edman degradation. First developed in 1950, Edman degradation begins with labeling the N-terminus of a target polypeptide with phenyl isothiocyanate (PITC) (Edman et al., 1950). High heat and acidic, anhydrous conditions selectively release the N-terminal amino acid as an identifiable PITC derivative, generating a new N-terminus. Repeated cycles of labeling and cleavage yield a protein sequence. While chemical Edman degradation suffers from low efficiency and harsh conditions, an enzymatic approach could offer tunable catalysis in mild, aqueous solutions. As a proof of concept, mutating the catalytic cysteine from cruzain (C01.075) to glycine accompanied by four computationally derived substitutions creates Edmanase. The modified active site accepts PITC and its sulfur atom as the nucleophile for N-terminal cleavage of the peptide substrate (Fig. 5d). Then, substrate-assisted catalysis can achieve stepwise removal of N-terminal amino acids from the sequencing target. The catalytic efficiencies for different side chains, however, spanned several orders of magnitude with low kcat values overall (Borgo and Havranek, 2015). One or more of the protease engineering strategies presented here could enhance the kinetic properties of Edmanase towards a viable protein sequencing platform.

Conclusions

The examples surveyed here reveal current trends in protease engineering. Small-scale, carefully curated mutagenesis and screening platforms can yield equivalent results to massive screening efforts for existing activity. For instance, screens of high-diversity, randomized, protease libraries (Meister et al., 2019; Sanchez and Ting, 2020) yielded improvements in catalytic efficiency equivalent to a focused engineering strategy (J. Thiele et al., 2018). Similarly, engineering protease stability while maintaining native activity is possible with screening throughputs orders of magnitude below fluorescence cytometry or phage-based evolution (Table 1). Conversely, high-throughput methods tackle formidable evolutionary challenges, such as generating activity on a non-native substrate.

Structure-guided focused mutagenesis libraries generally produce higher-fitness proteases compared to randomized variants. This especially applies to proteases with extensive substrate-binding sites such as trypsin (Tran et al., 2016), chymotrypsin (Ramesh et al., 2019), and BoNT (Blum et al., 2021). Indeed, the authors of this review also attempted large-scale screening efforts with the highly specific BoNT protease. However, a carefully curated selection of mutations, with some residues identified by epPCR, more efficiently reshaped the enzyme’s specificity from SNAP25 to SNAP23 (Dyer et al., 2021). Thus, random mutagenesis can identify nonobvious residues influencing protease specificity and catalysis (Tran et al., 2016; Packer et al., 2017; Ramesh et al., 2019), and is the standard approach to improving protease stability. Interestingly, the random mutagenesis strategies surveyed here are rarely followed up with focused mutagenesis. As demonstrated for engineering protease specificity, targeting such hot spots with saturation mutagenesis libraries can optimize each mutated residue, thus further improving the protease. Additionally, interfacing machine-learning approaches (Yang et al., 2019) with protease screening data could accelerate directed evolution.

Introducing new protease specificity is difficult, but possible (Hill et al., 2016; Tran et al., 2016; Packer et al., 2017; Ramesh et al., 2019). Flipping specificity from the native to the new substrate, however, is particularly challenging. In the examples discussed here, the proteases engineered to have altered specificity typically retained their native cleavage activity. This observation supports the concept that promiscuous enzymatic activity oftentimes precedes selectivity during evolution (Khersonsky et al., 2006). Further negative screens or selections, such as those demonstrated for chymotrypsin (Ramesh et al., 2019) or BoNT (Blum et al., 2021), would likely remove undesired activity. Computational tools can also guide the elimination of native cleavage activity in proteases (Gordon et al., 2012). Therefore, the approaches discussed here find new-to-nature cleavage activities that can serve as starting points for evolving truly selective proteases for biomedical or synthetic biology applications.

Thoroughly characterizing the activity of a final engineered protease is crucial to understanding its limits and utility. For instance, kinetic characterization evaluates rates of proteolysis over a range of substrate or inhibitor concentrations. Such kinetic experiments reveal important protease behavior, including catalytic efficiency, substrate affinity, maximum cleavage rate, substrate inhibition, and allostery. These results can in turn provide valuable sequence-function hypotheses for further engineering. Importantly, if screens used in protease engineering feature non-natural components (fluorophores, small peptides, fusion proteins, etc.), cleavage of the endogenous target substrate(s) should be assessed by SDS PAGE or MS. In brief, the relevant context for the protease’s application should always be a foremost consideration.

Pairing screening conditions with the protease’s ultimate performance objective ensures engineering success more than any other factor. Proteases evolved outside of cells, for example, are well-matched to applications in cleaning products (Gong et al., 2017; J. Thiele et al., 2018; Zhao and Feng, 2018), protein preparation (Lau et al., 2018), or peptide digestion (Tran et al., 2016; Ramesh et al., 2019). Similarly, proteases required to perform in complex biological environments are best engineered in vivo (Hill et al., 2016; Packer et al., 2017; Hu et al., 2020; Sanchez and Ting, 2020; Blum et al., 2021). Fortunately, screens focused on a single aspect of protease activity often include solubility and stability as selection criteria within the screening environment. However, confronting protease variants with final performance conditions during screens avoids unanticipated activity, selectivity, stability, or other issues.

Directing proteolysis within human cells is an attractive goal for biotherapeutics, as demonstrated by the therapeutic successes of BoNTs (Fonfria et al., 2018). Effectively evolving therapeutic proteases remains a challenge in the field of protease engineering. Existing platforms are restricted to bacteria or yeast, inherently limiting which proteases and substrates are targetable. To address this, recently developed systems for directed evolution in mammalian cells (Berman et al., 2018; English et al., 2019) or entirely new methods could be modified to evolve proteases. For instance, an evolution platform could link virus production with proteolysis, similar to PACE, for high-throughput, continuous evolution of proteases in mammalian cells. Ideally these mammalian platforms would mimic the protease’s ultimate performance conditions by accommodating full-length, folded human proteases and substrates with appropriate PTMs.

The expanding utility of proteases in commercial processes, therapeutics, and research continues to inspire elegant and diverse protein engineering platforms. Specifically, recent advances in protease screening methodologies present solutions to challenging evolutionary barriers with capabilities exceeding nature’s. Notably, a wide array of platforms offers accessible routes to engineering protease activity. Robust systems for characterizing protease cleavage can illuminate the molecular basis of substrate recognition as inspiration for further engineering. Additionally, emerging applications of proteases in synthetic biology and biotechnology sustain the demand for novel, tailored peptide cleavage activities. We hope this review inspires future advances in protease engineering technology to address the most challenging problems in biotechnology and biomedicine.

Highlights.

  • Proteases are widely engineered for use in commercial processes, therapeutics, and research.

  • Recent engineering strategies generate proteases with tailored activities.

  • Key engineering targets include protease specificity, catalytic efficiency, and stability.

  • Challenges and opportunities inform the next generation of protease engineering.

Acknowledgements:

We thank Gabriela Salcedo, Hariny Isoda, Jason Garrido, and Yasmeen Rodriguez for helpful conversations. We gratefully acknowledge support from Allergan Aesthetics an AbbVie Company (108201209000).

Footnotes

Declaration of Interests: The authors’ research on protease engineering was funded by Allergan Aesthetics an AbbVie Company. A provisional patent for engineered variants of Botulinum neurotoxin serotype A protease has been filed with the authors as co-inventors.

References

  1. Acevedo-Rocha CG, Reetz MT, and Nov Y (2015). Economical analysis of saturation mutagenesis experiments. Sci. Rep. 5, 10654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Berman CM, Papa LJ, Hendel SJ, Moore CL, Suen PH, Weickhardt AF, Doan N-D, Kumar CM, Uil TG, Butty VL, et al. (2018). An Adaptable Platform for Directed Evolution in Human Cells. J. Am. Chem. Soc. 140, 18093–18103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bershtein S, and Tawfik DS (2008). Ohno’s model revisited: Measuring the frequency of potentially adaptive mutations under various mutational drifts. Mol. Biol. Evol. 25, 2311–2318. [DOI] [PubMed] [Google Scholar]
  4. Blum TR, Liu H, Packer MS, Xiong X, Lee PG, Zhang S, Richter M, Minasov G, Satchell KJF, Dong M, et al. (2021). Phage-assisted evolution of botulinum neurotoxin proteases with reprogrammed specificity. Science (80-.). 371, 803–810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Borgo B, and Havranek JJ (2015). Computer-aided design of a catalyst for Edman degradation utilizing substrate-assisted catalysis. Protein Sci. 24, 571–579. [Google Scholar]
  6. Cadwell RC, and Joyce GF (1992). Randomization of genes by PCR mutagenesis. Genome Res. 2, 28–33. [DOI] [PubMed] [Google Scholar]
  7. Callahan N, Tullman J, Kelman Z, and Marino J (2020). Strategies for Development of a Next-Generation Protein Sequencing Platform. Trends Biochem. Sci. 45, 76–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Carrico ZM, Strobel KL, Atreya ME, Clark DS, and Francis MB (2016). Simultaneous selection and counter-selection for the directed evolution of proteases in E. coli using a cytoplasmic anchoring strategy. Biotechnol. Bioeng. 113, 1187–1193. [DOI] [PubMed] [Google Scholar]
  9. Chen K, and Arnold FH (1991). Enzyme Engineering for Nonaqueous Solvents: Random Mutagenesis to Enhance Activity of Subtilisin E in Polar Organic Media. Bio/Technology 9, 1073–1077. [DOI] [PubMed] [Google Scholar]
  10. Chen K, and Arnold FH (1993). Tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin E for catalysis in dimethylformamide. Proc. Natl. Acad. Sci. 90, 5618–5622. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chen S, and Barbieri JT (2007). Multiple pocket recognition of SNAP25 by botulinum neurotoxin serotype E. J. Biol. Chem. 282, 25540–25547. [DOI] [PubMed] [Google Scholar]
  12. Chen S, and Barbieri JT (2009). Engineering botulinum neurotoxin to extend therapeutic intervention. Proc. Natl. Acad. Sci. U. S. A. 106, 9180–9184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Chen MMY, Snow CD, Vizcarra CL, Mayo SL, and Arnold FH (2012). Comparison of random mutagenesis and semi-rational designed libraries for improved cytochrome P450 BM3-catalyzed hydroxylation of small alkanes. Protein Eng. Des. Sel. 25, 171–178. [DOI] [PubMed] [Google Scholar]
  14. Chen S, Yim JJ, and Bogyo M (2019). Synthetic and biological approaches to map substrate specificities of proteases. Biol. Chem. 401, 165–182. [DOI] [PubMed] [Google Scholar]
  15. Chung HK, and Lin MZ (2020). On the cutting edge: protease-based methods for sensing and controlling cell biology. Nat. Methods 17, 885–896. [DOI] [PubMed] [Google Scholar]
  16. Craik CS, Page MJ, and Madison EL (2011). Proteases as therapeutics. Biochem. J. 435, 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Diamond SL (2007). Methods for mapping protease specificity. Curr. Opin. Chem. Biol. 11, 46–51. [DOI] [PubMed] [Google Scholar]
  18. Dickinson BC, Packer MS, Badran AH, and Liu DR (2014). A system for the continuous directed evolution of proteases rapidly reveals drug-resistance mutations. Nat. Commun. 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Dyer RP, Isoda HM, Salcedo GS, Speciale G, Fletcher MH, Le LQ, Liu Y, Malik SZ, Vazquez-Cintron EJ, Chu AC, et al. (2020). Reengineering the Specificity of the Highly Selective Clostridium botulinum Protease via Directed Evolution. BioRxiv 2020.09.29.319145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Edman P, Högfeldt E, Sillén LG, and Kinell P-O (1950). Method for Determination of the Amino Acid Sequence in Peptides. Acta Chem. Scand. 4, 283–293. [Google Scholar]
  21. English JG, Olsen RHJ, Lansu K, Patel M, White K, Cockrell AS, Singh D, Strachan RT, Wacker D, and Roth BL (2019). VEGAS as a Platform for Facile Directed Evolution in Mammalian Cells. Cell 178, 748–761.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ferla MP (2016). Mutanalyst, an online tool for assessing the mutational spectrum of epPCR libraries with poor sampling. BMC Bioinformatics 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Fink T, Lonzarić J, Praznik A, Plaper T, Merljak E, Leben K, Jerala N, Lebar T, Strmšek Ž, Lapenta F, et al. (2019). Design of fast proteolysis-based signaling and logic circuits in mammalian cells. Nat. Chem. Biol. 15, 115–122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Firth AE, and Patrick WM (2005). Statistics of protein library construction. Bioinformatics 21, 3314–3315. [DOI] [PubMed] [Google Scholar]
  25. Fonfria E, Maignel J, Lezmi S, Martin V, Splevins A, Shubber S, Kalinichev M, Foster K, Picaut P, and Krupp J (2018). The Expanding Therapeutic Utility of Botulinum Neurotoxins. Toxins (Basel). 10, 208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Gaertner HF, and Puigserver AJ (1992). Increased activity and stability of poly(ethylene glycol)-modified trypsin. Enzyme Microb. Technol. 14, 150–155. [DOI] [PubMed] [Google Scholar]
  27. Gao C, Hou J, Xu P, Guo L, Chen X, Hu G, Ye C, Edwards H, Chen J, Chen W, et al. (2019). Programmable biomolecular switches for rewiring flux in Escherichia coli. Nat. Commun. 10, 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gao XJ, Chong LS, Kim MS, and Elowitz MB (2018). Programmable protein circuits in living cells. Science (80-.). 361, 1252–1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Giansanti P, Tsiatsiani L, Low TY, and Heck AJR (2016). Six alternative proteases for mass spectrometry–based proteomics beyond trypsin. Nat. Protoc. 11, 993–1006. [DOI] [PubMed] [Google Scholar]
  30. van Ginkel J, Filius M, Szczepaniak M, Tulinski P, Meyer AS, and Joo C (2018). Single-molecule peptide fingerprinting. Proc. Natl. Acad. Sci. 115, 3338–3343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Goldsmith M, and Tawfik DS (2012). Directed enzyme evolution: Beyond the low-hanging fruit. Curr. Opin. Struct. Biol. 22, 406–412. [DOI] [PubMed] [Google Scholar]
  32. Goldsmith M, and Tawfik DS (2017). Enzyme engineering: reaching the maximal catalytic efficiency peak. Curr. Opin. Struct. Biol. 47, 140–150. [DOI] [PubMed] [Google Scholar]
  33. Gong B-L, Mao R-Q, Xiao Y, Jia M-L, Zhong X-L, Liu Y, Xu P-L, and Li G (2017). Improvement of enzyme activity and soluble expression of an alkaline protease isolated from oil-polluted mud flat metagenome by random mutagenesis. Enzyme Microb. Technol. 106, 97–105. [DOI] [PubMed] [Google Scholar]
  34. Gordon SR, Stanley EJ, Wolf S, Toland A, Wu SJ, Hadidi D, Mills JH, Baker D, Pultz IS, and Siegel JB (2012). Computational design of an α-gliadin peptidase. J. Am. Chem. Soc. 134, 20513–20520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. De Groot AS, and Scott DW (2007). Immunogenicity of protein therapeutics. Trends Immunol. 28, 482–490. [DOI] [PubMed] [Google Scholar]
  36. Guerrero JL, O’Malley MA, and Daugherty PS (2016). Intracellular FRET-based Screen for Redesigning the Specificity of Secreted Proteases. ACS Chem. Biol. 11, 961–970. [DOI] [PubMed] [Google Scholar]
  37. Guerrero JL, Daugherty PS, and O’Malley MA (2017). Emerging technologies for protease engineering: New tools to clear out disease. Biotechnol. Bioeng. 114, 33–38. [DOI] [PubMed] [Google Scholar]
  38. Hill ME, Macpherson DJ, Wu P, Julien O, Wells JA, and Hardy JA (2016). Reprogramming Caspase-7 Specificity by Regio-Specific Mutations and Selection Provides Alternate Solutions for Substrate Recognition. ACS Chem. Biol. 11, 1603–1612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hu Y, Li T, Tu Z, He Q, Li Y, and Fu J (2020). Engineering a recombination neutral protease I from Aspergillus oryzae to improve enzyme activity at acidic pH. RSC Adv. 10, 30692–30699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Thiele J, M., Davari MD, König M, Hofmann I, Junker NO, Mirzaei Garakani T, Vojcic L, Fitter J, and Schwaneberg U (2018). Enzyme–Polyelectrolyte Complexes Boost the Catalytic Performance of Enzymes. ACS Catal. 8, 10876–10887. [Google Scholar]
  41. Khersonsky O, Roodveldt C, and Tawfik DS (2006). Enzyme promiscuity: evolutionary and mechanistic aspects. Curr. Opin. Chem. Biol. 10, 498–508. [DOI] [PubMed] [Google Scholar]
  42. Knight ZA, Garrison JL, Chan K, King DS, and Shokat KM (2007). A remodelled protease that cleaves phosphotyrosine substrates. J. Am. Chem. Soc. 129, 11672–11673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kretz CA, Tomberg K, Van Esbroeck A, Yee A, and Ginsburg D (2018). High throughput protease profiling comprehensively defines active site specificity for thrombin and ADAMTS13. Sci. Rep. 8, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kurinomaru T, Tomita S, Hagihara Y, and Shiraki K (2014). Enzyme hyperactivation system based on a complementary charged pair of polyelectrolytes and substrates. Langmuir 30, 3826–3831. [DOI] [PubMed] [Google Scholar]
  45. Lakshmanan A, Jin Z, Nety SP, Sawyer DP, Lee-Gosselin A, Malounda D, Swift MB, Maresca D, and Shapiro MG (2020). Acoustic biosensors for ultrasound imaging of enzyme activity. Nat. Chem. Biol. 16, 988–996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Lapek JD, Jiang Z, Wozniak JM, Arutyunova E, Wang SC, Joanne Lemieux M, Gonzalez DJ, and O’Donoghue AJ (2019). Quantitative multiplex substrate profiling of peptidases by mass spectrometry. Mol. Cell. Proteomics 18, 968–981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Lau Y-TK, Baytshtok V, Howard TA, Fiala BM, Johnson JM, Carter LP, Baker D, Lima CD, and Bahl CD (2018). Discovery and engineering of enhanced SUMO protease enzymes. J. Biol. Chem. 293, 13224–13233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Li F, Chen J, Leier A, Marquez-Lago T, Liu Q, Wang Y, Revote J, Smith AI, Akutsu T, Webb GI, et al. (2019). DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites. Bioinformatics 36, 1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Li Q, Yi L, Marek P, and Iverson BL (2013). Commercial proteases: Present and future. FEBS Lett. 587, 1155–1163. [DOI] [PubMed] [Google Scholar]
  50. Li Q, Yi L, Hoi KH, Marek P, Georgiou G, and Iverson BL (2017). Profiling Protease Specificity: Combining Yeast ER Sequestration Screening (YESS) with Next Generation Sequencing. ACS Chem. Biol. 12, 510–518. [DOI] [PubMed] [Google Scholar]
  51. Matthews DJ, and Wells JA (1993). Substrate phage: Selection of protease substrates by monovalent phage display. Science (80-.). 260, 1113–1117. [DOI] [PubMed] [Google Scholar]
  52. Meister SW, Hendrikse NM, and Löfblom J (2019). Directed evolution of the 3C protease from coxsackievirus using a novel fluorescence-assisted intracellular method. Biol. Chem. 400, 405–415. [DOI] [PubMed] [Google Scholar]
  53. Modarres HP, Mofrad MR, and Sanati-Nezhad A (2018). ProtDataTherm: A database for thermostability analysis and engineering of proteins. PLoS One 13, e0191222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Nguyen MTN, Shema G, Zahedi RP, and Verhelst SHL (2018). Protease Specificity Profiling in a Pipet Tip Using “charge-Synchronized” Proteome-Derived Peptide Libraries. J. Proteome Res. 17, 1923–1933. [DOI] [PubMed] [Google Scholar]
  55. Nov Y (2012). When Second Best Is Good Enough: Another Probabilistic Look at Saturation Mutagenesis. Appl. Environ. Microbiol. 78, 258–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Packer MS, and Liu DR (2015). Methods for the directed evolution of proteins. Nat. Rev. Genet. 16, 379–394. [DOI] [PubMed] [Google Scholar]
  57. Packer MS, Rees HA, and Liu DR (2017). Phage-assisted continuous evolution of proteases with altered substrate specificity. Nat. Commun. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Pantoliano MW, Ladner RC, Bryan PN, Rollence ML, Wood JF, and Poulos TL (1987). Protein Engineering of Subtilisin BPN′: Enhanced Stabilization through the Introduction of Two Cysteines to Form a Disulfide Bond. Biochemistry 26, 2077–2082. [DOI] [PubMed] [Google Scholar]
  59. Patrick WM, and Firth AE (2005). Strategies and computational tools for improving randomized protein libraries. Biomol. Eng. 22, 105–112. [DOI] [PubMed] [Google Scholar]
  60. Pethe MA, Rubenstein AB, and Khare SD (2019). Data-driven supervised learning of a viral protease specificity landscape from deep sequencing and molecular simulations. Proc. Natl. Acad. Sci. U. S. A. 116, 168–176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Püllmann P, Ulpinnis C, Marillonnet S, Gruetzner R, Neumann S, and Weissenborn MJ (2019). Golden Mutagenesis: An efficient multi-site-saturation mutagenesis approach by Golden Gate cloning with automated primer design. Sci. Rep. 9, 10932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Qu G, Li A, Sun Z, Acevedo-Rocha CG, and Reetz MT (2019). The Crucial Role of Methodology Development in Directed Evolution of Selective Enzymes. Angew. Chemie Int. Ed. [DOI] [PubMed] [Google Scholar]
  63. Ramesh B, Abnouf S, Mali S, Moree WJ, Patil U, Bark SJ, and Varadarajan N (2019). Engineered ChymotrypsiN for Mass Spectrometry-Based Detection of Protein Glycosylation. ACS Chem. Biol. 14, 2616–2628. [DOI] [PubMed] [Google Scholar]
  64. Ravikumar A, Arzumanyan GA, Obadi MKA, Javanpour AA, and Liu CC (2018). Scalable, Continuous Evolution of Genes at Mutation Rates above Genomic Error Thresholds. Cell 175, 1946–1957.e13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Rawlings ND, Barrett AJ, Thomas PD, Huang X, Bateman A, and Finn RD (2018). The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res. 46, D624–D632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Renicke C, Spadaccini R, and Taxis C (2013). A Tobacco Etch Virus Protease with Increased Substrate Tolerance at the P1’ position. PLoS One 8, e67915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Romero PA, and Arnold FH (2009). Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Sanchez MI, and Ting AY (2020). Directed evolution improves the catalytic efficiency of TEV protease. Nat. Methods 17, 167–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Schilling O, and Overall CM (2008). Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites. Nat. Biotechnol. 26, 685–694. [DOI] [PubMed] [Google Scholar]
  70. Schilling O, Huesgen PF, Barré O, Auf Dem Keller U, and Overall CM (2011). Characterization of the prime and non-prime active site specificities of proteases by proteome-derived peptide libraries and tandem mass spectrometry. Nat. Protoc. 6, 111–120. [DOI] [PubMed] [Google Scholar]
  71. Schmidt-Dannert C, and Arnold FH (1999). Directed evolution of industrial enzymes. Trends Biotechnol. 17, 135–136. [DOI] [PubMed] [Google Scholar]
  72. Song J, Tan H, Perry AJ, Akutsu T, Webb GI, Whisstock JC, and Pike RN (2012). PROSPER: An Integrated Feature-Based Tool for Predicting Protease Substrate Cleavage Sites. PLoS One 7, e50300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Song J, Wang Y, Li F, Akutsu T, Rawlings ND, Webb GI, and Chou KC (2019). IProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform. 20, 638–658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Steward L, Brin MF, and Brideau-Andersen A (2020). Novel Native and Engineered Botulinum Neurotoxins. (Springer, Berlin, Heidelberg: ), pp. 1–27. [DOI] [PubMed] [Google Scholar]
  75. Sullivan B, Walton AZ, and Stewart JD (2013). Library construction and evaluation for site saturation mutagenesis. Enzyme Microb. Technol. 53, 70–77. [DOI] [PubMed] [Google Scholar]
  76. Sumbalova L, Stourac J, Martinek T, Bednar D, and Damborsky J (2018). HotSpot Wizard 3.0: Web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 46, W356–W362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Tran DT, Cavett VJ, Dang VQ, Torres HL, and Paegel BM (2016). Evolution of a mass spectrometry-grade protease with PTM-directed specificity. Proc. Natl. Acad. Sci. U. S. A. 113, 14686–14691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. U.S. Food and Drug Administration (2020). Purple Book: Licensed Biological Products. [Google Scholar]
  79. Varadarajan N, Rodriguez S, Hwang B-Y, Georgiou G, and Iverson BL (2008a). Highly active and selective endopeptidases with programmed substrate specificities. Nat. Chem. Biol. 4, 290–294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Varadarajan N, Georgiou G, and Iverson BL (2008b). An engineered protease that cleaves specifically after sulfated tyrosine. Angew. Chemie - Int. Ed. 47, 7861–7863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Varadarajan N, Cantor JR, Georgiou G, and Iverson BL (2009). Construction and flow cytometric screening of targeted enzyme libraries. Nat. Protoc. 4, 893–901. [DOI] [PubMed] [Google Scholar]
  82. Vidmar R, Vizovišek M, Turk D, Turk B, and Fonović M (2017). Protease cleavage site fingerprinting by label-free in-gel degradomics reveals pH -dependent specificity switch of legumain. EMBO J. 36, 2455–2465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Wang W, Wildes CP, Pattarabanjird T, Sanchez MI, Glober GF, Matthews GA, Tye KM, and Ting AY (2017). A light- and calcium-gated transcription factor for imaging and manipulating activated neurons. Nat. Biotechnol. 35, 864–871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Ward OP (2019). Proteases. In Comprehensive Biotechnology, (Elsevier; ), pp. 604–615. [Google Scholar]
  85. Widen JC, Tholen M, Yim JJ, Antaris A, Casey KM, Rogalla S, Klaassen A, Sorger J, and Bogyo M (2020). AND-gate contrast agents for enhanced fluorescence-guided surgery. Nat. Biomed. Eng. 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Yang KK, Wu Z, and Arnold FH (2019). Machine-learning-guided directed evolution for protein engineering. Nat. Methods 16, 687–694. [DOI] [PubMed] [Google Scholar]
  87. Yao Y, Docter M, Van Ginkel J, De Ridder D, and Joo C (2015). Single-molecule protein sequencing through fingerprinting: Computational assessment. Phys. Biol. 12, 055003. [DOI] [PubMed] [Google Scholar]
  88. Yi L, Gebhard MC, Li Q, Taft JM, Georgiou G, and Iverson BL (2013). Engineering of TEV protease variants by yeast ER sequestration screening (YESS) of combinatorial libraries. Proc. Natl. Acad. Sci. U. S. A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Zhao H-Y, and Feng H (2018). Engineering Bacillus pumilus alkaline serine protease to increase its low-temperature proteolytic activity by directed evolution. BMC Biotechnol. 18, 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Zhao H, and Arnold FH (1999). Directed evolution converts subtilisin E into a functional equivalent of thermitase. Protein Eng. Des. Sel. 12, 47–53. [DOI] [PubMed] [Google Scholar]
  91. Zhao H, Giver L, Shao Z, Affholter JA, and Arnold FH (1998). Molecular evolution by staggered extension process (StEP) in vitro recombination. Nat. Biotechnol. 16, 258–261. [DOI] [PubMed] [Google Scholar]
  92. Zhou J, Li S, Leung KK, O’Donovan B, Zou JY, DeRisi JL, and Wells JA (2020). Deep profiling of protease substrate specificity enabled by dual random and scanned human proteome substrate phage libraries. Proc. Natl. Acad. Sci. 202009279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Zhu F, He B, Gu F, Deng H, Chen C, Wang W, and Chen N (2020). Improvement in organic solvent resistance and activity of metalloprotease by directed evolution. J. Biotechnol. 309, 68–74. [DOI] [PubMed] [Google Scholar]

RESOURCES