Abstract
As protein engineering grows more salient, many strategies have emerged to alter protein structure and function, with the goal of redesigning and optimizing natural product biosynthesis. Computational tools, including machine learning and molecular dynamics simulations, have enabled the rational mutagenesis of key catalytic residues for enhanced or altered biocatalysis. Semi-rational, directed evolution and microenvironment engineering strategies have optimized catalysis for native substrates and increased enzyme promiscuity beyond the scope of traditional rational approaches. These advances are made possible using novel high-throughput screens, including designer protein-based biosensors with engineered ligand specificity. Herein, we detail the most recent of these advances, focusing on polyketides, non-ribosomal peptides and isoprenoids, including their native biosynthetic logic to provide clarity for future applications of these technologies for natural product synthetic biology.
Keywords: biosynthesis, natural products, peptides, polyketides, synthetic biology, terpenes
Graphical Abstract
Graphical Abstract.

Introduction
Complex biosynthetic machinery has evolved in nature to convert small molecule building blocks into structurally diverse secondary metabolite natural products. Natively leveraged as signaling molecules and chemical defenses or offering other competitive advantages, natural products provide significant utility as commercial and industrial chemicals due to their diverse functional profiles as insecticides, herbicides, flavorings, fragrances and biofuels (da Silva and Rodrigues, 2014). Moreover, natural products have given rise to many blockbuster pharmaceuticals, owing to their wide-ranging therapeutic activities, including antibiotic, antiviral and anticancer, and continue to make up nearly one-third of newly approved small molecule therapeutics (Newman and Cragg, 2016). Efforts to enhance the bioactive properties of these privileged scaffolds have yielded potent new analogues, which have dramatically expanded the growing natural products market (Davison and Brimble, 2019).
Yet, accessing natural products and their analogues has proven challenging due to their significant structural complexity, which requires exquisite chemical control. Consequently, leveraging the logic of natural product biosynthetic pathways has emerged as a promising alternative to synthesis due to its exquisite chemical control and biocatalytic efficiency. However, limitations associated with enzymatic activity and substrate specificity have created critical bottlenecks for accessing natural products and their analogues.
Protein engineering provides an expansive toolbox whereby biosynthetic machinery can be optimized for production of natural products or leveraged to enable the biosynthesis of new-to-nature analogues (Fig. 1). Generally, these engineering efforts can be categorized as rational or derived from directed evolution. Rational design typically relies on structural information or homology models to map and mutate residues that interact with a ligand to alter activity; however, such information is not always readily accessible and is rarely proficient in predicting distal mutations that may enhance the desired activity (Yang et al., 2020). Conversely, directed evolution, an engineering strategy first developed in the late 1990s, leverages a random mutagenesis approach, mimicking traditional Darwinian evolution, to afford novel or improved protein function (Arnold and Volkov, 1999; Cobb et al., 2013). Notably, this strategy often produces large libraries that cannot be screened for activity using conventional analytical techniques. Despite the limitations and challenges of these approaches, both strategies have been successfully leveraged to afford catalytic improvements and alter substrate or product promiscuity. Moreover, protein engineering has also facilitated the development of synthetic biology tools that have enhanced the biosynthesis of natural products. For instance, metabolite-responsive transcriptional activators and repressors have been engineered for the development of designer biosensor platforms for molecular-targeted high-throughput screening (Mitchler et al., 2021). These, among other advancements in high-throughput screening have promoted extraordinarily successful metabolic, pathway and host engineering efforts for increasing titers.
Fig. 1 .

Leveraging the design-build-test cycle to produce engineered proteins for improved ‘natural’ natural product titers and designer natural product biosynthesis. The engineering of these proteins can proceed through a rational or semi-rational approach or derived from directed evolution.
This review describes the state-of-the-art in protein engineering to enhance access to members of three of the largest classes of natural products: polyketides, non-ribosomal peptides (NRPs), and isoprenoids. Herein, we survey the native biosynthetic logic of these pathways and highlight key advances in the optimization and diversification of these metabolites from the last 3 years. Within the context of the synthetic biology toolbox, we also consider the impacts of protein engineering on the development of high-throughput screening platforms for the engineering of natural product biosynthesis pathways and the future of the field.
Natural Product Classes and their Biosynthetic Logic
Secondary metabolites are classified based on the biosynthetic logic that converts simple primary metabolite precursors into complex and biologically diverse structures. To date, synthetic biology has primarily targeted three of these classes due to their apparent biosynthetic modularity and relevant bioactivity: polyketides, NRPs and isoprenoids. Here, we will briefly review the biosynthetic logic of these three natural product classes and the reader is directed to several recent review articles for additional information (Süssmuth and Mainz, 2017; Malico et al., 2020a,b).
The scaffolds of polyketides are produced by polyketide synthases (PKSs). There are three types of PKS: type I, which are large and have discrete domains and active sites; type II, which are formed by individual proteins associating together into one complex which forms the polyketide iteratively; and type III, which have a single active site performing all roles of catalysis to form the final product. Of these, type I PKSs, which produce blockbuster drugs such as erythromycin, lovastatin, and epothilone, are the most well-studied. These megaenzymes are composed of discrete catalytic domains, which are organized into modules that are each responsible for a single chain-elongation step (Fig. 2A). Natively, the acyltransferase (AT) domain, in some cases a free enzyme, selects an activated acyl-CoA substrate to serve as a starter or extender unit, which is then shuttled to the acyl carrier protein (ACP) (Helfrich et al., 2019). Notably, the AT acts as a ‘gatekeeper’ domain in the producing organism. Therefore, it often demonstrates stringent selectivity for a single acyl-CoA unit to contribute to the production of a single macrolactone in the native environment. The ketosynthase (KS) domain, bearing the growing chain, then catalyzes a decarboxylative thio-Claisen condensation with the ACP-bound extender unit to facilitate a single chain extension. In some modules, tailoring is performed by ketoreductase (KR), dehydratase (DH) and enoylreductase (ER) domains resulting in various levels of reduction of the ketide unit. Once the chain has undergone its final extension, the thioesterase (TE) domain then cleaves it from the assembly line and commonly cyclizes the product to yield a macrolactone core, which can then be decorated using tailoring enzymes to furnish a final product (Barajas et al., 2017; Malico et al., 2020a,b).
Fig. 2 .

Biosynthetic logic of chain extension mechanism for (A) polyketides, (B) NRPs and (C) isoprenoids. These extended chains may undergo a variety of fates including cyclization or tailoring to yield the final biosynthetic product.
NRPs, including daptomycin and cyclosporin, commonly produced in bacteria and fungi, are synthesized by dedicated NRP synthetases (NRPSs). NRPSs are categorized into seven types (A–G), contingent on the method of producing non-proteinogenic amino acids for incorporation into their product scaffolds (Süssmuth and Mainz, 2017). Much like type I PKSs, these enzymatic assemblies are generally organized into discrete domains housed in modules. In NRPSs, the adenylation (A) domain selects the appropriate amino acid and links it to the peptidyl carrier protein (PCP or T) domain. The addition of the upstream chain to the new amino acid is catalyzed by the condensation (C) domain, through either an amide or an ester linkage (Fig. 2B). In some modules, tailoring domains epimerize, perform redox chemistry on or otherwise modify the newly added amino acid before the peptide is released by a thioesterase (TE) domain (Süssmuth and Mainz, 2017). While not a focus of this review, ribosomally synthesized and post-translationally modified peptides (RiPPS) are another important class of natural products, and have shown amenability to engineering efforts (Hudson and Mitchell, 2018).
Terpenoids are biosynthesized from the isomeric hemiterpene building blocks: isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP). These monomers are synthesized either via the mevalonate (MVA) pathway, which uses six steps to convert acetyl-CoA to hemiterpenes, or through the methylerythritol-4-phosphate (MEP) pathway, which uses pyruvate as a starting material; members of all three domains of life variably use one pathway or the other, and a few use both (Kuzuyama and Seto, 2012). Recently, de novo alcohol-dependent hemiterpene pathways have been described, which utilize promiscuous phosphatases and kinases to furnish both the native hemiterpene monomers, as well as a variety of non-native analogues (Chatzivasileiou et al., 2019; Lund et al., 2019). These monomers can be polymerized by specialized prenyltransferases (PTases), referred to as prenylelongases, to furnish various C5n polyprenyl diphosphate chains, such as geranyl diphosphate (C10), farnesyl diphosphate (C15), and geranylgeranyl diphosphate (C20), which can be subsequently cyclized by terpene synthases and tailored to respectively yield diverse monoterpenes (e.g. limonene), sesquiterpenes (e.g. artemisinin) and diterpenes (e.g. taxol) (Fig. 2C) (Malico et al., 2020a,b). Prenyl diphosphates are also regioselectively appended to various aromatic acceptor molecules including polyketides (e.g. cannabinoids), amino acids (e.g. ergot alkaloids) and NRPs (cyclic dipeptides) via the action of dedicated PTases.
Optimization of Enzymatic Activity for Biosynthesis of Natural Products
Despite millennia of Darwinian evolution of the enzymes responsible for the biosynthesis of natural products from their respective precursors, biosynthetic pathways often require optimization through protein engineering to produce industrially relevant titers of these valuable compounds. To enhance the activity of these pathways, semi-rational and computational approaches have emerged to redesign the component enzymes for enhanced catalysis (Löffler et al., 2017; Buß et al., 2018; Huang et al., 2020a,b).
Traditional methods, involving saturation mutagenesis of residues identified through homology modeling or structural analysis, have proven effective for optimization of kinetic efficiency. For example, the efficiency of CYP76AH15, a cytochrome P450 enzyme involved in forskolin biosynthesis, was improved over 5-fold by a single point mutation identified after finding key substrate recognition ‘hotspots’ based on homology modeling with similar, more well-known CYP proteins (Forman et al., 2018). Moreover, a single Ser → Cys active site mutation in the PKS thioesterase domain PikIII-TE, increased the kcat/Km 4.3-fold for the macrocyclization of its native pentaketide substrate. Following quantum mechanics studies, this increase in catalysis was determined to be a result of a change in preferred mechanism: the transesterification reaction shifted from a two-step process to a concerted one (Koch et al., 2017). In another example, the natural precursor pathway for the potent anticancer paclitaxel was engineered by traditional mutagenesis methods. Rather than solely focusing on the rate-limiting CYP enzyme, mutations to the upstream terpene cyclase generated a variant which produced a more stable alternate intermediate, avoiding loss of precursors to degradation. Combined with mutations to the CYP enzyme, an alternative early pathway with a higher yield was developed for the synthesis of a valuable pharmaceutical. This more holistic pathway engineering approach has great promise for other natural products arising from common precursors like isoprenoids (Edgar et al., 2017).
The kcat of amorpha-4,11-diene synthase (ADS), a terpene cyclase catalyzing the first committed step of artemisinin biosynthesis, was increased 5-fold via mutability landscape-guided engineering (Abdallah et al., 2018). A ‘hotspot’ of residues within regions that affect protein function, including both the active site and protein–protein interaction surface, was identified through homology modeling that was then subjected to saturation mutagenesis. The resulting libraries were assayed for product ratio, reaction speed and other desired parameters. This approach has been used to alter the activity of enzymes from other classes (Hecht et al., 2013; van der Meer et al., 2016).
Other examples of engineering enzymatic activity leverage factors other than protein primary structure. For instance, altering the microenvironment of the enzyme by increasing substrate concentration around the protein surface has been demonstrated using small enzymes as a proof-of-concept. One example of this showed that covalent ligation of enzymes to a scaffold can force accumulation of the substrate near the active site due to the scaffold’s steric effects. In this specific case, horseradish peroxidase and alcohol dehydrogenase were bound to small DNA scaffolds, with the former displaying a 3-fold increase in kcat and an unchanged KM (Lancaster et al., 2018). This approach has been extended to coupled biocatalytic reactions to minimize efficiency loss due to intermediate diffusion by attaching multiple pathway enzymes to a single protein scaffold, thus localizing several metabolic steps (Dueber et al., 2009). DNA has been used as a scaffold to localize related NRPS modules allowing for increased activity (Huang et al., 2020a,b). Related methods have been used with short complementary peptide tags to assemble isoprenoid-synthesizing enzymes, with an approximate doubling in product titer (Kang et al., 2019; Zhao et al., 2016). For enzymes with a narrow optimum pH range, covalent modification of surface residues with charged polymers has modified the activity in non-optimal conditions: this approach was used to increase the reactivity of cytochrome C at high pH (Zhang et al., 2017).
Another solution that has been used to address engineering challenges is cell-free expression systems. By escaping the confines of in vivo systems, cell-free processes can ignore toxicity and cell-membrane permeability concerns. A recent example used cell-free synthesis to produce cannabinoids, which normally face significant challenges to in vivo heterologous in microbes due to the toxicity of the intermediate isoprenoid geranyl pyrophosphate (GPP) (Valliere et al., 2019). Complete pathways were introduced to the cell-free matrix to produce GPP. Several PTases were then screened, and NphB was selected for mutagenesis to improve production of cannabigerolic acid (CBGA). Residues were chosen for mutagenesis based on an in silico docking study with the desired precursors, followed by the use of Rosetta to analyze the docking data to model the effect of residue changes on substrate binding (Alford et al., 2015). Protein expression was also improved via a nonane overlay which was continually exchanged and extracted, avoiding protein solubility issues related to high levels of hydrophobic cannabinoid product. The final titer of CBGA was 1.25 g/L over 24 hours, a nearly 140-fold improvement compared to the non-optimized cell-free pathway, highlighting the promise of cell-free methods for systems with toxic intermediates (Valliere et al., 2019).
While some of these approaches have not been widely used to modify bioactive natural product biosynthesis, such methods may prove to be a powerful tool if applied to enzymes involved in natural product biosynthesis to increase their catalytic activity.
Reprogramming Enzyme Specificity and Catalysis toward Non-Natural Products
While the native promiscuity of many enzymes has enabled precursor-directed biosynthesis of new-to-nature analogues, limitations to in situ non-natural precursor availability pose a significant challenge to non-natural biosynthesis efforts (Kalkreuter et al., 2018). Engineering primary metabolism for production of non-native building blocks has recently garnered significant attention, particularly for the production of non-native acyl-CoA substrates, which can be utilized as precursors for polyketides, fatty acids, and isoprenoids. For example, rationally engineering the acetyl-CoA synthetase (ACS) carboxylate binding pocket, guided by I-TASSER modeling, led to a shift in substrate promiscuity toward novel branched-chain carboxylates (Roy et al., 2010; Yang et al., 2014; Sofeo et al., 2019). Engineering ACS to utilize novel carboxylate substrates has the potential to impact end products of metabolism. However, this would also generate alternative acyl-CoA substrates for fatty acid or polyketide biosynthesis allowing the opportunity to explore the flexibility of enzymology and this biosynthetic machinery in vivo (Sofeo et al., 2019).
Computational approaches based on machine learning (ML) strategies have accelerated the significant evolutionary potential of directed evolution by reducing combinatorial sequence space that requires exploration. ML-based algorithms functionally map a subset of the genetic diversity of a library as ‘inputs’ that correspond to the activity of the mutant as an ‘output’ to create a training set of data that can be applied to statistical algorithms to predict mutants with the optimal enzymatic performance (Volk et al., 2020). Leveraging such sequence-function relationships, Wu et al. (2019) successfully leveraged ML-assisted directed evolution to engineer the enantiodivergent activity of a putative nitric oxide dioxygenase to selectively furnish new-to-nature carbon–silicon bond formation with 93% and 79% enantiomeric excess, respectively, through just two rounds of directed evolution. Such ML strategies could be similarly leveraged to drive engineering efforts of complex biosynthetic pathways to enhance natural product biosynthesis or enable the production of their non-natural analogues.
Enzymes with stringent precursor selectivity have also been the target of engineering for biosynthesis of non-natural analogues. Traditionally, structural diversification of polyketides has been accomplished by engineering the specificity of the AT domain for the selection of non-native acyl-CoA extender units via mutagenesis or domain swapping (Barajas et al., 2017). Recently, the selectivity of the Pikromycin PKS AT domain of PikAIII (PikAT5) and PikAIV (PikAT6) was shifted via homology-guided rational site-directed mutagenesis at the tyrosine residue at the 755 (PikAIII) and 753 (PikAIV) position to enable the first reported precursor-biosynthesis of narbonolide derivatives with consecutive non-natural extender unit integration (Fig. 3) (Kalkreuter et al., 2019a,b). Other AT domains, particularly those from PKSs in hosts that produce only a few extender units, display low levels of extender unit promiscuity that can provide a platform for engineering new substrate specificities (Eustáquio et al., 2009; Musiol-Kroll et al., 2017). Recently, molecular dynamics (MD) simulations have assisted in prioritizing specific point mutations for increased substrate promiscuity in modular PKSs. Within the AT domain of erythromycin module 6 (EryAT6), MD simulations were used to identify 13 mutations across 10 residues proximal to the YASH loop or VDVVQP large subunit motif (LSM) which were likely to alter promiscuity. In vitro analyses of these mutations resulted in a >10-fold increase in selectivity toward the non-natural extender units, ethylmalonyl-CoA, propargylmalonyl-CoA, isopropylmalonyl-CoA and butylmalonyl-CoA, specifically. The substrate promiscuity of the EryAT6 was further shifted via larger structural motif swaps that targeted the residues highlighted by MD, leaving the AT > 95% unchanged by harnessing inter- and intra-motif interactions while affording novel promiscuities (Kalkreuter et al., 2021). These works demonstrate a significant step forward for the promise of synthetic biology toward the regioselective diversification of these assembly line derived natural products.
Fig. 3 .

Engineered pathways for the production of non-natural polyketide (A and B) and polyketide-NRP hybrid (C) analogues. Locations of the pathway mutations and chimeras are highlighted. The corresponding collinear change in the structure is also highlighted in the non-natural or non-native structure. The respective native pathway and product are shown side-by-side for comparison. The relative percent of product in competition-based conversion assays (A-B) are shown below the product. Titers with chimeric pathways are shown below (C). See text for references.
Previously, it has been considered that the C domain in NRPSs plays a critical role in proof-reading substrate specificity; however, structural biology and bioinformatics approaches have been unsuccessful at identifying specific residues which would be involved in such roles (Süssmuth and Mainz, 2017). Calcott et al. (2020) disputed that the C domain plays a role in substrate specificity through semi-rational mutagenesis that revealed NRPS divergence is primarily driven by recombination of A domains and subdomains, rather than C domains; these findings were supported by phylogenetic analysis. A domains are highly specific for the substrate they recognize and the recognition is determined by 10 residues in the substrate-binding pocket (specific code) and by using directed evolution and genetic selection, specific sites in the code which were required residues and which were tolerant of variation were determined. EntF, an NRPS involved in enterobactin biosynthesis, have functional sequence space for L-Ser recognition, thus having the potential for engineering specificity with minimal perturbations by targeting key residues that confer specificity (Throckmorton et al., 2019). Mutagenesis of EntE, an aryl acid A domain within the enterobactin NRPS, was conducted to probe the precise role of active site residues for its specific substrate, 2,3-dihydroxybenzoic acid (DHB) (Ishikawa et al., 2020). Enzyme kinetics and modeling studies of the EntE variants demonstrated that residues 236, 240 and 339 collectively regulate the substrate specificity toward DHB while also being activated by non-native aryl acids 3-hydroxybenzoic acid, 3-aminobenzoic acid, 3-fluorobenzoic acid and 3-chlorobenzoic acid (Ishikawa et al., 2020).
Interrupted A domains have multifunctional enzymatic activity in which they possess the methylation capabilities usually seen in the NRPS methylation (M) domain. As a proof-of-concept for generating novel interrupted A domains for diversification of NRPs, two noncognate M domains, KtzH (MH) and TioS (M35) were inserted into the A domain, Ecm6(A1T1). The novel, fully functional, engineered mono-interrupted A domains emulated natural interrupted A domains by achieving bio-functionality, retaining the same adenylation and methylation functions as their original source, and demonstrating that the identity of the M domain dictates the location of methylation (Lundy et al., 2018). An subsequent attempt to engineer interrupted A domains containing two consecutive M domains did not result in the target methylation pattern but did demonstrate that the a8 and a9 conserved motif was incredibly tolerant of artificial insertions and could be considered a privileged location for interrupting A domains without a loss of function, which is impactful for future enzyme engineering (Lundy et al., 2020). To further probe the limits of interrupted A domains, a di-interrupted A domain was engineered by adding a backbone methylating M35 domain from TioS between the a8 and a9 region of the mono-interrupted A domain, TioN(AaMNAb) which already contained a side chain methylating MN domain between the a2 and a3 region. This resulted in the first engineered di-interrupted A domain which was able to carry out three functions: adenylation, backbone N-methylation and side chain S-methylation, while the M domains maintained their regiospecificity (Lundy et al., 2019).
Phylogenetic analyses have also been leveraged to alter the product profiles derived from PKSs and PKS-NRPS hybrids. For example, via an evolution-guided strategy, Peng et al. (2019) successfully interconverted the biosynthetic activities of the aureothin and neoaureothin PKSs via the respective addition and removal of key modules. Similar work was also recently conducted to successfully diversify the modular organization of the antimycin-type cyclic depsipeptides JBIR-06 and neoantimycin (Awakawa et al., 2018). Such analyses have also been critical to revealing the mechanisms for substrate specificity and elucidating evolutionary paths to product diversification (Calcott et al., 2020). While these approaches have often been leveraged with assembly line-based machinery to facilitate non-natural production by leveraging the specificity of other modules or domains, evolutionary analyses may also be used reversibly to reconstruct ancestral, general catalysis enzymes with greater promiscuity. For example, using ancestral sequence reconstruction of the Streptomyces violens class I diterpene cyclase, spiroviolene synthase, a variant with significant promiscuity for the truncated farnesyl diphosphate substrate was obtained (Hendrikse et al., 2018; Schriever et al., 2021). Notably, this promiscuity was achieved via improvements in the kinetic activity of the engineered synthase with the target non-natural substrate.
Engineering Protein-Based Biosensors for Natural Product Synthetic Biology
Engineering biosynthetic pathways, as well as their individual protein components, is often throttled by low-throughput analytical methods for metabolite analysis. Coupling genetic diversity to artificial high-throughput selections or screens has emerged as a method for the rapid development of designer biocatalysts with enhanced or non-native functions (Renata et al., 2015). In particular, metabolite-sensing allosteric transcription factors (aTFs) have been leveraged as the basis for various high-throughput biosensor platforms to expedite the optimization of metabolic pathways and microbial strains (Hossain et al., 2020; Mitchler et al., 2021). The ligand-binding domain (LBD) of the aTF specifically binds to an inducer molecule (Fig. 4), which subsequently induces a conformational change that modulates transcription via the DNA-binding domain (DBD), consequently resulting in the expression of a reporter gene, such as a fluorescent protein (Cheng et al., 2018). More specifically, metabolite-responsive transcriptional regulator proteins and their cognate promoters have gained significant attention as promising tools for enhancing natural product biosynthetic pathways and enabling their dynamic metabolic control (Shen et al., 2016; Siu et al., 2018; Kalkreuter et al., 2019a,b; Lund et al., 2019; Qian et al., 2019; Qiu et al., 2020; Wen et al., 2020).
Fig. 4 .

Natural products and their precursors detected by transcription factor-based biosensor platforms.
The robustness of aTF-based biosensors relies on the inherent number of transfer functions, sensitivity (k1/2), dynamic range and effector selectivity, which can be tailored to yield an enhanced biosensor (Taylor et al., 2016; Kasey et al., 2018; Leopoldo et al., 2019; Yu et al., 2019). While a major class of regulatory proteins, few aTFs have been engineered to respond to non-natural effectors, likely because altering effector specificity may also disrupt allostery. Variants of the Escherichia coli lac repressor LacI, which are functional in both allosteric states, were captured by utilizing single-site saturation libraries, random mutagenesis and computational mutagenesis in conjunction with an in vivo selection-screening method. With this approach, four new inducers (fucose, gentiobiose, lactitol and sucralose) were identified with comparable specificity and induction compared to the wild-type (WT) inducer, isopropyl β-D-1-thiogalactopyranoside (IPTG). Thus, allostery was not disrupted suggesting ligand specificity can be engineered without compromising allostery (Taylor et al., 2016). Similarly, MphR, a promiscuous macrolide sensing aTF, was engineered to improve its promiscuity toward a variety of natural and non-natural macrolides which were previously poor inducers of the WT MphR (Kasey et al., 2018). The tunability of aTFs is fundamental for subsequent precision engineering of metabolism. Thus, by building promoter libraries with varying operator sites, it was revealed that there are interdependencies between a biosensor’s dynamic range and response threshold (Mannan et al., 2017). By targeting the LBD for mutagenesis, Snoek et al. (2020) were able to select an aTF variant of BenM for either quantitative (operational and dynamic range) or qualitative (specificity and inversion of function) changes in functionality via user-defined fluorescent activated cell sorting (FACS) based toggled selection methods.
Engineered biosensors will allow for previously described bottlenecks in engineering natural product pathways to be overcome. For example, triacetic acid lactone (TAL) is a polyketide natural product with the potential to be a bio-renewable platform if microbial biosynthetic titers can be improved to be competitive with chemical synthesis. Thus, to improve TAL production, a sensor-reporter system, AraC-TAL1 (P8V, T24I, H80G, Y82L, H93R), was developed based on the E. coli AraC regulatory protein. This was then leveraged in a high-throughput screen to identify variants of 2-pyrone synthase (2-PS) improved for TAL production. Ultimately, plasmid-based overexpression of betT, ompN and pykA led to a 49% increase in TAL product titer (Li et al., 2018). Despite the success of the AraC-TAL1 platform for such engineering efforts, AraC-TAL1 suffers from leaky expression, poor sensitivity and relaxed specificity including a response to orsellinic acid (OA). Subsequently, the sensitivity and selectivity of AraC-TAL with either TAL or OA has been improved to better facilitate the engineering of PKSs. Directed evolution afforded AraC-TAL12 (AraC-TAL1, P11L), which demonstrated a ~2.3-fold increase in the dynamic range of the biosensor at 4 mM TAL, as well as AraC-TAL14 (AraC-TAL1, L72V) which showed a ~2.5-fold decrease in background GFP levels and enhanced selectivity (1.9) to OA compared to AraC-TAL1. As such, it was used as a parent to furnish AraC-OA8 (P25G, I36F, H81Q, P128S, A140V, Q142L) that displayed a 24-fold improvement in specificity toward OA over TAL, as compared to AraC-TAL1 (Wang et al., 2020). Such selectivity was also developed for TAL compared to OA in other mutant strains. Through these engineering efforts, AraC aided in the improvement of TAL product titers as well as resulted in the development of two improved biosensors with shifted ligand specificity which can be applied to further PKS engineering.
High-throughput assays have aided in the ability to reprogram catalytic activity by reprograming ligand specific NRPS modules to accept and process non-native amino acids thus resulting in back-bone modified peptides. More specifically, a modified L-Phe-specific TycA (TycAF, lacking the epimerization domain) module was displayed on the surface of yeast cells to enable FACS which demonstrated that the yeast strain EBY100 produced the minimal TycAF. Furthermore, the A-domain catalysis was coupled to the fluorescence readout by using a previously described TycA W239S point mutation which allowed for the incorporation of O-propargyl-L-Tyr resulting in the bioorthogonal ‘click’ chemistry to the labeled yeast cells encoding for the TycA variants. This allowed for the utilization of random mutagenesis to the active site of TycA while simultaneously using FACS as a high-throughput screen. The resulting variant, TycAβpY with the A236V mutation and a Cys-Leu-Val β13β14 loop sequence, displayed an inverted substrate preference in favor of the (S)-β-Phe while maintaining high catalytic efficiency (Niquille et al., 2018).
Conclusions and Future Directions
Despite the challenges associated with the structural and mechanistic complexity of natural product biosynthesis, their component proteins have been successfully engineered to both enhance the titer of the native products as well as to produce non-natural analogues. Traditional, structure-guided approaches to protein engineering continue to offer remarkable success in the field of natural product biosynthetic engineering, including the ability to selectively produce non-natural analogues of natural products, as well as modestly enhance native productivity and specificity. Recent advances have harnessed the power of machine learning and directed evolution and offer exciting opportunities to access designer natural products. Although enhancements in this field have been laborious due to the low-throughput of traditional analytical methods, the development and application of synthetic biology technologies such as high-throughput screening platforms, are rapidly accelerating progress. As a result of such advances, the choice of designer natural product analog to target by protein engineering and synthetic biology has become critical. The development of databases that inventorize natural products and their activities could be leveraged to better guide future natural product engineering efforts. For example, TeroKit could be leveraged to rapidly identify relevant isoprenoid scaffolds as starting points for synthetic biology platforms. The cheminformatic tools ‘PKS Enumerator’ and the ‘Synthetic Insight-Based Macrolide Enumerator’ generate macrolactone libraries derived from knowledge of polyketide biosynthesis that can be mined for potential bioactive structures (Zin et al., 2018, 2020; Zeng et al., 2020). Additionally, high-throughput MD simulations in which improved protein-ligand docking results between active and decoy ligands are being developed as suitable platforms for synthetic biology (Guterres and Im, 2020). These computational tools will advance the field by potentially allowing for discrimination between ligand binders and non-binders after engineering. We envision the future of protein engineering and directed evolution to leverage advancements in computational tools in tandem with engineered aTF biosensors to fully enable the production of designer natural products.
Funding
Financial support is provided in part by the National Institutes of Health [grants GM124112 to G.J.W., T32GM133366 to M.M]; the Thomas Lord Distinguished Professorship Endowment (G.J.W.).
References
- Abdallah, I.I., Van Merkerk, R., Klumpenaar, E., Quax, W.J. (2018) Sci. Rep., 8, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alford, R.F., Koehler Leman, J., Weitzner, B.D., Duran, A.M., Tilley, D.C., Elazar, A., Gray, J.J. (2015) PLoS Comput. Biol., 11, e1004398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arnold, F.H. and Volkov, A.A. (1999) Curr. Opin. Chem. Biol., 3, 54–59. [DOI] [PubMed] [Google Scholar]
- Awakawa, T., Fujioka, T., Zhang, L., Hoshino, S., Hu, Z., Hashimoto, J., Kozone, I.et al. (2018) Nat. Commun., 9, 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barajas, J.F., Blake-Hedges, J.M., Bailey, C.B., Curran, S., Keasling, J.D. (2017) Synth. Syst. Biotechnol. KeAi Commun. Co., 2, 147–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buß, O., Rudat, J., Ochsenreither, K. (2018) Comput. Struc. Biotechnol. J., 16, 25–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calcott, M.J., Owen, J.G., Ackerley, D.F. (2020) Nat. Commun., 11, 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chatzivasileiou, A.O., Ward, V., Edgar, S.M.B., Stephanopoulos, G. (2019) Proc. Natl. Acad. Sci. U. S. A., 116, 506–511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng, F., Tang, X.-L., Kardashliev, T. (2018) Biotechnol. J., 13, 1700648. [DOI] [PubMed] [Google Scholar]
- Cobb, R.E., Chao, R., Zhao, H. (2013) AIChE J., 59, 1432–1440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davison, E.K. and Brimble, M.A. (2019) Curr. Opin. Chem. Biol. Elsevier Ltd., 52, 1–8. [DOI] [PubMed] [Google Scholar]
- Dueber, J.E., Wu, G.C., Malmirchegini, G.R., Moon, T.S., Petzold, C.J., Ullal, A.V., Prather, K.L.J., Keasling, J.D. (2009) Nat. Biotechnol., 27, 753–759. [DOI] [PubMed] [Google Scholar]
- Edgar, S., Li, F.S., Qiao, K., Weng, J.K., Stephanopoulos, G. (2017) ACS Synth. Biol., 6, 201–205. [DOI] [PubMed] [Google Scholar]
- Eustáquio, A.S., McGlinchey, R.P., Liu, Y., Hazzard, C., Beer, L.L., Florova, G., Alhamadsheh, M.M.et al. (2009) Proc. Natl. Acad. Sci. U. S. A., 106, 12295–12300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leopoldo, F.M., Currin, A., Dixon, N. (2019) J. Biol. Eng., 13, 91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Forman, V., Bjerg-Jensen, N., Dyekjær, J.D., Møller, B.L., Pateraki, I. (2018) Microb. Cell Fact., 17, 181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guterres, H. and Im, W. (2020) J. Chem. Inf. Model., 60, 2189–2198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hecht, M., Bromberg, Y., Rost, B. (2013) J. Mol. Biol., 425, 3937–48. [DOI] [PubMed] [Google Scholar]
- Helfrich, E.J.N., Ueoka, R., Dolev, A., Rust, M., Meoded, R.A., Bhushan, A., Califano, G.et al. (2019) Nat. Chem. Biol., 15, 813–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendrikse, N.M., Charpentier, G., Nordling, E., Syrén, P.-O. (2018) FEBS J., 285, 4660–4673. [DOI] [PubMed] [Google Scholar]
- Hossain, G.S., Saini, M., Miyake, R., Ling, H., Chang, M.W. (2020) Trends Biotechnol. Elsevier Ltd., 38, 797–810. [DOI] [PubMed] [Google Scholar]
- Huang, H.M., Stephan, P., Kries, H. (2020a) Cell Chem. Biol., 28, 221–227. [DOI] [PubMed] [Google Scholar]
- Huang, X., Pearce, R., Zhang, Y. (2020b) Bioinformatics, 36, 1135–1142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hudson, G.A. and Mitchell, D.A. (2018) Curr. Opin. Microbiol. Elsevier Ltd., 45, 61–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ishikawa, F., Nohara, M., Nakamura, S., Nakanishi, I., Tanabe, G. (2020) Biochemistry, 59, 351–363. [DOI] [PubMed] [Google Scholar]
- Kalkreuter, E., Bingham, K.S., Keeler, A.M., Lowell, A.N., Schmidt, J.J., Sherman, D.H., Williams, G.J. (2021) Nat. Commun., 12, 2193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalkreuter, Edward, Samantha M. Carpenter, and Gavin J. Williams. (2018). The Royal Society of Chemistry, 275–312. [Google Scholar]
- Kalkreuter, E., Crowetipton, J.M., Lowell, A.N., Sherman, D.H., Williams, G.J. (2019a) J. Am. Chem. Soc., 141, 1961–1969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalkreuter, E., Keeler, A.M., Malico, A.A., Bingham, K.S., Gayen, A.K., Williams, G.J. (2019b) ACS Synth. Biol., 8, 1391–1400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kang, W., Ma, T., Liu, M., Qu, J., Liu, Z., Zhang, H., Shi, B.et al. (2019) Nat. Commun., 10, 4248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kasey, C.M., Zerrad, M., Li, Y., Cropp, T.A., Williams, G.J. (2018) ACS Synth. Biol., 7, 227–239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koch, A.A., Hansen, D.A., Shende, V.V., Furan, L.R., Houk, K.N., Jiménez-Osés, G., Sherman, D.H. (2017) J. Am. Chem. Soc., 139, 13456–13465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuzuyama, Tomohisa, and Seto Haruo. 2012. Proceedings of the Japan Academy Series B: Physical and Biological Sciences. The Japan Academy. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lancaster, L., Abdallah, W., Banta, S., Wheeldon, I. (2018) Chem. Soc. Rev., 47, 5177–5186. [DOI] [PubMed] [Google Scholar]
- Li, Y., Qian, S., Dunn, R., Cirino, P.C. (2018) J. Ind. Microbiol. Biotechnol., 45, 789–793. [DOI] [PubMed] [Google Scholar]
- Löffler, P., Schmitz, S., Hupfeld, E., Sterner, R., Merkl, R. (2017) PLoS Comput. Biol., 13, e1005600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lund, S., Hall, R., Williams, G.J. (2019) ACS Synth. Biol., 8, 232–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lundy, T.A., Mori, S., Garneau-Tsodikova, S. (2018) ACS Synth. Biol., 7, 399–404. [DOI] [PubMed] [Google Scholar]
- Lundy, T.A., Mori, S., Garneau-Tsodikova, S. (2019) Org. Biomol. Chem., 17, 1169–1175. [DOI] [PubMed] [Google Scholar]
- Lundy, Taylor A, Mori Shogo, and Garneau-Tsodikova Sylvie. 2020. Lessons Learned in Engineering Interrupted Adenylation Domains When Attempting to Create Trifunctional Enzymes from Three Independent Monofunctional Ones †. RSC Adv., 10, 34299–307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malico, A.A., Calzini, M.A., Gayen, A.K., Williams, G.J. (2020a) J. Ind. Microbiol. Biotechnol., 47, 675–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malico, A.A., Nichols, L., Williams, G.J. (2020b) Curr. Opin. Chem. Biol. Elsevier Ltd., 58, 45–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mannan, A.A., Liu, D., Zhang, F., Oyarzún, D.A. (2017) ACS Synth. Biol., 6, 1851–1859. [DOI] [PubMed] [Google Scholar]
- van derMeer, J.Y., Biewenga, L., Poelarends, G.J. (2016) Chem Bio Chem., 17, 1792–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mitchler, M.M., Garcia, J.M., Montero, N.E., Williams, G.J. (2021) Curr. Opin. Biotechnol., 69, 172–181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Musiol-Kroll, E.M., Zubeil, F., Schafhauser, T., Härtner, T., Kulik, A., McArthur, J., Koryakina, I.et al. (2017) ACS Synth. Biol., 6, 421–427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newman, D.J. and Cragg, G.M. (2016) J. Nat. Prod. Am. Chem. Soc., 79, 629–61. [DOI] [PubMed] [Google Scholar]
- Niquille, D.L., Hansen, D.A., Mori, T., Fercher, D., Kries, H., Hilvert, D. (2018) Nat. Chem., 10, 282–287. [DOI] [PubMed] [Google Scholar]
- Peng, H., Ishida, K., Sugimoto, Y., Jenke-Kodama, H., Hertweck, C. (2019) Nat. Commun., 10, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qian, S., Li, Y., Cirino, P.C. (2019) Microb. Cell Fact., 18, 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiu, X., Xu, P., Zhao, X., Du, G., Zhang, J., Li, J. (2020) Metab. Eng., 60, 66–76. [DOI] [PubMed] [Google Scholar]
- Renata, H., Wang, Z.J., Arnold, F.H. (2015) Angew. Chem. Int. Ed., 54, 3351–3367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy, A., Kucukural, A., Zhang, Y. (2010) Nat. Protoc., 5, 725–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schriever, K., Saenz-Mendez, P., Rudraraju, R.S., Hendrikse, N.M., Hudson, E.P., Biundo, A., Schnell, R., Syrén, P.-O. (2021) J. Am. Chem. Soc. (January, jacs.0c10214). 143, 3794–807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen, H.J., Cheng, B.Y., Zhang, Y.M., Liang, T., Li, Z., Yi Fan, B., Li, X.R., Tian, G.Q., Liu, J.Z. (2016) Metab. Eng., 38, 180–190. [DOI] [PubMed] [Google Scholar]
- daSilva, V.C. and Rodrigues, C.M. (2014) Chemical and Biological Technologies in Agriculture. Springer International Publishing. [Google Scholar]
- Siu, Y., Fenno, J., Lindle, J.M., Dunlop, M.J. (2018) ACS Synth. Biol., 7, 16–23. [DOI] [PubMed] [Google Scholar]
- Snoek, T., Chaberski, E.K., Ambri, F., Kol, S., Bjørn, S.P., Pang, B., Barajas, J.F., Welner, D.H., Jensen, M.K., Keasling, J.D. (2020) Nucleic Acids Res., 48, e3–e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sofeo, N., Hart, J.H., Butler, B., Oliver, D.J., Yandeau-Nelson, M.D., Nikolau, B.J. (2019) ACS Synth. Biol., 8, 1325–1336. [DOI] [PubMed] [Google Scholar]
- Süssmuth, R.D. and Mainz, A. (2017) Angew. Chem. Int. Ed., 56, 3770–3821. [DOI] [PubMed] [Google Scholar]
- Taylor, N.D., Garruss, A.S., Moretti, R., Chan, S., Arbing, M.A., Cascio, D., Rogers, J.K.et al. (2016) Nat. Methods, 13, 177–183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Throckmorton, K., Vinnik, V., Chowdhury, R., Cook, T., Chevrette, M.G., Maranas, C., Pfleger, B., Thomas, M.G. (2019) ACS Chem. Biol., 14, 2044–2054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valliere, M.A., Korman, T.P., Woodall, N.B., Khitrov, G.A., Taylor, R.E., Baker, D., Bowie, J.U. (2019) Nat. Commun., 10, 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volk, M.J., Lourentzou, I., Mishra, S., Vo, L.T., Zhai, C., Zhao, H. (2020) ACS Synth. Biol., 9, 1514–1533. [DOI] [PubMed] [Google Scholar]
- Wang, Z., Doshi, A., Chowdhury, R., Wang, Y., Maranas, C.D., Cirino, P.C. (2020) Prot. Eng. Design Select., 33, gzaa027. [DOI] [PubMed] [Google Scholar]
- Wen, J., Tian, L., Liu, Q., Zhang, Y., Cai, M. (2020) J. Biotechnol., 320, 80–85. [DOI] [PubMed] [Google Scholar]
- Wu, Z., Jennifer Kan, S.B., Lewis, R.D., Wittmann, B.J., Arnold, F.H. (2019) Proc. Natl. Acad. Sci. U. S. A., 116, 8852–8858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang, G., Miton, C.M., Tokuriki, N. (2020) Protein Sci., 29, 1724–1747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., Zhang, Y. (2014) Nat. Methods., 12, 7–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu, H., Chen, Z., Wang, N., Yu, S., Yan, Y., Huo, Y.X. (2019) Metab. Eng., 56, 28–38. [DOI] [PubMed] [Google Scholar]
- Zeng, T., Liu, Z., Zhuang, J., Jiang, Y., He, W., Diao, H., Lv, N.et al. (2020) J. Chem. Inf. Model., 60, 2082–2090. [DOI] [PubMed] [Google Scholar]
- Zhang, Y., Wang, Q., Hess, H. (2017) ACS Catal., 7, 2047–2051. [Google Scholar]
- Zhao, C., Gao, X., Liu, X., Wang, Y., Yang, S., Wang, F., Ren, Y. (2016) J. Agric. Food Chem., 64, 3380–3385. [DOI] [PubMed] [Google Scholar]
- Zin, P.P.K., Williams, G., Fourches, D. (2018) J. Chem., 10, 53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zin, P.P.K., Williams, G., Fourches, D. (2020) J. Chem., 12, 23. [DOI] [PMC free article] [PubMed] [Google Scholar]
