Redesigning enzymes based on adaptive evolution for optimal function in synthetic metabolic pathways

Yasuo Yoshikuni; Jeffrey A Dietrich; Farnaz F Nowroozi; Patricia C Babbitt; Jay D Keasling

doi:10.1016/j.chembiol.2008.05.006

. Author manuscript; available in PMC: 2014 May 22.

Published in final edited form as: Chem Biol. 2008 Jun;15(6):607–618. doi: 10.1016/j.chembiol.2008.05.006

Redesigning enzymes based on adaptive evolution for optimal function in synthetic metabolic pathways

Yasuo Yoshikuni ^1,², Jeffrey A Dietrich ^1,², Farnaz F Nowroozi ^1,², Patricia C Babbitt ^1,^3,⁵, Jay D Keasling ^1,^2,^4,^6,^#

PMCID: PMC4030648 NIHMSID: NIHMS573833 PMID: 18559271

Abstract

Nature has balanced most metabolic pathways such that no one enzyme in the pathway controls the flux through that pathway. However, unnatural or non-native, constructed metabolic pathways may have limited product flux due to unfavorable in vivo properties of one or more enzymes in the pathway. One such example is the mevalonate-based isoprenoid biosynthetic pathway that we previously reconstructed in Escherichia coli. We have used a probable mechanism of adaptive evolution to engineer the in vivo properties of two enzymes (hydroxyl-3-methylglutaryl-CoA reductase (tHMGR) and many terpene synthases) in this pathway and thereby eliminate or minimize the bottleneck created by these inefficient or non-functional enzymes. Here, we demonstrate how we significantly improved the productivity (approximately 1000-fold) of this reconstructed biosynthetic pathway using this strategy. We anticipate that this strategy will find broad applicability in the functional construction (or reconstruction) of biological pathways in heterologous hosts.

INTRODUCTION

It is well known that many enzymes are able to catalyze highly specific chemical reactions with surprising accuracy and efficiency (Kraut et al. 2003). In metabolic pathways, these enzymes, each catalyzing different reactions in a series, often cooperate to minimize the unnecessary accumulation of metabolic intermediates while maximizing the production of the final product or optimizing the control of its production. Although natural metabolic pathways tend to be well balanced (no one enzyme controlling flux), constructed (heterologous) metabolic pathways may not be so balanced, particularly when the enzymes for that metabolic pathway come from organisms very different from the host. If the function of any single enzyme were suboptimal, the enzyme would cause a significant bottleneck in the pathway, lower the yield of the final product, and/or limit the growth of the host organism. The flux through any particular step is determined by the expressibility, solubility, stability, substrate specificity, reaction selectivity, and activity (collectively referred to as in vivo properties) of the enzyme catalyzing that reaction as well as the concentration of substrates, co-factors, inhibitors, and products. Although one might be able to use traditional metabolic engineering techniques (such as expression modulation, promoter engineering, codon optimization, to name a few) to overexpress a rate-limiting enzyme in a metabolic pathway to improve flux through that pathway, a better strategy would be to improve the in vivo properties of that rate-limiting enzyme so that it need not be overexpressed and the amino acids and energy needed to overexpress the enzyme would not be expended and could be used for other cellular processes. In cases where the in vivo properties cannot be improved by traditional methods, improving in vivo properties may be the only method to improve pathway flux. Therefore, the in vivo enzyme properties are important determinants of the efficiency of metabolic pathways and should be a major concern in designing and constructing synthetic pathways.

We previously constructed a synthetic mevalonate pathway in E. coli (Supporting Figure 1) and coupled it with a terpene synthase for production of amorpha-4,11-diene (a precursor to the anti-malarial drug artemisinin) (Martin et al. 2003). Since other terpenoids have also found use as drugs, flavors, fragrances, nutraceuticals, and potential use as advanced biofuels, this microbial system has become an attractive alternative to extraction from the natural host for production of these chemically complex terpenoids. However, two reaction steps—catalyzed by an N-terminally truncated hydroxy-3-methylglutaryl-CoA (HMG-CoA) reductase (tHMGR) (Donald et al. 1997) and many terpene synthases we have examined (Martin et al. 2001; Reiling et al. 2004)—have been identified as the primary bottlenecks in this synthetic pathway due to their unfavorable in vivo properties. That these two enzymes limited product formation was evident by the data that expressing amorpha-4,11-diene synthase, an enzyme with excellent in vivo properties, significantly reduced the accumulated FPP and allowed production of large titers of amorpha-4,11-diene (over 200 mg/L) and that vastly over-expressing the gene encoding tHMGR improved terpene production (Pitera et al. 2007). Although amorphadiene synthase had favorable in vivo properties, most terpene synthases have much lower activity in vivo (Martin et al. 2001; Reiling et al. 2004). Not only do terpene synthases and tHMGR have poor in vivo properties, the substrates for these enzymes (HMG-CoA and farnesyl diphosphate (FPP), respectively) appeared to be toxic and inhibit the growth of E. coli when they accumulated inside the cell (Martin et al. 2003; Pitera et al. 2007). However, the products of these enzymes (mevalonate and sesquiterpenes, respectively) do not interfere with host growth and are readily secreted into the medium.

The genes for the terpene biosynthetic pathway were encoded in three separate constructs. The MevT operon contains the genes encoding the enzymes in the top part of the pathway (acetyl-CoA to mevalonate: atoB encoding acetoacetyl-CoA thiolase, ERG13 encoding HMG-CoA synthase (HMGS), and the 3’ portion of HMG1 (without the 5’ portion of the gene encoding the membrane-binding N-terminus of the enzyme) encoding tHMGR (Supporting Figure 1A)). The MBIS operon contains the genes encoding the bottom part of the pathway (mevalonate to FPP: ERG12 encoding mevalonate kinase (MK), ERG8 encoding phosphomevalonate kinase (PMK), MVD1 encoding mevalonate diphosphate decarboxylase (MVD), idi encoding isopentenyl diphosphate isomerase, and ispA encoding prenyl diphosphate synthase (Supporting Figure 1B)). The third construct consists of the terpene synthase, which catalyzes the conversion of the prenyl diphosphates (geranyl-, farnesyl-, and geranylgeranyl diphosphates) to various terpenes with different regio- and stereochemistry (Supporting Figure 1C).

An example of a terpene synthase with poor in vivo properties is γ-humulene synthase (HUM). HUM is a sesquiterpene synthase from the gymnosperm Abies grandis that is known to produce at least 52 different sesquiterpene olefins from FPP through a wide variety of carbocation reaction mechanisms (Steele et al. 1998). Based on the theory of divergent molecular evolution, we previously demonstrated the redesign of many specific mutant variants that produce a single or a few terpenes (Martin et al. 2003; Yoshikuni et al. 2006a; Yoshikuni and Keasling 2007). Because of its versatility and plasticity in catalysis, HUM might find use in industrial terpenoid production and is an excellent model to study the biochemistry of terpene synthases. However, the over-expression of HUM with this system resulted in the formation of less than 0.02 mg/L of terpenes even after expression optimization at 37°C. Detailed analyses have revealed that a large fraction of HUM misfolds and accumulates in the insoluble fraction. As a result, only small fraction of this enzyme is functional in vivo. Although various genetic tools have been tried to improve expression of this enzyme, none of these strategies have worked well. Due to the potential importance of this enzyme and the difficulties in increasing its activity in vivo, we have chosen this enzyme and its specific mutant variants as models for redesign.

As has been discussed, since in vivo properties are strongly dependent on innate characteristics of enzymes, neither traditional nor more recently developed metabolic engineering strategies (Pitera et al. 2007) are likely to improve poorly functioning enzymes. Instead, protein engineering strategies that actually improve the function of these enzymes might be very useful or even become the sole solution to these problems. Protein engineering has been widely adopted as an effective design strategy to improve many different enzyme properties. However, the application of conventional protein engineering strategies such as directed evolution (Roodveldt et al. 2005) and rational and computational design (Daujotyte et al. 2003; Nasreen et al. 2006) are extremely difficult or nearly impossible for enzymes like HUM and other terpene synthases because of the lack of both a high-throughput screen and enzyme structures with atomic level accuracy. Although several screening methodologies using fused reporter proteins have been previously described, the reporter activity is not necessarily correlated with the activity of the target proteins and hence a large number of mutants still need to be screened (Yoshikuni et al. 2006b).

In addition, since it is thought that each terpene synthase has evolved after a series of speciation events, the primary sequences of terpene synthases of diverse function are more related within an organism or closely related organisms than are terpene synthases of similar function but in very different organisms. Primary sequences for terpene synthases are limited and biased to the enzymes derived from model species. Thus, it is also extremely difficult to extract useful information from a conventional analysis of their evolutionary relationship (Amin et al. 2004; Lehmann et al. 2002; Steipe et al. 1994). Furthermore, as the function of these enzymes is highly plastic, any single mutation introduced into a terpene synthase could alter its product selectivity. Therefore, a methodology that does not affect product selectivity would be ideal. For these reasons, we have considered how evolution might essentially optimize and maintain the favorable in vivo properties of enzymes and thereby efficiencies of metabolic pathways. Here, we present a simple and fast redesign strategy for enzymes based on plausible mechanisms of molecular evolution and demonstrate its use on two enzymes (tHMGR, HUM, and its specific mutant variants) whose in vivo activities have been difficult to improve using traditional metabolic engineering techniques.

RESULT AND DISCUSSION

Immutability of amino acids

In nature, pathways are thought to be constructed and gradually optimized through adaptive evolution of each enzyme in the pathway to render the host organisms adaptable to changing environments (Schmidt et al. 2003). In molecular evolution, the fixation probability of mutations is primarily determined by their fitness effects if a population size is small, which can be deleterious (opposed by purifying selection and likely discarded from a population), neutral or nearly neutral (genetic drift), or advantageous (generated through positive selection and fixed in a population) (Pal et al. 2006). It has been proposed that many enzymes are horizontally transferred from one species to another and are subsequently adapted to the new intracellular environment (Pal et al. 2005). Therefore, when mutations to a particular amino acid at a given position are kept to a minimum in this process (less mutable), then these amino acids are probably more essential for maintaining the in vivo properties of that enzyme. If such amino acids could be identified, then it would be possible to improve their function in the host organisms by properly realigning these amino acids based on evolutionary relationships of that enzyme.

Although several groups have previously considered the immutability of amino acids, the results have not been consistent and appear to change with the types of proteins being analyzed (Endo et al. 2002; Graur 1985). To reconcile these findings, we analyzed over 30,000 homologous sequences to over 200 different E. coli enzymes involved in central metabolism across multiple species (see Materials and Methods for details). Because of the essential roles of central metabolic enzymes in host viability, the in vivo properties of these enzymes are expected to be optimal. The probabilities of mutations to each amino acid X between two sequences ( $P_{Mut}^{X}$ ) and that for all amino acids (P_Mut) were determined (see Materials and Methods section for more information); interestingly, $P_{Mut}^{X}$ and P_mut were linearly correlated in many cases (Figure 1A–D). Thus, we are able to determine which amino acid contributes most to the diversification of enzymes. The average $P_{Mut}^{X} / P_{Mut}$ for each amino acid X was determined and then used to calculate the change in free energy due to the amino acid change ( $Δ G_{Mut}^{X}$ ) (Figure 1E). The analysis clearly indicates that Gly and Pro are significantly less mutable compared to other amino acids in E. coli central metabolic enzymes. The results imply that Gly and Pro are more essential and have been acquired and/or maintained during the adaptation of these enzymes toward function in E. coli.

An immutability of each amino acid (likelihood to gain and loss by substitutions, insertions, and deletions) was calculated by comparing over 200 *E. coli* proteins involved in central metabolism and each of their homologues (30,000 total) and it was converted to free energy difference ( $Δ G_{Mut}^{X}$ ; kT^* denotes arbitrary unit). A probability of mutation to each amino acid X( $P_{Mut}^{X}$ ) was plotted against that for all amino acids (*P_Mut*) using the glutamate synthase large subunit protein family as an example; plots for alanine ( $P_{Mut}^{A} / P_{Mut} ≅ 1$ ) (A), glutamine ( $P_{Mut}^{Q} / P_{Mut} > 1$ ) (B), glycine ( $P_{Mut}^{G} / P_{Mut} < 1$ ) (C), and proline ( $P_{Mut}^{P} / P_{Mut} < 1$ ) (D) are shown. The average of the relative stability for each amino acid to mutations obtained from analyses of 209 different protein families (or superfamilies) is shown (E: mean ± 2 S.E.). The results clearly indicate that Gly and Pro were significantly more immutable over the course of evolution.

Redistribution of Gly and Pro residues at predicted positions in proteins

As mentioned above, through prior work in our laboratory, we determined that tHMGR and HUM limited flux through the terpene biosynthetic pathway (Pitera et al. 2007; Yoshikuni et al. 2006a). Although both enzymes were expressed at a high level, neither enzyme had sufficient in vivo activity. Thus, we focused our work on these two enzymes. We selectively redistributed Gly and Pro both in tHMGR and HUM based on their evolutionary relationships to other HMGRs and terpene synthases, respectively. First, multiple sequence alignments (MSA) for both tHMGR and HUM were constructed. The MSA for tHMGR was constructed using other HMGR sequences derived from Archaea, because HMGR from Archaea is produced in soluble form. The MSA for HUM was constructed using mono- and diterpene synthases derived from gymnosperms, because only few sesquiterpene synthases derived from gymnosperms have been discovered. Next, the probabilities of conservation for Gly ( $P_{i}^{G}$ ) and Pro ( $P_{i}^{P}$ ) at each ith residue of tHMGR and HUM were calculated (represented in blue and red bars, respectively, in Figure 2A and C for tHMGR and in Figure 3A and C for HUM). Finally, substitutions involving Gly and Pro were introduced to both tHMGR and HUM according to the predicted profiles.

Distributions of Gly, represented in blue bars in (A), and Pro, represented in red bars in (C), to have effect on the *in vivo* properties of tHMGR were predicted based on an MSA constructed using the primary sequences of HMGR derived from archaea as a guide (sharing 30–40 % sequence identity). Residues that did not align in the MSA are represented by yellow bars in these figures. According to these profiles, Gly→Ala, Xaa→Gly, Pro→Ala, and Xaa→Pro substitutions were introduced into tHMGR, and functional consequences for these substitutions were monitored by *in vivo* mevalonate production. The effects caused by Gly→Ala and Xaa→Gly mutations are represented by the blue bars in (B), the effects caused by Pro→Ala and Xaa→Pro mutations are represented by the red bars in (D), and the effects caused by mutations to the residues that did not align in the MSA are represented by the yellow bars in (B) and (D) (mean ± S.D. of triplicate measurements). The correlation between the fitness effects caused by mutations and $P_{i}^{x}$ , are plotted in (E) and (F). Gly→Xaa and Pro→Xaa mutations are represented by the blue and red dots, respectively, in (E), and the Xaa→Gly and Xaa→Pro mutations are represented by the blue and red dots, respectively, in (F). These results show that over 80% of the aligned mutations were accurately predicted from these profiles using *P_i* = 0.4 as a threshold. *P_i* = 0.4 was chosen for threshold because the MSA was constructed using homologous sequences with 30–40% identity.

Distributions of Gly, represented by blue bars in (A), and Pro, represented by red bars in (C), to have effect on the *in vivo* properties of HUM were predicted based on an MSA constructed using the primary sequences of mono-, sesqui-, and diterpene synthases derived from gymnosperms as a guide. Residues that did not align, aligned only in monoterpene synthases, and aligned only in diterpene synthases in MSA are represented by yellow, green, and purple bars, respectively. According to this profile, Gly→Ala, Xaa→Gly, Pro→Ala, and Xaa→Pro substitutions were introduced into HUM, and the fitness effects of these substitutions were monitored by *in vivo* sesquiterpene production. The effects caused by Gly→Ala and Xaa→Gly mutations are represented by the blue bars in (B), the effects casued by Pro→Ala and Xaa→Pro mutations are represented by red bars in (D), and mutations to residues that did not align in MSA are represented by yellow bars in (B and D) (mean ± S.D. of triplicate measurements). The correlation between the fitness effects by mutations and $P_{i}^{x}$ is shown in (E) and (F). Gly→Xaa and Pro→Xaa mutations are represented by blue and red dots, respectively, in (E), and Xaa→Gly and Xaa→Pro mutations are represented by blue and red dots, respectively, in (F). These results indicate that over 80–90 % of the aligned mutations were accurately predicted from these profiles with *P_i* = 0.4 as a threshold; *P_i* = 0.4 was chosen for threshold because the MSA was constructed using homologous sequences with 30–40% identity.

For the case that the product of particular enzyme does not impact cell physiology and the reaction step catalyzed by this enzyme is the primary bottleneck in a biosynthetic pathway, it can be assumed that improvements to the in vivo properties of the enzyme will be directly reflected in its product formation, and mutational effects can be easily evaluated. Therefore, the fitness effects of these mutations on tHMGR and HUM were monitored by the level of in vivo mevalonate and sesquiterpene production, respectively (represented by the blue and red bars, respectively, in Figure 2B and D for tHMGR and Figure 3B and D for HUM).

In both cases, the predictions were greater than eighty percent accurate (Gly- and Pro-related mutations are represented by the blue and red dots, respectively in Figures 3E and F, and 4E and F); the exceptions were the residues that did not align well in MSA (represented by the yellow bars in Figures 3A–D and 4A–D). In the case of HUM, the effects of mutations to residues predominantly conserved in either mono- or diterpene synthases (represented by the green and purple bars in Figure 3A–D, respectively) were difficult to evaluate, because changes to these residues substantially impacted enzyme activity. As such, the quality of the MSA would significantly affect the accuracy of the predictions. Although the MSA for HUM was constructed primarily from mono- and diterpene synthases, relatively high correlation between product formation and probability of conservation was observed (represented by the blue and red dots in Figure 3E and F for HUM). The correlation for tHMGR was not as good as that for HUM (represented by the blue and red dots in Figure 2E and F). Because the pathway containing the native tHMGR was already capable of producing mevalonate at almost 10% of its theoretical maximum, mevalonate production could be improved only a few fold. Thus, improvement to the in vivo properties of tHMGR did not have as significant impact on mevalonate production as improvements on HUM had on sesquiterpene production. It is also interesting to note that the method was good at predicting the effects of cumulative neighbor mutations. For example, introduction of each of the tHMGR mutants G353A and G349A resulted in increased mevalonate production even though the mutations were predicted to reduce mevalonate production. However, when these mutations were introduced into the G352A mutant, these mutations decreased mevalonate production (10–40% reduction in mevalonate productivity compared to that from G352A alone). Thus, the methodology accurately predicted the effect of cumulative neighbor mutations.

The growth (A) and mevalonate production (B) for strains harboring pBADMevT containing tHMGR-WT (yeast wild-type HMG1p of its membrane binding domain truncated), tHMGR-G5 (G206A/G319A/G352A/G417A/G495A) and tHMGR-G9 (P200A/G206A/T239P/G319A/G352A/G417A/P428G/K474G/G495A) were measured. Both the growth and mevalonate production improved approximately 2.5-3-fold, and the increase in growth is proportional to the increase in mevalonate production. We previously demonstrated that accumulation of HMG-CoA inhibits cell growth. Thus, improving the *in vivo* properties of tHMGR reduced HMG-CoA accumulation and thus the toxicity of the pathway.

Finally, to determine if another amino acid might be a better change at these positions, saturation mutagenesis was carried out on G148, G227, G327, and G361 in HUM. In all cases, substitution of the native amino acid with Ala improved sesquiterpene production, and no further mutations were considered.

Accumulation of single mutations improved the in vivo enzyme properties and production of various terpenes

Mutations that improved the in vivo properties of tHMGR and HUM were subsequently recombined, and the effects of mutations on their in vivo properties were cumulative; we have constructed the tHMGR-G9 mutant containing the changes P200A/G206A/T239P/G319A/G352A/G417A/P428G/K474G/G495A and HUM-G6 mutant containing the changes K126P/R142G/G148A/G227A/G327A/G361A. When integrated into the host alone, tHMGR-G9 increased mevalonate production three fold and HUM-G6 improved sesquiterpene production 80 fold (Figure 4B and Figure 5B, respectively). The improvements in product titer were due to increases in flux through the pathway as well as improvements in growth, as E. coli harboring tHMGR-G9 did not suffer the growth inhibition of accumulated HMG-CoA that the strain harboring wild-type tHMGR did (Figure 4A and Figure 5A). When the tHMGR-G9 and HUM-G6 mutants were integrated into the same host, host growth improved 3–4 fold and sesquiterpene production improved nearly 1,000 fold (Figure 5A and B).

*E. coli* DH1 harboring pBADMevT (tHMGR-WT or tHMGR-G9 (P200A/G206A/T239P/G319A/G352A/G417A/P428G/K474G/G495A)), pBBRMBIS, and pTrcHUM15 (containing HUM-WT, HUM-G3 (K126P/R142G/G227A), or HUM-G6 (K126P/R142G/G148A/G227A/G327A/G361A)) was used for *in vivo* sesquiterpene production. The growth curve (A) and sesquiterpene production at 24 hours after inoculation (B) are shown. HUM-WT, HUM-G3, and HUM-G6 co-integrated with tHMGR are shown in light blue, medium blue, and dark blue, respectively, and those with tHMGR-G9 are shown in yellow, light green, and dark green, respectively. The strain containing tHMGR-G9 grew 3-fold higher and produced 3-fold more mevalonate (Figure 4), resulting in a synergistic improvement in overall sesquiterpene production. The mutations in HUM-G6 were also applied to specific variants of HUM previously constructed in our laboratory (SIB, sibirene synthase; sHUM, specific γ-humulene synthase; LFN, longifolene synthase; ALP, α-longipinene synthase; BBA, β-bisabolene synthase; and AYG, α-ylangene synthase) were co-integrated with tHMGR-G9 (C). The resulting specific terpene production at 48 hours after inoculation improved dramatically for each variant and is compared to total sesquiterpene production for G6. All data represent mean ± S.D. of triplicate measurements.

The same mutations that were introduced into HUM-G6 were also introduced into six, previously designed, mutant variants of HUM that produced a few or a single product rather than the many products generated by HUM (Yoshikuni et al. 2006a); the effects of mutations on their in vivo properties were similar to that of HUM. The terpene productivities of the host harboring these improved enzymes were also dramatically improved (8–34 mg/L from ~0.02 mg/L; Figure 5C). Given that the conservation analysis is supported for over 200 different enzyme families (or superfamilies), we predict that similar mutations could improve the in vivo properties of many other enzymes.

The effects of Gly- and Pro-related mutations on enzyme activity

To investigate how Gly and Pro redistributions contribute to such a large improvement in sesquiterpene production, product selectivity, steady state kinetics, and in vivo sesquiterpene production from S-tagged versions of HUM, HUM-G3 (K126P/R142G/G227A), and HUM-G6 were measured. (The S-tag was removed for the steady state kinetics study). Although terpene synthases are known to be very plastic (Yoshikuni et al. 2006a; Yoshikuni et al. 2006b), the product selectivity of HUM-G6 was comparable to that of wild-type HUM (Table 1). In addition, its activity (k_cat/K_m) was similar to that of wild-type HUM (Table 1). In contrast, in vivo sesquiterpene production from S-tagged HUM-G6 increased nearly two fold over that of the non-S-tagged HUM-G6. The improvement in sesquiterpene production from HUM to HUM-G6 increased with temperature (3.3-fold at 20°C, 10-fold at 30°C, and 220-fold at 37°C; Figure 6A). Of all the temperatures tested, wild-type HUM showed the highest terpene production at 30°C, whereas HUM-G6 showed the highest terpene production at 37°C. Given that more enzyme was found in the soluble fraction at 37°C than at 30°C (Figure 6B) it appears that wild-type HUM does not fold properly at higher temperatures (many enzymes found even in the soluble fraction at 37°C were not functional), and that Gly and Pro redistributions increased the ability of HUM-G6 to functionally fold in the E. coli intracellular environment at higher temperature. In fact, quantification of in vivo enzyme levels in both the soluble fraction and crude lysate revealed that the mutations significantly improved the concentration of soluble enzymes (Figure 6C).

Table 1.

Product selectivity of HUM and its mutant variants

		Kinetic Parameters^*2			Product Distributions^3 ^4

Name^*1	Mutations	k_cat (10⁻³ s⁻¹)	K_m (µM)	k_cat/K_m (10³ M⁻¹s⁻¹)	1	2	3	4	5	6
WT	None	12.00 ± 0.34	2.01 ± 0.17	5.96	8.3	7.2	14.9	26.1	34.0	9.5
G3	K126P, 142G, G227A	7.62 ± 0.21	4.66 ± 0.39	1.64	7.2	6.3	16.5	27.9	31.7	10.4
G6	K126P, 142G, G148A, G227A, G327A G361A	1.71 ± 0.17	0.69 ± 0.13	2.47	7.1	6.8	15.2	27.4	32.8	10.7
SIB	K126P, 142G, G148A, G227A, G327A G361A, F312Q, M339A, M447F	ND	ND	ND	0.2	2.8	2.2	80.1	13.8	0.9
HUM	K126P, 142G, G148A, G227A, G327A G361A, M339N, S484C, M565I	ND	ND	ND	5.6	11.7	5.9	0.7	75.6	0.6
LFN	K126P, 142G, G148A, G227A, G327A G361A, A317N^*5,A336S, S484C, I562V	ND	ND	ND	12.8	3.4	62.1	1.6	11.9	8.1
ALP	K126P, 142G, G148A, G227A, G327A, G361A, A336C, T445C, S484C, I562L, M565L	ND	ND	ND	60.2	4.6	13.7	0.4	14.6	6.5
BBA	K126P, 142G, G148A, G227A, G327A, G361A, A336V, M447H, I562T	ND	ND	ND	1.6	0.2	3.9	0.5	4.7	89.1
AYG	K126P, 142G, G148A, G227A, G327A, G361A, S484A, Y566F	ND	ND	ND	14.6	27.5	0.5	0.6	47.1	9.6

Open in a new tab

^*1

WT: wild type γ–humulene synthase, G3: third generation of mutant γ–humulene synthase, G6: sixth generation of mutant γ–humulene synthase, SIB: sibirene synthase, HUM: new γ–humulene synthase, LFN: longifolene synthase, ALP: α–longipinene synthase, BBA: β–bisabolene synthase, AYG: α–ylangene synthase

^*2

ND: parameters were not determined for a particular mutant.

^*3

1: α–longipinene, 2: α–ylangene, 3: longifolene 4: sibirene, 5: γ–humulene, 6: β–bisabolene

^*4

All product distributions were represented for 1–6 as 100%; these are corresponding to more than 85–95% and to 75% of total products in mutants and wild type (including G3 and G6), respectively.

^*5

A317N occurred during recombination, and improved in vivo terpene production without a change in product distribution

S-tagged versions HUM-WT (yellow), HUM-G3 (light green), and HUM-G6 (dark green) were co-integrated with tHMGR-G9 into the synthetic pathway to examine the effect of temperature on accumulated Gly and Pro mutations. Sesquiterpene production by these strains (A), total HUM protein concentration in culture (B), and soluble HUM protein concentration in culture (C) at 24 hours after inoculation are shown. The OD_600nm reached approximately 2.5 at 20°C, 8.0 at 30°C, and 6.5 (G6)-8.0 (WT) at 37°C. Interestingly, sesquiterpene productivity from HUM-G6 improved almost 1.7-fold with an N-terminal S-tag. The higher the temperature, the more HUM was produced. At 37°C, the amount of HUM-G6 in the soluble fraction was significantly higher than that of HUM-WT. All data represent mean±S.D. of triplicate measurements.

SIGNIFICANCE

We successfully redesigned the in vivo properties of tHMGR, HUM, and its six specific mutant variants to improve in vivo sesquiterpene production without affecting the respective catalyzed reactions. Because of their unfavorable in vivo properties, these enzymes have been previously found to be significant bottlenecks in the terpene biosynthetic pathway. From our initial analysis of over 30,000 homologues to over 200 E. coli enzymes involved in central metabolism, Gly and Pro were initially suggested to be essential for their in vivo properties. These amino acids were then selectively redistributed in tHMGR and HUM based on their evolutionary relationships. The results show that this protein engineering strategy is simple and fast, but yet powerful and effective to redesign in vivo enzyme properties without altering product distributions, a task that would have been extremely difficult or nearly impossible using conventional protein engineering strategies. Since proper distributions of these amino acids can be predicted largely from their evolutionary relationships and their redistribution dramatically impacted the flux through the heterologous, biosynthetic pathway, it is likely that proper distributions of these amino acids are essential for many enzymes expressed in E. coli and could have been achieved primarily as a result of adaptive evolution. Together with our previous study (Yoshikuni et al. 2006a; Yoshikuni and Keasling 2007), these results provide evidence that protein engineering strategies based on the theories of molecular evolution are very useful for design and construction of synthetic systems.

MATERIALS AND METHODS

Reagents and equipments

All enzymes and chemicals were purchased from New England Biolabs and Sigma-Aldrich Co, respectively, unless otherwise stated. An HP6890 gas chromatograph equipped with a 5973 mass selective detector (Hewlett Packard) or flame ionization detector, a CyclosilB capillary column (30 m × 250 µm i.d. × 0.25 µm thickness, Agilent Technologies) or DB5-MS capillary column (30 m × 250 µm i.d. × 0.25 µm thickness, Agilent Technologies), and a Combi PAL auto sample-injector (LEAP Technologies) were used for analysis. An LS6500 multi-purpose scintillation counter (Beckman coulter) was used for enzyme kinetics.

Analysis of amino acid composition changes in proteins across multiple species

To examine the relative importance of each of twenty different amino acids (X: Ala, Cys, Asp….), we calculated relative immutability (likelihood to gain and loss by substitutions, deletions, and insertions) for each amino acid from over 200 different protein families (or superfamilies) and it is fitted to free energy difference ( $Δ G_{Mut}^{X}$ ). These protein families are all involved in central metabolism, including glycolysis, citric acid cycle, pentose phosphate pathway, oxidative phosphorylation, fatty acid metabolism, amino acid metabolism, and nucleic acid metabolism. In each protein family (or superfamily) (F), each homologous protein sequence (H) was searched using the basic local alignment search tool for proteins (BLASTP: http://www.ncbi.nih.gov). In pair-wise alignment between a particular E. coli protein sequence and its homologous protein sequence derived from a particular species, the probability of mutations ( $P_{Mut, H, F}^{X}$ ) for each amino acid (X) was calculated based on the composition of all aligned pairs of amino acids described by the following formula:

P_{Mut, H, F}^{X} = \frac{N_{Mut, H, F}^{X}}{N^{X}}

where $N_{Mut, H, F}^{X}$ and N^X denote the number of amino acid pairs only one and at least one of which is X in each pair-wise alignment, respectively. The pair-wise alignments used herein covered at least 80% of and shared over 50% identity to the corresponding E. coli protein sequences. For this calculation, we assumed that if proteins had evolved without any constraint, the $P_{Mut, H, F}^{X}$ should be identical to that for all amino acids (P_Mut,H,F) described as the following formula:

P_{Mut, H, F} = \frac{N_{Mut, H, F}}{N}

where N_Mut,H,F denotes the number of mutated amino acid pairs, and N denotes the number of all amino acid pairs in each pair-wise alignment. We, then, plotted $P_{Mut, H, F}^{X}$ against P_Mut,H,F. On average, 164 (2 × S.E. = 22) plots (pair-wise sequence alignments) were made for each of over 200 protein families. $P_{Mut, F}^{X} / P_{Mut, F}$ is defined as the slope for the linear regression of the data in the plot. The free energy difference of each amino acid X for the mutations in each protein family (or superfamily) F ( $Δ G_{Mut, F}^{X}$ ) was then calculated according to Boltzmann statistics as follows:

\frac{P_{Mut, F}^{X}}{P_{Mut, F}} = exp (\frac{- Δ G_{Mut, F}^{X}}{k T^{*}})

where kT^* denotes an arbitrary constant. In this analysis, we calculated $Δ G_{Mut, F}^{X}$ only when the R² of the $P_{Mut, F}^{X} / P_{Mut, F}$ plot was greater than 0.9 or 0.7 if the plots were distributed symmetrically about the trend line to minimize the bias introduced by duplicate or nearly duplicate plots.

Design methodology to improve in vivo properties of enzymes using MSA as a guide

To predict where to distribute Gly, Pro, and Xaa (where Xaa denotes any amino acid residues other than Gly and Pro), we first created an MSA for both HUM and tHMGR using MUSCLE (http://phylogenomics.berkeley.edu/cgi-bin/muscle/input_muscle.py). The primary sequence of HUM from the gymnosperm Abies grandis was aligned with other mono-, sesqui-, and diterpene synthases derived from gymnosperms. Although many sesquiterpene synthases have been isolated from angiosperms, mono- and diterpene synthases from gymnosperms are more closely correlated to HUM at the primary sequence level. The primary sequence of tHMGR derived from yeast, which was solubilized for expression in E. coli by removal of the N-terminus, was aligned with other orthologous sequences derived from archaeal species, as the archaeal HMGR is produced in a soluble form as opposed to the membrane-bound form found in most eukaryotes. The conservation probability for Gly ( $P_{i}^{G}$ ) and Pro ( $P_{i}^{P}$ ) at column i in a given MSA was calculated based on the composition of Gly and Pro at column i as follows:

P_{i}^{X} = \frac{N_{i}^{X}}{N_{i}}

where $N_{i}^{X}$ and N_i denote the number of amino acid X (Gly or Pro) and the total number of aligned amino acids at position i in each column of the MSA, respectively. The fitness effects contributed by these mutations were predicted to be dependent on the value of P_i; we used $P_{i}^{X} = 0.4$ as a threshold because each sequence used to construct MSA shared approximately 35–40% identity with HUM or tHMGR. We compared the value of $P_{i}^{X}$ and the fitness effects resulting from single mutations; when $P_{i}^{X} \geq 0.4$ , the mutation to amino acid X is predicted to show neutral, nearly neutral, or positive fitness effects and away from amino acid X to show neutral, nearly neutral, or negative fitness effects. When $P_{i}^{X} \leq 0.4$ , the mutation to amino acid X is predicted to show neutral, nearly neutral, or negative fitness effects and away from amino acid X to show neutral, nearly neutral, or positive fitness effects. For example, if X is Gly, $P_{i}^{Gly}$ is larger than 0.4, and the ith residue is Gly, the residue remains Gly; if the ith residue is any residue other than Gly, the residue is changed to Gly. If $P_{i}^{Gly}$ is smaller than 0.4 and if the ith residue is Gly, the residue is changed to Ala; if the ith residue is any residue other than Gly, the residue remains unchanged.

Strains and Plasmids

Escherichia coli strain DH10B and DH1 were used for both mevalonate and sesquiterpene production, and BL21(DE3) was used for protein over-expression and purification. Plasmids pBADMevT (Martin et al. 2003) and their mutant variants were used for mevalonate production. Plasmid pBBRMBIS (Martin et al. 2003) was used for FPP production. Plasmids pTrcHUM (Yoshikuni et al. 2006a), pTrcHUM15, and their mutant variants were used for sesquiterpene production (Supporting Figure 1). Plasmid pTrcSHUM15 and its mutant variants were used for quantification of protein concentrations in vivo. Plasmids pETHUM (Yoshikuni et al. 2006a) and its mutant variants were used for protein over-expression and purification. All genes in pBADMevT and pTrcHUM have been previously re-synthesized and their codon usage has been optimized.

Since reduced expression of HUM slightly improved sesquiterpene production, an extra seven base pairs were introduced between the ribosome-binding-site (RBS) and the start codon at the NcoI site of pTrcHUM. The RBS region was amplified by polymerase chain reaction (PCR): 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward (5’-GCGCGTTGGTGCGGATATC-3’) and reverse (5’-CATGCCATGGAGCTTATTCTGTTTCCTGTGTGAAATTG-3’) primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng pTrcHUM as a template in a total volume of 100 µl. The amplified fragments were then digested with EcoRV/NcoI and inserted into the corresponding site of pTrcHUM to form pTrcHUM15.

pTrcSHUM15 was constructed based on pTrcHUM15 backbone. The S-tag was fused to the N-terminal of HUM. The RBS region in pTrcHUM15 was amplified by PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward (5’-GCGCGTTGGTGCGGATATC-3’) and reverse (5’-GCAGCAGCGGTTTCTTTCATGGAGCTTATTCTGTTTC-3’) primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng pTrcHUM15 as a template in total volume of 100 µl. The S-tag was amplified by PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward (5’-GAAACAGAATAAGCTCCATGAAAGAAACCGCTGCTGC-3’) and reverse (5’-CATGCCATGGAACCGCGTGGC-3’) primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng pET29 (Novagen) as a template in a total volume of 100 µl. These two amplified fragments were spliced by over-lap PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward (5’-GCGCGTTGGTGCGGATATC-3’) and reverse (5’-CATGCCATGGAACCGCGTGGC-3’) primers, 2.5 U Phusion DNA polymerase (Finezyme), and the abovementioned fragments as a template in a total volume of 100 µl. The spliced fragment was then digested with EcoRV/NcoI and inserted into the corresponding site of pTrcHUM to form pTrcSHUM15.

GC-MS analysis for in vivo mevalonate production

To screen the single mutation library, a single colony harboring pBADMevT (wild type tHMGR or its mutant variants) was inoculated into LB medium containing Cm⁵⁰ and grown overnight at 37°C. An aliquot (50 µl) of this seed culture was inoculated into fresh LB medium (5 ml) containing Cm⁵⁰ and 13.3 mM (+)-L-arabinose, and grown for 24 hours at 37°C. An aliquot of culture (560 µl) was mixed with 140 µl of 0.5 M HCl to dehydrate the mevalonate to form mevalonolactone, and 700 µl of ethyl acetate was then added to the sample. The mixture was vortexed for 5 minutes, and the ethyl acetate was analyzed by GC-MS using a GC oven temperature program of 90°C for 1 min, then ramping 30°C/min to 250°C for CyclosilB capillary column analysis. Mevalonolactone was identified from its mass spectrum and retention time by comparison to an authentic standard.

As for the final mevalonate production assay, a single colony harboring pBADMevT (wild type tHMGR or its mutant variants) was inoculated into LB medium containing Cm⁵⁰ and grown overnight at 37°C. An aliquot (500 µl) of this seed culture was inoculated into fresh modified m9 medium (50 ml, see above formulation) containing Cm⁵⁰. Two hours after the inoculation (+)-L-arabinose was added to the final concentration of 13.3 mM. Mevalonate production was analyzed as mentioned above.

GC-FID and GC-MS analysis for in vivo sesquiterpene production

To screen the single mutation library, a single colony harboring pTrcHUM (wild type HUM or its mutant variants) and pBBRMBIS was inoculated into Luria Bertani (LB) medium containing 50 µg/ml carbenicillin (Cb⁵⁰) and 50 µg/ml kanamycin (Km⁵⁰) and grown overnight at 37°C. An aliquot (50 µl) of this seed culture was inoculated into fresh LB medium (5 ml) containing 10 mM D/L-mevalonate, Cb⁵⁰, and Km⁵⁰, overlaid with 500 µl dodecane, and grown for 24 hours at 37°C. An aliquot of dodecane (50 µl) was diluted into 200 µl of ethyl acetate, and the mixture was analyzed by GC-MS or GC-FID using a GC oven temperature program of 80°C for 1 min, then ramping 30°C/min to 110°C, 5°C/min to 160°C, and 130°C/min to 250°C for CyclosilB capillary column analysis and of 80°C for 3 min, then ramping 5°C/min to 160°C, and 120°C/min to 300°C for DB-5MS capillary column analysis. Camphor was used as an internal standard. Sesquiterpenes were identified from their mass spectra and GC retention times by comparison to available authentic standards and spectra in libraries previously reported in the literature.

As for the final sesquiterpene production assay, a bacterial system containing three plasmids was used (Martin et al. 2003). A single colony harboring pTrcHUM15 or pTrcSHUM15 (wild type HUM or its mutant variants), pBBRMBIS, and pBADMevT (wild type tHMGR(Donald et al. 1997) or its mutant variants) was inoculated into LB medium containing Cb⁵⁰, Km⁵⁰, and chloramphenicol (Cm⁵⁰) and grown for overnight at 37°C. An aliquot of this seed culture was inoculated into 50 ml of fresh, modified, M9 medium (pH 7, M9 salt, 75 mM MOPS, 3 % glycerol, 5 g/L yeast extract, 2 mM MgSO₄, 1 mg/L thiamine, 10 µM FeSO₄, 0.01 mM CaCl₂, and micronutrient) to the final OD_600nm of 0.05 containing Cb⁵⁰, Km⁵⁰, and Cm⁵⁰, overlaid with 10 ml of dodecane. Two hours after the inoculation, isopropyl-β-D-thiogalactopyranosid (IPTG) and (+)-L-arabinose were added to the final concentrations of 1 mM and 13.3 mM, respectively. Sesquiterpene production was analyzed as mentioned above.

Site directed mutagenesis of tHMGR by overlap PCR

Site directed mutagenesis for tHMGR was carried out using overlap PCR (Supporting Table 1 for the primer sequences used herein). DNA fragments encoding the N- and C-termini of the mutation were amplified by PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward and reverse primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng pBADMevT as a template for tHMGR in a total volume of 100 µl. The amplified DNA fragment was gel purified using a gel purification kit (Qiagen) or treated with DpnI and purified using a PCR purification kit (Qiagen). These two amplified DNA fragments were spliced via over-lap PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward and reverse primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng of the above mentioned DNA fragments as a template in a total volume of 100 µl. The fully amplified HMGR fragment was digested with SpeI/HindIII and inserted into the corresponding site of pBADMevT.

Site directed mutagenesis of HUM by overlap PCR

Site directed mutagenesis of HUM was carried out using over-lap PCR (Supporting Table 2 for the primer sequences used herein). DNA fragments encoding the N- and C-terminus of the mutation were amplified by PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward and reverse primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng pTrcHUM in 100 µl as a template for γ-humulene synthase. Amplified DNA was gel purified using a gel purification kit (Qiagen) or treated with DpnI and purified using a PCR purification kit (Qiagen). These two amplified DNA fragments were spliced via overlap PCR: 98°C for 30 sec, 55°C for 30 sec, and 72°C for 30 sec, repeated for 30 times. The reaction mixture contained 1× Phusion buffer, 2 mM dNTP, 0.5 µM forward and reverse primers, 2.5 U Phusion DNA polymerase (Finezyme), and 50 ng of the abovementioned DNA fragments as a template in a total volume of 100 µl. The fully amplified HUM fragment was digested with NcoI/XbaI and cloned into the corresponding site in pTrcHUM.

Quantification of in vivo HUM concentrations

A single colony harboring pTrcSHUM15 (wild type or its mutant variant), pBBRMBIS, and pBADMevT was inoculated into LB medium containing Cb⁵⁰, Km⁵⁰, and Cm⁵⁰ and grown overnight at 37°C. An aliquot of this seed culture was inoculated into fresh modified M9 medium (50 ml, see above formulaton) containing Cb⁵⁰, Km⁵⁰, and Cm⁵⁰ to the final OD_600nm of 0.05 and was grown at 37°C. Two hours after the inoculation, IPTG and (+)-L-arabinose was added to the final concentrations of 1 mM and 13.3 mM, respectively. The cultures were then grown at 20°C, 30°C, and 37°C. An aliquot of culture (1 ml) was taken and centrifuged at 14,000 × g. The resulting pellet was resuspended into Bugbuster containing recommended amount of Lysonase (Novagen) to the final OD_600nm of 20, and it was incubated for half an hour at room temperature. This lysis solution was centrifuged for 10 min at 14,000 × g. Total cell lysate (24 µl) (both soluble and insoluble fractions) and the soluble fraction only (24 µl) were mixed with 75 µl of 8 M guanidinium chloride and 1 µl of 4 M dithiothreitol. These solutions were incubated for another hour at room temperature. The concentration of HUM was determined by FRET Works S-tag assay kit following the recommended protocols (Novagen). In vivo sesquiterpene production from each culture was measured as described above.

Protein expression and purification of HUM

Wild type HUM and its variants were cloned into pET29 and transformed into BL21 (DE3). Each transformant was inoculated into LB medium (5 ml) containing Km⁵⁰ and was grown overnight at 30°C. An aliquot (2 ml) of this seed culture was inoculated into fresh terrific broth (TB) medium containing Km⁵⁰ (500 ml), and the culture was grown at 30°C. When the culture reached OD_600nm of 0.6–0.8, 0.1 mM of IPTG was added, and it was grown at 20°C for another 16 hours. Cells were harvested by centrifugation at 6,000 × g for 15 min. The pellet was suspended in 50 ml of BugBuster (Novagen) containing 20 U DNaseI and bacterial protease inhibitor cocktail II (Novagen), and was incubated for an hour at 4°C. The solution was then centrifuged at 20,000 × g for 30 min, and then filtered through a 0.45-µm filter. S-tagTM Thrombin purification kit (Novagen) was used for the purification following the protocol recommended by Novagen. All purifications were done at half scale. The eluted protein solution was dialyzed twice (PIERCE, MW 3,000 Da) against 1 L of buffer containing 10 mM Tes (pH 7.0), 10 mM MgCl₂, 1 mM DTT and 5% glycerol overnight. The protein concentration was measured using the Bradford method. We obtained approximately 3 ml of 25–500 µg/ml of protein solution with about 95% purity (confirmed by SDS-PAGE gel, data not shown).

Enzyme kinetics

The kinetics studies of HUM and its variants were carried out following a slightly modified protocol from that previously reported by Little et. al. (Little and Croteau 2002). Kinetics for each enzyme was measured in a 40 µl reaction containing 0.15–0.4 µM enzyme, in buffer described in the previous section and overlaid with dodecane. The concentration of FPP was varied from 0.229 to 58.6 µM with a fixed ratio of [³H]FPP. Seven to nine different concentrations of FPP were used for each enzyme (n = 3). The reaction mixture was incubated for 20 minutes at 31°C. To stop the reaction, 40 µL of a solution containg 4 M NaOH and 1 M EDTA was added and mixed. To extract sesquiterpene products, the reaction mixture was vortexed for 2 min, and 400 µL of dodecane was taken from the solution and mixed with 15 mL of scintillation fluid. Radioactivity was measured by scintillation counting. k_cat, K_m and k_cat/K_m were calculated using Enzyme Kinetics!Pro (ChemSW).

Supplementary Material

Supplemental Data

NIHMS573833-supplement-Supplemental_Data.pdf^{(164.3KB, pdf)}

REFERENCES

Amin N, Liu AD, Ramer S, Aehle W, Meijer D, Metin M, Wong S, Gualfetti P, Schellenberger V. Construction of stabilized proteins by combinatorial consensus mutagenesis. Protein Eng Des Sel. 2004;17(11):787–793. doi: 10.1093/protein/gzh091. [DOI] [PubMed] [Google Scholar]
Daujotyte D, Vilkaitis G, Manelyte L, Skalicky J, Szyperski T, Klimasauskas S. Solubility engineering of the HhaI methyltransferase. Protein Eng. 2003;16(4):295–301. doi: 10.1093/proeng/gzg034. [DOI] [PubMed] [Google Scholar]
Donald KA, Hampton RY, Fritz IB. Effects of overproduction of the catalytic domain of 3-hydroxy-3-methylglutaryl coenzyme A reductase on squalene synthesis in Saccharomyces cerevisiae. Appl Environ Microbiol. 1997;63(9):3341–3344. doi: 10.1128/aem.63.9.3341-3344.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
Endo T, Fedorov A, de Souza SJ, Gilbert W. Do introns favor or avoid regions of amino acid conservation? Mol Biol Evol. 2002;19(4):521–252. doi: 10.1093/oxfordjournals.molbev.a004107. [DOI] [PubMed] [Google Scholar]
Graur D. Amino acid composition and the evolutionary rates of protein-coding genes. J Mol Evol. 1985;22(1):53–62. doi: 10.1007/BF02105805. [DOI] [PubMed] [Google Scholar]
Kraut DA, Carroll KS, Herschlag D. Challenges in enzyme mechanism and energetics. Annu Rev Biochem. 2003;72:517–571. doi: 10.1146/annurev.biochem.72.121801.161617. [DOI] [PubMed] [Google Scholar]
Lehmann M, Loch C, Middendorf A, Studer D, Lassen SF, Pasamontes L, van Loon AP, Wyss M. The consensus concept for thermostability engineering of proteins: further proof of concept. Protein Eng. 2002;15(5):403–411. doi: 10.1093/protein/15.5.403. [DOI] [PubMed] [Google Scholar]
Little DB, Croteau RB. Alteration of product formation by directed mutagenesis and truncation of the multiple-product sesquiterpene synthases delta-selinene synthase and gamma-humulene synthase. Arch Biochem Biophys. 2002;402(1):120–135. doi: 10.1016/S0003-9861(02)00068-1. [DOI] [PubMed] [Google Scholar]
Martin VJ, Pitera DJ, Withers ST, Newman JD, Keasling JD. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat Biotechnol. 2003;21(7):796–802. doi: 10.1038/nbt833. [DOI] [PubMed] [Google Scholar]
Martin VJ, Yoshikuni Y, Keasling JD. The in vivo synthesis of plant sesquiterpenes by Escherichia coli. Biotechnol Bioeng. 2001;75(5):497–503. doi: 10.1002/bit.10037. [DOI] [PubMed] [Google Scholar]
Nasreen A, Vogt M, Kim HJ, Eichinger A, Skerra A. Solubility engineering and crystallization of human apolipoprotein D. Protein Sci. 2006;15(1):190–199. doi: 10.1110/ps.051775606. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pal C, Papp B, Lercher MJ. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat Genet. 2005;37(12):1372–1375. doi: 10.1038/ng1686. [DOI] [PubMed] [Google Scholar]
Pal C, Papp B, Lercher MJ. An integrated view of protein evolution. Nat Rev Genet. 2006;7(5):337–348. doi: 10.1038/nrg1838. [DOI] [PubMed] [Google Scholar]
Pitera DJ, Paddon CJ, Newman JD, Keasling JD. Balancing a heterologous mevalonate pathway for improved isoprenoid production in Escherichia coli. Metab Eng. 2007;9(2):193–207. doi: 10.1016/j.ymben.2006.11.002. [DOI] [PubMed] [Google Scholar]
Reiling KK, Yoshikuni Y, Martin VJ, Newman J, Bohlmann J, Keasling JD. Mono and diterpene production in Escherichia coli. Biotechnol Bioeng. 2004;87(2):200–212. doi: 10.1002/bit.20128. [DOI] [PubMed] [Google Scholar]
Roodveldt C, Aharoni A, Tawfik DS. Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol. 2005;15(1):50–56. doi: 10.1016/j.sbi.2005.01.001. [DOI] [PubMed] [Google Scholar]
Schmidt S, Sunyaev S, Bork P, Dandekar T. Metabolites: a helping hand for pathway evolution? Trends Biochem Sci. 2003;28(6):336–341. doi: 10.1016/S0968-0004(03)00114-2. [DOI] [PubMed] [Google Scholar]
Steele CL, Crock J, Bohlmann J, Croteau R. Sesquiterpene synthases from grand fir (Abies grandis). Comparison of constitutive and wound-induced activities, and cDNA isolation, characterization, and bacterial expression of delta-selinene synthase and gamma-humulene synthase. J Biol Chem. 1998;273(4):2078–2089. doi: 10.1074/jbc.273.4.2078. [DOI] [PubMed] [Google Scholar]
Steipe B, Schiller B, Pluckthun A, Steinbacher S. Sequence statistics reliably predict stabilizing mutations in a protein domain. J Mol Biol. 1994;240(3):188–192. doi: 10.1006/jmbi.1994.1434. [DOI] [PubMed] [Google Scholar]
Yoshikuni Y, Ferrin TE, Keasling JD. Designed divergent evolution of enzyme function. Nature. 2006a;440(7087):1078–1082. doi: 10.1038/nature04607. [DOI] [PubMed] [Google Scholar]
Yoshikuni Y, Keasling JD. Pathway engineering by designed divergent evolution. Current Opinion in Chemical Biology. 2007;11(2):233–239. doi: 10.1016/j.cbpa.2007.02.033. [DOI] [PubMed] [Google Scholar]
Yoshikuni Y, Martin VJ, Ferrin TE, Keasling JD. Engineering cotton (+)-delta-cadinene synthase to an altered function: germacrene D-4-ol synthase. Chem Biol. 2006b;13(1):91–98. doi: 10.1016/j.chembiol.2005.10.016. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

NIHMS573833-supplement-Supplemental_Data.pdf^{(164.3KB, pdf)}

[R1] Amin N, Liu AD, Ramer S, Aehle W, Meijer D, Metin M, Wong S, Gualfetti P, Schellenberger V. Construction of stabilized proteins by combinatorial consensus mutagenesis. Protein Eng Des Sel. 2004;17(11):787–793. doi: 10.1093/protein/gzh091. [DOI] [PubMed] [Google Scholar]

[R2] Daujotyte D, Vilkaitis G, Manelyte L, Skalicky J, Szyperski T, Klimasauskas S. Solubility engineering of the HhaI methyltransferase. Protein Eng. 2003;16(4):295–301. doi: 10.1093/proeng/gzg034. [DOI] [PubMed] [Google Scholar]

[R3] Donald KA, Hampton RY, Fritz IB. Effects of overproduction of the catalytic domain of 3-hydroxy-3-methylglutaryl coenzyme A reductase on squalene synthesis in Saccharomyces cerevisiae. Appl Environ Microbiol. 1997;63(9):3341–3344. doi: 10.1128/aem.63.9.3341-3344.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Endo T, Fedorov A, de Souza SJ, Gilbert W. Do introns favor or avoid regions of amino acid conservation? Mol Biol Evol. 2002;19(4):521–252. doi: 10.1093/oxfordjournals.molbev.a004107. [DOI] [PubMed] [Google Scholar]

[R5] Graur D. Amino acid composition and the evolutionary rates of protein-coding genes. J Mol Evol. 1985;22(1):53–62. doi: 10.1007/BF02105805. [DOI] [PubMed] [Google Scholar]

[R6] Kraut DA, Carroll KS, Herschlag D. Challenges in enzyme mechanism and energetics. Annu Rev Biochem. 2003;72:517–571. doi: 10.1146/annurev.biochem.72.121801.161617. [DOI] [PubMed] [Google Scholar]

[R7] Lehmann M, Loch C, Middendorf A, Studer D, Lassen SF, Pasamontes L, van Loon AP, Wyss M. The consensus concept for thermostability engineering of proteins: further proof of concept. Protein Eng. 2002;15(5):403–411. doi: 10.1093/protein/15.5.403. [DOI] [PubMed] [Google Scholar]

[R8] Little DB, Croteau RB. Alteration of product formation by directed mutagenesis and truncation of the multiple-product sesquiterpene synthases delta-selinene synthase and gamma-humulene synthase. Arch Biochem Biophys. 2002;402(1):120–135. doi: 10.1016/S0003-9861(02)00068-1. [DOI] [PubMed] [Google Scholar]

[R9] Martin VJ, Pitera DJ, Withers ST, Newman JD, Keasling JD. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat Biotechnol. 2003;21(7):796–802. doi: 10.1038/nbt833. [DOI] [PubMed] [Google Scholar]

[R10] Martin VJ, Yoshikuni Y, Keasling JD. The in vivo synthesis of plant sesquiterpenes by Escherichia coli. Biotechnol Bioeng. 2001;75(5):497–503. doi: 10.1002/bit.10037. [DOI] [PubMed] [Google Scholar]

[R11] Nasreen A, Vogt M, Kim HJ, Eichinger A, Skerra A. Solubility engineering and crystallization of human apolipoprotein D. Protein Sci. 2006;15(1):190–199. doi: 10.1110/ps.051775606. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Pal C, Papp B, Lercher MJ. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat Genet. 2005;37(12):1372–1375. doi: 10.1038/ng1686. [DOI] [PubMed] [Google Scholar]

[R13] Pal C, Papp B, Lercher MJ. An integrated view of protein evolution. Nat Rev Genet. 2006;7(5):337–348. doi: 10.1038/nrg1838. [DOI] [PubMed] [Google Scholar]

[R14] Pitera DJ, Paddon CJ, Newman JD, Keasling JD. Balancing a heterologous mevalonate pathway for improved isoprenoid production in Escherichia coli. Metab Eng. 2007;9(2):193–207. doi: 10.1016/j.ymben.2006.11.002. [DOI] [PubMed] [Google Scholar]

[R15] Reiling KK, Yoshikuni Y, Martin VJ, Newman J, Bohlmann J, Keasling JD. Mono and diterpene production in Escherichia coli. Biotechnol Bioeng. 2004;87(2):200–212. doi: 10.1002/bit.20128. [DOI] [PubMed] [Google Scholar]

[R16] Roodveldt C, Aharoni A, Tawfik DS. Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol. 2005;15(1):50–56. doi: 10.1016/j.sbi.2005.01.001. [DOI] [PubMed] [Google Scholar]

[R17] Schmidt S, Sunyaev S, Bork P, Dandekar T. Metabolites: a helping hand for pathway evolution? Trends Biochem Sci. 2003;28(6):336–341. doi: 10.1016/S0968-0004(03)00114-2. [DOI] [PubMed] [Google Scholar]

[R18] Steele CL, Crock J, Bohlmann J, Croteau R. Sesquiterpene synthases from grand fir (Abies grandis). Comparison of constitutive and wound-induced activities, and cDNA isolation, characterization, and bacterial expression of delta-selinene synthase and gamma-humulene synthase. J Biol Chem. 1998;273(4):2078–2089. doi: 10.1074/jbc.273.4.2078. [DOI] [PubMed] [Google Scholar]

[R19] Steipe B, Schiller B, Pluckthun A, Steinbacher S. Sequence statistics reliably predict stabilizing mutations in a protein domain. J Mol Biol. 1994;240(3):188–192. doi: 10.1006/jmbi.1994.1434. [DOI] [PubMed] [Google Scholar]

[R20] Yoshikuni Y, Ferrin TE, Keasling JD. Designed divergent evolution of enzyme function. Nature. 2006a;440(7087):1078–1082. doi: 10.1038/nature04607. [DOI] [PubMed] [Google Scholar]

[R21] Yoshikuni Y, Keasling JD. Pathway engineering by designed divergent evolution. Current Opinion in Chemical Biology. 2007;11(2):233–239. doi: 10.1016/j.cbpa.2007.02.033. [DOI] [PubMed] [Google Scholar]

[R22] Yoshikuni Y, Martin VJ, Ferrin TE, Keasling JD. Engineering cotton (+)-delta-cadinene synthase to an altered function: germacrene D-4-ol synthase. Chem Biol. 2006b;13(1):91–98. doi: 10.1016/j.chembiol.2005.10.016. [DOI] [PubMed] [Google Scholar]

PERMALINK

Redesigning enzymes based on adaptive evolution for optimal function in synthetic metabolic pathways

Yasuo Yoshikuni

Jeffrey A Dietrich

Farnaz F Nowroozi

Patricia C Babbitt

Jay D Keasling

Abstract

INTRODUCTION

RESULT AND DISCUSSION

Immutability of amino acids

Figure 1. Evolutionary study of the relative stability for each amino acid.

Redistribution of Gly and Pro residues at predicted positions in proteins

Figure 2. MSA-predicted Gly and Pro redistributions in tHMGR and their effect on mevalonate production.

Figure 3. MSA-predicted Gly and Pro redistributions in HUM and their effect on sesquiterpene production.

Figure 4. Integration of redesigned tHMGR into E. coli and the resulting mevalonate production.

Accumulation of single mutations improved the in vivo enzyme properties and production of various terpenes

Figure 5. Co-integration of redesigned HUM and tHMGR into the synthetic pathway and the resulting in vivo sesquiterpene production.

The effects of Gly- and Pro-related mutations on enzyme activity

Table 1.

Figure 6. Investigation of the temperature effects for Gly and Pro mutations.

SIGNIFICANCE

MATERIALS AND METHODS

Reagents and equipments

Analysis of amino acid composition changes in proteins across multiple species

Design methodology to improve in vivo properties of enzymes using MSA as a guide

Strains and Plasmids

GC-MS analysis for in vivo mevalonate production

GC-FID and GC-MS analysis for in vivo sesquiterpene production

Site directed mutagenesis of tHMGR by overlap PCR

Site directed mutagenesis of HUM by overlap PCR

Quantification of in vivo HUM concentrations

Protein expression and purification of HUM

Enzyme kinetics

Supplementary Material

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases