In-depth Sequence–Function Characterization Reveals Multiple Pathways to Enhance Enzymatic Activity

Vikas D Trivedi; Todd C Chappell; Naveen B Krishna; Anuj Shetty; Gladstone G Sigamani; Karishma Mohan; Athreya Ramesh; Kumar R Pravin; Nikhil U Nair

doi:10.1021/acscatal.1c05508

. Author manuscript; available in PMC: 2023 Jun 15.

Published in final edited form as: ACS Catal. 2022 Feb 1;12(4):2381–2396. doi: 10.1021/acscatal.1c05508

In-depth Sequence–Function Characterization Reveals Multiple Pathways to Enhance Enzymatic Activity

Vikas D Trivedi ^1,^*, Todd C Chappell ^1,^*, Naveen B Krishna ², Anuj Shetty ², Gladstone G Sigamani ², Karishma Mohan ¹, Athreya Ramesh ¹, Kumar R Pravin ², Nikhil U Nair ^1,^✉

PMCID: PMC10270700 NIHMSID: NIHMS1899566 PMID: 37325394

Abstract

Deep mutational scanning (DMS) has recently emerged as a powerful method to study protein sequence-function relationships but it has not been well-explored as a guide to enzyme engineering and identifying pathways by which their catalytic cycle may be improved. We report such a demonstration in this work using a Phenylalanine ammonia-lyase (PAL), which deaminates L-phenylalanine to trans-cinnamic acid and has widespread application in chemo-enzymatic synthesis, agriculture, and medicine. In particular, the PAL from Anabaena variabilis (AvPAL*) has garnered significant attention as the active ingredient in Pegvaliase^®, the only FDA-approved drug treating classical Phenylketonuria (PKU). Although an extensive body of literature exists on the structure, substrate-specificity, and catalytic cycle, protein-wide sequence determinants of function remain unknown, as do intermediate reaction steps that limit turnover frequency, all of which has hindered rational engineering of these enzymes. Here, we created a detailed sequence-function landscape of AvPAL* by performing DMS and revealed 112 mutations at 79 functionally relevant sites that affect a positive change in enzyme fitness. Using fitness values and structure-function analysis, we picked a subset of positions for comprehensive single- and multi-site saturation mutagenesis and identified combinations of mutations that led to improved reaction kinetics in cell-free and cellular contexts. We then performed QM/MM and MD to understand the mechanistic role of the most beneficial mutations and observed that different mutants confer improvements via different mechanisms, including stabilizing transition and intermediate states, improving substrate diffusion into the active site, and decreasing product inhibition. This work demonstrates how DMS can be combined with computational analysis to effectively identify significant mutations that enhance enzyme activity along with the underlying mechanisms by which these mutations confer their benefit.

Keywords: PAL, phenylalanine ammonia-lyase, phenylketonuria, PKU, deep mutational scanning, directed evolution, QM/MM, molecular dynamics

INTRODUCTION.

Deep mutational scanning (DMS)^1–15 has emerged as a powerful method to assess sequence-function relationships¹⁶, identify functional hotspots¹⁷, and accelerate and/or broaden engineering campaigns^{18, 19}. Specifically, DMS provides a comprehensive map of sequence–function relationships to explore the protein fitness landscapes¹⁷, discover new functionally relevant sites²⁰, and identify beneficial combinations of mutations for protein engineering²¹. Building upon these previous efforts at DMS guided protein engineering, we developed a workflow to engineer an ammonia-lyase (AL, EC 4.3.1.*), a relatively understudied family of enzymes ─ and further assess how specific mutations overcome different steps in the catalytic cycle that limit activity.

Phenylalanine ammonia-lyases (PALs), which deaminate L-phenylalanine (Phe) to trans-cinnamic acid (tCA) and ammonium (NH₄⁺), are widely found associated with secondary metabolism in plants, bacteria, and fungi²² and contain the rare 4-methylideneimidazole-5-one (MIO) adduct. The MIO adduct enables deamination without an exogenous cofactor such as pyridoxal 5-phosphate (PLP) and/or co-substrate(s)²³. Biocatalytic applications for natural product and fine chemical synthesis, as well as therapeutic potential have driven the discovery, expression, characterization, and engineering of PALs^{1, 24–28}. In particular, the recent success translating PALs into enzyme replacement therapies for phenylketonuria (PKU) management and potential use as a cancer therapeutic have further increased interest in engineering this class of enzymes^29–32. While there is extensive literature on the structure and catalytic mechanism of PALs, and a generalized understanding of how residues in the substrate-binding pocket contribute to specificity and turnover using semi-rational and homology-guided mutagenesis studies^33–35, there is poor understanding of how enzymatic performance can be improved by mutating residues that are not in contact with the bound substrate.

We previously developed a growth-coupled enrichment for rapid screening of high-activity variants of AvPAL* (the double mutant C503S-C565S PAL from Anabaena variabilis, currently used to formulate the PKU drug Pegvaliase^®) in E. coli³⁶. Building on our prior work, we first utilized DMS to provide a detailed sequence-function landscape of AvPAL* and identified 79 functionally relevant sites that improve activity. Next, we selected seven hotspots for single- or multi-site saturation mutagenesis to study their interactions and further enhance catalytic activity. Interestingly, many beneficial mutations were not well-represented in the natural sequence diversity of homologous PAL enzymes. We observed that few mutations showed positive fitness with increasing number of co-mutating residues, as evidenced by the two best combinations of mutation among the 7 sites, a double (T102E-M222L) and triple (T012R-M222L-D306G) mutant. These displayed a ~2.5-fold improvement in k_cat (and >3-fold increase in catalytic efficiency). To understand the mechanistic role of key mutations, we performed modelling studies (Quantum Mechanical, QM/MM, and Molecular Dynamics, MD, including metadynamics) and concluded that there are multiple pathways to enhance PAL catalytic activity, including, i) decreased root mean square fluctuation (RMSF) of substrate in the active site, ii) greater proximity of the substrate to catalytic residues, iii) stabilization of the substrate in the near attack conformation, iv) stabilization of the transition and intermediate states, and v) facilitated diffusion of the substrate to the active site. Based on the unique experimental and computational insights, we also created another variant (T102E-M222L-N453S) that displayed lower product inhibition and ~6-fold higher activity in a whole-cell context. In summary, we provide a combined DMS and modeling workflow that we used to engineer enhanced catalytic properties of PALs, while simultaneously advancing the understanding of their basic enzymology.

RESULTS & DISCUSSION.

Overview.

The overall workflow of this work is summarized in Fig. 1. Starting with a randomly mutagenized library, we first performed deep mutational scanning (DMS) of AvPAL* using a growth-based high-throughput screen (HTS) to evaluate the fitness – or change in relative frequency – of each mutation. Briefly, we deep sequenced plasmid libraries from the naïve and three enrichment passage populations to identify mutations that occurred at each position, calculated the change in frequency for every mutation in each of the enriched passage populations relative to the naïve library (fitness), and mapped the fitness onto the protein sequence and structure. Using fitness, structural insights, and domain knowledge, we classified certain positions as mutational hotspots from which we generated site-saturation mutagenesis libraries of each position alone, or in combination. We then enriched these libraries using our HTS, as before, and identified additional variants that further enhanced activity. Next, we performed MD, including metadynamics, and QM/MM studies to characterize the catalytic mechanism of AvPAL*, and assess the functional impact that these mutations have on its catalytic activity. Finally, we used data from all investigations to devise more active AvPAL* variants.

Deep mutational scanning (DMS) of AvPAL* and analysis of active site residues.

The naïve library contained approximately 2–4 amino acid mutations per gene, had a broad distribution of mutations (no major bias), and had an average of 5.6 (range 2–7) substitutions per residue (Fig. 2a&b). Comparatively, the enriched library from the third passage averaged only 0.6 (range 0–5) mutations per position, and also contained only 222 positions with at least one mutation (relative to the overall protein size of 565 amino acids). We found that all premature nonsense codons in the naïve library were rapidly depleted, and the library also shifted from majority non-synonymous to synonymous mutations during enrichment (Fig. S1).

Fig. 2: — a) Mutation frequency in naïve and passages #1–3. Positions (a.a.) corresponding to the three most highly enriched positions are labeled (218, 222, and 453). b) Number of mutations sampled at each position in the naïve (dark blue) and passage #3 (red) libraries. The naïve library had an average of 5.6 mutations per position across all 565 positions. Passage #3 library had an average of 0.6 mutations per position across 222 positions. c) Fitness of all mutations present at all positions in the passage #3 library. Negative fitness denotes mutations that decrease in frequency over passages; the highest fitness mutations are labeled (G218A, G218S, M222L, I268T, D306G, G360C, and N453S). Position labels (x-axis) are the same for panels (a-c). From here we identified 79 functionally relevant sites of these, 7 positions T102, G218, M222L, I268, D306, G360 and N453 showed maximum fitness gradient and are thus referred to as hotspots. d) Active site residues of AvPAL* (grey sticks) with phenylalanine ligand (yellow) docked. e) Fitness heatmap of active site residues at passage #3. Wildtype residues of AvPAL* are bordered in black and listed above the residue position. Grey boxes indicate mutations not sampled in library.

To evaluate the relative increase of each mutant variant in library during enrichment, we calculated a fitness score for each mutation (File S1). We found that the maximum fitness scores and growth rate generally increased across passages, and there was a good correlation between enzyme specific activity and fitness, indicating a strong positive selection for PAL activity using our PAL enrichment method (Fig. S1–4). Overall, 93% of mutations in the library had negative fitness by the third passage, meaning that most mutations were decreasing in their relative proportion, and thus, resulted in variants that are deleterious (Fig. 2c). Notably, the active site residues – catalytic and substrate binding – were generally non-permissive to mutations (Fig. 2d, e), and our data is in good agreement with published literature. Y78, Y314, and the MIO-forming triad (A167, S168, and G169) that are implicated to play essential roles in catalysis³⁷ and highly conserved in PALs, were found to be non-permissive to mutations. L104, F107, L108, and L171 in AvPAL* are part of the substrate binding pocket that interacts with the hydrophobic moiety of Phe. Among these, L104, L171, and L219 are three of the most highly conserved residues in tyrosine/phenylalanine ammonia lyases (TPALs) and mutases²³ and any substitution at these positions had negative fitness (Fig. 2e). F107 equivalent position shows more diversity, as PALs contain F¹⁴, TPALs contain basic amino acids R¹⁵ or H⁸, whereas aminomutase carry more polar amino acids C^{6, 38} or S³⁹. In PALs, F107 forms edge-edge interaction with the phenyl ring of the substrate²⁸ and hence a mutation to Tyr is likely to be minimally disruptive, consistent with neutral fitness of F107Y in our data. In an earlier study, L108A, G mutations were shown to drastically reduce the enzyme activity suggesting it is not permissive to mutations²⁸. However, we found L108Q, M to have positive fitness, suggesting the need for a large uncharged sidechain to fill-in the active site and maintain favorable interaction with the substrate. We confirmed L108Q and M to be more active and L108G to be less active than parental AvPAL* on phenylalanine (Fig. S5). Though L108M agrees with the requirement of hydrophobic residue to maintain nonpolar contacts²⁸, L108Q is an unusual amino acid change to a polar residue at an otherwise conserved position. Generally, L108 equivalent position is conserved in ammonia-lyases active on phenylalanine, and His is found to be present in enzymes active on Tyr²³. On performing sequence analysis on AvPAL* homologs against the RefSeq protein database, we observed out of the 998 sequences, Leu and His are present at the 108-equivalent position in 490 sequences each, accounting for 98.2% of the sequences, and Ala, Lys, Met, Gln, Thr are found in the remaining 18 sequences (Table S1). The M222 position shows greater permissivity in the library and its natural diversity in homologs. Of the same 998 homologs, 469 had Val, 396 had Met, whereas, Ile, Asn, Thr, and Leu were found in 105, 15, 7 and 6 sequences, respectively. We found M222L, V to have higher fitness and thus, higher activity compared to parental AvPAL*. This is consistent with a recently concluded study on PcPAL, where bulkiness of the residues lining the active site were considered important³⁵. F363, K419, and E448 all showed negative fitness for all substituted amino acids sampled and are conserved in PALs and TALs with available structures. I423 showed negative fitness for all mutations, including Thr, the equivalent of which in Petroselinum crispum PAL (PcPAL) (I460T) demonstrated a modest 1.15-fold increase in k_cat⁴⁰. Thus, many of the outcomes of the analysis of DMS data related to active site residues are largely consistent with published data on PALs, supporting the validity of the workflow and calculated fitness scores as proxy for enzyme activity (Fig. S4). Further, we identified 4 distinct active site mutations at L108 and M222 that increase AvPAL* activity, only one of which has been previously reported (M222L, by our group)³⁶.

Sequence-function characterization highlights hotspots that enhance activity.

To accurately identify “hotspots”, positions where mutations contribute most highly to enhance activity, we calculated a fitness gradient. The fitness gradient was determined by calculating the fitness of each mutation at each position for each passage (P#1, #2 and #3) relative to the Naïve library (P#1 versus Naïve, P#2 versus Naïve, P#3 versus Naïve), followed by linear regression over passages, thereby overcoming any subtle biases from a single timepoint (Fig. S6). We omitted the N-terminal residues from consideration because they were previously shown to be dispensable for AvPAL* activity²⁸, and also filtered the fitness scores to ignore any mutation with a frequency of zero in any passage or less than 0.625 % in passage #3, to better identify the most significant mutations. From the 225 positively fit mutations in P#3, we identified 12 positions (and 14 mutations) with the most positive fitness gradients (Fig. 3a). We then mapped the fitness onto the structure of AvPAL (PDB ID: 2NYN) with phenylalanine docked in the active site to investigate where the most fit positions are located relative to the active site and one another (Fig. 3b). Interestingly, we found that the most fit mutations were clustered at either end of the protein chain, with many quite distal from the active site. To investigate further, we calculated the distance of the α-carbon of each residue to the α-carbon of Phe docked in the active site of the same chain (Fig. 3c). We found that 7 of 12 of the fittest mutations were >50 Å from the docked Phe. Noting that AvPAL* is a homotetramer composed of dimers with chains oriented anti-parallel, we found that intramolecular distal residues were proximal to adjacent active sites (Fig. 3d). We see that positions 268, 294, 306, 400, and 494 from the B-chain are closest to the A-chain substrate Phe, as are the 407, 533 and 534 positions from the C-chain. In fact, previous investigations into the structure of AvPAL* have identified loops in the adjacent chains that play an important role in forming the active site pocket (Fig. 3c, f)²⁸. Interestingly, we also observed that the top 4 mutations (G218S/A, M222L, and N453S) were in highest abundance as single site mutants (i.e., did not co-occur), and triple mutants were completely absent, suggesting a need to further investigate these hotpots combinatorically (Table S2).

Combinatorial mutagenesis reveals optimal single and combined mutations at hotspots

From our analyses, we selected 7 hotspot positions (102, 218, 222, 268, 306, 360, and 453) for further investigation (Fig. 3e). We generally classified these residues as comprising 2 regions: and i) residues located within the loops surrounding the active site (268, 306, 360, and 453) (Fig. 3f), and ii) residues clustered within a bundle of α-helices that form a surface in the active site (102, 218, 222) (Fig. 3g). The 102, 218, 222, 360, and 453 positions most likely act on the intra-chain active site, while the 268 and 306 positions are from an adjacent chain (chain B, relative to the active site of chain A, and vice versa).

Though computational approaches like GRAPE⁴¹, CompassR⁴², SNAP2⁴³, PROVEAN⁴⁴, DeMask⁴⁵ etc. can be used to inform the sites for combinatorial library, we observed that the trends in fitness scores from our study, which correlate well with the observed specific activity, did not agree well some of the computational predictions (Fig. S7, Table S3). Hence, to identify the improved variants, we generated combinatorial libraries (like synthetic shuffling⁴⁶ and CASTing⁴⁷) from 7 hotspots using single (⁷C₁ – i.e., 7 choose 1), triple (⁷C₃ – 7 choose 3), and septuple (⁷C₇ – 7 choose 7) site saturation mutagenesis libraries. We constructed the three libraries in a manner allowing us to extensively cover all the combinations practically possible. For instance, ⁷C₃ library allows us to sample up to a maximum of three sites, whereas ⁷C₇ allows us to sample all seven sites (please see material and methods, Fig. S8 for more details). We then enriched these libraries using our HTS and calculated the fitness of each residue. Fig. 4a shows the fitness of single site saturation mutagenesis of all sites when passaged individually (⁷C₁). For all sites, we found substitutions that had higher fitness than the wildtype. For example, the T102 position shows the most permissive behavior with seven amino acid substitutions showing positive fitness including A, S, P, E, R, K, and H. The other six positions display more restrictive pattern with only 3–4 amino acids showing positive fitness (Fig. 4a). In total we found 28 individual substitutions with higher fitness than the native residue at that position and 24 substitutions with positive fitness. Interestingly, only 2 of these 24 have previously been described to enhance AvPAL* activity (our previous work³⁶) – the remaining being first described here. However, fewer mutations showed positive fitness scores when evaluated in combination using ⁷C₃ and ⁷C₇ libraries (Fig. 4b, c). This indicates that while many mutations contribute to positive fitness individually, most are not readily additive or synergistic in combination. This agrees with the observation that double mutants from a G218X-M222X library are less fit than single mutants at either of those positions (Fig. S9). Indeed, the G218S mutation, while highly fit in isolation, is generally unable to contribute significant fitness score when combined with 1 to 6 other mutations (Fig. 4a–c, Fig. S9). Looking at the naturally occurring diversity at the seven sites investigated in the present study, we found both agreement with the natural diversity and numerous mutations with positive fitness that are not found naturally (Fig. 4d). For four of the seven positions (218, 222, 306, 360), we find that the natural diversity fully recapitulated the fittest variants found in our study. However, for other sites (102, 268, 453), we were able to find novel mutations that are not predictable through sequence alignment alone. We also noticed that position 306 has very high natural diversity but is restricted to only two mutations with positive fitness (Glu, Gly). Conversely, 453 is naturally restricted to only two amino acids naturally (Asn, Glu) but many alternate mutations increase fitness.

Fig. 4. — Fitness heatmaps of a) ⁷C₁, b) ⁷C₃, and c) ⁷C₇ mutant libraries. Relative fitness is shown as two gradients, from highest fitness (red) to zero fitness (white) and zero fitness to lowest fitness (dark blue). Relative fitness for ¹C₁ library was calculated at each position individually, and at all positions in combination for ⁷C₃ and ⁷C₇. d) Sequence comparison to 100 proteins with greatest homology to AvPAL*. Black cells are the 7 hotspot positions, numbered for AvPAL*. Yellow cells indicate positively fit mutations found during our enrichments that were unique relative to the homologous proteins. A green gradient was applied to the natural residues indicating frequency of residue at each position, with dark green as the most frequent and light green as infrequent.

To identify the most active variants from the ⁷C₃ and ⁷C₇ libraries, we picked 20 random colonies from each enriched culture, tested them for growth and tCA production, and sequenced the top 14 (Fig. S10–11, Table S4). From this subset, we identified new variants with improved kinetic parameters (Table 1). Among these, T102E-M222L and T102R-M222L-D306G displayed 2.4- and 2.25-fold improvement in the k_cat, respectively. Interestingly, T102R-M222L-D306G showed some substrate inhibition (Fig. S12, apparent at > 10 mM Phe), although, it did display the highest activity at lower substrate concentrations (< 300 μM), which is most relevant for PKU treatment. Beyond having isolated the most active AvPAL* variants, these results have two additional implications. First, only a small combination of mutations at these sites synergistically and/or additively enhance AvPAL* activity (Fig 4c). Second, the propensity of these mutations to act additively and/or synergistically may be explained by the mechanism by which they contribute to PAL activity. To understand the mechanism by which the different mutations may contribute increasing activity, we performed in silico modelling studies.

Table 1.

Kinetic constants of highest activity AvPAL* variants.

PAL-variant	Model	V_max (μmole·min⁻¹·mg⁻¹)	K_M (μM)	K_i (mM)	k_cat (s⁻¹)	k_cat/K_M (S⁻¹·μM⁻¹)	Fold increase k_cat

AvPAL*	MM	0.93 ± 0.01	137 ± 0	-	0.97	0.007	1.00
M222L	MM	1.85 ± 0.02	145 ± 7	-	1.93	0.013	1.99
L4P-G218S	SI	1.86 ± 0.02	199 ± 9	208 ± 39	1.94	0.010	2.00
G218S	SI	1.49 ± 0.02	180 ± 9	164 ± 26	1.55	0.009	1.60
G218A	SI	1.01 ± 0.02	43 ± 4	96 ± 18	1.05	0.024	1.09
T102P	MM	1.88 ± 0.02	253 ± 20	-	1.96	0.008	2.02
T102E-M222L	MM	2.33 ± 0.02	144 ± 6	-	2.43	0.017	2.51
T102S-M222L	SI	1.88 ± 0.02	111 ± 6	371 ± 136	1.96	0.018	2.02
T102M-M222L-D306G-N453G	SI	1.15 ± 0.01	48 ± 3	147 ± 24	1.20	0.025	1.24
T102R-M222L-D306G	SI	2.16 ± 0.02	96 ± 4	169 ± 21	2.25	0.023	2.32

Open in a new tab

MM – Michaelis-Menten (Eqn 2), SI – Substrate inhibition, (Eqn 3)

MD studies reveal mutants with local fluctuations in the active site impact the near attack conformation.

Having identified mutations that enhance AvPAL* activity, we were interested in understanding the mechanism contributing to increased activity. We conducted extensive all-atom atomistic MD studies for different mutants of AvPAL* to ascertain the stability of the attack conformation. The starting enzyme substrate (E-S) complex was derived using docking studies; each MD simulation was 500 ns × 2 long, and all post-simulation studies were conducted from 100^th ns onwards. AvPAL* is a homotetramer, of closely interlocking monomers^{1, 14, 28}. Each tetramer contains four catalytic sites, and each active site is comprised of residues from three different monomers and one MIO group. Each active site is capped by two flexible loops: an inner loop which is packed tightly in the active site and forms interactions with the substrate and an outer loop which serves as an external cap. The outer loop is attributed to forming a barrier to bulk solvent, preventing access to the active site²⁸. Reaction mechanisms of PALs have been extensively studied and involve the formation of the N-MIO intermediate (Fig. Sf)^{3, 8}. Here, we focused on the formation of the first intermediate state of the substrate in the active site (before covalent binding with MIO, Fig. S13a–b). During this step, Y314 functions as the catalytic base.

As the binding conformation of the substrate is not yet identified in any of the PAL crystal structures, we performed the docking of Phe in AvPAL* structure using Autodock4 and then chose energetically and structurally feasible conformation for further studies. We observed the binding energy to be −3.03 kcal mol⁻¹ for the E-S complex. Maintenance of close proximity between substrate and catalytic residues over the period of the MD simulation is indicative of a stable E-S complex and formation of near-attack conformation. We measured the distances between substrate amino nitrogen (Phe(N)) and the MIO methylidene carbon (MIO(Cβ2)) and adjacent chain tyrosine 314 hydroxyl oxygen (Y314(O)) for the variants (Fig. S14). We used L108G as a control for our modeling studies as it has previously been validated as a deleterious mutation²⁸. Fitter variants, M222L and G218S, but not N453S, show improved near-attack conformation as indicated by closer proximity to MIO when compared to controls, parental AvPAL* and low activity variant L108G (Fig. 5a–d, Table S5). The double and triple mutants, T102E-M222L and T102R-M222L-D306G, also interact closely with the reactive MIO(Cβ2) compared to controls (Fig. 5e–f). Next, we calculated root mean square fluctuation (RMSF) for mutant backbone atoms and normalized them over the parental AvPAL* to identify the flexible regions. Higher values are characteristic of flexible regions that readily displace from their average position in the parental enzyme during simulation and vice-versa for negative values. Four flexible regions in the high-activity single mutants show altered dynamics compared to parental: residues 65–110, 290–325, and 400–430 (Fig. 5g, S14). These regions are located around the active-site and the interface of the dimeric subunits. The loop residues 290–325 constitutes an access channel of the enzyme and residues 310–318 forms the second shell of the active site. Region 65–110 interacts with phenyl group of the substrate. From Fig. 5g, we concluded that the dynamics of AvPAL* changed with every mutation and the negative control (L108G) showed very high fluctuation in the access channel (290–325) and second shell that likely disrupt profitable substrate interactions. In addition, we also evaluated the Free Energy Surface (FES) obtained from different sets of metadynamics experiments, which provide us an understanding of the energy bins associated within the active site and the regions around it. These data further support our assertion that many of the beneficial mutations stabilize interactions between the substrate and active site (Fig. S15). Overall, our MD simulations suggest that for four of the five higher activity variants studied (M222L, G218S, T012E-M222L, and T102R-M222L-D306G), the substrate more readily approaches the catalytic site forming a stable near-attack conformation. However, for N453S, our analysis revealed behavior largely unchanged from parental, suggesting that its mode of action may not involve direct modulation of interactions between substrate and catalytic residues.

Figure 5. — Scatter plots of distance calculated between Y314(O)–Phe(N) and MIO(Cβ2)–Phe(N) atoms in parental AvPAL* (black), a) G218S (green), b) M222L (red), c) N453S (pink), d) L108G (blue), e) T102E-M222L (purple), and f) T102R-M222L-D306G (khaki). The scatter plot shown here is average of two runs. g) RMSF (root mean square fluctuation) plot of protein backbone atoms (carboxylate, Cα, amine). The RMSF values of mutants are normalized to that of the parental enzyme so that only major movements are amplified. All variants are plotted on the same scale. Regions with high deviation from the parent are boxed.

Steered molecular dynamics (SMD) studies show steady and seamless diffusion of Phe in mutant N453S.

Due to the location of N453S in the periphery of the active site (~12 Å), we suspected it could impact Phe diffusion. We therefore performed SMD of AvPAL* and the mutant for comparison studies. One of the near attack conformations was chosen to simulate the egress path taken by the Phe from the active site. Subsequently, we used the same path as a shadow to simulate the Phe reassociation studies. With the primary force constant, both egress and the reassociation of the Phe favored substrate diffusion in N453S compared to AvPAL* (Fig. 6, Movie S1–S2), details are explained in the supplementary section (Fig. S16). We conducted umbrella sampling as an extension of SMD studies to estimate the energetics during the translocation of Phe along the path. A series of configuration or reaction coordinates across the path were chosen from the SMD studies and constructed based on the distance between the center of mass (COM) of MIO and that of the Phe. The path was discretized into multiple windows that were chosen for every 0.5 Å of the Phe movement from the active site till it reached the periphery of the protein. The umbrella sampling studies on Phe translocation sheds light on mutation N453S and the residues along the path that are responsible for the substrate stabilization and anchoring as it enters the active site. The potential of mean force (PMF) graph shows that N453S has two minima, which were not observed in the AvPAL*, at a distance of ~5–6 Å and ~10–11 Å between the COM of MIO and that of Phe (Fig. 6a). The conformational changes of Phe were extracted from umbrella sampling and mapped on the protein for AvPAL* and N453S (Fig. 6b–c). For AvPAL*, the path is narrow towards the active site leading to slightly constrained and energetically less favorable entry (Fig. 6b). The path for N453S is wider and more energetically favorable, especially in a region close to the active site where the substate shows well organized conformations projecting the amino group towards the positively charged residues (Fig. 6f). Due to this, the phenyl group of Phe likely enters the active site and forms a precise Michaelis complex (Fig. S17).

Figure 6. — a) The conformational transition of the substrate along the PMF profile, b) extracted from the parental and c) mutant N453S. The peripheral regions of the substrate entry path are highlighted as R1, R2, R3, and R4. They composed of residues 397–403, 308–315 of chain C and 83–94, 446–455 of chain A, respectively. The region marked in as (*) in (a) is the free energy dip that facilities the substrate entry in N453S. d) Binding free energy calculations showed higher affinity of Phe for mutant N453S. **e, f)** E-S complex extracted from free energy calculations with least binding energies for parental and mutant (dotted box in (d)) reveal that the substrate is stabilized by salt bridges and hydrogen bonds in N453S and only hydrogen bonds in the parent. Green lines show hydrogen bond interactions, orange lines are salt bridge interactions, and light pink lines are hydrophobic interactions.

In addition, we calculated the binding energy for E-S complex from region that showed differences in AvPAL* and N453S denoted as asterisk (*) in the PMF profile (Fig. 6a). The binding energy for N453S improved by ~35 kcal mol⁻¹ when compared to AvPAL* (Fig. 6d). The conformation of the substrate and its interacting residues were extracted from the region denoted by asterisk in Fig. 6b–c that represents a low-energy region for both AvPAL* and N453S. In AvPAL*, Phe was found to interact with Y78, the backbone of Q311 and G85, whereas in case of N453S, Phe showed interactions with A88, R313, and R317 (Fig. 6e–f). In N453S, A88 was observed to have π-alkyl interaction and R317 shows π-cationic interaction with the phenyl ring of Phe. Among electrostatic interactions, the salt bridge plays a major role in stabilizing and anchoring Phe. R313 and R317 shows salt bridge interaction with the carboxylic group and N451 interacts with the amino group of the Phe. In AvPAL*, Phe amine interacts with the backbone of Q311, the carboxylic group interacts with the backbone nitrogen of G85 and hydroxyl group of Y78 with hydrogen bond interactions (Fig. 6e). We did not observe any interactions with the phenyl ring that could slow movement of the substrate in either case.

Because N453S showed improved access of the substrate Phe to the active site, we hypothesized that combining it with our other mutants might further improve the enzyme activity. To investigate this, we constructed N453S mutants of the M222L, L4P-G218S, T102E-M222L and T102R-M222L-D306G variants. We characterized these new variants by evaluating the kinetics of the purified enzymes (Fig. 7a–e, Table S6), and determining the whole cell conversion of Phe to tCA (Fig. 7f) and growth rate (Table S7) of strains expressing these variants. We found that all the active N453S added variants displayed similar kinetic parameters as parental AvPAL* (Fig 7a–e, Table S6), despite their parental counterparts having >2-fold higher activity. Further, presence of N453S in G218S and T102R-M222L-D306G backgrounds completely abolished activity (Fig. 7c, e). The reduced v_max for all new variants was surprising because all the active N453S combinations displayed improved whole cell tCA conversion when compared to their parental counterparts (Fig. 7f). In fact, T102E-M222L-N453S gave >6-fold higher conversion of Phe to tCA when compared to AvPAL* (Fig. 7f). Since the SMD studies suggested more favorable substrate ingress in N453S, we hypothesized that Phe may more readily displace tCA from the active site, reducing product inhibition, a known issue with PALs^48–50, and may explain higher fitness in the cellular context. To test this, we determined the activity of AvPAL*, N453S, T102E-M222L and T102E-M222L-N453S in the presence and absence of 150 μM tCA (Fig. S18). We found that in both cases, the N453S variants were less inhibited by tCA. This illuminates the significance of product inhibition in applications where PAL is encapsulated and tCA concentrations are likely to build to higher levels. The T102E-M222L-N453S variant, which exhibits 6-fold better whole cell conversion of Phe to tCA, might be a good potential candidate for these applications (e.g., probiotic therapy for PKU).

Figure 7. — **a–e)** Michaelis-Menten plots of AvPAL* and high active variants combined with N453S. L4P-G218S-N453S and T102R-M222L-D306G-N453S did not exhibit any activity at any of the Phe concentrations tested. Kinetic parameters are listed in Table S3. f) Whole cell conversion assay indicates that active enzymes are more active when encapsulated in *E. coli* cells even though there do not display superior kinetic parameters.

QM/MM reveals stabilization of the transition state in the hyperactive active mutants.

Although two reaction mechanisms have been proposed for PALs that proceed either through a Friedel-Crafts (FC) like intermediate⁵¹ or an N-MIO adduct⁸, there is increasing support for the latter³. To investigate, we simulated the N-MIO adduct reaction mechanism that involves formation of near attack conformation where Phe is oriented suitably for proton abstraction by the hydroxyl group of Y314 (Fig S13a–b). This deprotonation results in formation of the nucleophilic amino moiety activating Phe for interaction with the electrophilic MIO⁸. These rearrangements are referred to as the first step. The starting point of the E-S complex derived from well equilibrated MD simulations (1 μs simulations) that shows the least distance between reactive groups, i.e., Y314(O)–Phe(N) and MIO(Cβ2)–Phe(N). To get the least distance between the reactive groups for the QM/MM simulation, we scanned across the MD simulation and this E-S coordinate (MD-ES, Table S8) for the distance between the combined COM of MIO and Y314 and that of the amino group of the substrate.

All forms of quantum chemical calculations performed in this study were to delineate the reaction pathway and free-energy barrier for the first step of proton abstraction and the intermediate state 1 (IS1) formation where the substrate amine forms an attack conformation with MIO(Cβ2) (Fig. 8). The first step of the investigation is based on the generally proposed mechanism, as shown in Fig. S19. We obtained the detailed reaction mechanism, the optimized structures of transition states and intermediates which are shown in Fig. 8a–c and the calculated overall relative free energy graph is given in Fig. 8d. To understand the mechanism, we implemented the new functionality by NAMD that can execute multiple QM regions in parallel⁵². All four active sites of PAL were treated under QM code simultaneously while the rest of the protein was treated under MM force field (Fig. S20). The E-S complex is defined as the zero-point ground state (GS, 0 kcal mol⁻¹) to which all other energies are compared for each mutation. To define the attack conformation from Michaelis complex, we conducted QM/MM using PM7 function by including active site residues Y314, Q452, R317, MIO, and Phe until the distance between the Y314(O) and substrate amine approached ~1.5 Å (Table S9). The distance we observed is similar to that derived by QM/MM on TAL from Rhodobacter capsulatus⁵³ and crystal structure from PcPAL³. This was used as starting point reaction coordinates for transition state (TS) optimization using higher level QM/MM simulations based on B3LYP def2-SVP D4 TS. TS and IS were studied until the substrate H⁺ was abstracted by Y314, and then until the Phe moved to an optimum attack conformation with MIO. We observed the following sequence of events in the first step of the reaction in parental AvPAL*. After the near-attack conformation was formed, the bond between the Y314(O) and abstracted H⁺ was stretched from 1.04 Å to 1.21 Å and at the same time, the carboxyl oxygen of the Phe abstracts the H⁺ from Y314. First transition state (TS1) is formed when the H⁺ from the amino nitrogen is stretched to 1.28 Å, before being completely extracted by Y314, followed by a small conformational change in the Phe that brings it closer to the MIO to form IS1. For all tested mutations, the same events were observed for the first step, but the mutations showed different energies for the TS1 and IS1. The energies for both the TS1 and IS1 correlated with the experimental data and the energies can be graded as L108G < AvPAL* < M222L < G218S< T102E-M222L < T102R-M222L-D306G and suggest that formation of IS1 through TS1 may be rate-limiting in AvPAL*, but shifts to a downstream step in all mutants other than N453S (Fig. S21). The conformational changes observed during the formation of the IS1 is described in supplementary information and Fig. S22.

Figure 8. — Proposed reaction mechanism. **a-c)** Transition states for the first step of the reaction, where proton transfer takes place, with Phe (pink), MIO (yellow), catalytic Y314 (blue), and R317 (green), and other protein residues (light grey) as stick models. Only polar hydrogens are shown for clarity. All distances are in Å. d) Relative energy landscape for AvPAL* (black), M222L (red), G218S (blue), L108G (green), T102E-M222L (pink), and T102R-M222L-D306G (cyan) for all the steps from ground state (GS) to IS1. Energies along paths are not to scale. Relative energy values for TS1 and IS1 from three independent runs are also in Table S10.

Next, to understand the barrier-crossing events shown in the QM/MM simulations in depth, we conducted hybrid QM/MM techniques combined with metadynamics, which enhances the sampling of coordinates relevant to the reaction. This way we can observe how the system accelerates across the reaction barriers by itself and escapes from local minima (Fig. S23). We further characterized the TS1 and IS1 structures by transition path sampling (TPS) simulations, and this was plotted over the free energy surface (FES). FES and TPS derived from QM/MM metadynamics could clearly differentiate the mutants and AvPAL* and showed thermodynamically favorable energy paths for T102E-M222L and T102R-M222L-D306G (details in supplemental information, Fig. S23).

CONCLUSIONS.

In summary, we report an approach where insight gleaned from DMS can guide further protein engineering while also provide a starting point for fundamental studies that elucidate limitations to the catalytic cycle. We also provide the most extensive sequence-function analysis of an MIO-containing enzyme, AvPAL*, and created variants with improved activity. By performing computational studies (QM/MM, MD, steered MD + QM/MM), we identified the mechanisms through which the mutations enhanced enzyme activity, which in turn, allowed us to identify variants that have promising applications in cell-free biocatalysis (T102E-M222L, T102R-M222L-D306G) and cell-based systems (T102R-M222L-N453S). Not only does this significantly advance enzymology and engineering of PALs, but also demonstrates the power of using DMS to guide basic and applied enzymology.

METHODS.

Strains and general techniques for DNA manipulation.

AvPAL* error prone library enriched as described previously was used in the current study³⁶. PCR was performed using Phusion DNA polymerase or Platinum^™ SuperFi II Green PCR Master Mix (ThermoFisher Scientific). E. coli NEB5α (New England Biolabs) was used for plasmid propagation and E. coli MG1655 rph+ was used for screening of libraries and purification of recombinant AvPAL* and its mutants. Sequences of constructed plasmids were confirmed through DNA sequencing (Genewiz). AvPAL* was expressed under constitutive T5 promoter from plasmid pBAV1k carrying chloramphenicol resistance.

Next generation sequencing (NGS) of library and data processing.

Plasmid libraries and PCR products were outsourced to Genewiz (New Jersey, USA) for sequencing on Illumina MiSeq Nextra paired end sequencing platform (2 ´ 250 bp). For all the samples sequenced we received 1–5 million reads with average length of 160 bp after trimming. The bioinformatic workflow is depicted in Fig. S24. Briefly, the raw fastQ files were evaluated for quality score, read length, adaptor and duplicate read content using FastQC package. Subsequent analysis was performed using Geneious Prime^® 2020.2.4. The reads were paired and merged using BBMerge package⁵⁴, filtered for adaptor sequences, short and poor-quality reads using BBDuk package. The reads were then mapped onto reference gene (AvPAL*) using BowTie2 package⁵⁵. The mapped reads were then analyzed for single nucleotide variants to detect mutations. This variant call file was used to calculate the fitness score using Eqn 1.

L v = \ln \frac{C v, s e l + 0.5}{C t o t a l, s e l + 0.5} - \ln \frac{C v, i n p + 0.5}{C t o t a l, i n p + 0.5} \dots

Equation 1

The fitness values thus determined were represented using heatmap to show the residues with positive and negative fitness.

Physical linking of distal mutations for amplicon sequencing.

Workflow followed for physical linkage of G218, M222 and N453 is indicated in Fig. S25. Briefly, region immediate downstream to M222 and immediate upstream to N453 was amplified with primers having homologous region. The amplicon flanked by homologous region was sealed using NEB-HiFi assembler. The circular plasmid was then used as a template to amplify ~300 bp region spanning G218-N453. The segment was amplified using primers with illumine sequencing overhangs. The amplicon was sequenced using AmpliconEZ Seq Illumina platform at Genewiz.

Construction of site saturation mutagenesis (SSM) libraries.

pBAV1k plasmid containing AvPAL* was used as template for constructing the site saturation libraries. The SSM libraries for seven sites of interest were constructed in three ways; i) individual sites using NNS codon at the target location was constructed following QuickChange-like method. Briefly, partially overlapping primers were used to perform inverse PCR, the amplicon was subjected to DpnI (NEB) digestion to remove the parental plasmid followed by NEBHiFi assembly (NEB) to assemble and seal the overlapping ends for improved transformation efficiency. The assembled product was purified using PCR clean-up kit and electroporated into MG1655 rph+. ii) A new approach of scaling by mutation was developed to mutate three sites in varying combination (Fig. S8). In this approach, the clean-up product from approach i) was pooled in equimolar amounts and used as template for second round of inverse PCR using seven primer pairs individually. This process was repeated total of three times to generate the ⁷C₃ SSM library which was transformed into E. cloni DH10B (Lucigen) for achieving large library size. iii) The third library was constructed by using restricted codon at the seven sites of interest – ⁷C₇. The restricted codon was chosen based on the DMS data from error prone PCR library screen (Table S11). The fragments were assembled using NEB HiFi assembler and electroporated into E. cloni DH10B after PCR clean-up. The plasmid library from approach ii) and iii) were isolated from E. cloni and transformed into E. coli MG1655 rph+ enrichment on minimal media containing 30 mM Phe. Fitness data for these libraries were obtained by sequencing the seven sites of interest using AmpliconEZ seq (Genewiz). The data was processed as described in a manner described above.

Enzyme assay, purification, and kinetic characterization.

PAL activity was monitored by measuring the production of tCA at 290 nm over time. Briefly, 200 μL reaction as performed by 1 μg of purified enzyme to pre-warmed PBS containing 30 mM Phe. The assay was performed in 96-well F-bottom UVStar (Greiner Bio-One, Kremsmünster, Austria) microtiter plate and absorbance at 290 nm was measured every 15 s at 37 °C using a SpectraMax M3 (Molecular Devices) plate reader.

For purification, the enzyme was isolated from 25 mL culture. The pellet was washed once with PBS and resuspended in 500 μL PBS. This cell suspension was sonicated on ice using a Sonifier SFX 150 (Branson Ultrasonics, Danbury, CT) (10 s ON; 1 min OFF; 2 min; 40 %), and cell debris was separated from the lysate by centrifuging at 20,000 ´ g for 10 min at 4°C. As each construct included a N-term His-tag, the enzyme was purified via immobilized metal affinity chromatography (IMAC) purification. Briefly, the lysate was loaded onto HisPur^™ Ni-NTA Spin Plates (ThermoFisher Scientific) and incubated for 2 min. After being washed four times equilibration buffer, pure protein was then eluted using 200 μL of Elution buffer (300 mM NaCl, 50 mM NaH2PO4, 500 mM imidazole, pH 8.0). Elution fractions were then dialyzed using Tube-O-Dialyser tubes (1 kDa MWCO, Geno-Tech). Protein concentration was estimated by Bradford reagent (VWR) using bovine serum albumin (BSA) as the standard.

For kinetic analysis, AvPAL* and selected mutants were purified and assayed as described above. The activity was measured at twelve concentrations of Phe ranging from 15 μM to 30 mM in PBS, pH 7.4 (PBS) at 37 °C. The initial velocity at different phenylalanine concentration for AvPAL* and the mutants was analyzed using both Michaelis–Menten (Eqn S2) and Substrate-Inhibition (Eqn S3) models by nonlinear least-squares regression analysis using GraphPad Prism (v9). The model with the best-fit and R² was chosen as the preferred one.

Modelling and induced-fit conformation sampling/enzyme-substrate interaction studies.

3D structure of the PAL enzyme from Anabaena variabilis chosen (PDB ID: 2NYN). The structure had 2 missing regions (Residues 74–92, 302–309) which were modelled using MODELLER⁵⁶. The PAL structure with the least DOPE Score was selected and chosen for further studies. The binding conformation of phenylalanine is not identified in any AvPAL* crystal structures. To compensate, Phe was docked in the active site of the modelled AvPAL* structure using Autodock4 tool⁵⁷. An energetically and structurally feasible conformation was chosen for the interaction studies and structural analysis in Chimera. The binding energy was found to be −3.03 kcal mol⁻¹ for the E-S complex.

Molecular dynamics (MD) simulations.

MD simulation was conducted for AvPAL* and mutant complexes from the interaction studies. The complexes were taken into a system using the AMBER99SB⁵⁸ force field as implemented in GROMACS^59–63 tools. The complex was placed in a box of volume 1000 nm³ and then solvated with ~26,230 water molecules. To emulate conditions similar to in vitro experiments, a salt concentration of 0.15 M NaCl was incorporated into the solvated system. This has the added benefit of neutralizing the charge of the system. The LINCS was employed to constrain bond length and fix all bonds containing hydrogen atoms. Berendsen thermostat⁶⁴ was chosen to control the temperature at 310 K. The Particle-mesh Ewald algorithm (PME)⁶⁵ was used to calculate electrostatic interactions with a 10 Å cut-off. The V-rescale and the Parrinello–Rahman algorithms was applied to couple the temperature and pressure. Energy minimization of the system was obtained using the steepest descent algorithm with a tolerance value of 1000 kJ mol⁻¹ nm⁻¹ in 1000 steps. The minimized system was equilibrated for 1 ns each of constant volume and constant pressure ensemble. The system was then subject to a production run of 500 ns at 1 atm pressure and 310 K, twice for statistical significance. The coordinates obtained from the production run were used for post-simulation analysis to observe the effect of the mutations on the dynamics of the protein. The distances between MIO methylidene atom and substrate amino nitrogen (MIO(Cβ2)–Phe(N)) and Y314 hydroxyl oxygen and substrate amino nitrogen (Y314(O)–Phe(N)) were considered and plot against each other as a scatter plot. Backbone atoms of domains around the active site were considered to calculate Root mean square fluctuation (RMSF).

Metadynamics-based MD simulations.

Metadynamics simulations were performed to understand the free energy landscape of the active site. On completion of simulations, the substrate was expected to find different potential minima to attain near attack conformation in the active site. Comparative studies were conducted for AvPAL* and mutations. In metadynamics based approaches, the choice of a collective variable (CV) in the design of the experiment is crucial. We chose two CVs i.e., distance between COM (center of mass) of substrate atoms and COM of Y314 atoms (CV1), and distance between COM of substrate atoms and COM of heavy atoms in the backbone of residues in conserved secondary structures that were present within 5 Å of G218, M222 and L108 residues (CV2) (Fig. S15a).

Steered molecular dynamics (SMD) studies.

SMD simulations were conducted to identify conformational changes and associated path samplings when the substrate is exposed to mechanical strain or rupture force, which cannot be achieved through standard MD simulations. Well equilibrated systems were chosen to be starting points for the SMD studies. The pulling simulations were implemented using GROMACS tools. Substrate was pulled by its COM away from the active site and pulled towards COM of MIO group in unbinding and reassociation process, respectively. The pull velocity of 0.0005 nm⁻¹ ps⁻¹ with the bias force constant of 310 kJ mol⁻¹nm⁻² and −30 kJ mol⁻¹ nm⁻² were used in unbinding and entry process, respectively. Umbrella sampling were conducted as an extension of SMD studies to estimate the energetics during the translocation of the substrate in the path. A series of configuration or reaction coordinates across the path were chosen from the SMD studies and constructed based on the distance between the COM of MIO and that of the substrate. The path was discretized into multiple windows which were chosen for every 0.5 Å of the substrate movement from the active site till it reaches the periphery of the protein.

QM/MM simulation.

E-S complexes were placed in a cubic box with a solute-solvent separation margin of 12 Å in each dimension, by means of QwikMD⁶⁶ program implemented in VMD. The electroneutrality of the system was maintained by the adding NaCl to maintain a salt concentration of 0.15 M. CHARMM36 forcefield was used for the protein topology (generated using psfgen and autopsf programs) and TIP3P water models were used in the system. During the simulations, a 12.0 Å cut-off was applied to short-range, non-bonded interactions, whereas long-range electrostatic interactions are treated using the particle-mesh Ewald (PME) method. The equations of motions were integrated using the r-RESPA multiple time step scheme to update the short-range interactions every step and long-range interactions every two steps. The time step of integration was set to be 2 fs for all simulations performed. Thermal equilibrations were conducted by first subjecting the system to energy minimization using the conjugated gradients method for 1000 steps (2 ps) and then coupled with a heat bath kept constant at 300 K by the Langevin thermostat with a collision coefficient of 1 ps⁻¹ and a barostat maintained at 1 atm.

The last step of classical equilibrium was taken to QM/MM interface to select QM region and initiate QM/MM simulation using QwikMD interface provided in VMD. Four regions in different chains constituting MIO adduct, Q452, Y314, substrate and water molecules within 3.5 Å of the MIO were selected as QM regions, the total charge for each QM region was maintained between +1 and −1 for effective Semi-Empirical QM Calculations.

The system was optimized by a 1,000 steps minimization, followed by 10,000 steps of simulated annealing calculation, equilibration and subjected to an average of 5000 ps QM/MM hybrid production run using PM7⁶⁷ together with the CHARMM36 force field. The least distance between the reactive groups were considered as the good guess of the transition state geometry. This was followed by an average of 1000 ps QM/MM hybrid production run using PM7-TS. Simultaneously, an average of 1000 ps density function theory (DFT) based QM/MM hybrid production at B3LYP/ 6–31G(d) level def2-SVP level implemented in ORCA⁶⁸ was carried out and NEB-TS was used to find and optimize TS. Three input files were provided for the purposes of the QM/MM hybrid production run i.e., i) initial conformation from well equilibrated MD, ii) the transition state geometry derived form QMMM-PM7 and the final product (refer IS1 in Fig. 8) that was manually modelled using the transition state geometry, equilibrated using QMMM-PM7 simulation.

The mutants and the AvPAL* were subjected to QM/MM simulation protocol. In this report only the first step of the reaction and their corresponding energy values of PM7-TS are reported. DFT calculations were used only for validating the reaction coordinates of the transition states and the intermediate state. For benchmarking, we used total of 8 QM regions to study the changes in activation energy with varying QM region sizes, QM region 1 (QM1) includes L-Phe and residues within 3Å of L-Phe, excluding solvent atoms. QM1w5A system includes atoms from QM1 and solvent atoms within a 5Å from L-Phe similarly, QM1w5A, QM1w6A, QM1w7A, QM1w10A combines atoms from QM1 and solvent atoms within 6Å, 7Å and 10Å from L-Phe, respectively. QM2 includes Y78 in addition to the QM1. The inclusion of R317 to the QM2 becomes QM3. For QM4, PHE84 is added to QM3 leading to 148 atoms (Fig. S26, Table S12). The activation energy (Ea) differs with ±1.0 kcal/mol among QM systems. The QM1 (the smaller QM region comprising an important residue required for the reaction) with low E_a compared to other systems appears to be sufficient for studying reaction. It was observed that the convergence was slow and is computationally expensive as the QM size increased.

QM/MM Metadynamics.

5000 ps equilibrated QM/MM reaction coordinate of the E-S was used as initial input structure for the metadynamics simulations. QM/MM-metadynamics simulations was carried out at 300 K, 1 bar, 0.5 fs time step and periodic boundary conditions for 1000 ps using NAMD 2.13⁵² and colvars module⁶⁹. The distance between amino-nitrogen of the substrate & hydroxyl-oxygen of Y314 and distance between amino-nitrogen of the substrate and methylidene carbon of MIO adduct were used as two collective variables (CVs). Gaussians of height 0.2 kcal/mol were added onto the CV coordinate at every step to construct metadynamics bias potential with width of 1°. QM/MM hybrid production run using PM7-TS integrator.

PRE-PRINT.

A pre-print version of the original submission is available on bioxRiv⁷⁰.

Supplementary Material

Movie S1

Download video file^{(43.4MB, mp4)}

Supplemental Information

NIHMS1899566-supplement-Supplemental_Information.docx^{(8.1MB, docx)}

Movie S2

Download video file^{(63.3MB, mp4)}

ACKNOWLEDGEMENTS.

The authors would like to thank current and former Nair lab members, Dr. Zachary J. S. Mays, Dr. Debika Choudhury, Sean F. Sullivan, Trevor B. Nicks, Rana Said, Maya Vaishnaw, and Alexis Barselau for helpful discussions. We would like to thank Prof. Nicholas Turner (University of Manchester, UK) for providing us with the AvPAL-encoding plasmid. This work was supported by NIH grant DP2HD91798 and Tufts Launcher Accelerator to N.U.N.

Footnotes

COMPETING INTERESTS.

Authors (N.U.N., V.D.T., T.C.C., K.M.) and Tufts University have applied for a patent on the workflow and enhanced activity variants. N.U.N., V.D.T., and T.C.C. are cofounders of Enrich Bio, Inc.

CODE AVAILABILITY.

All open-source and commercial software used are described in the methods section.

SUPPORTING INFORMATION.

Mutational library design and characterization, DMS results, mutant enzyme characterization data, results from MD metadynamics, steered MD, QM/MM, and QM/MM metadynamics simulations.

DATA AVAILABILITY.

Deep sequencing data has been submitted to NCBI SRA and is available under accession # PRJNA730338.

REFERENCES.

1.Weise NJ; Ahmed ST; Parmeggiani F; Galman JL; Dunstan MS; Charnock SJ; Leys D; Turner NJ, Zymophore Identification Enables the Discovery of Novel Phenylalanine Ammonia Lyase Enzymes. Scientific Reports 2017, 7 (1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Araya CL; Fowler DM; Chen W; Muniez I; Kelly JW; Fields S, A Fundamental Protein Property, Thermodynamic Stability, Revealed Solely from Large-Scale Measurements of Protein Function. Proceedings of the National Academy of Sciences 2012, 109 (42), 16858–16863. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Bata Z; Molnár Z; Madaras E; Molnár B; Sánta-Bell E; Varga A; Leveles I; Qian R; Hammerschmidt F; Paizs C, Substrate Tunnel Engineering Aided by X-Ray Crystallography and Functional Dynamics Swaps the Function of Mio-Enzymes. ACS Catalysis 2021, 11, 4538–4549. [Google Scholar]
4.Brenan L; Andreev A; Cohen O; Pantel S; Kamburov A; Cacchiarelli D; Persky NS; Zhu C; Bagul M; Goetz EM, Phenotypic Characterization of a Comprehensive Set of Mapk1/Erk2 Missense Mutants. Cell Reports 2016, 17 (4), 1171–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Calabrese JC; Jordan DB; Boodhoo A; Sariaslani S; Vannelli T, Crystal Structure of Phenylalanine Ammonia Lyase: Multiple Helix Dipoles Implicated in Catalysis. Biochemistry 2004, 43 (36), 11403–11416. [DOI] [PubMed] [Google Scholar]
6.Heberling MM; Masman MF; Bartsch S; Wybenga GG; Dijkstra BW; Marrink SJ; Janssen DB, Ironing out Their Differences: Dissecting the Structural Determinants of a Phenylalanine Aminomutase and Ammonia Lyase. ACS Chemical Biology 2015, 10 (4), 989–997. [DOI] [PubMed] [Google Scholar]
7.Hermes JD; Weiss PM; Cleland W, Use of Nitrogen-15 and Deuterium Isotope Effects to Determine the Chemical Mechanism of Phenylalanine Ammonia-Lyase. Biochemistry 1985, 24 (12), 2959–2967. [DOI] [PubMed] [Google Scholar]
8.Jun S-Y; Sattler SA; Cortez GS; Vermerris W; Sattler SE; Kang C, Biochemical and Structural Analysis of Substrate Specificity of a Phenylalanine Ammonia-Lyase. Plant Physiology 2018, 176 (2), 1452–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Louie GV; Bowman ME; Moffitt MC; Baiga TJ; Moore BS; Noel JP, Structural Determinants and Modulation of Substrate Specificity in Phenylalanine-Tyrosine Ammonia-Lyases. Chemistry & Biology 2006, 13 (12), 1327–1338. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Melnikov A; Rogov P; Wang L; Gnirke A; Mikkelsen TS, Comprehensive Mutational Scanning of a Kinase in Vivo Reveals Substrate-Dependent Fitness Landscapes. Nucleic Acids Research 2014, 42 (14), e112–e112. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Ritter H; Schulz GE, Structural Basis for the Entrance into the Phenylpropanoid Metabolism Catalyzed by Phenylalanine Ammonia-Lyase. The Plant Cell 2004, 16 (12), 3426–3436. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Rockah-Shmuel L; Tóth-Petróczy Á; Tawfik DS, Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Computational Biology 2015, 11 (8), e1004421. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Starita LM; Pruneda JN; Lo RS; Fowler DM; Kim HJ; Hiatt JB; Shendure J; Brzovic PS; Fields S; Klevit RE, Activity-Enhancing Mutations in an E3 Ubiquitin Ligase Identified by High-Throughput Mutagenesis. Proceedings of the National Academy of Sciences 2013, 110 (14), E1263–E1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Wang L; Gamez A; Archer H; Abola EE; Sarkissian CN; Fitzpatrick P; Wendt D; Zhang Y; Vellard M; Bliesath J, Structural and Biochemical Characterization of the Therapeutic Anabaena Variabilis Phenylalanine Ammonia Lyase. Journal of Molecular Biology 2008, 380 (4), 623–635. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wang L; Gamez A; Sarkissian CN; Straub M; Patch MG; Han GW; Striepeke S; Fitzpatrick P; Scriver CR; Stevens RC, Structure-Based Chemical Modification Strategy for Enzyme Replacement Treatment of Phenylketonuria. Molecular Genetics and Metabolism 2005, 86 (1–2), 134–140. [DOI] [PubMed] [Google Scholar]
16.Fowler DM; Araya CL; Fleishman SJ; Kellogg EH; Stephany JJ; Baker D; Fields S, High-Resolution Mapping of Protein Sequence-Function Relationships. Nature Methods 2010, 7 (9), 741. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hietpas RT; Jensen JD; Bolon DN, Experimental Illumination of a Fitness Landscape. Proceedings of the National Academy of Sciences 2011, 108 (19), 7896–7901. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Wrenbeck EE; Faber MS; Whitehead TA, Deep Sequencing Methods for Protein Engineering and Design. Current opinion in Structural Biology 2017, 45, 36–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Wrenbeck EE; Azouz LR; Whitehead TA, Single-Mutation Fitness Landscapes for an Enzyme on Multiple Substrates Reveal Specificity Is Globally Encoded. Nature Communications 2017, 8 (1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Jones EM; Lubock NB; Venkatakrishnan A; Wang J; Tseng AM; Paggi JM; Latorraca NR; Cancilla D; Satyadi M; Davis JE, Structural and Functional Characterization of G Protein–Coupled Receptors with Deep Mutational Scanning. Elife 2020, 9, e54895. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.McLaughlin RN Jr; Poelwijk FJ; Raman A; Gosal WS; Ranganathan R, The Spatial Architecture of Protein Function and Adaptation. Nature 2012, 491 (7422), 138–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Kong J-Q, Phenylalanine Ammonia-Lyase, a Key Component Used for Phenylpropanoids Production by Metabolic Engineering. RSC Advances 2015, 5 (77), 62587–62603. [Google Scholar]
23.Parmeggiani F; Weise NJ; Ahmed ST; Turner NJ, Synthetic and Therapeutic Applications of Ammonia-Lyases and Aminomutases. Chemical Reviews 2018, 118 (1), 73–118. [DOI] [PubMed] [Google Scholar]
24.Klumbys E; Zebec Z; Weise NJ; Turner NJ; Scrutton NS, Bio-Derived Production of Cinnamyl Alcohol Via a Three Step Biocatalytic Cascade and Metabolic Engineering. Green Chemistry 2018, 20 (3), 658–663. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Toogood HS; Scrutton NS, Discovery, Characterization, Engineering, and Applications of Ene-Reductases for Industrial Biocatalysis. ACS Catalysis 2018, 8 (4), 3532–3549. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Parmeggiani F; Lovelock SL; Weise NJ; Ahmed ST; Turner NJ, Synthesis of D-and L-Phenylalanine Derivatives by Phenylalanine Ammonia Lyases: A Multienzymatic Cascade Process. Angewandte Chemie 2015, 127 (15), 4691–4694. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Zhang F; Ren J; Zhan J, Identification and Characterization of an Efficient Phenylalanine Ammonia-Lyase from Photorhabdus Luminescens. Applied Biochemistry and Biotechnology 2021, 193 (4), 1099–1115. [DOI] [PubMed] [Google Scholar]
28.Moffitt MC; Louie GV; Bowman ME; Pence J; Noel JP; Moore BS, Discovery of Two Cyanobacterial Phenylalanine Ammonia Lyases: Kinetic and Structural Characterization. Biochemistry 2007, 46 (4), 1004–1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Isabella VM; Ha BN; Castillo MJ; Lubkowicz DJ; Rowe SE; Millet YA; Anderson CL; Li N; Fisher AB; West KA, Development of a Synthetic Live Bacterial Therapeutic for the Human Metabolic Disease Phenylketonuria. Nature Biotechnology 2018, 36 (9), 857–864. [DOI] [PubMed] [Google Scholar]
30.Burton BK; Longo N; Vockley J; Grange DK; Harding CO; Decker C; Li M; Lau K; Rosen O; Larimore K, Pegvaliase for the Treatment of Phenylketonuria: Results of the Phase 2 Dose-Finding Studies with Long-Term Follow-Up. Molecular Genetics and Metabolism 2020, 130 (4), 239–246. [DOI] [PubMed] [Google Scholar]
31.Yang J; Tao R; Wang L; Song L; Wang Y; Gong C; Yao S; Wu Q, Thermosensitive Micelles Encapsulating Phenylalanine Ammonia Lyase Act as a Sustained and Efficacious Therapy against Colorectal Cancer. Journal of Biomedical Nanotechnology 2019, 15 (4), 717–727. [DOI] [PubMed] [Google Scholar]
32.Babich OO; Pokrovsky VS; Anisimova NY; Sokolov NN; Prosekov AY, Recombinant L-Phenylalanine Ammonia Lyase from Rhodosporidium Toruloides as a Potential Anticancer Agent. Biotechnology and Applied Biochemistry 2013, 60 (3), 316–322. [DOI] [PubMed] [Google Scholar]
33.Bartsch S; Bornscheuer UT, Mutational Analysis of Phenylalanine Ammonia Lyase to Improve Reactions Rates for Various Substrates. Protein Engineering, Design & Selection 2010, 23 (12), 929–933. [DOI] [PubMed] [Google Scholar]
34.Bencze LC; Filip A; Bánóczi G; Toşa MI; Irimie FD; Gellért Á; Poppe L; Paizs C, Expanding the Substrate Scope of Phenylalanine Ammonia-Lyase from Petroselinum Crispum Towards Styrylalanines. Organic & Biomolecular Chemistry 2017, 15 (17), 3717–3727. [DOI] [PubMed] [Google Scholar]
35.Nagy EZ; Tork SD; Lang PA; Filip A; Irimie FD; Poppe L. s.; Toşa MI; Schofield CJ; Brem J. r.; Paizs C, Mapping the Hydrophobic Substrate Binding Site of Phenylalanine Ammonia-Lyase from Petroselinum Crispum. ACS Catalysis 2019, 9 (9), 8825–8834. [Google Scholar]
36.Mays ZJ; Mohan K; Trivedi VD; Chappell TC; Nair NU, Directed Evolution of Anabaena Variabilis Phenylalanine Ammonia-Lyase (Pal) Identifies Mutants with Enhanced Activities. Chemical Communications 2020, 56 (39), 5255–5258. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Cooke HA; Christianson CV; Bruner SD, Structure and Chemistry of 4-Methylideneimidazole-5-One Containing Enzymes. Current Opinion in Chemical Biology 2009, 13 (4), 460–468. [DOI] [PubMed] [Google Scholar]
38.Feng L; Wanninayake U; Strom S; Geiger J; Walker KD, Mechanistic, Mutational, and Structural Evaluation of a Taxus Phenylalanine Aminomutase. Biochemistry 2011, 50 (14), 2919–2930. [DOI] [PubMed] [Google Scholar]
39.Cooke HA; Bruner SD, Probing the Active Site of Mio-Dependent Aminomutases, Key Catalysts in the Biosynthesis of Β-Amino Acids Incorporated in Secondary Metabolites. Biopolymers 2010, 93 (9), 802–810. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Tomoiagă RB; Tork SD; Horváth I; Filip A; Nagy LC; Bencze LC, Saturation Mutagenesis for Phenylalanine Ammonia Lyases of Enhanced Catalytic Properties. Biomolecules 2020, 10 (6), 838. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Sun J; Cui Y; Wu B, Grape, a Greedy Accumulated Strategy for Computational Protein Engineering. Methods in Enzymology 2021, 648, 207–230. [DOI] [PubMed] [Google Scholar]
42.Cui H; Cao H; Cai H; Jaeger KE; Davari MD; Schwaneberg U, Computer-Assisted Recombination (Compassr) Teaches Us How to Recombine Beneficial Substitutions from Directed Evolution Campaigns. Chemistry (Weinheim an der Bergstrasse, Germany) 2020, 26 (3), 643. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Hecht M; Bromberg Y; Rost B, Better Prediction of Functional Effects for Sequence Variants. BMC Genomics 2015, 16 (8), 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Choi Y; Chan AP, Provean Web Server: A Tool to Predict the Functional Effect of Amino Acid Substitutions and Indels. Bioinformatics 2015, 31 (16), 2745–2747. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Munro D; Singh M, Demask: A Deep Mutational Scanning Substitution Matrix and Its Use for Variant Impact Prediction. Bioinformatics 2020, 36 (22–23), 5322–5329. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Ness JE; Kim S; Gottman A; Pak R; Krebber A; Borchert TV; Govindarajan S; Mundorff EC; Minshull J, Synthetic Shuffling Expands Functional Protein Diversity by Allowing Amino Acids to Recombine Independently. Nature Biotechnology 2002, 20 (12), 1251–1255. [DOI] [PubMed] [Google Scholar]
47.Reetz MT; Bocola M; Carballeira JD; Zha D; Vogel A, Expanding the Range of Substrate Acceptance of Enzymes: Combinatorial Active-Site Saturation Test. Angewandte Chemie International Edition 2005, 44 (27), 4192–4196. [DOI] [PubMed] [Google Scholar]
48.MacDonald MJ; D’Cunha GB, A Modern View of Phenylalanine Ammonia Lyase. Biochemistry and Cell Biology 2007, 85 (3), 273–282. [DOI] [PubMed] [Google Scholar]
49.Sato T; Kiuchi F; Sankawa U, Inhibition of Phenylalanine Ammonia-Lyase by Cinnamic Acid Derivatives and Related Compounds. Phytochemistry 1982, 21 (4), 845–850. [Google Scholar]
50.Zoń J; Laber B, Novel Phenylalanine Analogues as Putative Inhibitors of Enzymes Acting on Phenylalanine. Phytochemistry 1988, 27 (3), 711–714. [Google Scholar]
51.Poppe L; Rétey J, Friedel–Crafts-Type Mechanism for the Enzymatic Elimination of Ammonia from Histidine and Phenylalanine. Angewandte Chemie International Edition 2005, 44 (24), 3668–3688. [DOI] [PubMed] [Google Scholar]
52.Phillips JC; Hardy DJ; Maia JD; Stone JE; Ribeiro JV; Bernardi RC; Buch R; Fiorin G; Hénin J; Jiang W, Scalable Molecular Dynamics on Cpu and Gpu Architectures with Namd. The Journal of Chemical Physics 2020, 153 (4), 044130. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Pinto GP; Ribeiro AJ; Ramos MJ; Fernandes PA; Toscano M; Russo N, New Insights in the Catalytic Mechanism of Tyrosine Ammonia-Lyase Given by Qm/Mm and Qm Cluster Models. Archives of Biochemistry and Biophysics 2015, 582, 107–115. [DOI] [PubMed] [Google Scholar]
54.Bushnell B; Rood J; Singer E, Bbmerge–Accurate Paired Shotgun Read Merging Via Overlap. PloS One 2017, 12 (10), e0185056. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Langmead B; Salzberg SL, Fast Gapped-Read Alignment with Bowtie 2. Nature Methods 2012, 9 (4), 357. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Webb B; Sali A, Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 2016, 54 (1), 5.6. 1–5.6. 37. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Morris GM; Huey R; Lindstrom W; Sanner MF; Belew RK; Goodsell DS; Olson AJ, Autodock4 and Autodocktools4: Automated Docking with Selective Receptor Flexibility. Journal of Computational Chemistry 2009, 30 (16), 2785–2791. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Lindorff-Larsen K; Piana S; Palmo K; Maragakis P; Klepeis JL; Dror RO; Shaw DE, Improved Side-Chain Torsion Potentials for the Amber Ff99sb Protein Force Field. Proteins: Structure, Function, and Bioinformatics 2010, 78 (8), 1950–1958. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Berendsen HJ; van der Spoel D; van Drunen R, Gromacs: A Message-Passing Parallel Molecular Dynamics Implementation. Computer Physics Communications 1995, 91 (1–3), 43–56. [Google Scholar]
60.Van Der Spoel D; Lindahl E; Hess B; Groenhof G; Mark AE; Berendsen HJ, Gromacs: Fast, Flexible, and Free. Journal of Computational Chemistry 2005, 26 (16), 1701–1718. [DOI] [PubMed] [Google Scholar]
61.Pronk S; Páll S; Schulz R; Larsson P; Bjelkmar P; Apostolov R; Shirts MR; Smith JC; Kasson PM; van der Spoel D, Gromacs 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29 (7), 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Páll S; Abraham MJ; Kutzner C; Hess B; Lindahl E In Tackling Exascale Software Challenges in Molecular Dynamics Simulations with Gromacs, International conference on exascale applications and software, Springer: 2014; 3–27. [Google Scholar]
63.Abraham MJ; Murtola T; Schulz R; Páll S; Smith JC; Hess B; Lindahl E, Gromacs: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1, 19–25. [Google Scholar]
64.Berendsen HJ; Postma J. v.; van Gunsteren WF; DiNola A; Haak JR, Molecular Dynamics with Coupling to an External Bath. The Journal of Chemical Physics 1984, 81 (8), 3684–3690. [Google Scholar]
65.Darden T; York D; Pedersen L, Particle Mesh Ewald: An N⋅ Log (N) Method for Ewald Sums in Large Systems. The Journal of Chemical Physics 1993, 98 (12), 10089–10092. [Google Scholar]
66.Melo MC; Bernardi RC; Rudack T; Scheurer M; Riplinger C; Phillips JC; Maia JD; Rocha GB; Ribeiro JV; Stone JE, Namd Goes Quantum: An Integrative Suite for Hybrid Simulations. Nature Methods 2018, 15 (5), 351. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Stewart JJP Mopac2016, Colorado Springs, CO, USA, 2016.
68.Neese F, Software update: the ORCA program system, version 4.0. WIREs: Computational Molecular Science 2018. 2, 73–78.. [Google Scholar]
69.Fiorin G; Klein ML; Hénin J, Using Collective Variables to Drive Molecular Dynamics Simulations. Molecular Physics 2013, 111 (22–23), 3345–3362. [Google Scholar]
70.Trivedi VD; Chappell TC; Krishna NB; Shetty A; Sigamani GG; Mohan K; Ramesh A; Kumar P; Nair NU, In-Depth Sequence-Function Characterization Reveals Multiple Paths to Enhance Phenylalanine Ammonia-Lyase (Pal) Activity. bioRxiv 2021. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Movie S1

Download video file^{(43.4MB, mp4)}

Supplemental Information

NIHMS1899566-supplement-Supplemental_Information.docx^{(8.1MB, docx)}

Movie S2

Download video file^{(63.3MB, mp4)}

Data Availability Statement

Deep sequencing data has been submitted to NCBI SRA and is available under accession # PRJNA730338.

[R1] 1.Weise NJ; Ahmed ST; Parmeggiani F; Galman JL; Dunstan MS; Charnock SJ; Leys D; Turner NJ, Zymophore Identification Enables the Discovery of Novel Phenylalanine Ammonia Lyase Enzymes. Scientific Reports 2017, 7 (1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Araya CL; Fowler DM; Chen W; Muniez I; Kelly JW; Fields S, A Fundamental Protein Property, Thermodynamic Stability, Revealed Solely from Large-Scale Measurements of Protein Function. Proceedings of the National Academy of Sciences 2012, 109 (42), 16858–16863. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Bata Z; Molnár Z; Madaras E; Molnár B; Sánta-Bell E; Varga A; Leveles I; Qian R; Hammerschmidt F; Paizs C, Substrate Tunnel Engineering Aided by X-Ray Crystallography and Functional Dynamics Swaps the Function of Mio-Enzymes. ACS Catalysis 2021, 11, 4538–4549. [Google Scholar]

[R4] 4.Brenan L; Andreev A; Cohen O; Pantel S; Kamburov A; Cacchiarelli D; Persky NS; Zhu C; Bagul M; Goetz EM, Phenotypic Characterization of a Comprehensive Set of Mapk1/Erk2 Missense Mutants. Cell Reports 2016, 17 (4), 1171–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Calabrese JC; Jordan DB; Boodhoo A; Sariaslani S; Vannelli T, Crystal Structure of Phenylalanine Ammonia Lyase: Multiple Helix Dipoles Implicated in Catalysis. Biochemistry 2004, 43 (36), 11403–11416. [DOI] [PubMed] [Google Scholar]

[R6] 6.Heberling MM; Masman MF; Bartsch S; Wybenga GG; Dijkstra BW; Marrink SJ; Janssen DB, Ironing out Their Differences: Dissecting the Structural Determinants of a Phenylalanine Aminomutase and Ammonia Lyase. ACS Chemical Biology 2015, 10 (4), 989–997. [DOI] [PubMed] [Google Scholar]

[R7] 7.Hermes JD; Weiss PM; Cleland W, Use of Nitrogen-15 and Deuterium Isotope Effects to Determine the Chemical Mechanism of Phenylalanine Ammonia-Lyase. Biochemistry 1985, 24 (12), 2959–2967. [DOI] [PubMed] [Google Scholar]

[R8] 8.Jun S-Y; Sattler SA; Cortez GS; Vermerris W; Sattler SE; Kang C, Biochemical and Structural Analysis of Substrate Specificity of a Phenylalanine Ammonia-Lyase. Plant Physiology 2018, 176 (2), 1452–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Louie GV; Bowman ME; Moffitt MC; Baiga TJ; Moore BS; Noel JP, Structural Determinants and Modulation of Substrate Specificity in Phenylalanine-Tyrosine Ammonia-Lyases. Chemistry & Biology 2006, 13 (12), 1327–1338. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Melnikov A; Rogov P; Wang L; Gnirke A; Mikkelsen TS, Comprehensive Mutational Scanning of a Kinase in Vivo Reveals Substrate-Dependent Fitness Landscapes. Nucleic Acids Research 2014, 42 (14), e112–e112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Ritter H; Schulz GE, Structural Basis for the Entrance into the Phenylpropanoid Metabolism Catalyzed by Phenylalanine Ammonia-Lyase. The Plant Cell 2004, 16 (12), 3426–3436. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Rockah-Shmuel L; Tóth-Petróczy Á; Tawfik DS, Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Computational Biology 2015, 11 (8), e1004421. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Starita LM; Pruneda JN; Lo RS; Fowler DM; Kim HJ; Hiatt JB; Shendure J; Brzovic PS; Fields S; Klevit RE, Activity-Enhancing Mutations in an E3 Ubiquitin Ligase Identified by High-Throughput Mutagenesis. Proceedings of the National Academy of Sciences 2013, 110 (14), E1263–E1272. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Wang L; Gamez A; Archer H; Abola EE; Sarkissian CN; Fitzpatrick P; Wendt D; Zhang Y; Vellard M; Bliesath J, Structural and Biochemical Characterization of the Therapeutic Anabaena Variabilis Phenylalanine Ammonia Lyase. Journal of Molecular Biology 2008, 380 (4), 623–635. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Wang L; Gamez A; Sarkissian CN; Straub M; Patch MG; Han GW; Striepeke S; Fitzpatrick P; Scriver CR; Stevens RC, Structure-Based Chemical Modification Strategy for Enzyme Replacement Treatment of Phenylketonuria. Molecular Genetics and Metabolism 2005, 86 (1–2), 134–140. [DOI] [PubMed] [Google Scholar]

[R16] 16.Fowler DM; Araya CL; Fleishman SJ; Kellogg EH; Stephany JJ; Baker D; Fields S, High-Resolution Mapping of Protein Sequence-Function Relationships. Nature Methods 2010, 7 (9), 741. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Hietpas RT; Jensen JD; Bolon DN, Experimental Illumination of a Fitness Landscape. Proceedings of the National Academy of Sciences 2011, 108 (19), 7896–7901. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Wrenbeck EE; Faber MS; Whitehead TA, Deep Sequencing Methods for Protein Engineering and Design. Current opinion in Structural Biology 2017, 45, 36–44. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Wrenbeck EE; Azouz LR; Whitehead TA, Single-Mutation Fitness Landscapes for an Enzyme on Multiple Substrates Reveal Specificity Is Globally Encoded. Nature Communications 2017, 8 (1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Jones EM; Lubock NB; Venkatakrishnan A; Wang J; Tseng AM; Paggi JM; Latorraca NR; Cancilla D; Satyadi M; Davis JE, Structural and Functional Characterization of G Protein–Coupled Receptors with Deep Mutational Scanning. Elife 2020, 9, e54895. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.McLaughlin RN Jr; Poelwijk FJ; Raman A; Gosal WS; Ranganathan R, The Spatial Architecture of Protein Function and Adaptation. Nature 2012, 491 (7422), 138–142. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Kong J-Q, Phenylalanine Ammonia-Lyase, a Key Component Used for Phenylpropanoids Production by Metabolic Engineering. RSC Advances 2015, 5 (77), 62587–62603. [Google Scholar]

[R23] 23.Parmeggiani F; Weise NJ; Ahmed ST; Turner NJ, Synthetic and Therapeutic Applications of Ammonia-Lyases and Aminomutases. Chemical Reviews 2018, 118 (1), 73–118. [DOI] [PubMed] [Google Scholar]

[R24] 24.Klumbys E; Zebec Z; Weise NJ; Turner NJ; Scrutton NS, Bio-Derived Production of Cinnamyl Alcohol Via a Three Step Biocatalytic Cascade and Metabolic Engineering. Green Chemistry 2018, 20 (3), 658–663. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Toogood HS; Scrutton NS, Discovery, Characterization, Engineering, and Applications of Ene-Reductases for Industrial Biocatalysis. ACS Catalysis 2018, 8 (4), 3532–3549. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Parmeggiani F; Lovelock SL; Weise NJ; Ahmed ST; Turner NJ, Synthesis of D-and L-Phenylalanine Derivatives by Phenylalanine Ammonia Lyases: A Multienzymatic Cascade Process. Angewandte Chemie 2015, 127 (15), 4691–4694. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Zhang F; Ren J; Zhan J, Identification and Characterization of an Efficient Phenylalanine Ammonia-Lyase from Photorhabdus Luminescens. Applied Biochemistry and Biotechnology 2021, 193 (4), 1099–1115. [DOI] [PubMed] [Google Scholar]

[R28] 28.Moffitt MC; Louie GV; Bowman ME; Pence J; Noel JP; Moore BS, Discovery of Two Cyanobacterial Phenylalanine Ammonia Lyases: Kinetic and Structural Characterization. Biochemistry 2007, 46 (4), 1004–1012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Isabella VM; Ha BN; Castillo MJ; Lubkowicz DJ; Rowe SE; Millet YA; Anderson CL; Li N; Fisher AB; West KA, Development of a Synthetic Live Bacterial Therapeutic for the Human Metabolic Disease Phenylketonuria. Nature Biotechnology 2018, 36 (9), 857–864. [DOI] [PubMed] [Google Scholar]

[R30] 30.Burton BK; Longo N; Vockley J; Grange DK; Harding CO; Decker C; Li M; Lau K; Rosen O; Larimore K, Pegvaliase for the Treatment of Phenylketonuria: Results of the Phase 2 Dose-Finding Studies with Long-Term Follow-Up. Molecular Genetics and Metabolism 2020, 130 (4), 239–246. [DOI] [PubMed] [Google Scholar]

[R31] 31.Yang J; Tao R; Wang L; Song L; Wang Y; Gong C; Yao S; Wu Q, Thermosensitive Micelles Encapsulating Phenylalanine Ammonia Lyase Act as a Sustained and Efficacious Therapy against Colorectal Cancer. Journal of Biomedical Nanotechnology 2019, 15 (4), 717–727. [DOI] [PubMed] [Google Scholar]

[R32] 32.Babich OO; Pokrovsky VS; Anisimova NY; Sokolov NN; Prosekov AY, Recombinant L-Phenylalanine Ammonia Lyase from Rhodosporidium Toruloides as a Potential Anticancer Agent. Biotechnology and Applied Biochemistry 2013, 60 (3), 316–322. [DOI] [PubMed] [Google Scholar]

[R33] 33.Bartsch S; Bornscheuer UT, Mutational Analysis of Phenylalanine Ammonia Lyase to Improve Reactions Rates for Various Substrates. Protein Engineering, Design & Selection 2010, 23 (12), 929–933. [DOI] [PubMed] [Google Scholar]

[R34] 34.Bencze LC; Filip A; Bánóczi G; Toşa MI; Irimie FD; Gellért Á; Poppe L; Paizs C, Expanding the Substrate Scope of Phenylalanine Ammonia-Lyase from Petroselinum Crispum Towards Styrylalanines. Organic & Biomolecular Chemistry 2017, 15 (17), 3717–3727. [DOI] [PubMed] [Google Scholar]

[R35] 35.Nagy EZ; Tork SD; Lang PA; Filip A; Irimie FD; Poppe L. s.; Toşa MI; Schofield CJ; Brem J. r.; Paizs C, Mapping the Hydrophobic Substrate Binding Site of Phenylalanine Ammonia-Lyase from Petroselinum Crispum. ACS Catalysis 2019, 9 (9), 8825–8834. [Google Scholar]

[R36] 36.Mays ZJ; Mohan K; Trivedi VD; Chappell TC; Nair NU, Directed Evolution of Anabaena Variabilis Phenylalanine Ammonia-Lyase (Pal) Identifies Mutants with Enhanced Activities. Chemical Communications 2020, 56 (39), 5255–5258. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Cooke HA; Christianson CV; Bruner SD, Structure and Chemistry of 4-Methylideneimidazole-5-One Containing Enzymes. Current Opinion in Chemical Biology 2009, 13 (4), 460–468. [DOI] [PubMed] [Google Scholar]

[R38] 38.Feng L; Wanninayake U; Strom S; Geiger J; Walker KD, Mechanistic, Mutational, and Structural Evaluation of a Taxus Phenylalanine Aminomutase. Biochemistry 2011, 50 (14), 2919–2930. [DOI] [PubMed] [Google Scholar]

[R39] 39.Cooke HA; Bruner SD, Probing the Active Site of Mio-Dependent Aminomutases, Key Catalysts in the Biosynthesis of Β-Amino Acids Incorporated in Secondary Metabolites. Biopolymers 2010, 93 (9), 802–810. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Tomoiagă RB; Tork SD; Horváth I; Filip A; Nagy LC; Bencze LC, Saturation Mutagenesis for Phenylalanine Ammonia Lyases of Enhanced Catalytic Properties. Biomolecules 2020, 10 (6), 838. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Sun J; Cui Y; Wu B, Grape, a Greedy Accumulated Strategy for Computational Protein Engineering. Methods in Enzymology 2021, 648, 207–230. [DOI] [PubMed] [Google Scholar]

[R42] 42.Cui H; Cao H; Cai H; Jaeger KE; Davari MD; Schwaneberg U, Computer-Assisted Recombination (Compassr) Teaches Us How to Recombine Beneficial Substitutions from Directed Evolution Campaigns. Chemistry (Weinheim an der Bergstrasse, Germany) 2020, 26 (3), 643. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Hecht M; Bromberg Y; Rost B, Better Prediction of Functional Effects for Sequence Variants. BMC Genomics 2015, 16 (8), 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Choi Y; Chan AP, Provean Web Server: A Tool to Predict the Functional Effect of Amino Acid Substitutions and Indels. Bioinformatics 2015, 31 (16), 2745–2747. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] 45.Munro D; Singh M, Demask: A Deep Mutational Scanning Substitution Matrix and Its Use for Variant Impact Prediction. Bioinformatics 2020, 36 (22–23), 5322–5329. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Ness JE; Kim S; Gottman A; Pak R; Krebber A; Borchert TV; Govindarajan S; Mundorff EC; Minshull J, Synthetic Shuffling Expands Functional Protein Diversity by Allowing Amino Acids to Recombine Independently. Nature Biotechnology 2002, 20 (12), 1251–1255. [DOI] [PubMed] [Google Scholar]

[R47] 47.Reetz MT; Bocola M; Carballeira JD; Zha D; Vogel A, Expanding the Range of Substrate Acceptance of Enzymes: Combinatorial Active-Site Saturation Test. Angewandte Chemie International Edition 2005, 44 (27), 4192–4196. [DOI] [PubMed] [Google Scholar]

[R48] 48.MacDonald MJ; D’Cunha GB, A Modern View of Phenylalanine Ammonia Lyase. Biochemistry and Cell Biology 2007, 85 (3), 273–282. [DOI] [PubMed] [Google Scholar]

[R49] 49.Sato T; Kiuchi F; Sankawa U, Inhibition of Phenylalanine Ammonia-Lyase by Cinnamic Acid Derivatives and Related Compounds. Phytochemistry 1982, 21 (4), 845–850. [Google Scholar]

[R50] 50.Zoń J; Laber B, Novel Phenylalanine Analogues as Putative Inhibitors of Enzymes Acting on Phenylalanine. Phytochemistry 1988, 27 (3), 711–714. [Google Scholar]

[R51] 51.Poppe L; Rétey J, Friedel–Crafts-Type Mechanism for the Enzymatic Elimination of Ammonia from Histidine and Phenylalanine. Angewandte Chemie International Edition 2005, 44 (24), 3668–3688. [DOI] [PubMed] [Google Scholar]

[R52] 52.Phillips JC; Hardy DJ; Maia JD; Stone JE; Ribeiro JV; Bernardi RC; Buch R; Fiorin G; Hénin J; Jiang W, Scalable Molecular Dynamics on Cpu and Gpu Architectures with Namd. The Journal of Chemical Physics 2020, 153 (4), 044130. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Pinto GP; Ribeiro AJ; Ramos MJ; Fernandes PA; Toscano M; Russo N, New Insights in the Catalytic Mechanism of Tyrosine Ammonia-Lyase Given by Qm/Mm and Qm Cluster Models. Archives of Biochemistry and Biophysics 2015, 582, 107–115. [DOI] [PubMed] [Google Scholar]

[R54] 54.Bushnell B; Rood J; Singer E, Bbmerge–Accurate Paired Shotgun Read Merging Via Overlap. PloS One 2017, 12 (10), e0185056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Langmead B; Salzberg SL, Fast Gapped-Read Alignment with Bowtie 2. Nature Methods 2012, 9 (4), 357. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] 56.Webb B; Sali A, Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 2016, 54 (1), 5.6. 1–5.6. 37. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] 57.Morris GM; Huey R; Lindstrom W; Sanner MF; Belew RK; Goodsell DS; Olson AJ, Autodock4 and Autodocktools4: Automated Docking with Selective Receptor Flexibility. Journal of Computational Chemistry 2009, 30 (16), 2785–2791. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] 58.Lindorff-Larsen K; Piana S; Palmo K; Maragakis P; Klepeis JL; Dror RO; Shaw DE, Improved Side-Chain Torsion Potentials for the Amber Ff99sb Protein Force Field. Proteins: Structure, Function, and Bioinformatics 2010, 78 (8), 1950–1958. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] 59.Berendsen HJ; van der Spoel D; van Drunen R, Gromacs: A Message-Passing Parallel Molecular Dynamics Implementation. Computer Physics Communications 1995, 91 (1–3), 43–56. [Google Scholar]

[R60] 60.Van Der Spoel D; Lindahl E; Hess B; Groenhof G; Mark AE; Berendsen HJ, Gromacs: Fast, Flexible, and Free. Journal of Computational Chemistry 2005, 26 (16), 1701–1718. [DOI] [PubMed] [Google Scholar]

[R61] 61.Pronk S; Páll S; Schulz R; Larsson P; Bjelkmar P; Apostolov R; Shirts MR; Smith JC; Kasson PM; van der Spoel D, Gromacs 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29 (7), 845–854. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R62] 62.Páll S; Abraham MJ; Kutzner C; Hess B; Lindahl E In Tackling Exascale Software Challenges in Molecular Dynamics Simulations with Gromacs, International conference on exascale applications and software, Springer: 2014; 3–27. [Google Scholar]

[R63] 63.Abraham MJ; Murtola T; Schulz R; Páll S; Smith JC; Hess B; Lindahl E, Gromacs: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1, 19–25. [Google Scholar]

[R64] 64.Berendsen HJ; Postma J. v.; van Gunsteren WF; DiNola A; Haak JR, Molecular Dynamics with Coupling to an External Bath. The Journal of Chemical Physics 1984, 81 (8), 3684–3690. [Google Scholar]

[R65] 65.Darden T; York D; Pedersen L, Particle Mesh Ewald: An N⋅ Log (N) Method for Ewald Sums in Large Systems. The Journal of Chemical Physics 1993, 98 (12), 10089–10092. [Google Scholar]

[R66] 66.Melo MC; Bernardi RC; Rudack T; Scheurer M; Riplinger C; Phillips JC; Maia JD; Rocha GB; Ribeiro JV; Stone JE, Namd Goes Quantum: An Integrative Suite for Hybrid Simulations. Nature Methods 2018, 15 (5), 351. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] 67.Stewart JJP Mopac2016, Colorado Springs, CO, USA, 2016.

[R68] 68.Neese F, Software update: the ORCA program system, version 4.0. WIREs: Computational Molecular Science 2018. 2, 73–78.. [Google Scholar]

[R69] 69.Fiorin G; Klein ML; Hénin J, Using Collective Variables to Drive Molecular Dynamics Simulations. Molecular Physics 2013, 111 (22–23), 3345–3362. [Google Scholar]

[R70] 70.Trivedi VD; Chappell TC; Krishna NB; Shetty A; Sigamani GG; Mohan K; Ramesh A; Kumar P; Nair NU, In-Depth Sequence-Function Characterization Reveals Multiple Paths to Enhance Phenylalanine Ammonia-Lyase (Pal) Activity. bioRxiv 2021. [Google Scholar]

PERMALINK

In-depth Sequence–Function Characterization Reveals Multiple Pathways to Enhance Enzymatic Activity

Vikas D Trivedi

Todd C Chappell

Naveen B Krishna

Anuj Shetty

Gladstone G Sigamani

Karishma Mohan

Athreya Ramesh

Kumar R Pravin

Nikhil U Nair

Abstract

INTRODUCTION.

RESULTS & DISCUSSION.

Overview.

Fig. 1: Overview of work.

Deep mutational scanning (DMS) of AvPAL* and analysis of active site residues.

Fig. 2: AvPAL* deep mutational scanning (DMS) outcomes.

Sequence-function characterization highlights hotspots that enhance activity.

Fig. 3: Identification and location of highest fitness positions.

Combinatorial mutagenesis reveals optimal single and combined mutations at hotspots

Fig. 4. Characterization of site saturation mutant libraries.

Table 1.

MD studies reveal mutants with local fluctuations in the active site impact the near attack conformation.

Figure 5. Results from MD.

Steered molecular dynamics (SMD) studies show steady and seamless diffusion of Phe in mutant N453S.

Figure 6. Results from SMD and umbrella sampling for parental AvPAL* and N453S.

Figure 7. Kinetic characterization and whole cell activity of N453S variants.

QM/MM reveals stabilization of the transition state in the hyperactive active mutants.

Figure 8. QM/MM studies on AvPAL* variants.

CONCLUSIONS.

METHODS.

Strains and general techniques for DNA manipulation.

Next generation sequencing (NGS) of library and data processing.

Physical linking of distal mutations for amplicon sequencing.

Construction of site saturation mutagenesis (SSM) libraries.

Enzyme assay, purification, and kinetic characterization.

Modelling and induced-fit conformation sampling/enzyme-substrate interaction studies.

Molecular dynamics (MD) simulations.

Metadynamics-based MD simulations.

Steered molecular dynamics (SMD) studies.

QM/MM simulation.

QM/MM Metadynamics.

PRE-PRINT.

Supplementary Material

ACKNOWLEDGEMENTS.

Footnotes

DATA AVAILABILITY.

REFERENCES.

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases