Abstract
The means by which superfamilies of specialized enzymes arise by gene duplication and functional divergence are poorly understood. The escape from adaptive conflict hypothesis, which posits multiple copies of a gene encoding a primitive inefficient and highly promiscuous generalist ancestor, receives support from experiments showing that resurrected ancestral enzymes are indeed more substrate-promiscuous than their modern descendants. Here, we provide evidence in support of an alternative model, the innovation–amplification–divergence hypothesis, which posits a single-copied ancestor as efficient and specific as any modern enzyme. We argue that the catalytic mechanisms of plant esterases and descendent acetone cyanohydrin lyases are incompatible with each other (e.g., the reactive substrate carbonyl must bind in opposite orientations in the active site). We then show that resurrected ancestral plant esterases are as catalytically specific as modern esterases, that the ancestor of modern acetone cyanohydrin lyases was itself only very weakly promiscuous, and that improvements in lyase activity came at the expense of esterase activity. These observations support the innovation–amplification–divergence hypothesis, in which an ancestor gains a weak promiscuous activity that is improved by selection at the expense of the ancestral activity, and not the escape from adaptive conflict in which an inefficient generalist ancestral enzyme steadily loses promiscuity throughout the transition to a highly active specialized modern enzyme.
Keywords: catalysis, transition state stabilization, gene duplication and divergence, escape from adaptive conflict, innovation, amplification, divergence.
Introduction
Enzyme superfamilies are the products of gene duplication and functional divergence. Members of a superfamily share a common protein fold and similar amino acid sequences (indicating descent from a common ancestor) yet have different activities (suggesting selection for novel functions). Members’ activities often overlap. Substrate promiscuous enzymes favor one substrate, yet also catalyze the same chemical transformation, to a greater or lesser extent, with other substrates (Humble and Berglund 2011). Enzymes that catalyze different reactions (often similar to those of other superfamily members) are said to display catalytic promiscuity (O’Brien and Herschlag 1999; Baas et al. 2013; Baier and Tokuriki 2014). For example, o-succinylbenzoate synthase catalyzes both the normal dehydration of its substrate and the racemization of N-acyl amino acids through the same metal ion stabilized enediolate characteristic of members of the enolase superfamily (Ringia et al. 2004).
Functional promiscuity is key to several models of superfamily evolution. The escape from adaptive conflict hypothesis (Hughes 1994) posits multiple copies of a gene that encodes a primitive inefficient and highly promiscuous generalist enzyme. The modern superfamily emerges as selection for catalytic efficiency forces functional specialization through trade-offs (the idea that a jack-of-all-trades is a master at none). An alternative model, the innovation–amplification–divergence hypothesis (Bergthorsson et al. 2007; Näsvall et al. 2012), posits a single-copied ancestor as efficient and specialized as any modern enzyme. Once mutation has created a weakly beneficial promiscuous function, selection acts to increase expression through gene amplification until the costs associated with excess protein expression become prohibitive. Once again, trade-offs ensure that increases in catalytic efficiency force increases in functional specialization, whereas selection against excess protein expression eliminates unneeded copies.
The key difference between these models lies not in their outcomes—both predict superfamilies of enzymes with related, though distinct (if overlapping) functions—but in their origins. The escape from adaptive conflict hypothesis posits an inefficient but highly promiscuous ancestor. The innovation–amplification–divergence hypothesis posits an ancestor as efficient and as specialized as any modern enzyme.
At first glance it might seem impossible to distinguish between the models; after all, the ancestor that existed millions of years ago is extinct. However its genes were passed on, generation upon generation, albeit modified by mutation, to its living descendants. By analyzing the gene sequences of extant species, researchers can infer the sequences of ancestral genes (Thornton 2004; Harms and Thornton 2013). Such an analysis uses a set of aligned sequences from extant species, a phylogenetic tree, and a probabilistic model of sequence evolution that incorporates differences in the rates of evolution and the likelihood that one amino acid will be exchanged for another. Then, the ancient gene is chemically synthesized and the protein expressed in a suitable host. Where analysis suggests several possible amino acids at a location, several plausible ancestors might be constructed and characterized. In most cases, such ancestors have similar properties. For example, reconstructed ancestral β-lactamases and plausible alternative protein sequences have similar phenotypes (Risso et al. 2013).
Reconstructed ancestral enzymes support the escape from adaptive conflict model because they have broader substrate ranges than their modern descendants. Voordeckers et al. (2012) showed that reconstructed ancestral α-glycosidases catalyzed hydrolysis of a wide range of substrates (maltose and isomaltose analogs), whereas modern glycosidases are comparatively specific. Likewise, reconstructed 600-My-old β-lactamases catalyze hydrolysis of a wider range of β-lactams, including third-generation semisynthetic antibiotics that never existed in nature (Risso et al. 2013). However, the hypothesis that ancestral enzymes were more catalytically promiscuous than modern enzymes has not been tested.
Acetone cyanohydrin lyases (ACLs) are hydroxynitrile lyases, a heterogeneous class of enzymes that produce cyanide as a defense against insects (Poultan 1990) and that independently have evolved in at least five distinct protein folds. We predict that ancestral plant esterases in the α/β-hydrolase fold superfamily, which gave rise to modern ACLs, were as catalytically specialized as modern esterases. Unlike substrate promiscuity and catalytic promiscuity among mechanistically related reactions, the reaction mechanisms of esterases and ACLs differ dramatically (fig. 1; Gruber et al. 2004), even though both contain the same Ser–His–Asp catalytic triad (a chemical example of the Darwinian principle of finding new uses for old parts). First, the binding orientations of the substrates differ. In esterases, the ester carbonyl binds in an oxyanion hole with its carbon near the Ser. In ACLs, the oxyanion hole is blocked and the substrate aldehyde carbonyl binds with its oxygen near the Ser. Second, ester hydrolysis forms and later releases an acyl enzyme intermediate in which the acyl part of the ester is linked covalently to the active site Ser. In contrast, acetone cyanohydrin cleavage occurs in a single step without covalent links between substrate and enzyme. Third, ester hydrolysis releases an uncharged hydrophobic alcohol whereas acetone cyanohydrin cleavage releases a small charged cyanide group. These mechanistic differences are so marked that stabilizing an esterase transition state must necessarily impede stabilizing the ACL transition state. For this reason we predict that the ACLs evolved, not by escape from adaptive conflict, but by innovation, amplification and divergence, from a weakly promiscuous ancestor in a lineage of otherwise highly specific esterases.
Fig. 1.
The catalytic mechanisms of esterases (left) and hydroxynitrile lyases (right) are quite different. Esterases use a two-step mechanism. First, the catalytic Ser acts as a nucleophile to attack the ester’s carbonyl carbon with the resulting formation of an acyl enzyme intermediate and release of the alcohol. Second, hydrolysis of the carboxylic acid from the Ser is initiated when the catalytic His abstracts a proton from water, the resulting hydroxide attacks the carbonyl carbon and the same mechanism effectively proceeds in reverse. Hydroxynitrile cleavage occurs in a single concerted step involving simple acid-base chemistry. Note that the orientation of the carbonyl differs in the two reactions. In esterases, the carbonyl carbon points toward the Ser and the oxygen toward the oxanion hole. In hydroxynitrile lyases, the carbonyl oxygen points toward the catalytic Ser.
Results
Phylogenetic Analysis
The α/β-hydrolase superfamily contains greater than 60,000 proteins (Kourist et al. 2010; Lenfant et al. 2013) with wide-ranging catalytic activities (Ollis et al. 1992; Holmquist 2000; Nardini and Dijkstra 1999). Most are hydrolases (Enzyme Classification group 3), although the superfamily also includes ACLs (Enzyme Classification group 4). Phylogenetic analysis shows the ACLs cluster deep within a larger group of plant esterases (fig. 2). The maximum-likelihood tree is well resolved with most clusters receiving strong (>80%) bootstrap support. The topology of the neighbor-joining tree is similar (differences are restricted to short branches with low bootstrap support).
Fig. 2.
Maximum-likelihood phylogenetic tree of α/β hydrolases showing extant ACLs (red text) nested within a larger group of esterases (EST, blue text), the neighboring methylketone synthase (MKS, a decarboxylase, green text), and distantly related meta-cleavage product hydrolases (connected by dotted lines). Gray triangles represent clusters of proteins (numbers shown at the tips of the triangles). Resurrected ancestors EST1 and EST2 (blue) are dedicated esterases, EST3 (lavender) shows weak acetone cyanohydrin activity, ACL1 (purple) is an effective ACL with some esterase activity, whereas extant ACLs (red) are dedicated ACLs. Numbers at bifurcations and triangle bases represent bootstrap support.
ACLs, from the rubber tree Hevea brasiliensis (HbACL; Gruber et al. 1999), cassava Manihot esculenta (MeACL; Selmar et al. 1989; Hughes et al. 1994), and wild castor Baliospermum montanum (Nakano et al. 2014), share approximately 40% identity with related plant esterases. The ACL cluster is allied to a cluster that includes an esterase from Arabidopsis thaliana (AtEST5; Andexer et al. 2007) and other esterases of unknown function. Next is a cluster of 174 uncharacterized sequences, and then a cluster that includes two well-characterized esterases, salicylic acid binding protein 2 (SABP2) from the tobacco, Nicotiana tabacum (NtEST; Forouhar 2005), and polyneuridine-aldehyde esterase from snakeroot, Rauvolfia serpentina (RsEST; Mattern-Dogru 2002).
More distantly related (with only 20–40% amino acid identity to the ACLs) are the decarboxylase methylketone synthase I (ShMKS) from Solanum habrochaites (Yu 2010; Auldridge 2012) and in a separate cluster of 41 uncharacterized sequences a previously uncharacterized esterase from Ricinis communis (RcEST). Other enzymes in the α/β-hydrolase superfamily from bacteria and animals, including C–C hydrolases (e.g., BphD from Burkholderia xenovorans; Horsman et al. 2007), epoxide ring hydrolases (e.g., human soluble epoxide hydrolase, Beetham et al. 1993; Gomez et al. 2006; Thalji et al. 2013) and haloalkane hydrolases (e.g., LinB from Sphingobium sp.; PDB: 4H77_A; Okai 2013), share the same fold but are too distantly related (only 10–20% amino acid identify) for reliable phylogenetic analysis.
Structural Evolution
Different sites in these plant esterases and ACLs evolve at different rates. The broad pattern is typical of many globular enzymes. Sites critical to function are invariant. For example, the Ser–His–Asp catalytic triad is invariant and only in distantly related α/β hydrolases, such as dehalogenases and epoxide hydrolases, are variations found. Conservative replacements (e.g., Ile for Leu) predominate in the slowly evolving hydrophobic core. Surface sites far from the active site evolve rapidly, as do the amino and carboxy termini which have been subject to numerous insertions and deletions. The catalytic domain is more conserved than the lid domain that forms part of the substrate-binding site. Several residues contacting ligands bound in the active site are not conserved. These presumably contribute to differences in substrate specificity.
Ancestral Enzyme Reconstruction
Plant esterases and ACLs share greater than 40% amino acid identity making it feasible to robustly infer ancestral enzyme sequences. Ancestral sequences at focal nodes ACL1, and EST1, EST2 and EST3 (fig. 2) were estimated using maximum likelihood in RaxML (Stamatakis 2006) for two trees, one constructed by maximum likelihood (Felsenstein 1981) and the other using neighbor joining (Saitou and Nei 1987). As expected, all predicted ancestral enzyme sequences contain conserved residues common to most enzymes in the family, including the Ser–His–Asp catalytic triad. Nevertheless, the ancestral enzyme sequences are different from the sequences of modern enzymes. Most similar are those at node ACL1 and HbACL (∼81% identical), then those at nodes EST1 and EST2 to NtEST (∼73% identical) and least similar are those at node EST3 to AtEST5 and HbACL (∼67% identical). Note that ancestral sequences at EST3 occupy positions in raw sequence space intermediate between modern ACLs and modern plant esterases. The longer branch lengths leading to the ACLs reflect multiple mutations at each evolving site.
Reconstruction of an exact ancestral sequence is highly unlikely because of uncertainties at some sites. Marginal ancestral state reconstructions suggest that the most likely amino acids were identified with greater than 90% confidence at greater than 85% of sites. Only at node EST3 was the maximum-likelihood sequence less robustly estimated (>90% confidence at>75% of sites), a consequence of its position at the intersection of three relatively long branches. To accommodate these uncertainties, we constructed enzymes for each focal node that contained the most likely alternative amino acids at ALL uncertain sites. Although these alternative sequences are highly improbable ancestors, their activities can be expected to be similar to their respective maximum-likelihood-estimated ancestors because most differences are physically conservative (e.g., between Asp and Glu) at rapidly evolving and functionally unconstrained sites at the protein surface. Also reconstructed by maximum likelihood were ancestral sequences using the neighbor-joining tree. Based on their positions in the trees we expect reconstructed enzymes at node ACL1 to be efficient ACLs, at EST3 to be esterases with weak promiscuous acetone cyanohydrin cleavage activities, and at EST1 and EST2 to be specialist esterases.
Activities
Six-His-tagged enzymes, both ancient and modern, were expressed and purified to homogeneity and their activities determined toward acetone cyanohydrin and four esters. Acetone cyanohydrin is the natural substrate for this cluster of ACLs (Butler 1965). The natural substrates for the esterases are not known, except for RsEST which is a polyneuridine esterase (Mattern-Dogru 2002) and NtEST which hydrolyzes methyl salicylate in a plant-defense-signaling pathway (Forouhar 2005). In order to focus on esterase activity rather than substrate specificity, we surveyed hydrolysis of four esters that differ in the sizes of their acyl and alcohol moieties.
As expected, RsEST and NtEST catalyze hydrolysis of all four esters and do not catalyze acetone cyanohydrin cleavage. From its position in the phylogenetic tree, we anticipated that the previously uncharacterized RcEST would be an esterase. Indeed, RcEST is an efficient esterase that lacks acetone cyanohydrin cleavage activity (table 1).
Table 1.
Activities of Modern and Resurrected Ancestral Enzymes.
Substrate |
|||||
---|---|---|---|---|---|
Hydroxynitrilea | Esterb |
||||
Enzyme | Acetone Cyanohydrin | Methyl Salicylate | 1-Naphthyl Acetate | 2-Naphthyl Acetate | Methyl Pentanoate |
HbACL | 2,400c | 0.066 | NDA | NDA | NDA |
MeACL | 12,600 | NDA | NDA | NDA | NDA |
ACL1-ML | 720 | 0.66 | 0.054 | 0.13 | 0.41 |
ACL1-ALT | 880 | 0.048 | 0.016 | 0.024 | NDA |
ACL1-NJ | 350 | 0.03 | 0.0090 | 0.0054 | 0.94 |
EST3-ML | 0.078 | NDA | 0.0036 | 0.0084 | NDA |
EST3-ALT | NDA | NDA | NDA | 0.0024 | 0.60 |
EST3-NJ | NDA | 0.13 | 0.28 | 0.46 | 42 |
EST1-ML | NDA | 26 | 0.096 | 2.2 | 130 |
EST2-ML | NDA | 14 | 0.084 | 1.5 | 310 |
RcEST | NDA | 7.5 | 0.12 | 0.18 | 2.6 |
RsEST | NDA | 0.042 | 0.042 | 0.012 | 6.5 |
SABP2 | NDA | 0.52 | 0.012 | 0.018 | 180 |
AtEST5 | NDA | NDA | 1.7 | 0.038 | 0.66 |
NDA, no detectable activity.
Activities are kcat (min−1).
Activities are min−1. Rates measured at one substrate concentration given in Materials and Methods.
Standard errors <10% of estimates.
AtEST5 is an esterase. AtEST5 is unusual because it cleaves mandelonitrile (Andexer et al. 2007) in addition to ester hydrolysis (Koo et al. 2013). Its natural function is not cleaving hydroxynitriles, however, because 1) the family Brassicaceae, of which A. thaliana is a member, contains no cyanogenic glucosides (Olson-Manning et al. 2013); 2) acetone cyanohydrin, the natural substrate for this class of ACLs (Butler 1965), is not a substrate for AtEST5; and 3) AtEST5 hydrolyzes esters at rates comparable to known the esterases (table 1). AtEST5 and other sequences in this cluster are, like their ancestors, esterases, although their natural substrates have yet to be identified.
Nested deep within the esterases, ancient enzymes at EST1 and EST2 hydrolyze all four esters efficiently and lack acetone cyanohydrin cleavage activity. The maximum-likelihood ancestor at EST3 is an esterase with weak acetone cyanohydrin cleavage activity. The alternative enzymes at this node are esterases, although they vary somewhat in substrate specificity. All three reconstructed enzymes at ACL1 have good acetone cyanohydrin cleavage activities and retain relatively high esterase activities. These enzymes are catalytically promiscuous. HbACL and MeACL are highly specific ACLs, although HbACL can hydrolyze methyl salicylate, albeit slowly.
That esterase activities increase from EST3 to ACL1 is not surprising. Most amino acid substitutions compromise protein structures. Those that cause functional changes are no exception. Compensatory mutations following the acquisition of acetone cyanohydrin cleavage activity evidently raised the ancestral esterase activity. Importantly, the specificity of EST3-ML toward acetone cyanohydrin versus 1-naphthyl acetate is 0.078/0.0036 = 21.7, whereas that for ACL1-ML is 720/0.054 = 13,333.3. In other words, the specificity toward acetone cyanohydrin cleavage has increased more than 600-fold. This represents a decisive shift away from esterase activity toward acetone cyanohydrin cleavage specialization.
Discussion
For two reasons, our results support the innovation–amplification–divergence hypothesis (Bergthorsson et al. 2007; Näsvall et al. 2012) rather than the escape from adaptive conflict hypothesis (Hughes 1994). First, modern specialist ACLs evolved from an equally specialized ancestor (EST2; table 1 and fig. 2). Second, catalytic promiscuity increased during the transition, from EST2 through EST3 to ACL1, before it decreased in the modern ACLs. These observations are incompatible with escape from adaptive conflict where an inefficient generalist ancestral enzyme steadily loses promiscuity throughout the transition to a highly active specialized modern enzyme.
Several studies have shown that, like their modern descendants, reconstructed ancestral enzymes and proteins display substrate and ligand promiscuity (Eick et al. 2012; Voordeckers et al. 2012). Both modern and ancestral plant esterases and ACLs also show considerable substrate promiscuity (Guterl et al. 2009; Andexer et al. 2012), yet only rarely are they catalytically promiscuous. This is because the two catalytic mechanisms must stabilize very different transition states (fig. 1; Gruber et al. 2004). The esterases follow the canonical two-step serine hydrolase mechanism, with its acyl enzyme intermediate, whereas the lyase mechanisms use a concerted mechanism involving only binding and general acid-base chemistry (fig. 1). Moreover, stabilizing two different transition states is made all the more difficult as the enzyme must hold the substrates’ reactive groups in opposing positions. Hence, chemistry predicts that 1) modern plant esterases and ACLs, as well as their ancestors, are rarely promiscuous for both reactions; 2) promiscuous enzymes heavily favor one reaction over the other; 3) the mutational origin of acetone cyanohydrin cleavage is a rare event with all modern ACLs descended from an ancestral esterase with an inefficient promiscuous ACL activity; and 4) esterase activity was lost rapidly during divergence to modern ACLs, not only because the transition states of the reactions are so very different but also because the enzyme needs to hold the reactive carbonyl in opposite orientations in the two reactions.
All four predictions are confirmed (table 1). First, neither modern esterases nor the ancestral esterases at nodes EST1 and EST2 have detectable acetone cyanohydrin cleavage activities. By extrapolation, most ancestral esterases in this family never had them either. Modern ACLs are also highly specific enzymes, although HbACL has a weak methyl salicylate esterase activity. This exception suggests the intriguing possibility that HbACL moonlights in the plant-defense-signaling pathway involving methyl salicylate. Second, promiscuous enzymes either heavily favor ester hydrolysis (ancestors at node EST3) or cyanohydrin cleavage (HbACL and ancestors at ACL1). The only exception is AtEST5 which, like the ACLs, efficiently cleaves the related hydroxynitrile mandelonitrile, yet with opposite stereospecificity (Andexer et al. 2012). The catalytic mechanism and mode of substrate binding to AtEST5 remain speculative (Andexer et al. 2012; Zhu et al. 2015). Third, all ACLs derive from ancestral esterase EST3, the only esterase with an inefficient promiscuous ACL activity. This observation suggests that the selective cause favoring the evolution of acetone cyanohydrin cleavage may have arisen long before the adaptive response was made possible by a rare mutational novelty at EST3. Fourth, catalytic trade-offs forced a rapid loss of esterase activity during divergence to the modern ACLs.
The substrates themselves require different reactions; an ester cannot undergo an elimination reaction and a cyanohydrin cannot undergo hydrolysis. Enabling cyanohydrin cleavage within an esterase active site requires as few as two amino acid replacements (Padhi et al. 2010). Esterases contain a Gly in the oxyanion loop, whereas ACLs contain a threonine, which physically blocks access to the oxyanion hole. Second, in esterases a hydrophobic residue (methionine in SABP2) usually forms part of the alcohol binding site, whereas in ACLs a lysine in this location creates a polar site for the cyanide. These two amino acid replacements in SABP2 (Gly12Thr, Met239Lys) switched catalysis from esterase to ACL cleavage. However, the corresponding substitutions in a more distantly related esterase did not create hydroxynitrile activity indicating that other unknown features also contribute to this activity. Ancestral enzyme ACL1 contains both key residues—a threonine to block the oxyanion hole and a lysine in the alcohol binding site.
In contrast, EST3 appears to be an esterase with a novel ACL activity. As in AtEST5, both EST3-ML and EST3-MIX have replaced the Gly in the oxyanion loop with an Asn. Unlike Thr, Asn is not branched at the Cβ and might still permit the substrate’s carbonyl oxygen to access the oxyanion hole. Thus, the three enzymes each retain considerable esterase activity. Alternatively, the Asn δ-amide might act as a hydrogen bond donor to stabilize the oxyanion intermediates during ester hydrolysis. For the other replacement, EST3-MIX contains a lysine typical of ACLs whereas EST3-ML still retains the methionine typical of esterases. One would expect that of the two ancestral reconstructions, EST3-MIX would have higher acetone cyanohydrin cleavage activity. Yet the opposite is true, EST3-ML is the better lyase. Clearly, the details by which successive amino acid replacements, including the role of epistasis, changed a highly specific esterase into a highly specific ACL remain to be determined.
Not all catalytic mechanisms differ as dramatically as those of plant esterases and ACLs. Work by Gerlt and collaborators (Gerlt et al. 2005; Glasner et al. 2006) has established the existence of enzyme superfamilies whose members catalyze different reactions by stabilizing a common transition state. For example, members of the enolate superfamily use a base to abstract a proton alpha to a carboxylate to produce an enediolate, followed by protonation either on the opposite face for an isomerization (e.g., mandelate racemase) or of the leaving group for an elimination (e.g., in enolase, a dehydratase). We conjecture that enzymes that stabilize similar transition states have a greater potential for catalytic promiscuity and could well have arisen through escape from adaptive conflict hypothesis. Unfortunately, so divergent are the members of the enolase and many other superfamilies that reconstructing ancestors becomes a highly unreliable exercise. Testing our conjecture at the limits of detectable sequence similarity will prove difficult, if not impossible.
In summary, we have shown that the highly specific ancestral plant esterase (EST2) that gave rise to modern ACLs first acquired a weak promiscuous acetone cyanohydrin cleavage activity (EST3), and that the subsequent increase in acetone cyanohydrin cleavage activity (ACL1 and modern ACLs) is associated with a rapid loss of esterase activity. This pattern of functional evolution is both consistent with the innovation–amplification–divergence hypothesis (Bergthorsson et al. 2007; Näsvall et al. 2012), and predictable based on different active site chemistry of the two reactions.
Materials and Methods
Phylogenetics
Five thousand protein sequences, 150–600 amino acids long and sharing a minimum 30% sequence identity with Hevea brasiliensis ACL (GI: 1223884), were obtained from the NCBI protein sequence database. Identical copies, mutant peptides, and all patents were removed and the remaining sequences aligned using Muscle (Edgar 2004) in SeaView (Gouy et al. 2010). A preliminary neighbor-joining (Saitou and Nei 1987) tree was used to identify a cluster of 1,285 sequences between 30% and 99% identical, and that included the ACLs and salicylic acid binding protein 2 from Nicotiana tabacum, for further analysis.
The 1,285-amino acid sequence alignment was adjusted manually guided by superpositioned protein structures (H. brasiliensis 1QJ4, Manihot esculenta 1E8D, Arabidopsis thaliana 3DQZ, Nicotiana tabacum 1XKL) obtained with DeepView/Swiss PDB-Viewer (Guex and Peitsch 1997) to adjust insertions and deletions into surface loops. The final alignment of 1,285 sequences is available upon request from the authors.
Bootstrapped (1,000 replicates) maximum-likelihood trees were obtained using RAxML (Stamatakis 2006) using the ML + Bootstrap + GAMMAPROT + LG settings. Maximum likelihood uses a probabilistic model of sequence evolution to construct a tree from a given alignment. The GAMMAPROT setting to allow some sites to evolve faster than others and the LG (Le and Gascuel 2008) setting provides an empirical amino acid exchange matrix to accommodate differences in the rates of exchange between different amino acids. The most likely tree is one with topology, branch lengths and other parameters that maximize the likelihood of the observed alignment. An advantage of the maximum-likelihood approach is that it provides a natural means to test alternative hypotheses—branch lengths, topologies, and ancestral states. Similar trees were obtained using FastML with GAMMAPROT and a JTT (Jones, Taylor, and Thorton) amino acid exchange matrix (Jones et al. 1992) in RAxML with SH support (Shimodaira and Hasegawa 1999) and bootstrapped neighbor joining with a Poisson correction in SeaView.
Ancestral sequences were inferred using maximum likelihood as implemented in RAxML using the ML + Bootstrap + GAMMAPROT + LG tree obtained above. Maximum likelihood calculates the most likely amino acid at each site at each node using an empirical Bayes approach (Yang et al. 1995). The likelihood of observing a particular amino acid, x, at a particular site at a particular node is given as
where a is one of 20 amino acids at the focal site, t is the topology of the phylogenetic tree, m is an evolutionary model, θ represents the various model parameters (rates of amino acid exchange, variable rates across sites, etc.), and px is the prior probability of observing x. Maximum likelihood resolves ambiguities in favor of the most likely model. For example, in the case of tree (((Leu,Leu),(Met,Met)),Arg) the most likely ancestor of sequences 1–4 had an Met because exchanges between Met and Leu and between Met and Arg are commonplace whereas those between Leu and Arg are rare. Ancestral sequences were also inferred for the neighbor-joining tree using maximum likelihood.
Gene Synthesis and Cloning
Genes for the ancestral enzymes were synthesized by GenScript (Piscataway, NJ) and subcloned into a pET21a(+) vector at NdeI and XhoI restriction sites resulting in an upstream T7 promoter and lac operator and an in frame C-terminal six His-tag. DNA sequencing of the entire gene confirmed the fidelity of cloning.
Protein Expression and Purification
Lysogeny broth media containing 100 µg/ml ampicillin (LB-amp, 5 ml) was inoculated with a single colony (Escherichiacoli BL21 transformed with the with appropriate plasmid) from an agar plate and incubated in an orbital shaker at 37 °C and 200 rpm for 15 h. This culture was used to inoculate terrific broth-amp media (500 ml), which was incubated at 37 °C and 250 rpm until the absorbance at 600 nm reached 1.0 (approximately 3–4 h). This culture was cooled at 17 °C and 200 rpm for 1 h, then isopropyl β-d-1-thiogalactopyranoside (1 mM) was added to induce the protein expression and the cultivation continued for 20 h. The cells were harvested by centrifugation (8,000 rpm, 10 min at 4 °C) and suspended in buffer A (20 mM imidazole, 50 mM NaH2PO4, 300 mM NaCl, pH 7.2, 10 ml). The cells were disrupted by sonication (400 W, 40% amplitude for 5 min) and centrifuged (4 °C, 12,000 rpm 45 min). The supernatant was loaded onto a column containing Ni-NTA resin (1 ml, Qiagen) pre-equilibrated with buffer A (30 ml). The column was washed buffer A (50 ml) followed by buffer B (50 mM imidazole, 50 mM NaH2PO4, 300 mM NaCl, pH 7.2, 50 ml). The His-tagged protein was eluted with elution buffer (150 mM imidazole, 50 mM NaH2PO4, 300 mM NaCl, pH 7.2, 15 ml) and concentrated to approximately 1 ml with an Amicon ultrafiltration centrifuge tube (10 kD cutoff). The imidazole buffer was exchanged by four successive additions of BES buffer (5 mM, pH 7.0, 10 ml) followed by concentration to approximately 1 ml with the centrifuge tube.
Cleavage of Acetone Cyanohydrin
The release of cyanide from acetone cyanohydrin was measured using a modified König reaction (Selmar 1987; Andexer 2006). Enzyme (up to 8 µM in 5 mM citrate phosphate buffer, pH 5) was added to a solution of acetone cyanohydrin (1.28–10 mM in 0.1 M citric acid). After up to 20 min, the reaction was quenched by addition of aqueous N-chlorosuccinimide (2 mM, 62.5 µl, also containing 20 mM succinimide). After 2 min, barbituric acid (230 mM in 30% pyridine, 12.5 µl) was added and after 10 min, the absorbance at 580 nm was measured and compared with a calibration curve constructed using K2[Zn(CN)4] with concentrations of HCN ranging from 2.5 to 100 µM.
Esterase Activity
Methyl salicylate (500 µM), methyl pentanoate, 1-naphthyl acetate, and 2-naphthyl acetate (2 mM) in 5 mM BES buffer pH 7 with up to 50 µM enzyme. Control reactions were performed by adding equal volumes buffer that had been removed from the enzyme by centrifugal filtration. Reactions were shaken at room temperature for up to 18 h before enzyme was removed by centrifugation using Amicon Ultra 0.5 ml regenerated cellulose 10,000 NMWL centrifugal filters. Reactions with methyl salicylate, 1-naphthyl acetate, and 2-naphthyl acetate were analyzed by HPLC on an Agilent Eclipse Plus C18 column eluted with methanol: Water + 0.1% formic acid (80:20) at 1.0 ml/min using a photodiode array detector. Methyl pentanoate hydrolysis was quantified p-nitrophenol as a pH-indicator as described previously (Janes 1998).
Acknowledgments
This work was supported by the National Institute of Health (NIH) grant 5R01GM102205 and National Science Foundation (NSF) grant 1152804 to A.M.D. and R.J.K.
References
- Andexer J, Guterl J-K, Pohl M, Eggert T. 2006. A high-throughput screening assay for acetone cyanohydrin lyase activity. Chem Commun. 40:4201–4203. [DOI] [PubMed] [Google Scholar]
- Andexer JN, Staunig N, Eggert T, Kratky C, Pohl M, Gruber K. 2012. Hydroxynitrile lyases with α/β-hydrolase fold: two enzymes with almost identical 3D structures but opposite enantioselectivities and different reaction mechanisms. ChemBioChem 13:1932–1939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andexer J, von Langermann J, Mell A, Bocola M, Kragl U, Eggert T, Pohl M. 2007. An R-selective hydroxynitrile lyase from Arabidopsis thaliana with an α/β-hydrolase fold. Angew Chem Int Ed Engl. 46:8679–8681. [DOI] [PubMed] [Google Scholar]
- Auldridge ME, Guo Y, Austin MB, Ramsey J, Fridman E, Pichersky E, Noel JP. 2012. Emergent decarboxylase activity and attenuation of α/β-hydrolase activity during the evolution of methylketone biosynthesis in tomato. Plant Cell 24:1596–1607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baas B-J, Zandvoort E, Geertsema EM, Poelarends GJ. 2013. Recent advances in the study of enzyme promiscuity in the tautomerase superfamily. ChemBioChem 14:917–926. [DOI] [PubMed] [Google Scholar]
- Baier F, Tokuriki N. 2014. Connectivity between catalytic landscapes of the metallo-β-lactamase superfamily. J Mol Biol. 426:2442–2456. [DOI] [PubMed] [Google Scholar]
- Beetham JK, Tian T, Hammock BD. 1993. cDNA cloning and expression of a soluble epoxide hydrolase from human liver. Arch Biochem Biophys. 305:197–201. [DOI] [PubMed] [Google Scholar]
- Bergthorsson U, Andersson DI, Roth JR. 2007. Ohno’s dilemma: evolution of new genes under continuous selection. Proc Natl Acad Sci U S A. 104:17004–17009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butler GW. 1965. The distribution of the cyanoglucosides linamarin and lotaustralin in higher plants. Phytochemistry 4:127–131. [Google Scholar]
- Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eick GN, Colucci JK, Harms MJ, Ortlund EA, Thornton JW. 2012. Evolution of minimal specificity and promiscuity in steroid hormone receptors. PLoS Genet. 8:e1003072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Felsenstein J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 17:368–376. [DOI] [PubMed] [Google Scholar]
- Forouhar F, Yang Y, Kumar D, Chen Y, Fridman E, Park SW, Chiang Y, Acton TB, Montelione GT, Pichersky E, et al. 2005. Structural and biochemical studies identify tobacco SABP2 as a methyl salicylate esterase and implicate it in plant innate immunity. Proc Natl Acad Sci U S A. 102:1773–1778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerlt JA, Babbitt PC, Rayment I. 2005. Divergent evolution in the enolase superfamily: the interplay of mechanism and specificity. Arch Biochem Biophys. 433:59–70. [DOI] [PubMed] [Google Scholar]
- Glasner ME, Gerlt JA, Babbitt PC. 2006. Evolution of enzyme superfamilies. Curr Opin Chem Biol. 10:492–497. [DOI] [PubMed] [Google Scholar]
- Gomez GA, Morisseau C, Hammock BD, Christianson DW. 2006. Human soluble epoxide hydrolase: structural basis of inhibition by 4-(3-cyclohexylureido)-carboxylic acids. Protein Sci. 15:58–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gouy M, Guindon S, Gascuel O. 2010. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol. 27:221–224. [DOI] [PubMed] [Google Scholar]
- Gruber K, Gartler G, Krammer B, Schwab H, Kratky C. 2004. Reaction mechanism of acetone cyanohydrin lyases of the α/β-hydrolase superfamily. J Biol Chem. 279:20501–20510. [DOI] [PubMed] [Google Scholar]
- Gruber K, Gugganig M, Wagner UG, Kratky C. 1999. Atomic resolution crystal structure of hydroxynitrile lyase from Hevea brasiliensis. J Biol Chem. 380:993–1000. [DOI] [PubMed] [Google Scholar]
- Guex N, Peitsch MC. 1997. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18:2714–2723. [DOI] [PubMed] [Google Scholar]
- Guterl JK, Andexer JN, Sehl T, von Langermann J, Frindi-Wosch I, Rosenkranz T, Fitter J, Gruber K, Kragl U, Eggert T, et al. 2009. Uneven twins: comparison of two enantiocomplementary hydroxynitrile lyases with alpha/beta-hydrolase fold. J Biotechnol. 14:166–173. [DOI] [PubMed] [Google Scholar]
- Harms MJ, Thornton JW. 2013. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat Rev Genet. 14:559–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holmquist M. 2000. Alpha/beta-hydrolase fold enzymes: structures, functions and mechanisms. Curr Protein Pept Sci. 1:209–235. [DOI] [PubMed] [Google Scholar]
- Horsman GP, Bhowmik S, Seah SY, Kumar P, Bolin JT, Eltis LD. 2007. The tautomeric half-reaction of BphD, a C-C bond hydrolase. Kinetic and structural evidence supporting a key role for histidine 265 of the catalytic triad. J Biol Chem. 282:19894–19904. [DOI] [PubMed] [Google Scholar]
- Hughes AL. 1994. The evolution of functionally novel proteins after gene duplication. Proc Biol Sci. 256:119–124. [DOI] [PubMed] [Google Scholar]
- Hughes J, Carvalho FJ, Hughes MA. 1994. Purification, characterization, and cloning of α-hydroxynitrile lyase from cassava (Manihot esculenta Crantz). Arch Biochem Biophys. 311:496–502. [DOI] [PubMed] [Google Scholar]
- Humble MS, Berglund P. 2011. Biocatalytic promiscuity. Eur J Org Chem. 2011:3391–3401. [Google Scholar]
- Janes LE, Löwendahl AC, Kazlauskas RJ. 1998. Quantitative screening of hydrolase libraries using pH indicators: identifying active an enantioselective hydrolases. Chem Eur J. 4:2324–2331. [Google Scholar]
- Jones DT, Taylor WR, Thornton JM. 1992. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 8:275–282. [DOI] [PubMed] [Google Scholar]
- Koo YJ, Yoon ES, Seo JS, Kim J-K, Choi YDJ. 2013. Characterization of a methyl jasmonate specific esterase in Arabidopsis. J Korean Soc Appl Biol Chem. 56:27–33. [Google Scholar]
- Kourist R, Jochens H, Bartsch S, Kuipers R, Padhi SK, Gall MG, Böttcher D, Joosten H-J, Bornscheuer UT. 2010. The α/β-hydrolase fold 3DM database (ABHDB) as a tool for protein engineering. ChemBioChem 11:1635–1643. [DOI] [PubMed] [Google Scholar]
- Le SQ, Gascuel O. 2008. LG: an improved, general amino-acid replacement matrix. Mol Biol Evol. 25:1307–1320. [DOI] [PubMed] [Google Scholar]
- Lenfant N, Hotelier T, Velluet E, Bourne Y, Marchot P, Chatonnet A. 2013. ESTHER, the database of the α/β-hydrolase fold superfamily of proteins: tools to explore diversity of functions. Nucleic Acids Res. 41:D423–D429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mattern-Dogru E, Ma X, Hartmann J, Decker H, Stöckigt J. 2002. Potential active-site residues in polyneuridine aldehyde esterase, a central enzyme of indole alkaloid biosynthesis, by modelling and site-directed mutagenesis. Eur J Biochem. 269:2889–2896. [DOI] [PubMed] [Google Scholar]
- Nakano S, Dadashipour M, Asano Y. 2014. Structural and functional analysis of hydroxynitrile lyase from Baliospermum montanum with crystal structure, molecular dynamics and enzyme kinetics. Biochim Biophys Acta. 1844:2059–2067. [DOI] [PubMed] [Google Scholar]
- Nardini M, Dijkstra BW. 1999. α/β-Hydrolase fold enzymes: the family keeps growing. Curr Opin Struct Biol. 9:732–737. [DOI] [PubMed] [Google Scholar]
- Näsvall J, Sun L, Roth JR, Andersson DI. 2012. Real-time evolution of new genes by innovation, amplification, and divergence. Science 338:384–387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Brien PJ, Herschlag D. 1999. Catalytic promiscuity and the evolution of new enzymatic activities. Chem Biol. 6:R91–R105. [DOI] [PubMed] [Google Scholar]
- Okai M, Ohtsuka J, Imai LF, Mase T, Moriuchi R, Tsuda M, Nagata K, Nagata Y, Tanokura M. 2013. Crystal structure and site-directed mutagenesis analyses of haloalkane dehalogenase LinB from Sphingobium sp. strain MI1205. J Bacteriol. 195:2642–2651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ollis D, Cheah E, Cygler M, Dijkstra BW, Frolow F, Franken SM, Harel M, Remington S, Silman I, Schrag JD. 1992. The α/β hydrolase fold. Protein Eng Des Sel. 5:197–211. [DOI] [PubMed] [Google Scholar]
- Olson-Manning CF, Lee CR, Rausher MD, Mitchell-Olds T. 2013. Evolution of flux control in the glucosinolate pathway in Arabidopsis thaliana. Mol Biol Evol. 30:14–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padhi SK, Fujii R, Legatt GA, Fossum SL, Berchtold R, Kazlauskas RJ. 2010. Switching from an esterase to a acetone cyanohydrin lyase mechanism requires only two amino acid substitutions. Chem Biol. 17:863–871. [DOI] [PubMed] [Google Scholar]
- Poultan JE. 1990. Cyanogenesis in plants. Plant Physiol. 94:401–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ringia EAT, Garrett JB, Thoden J, Holden HM, Rayment I, Gerlt JA. 2004. Evolution of enzymatic activity in the enolase superfamily: functional studies of the promiscuous o-succinylbenzoate synthase from Amycolatopsis. Biochemistry 43:224–229. [DOI] [PubMed] [Google Scholar]
- Risso V, Gavira J, Mejia-Carmona DF, Gaucher E, Sanchez-Ruiz JM. 2013. Hyperstability and substrate promiscuity in laboratory resurrections of Precambrian β-lactamases. J Am Chem Soc. 135:2899–2902. [DOI] [PubMed] [Google Scholar]
- Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 4:406–425. [DOI] [PubMed] [Google Scholar]
- Selmar D, Carvalho FJP, Conn EE. 1987. A colorimetric assay for α-acetone cyanohydrin lyase. Anal Biochem. 166:208–211. [DOI] [PubMed] [Google Scholar]
- Selmar D, Lieberei R, Biehl B, Conn EE. 1989. α-hydroxynitrile lyase in Hevea brasiliensis and its significance for rapid cyanogenesis. Physiol Plant. 75:97–101. [Google Scholar]
- Shimodaira H, Hasegawa M. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol. 16:1114–1116. [Google Scholar]
- Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690. [DOI] [PubMed] [Google Scholar]
- Thalji RK, McAtee JJ, Belyanskaya S, Brandt M, Brown GD, Costell MH, Ding Y, Dodson JW, Eisennagel SH, Fries RE, et al. 2013. Discovery of 1-(1,3,5-triazin-2-yl)piperidine-4-carboxamides as inhibitors of soluble epoxide hydrolase. Bioorg Med Chem Lett. 23:3584–3588. [DOI] [PubMed] [Google Scholar]
- Thomson JM, Gaucher EA, Burgan MF, De Kee DW, Li T, Aris JP, Benner SA. 2005. Resurrecting ancestral alcohol dehydrogenases from yeast. Nat Genet. 37:630–635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thornton JW. 2004. Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet. 5:366–375. [DOI] [PubMed] [Google Scholar]
- Voordeckers K, Vanneste K, van der Zande E, Voet A, Maere S, Verstrepen KJ. 2012. Reconstruction of ancestral metabolic enzymes reveals molecular mechanisms underlying evolutionary innovation through gene duplication. PLoS Biol. 10:e1001446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z, Kumar S, Nei M. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641–1650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu G, Nguyen TT, Guo Y, Schauvinhold I, Auldridge ME, Bhuiyan N, Ben-Israel I, Iijima Y, Fridman E, Noel JP, et al. 2010. Enzymatic functions of wild tomato methylketone synthases 1 and 2. Plant Physiol. 154:67–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu W, Liu Y, Zhang R. 2015. A QM/MM study of the reaction mechanism of (R)-hydroxynitrile lyases from Arabidopsis thaliana (AtHNL). Proteins 83:66–77. [DOI] [PubMed] [Google Scholar]