Significance
By applying phylogenetic tools, we infer the ancestral origins of a nonribosomal peptide synthetase (NRPS) condensation (C) domain responsible for β-lactam (antibiotic) synthesis. Subsequent in vitro experiments suggest a mechanism for how this unique synthetic function likely evolved. β-Lactam synthesizing ability appears to have emerged along with an adjacent epimerization (E) domain deletion and in parallel with other C domains whose catalysis presumably share intermediates with β-lactam synthesis, including ᴅ-amino acid and dehydroamino acid synthesis. The implied mechanistic link between β-lactam formation and the latter two C-domain functions can be experimentally tested in the future to discern how evolutionary differences dictate synthetic outcome. These insights may constructively guide protein engineering efforts to introduce potentially valuable structural elements into NPRS products.
Keywords: nonribosomal peptide synthetase, condensation domain, evolution, β-lactam, nocardicin
Abstract
Nonribosomal peptide synthetases (NRPSs) are large, multidomain biosynthetic enzymes involved in the assembly-line–like synthesis of numerous peptide natural products. Among these are clinically useful antibiotics including three classes of β-lactams: the penicillins/cephalosporins, the monobactams, and the monocyclic nocardicins, as well as the vancomycin family of glycopeptides and the depsipeptide daptomycin. During NRPS synthesis, peptide bond formation is catalyzed by condensation (C) domains, which couple the nascent peptide with the next programmed amino acid of the sequence. A growing number of additional functions are linked to the activity of C domains. In the biosynthesis of the nocardicins, a specialized C domain prepares the embedded β-lactam ring from a serine residue. Here, we examine the evolutionary descent of this unique β-lactam–synthesizing C domain. Guided by its ancestry, we predict and demonstrate in vitro that this C domain alternatively performs peptide bond formation when a single stereochemical change is introduced into its peptide starting material. Remarkably, the function of the downstream thioesterase (TE) domain also changes. Natively, the TE directs C terminus epimerization prior to hydrolysis when the β-lactam is made but catalyzes immediate release of the alternative peptide. In addition, we investigate the roles of C-domain histidine residues in light of clade-specific sequence motifs, refining earlier mechanistic proposals of both β-lactam formation and canonical peptide synthesis. Finally, expanded phylogenetic analysis reveals unifying connections between β-lactam synthesis and allied C domains associated with the appearance of ᴅ-amino acid and dehydroamino acid residues in other NRPS-derived natural products.
Biochemical innovation in natural product biosynthesis is an important source and inspiration for new medicines. One large class of such naturally occurring metabolites are peptides derived from nonribosomal peptide synthetases (NRPSs), a superfamily of multidomain proteins. These molecules are associated with many well-known pharmacophores including siderophores like enterobactin and the pyoverdines, the glycopeptide antibiotics such as vancomycin, and the β-lactams including penicillins, monobactams, and the nocardicins. To access these diverse chemical structures, NRPSs utilize an assembly-line–like architecture consisting of enzymatic domains organized in a linear sequence (1, 2). Within each assembly line, domains are grouped into functional modules, which are individually responsible for incorporation of a single amino acid into the final product. Although each module consists principally of the same set of enzymatic domains, evolution has diversified NRPS systems to select for distinct amino acids and in some cases, to carry out specialized synthetic tasks, as is addressed here.
NRPS synthesis begins with the activity of an adenylation (A) domain, which selects and activates the carboxylate of specific monomers from the cellular pool through turnover of their cosubstrate ATP. All 20 proteinogenic and hundreds of nonproteinogenic amino acids are incorporated by A domains (3). Activated monomers are then trapped as thioesters by the phosphopantetheine cofactor of an adjacent peptidyl carrier protein (PCP) and covalently tethered to the NRPS. Subsequently, condensation (C) domains catalyze intermodular peptide bond formation between the thioester of an upstream PCP-bound donor and the amine of its intramodular PCP-bound amino acid acceptor. C, A, and PCP domains constitute the minimal NRPS module, which in a series of modules catalyze N- to C-peptide elongation (4). Fully mature peptides are typically released by hydrolysis or macrocyclization in a terminal thioesterase (TE) domain.
Additionally, several accessory domains can be utilized to tailor the structures and properties of nonribosomal peptides. The most common of these is the epimerization domain (E), which facilitates ʟ→ᴅ epimerization of amino acids from the cellular pool. To accomplish this stereochemical change, E domains are precisely inserted between PCP and C domains to facilitate inversion prior to condensation. Interestingly, in vitro studies have demonstrated that E-domain–catalyzed stereochemical inversion is thermodynamically controlled and that an equilibrium mixture of l- and ᴅ-isomers is formed bound to donor PCPs only slightly favoring the ᴅ-antipode (5). To compensate for partial racemization, downstream C domains exhibit strict stereopreference and only facilitate peptide bond formation with ᴅ-amino acyl and peptidyl donors (6). This gatekeeping stereoselectivity is manifested in evolutionary history by C-domain divergence into predominately two phylogenetic groups, denoted LCL in the usual untailored l-donor context and DCL in the context of ᴅ-amino acid incorporation. Several smaller clades have manifested as well to accompany other substrates and modifications at C-domain donor sites (7).
In line with their functional complementarity, E and DCL domains are almost always paired in NRPSs. Their co-occurrence suggests a strong selective pressure maintains the E–DCL interface throughout evolutionary history despite myriad genomic duplication, insertion, deletion, and recombination events. With this generalization in mind, we present here a detailed evolutionary analysis of the nocardicin biosynthetic NRPS (Fig. 1) and highlight the unexpected correlation between an E-domain deletion and the emergence of β-lactam synthesis ability in the last C domain of NocB (C5). On the phylogenetic level, we determine that C5 descended from DCL domains and further identify sequence motifs in upstream PCP4 that reinforce the historical loss of an E domain. Evidence for similar E-domain deletions is found in only a small number of other NRPS systems, most notably those associated with the glycopeptide antibiotics (7–9). Importantly, however the surviving DCL domains connected to these deletion events exhibit eroded stereochemical selectivity appearing functionally LCL-like. Given these data, we test in vitro the specificity of C5 and demonstrate that it is a functional DCL catalyst and strikingly retains preference for ᴅ-donors, despite also catalyzing the conversion of l-serine into a β-lactam. Serendipitously, our results also implicate the embedded nocardicin β-lactam ring as a key determinate of the NocTE expanded epimerase function but not of TE acylation and hydrolysis. Using site-directed mutagenesis, we elaborate the mechanisms of NocB-C5 catalyzed β-lactam formation and DCL condensation in parallel and support divergent roles for conserved histidine residues in both reactions. Finally, using phylogeny, we infer the historical component of C-domain diversification and highlight how shared lineage with DCL domains provides an important framework for several expanded C-domain functions, including β-lactam synthesis in NocB-C5.
Fig. 1.
Proposed nonribosomal assembly of the nocardicins by NocA/NocB. Domain subscripts indicate module association. Red and blue dots at chiral centers highlight l- and ᴅ-configurations, respectively. Through the conventional activity of modules 1 through 4, the tetrapeptidyl intermediate l-Hpg-l-Arg-ᴅ-Hpg-l-Ser∼∼PCP4 is formed. Next, M5-C5 catalyzes integrated β-lactam formation proceeding through dehydration of the l-seryl donor, followed by β-addition of the acceptor amine of l-Hpg from M5, and finally 4-exo-trig cyclization to transfer the growing peptide to PCP5. The resulting PCP5-bound pentapeptidyl β-lactam is transferred to the terminal TE, where it undergoes complete epimerization prior to hydrolysis. Proteolysis of pronocardicin G (1) by cellular proteases yields the simplest isolable nocardicin β-lactam, nocardicin G (2).
Results and Discussion
Evolutionary History of the Nocardicin β-Lactam Synthesizing Module.
Contemporaneous with our recent discovery of the β-lactam synthesizing function of NocB-C5, we also observed that the core sequence motifs in C5 closely resemble DCL domains (10). From an organizational perspective, this observation greatly interested us. First, because in vitro reconstitution experiments with NocB-M5 established that the native donor for C5, and precursor of the nocardicin β-lactam, is an ʟ-seryl tetrapeptide (10). Secondly, because DCL domains are thought to be rigidly stereospecific for ᴅ-peptidyl donors to ensure isomeric purity during peptide synthesis (6).
Motivated by these seemingly conflicting points, we sought to establish more precisely the evolutionary origins of C5 through phylogenetic reconstruction. To accomplish this goal, we compared NocB-C5 to 189 C, E, and heterocyclization (Cyc) domains from a curated collection of 23 biosynthetic gene clusters (BGCs). The BGCs were selected so that every major catalytic class of C domain would be represented (SI Appendix, Table S1) (7, 11). Corroborating our initial observations, we found that C5 does cluster with the DCL clade (Fig. 2 and Dataset S3). Several reconstruction methods and evolutionary models were tested, and they all point to the same relationship between C5 and DCL domains.
Fig. 2.
Unrooted phylogenetic tree of the C-domain superfamily with bootstrap support values computed from 100 bootstrap replicates. Phylogeny was reconstructed with PhyML 3.0, using the LG substitution matrix with the decorations +Γ+I+F. A total of 190 domains and 23 BGCs are represented in the tree. Colors indicate distinct functional clades, including the following: Epimerization (E, yellow), Heterocyclization (Cyc, blue), Starter (StarterC, purple), Dehydroamino acid associated (modAAC, light blue), Dual epimerization/condensation (DualC, orange), LCL (green), and DCL (red). Yellow star labels the tip for NocB-C5, the β-lactam–synthesizing domain. The full phylogenetic tree with tip labels is available in standard Newick format in Dataset S3. Associated alignment in FASTA format is available in Dataset S1.
Interestingly, it is known that PCPs also differ based on their precise roles as donors for l- or ᴅ-amino acid–incorporating modules. The most notable variation between PCPs from each context is located in the CoreT motif (12). This motif includes the all-important serine residue, which becomes phosphopantetheinylated to facilitate peptidyl transfer and assembly. In general, when a PCP is a donor for an LCL domain, it is denoted a PCPC and has the consensus motif [GGHSL]. Likewise, when a PCP is a donor for an E and DCL domain, it is denoted a PCPE and has the consensus motif [GGDSI]. Histidine versus aspartic acid adjacent to the central serine was found to be the most important difference between these core motifs as swapping this residue in the ᴅ-amino acid–incorporating modules greatly hampered both E-domain epimerization and downstream DCL-domain condensation reactions (12). However, swapping the motif’s C-terminal aliphatic residue (I versus L) was inconsequential. Analyzing the PCPs of the nocardicin NRPS, we found that PCP4, the donor for C5, has the CoreT motif [GGDSV], classifying it as a PCPE (SI Appendix, Fig. S1).
Taken together, PCP4 and C5 contain sequence signatures inherited from the ᴅ-amino acid–incorporating context. These observations coupled with the absence of an intervening E domain suggest that an E-domain deletion immediately upstream of module 5 (M5) was a likely evolutionary event in the history of the nocardicin β-lactam synthesizing module.
Characterization of Module 5 Inferred Ancestral Function.
Having observed the sequence scars of an E-domain deletion, we became interested in determining if M5 was active in its putative ancestral context. In other words, would C5 act as a competent DCL catalyst and form peptide bonds when provided ᴅ-amino acyl or peptidyl donors bound to PCP4? A number of methods have been developed to directly probe activity of internal C domains (13, 14). However, it was found previously that only the method utilizing PCP-bound substrates (not small molecule substrate mimics) could successfully reconstitute the activity of M5 (10). Modeled after these earlier successes, we utilized the promiscuous activity of the phosphopantetheinyl transferase Sfp to load PCP4 with synthetic peptidyl–coenzyme A (CoA) thioesters. The native substrate for M5 is the PCP-bound tetrapeptide l-Hpg-l-Arg-ᴅ-Hpg-l-Ser∼∼PCP4 (3) [abbreviated native l-seryl donor; ∼∼ symbolizes the phosphopantetheine cofactor, and Hpg = (p-hydroxyphenyl)glycine)] (Fig. 3A). Therefore, to probe DCL activity in M5-C5, we envisioned testing the ᴅ-seryl tetrapeptidyl donor, l-Hpg-l-Arg-ᴅ-Hpg-ᴅ-Ser∼∼PCP4 (4) (abbreviated ᴅ-seryl donor) (Fig. 3B). The corresponding ᴅ-seryl tetrapeptidyl CoA was synthesized and used to load PCP4 (SI Appendix, Compound S11). When the ᴅ-seryl donor 4 was reconstituted with M5, we were heartened to find the efficient accumulation of a different product, which, when compared to a synthetic standard, revealed it was the expected pentapeptide from DCL coupling, l-Hpg-l-Arg-ᴅ-Hpg-ᴅ-Ser-ʟ-Hpg (5) (abbreviated ʟʟᴅᴅʟ) (Fig. 3C). The yield of the expected ʟʟᴅᴅʟ-pentapeptide 5 was nearly identical to pronocardicin G by 4 h under the conditions tested. These results indicate that C5 is a fully functional DCL catalyst, consistent with its inferred evolutionary origins.
Fig. 3.
M5 in vitro reconstitution. (A) Conceptual diagram of M5 reconstitution with its native l-seryl donor 3 to form pronocardicin G (1). (B) Conceptual diagram of M5 reconstitution with the nonnative ᴅ-seryl donor 4 forming ʟʟᴅᴅʟ-pentapeptide 5. (C) Product analysis of ᴅ-seryl reconstitution reactions. The product of the reconstitution reaction overlaps with a synthetic standard of the expected ʟʟᴅᴅʟ-pentapeptide 5, highlighted by blue boxes. (D) Accumulation of the ʟʟᴅᴅʟ-pentapeptide 5 is abolished in the M5-TE mutant S1779A, indicating the NocTE mediates product release.
TE-Mediated Release and Product Stereochemistry.
In addition to demonstrating DCL activity for M5-C5, accumulation of the ʟʟᴅᴅʟ-pentapeptide 5 also indicated the direct ᴅ-seryl condensation product tethered to PCP5 (l-Hpg-l-Arg-ᴅ-Hpg-ᴅ-Ser-l-Hpg∼∼PCP5, abbreviated ʟʟᴅᴅʟ∼∼PCP5) is readily hydrolyzed with apparent stereochemical retention at the C-terminal (p-hydroxyphenyl)glycine (Hpg). Previous substrate profiling experiments of the NocTE with pantetheine (pant)-derived peptides provided us with the expectation that the TE was highly sensitive to the identity of its substrates and might not tolerate linear peptides. Gaudelli et al. found that incubation of the NocTE with the single epimer substrate mimic, ʟʟᴅʟl-pant, resulted in slow hydrolysis equivalent to the control reaction at 3 h (15). In stark contrast, the near-native β-lactam–containing mimic epi-nocardicin G-pant, was completely epimerized and hydrolyzed under identical conditions. Therefore, to rule out spontaneous offloading, we mutated the TE nucleophilic serine to alanine within M5 (S1779A) and attempted to reconstitute M5 activity. Importantly, ʟʟᴅᴅʟ-pentapeptide 5 production was abolished in M5-S1779A in keeping with the TE-mediated hydrolytic release of 5 (Fig. 3D).
Due to the evident involvement of the TE in hydrolytic product release, we questioned the stereochemical assignment of the ʟʟᴅᴅʟ-pentapeptide 5. In its native context, the NocTE is known to catalyze near-complete inversion (≥100:1, ʟ→ᴅ) of the C-terminal Hpg of its β-lactam–containing substrate, epi-pronocardicin G, prior to hydrolytic release (15). The ʟʟᴅᴅʟ-pentapeptide 5 contains an identical C-terminal Hpg residue, and therefore, we expected that this residue would undergo TE-mediated epimerization as well. Therefore, we aimed to confirm by a secondary method, other than high-performance liquid chromatography (HPLC)/UV/ESI-MS comparison to an authentic standard, that the C-terminal stereochemistry of our putative ʟʟᴅᴅʟ-pentapeptide (5) product was indeed the unchanged C-terminal l-isomer. To achieve this aim, we repeated our in vitro reconstitution of M5 with l- and ᴅ-seryl donors in buffered D2O. While the pronocardicin G (1) produced by the l-seryl donor 3 was accompanied by a 2 Da shift as shown previously (one deuterium incorporation results from Hpg epimerization), the ʟʟᴅᴅʟ-pentapeptide 5 exhibited no deuterium incorporation, indicating that the C-terminal Hpg never loses its l-stereochemistry (SI Appendix, Fig. S2) (16). From this striking comparison, we conclude the product of M5 reconstitution with the ᴅ-seryl donor 4 is accurately defined as ʟʟᴅᴅʟ in accordance with well-established NRPS behavior.
The consequence of ʟʟᴅᴅʟ 5 release in the absence epimerization highlights the critical importance of the nocardicin β-lactam to redirect TE function to C-terminal epimerization prior to hydrolysis. A recent structure of NocTE with an accurate phosphonate mimic bound revealed the native β-lactam–containing peptide in a postepimerization configuration, with the phosphonate mimicking the tetrahedral intermediate of the final hydrolysis step (17). The active site appears to include an unoccupied pocket, which has been hypothesized to accommodate the l-Hpg isomer prior to epimerization. Mutagenesis of possible catalytic residues in the NocTE suggested that only the catalytic triad H1901 and nucleophilic S1779 were necessary for the epimerization and hydrolysis half reactions. Building upon the hypothesis that the unoccupied pocket in the TE accommodates l-Hpg prior to epimerization, it was proposed that the catalytic triad H1901 is dually responsible as a general base for the epimerization and hydrolysis half reactions. In support of this mechanism, the structure revealed H1901 is a short 4.5Å from the C-α position of ᴅ-Hpg and exhibits the appropriate geometry for α-H abstraction in a model of the unepimerized peptide. More distant from the site of epimerization and hydrolysis, the nocardicin β-lactam exhibits well-resolved interactions with the TE surface. Since the ʟʟᴅᴅʟ-pentapeptide 5 is hydrolyzed in the absence of epimerization, we assume the protein–substrate interactions between the TE and the rigid β-lactam moiety are crucial for organizing the native peptide precisely for epimerization prior to hydrolysis.
Several valuable parallels can be drawn to the macrocyclizing TEs from tyrocidine (TycC-TE) and surfactin (Srf-TE) biosynthesis, which have been subjected to rigorous substrate profiling (18–21). These investigations revealed that the TycC- and Srf-TEs could accommodate many changes to the substrate backbone, but they were acutely sensitive to residue changes near the C terminus. Relative to the C terminus, TycC-TE required l-Orn at the second position and the Srf-TE required l-Leu and ᴅ-Leu at its first and second positions, respectively, for significant TE macrocyclization to occur (19, 21). Consistent with these observations, the eventual crystal structure of the Srf-TE with a dipeptidyl boronate inhibitor bound revealed binding pockets for both C-terminal Leu residues (21). Although the TycC- and Srf-TEs are functionally distinct from the NocTE, they may exemplify the parallel mechanistic principle that extended protein–substrate interactions are critical for substrate positioning and controlling competing reaction rates of hydrolysis or macrocyclization.
M5 Donor Substrate Preference.
In order to judge the efficiency of native and ancestral catalysis by M5, we sought to compare the rates of pronocardicin G (1) and ʟʟᴅᴅʟ-pentapeptide 5 formation in parallel time-course experiments. Assuming both products have nearly identical extinction coefficients of absorption (both contain three chromophoric Hpg units), we used peak integration at 230 nm as a measure of product formation. Interestingly, the net tandem efficiency of turnover in M5 (C5 turnover + TE hydrolytic release) for the nonnative ᴅ-seryl donor 4 producing ʟʟᴅᴅʟ 5 proceeded with an ∼twofold greater rate than turnover of the native l-seryl donor 3 to produce pronocardicin G (1) (Fig. 4 A and B). Initial rates were calculated from the first hour of product formation. Although overall turnover of the ᴅ-seryl donor 4 by M5 constitutes a simpler reaction pathway than the native reaction, we believe the substantially increased product accumulation for the nonnative ᴅ-seryl donor is consistent with M5 having a preevolved substrate preference for ᴅ-donors.
Fig. 4.
Comparative time course of pronocardicin G (1) versus ʟʟᴅᴅʟ-pentapeptide 5 formation. (A) Representative HPLC time-course analysis of the rate of product formation. Red and blue boxes highlight the formation of 1 and 5, respectively. Reference also SI Appendix, Figs. S5 and S6. (B) Product peak area at 230 nm over time. Dashed line represents the regression line for the first hour of data (slope ± SEM). The difference in slope for each line indicates that the ʟʟᴅᴅʟ-pentapeptide 5 forms ∼twofold faster than the native product pronocardicin G (1).
Under unoptimized kinetic conditions with the native l-seryl donor 3 (10:1 l-seryl donor:M5), a minor coproduct peak was noted that overlaps with the ʟʟᴅᴅʟ-pentapeptide 5 in our analysis method (SI Appendix, Fig. S3). To ensure that a substrate stereochemical impurity would not interfere with our rate determination, ʟʟᴅʟ- and ʟʟᴅᴅ-CoAs (SI Appendix, Compounds S10 and S11) were purified by extensive HPLC, demonstrating baseline separation of the two stereoisomers prior to our time-course study (SI Appendix, Fig. S4). Interestingly, the minor peak formed irrespective of our additional step to ensure stereochemical purity. After obtaining an exact mass for this peak, we discovered it corresponded to a mixture of a pentapeptide and nocardicin G (2) (the major wild-type (WT) product pronocardicin G (1) minus its N-terminal dipeptide). Presumably, trace cellular protease(s) copurify with M5 despite multiple purification steps. Empirically, we found this secondary product formation was minimized when the M5 concentration was kept low (1/100 the concentration of donor) (SI Appendix, Fig. S5). Ultimately, these optimized conditions were used for the time-course experiments so that aberrant consumption of l-seryl donor 3 or product pronocardicin G (1) would not significantly impact the initial rate determinations.
Reexamination of the Role of Histidine in C5-Catalyzed β-Lactam Formation.
Motivated by our results framing the global sequence and functional context from which β-lactam synthesis in C5 evolved, we sought to reexamine the mechanism proposed previously for integrated β-lactam formation, particularly the roles of active site histidine residues (10, 16). Throughout the greater C-domain superfamily there exists the highly conserved active site motif HHxxxDG, where the second histidine (H) is generally accepted to be essential for catalysis. Mutation of this residue in many C domains leads to near abolishment of C-domain function (22–24). In view of these results, different mechanistic roles for this residue have been proposed. The oldest proposal, by analogy to chloramphenicol acyltransferase, suggests that this conserved histidine participates in general base catalysis by deprotonating the acceptor amine for nucleophilic addition into the donor thioester (22, 25). More recently, alternative proposals have asserted the conserved histidine helps stabilize the zwitterionic transition state formed during nucleophilic addition or merely positions the acceptor substrate optimally for catalysis (26, 27). Nevertheless, structures of C domains with accurate mimics of both substrates in peptide bond formation have yet to be obtained, leaving the precise positioning of substrates relative to catalytic residues and, therefore, the interpretation of catalytic roles difficult to state definitively. The nocardicin β-lactam synthesizing domain C5 has an extended histidine motif (H790H791H792xxxDG; numeric superscripts indicate residue position in NocB). Mutagenesis of all three histidine residues in M5-C5 to alanine or glutamine was massively detrimental to pronocardicin G (1) production in vitro, where the H790 and H792 mutants resulted in complete loss of activity (10, 16). Consistent with these results and an array of mechanistic experiments, it was proposed H790 and H792 both function as general bases to facilitate E1cb dehydration of the l-seryl donor 3, β-addition of the acceptor amine, and 4-exo-trig cyclization to form the nocardicin azetidinone and facilitate intramodular transfer of the growing peptide. Evidence for both E1cb elimination and the intermediacy of dehydroalanine as chemical steps in C5 catalyzed β-lactam formation were strongly supported using carefully designed synthetic substrates (10, 16). However, corroboration of the precise roles of these His residues in β-lactam formation, as with conventional peptide bond formation, were preliminarily examined using mutagenesis and require further support.
Having established an evolutionary and functional connection between C5 and DCL domains, we were interested to see how the extended core His motif in C5 relates to its ancestry. We measured the frequency of all residues upstream of the canonical His motif in the C domains from our curated collection of 23 BGCs. Across 147 C domains, we found that histidine occurred with a frequency of 5.4% (SI Appendix, Fig. S7). However, when looking only at DCL domains, we were struck to find the frequency of histidine increased to 22.9% (35 domains). We also observed that the top three most frequently encountered residues at this position in the DCL family were F > H > Y—all aromatic residues (SI Appendix, Fig. S7). Together, these three aromatic residues accounted for three quarters of the residues immediately upstream of the His-motif in the DCL domains of our dataset. Naturally, we were intrigued to know what importance the presence of an aromatic ring at this position played in the DCL family and perhaps if this chemical feature was more important than the base properties of His for integrated β-lactam formation in C5. In addition, we were interested in exploring the constraints on DCL catalysis, as we assumed this context was significant to the emergence of β-lactam synthesis. All three histidine residues in the M5-C5 core His-motif were mutated to glutamine, phenylalanine, and tyrosine. Then each mutant was reconstituted with l- and ᴅ-seryl donors in parallel with WT-M5.
While the β-lactam–containing pronocardicin G (1) synthesis ceased in all H790 and H792 mutants tested previously, we were astonished to see the accumulation of a small amount of the wild-type product from the H790Y mutant (Fig. 5A). Judging that tyrosine is not a basic residue, we think it unlikely that His at this position is acting as a general base to promote the initial E1cb dehydration of the l-seryl donor, as previously proposed (10, 16). Consequently, we speculate that H792 may be the more plausible candidate involved in this crucial C-α deprotonation and subsequent β-elimination step, as this residue remained immutable in our assays (Fig. 5C) (SI Appendix, Fig. S8). The only other mutants that produced detectable pronocardicin G (1) were the mutants of H791, which mirror our previous reports (Fig. 5B) (10, 16). This position is generally thought to be conserved for structural reasons, but no catalytic role is expected.
Fig. 5.
Effect of M5-C5 His-motif [H790H791H792xxDG] mutagenesis on C5-catalyzed β-lactam synthesis and DCL condensation in parallel. The effect of mutations by residue position are displayed: (A) H790, (B) H791, (C) H792. Red and blue boxes highlight the reaction products pronocardicin G (1) and ʟʟᴅᴅʟ-pentapeptide 5, respectively.
Interestingly, C5-catalyzed DCL condensation reactions were far more amenable to mutational change as detectable ʟʟᴅᴅʟ-pentapeptide 5 formation was measured in nearly all mutants assayed. Notably, mutation of H790 to aromatic residues saw equal or even improved production of the ʟʟᴅᴅʟ-pentapeptide 5 (Fig. 5A). On the other hand, the nonaromatic replacement, glutamine, was slightly detrimental at H790. Mutation of H791 was also broadly tolerated. We note that despite multiple purification steps, the aromatic mutants of H791 appeared to exhibit degradation prior to assay initiation (SI Appendix, Fig. S10). Nonetheless, even with degradation and a reduced fraction of active protein, the H791 aromatic replacements exhibited activity comparable to the glutamine mutant (Fig. 5B). Mutations of H792 yielded the opposite trend as H790, where glutamine at this position yielded near-WT production of ʟʟᴅᴅʟ-pentapeptide 5, but phenylalanine and tyrosine yielded reduced to nearly undetectable pentapeptide synthesis (Fig. 5C). Remarkably, significant amounts of ʟʟᴅᴅʟ-pentapeptide 5 were formed even in H792Q/F mutants, consistent with an active site base not being needed for C-domain catalysis of peptide bond formation. The second His-motif histidine [HHxxxDG] has been mutated with minimal detrimental effect on peptide bond formation in a few other systems, but to our knowledge only in the StarterC and LCL subtypes (28, 29). Our observation that this residue is similarly mutable in C5-catalyzed peptide bond formation directly addresses the peptide bond-forming step in DCL domains in internal modules.
Although C5-H792 is unnecessary for peptide bond formation, the consistent importance of this residue for β-lactam formation is a key point of discussion. The current mechanism supported for β-lactam synthesis in C5 requires base catalysis for an E1cb mechanism (SI Appendix, Fig. S8) (16). While we cannot definitively define H792 as a catalytic base in the proposed mechanism, many parallels can be drawn to epimerization (E) domains that support this idea. While considered one of the more distant members of the C-domain family, E domains are structurally similar to C domains and share the consensus His-motif [HHxxxDG]. Furthermore, E domains share an intimate evolutionary history with DCL domains, as they are almost universally juxtaposed in NRPS assembly lines. The proposed rationale for juxtaposition is that E and DCL domains evolved from a domain duplication event and over time manifested complimentary function and substrate specificities. An analogous duplication event was characterized in vibriobactin biosynthesis, which exhibits tandem Cyc domains (29). Furthermore, duplication events are well known drivers of functional diversification (30). Recently, it was shown that small differences in the crossover regions between N- and C-terminal lobes characteristic of the C-domain superfamily, appear to block in E domains what would be their acceptor site, rendering them incapable of peptide bond formation (31, 32). Studies in the gramicidin S synthetase initiation module (GrsA) encompass most of what is mechanistically known about E domains. It is proposed that the second histidine of the conserved His-motif (GrsA-H753) acts as a general base for C-α deprotonation, resulting in an achiral enolate. A precisely positioned glutamate (GrsA-E892) on the opposite face is thought to compete for reprotonation to effectively racemize the C-α stereochemistry. Using tritium-labeled donors, a set of mutants was screened in GrsA, and it was shown that both the H753 and E892 were critical for epimerization (5). However, of the two residues, only H753 was necessary for 3H-label washout, implicating H753 as a catalytic base. Complementing these data, the recent PCPE–E didomain structure of GrsA revealed the donor-site interactions of a phosphopantetheine cofactor with a C-domain superfamily member (31). It is striking that the phosphopantetheine in this structure is presented such that the canonical active site motif histidine would be positioned optimally for C-α hydrogen abstraction, fully consistent with its proposed role and mutagenesis.
Reexamining earlier data on NocB-M5 turnover of deuterium-labeled donors, we see that deuterium washout by NocB-C5-H792 is fully consistent with the outcomes of GrsA-H753 tritium-label washout (16). Considering C5-H790 was not essential for β-lactam formation but H792 was, the involvement of H792 as a base for C-α deprotonation during integrated β-lactam formation is probable.
Discussion
Enzyme Model for the Emergence of β-Lactam Synthesis.
In our assessment, PCPE–E–DCL interfaces appear to be near universally conserved in bacteria. Therefore, it is striking that the E-domain deletion upstream of NocB-M5 survived evolutionary history and natural selection such that we could observe inherited sequence scars of the deletion event and yet characterize the ancestral function of the surviving domains. Certainly, the coupled emergence of β-lactam synthesis ability by NocB-C5 provides an evolutionary advantage that would be selected for positively. However, it appears mechanistically problematic that the upstream E-domain deletion would have been maintained long enough for C5 to evolve beneficial function. By strict estimations, an E-domain deletion in an NRPS assembly line should be severely detrimental to NRPS function, as DCL domains are widely held to exhibit specificity for ᴅ-donors, but the cellular pool consists primarily of l-amino acids (6). Therefore, we suggest that C5 likely exhibited trace levels of its modern-day function prior to E-domain deletion. This phenomenon, known generally as promiscuity or moonlighting, is increasingly recognized as a driving force for enzyme functional evolution. For example, the innovation–amplification–divergence model of evolution presumes that new enzyme functions are weakly present in ancestral genes prior to their functional divergence and optimization (30). What is often required to manifest new function, is to change the context from which the weakly multifunctional enzyme is found. The fact that NocB-C5 exhibits both β-lactam synthesis and DCL peptide bond formation in the same active site supports the presumption that moonlighting β-lactam synthesis is compatible with DCL catalysis and could have existed prior to E-domain deletion. However, given a functional mechanism for donor epimerization, DCL peptide bond formation would have likely dominated, as modern C5 is a functional DCL catalyst.
Ancestral Framework for C-Domain Innovation.
It is increasingly appreciated that C domains are a functionally diverse superfamily of enzymes involved in more than just peptide bond formation. While NocB-C5 is the only known β-lactam–synthesizing C domain, several distinct functional clades are thought to catalyze related chemical transformations. For example, the modAAC- and DualC-domain families strongly correlate with the incorporation of dehydroamino acids when β-hydroxy precursors are presented on their upstream donor, especially serine and threonine residues (11, 33–35). In relation to the mechanism for β-lactam synthesis in NocB-C5, which includes a well-supported dehydroalanyl intermediate derived from serine, the integrated synthesis of dehydroamino acids by C domains is probable (10). Additional evidence can be found in a recent study conducted by Pattenson et al. in which the biosynthesis of l-2-amino-4-methoxy-trans-3-butenoic acid (AMB) was reconstituted in vitro (35). Using deuterium-labeled amino acid precursors and chemical capture methods, it was deduced that a transient dehydroamino acid intermediate is generated by a modAAC domain on pathway to AMB.
From these data, we predict that dehydroamino acid synthesis ability in NRPS biosynthesis is rightly attributed to C domains and that at least three distinct C-domain subtypes have convergently evolved this function—the modAAC, DualC, and now DCL domains (notably NocB-C5). Interestingly, we were fascinated to find that all three dehydroamino acid–associated subtypes grouped closely to one another in our phylogenetic analysis (Fig. 2). In the unrooted tree, however, it is unclear if this distance is indicative of shared evolutionary history, as unrooted trees do not show lineal descent (36). As an exercise, we attempted to root our earlier phylogeny in order to determine the precise relationship between the convergent syntheses of dehydroamino acids associated with C domains in NRPS assembly lines. We empirically tested members of the CoA-acyltransferase clan (Pfam: CL0149) to which C domains belong and identified the wax ester synthase/diacylglycerol O-acyltransferase (WES/DGAT, Pfam: PF03007) and the polyketide-associated protein A5 (PapA5, Pfam: PF16911) enzyme families as suitable outgroups to root our phylogeny. These two enzyme families show high structural and functional similarity to C domains and are prevalent in bacteria where the majority of NPRSs are found (SI Appendix, Fig. S11) (37, 38). However, to our knowledge, neither family is linked to NRPS biosynthetic pathways. Instead, the WES/DGAT and PapA5 enzyme classes assist bacterial lipid metabolism and virulence through transesterification of CoA-esters with alcohols to produce lipid esters. Sampling 100 sequences from the WES/DGAT- and PapA5-domain families in the phyla Actinobacteria and Proteobacteria, we reconstructed our C-domain phylogeny in the presence of an outgroup to infer lineage (Fig. 6 and Dataset S4).
Fig. 6.
Rooted phylogeny of the C-domain superfamily with bootstrap support values computed from 100 bootstrap replicates. All C domains are monophyletic. However, two superclades are distinguishable, relating to early divergence in C-domain function. These clades are denoted l-clade and ᴅ-clade, after their primary members, LCL and DCL domains, respectively. The ᴅ-clade could represent an extended lineage of C domains that coevolved with E-domain function, as all ᴅ-clade members purportedly exhibit some control over the timing of condensation with l-donors. In contrast, the l-clade could represent those C domains that evolved independently. Subsequent E-domain deletions would have manifested the modAAC and DualC domains, as well as NocB-C5, the β-lactam synthesizing domain (family association indicated by yellow star). The full phylogenetic tree with tip labels is available in standard Newick format in Dataset S4. Associated alignment in FASTA format is available in Dataset S2.
In the resulting phylogenetic tree, we were gratified to see that all three dehydroamino acid–associated subtypes form a monophyletic group (denoted the ᴅ-clade). Given this ordered link between the dehydroamino acid–associated domains, further experiments are needed to determine whether each of these families evolved parallel mechanisms of catalysis. Such mechanistic parallels are common in mechanistically diverse enzyme superfamilies, especially when divergent function is derived from a shared intermediate (39). In such cases, key features inherited from the clade’s common ancestor for the generation of this intermediate are often preserved. From this logic, we imagine that the specialized synthetic tasks in the ᴅ-clade could be reconciled through generation of an enolate intermediate, which could be easily partitioned to achieve ᴅ-amino acid synthesis, dehydroamino acid synthesis, and β-lactam synthesis (Fig. 7). The role of this versatile anion is supported in E domains and NocB-C5 catalysis.
Fig. 7.
Possible routes to the divergent functions of the ᴅ-clade. (A–D) These steps outline mechanistic features that could underlie the formation of the proposed common intermediate (E), a donor enolate. (B) Importantly emphasizes the donor discrimination of DCL domains when a ternary complex is formed. (F) Protonation of the common enolate from the face opposite to proton abstraction followed by 1,2-addition (G) would result in ᴅ-amino acid incorporation, as in DualC domains. (H) When a β-hydroxy amino acid is presented as the donor, the resulting enolate can undergo dehydration to a dehydroamino acid. (I) Subsequent 1,4-addition followed by 4-exo-trig cyclization results in β-lactam synthesis, as observed in NocB-C5. (J) Alternatively, 1,2-addition results in dehydroamino acid incorporation, as proposed for modAAC and DualC domains.
Strong precedent for parallel evolution of mechanisms in the ᴅ-clade can be found in the earliest study of DualC domains. This work of Balibar et al. interrogated the timing of epimerization in DualC domains and found that epimerization necessarily precedes condensation (33). Therefore, it was inferred with respect to peptide bond formation that DualC domains are DCL catalysts (33). On the molecular level, it is unknown how DualC or DCL domains discriminate against condensation with l-donors. However, given the appearance of identical stereochemical control in these domain families coupled with their common ancestry reinforces the parallel evolution hypothesis. Furthermore, Balibar et al. showed that initiation of the epimerization–condensation sequence by DualC domains is only accessible when the downstream amino acyl acceptor is present and primed for condensation in a ternary complex (33). Ternary complex formation is necessary for all C-domain–catalyzed peptide bond-forming reactions (Fig. 7). However, in the context of a multistep sequence, it is noteworthy that preformation of the complex was required for the prior epimerization step. Analogously, initiation of NocB-C5–catalyzed β-lactam formation was also found to be dependent on initial ternary complex formation. Evidence for this precondition was demonstrated in a previous study in which E1cb dehydration of the native l-seryl donor and subsequent β-lactam synthesis was absent in NocB-M5 reconstitution reactions lacking ATP and l-Hpg (16). These components are necessary for priming the amino acyl acceptor by their intramodular A domain. The timing of dehydroamino acid synthesis associated with modAAC domains has not been investigated. However, given that donor site stringency enhances the fidelity of NRPS products broadly throughout the ᴅ-clade, we predict that l-donor coupling is regulated in modAAC domains such that donor dehydration precedes condensation. Further work is needed to explicitly address this assumption.
The evolution of stereochemical control by C domains is a feature that has long been associated with the general function of E domains. Because E domains effectively racemize C-α stereochemistry, the complementary specificity for ᴅ-donors in DCL-domain coupling ensures isomeric purity during product assembly. From this framework, it is interesting to see the manifestations of DCL function in other members of the ᴅ-clade that are not functionally coupled to E domains. We can infer that DCL control is an inherited trait from the clade’s most recent common ancestor, as loss of chirality or ʟ→ᴅ epimerization is broadly associated with couplings specific to the ᴅ-clade. Given that E domains diverged before C-domain specialization in our rooted phylogeny, it is possible that the entire ᴅ-clade coevolved alongside E domains (Fig. 6). Subsequent deletion events could have then manifested the functional divergence of modAAC and DualC domains, similar to the inferred history of β-lactam synthesis ability in NocB-C5. From our analysis, modAAC domains are the only other domain class, outside of E and DCL domains, that inherits association with PCPE donors (SI Appendix, Fig. S7). Therefore, a distinct deletion event would be coupled to DualC divergence.
In sum, these observations coupled with the phylogeny explored here, provide a preliminary model to explain the differentiation of function among C domains. Future work will need to further probe the catalytic mechanisms of ᴅ-clade C domains, specifically, to determine the extent to which parallel evolutionary principle is supported. Comparison of motif differences between C-domain clades along the evolutionary trajectory may provide clues about the amino acid changes that control synthetic outcome. Such information could then be applied in protein engineering efforts to incorporate β-lactams, ᴅ-amino acids, dehydroamino acids, and possibly even derivatives thereof into innovative NRPS products. Furthermore, this model explains in part how C-domain donor site stereochemical control and perhaps familial foundations for anion synthesis support the unexpected emergence β-lactam synthesis ability in NocB-C5 from DCL domains.
Materials and Methods
Synthesis of Substrates and Standards.
All compounds used in this study (S1-S11, 5) were synthesized from commercially available precursors (SI Appendix, Synthetic Methods).
Sequence and Phylogenetic Analysis.
All BGC data used in this study were obtained from the Minimum Information about a BGC repository (Version 2.0) (40). With a select set of BGCs, we identified the subsequences for C-domain superfamily members with the program hmmsearch from the frequently used software package HMMER3 (SI Appendix, Table S1). hmmsearch utilizes a Profile Hidden Markov Model (pHMM) to identify domain start and end sites (41). We supplied the pHMM PF00668 from the Pfam database, which recognizes all C-domain superfamily members (42). Additional details about the refinement of hits from hmmsearch and the padding of partially recognized domains are included in SI Appendix, Sequence and Phylogenetic Analysis. Additional details are also included on the collection of outgroup sequences referenced in our tree-rooting process. After acquiring our sequences, we implemented the MUSCLE algorithm integrated into the bioinformatics application Jalview to align all C domains (43, 44). The alignment was then manually trimmed to remove extraneous and poorly aligned sequence at the termini. The C-domain alignments used in this study are available for download in standard FASTA format (Datasets S1 and S2). The phylogenetic trees displayed in this study are maximum-likelihood trees reconstructed with the software PhyML 3.0 (45). We employed the Le-Gascuel (LG) substitution matrix with the decorations +Γ+I+F for tree reconstruction. This model was identified as the best model by the Smart Model Selection algorithm using the Akaike Information Criterion (AIC) (46). Topology support was computed using 100 bootstrap replicates. Phylogenetic trees were visualized and annotated with the R language packages ggtree and treeio (47, 48). Both rooted and unrooted phylogenetic trees reconstructed in this study are available for download in standard Newick format (Datasets S3 and S4).
General Protocol for In Vitro Reconstitution of NocB-M5 Activity.
All proteins used in our assays were expressed and purified to homogeneity prior to assay initiation (SI Appendix, Enzyme Preparations and Figs. S9 and S10). Several mutant enzymes were cloned by Gibson Assembly using mutagenic PCR primers (SI Appendix, Enzyme Preparations and Table S2). Each in vitro reaction was carried out in assay buffer (50 mM HEPES, pH 7.5, 25 mM NaCl). The only exception was for deuterium incorporation assays, where deuterated buffer was used instead (50 mM HEPES, pD 7.5, 25 mM NaCl) (49). In one pot, 200 μM apo-PCP4 was incubated with 10 mM MgCl2, 2 μM Sfp, and 250 μM tetrapeptidyl-CoA (ʟʟᴅʟ-CoA S10 or ʟʟᴅᴅ-CoA S11 for l- or ᴅ-seryl donor reactions, respectively). In a separate pot, 20 μM M5 was incubated with 10 mM MgCl2, 2 μM Sfp, and 70 μM CoA. Both incubations were carried out at room temperature (RT) for 1 h. Complete apo- to holo-tetrapeptidyl conversion of PCP4 was then confirmed by liquid chromatography coupled high-resolution mass spectrometry (LC-MS) prior to proceeding further. Once fully modified, excess CoAs were removed by four iterations of diluting each reaction two to three times their initial volume and then centrifugally filtering at 4 °C (10 kDa Amicon Ultra-0.5). After the final iteration of filtration, M5 was resuspended to 20 μM in assay buffer supplemented with 5 mM ATP and 2 mM l-(p-hydroxyphenyl)glycine (Hpg) and incubated for at least 5 min. Meanwhile, tetrapeptidyl∼∼PCP4 was resuspended to 200 μM with assay buffer. Finally, M5 reconstitution was initiated by mixing equal parts tetrapeptidyl∼∼PCP4 and preactivated M5, resulting in the following final concentrations: 100 μM tetrapeptidyl∼∼PCP4, 10 μM M5, 2.5 mM ATP, and 1 mM l-Hpg in assay buffer. Reconstitution reactions were incubated at RT for 4 h. To analyze each assay, reactions were stopped by centrifugal filtration at 16,000 × g for 15 min (10 kDa Amicon Ultra-0.5). Subsequently, 50 μL of filtrate was directly analyzed by HPLC method 4 (SI Appendix, General Methods).
Modified Reconstitution Protocol for In Vitro Time Course of NocB-M5 Activity.
The general protocol for in vitro reconstitution of M5 activity was modified in the following ways. First, ʟʟᴅʟ- and ʟʟᴅᴅ-CoAs were purified by an additional polishing HPLC step to achieve increased stereochemical purity for the time-course reactions (SI Appendix, General Methods and Fig. S4). Second, M5 concentration was reduced to 2 μM for incubation and preactivation steps and 1 μM in the final reconstitution reaction. The concentration of all components in the final time-course assays were as follows: 100 μM tetrapeptidyl∼∼PCP4, 1 μM M5, 2.5 mM ATP, and 1 mM l-Hpg in assay buffer. Third, initiated reconstitution reactions were incubated in a 25.0 °C water bath. Aliquots were taken of each reaction at the appropriate timepoint and quenched by immediate centrifugal filtration at 16,000 × g for 15 min (10 kDa Amicon Ultra-0.5). Subsequently, 50 μL of filtrate was directly analyzed by HPLC method 4 (SI Appendix, General Methods).
Supplementary Material
Acknowledgments
We thank Dr. N. M. Gaudelli for providing nocardicin G reference data (SI Appendix) and Drs. I. P. Mortimer and J. A. Tang of the Department of Chemistry for their help acquiring LC-MS and NMR data. We are pleased to acknowledge support of this work by the NIH (Grants RO1 AI121072 to C.A.T. and T32 GM080189 to M.J.W.).
Footnotes
The authors declare no competing interest.
This article is a PNAS Direct Submission. G.D.W. is a guest editor invited by the Editorial Board.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2026017118/-/DCSupplemental.
Data Availability
All study data are included in the article and/or supporting information.
References
- 1.Fischbach M. A., Walsh C. T., Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: Logic, machinery, and mechanisms. Chem. Rev. 106, 3468–3496 (2006). [DOI] [PubMed] [Google Scholar]
- 2.Marahiel M. A., A structural model for multimodular NRPS assembly lines. Nat. Prod. Rep. 33, 136–140 (2016). [DOI] [PubMed] [Google Scholar]
- 3.Walsh C. T., O’Brien R. V., Khosla C., Nonproteinogenic amino acid building blocks for nonribosomal peptide and hybrid polyketide scaffolds. Angew. Chem. Int. Ed. Engl. 52, 7098–7124 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Drake E. J., et al., Structures of two distinct conformations of holo-non-ribosomal peptide synthetases. Nature 529, 235–238 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Stachelhaus T., Walsh C. T., Mutational analysis of the epimerization domain in the initiation module PheATE of gramicidin S synthetase. Biochemistry 39, 5775–5787 (2000). [DOI] [PubMed] [Google Scholar]
- 6.Clugston S. L., Sieber S. A., Marahiel M. A., Walsh C. T., Chirality of peptide bond-forming condensation domains in nonribosomal peptide synthetases: The C5 domain of tyrocidine synthetase is a (D)C(L) catalyst. Biochemistry 42, 12095–12104 (2003). [DOI] [PubMed] [Google Scholar]
- 7.Rausch C., Hoof I., Weber T., Wohlleben W., Huson D. H., Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol. Biol. 7, 78 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Reitz Z. L., Hardy C. D., Suk J., Bouvet J., Butler A., Genomic analysis of siderophore β-hydroxylases reveals divergent stereocontrol and expands the condensation domain family. Proc. Natl. Acad. Sci. U.S.A. 116, 19805–19814 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Schoppet M., et al., The biosynthetic implications of late-stage condensation domain selectivity during glycopeptide antibiotic biosynthesis. Chem. Sci. (Camb.) 10, 118–133 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gaudelli N. M., Long D. H., Townsend C. A., β-Lactam formation by a non-ribosomal peptide synthetase during antibiotic biosynthesis. Nature 520, 383–387 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ziemert N., et al., The natural product domain seeker NaPDoS: A phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS One 7, e34064 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Linne U., Doekel S., Marahiel M. A., Portability of epimerization domain and role of peptidyl carrier protein on epimerization activity in nonribosomal peptide synthetases. Biochemistry 40, 15824–15834 (2001). [DOI] [PubMed] [Google Scholar]
- 13.Belshaw P. J., Walsh C. T., Stachelhaus T., Aminoacyl-CoAs as probes of condensation domain selectivity in nonribosomal peptide synthesis. Science 284, 486–489 (1999). [DOI] [PubMed] [Google Scholar]
- 14.Ehmann D. E., Trauger J. W., Stachelhaus T., Walsh C. T., Aminoacyl-SNACs as small-molecule substrates for the condensation domains of nonribosomal peptide synthetases. Chem. Biol. 7, 765–772 (2000). [DOI] [PubMed] [Google Scholar]
- 15.Gaudelli N. M., Townsend C. A., Epimerization and substrate gating by a TE domain in β-lactam antibiotic biosynthesis. Nat. Chem. Biol. 10, 251–258 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Long D. H., Townsend C. A., Mechanism of integrated β-lactam formation by a nonribosomal peptide synthetase during antibiotic synthesis. Biochemistry 57, 3353–3358 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Patel K. D., et al., Structure of a bound peptide phosphonate reveals the mechanism of nocardicin bifunctional thioesterase epimerase-hydrolase half-reactions. Nat. Commun. 10, 3868 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kohli R. M., Trauger J. W., Schwarzer D., Marahiel M. A., Walsh C. T., Generality of peptide cyclization catalyzed by isolated thioesterase domains of nonribosomal peptide synthetases. Biochemistry 40, 7099–7108 (2001). [DOI] [PubMed] [Google Scholar]
- 19.Trauger J. W., Kohli R. M., Mootz H. D., Marahiel M. A., Walsh C. T., Peptide cyclization catalysed by the thioesterase domain of tyrocidine synthetase. Nature 407, 215–218 (2000). [DOI] [PubMed] [Google Scholar]
- 20.Trauger J. W., Kohli R. M., Walsh C. T., Cyclization of backbone-substituted peptides catalyzed by the thioesterase domain from the tyrocidine nonribosomal peptide synthetase. Biochemistry 40, 7092–7098 (2001). [DOI] [PubMed] [Google Scholar]
- 21.Tseng C. C., et al., Characterization of the surfactin synthetase C-terminal thioesterase domain as a cyclic depsipeptide synthase. Biochemistry 41, 13350–13359 (2002). [DOI] [PubMed] [Google Scholar]
- 22.Bergendahl V., Linne U., Marahiel M. A., Mutational analysis of the C-domain in nonribosomal peptide synthesis. Eur. J. Biochem. 269, 620–629 (2002). [DOI] [PubMed] [Google Scholar]
- 23.Bloudoff K., Rodionov D., Schmeing T. M., Crystal structures of the first condensation domain of CDA synthetase suggest conformational changes during the synthetic cycle of nonribosomal peptide synthetases. J. Mol. Biol. 425, 3137–3150 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Roche E. D., Walsh C. T., Dissection of the EntF condensation domain boundary and active site residues in nonribosomal peptide synthesis. Biochemistry 42, 1334–1344 (2003). [DOI] [PubMed] [Google Scholar]
- 25.De Crécy-Lagard V., Marlière P., Saurin W., Multienzymatic non ribosomal peptide biosynthesis: Identification of the functional domains catalysing peptide elongation and epimerisation. C. R. Acad. Sci. III 318, 927–936 (1995). [PubMed] [Google Scholar]
- 26.Bloudoff K., Alonzo D. A., Schmeing T. M., Chemical probes allow structural insight into the condensation reaction of nonribosomal peptide synthetases. Cell Chem. Biol. 23, 331–339 (2016). [DOI] [PubMed] [Google Scholar]
- 27.Samel S. A., Schoenafinger G., Knappe T. A., Marahiel M. A., Essen L.-O., Structural and functional insights into a peptide bond-forming bidomain from a nonribosomal peptide synthetase. Structure 15, 781–792 (2007). [DOI] [PubMed] [Google Scholar]
- 28.Keating T. A., Marshall C. G., Walsh C. T., Keating A. E., The structure of VibH represents nonribosomal peptide synthetase condensation, cyclization and epimerization domains. Nat. Struct. Biol. 9, 522–526 (2002). [DOI] [PubMed] [Google Scholar]
- 29.Marshall C. G., Hillson N. J., Walsh C. T., Catalytic mapping of the vibriobactin biosynthetic enzyme VibF. Biochemistry 41, 244–250 (2002). [DOI] [PubMed] [Google Scholar]
- 30.Andersson D. I., Jerlström-Hultqvist J., Näsvall J., Evolution of new functions de novo and from preexisting genes. Cold Spring Harb. Perspect. Biol. 7, a017996 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chen W.-H., Li K., Guntaka N. S., Bruner S. D., Interdomain and intermodule organization in epimerization domain containing nonribosomal peptide synthetases. ACS Chem. Biol. 11, 2293–2303 (2016). [DOI] [PubMed] [Google Scholar]
- 32.Samel S. A., Czodrowski P., Essen L.-O., Structure of the epimerization domain of tyrocidine synthetase A. Acta Crystallogr. D Biol. Crystallogr. 70, 1442–1452 (2014). [DOI] [PubMed] [Google Scholar]
- 33.Balibar C. J., Vaillancourt F. H., Walsh C. T., Generation of D amino acid residues in assembly of arthrofactin by dual condensation/epimerization domains. Chem. Biol. 12, 1189–1200 (2005). [DOI] [PubMed] [Google Scholar]
- 34.Christiansen G., Fastner J., Erhard M., Börner T., Dittmann E., Microcystin biosynthesis in planktothrix: Genes, evolution, and manipulation. J. Bacteriol. 185, 564–572 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Patteson J. B., Dunn Z. D., Li B., In vitro biosynthesis of the nonproteinogenic amino acid methoxyvinylglycine. Angew. Chem. Int. Ed. Engl. 57, 6780–6785 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kinene T., Wainaina J., Maina S., Boykin L. M., “Rooting trees, methods for” in Encyclopedia of Evolutionary Biology, Kliman R. M., Ed. (Academic Press, Oxford, 2016), pp. 489–493. [Google Scholar]
- 37.Buglino J., Onwueme K. C., Ferreras J. A., Quadri L. E. N., Lima C. D., Crystal structure of PapA5, a phthiocerol dimycocerosyl transferase from Mycobacterium tuberculosis. J. Biol. Chem. 279, 30634–30642 (2004). [DOI] [PubMed] [Google Scholar]
- 38.Petronikolou N., Nair S. K., Structural and biochemical studies of a biocatalyst for the enzymatic production of wax esters. ACS Catal. 8, 6334–6344 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Gerlt J. A., Babbitt P. C., Divergent evolution of enzymatic function: Mechanistically diverse superfamilies and functionally distinct suprafamilies. Annu. Rev. Biochem. 70, 209–246 (2001). [DOI] [PubMed] [Google Scholar]
- 40.Kautsar S. A., et al., MIBiG 2.0: A repository for biosynthetic gene clusters of known function. Nucleic Acids Res. 48, D454–D458 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Eddy S. R., Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.El-Gebali S., et al., The Pfam protein families database in 2019. Nucleic Acids Res. 47, D427–D432 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Edgar R. C., MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Waterhouse A. M., Procter J. B., Martin D. M. A., Clamp M., Barton G. J., Jalview Version 2–A multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Guindon S., et al., New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010). [DOI] [PubMed] [Google Scholar]
- 46.Lefort V., Longueville J.-E., Gascuel O., SMS: Smart model selection in PhyML. Mol. Biol. Evol. 34, 2422–2424 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wang L.-G., et al., Treeio: An R package for phylogenetic tree input and output with richly annotated and associated data. Mol. Biol. Evol. 37, 599–603 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Yu G., Using ggtree to visualize data on tree-like structures. Curr. Protoc. Bioinformatics 69, e96 (2020). [DOI] [PubMed] [Google Scholar]
- 49.Glasoe P. K., Long F. A., Use of glass electrodes to measure acidities in deuterium oxide. J. Phys. Chem. 64, 188–190 (1960). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All study data are included in the article and/or supporting information.







