Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jun 27.
Published in final edited form as: Chem Biol. 2012 Nov 21;19(11):1411–1422. doi: 10.1016/j.chembiol.2012.09.012

Structures of cyanobactin maturation enzymes define a new family of transamidating proteases

Vinayak Agarwal 1,2, Elizabeth Pierce 3, John McIntosh 3, Eric W Schmidt 3,*, Satish K Nair 1,2,4,*
PMCID: PMC10294700  NIHMSID: NIHMS1907216  PMID: 23177196

Summary

Cyanobactins, a class of ribosomally encoded macrocylic natural products are biosynthesized through the proteolytic processing and subsequent N-C macrocylization of ribosomal peptide precursors. This macrocylization occurs through a two-step process in which the first protease (PatA) removes the amino terminal flanking sequence from the precursor to yield a free N-terminus of the precursor peptide, and the second protease (PatG) removes the C-terminal flanking sequence and then catalyzes the transamidation reaction to yield an N-C cyclized product. Here, we present the crystal structures of the protease domains of PatA and PatG from the patellamide cluster and of PagA from the prenylagaramide cluster. A comparative structural and biochemical analysis of the transamidating PatG protease reveals the presence of a unique structural element distinct from canonical subtilisin proteases, which may facilitate the N-C macrocylization of the peptide substrate.

Introduction

Small molecule peptide natural products mined from microbial and plant sources continue to represent one of the major avenues towards the development of new antibiotics and anti-proliferatives. Many of these natural products exhibit potent biological activities against a number of therapeutic targets and the range of schemes for the biosynthesis of such molecules gives rise to a diversity of structures (Fischbach and Walsh, 2009; Newman and Cragg, 2007). While significant research efforts have illustrated a vast array of non-ribosomally synthesized peptide scaffolds, the structural diversity of ribosomally synthesized peptides has only begun to be appreciated with the realization that such peptides can undergo an range of post-translational modifications.

One modification that is prevalent in ribosomally synthesized peptide natural products is macrocyclization, which brings together distant parts of an otherwise elongated molecule in close proximity. Examples of such ribosomally synthesized and posttranslationally modified peptides (RiPPs) that contain macrocycles include lantibiotics that contain (methyl)lanthionine-containing thioethers (Figure 1A) (Knerr and van der Donk, 2012), thiopeptides that are constrained through pyridine rings generated by Diels-Alder type condensations (Figure 1B) (Walsh et al., 2010), and thuricin and subtilosin bearing sulfur-α carbon thioethers (Fluhe et al., 2012; Rea et al., 2010) (Figure 1). Cyclic peptides are widely sought in pharmaceutical discovery and development because their constrained structures often provide better target specificity and improved pharmacological and stability properties in comparison to their linear relatives. Macrocyclization also provides rigidity to the structure of the molecules, and can also confer thermostability and chemical resistance to the cyclized natural product (Garg et al., 2012).

Figure 1: Chemical structures of representative macrocyclized natural products.

Figure 1:

(A) Lantibiotic nisin; (B) thiopeptide thiocillin I; and cyanobactins (C) patellamide A (D) trunkamide and (E) prenylagaramide B. The macrocyclizing covalent bonds are highlighted in red. Five membered heterocycles in patellamide A and trunkamide are highlighted. Prenylagaramide is devoid of heterocycles. (F) Primary sequence alignment of the PatE, TruE and PagE substrate peptide primary sequences. The hypervariable cyanobactin cassette residues (shown by dashed line above the sequence) are denoted by X. The cyanobactin cassettes are flanked by conserved sequences at the N- and C- termini (shaded green and yellow respectively). Note that the last residue of the cyanobactin coding cassette (in bold and marked by ▲ under the sequence) is a cysteine residue (for PatE and TruE) which is subsequently heterocyclized, or a heterocycle mimicking proline residue (for PagE).

Cyanobactins (Figure 1CE) are a class of RiPPs that are found in greater than 30% of all cyanobacteria. In contrast to many other macrocyclic natural products, cyanobactins are macrocyclized via the peptide backbone and not via amino acid side-chains. All cyanobactins are synthesized as precursor peptides (“PatE” and relatives) that are matured by two proteases, exemplified by PatA and PatG in the patellamide biosynthetic pathway (Figure 2) (Lee et al., 2009; Schmidt et al., 2005). In addition to macrocyclization, patellamides and related cyanobactins may contain Ser, Thr and Cys residues, which are heterocyclized with the main chain atoms to yield (methyl)oxazoline and thiazoline five membered rings (McIntosh et al., 2010a; McIntosh and Schmidt, 2010). Further oxidation of these rings, presumably by the oxidase domain within the patG ORF, yields (methyl)oxazole and thiazole rings. Other posttranslational modifications include prenylation at Ser/Thr or Tyr/Trp residues catalyzed by the PatF class of prenyltransferases (McIntosh et al., 2011). Metagenomic studies have revealed the widespread presence of homologs of patellamide biosynthetic genes in the marine microbiota, such as trunkamides, prenylagaramides and arthrospiramides (Donia and Schmidt, 2011; Schmidt and Donia, 2009; Sivonen et al., 2010). The biosynthetic route for cyanobactins represents one of the most broad substrate RiPP pathways so far characterized, and represents an ideal starting point for engineering and genome mining approaches to peptide macrocyclization.

Figure 2: The proposed enzymatic scheme for the biosynthesis of patellamide and related cyanobactins.

Figure 2:

For clarity, only one substrate cassette for PatE is shown. The conserved amino acids flanking the amino and carboxy termini of the Pat cassettes are colored as in Figure 1, with the coding cassette hypervariable amino acids shown as spheres. Note that the terminal coding cassette residue is a cysteine, which is heterocyclized and oxidized to a thiozole. A proline residue is present in PagE at this position, and a proline can substitute for cysteine in PatE.

Intriguingly, despite being orthologs that are very sequence similar (greater than 40% similarity between PatA and PatG), these proteases catalyze quite biochemically distinct reactions. PatA recognizes a five-amino acid sequence on the PatE coding cassette as a site for cleavage, synthesizing short, linear peptides. In turn, these peptides are substrates for PatG, which recognizes a three-amino acid C-terminal sequence (Lee et al., 2009). PatG cleaves this sequence from the core peptide and macrocyclizes the product. In contrast to other RiPPs, the core regions of the Pat cassettes are hypervariable, while the flanking regions are highly conserved, leading to sequence-diverse macrocycles. The presence of diverse posttranslational modifications, paired with the hypervariable core peptide, has resulted in >200 known cyanobactin natural products and >70 widely sequence-diverse, engineered derivatives that we have previously reported (Donia et al., 2008; Donia and Schmidt, 2011; Tianero et al., 2012).

PatA and PatG are orthologs that consist of multiple domains, including a subtilisin-like serine protease domain and a domain of unknown function (DUF) (Figure 3). In addition, PatG and its relatives often include other domains, such as thiazoline oxidase or methyltransferase domains. We previously showed that just the protease domains of PatA and PatG are required to recapitulate activity in vitro (Lee et al., 2009), and that the complete proteins or DUF domains do not accelerate the reactions in comparison to protease domain alone. As discerned from the many known cyanobactin gene clusters, PatA, PatG, and their homologues fall into a single family (clade) within the large subtilisin protease group (Donia and Schmidt, 2011). By carefully manipulating substrates, we showed that PatG follows the normal subtilisin-like proteolytic route, where substrates are cleaved and become covalently bound to the active-site Ser (McIntosh et al., 2010b). Subsequently, displacement by the internal amine nucleophile leads to transamidation and macrocyclization. We also observed the alternative route, hydrolysis to linear peptides, with some substrates. The recent structure determination of the PatG protease domain provided strong evidence in support of this mechanistic hypothesis (Koehnke et al., 2012).

Figure 3: Domain organization of the cyanobactin protease ORFs and homology between the PatA and PatG protease domains.

Figure 3:

(A) The PatA homologs are organized as a N-terminal protease domain (colored yellow) followed a linker region and a C-terminal DUF domain (colored blue). Primary sequence numbering for PatA is shown. (B) The PatG homologs are organized as a protease domain, followed by the C-terminal DUF domain. Variability is found in the N-terminal organization of PatG homolog sequences, with the PatG sequence harboring an additional DUF domain, followed by a flavin dependent oxidase domain. In the ThcG sequence, an additional S-adenosyl-methionine dependent methyltransferase (SMT) domain is present between the protease and DUF domains. Primary sequence numbering for PatG is shown. (C) Sequence alignment of PatA and PatG identifies homology between the protease and DUF domains. The amino acids corresponding to the capping helices in PatG are boxed. The catalytic triad residues are shown with (▲) under the residues.

The PatG protease domain was capable of cyclizing 6–12 amino acid peptides with many different sequences (McIntosh et al., 2010b). The only requirements were the maintenance of the C-terminal Ala-Tyr-Asp recognition motif, as well as the presence of a heterocycle (Pro, thiazole, or thiazoline) at the last residue of the core peptide. Without these elements, both hydrolysis and macrocyclization activities were abolished. By contrast, the only required element for PatA protease domain is an N-terminal Gly-Leu-Glu-Ala-Ser/Gly-Val-Glu-Pro-Ser sequence (Lee et al., 2009). In addition, there are many other homologous cyanobactin pathways that have different recognition elements and lead to different products (Donia et al., 2006; Donia et al., 2008; Donia and Schmidt, 2011). For example, in the case of prenylagaramides, heterocyclization does not occur, and PagA, PagG and a prenyltransferase are the only enzymes required to mature the precursor peptide, PagE (Donia and Schmidt, 2011). The flanking recognition elements, Gly-Leu-Thr-Pro-His and Phe-Ala-Gly, are very different from their homologous Gly-Leu-Glu-Ala-Ser and Ala-Tyr-Asp sequences in the patellamide pathway (Figure 1F).

Here, we have biochemically determined the domain boundaries of the protease domains of both of the patellamide processing enzymes, as well as the first protease from the prenylagaramide cluster. We also report high-resolution crystal structures of PatA, PagA, and PatG that elucidate the key features responsible for the different substrate recognition and reactivities of the enzymes. A comparison of the structures reveals the presence of an unexpected insert in the PatG protease domain, which may provide a molecular rationale for the macrocyclization reaction. The structures serve to explain how highly related orthologs promote such different chemical reactions, leading to a broad array of RiPP natural products with therapeutic potential.

Results and Discussion

Determination of cyanobactin protease domain boundaries

Metagenomic approaches have led to the identification of several widely distributed clusters that code for the biosynthesis of patellamide-like hetero- and macro-cyclized peptides, including the trunkamides and the prenylagaramides (Donia and Schmidt, 2011). Each of these gene clusters is characterized by the presence of two distinct ORFs that encode the processing proteases. The proteases do not typically exist as independent units but are rather domains of 300–360 residues that are embedded within the context of much larger ORFs (Lee et al., 2009). For homologs of the patA, the protease domain is localized at the amino terminus of the corresponding ORF, followed by a long linker region and a domain of unknown function (DUF3) at the carboxy terminus. This domain organization is conserved among all patA homologs that have been sequenced from homologous gene clusters (Figure 3A). For homologs of patG, the protease domain can either be localized to the amino terminus followed by a DUF domain, or be sandwiched between an amino terminal flavin-dependent oxidase domain and the carboxy terminal DUF domain (Figure 3B). In protease-containing ORFs that do not have the amino-terminal oxidase domain, the oxidase may be present as a stand-alone ORF (Thc cluster), or be totally absent (Pag cluster) (Donia and Schmidt, 2011).

Both the cyanobactin processing proteases have been experimentally demonstrated to possess extremely broad substrate specificity, which we have exploited for the in vivo synthesis of un-natural circular cyanobactin derivatives of diverse chemical structures (Lee et al., 2009; McIntosh et al., 2010b). The only conserved and required feature for both proteases is the flanking recognition sequence at either terminus of the Pat cassette. PatA and its homologs recognize the amino terminal flanking sequence for substrate engagement, while the PatG homologs utilize the carboxy terminus flanking sequence. Within the context of these two requirements, the enzymes can be utilized to work on extremely diverse peptide cassettes and catalyze their macrocyclization. It should also be noted that the in vitro activity for both the PatA protease domain and the PatG protease domain structures described in this study has been experimentally established (McIntosh et al., 2011; McIntosh et al., 2010b).

Attempts at heterologous production and purification of full-length ORFs encompassing the PatA and PatG proteases (as well as other homologs) in E. coli yielded only impure and proteolytically degraded protein samples. It should be noted that this is not a universal phenomenon, as the whole TruG protein was stably expressed and was catalytically active (Lee et al., 2009). This degradation was not the result of autoprocessing, as expression strategies utilizing the catalytically inactive variants of both proteases (active site Ser→Ala mutants) also resulted in degraded samples, although the enzymes were no longer competent as proteases. Consequently, further attempts at protein production focused solely on the respective protease domains, the domain boundaries of which were determined by a combination of sequence analysis and biochemical experiments. Sequence analysis identifies a putative PatA protease recognition motif (Ser299-Val300-Glu301-Ala302-Ser303), which precedes the proline rich sequence presumed to be the linker that connects the protease and DUF domains. We speculated that this sequence likely constitutes the carboxy terminus of the protease domain, and generated an expression construct spanning residues Met1 through Ser303 of PatA. Heterologous expression of this domain in E. coli yielded soluble and stable protein samples that were catalytically active. Further refinement of the domain boundaries using mass spectrometric analysis of limit digests identified a stable domain from which the carboxy terminal 20 residues were removed (i.e. Met1-Ala284). Although diffraction quality crystals could only be obtained for the larger construct, continuous electron density could only be observed for only up to Ala284, validating residues Met1-Ala284 as the PatA protease domain boundary. The domain boundaries for the PagA homolog from the prenylagaramide cluster were similarly established and a construct encompassing residues Met1 through Ser303 was used for further studies. Alignment of the PatG primary sequence with those of PatA and PagA demonstrates that sequence similarity extends to PatG residues encompassing Lys514 through Gly850. A slightly larger construct that extends through to Thr866 at the carboxy terminus was also generated. The protease domains of both PatA and PatG were demonstrated to be catalytically proficient at levels comparable to those of the corresponding wild-type enzymes (McIntosh et al., 2011; McIntosh et al., 2010b).

Crystal structure of the PatA and PagA protease domains

Crystals of the PatA protease domain (Met1 through Ser303; hereafter PatA) diffract to 1.7 Å resolution (1 molecule in the crystallographic asymmetric unit) at an insertion device synchrotron source (Station 21 ID-D; Argonne National Labs, IL). Crystallographic phases were determined by single wavelength anomalous dispersion methods from data collected on crystals grown from selenomethionine labeled protein, which grew in a different, but related, crystal form with 2 molecules in the crystallographic asymmetric unit. The protein behaves as a monomer in solution and in the crystal (Supplementary Figure S1). Crystals of the PagA protease domain (Met1 through Ser303; hereafter PagA) diffract to 2.45 Å resolution and crystallographic phases were determined by molecular replacement using the refined coordinates of PatA (66% identity) as a search probe.

The overall architectures of PatA and PagA are that of the canonical α/β protease fold first observed in the structure of subtilisin (Siezen and Leunissen, 1997) (Figure 4A, 4C). As the structures of PatA and PagA are essentially identical, further discussion will be based on the higher resolution PatA structure. The overall topology of the PatA protease domain consists of a seven membered central β sheet decorated on either side by α helices. The last strand of this central sheet terminates into two additional β strands, followed by a 22-residue α helix (helix α6; numbered consecutively from the N-terminus to the C-terminus). This long helix, which harbors the catalytically requisite Ser218 nucleophile, is rigidly held in place by numerous hydrophobic interactions with the central β sheet, as well as with the helices around the β sheet. Additional interactions are provided by helix α6 residues Ser225 and Glu234, which form a hydrogen bond with the Asn117 and a salt bridge interaction with Lys284, respectively.

Figure 4: Crystal structure of the PatA and PagA protease domains.

Figure 4:

(A) The α helices of PatA enzyme are colored cyan, with the central beta sheet colored blue. The loop region comprising of residues 45–57 which connects the PatA β sheet 2 and α helix 1 is shown in pink with the amino acids numbered. Note that the amino acids 46–50 are missing in the final model. The catalytic triad residues are shown in stick-ball representation with the carbon atoms colored yellow. (B) Zoomed in view of the PatA catalytic triad residue side chains. The residues are labeled. Superimposed is a difference Fourier electron density map (contoured at 2.0σ over background in blue) calculated with coefficients |Fobs| - |Fcalc| and phases from the final refined model with the coordinates of the catalytic triad residue side chain atoms deleted prior to one round of refinement. Note that the positioning of the His58 side chain precludes hydrogen bonding with Asp23 and Ser218 side chains. (C) The α helices of the PagA enzyme are colored cyan, with the central beta sheet colored green. The loop region comprising of residues 45–60 is shown in pink with the amino acids numbered. Note that the amino acids 46–55 are missing in the final model. The catalytic triad residues are shown in stick-ball representation with the carbon atoms colored yellow. In contrast to PatA residues 51–57, PagA residues 56–60 point away from the PagA catalytic triad. (D) Zoomed in view of the PagA catalytic triad residue side chains. The residues are labeled. Superimposed is a difference Fourier electron density map (contoured at 2.0σ over background in blue) calculated with coefficients |Fobs| - |Fcalc| and phases from the final refined model with the coordinates of the catalytic triad residue side chain atoms deleted prior to one round of refinement. Note that the positioning of the His61 side chain allows for hydrogen bonding with Asp26 and Ser221 side chains.

The existence of numerous contacts used to position the helix α6 likely represents a thermodynamic requirement for enzyme activity, as the catalytic requisite Ser218 resides at the amino terminus of this helix. The dipole moment of helix α6, in addition to the serine protease catalytic triad, likely aids in the deprotonation of Ser218 to generate the nucleophilic serine alkoxide for catalysis. Two disulfide bonds, between Cys156 and Cys158, and between Cys258 and Cys269, respectively, interconnect the loop emanating from the central sheet β6, and the loop joining the carboxy terminus of helix α7and the amino terminus of helix α8. Both β6, as well as α7, contacts the helix α6 that harbors Ser218. Due to the absence of suitable electron density, a five amino acid flexible loop, comprising residues Ala45 through Ser51 that connects β2 and α1 is not modeled in the final structure. This loop is disordered in three crystallographically independent structures of PatA, specifically in native PatA (1 monomer in crystallographic asymmetric unit), and SeMet PatA (2 copies in the asymmetric unit).

A structure-based DALI search against the Protein Data Bank identifies subtilisin and related proteases as the closest structural homologs. The strongest homology is with subtilisin Carlsberg (PDB code 1SCA; Z-score=33.4, RMSD=1.7 Å over 234 aligned Cα atoms) and subtilisin BPN’ (PDB code 2ST1; Z-score=33.5, RMSD=1.6 Å over 234 aligned Cα atoms), as well as other related proteins such as the subtilisin-like virulence associated protease AprV2 (PDB code 3LPA; Z-score=31.0, RMSD=1.8 Å over 248 aligned Cα atoms).

An alignment of the PatA protease domain structure with various forms of a highly homologous serine protease subtilisin Carlsberg (Neidhart and Petsko, 1988) (RMSD 1.7 Å over 232 aligned residues) provides insights into the positioning of the amino acids proximal to the PatA active site and the catalytic triad. The PatA loop bearing residues Val40 through His58 (and containing the disordered residues Ala45 through Ser51) points inward towards the active site, and the positioning of this loop in PatA is identical in all 3 crystallographically independent copies of the model described above (Figure 4A). By analogy, the equivalent loop in the structures of subtilisin Carlsberg apo structure (PDB ID: 1SCA) (Fitzpatrick et al., 1993) and subtilisin Carlsberg structure in complex with the Eglin C inhibitor (PDB code 1CSE) (Bode et al., 1987) is distally situated and is directed away from the active site. A number of interactions stabilize the position of the Val40-His58 loop near the PatA active site, including hydrogen bonds between Ser56 and Asp23, between the carbonyl oxygen of Gly53 and main chain amide of Gly121, and van der Waals contact between Phe54 and Met55. The functional implication of positioning the Val40-His58 loop in PatA is not immediately apparent but the net result is deformation of the catalytic triad in PatA (Figure 4B), as compared to other subtilisin-like proteases, including the homologous PagA protease.

Active site architecture of the PatA/PagA protease

By sequence analogy to other serine proteases, the catalytic triad in PatA protease domain is comprised of residues Asp23-His58-Ser218 (Figure 4B), and the PagA triad is composed of Asp26-His61-Ser221 (Figure 4D) (analogous to Asp32-His64-Ser221 in subtilisin Carlsberg). While the PagA triad superimposes nearly perfectly with that of subtilisin, the catalytic triad in the PatA structures is imperfectly positioned. In particular, the imidazole side chain of PatA His58, which should be positioned in between, and within hydrogen bonding distance of, both Asp23 and Ser218, points away from the enzyme active site (Figure 4B). In PatA, Ser56 occupies the site of the catalytic triad histidine, suggesting that the PatA protease domain is not pre-organized for efficient catalysis (Figure 4A). In marked contrast, the equivalent loop in PagA is situated away from the active site (Figure 4C), and, as a result, His61 in PagA is positioned within hydrogen bonding distance to both Asp26 and Ser221 to form a competently organized catalytic triad (Figure 4D).

The non-canonical position of His58 in PatA is a consequence of the orientation of Val40-His58 loop in the proximity of the active site. In the structures of PatA, this loop spans 19 residues while the equivalent loop is 12 residues in PagA and 13 residues in subtilisin Carlsberg. Consequently, major conformational changes must be necessary for the PatA active site to adopt a catalytically proficient state. These conformational changes may be conferred by binding of the substrate, or by the absent C-terminal DUF domain, and should result in the occlusion of the loop 51–58 away from the active site. Nonetheless, this protein was fully catalytically active similar to the wild-type protein that was previously expressed (Lee et al 2009). This construct was completely stable even after extensive enzymatic reactions, whereas the previously constructed PatA protein was active but unstable.

Crystal structure of the PatG protease domain

The 1.9 Å crystal structure of the PatG protease domain (residues Lys514-Thr866; hereafter PatG) was determined by the molecular replacement method. The high degree of conservation between PatA and PatG, as reflected by the nearly 40% identity between their primary sequences, is reflected in the similarities in their overall structures (Figure 5A). Almost all of the PatA secondary structural elements can be superimposed directly upon the PatG structure, with even loop residues aligning in near identical fashion. As with PatA, the structure of PatG belongs to the canonical subtilisin fold, composed of a central β sheet with α helices flanking the two faces of the sheets. The catalytic nucleophile Ser783 is analogously positioned at the amino terminus of the long α helix, with Asp548 and His618 completing the PatG catalytic triad (Figure 5B).

Figure 5: Crystal structure of the PatG protease domain.

Figure 5:

(A) The α/β core of the enzyme is colored cyan, with the capping helices colored brown. The capping helices are numbered. The catalytic triad residues are shown in stick-ball representation with the carbon atoms colored yellow. (B) Zoomed in view of the PatG catalytic triad residue side chains. The residues are labeled. Superimposed is a difference Fourier electron density map (contoured at 2.0σ over background in blue) calculated with coefficients |Fobs| - |Fcalc| and phases from the final refined model with the coordinates of the catalytic triad residue side chain atoms deleted prior to one round of refinement.

A superposition of the PatG structure with PatA (and other homologous subtilisin-like proteases) identifies the presence of the two additional α helices that link the fifth and sixth strands of the central β sheet of PatG. These two helices are located directly above the catalytic Ser783 and have hence been termed the “capping helices” (Figure 5A). These helices are made up of residues Pro579 to Val605, and are conspicuously absent in the primary sequence of PatA. Moreover, a structure-based comparison of failed to identify the capping helices in the structures of any other subtilisin-like enzyme. Hence, the presence of the capping helices is unique to PatG and homologous proteases that can presumably catalyze N-C transamidation reactions during the biosynthesis of other cyanobactins. It should be noted that the PatG protease domain construct described here has been experimentally demonstrated to be sufficient to catalyze the macrocyclizing transamidation reaction in cyanobactin biosynthesis (McIntosh et al., 2010b).

The first helix of the capping domain (PatG residues Pro579-Gln591) is rigidly held into place, relative to other secondary structure elements, by a number of hydrogen bonds and hydrophobic interactions. Primary among these is the salt bridge interaction between the side chain of Arg589 and Asp617. Hydrogen bonds also exist between the side chains of Tyr582 and Gln586 and the side chains of Asn616 and Gln770, respectively. The Gln586 side chain amide is also hydrogen bonded to the main chain carbonyl oxygen atom of Pro771. The side chain of Phe585 makes hydrophobic van der Waals contacts with the side chains of Leu602, Ile613 and Val614. It should be noted that these residues lie on the catalytic His618-bearing α helix. Hence it is conceivable that the first capping helix is rigidly positioned relative to the enzyme active site and should be relatively immobile. The second helix of the PatG protease-capping domain (residues Lys596-Val605) does not display many interactions with the enzyme core domain as seen for the first helix of the capping domain. A solitary salt bridge exists between the side chains of Glu603 and Lys610. This likely provides for mobility in the positioning of this helix, which is also reflected in the relatively higher thermal (B) factor values for the residues in this helix.

Probable mechanism for PatG catalyzed transamidation

Formation of amide bonds during the biosynthesis of small molecule natural products is driven by the activation of an amino acid at the carboxy terminus by either phosphorylation (Blasiak and Clardy, 2010; Hollenhorst et al., 2009) or adenylation (Kadi et al., 2007). The activation of the carboxy terminus results in conjugation of the carboxyl with a suitable leaving group, which then facilitates nucleophilic attack by a deprotonated backbone primary amine of the incoming amino acid. Another example of an amide bond forming reaction utilizing activated amino acids is that catalyzed by non-ribosomal peptide synthetases, in which an amino acid is first adenylated before being covalently tethered via a labile thioester to a phosphopantheinenylated peptidyl carrier protein (PCP). Lastly, covalent tethering of the carboxy terminus to tRNA molecules via an ester linkage has also been demonstrated for amide bond generation in small molecule natural product biosynthesis (Vetting et al., 2010; Zhang et al., 2011). For each of these cases, activation of the carboxy terminus is achieved by hydrolysis of an ATP molecule; hence such amide bond forming reactions are cofactor-dependent.

In contrast, amide bond formation by PatG is not driven by ATP hydrolysis, or by tethering to a PCP or tRNA molecules. Instead, the carboxy terminus of the Pat cassette is linked by an amide bond to highly conserved flanking residues Ala-Tyr-Asp-Gly-Glu, and is activated by virtue of formation of an acyl-enzyme intermediate with the catalytic Ser783 of the PatG protease. For all of the cases listed above, there are two mechanistic requirements that need to be satisfied. The first is the spatial positioning of the modified carboxy terminus in close physical proximity to the incoming primary amine within the enzyme active site. The second is that the incoming primary amine should be deprotonated by a catalytic base, so as to generate the nucleophile for attack at the modified carboxy terminus. As the pKa of an aliphatic non-conjugated primary amine is ~ 10.5, the only available amino acid side chains capable of deprotonating the amine are tyrosine (pKa 10.07), Lys (pKa 10.53) or arginine (pKa 12.48). The enzyme active site can also modulate the pKa of the primary amine by positioning it in a hydrophobic, or a positively charged cavity.

Within the PatG active site, spatial positioning of the carboxy terminus to the incoming primary amine may be mediated by the “capping helices” that are unique to this enzyme. Following the installation of the heterocycles in the precursor by PatD, the substrate peptide wraps around in the interior of the active site in the pocket formed by the presence of the capping helices. The primary determinants for binding would be mediated by the backbone carbonyl oxygen atoms of the substrate peptide with the positively charged interior electrostatic surface of the PatG capping helices. After engaging the peptidic substrate, the catalytic Ser783 of PatG would attack the carbonyl carbon of the scissile amide bond to generate a tetrahedral intermediate. The intermediate would be resolved by the departure of the C-terminal flanking peptide (Ala-Tyr-Asp-Gly-Glu) of the Pat cassette to yield the covalent acyl-enzyme complex.

The observed conformational flexibility of the second helix of the capping domain would allow for the substrate to approach the active site. The subsequent closed state of the helix would then bring the amino terminus of the Pat cassette in close proximity to the carboxy terminus of the esterified intermediate that is covalently attached to the catalytic serine residue. The positively charged interior surface of the PatG capping helices could lower the pKa of the Pat cassette amino terminus primary amine, and either of two appropriately positioned Lys side chains-Lys594 or Lys598 could act as a general base.

The second capping helix of the PatG capping domain would then undergo a small conformational change, to position the deprotonated amino terminus of the Pat cassette in close proximity to the ester bond of the catalytic serine residue to undergo a second nucleophilic addition at the carbonyl carbon, to yield a second hemiacetal intermediate. This intermediate would be resolved by the departure of the serine side chain to yield an amide bond (Figure 6). In order to facilitate the second nucleophilic addition by the amino terminus of the Pat cassette, the PatG active site would need to be desolvated to protect the ester intermediate from nucleophilic attack by a water molecule to yield a linear Pat cassette to be released as the product. The four heterocycles in the PatE substrate peptide may help to constrict conformational flexibility and offset the entropic cost of binding an otherwise flexible peptide within the constraints of the PatG active site. Notably, the TruG protease (82% sequence identity, 91% sequence similarity with PatG) functions on a peptide substrate that contains only a single heterocycle, and the PatG protease can macrocyclize a variety of peptidic and non-peptidic substrates (McIntosh et al., 2010b). These data argue that prearrangement of the substrate in a near cyclical conformation is not a strict requisite for macrocyclization.

Figure 6: Proposed mode of substrate binding and reaction mechanism for the macrocyclizing amide bond formation by the PatG protease domain.

Figure 6:

The patellamide cassette wraps in the interior of the capping helices, which also serve to deprotonate the amino terminus of the cyanobactin coding cassette generated by the proteolytic action of PatA (highlighted in blue). The PatG catalytic triad accomplished the first half reaction, causing the scission of the conserved carboxy terminus flanking region (highlighted in yellow), and generating the acyl enzyme intermediate. The capping helices maintain the active site in a desolvated state and position the amino terminus in close proximity to the acyl enzyme intermediate, in order to bias the second half reaction towards aminolysis, rather than hydrolysis. Xn denotes the amino acids in the middle of the first (R1) and last (Rn+2) amino acids of the cyanobactin coding cassette.

The poor catalytic activity of the PatG enzyme (Lee et al., 2009; McIntosh et al., 2010b) precludes determination of the kinetic parameters for the wild type enzyme and the Lys594/Lys598→Ala mutations. The double mutant was constructed and, superficially appears to be very similar to the wild-type enzyme. A closer inspection of the time course of macrocyclization of the substrate peptide- Lys-Lys-Pro-Tyr-Ile-Leu-Pro-Ala-Tyr-Asp-Gly-Glu with the PatG 513–866 wild type enzyme revealed the formation of cyclic product after only 3 hours with the wild type enzyme (Supplementary Figure S2), while no product could be observed with the PatG 513–866 Lys594→Ala/Lys598→Ala double mutant enzyme during this time (Supplementary Figure S2B,C). A longer incubation of up to 48 hours converted nearly all of the substrate to the cyclic product by both enzymes, which leads us to postulate that mutation of Lys594→Ala/Lys598→Ala slows the rate of the reaction but does not alter the overall catalytic mechanism. This analysis is confounded by the non-quantitative nature of MALDI-MS, and a full kinetic analysis needs to be performed to definitively quantitate the contribution of Lys594 and Lys598 residues towards catalysis. However, due to the poor activity of the enzyme, only a semi-quantitative approach could be used to characterize site-specific variants. Extensive trials with various substrates, which are reported here and before (McIntosh et al., 2010b), have not lead to appreciable improvements in the activity of the enzyme. We have tried many different buffers, including those recently reported (Tianero et al., 2012). The relative rate was not greatly improved for the best substrates, leading us to believe that acceleration may be due at least in part to buffer effects on substrates rather than directly on the enzyme.

The poor activity of the wild-type enzyme may also be attributed to the absence of additional cis-acting domains or trans interaction of the protease domain with other proteins in the patellamide biosynthetic pathway. The possibility that the macrocyclase catalyst has been adapted to produce only modest levels of cyclized cyanobactin natural products under physiological conditions also cannot be discounted, since our optimal in vivo production levels require 5 days of fermentation in E. coli. The poor activity of the catalyst cannot however be attributed to the absence of the four heterocycles, based upon our previous biochemical studies on the substrate promiscuity of PatG protease domain, and the fact that highly similar TruG and PatG enzymes employ physiological substrates which have one or no heterocycles present, respectively. Lastly, prior experiments have characterized PatG and TruG constructs containing the DUF and protease domains, and they were not markedly faster than the protease-only constructs (Lee et al. 2009).

Catalytic profiles of an engineered PatG subtiligase-type variant

The cofactor independent peptide bond formation activity of the PatG protease domains may also be considered in the context of decades of prior studies using engineered variants of subtilisin to catalyze peptide ligation (Braisted et al., 1997). Formation of a peptide bond by PatG relies on minimizing hydrolysis (by solvent) and promoting aminolysis (by the substrate peptide a-amine) of the covalent acyl-enzyme intermediate. Hydrolysis of the intermediate yields linear Pat cassettes as products, while aminolysis yields macrocyclized cyanobactins. Both linear and macrocyclized products can be observed in in vitro experiments with PatG (Figure 7, 8) (McIntosh et al., 2010b). PatG likely promotes aminolysis by deprotonating and positioning the Pat cassette α-amine in close proximity to the active site Ser, and by suitable desolvation of the active site. Previous studies have shown that the equilibrium between hydrolysis and aminolysis can be shifted by changing the chemical nature of the covalent enzyme intermediate in proteases such as subtilisin (Shih-Hsi Chu, 1966). Thioester intermediates, generated by the replacement of active site Ser221 in subtilisin by Cys, are more than 600- fold more efficient for aminolysis over hydrolysis (Nakatsuka et al., 1987). Selenol esters, containing an active site Se-Cys, were 14,000- fold more efficient for aminolysis (Wu and Hilvert, 1989). The molecular crowding in the active site created by the replacement of the wild type alcohol Ser side chain by bulkier thiol side chain (in thiolsubtilisin) or selenol side chain (in selenolsubtilisin) could be relieved by an additional Pro225→Ala mutation (Abrahmsen et al., 1991). This mutation results in the slight shift of the α helix that bears the active site nucleophile and this shift can accommodate the longer carbon-sulfur bond length (~ 1.8 Å) in a thioester intermediate compared to the ~1.4 Å carbon-oxygen bond length of an ester intermediate (Abrahmsen et al., 1991). Thus Ser221→Cys/Pro225→Ala subtilisin (termed as subtiligase) could be utilized to generate entire functional proteins and enzymes by sequential ligation of peptide fragments (Chang et al., 1994; Jackson et al., 1994). This pioneering strategy, which allowed for an alternate strategy for incorporation of un-natural amino acids in proteins (Jackson et al., 1994), was fraught with the limitation that it required the use of activated carboxy terminal esters. Studies into the catalysis of peptide macrocyclization using subtiligase established that peptides greater than 12 amino acids in length could be cyclized by this engineered catalyst (Jackson et al., 1995).

Figure 7: Analysis of products generated by the wild type PatG, and PatGsubtiligase for the KKPYILPAYDGE peptide in the presence of Ala-Gly di-peptide.

Figure 7:

Top panel in black shows the MALDI-MS spectrum for the substrate peptide without the addition of the catalysts. Middle panel in red shows the reaction of the substrate peptide with wild type PatG, while the bottom panel in blue shows the reaction of the substrate peptide with PatGsubtiligase. The relevant peaks are labeled. Cyclic peptides are labeled as c(peptide). Note that the peaks corresponding to the cyclic products in the bottom panel are present in the middle panel. But, the peaks corresponding to linear products, or linear Ala-Gly adducts in the middle panel are absent in the bottom panel. Note that the reaction has proceeded to completion, as evidenced by the complete disappearance of the substrate peaks.

Figure 8: Analysis of products generated by the wild type PatG protease domain, and PatGsubtiligase for the (Ahx)RGDWPAYDGE peptide.

Figure 8:

Top panel in black shows the MALDI-MS spectrum for the substrate peptide without the addition of the catalysts. Middle panel in red shows the reaction of the substrate peptide with wild type PatG, while the bottom panel in blue shows the reaction of the substrate peptide with PatGsubtiligase. The relevant peaks are labeled. Cyclic peptides are labeled as c(peptide). Note that the peak corresponding to the cyclic products in the bottom panel are present in the middle panel, but the peak corresponding to linear product in the middle panel is absent in the bottom panel. Also note that the reaction has not proceeded to completion.

In order to explore whether similar substitutions within the PatG could offer an increase in the catalytic efficiency for macrocyclization, we generated the PatG protease Ser783→Cys/Pro787→Ala variant (hereafter PatGsubtiligase) and carried out in vitro mass spectrometric assays of this variant with different substrates. The crystal structure of PatG establishes that Pro787 is positioned analogously to Pro225 in subtilisin. Both wild-type PatG and PatGsubtiligase were first assayed with the substrate Lys-Lys-Pro-Tyr-Ile-Leu-Pro-Ala-Tyr-Asp-Gly-Glu. In order to probe the competing reactions of hydrolysis and aminolysis, the enzymatic reactions of both proteins were challenged using the primary amines of small dipeptides. In the presence of Ala-Gly di-peptide, the wild type enzyme displays a mixture of products, which include linear (Lys)-Lys-Pro-Tyr-Ile-Leu-Pro peptides (note that the first Lys residue of the substrate peptide is hydrolyzed non-enzymatically) (McIntosh et al., 2010b), cyclized peptides, and most interestingly, linear adducts of (Lys)-Lys-Pro-Tyr-Ile-Leu-Pro with the Ala-Gly di-peptide (Figure 7). This result is consistent with our hypothesis that the wild type enzyme can deacylate the ester intermediate by hydrolysis or aminolysis. Further, the aminolysis step can also accept ‘trans’ primary amines, as provided by the Ala-Gly di-peptide. This result has been further corroborated by the presence of Gly-Gly adducts with the same substrate peptide under similar assay conditions (Supplementary Figure S3).

When the PatGsubtiligase reaction was challenged with the Ala-Gly dipeptide (Figure 7) or the Gly-Gly di-peptide (Supplementary Figure S3), the predominant products are the cyclized peptides, and no Ala-Gly/Gly-Gly adducts can be observed. Our preliminary qualitative data also suggests that PatGsubtiligase is a more competent catalyst for cyclization via aminolysis, as compared to the wild type enzyme, in the absence of competing primary amines (Supplementary Figure S4). These data are consistent with the idea that the rate-limiting step of the PatG catalytic scheme, as shown in Figure 6, is the deacylation. When the aminolysis of the acyl-enzyme intermediate is hastened in PatGsubtiligase, only cyclic peptides are observed as products. The slower aminolytic deacylation of the wild type enzyme allows for competition from water, as well as small molecule primary amines, for deacylation, which results in the formation of linear products and di-peptide adducts, respectively.

It should be noted that Lys-Lys-Pro-Tyr-Ile-Leu-Pro-Ala-Tyr-Asp-Gly-Glu is one of the best substrates for the PatG protease domain enzyme that we identified in our previous extensive substrate scope studies. We sought to explore whether the PatGsubtiligase would bias the reaction towards cyclization of the substrate peptide for poorer substrates that are processed very slowly by the enzyme. We assayed the activity of the wild type PatG and the subtligase variant in presence of the (Ahx)Arg-Gly-Asp-Trp-Pro-Ala-Tyr-Asp-Gly-Glu (Ahx = aminohexanoic). For this substrate, cyclization was demonstrated to occur via the aminohexanoic moiety at the N-terminus of the peptide. While the wild type enzyme displayed a mixture of cyclic and linear products, PatGsubtiligase exclusively produced the cyclized product (Figure 8).

These engineering experiments, analysis of which are compromised by the poor catalytic efficiency of the enzyme, lay the framework for further development of PatG as a general promiscuous, cofactor independent, catalyst for the synthesis of circularized peptides. The range of products that can be generated using wild type PatG and PatGsubtiligase may be of broad interest for pharmaceutical and biotechnological applications.

Significance

Biosynthesis of microbial genetically encoded small molecule natural products continues to provide numerous examples in which novel enzymatic chemistry can be achieved through subtle changes in the active sites of enzymes with well-established mechanisms. One such example is the cofactor independent amide bond formation activity of the cyanobactin maturation proteases, which is utilized to generate a macrocyclic product from a linear substrate. Here, we describe the crystal structures of the subtilisin-like proteases requisite for cyanobactin maturation. A comparison of the structures reveals the presence of novel structural motifs within the PatG protease that favor transamidation (rather than hydrolysis) of the acyl-substrate complex. We also present biochemical details of an engineered PatG variant with improved macrocyclization activity, while preserving the broad substrate specificity of the wild-type enzyme.

Experimental procedures

Methods describing the cloning, expression, purification and crystallization of the various cyanobactin protease domains have been described in detail in the Supplementary Materials and Methods.

Phasing and structure determination

X-ray diffraction data was collected at Life Sciences Collaborative Access Team (LS-CAT), Sector-21, Argonne National Laboratory. Initial crystallographic phases for the PatA protease domain were determined solved by single wavelength anomalous diffraction utilizing anomalous scattering from crystals of selenomethionine labeled PatA protease domain. A nine-fold redundant data set was collected at the selenium absorption edge, to a limiting resolution of 1.9 Å (overall Rmerge = 0.077, I/σ (I) = 5 in the highest resolution shell) utilizing a Mar 300 CCD detector (LS-CAT, Sector 21 ID-D, Advanced Photon Source, Argonne, IL, USA). Data were indexed and scaled using the HKL2000 package (Otwinowski et al., 2003). A total of 15 selenium sites were identified using HySS (Grosse-Kunstleve and Adams, 2003), and heavy atom parameters were refined using Phaser as implemented in the Phenix suite of programs to yield a figure of merit of 0.418 (acentric/centric = 0.452/0.110). The resultant electron density map was of exceptional quality and permitted most of the main-chain and more than half of the side-chain residues to be automatically built using ARP/wARP. The remainder of the model was fitted using Coot (Emsley and Cowtan, 2004) and further improved through iterative rounds of refinement with REFMAC5 (Murshudov et al., 1997) interspersed with manual rebuilding. Subsequent rounds of model building and crystallographic refinement utilized data from a native crystal of PatA protease domain grown under similar conditions that diffracted to 1.7 Å resolution. Cross validation, using 5% of the data for the calculation of the free R factor, was utilized throughout the model-building process in order to monitor building bias (Kleywegt and Brunger, 1996).

X-ray diffraction data from crystals of the PagA and PatG protease domains were collected and processed in a similar manner. The structure of each of these protease domain were determined by the molecular replacement method using the refined coordinates of the PatA protease domain as a search model (McCoy et al., 2007). The resultant solutions were subject to initial rounds of automated model building using ARP/wARP, followed by subsequent rounds of manual rebuilding using XtalView, interspersed with crystallographic refinement using REFMAC5.

Supplementary Material

supplemental figures

Table 1.

Data collection, phasing and refinement statistics

PatA SeMet PatA PagA PatG
Data collection
Space Group P41 P4 I212121 C2
a, b, c (Å), β (°) 83.7, 83.7, 42.3 90 117.4, 117.4, 42.6 90 49.7, 145.9, 193.5 133.9, 66.9, 105.6 112.2
Resolution (Å)1 50–1.7 (1.76–1.7) 50–1.9 (1.97–1.9) 50–2.45 (2.54–2.45) 50–2.0 (2.07–2.0)
Rsym (%)2 8.3 (66.1) 7.7 (38.6) 8.2 (49.1) 8.0 (44.3)
I/σ(I) 43.5 (1.5) 43.2 (5.0) 17.2 (1.7) 22.6 (2.3)
Completeness (%) 97.4 (75.6) 100. (100.) 93.2 (67.2) 96.5 (76.9)
Redundancy 12.2 (3.5) 9.3 (8.3) 6.2 (5.3) 7.1 (4.6)
Phasing
FOM/DM FOM3 0.350/0.646
Refinement
Resolution (Å) 25.0–1.7 25.0–1.9 25.0–2.45 25.0–2.00
No. reflections 30,147 43,969 23,255 53,640
Rwork / Rfree4 20.9/24.4 21.3/24.9 18.9/23.1 20.0/24.3
Number of atoms
Protein 1,992 3,949 4,052 4,541
Water 237 236 156 442
B-factors
Protein 27.0 28.0 47.8 32.4
Water 40.9 35.5 44.3 42.9
R.m.s deviations
Bond lengths (Å) 0.009 0.010 0.007 0.009
Bond angles (°) 1.24 1.33 1.27 1.16
1.

Highest resolution shell is shown in parenthesis.

2.

Rsym = Σ |(Ii - <I> | Σ Ii where Ii = intensity of the ith reflection and <I> = mean intensity.

3.

Mean figure of merit before and after density modification

4.

R-factor = Σ (|Fobs|-k|Fcalc|)/Σ |Fobs| and Rfree is the R value for a test set of reflections consisting of a random 5% of the diffraction data not used in refinement.

Highlights.

  • Boundaries for cyanobactin maturation protease domains determined.

  • Structures of protease domains for cyanobactin maturation enzymes determined.

  • Novel structural motif in PatG assists in cyanobactin cyclizing transamidation.

  • Rational mutagenesis of active site leads to altered catalytic profiles for PatG.

Acknowledgements

The authors are grateful to Drs. Keith Brister, Joseph Brunzelle and the staff at Life Sciences Collaborative Access Team (LS-CAT, Argonne National Laboratory, IL) for facilitating data collection. We thank Drs. Wilfred A. van der Donk (University of Illinois) and James Wells (University of California at San Francisco) for useful discussions. The authors declare no competing financial interests. Atomic coordinates have been deposited with the Protein Data Bank (www.rcsb.org) with the following accession codes: PatA (4H6V), PagA (4H6W), and PatG (4H6X).

References

  1. Abrahmsen L, Tom J, Burnier J, Butcher KA, Kossiakoff A, and Wells JA (1991). Engineering subtilisin and its substrates for efficient ligation of peptide bonds in aqueous solution. Biochemistry 30, 4151–4159. [DOI] [PubMed] [Google Scholar]
  2. Blasiak LC, and Clardy J (2010). Discovery of 3-formyl-tyrosine metabolites from Pseudoalteromonas tunicata through heterologous expression. J Am Chem Soc 132, 926–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bode W, Papamokos E, and Musil D (1987). The high-resolution X-ray crystal structure of the complex formed between subtilisin Carlsberg and eglin c, an elastase inhibitor from the leech Hirudo medicinalis. Structural analysis, subtilisin structure and interface geometry. Eur J Biochem 166, 673–692. [DOI] [PubMed] [Google Scholar]
  4. Braisted AC, Judice JK, and Wells JA (1997). Synthesis of proteins by subtiligase. Methods Enzymol 289, 298–313. [DOI] [PubMed] [Google Scholar]
  5. Chang TK, Jackson DY, Burnier JP, and Wells JA (1994). Subtiligase: a tool for semisynthesis of proteins. Proc Natl Acad Sci U S A 91, 12544–12548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Donia MS, Hathaway BJ, Sudek S, Haygood MG, Rosovitz MJ, Ravel J, and Schmidt EW (2006). Natural combinatorial peptide libraries in cyanobacterial symbionts of marine ascidians. Nat Chem Biol 2, 729–735. [DOI] [PubMed] [Google Scholar]
  7. Donia MS, Ravel J, and Schmidt EW (2008). A global assembly line for cyanobactins. Nat Chem Biol 4, 341–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Donia MS, and Schmidt EW (2011). Linking chemistry and genetics in the growing cyanobactin natural products family. Chem Biol 18, 508–519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Emsley P, and Cowtan K (2004). Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 60, 2126–2132. [DOI] [PubMed] [Google Scholar]
  10. Fischbach MA, and Walsh CT (2009). Antibiotics for emerging pathogens. Science 325, 1089–1093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fitzpatrick PA, Steinmetz AC, Ringe D, and Klibanov AM (1993). Enzyme crystal structure in a neat organic solvent. Proc Natl Acad Sci U S A 90, 8653–8657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Fluhe L, Knappe TA, Gattner MJ, Schafer A, Burghaus O, Linne U, and Marahiel MA (2012). The radical SAM enzyme AlbA catalyzes thioether bond formation in subtilosin A. Nat Chem Biol 8, 350–357. [DOI] [PubMed] [Google Scholar]
  13. Garg N, Tang W, Goto Y, Nair SK, and van der Donk WA (2012). Lantibiotics from Geobacillus thermodenitrificans. Proc Natl Acad Sci U S A 109, 5241–5246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Grosse-Kunstleve RW, and Adams PD (2003). Substructure search procedures for macromolecular structures. Acta Crystallogr D Biol Crystallogr 59, 1966–1973. [DOI] [PubMed] [Google Scholar]
  15. Hollenhorst MA, Clardy J, and Walsh CT (2009). The ATP-dependent amide ligases DdaG and DdaF assemble the fumaramoyl-dipeptide scaffold of the dapdiamide antibiotics. Biochemistry 48, 10467–10472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Jackson DY, Burnier J, Quan C, Stanley M, Tom J, and Wells JA (1994). A designed peptide ligase for total synthesis of ribonuclease A with unnatural catalytic residues. Science 266, 243–247. [DOI] [PubMed] [Google Scholar]
  17. Jackson DY, Burnier JP, and Wells JA (1995). Enzymic Cyclization of Linear Peptide Esters Using Subtiligase. Journal of the American Chemical Society 117, 819–820. [Google Scholar]
  18. Kadi N, Oves-Costales D, Barona-Gomez F, and Challis GL (2007). A new family of ATP-dependent oligomerization-macrocyclization biocatalysts. Nat Chem Biol 3, 652–656. [DOI] [PubMed] [Google Scholar]
  19. Kleywegt GJ, and Brunger AT (1996). Checking your imagination: applications of the free R value. Structure 4, 897–904. [DOI] [PubMed] [Google Scholar]
  20. Knerr PJ, and van der Donk WA (2012). Discovery, biosynthesis, and engineering of lantipeptides. Annu Rev Biochem 81, 479–505. [DOI] [PubMed] [Google Scholar]
  21. Koehnke J, Bent A, Houssen WE, Zollman D, Morawitz F, Shirran S, Vendome J, Nneoyiegbe AF, Trembleau L, Botting CH, et al. (2012). The mechanism of patellamide macrocyclization revealed by the characterization of the PatG macrocyclase domain. Nat Struct Mol Biol. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lee J, McIntosh J, Hathaway BJ, and Schmidt EW (2009). Using marine natural products to discover a protease that catalyzes peptide macrocyclization of diverse substrates. J Am Chem Soc 131, 2122–2124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, and Read RJ (2007). Phaser crystallographic software. J Appl Crystallogr 40, 658–674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. McIntosh JA, Donia MS, Nair SK, and Schmidt EW (2011). Enzymatic basis of ribosomal peptide prenylation in cyanobacteria. J Am Chem Soc 133, 13698–13705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. McIntosh JA, Donia MS, and Schmidt EW (2010a). Insights into heterocyclization from two highly similar enzymes. J Am Chem Soc 132, 4089–4091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. McIntosh JA, Robertson CR, Agarwal V, Nair SK, Bulaj GW, and Schmidt EW (2010b). Circular logic: nonribosomal peptide-like macrocyclization with a ribosomal peptide catalyst. J Am Chem Soc 132, 15499–15501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. McIntosh JA, and Schmidt EW (2010). Marine molecular machines: heterocyclization in cyanobactin biosynthesis. Chembiochem 11, 1413–1421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Murshudov GN, Vagin AA, and Dodson EJ (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr 53, 240–255. [DOI] [PubMed] [Google Scholar]
  29. Nakatsuka T, Sasaki T, and Kaiser ET (1987). Peptide segment synthesis catalyzed by the semisynthetic enzyme thiolsubtilisin. Journal of the American Chemical Society 109, 3808–3810. [Google Scholar]
  30. Neidhart DJ, and Petsko GA (1988). The refined crystal structure of subtilisin Carlsberg at 2.5 A resolution. Protein Eng 2, 271–276. [DOI] [PubMed] [Google Scholar]
  31. Newman DJ, and Cragg GM (2007). Natural products as sources of new drugs over the last 25 years. J Nat Prod 70, 461–477. [DOI] [PubMed] [Google Scholar]
  32. Otwinowski Z, Borek D, Majewski W, and Minor W (2003). Multiparametric scaling of diffraction intensities. Acta Crystallogr A 59, 228–234. [DOI] [PubMed] [Google Scholar]
  33. Rea MC, Sit CS, Clayton E, O’Connor PM, Whittal RM, Zheng J, Vederas JC, Ross RP, and Hill C (2010). Thuricin CD, a posttranslationally modified bacteriocin with a narrow spectrum of activity against Clostridium difficile. Proc Natl Acad Sci U S A 107, 9352–9357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Schmidt EW, and Donia MS (2009). Chapter 23. Cyanobactin ribosomally synthesized peptides--a case of deep metagenome mining. Methods Enzymol 458, 575–596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Schmidt EW, Nelson JT, Rasko DA, Sudek S, Eisen JA, Haygood MG, and Ravel J (2005). Patellamide A and C biosynthesis by a microcin-like pathway in Prochloron didemni, the cyanobacterial symbiont of Lissoclinum patella. Proc Natl Acad Sci U S A 102, 7315–7320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Shih-Hsi Chu HGM (1966). Analogs of Neuroeffectors. V. Neighboring-Group Effects in the Reactions of Esters, Thiolesters, and Selenolesters. The Hydrolysis and Aminolysis of Benzoylcholine, Benzoylthiolcholine, Benzoylselenolcholine, and of Their Dimethylamino Analogs. The Journal of Organic Chemistry 31, 308–312. [Google Scholar]
  37. Siezen RJ, and Leunissen JA (1997). Subtilases: the superfamily of subtilisin-like serine proteases. Protein Sci 6, 501–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Sivonen K, Leikoski N, Fewer DP, and Jokela J (2010). Cyanobactins-ribosomal cyclic peptides produced by cyanobacteria. Appl Microbiol Biotechnol 86, 1213–1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Tianero MD, Donia MS, Young TS, Schultz PG, and Schmidt EW (2012). Ribosomal route to small-molecule diversity. J Am Chem Soc 134, 418–425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Vetting MW, Hegde SS, and Blanchard JS (2010). The structure and mechanism of the Mycobacterium tuberculosis cyclodityrosine synthetase. Nat Chem Biol 6, 797–799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Walsh CT, Acker MG, and Bowers AA (2010). Thiazolyl peptide antibiotic biosynthesis: a cascade of post-translational modifications on ribosomal nascent proteins. J Biol Chem 285, 27525–27531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Wu ZP, and Hilvert D (1989). Conversion of a protease into an acyl transferase: selenolsubtilisin. Journal of the American Chemical Society 111, 4513–4514. [Google Scholar]
  43. Zhang W, Ntai I, Kelleher NL, and Walsh CT (2011). tRNA-dependent peptide bond formation by the transferase PacB in biosynthesis of the pacidamycin group of pentapeptidyl nucleoside antibiotics. Proc Natl Acad Sci U S A 108, 12249–12253. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplemental figures

RESOURCES