Abstract
Modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzymatic assembly lines are large and dynamic protein machines that generally effect a linear sequence of catalytic cycles. Here we report the heterologous reconstitution and comprehensive characterization of two hybrid NRPS-PKS assembly lines that defy many standard rules of assembly line biosynthesis to generate a large combinatorial library of cyclic lipodepsipeptide protease inhibitors called thalassospiramides. We generate a series of precise domain-inactivating mutations in thalassospiramide assembly lines and present evidence for an unprecedented biosynthetic model that invokes inter-module substrate activation and tailoring, module skipping, and pass-back chain extension, whereby the ability to pass the growing chain back to a preceding module is flexible and substrate-driven. Expanding bidirectional inter-module domain interactions could represent a viable mechanism for generating chemical diversity without increasing the size of biosynthetic assembly lines and challenges our understanding of the potential elasticity of multi-modular megaenzymes.
Modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzymes are molecular-scale assembly lines that construct complex polymeric products, many of which are useful to humans as life-saving drugs. The first characterized assembly lines exhibited an elegant co-linear biosynthetic logic, whereby the linear arrangement of functional units, called modules, along an NRPS or PKS polypeptide directly correlates to the chemical structure of the product1. The PKS giving rise to the antibiotic erythromycin2 and the NRPS producing the antibiotic daptomycin3 are two such examples. The core components of an NRPS or PKS assembly line elongation module include, respectively, condensation (C) or ketosynthase (KS) domains catalyzing chain extension, adenylation (A) or acyltransferase (AT) domains for substrate selection, and thiolation (T) domains for covalent substrate tethering. Optional tailoring domains such as methyltransferase (MT), ketoreductase (KR), or dehydratase (DH) domains, if present, chemically modify building blocks or chain-extension intermediates. This one-to-one correlation between product moieties and assembly line modules with the requisite catalytic domains is one feature that makes NRPS/PKS enzymes among the largest proteins found in nature.
However, it is now clear that many assembly lines do not strictly abide by the rules of co-linearity. A phylogenetically distinct class of modular PKSs are trans-AT PKSs, which do not directly encode AT domains within modules but instead as stand-alone enzymes that act in trans4. A separate type of NRPS, referred to as a nonlinear NRPS, deviates from the standard core domain arrangement of C-A-T and reuses a single domain more than once5. Several nonlinear NRPSs possess modules missing A domains and are presumably loaded by A domains from upstream modules6-10. Leveraging domain activities in trans or from different modules reduces the size of biosynthetic assembly lines and thus may represent a mechanism for minimizing modular assembly lines without sacrificing product complexity.
A particularly intriguing set of hybrid NRPS-PKS assembly lines that are both nonlinear and trans-AT are those responsible for the biosynthesis of the thalassospiramides, a large group of immunosuppressive cyclic lipodepsipeptides11-13. Thalassospiramide NRPS-PKS genes have been identified in several marine Rhodospirillaceae bacteria and exhibit distinct architectures that range in domain and module “completeness”13. While it is still unclear whether all configurations are functional, thalassospiramide assembly lines representing the most “complete” and “incomplete” architectures are both capable of producing a large combination of lipodepsipeptides that vary in fatty acid and amino acid composition, order, and length12. Previous work identifying these NRPS-PKS genes and structurally characterizing their numerous and diverse chemical products11-13 led us to hypothesize that these assembly lines must operate with an unprecedented degree of nonlinearity. Furthermore, we posited that in order to generate their chemical products, thalassospiramide assembly lines must catalyze one or two rounds of pass-back chain extension, where the chain-extension intermediate is passed from a downstream module back to an upstream module within the same polypeptide.
Here, we report a comprehensive characterization of the thalassospiramide biosynthetic machinery from α-proteobacteria Thalassospira sp. CNJ-328 and Tistrella mobilis KA081020-065, which represent the most “complete” and “incomplete” assembly line architectures, respectively13. We present an experimentally supported and mechanistically novel biosynthetic model that invokes inter-module substrate activation and tailoring, module skipping, and pass-back chain extension, whereby the ability to pass the growing chain forward or backward is flexible and influenced by the identity and chain length of the chemical intermediate. These newly described features accentuate the potential bidirectionality and flexibility of multi-modular megaenzymes and reveal new engineering opportunities and structural considerations.
Results
Cloning and heterologous expression of ttc and ttm.
Close inspection of the thalassospiramide ttc and ttm gene clusters revealed that the NRPS/PKS genes from Thalassospira and Tistrella, respectively, occupy completely different genomic contexts (Fig. 1a,b, Supplementary Table 1). Ttc also includes a 4’-phosphopantetheinyl transferase (PPTase), TtcD, for which there is no homolog in ttm. Notably absent from the genomic vicinity of both pathways are any genes encoding stand-alone AT or A domains (Supplementary Table 1), although both assembly lines possess a trans-AT PKS module and Ttm contains two NRPS modules without A domains. We chose to clone a broader range for ttc and a more limited range for ttm (Fig. 1a,b).
Fig. 1 ∣. Heterologous reconstitution of thalassospiramide biosynthetic gene clusters in a P. putida host.
a,b, Annotated genomic loci encompassing the thalassospiramide assembly line genes from Thalassospira sp. CNJ-328 (a) and Tistrella mobilis KA081020-065 (b) targeted for cloning and heterologous expression. (c) LC-MS analysis of extracts from an empty P. putida EM383 host and hosts with genomically integrated ttc and ttm pathways compared against an authentic thalassospiramide A (1) standard. Extracted ion chromatograms (EIC) of m/z 958.5496. See Supplementary Fig. 4 for associated MS/MS spectra.
To directly clone these gene clusters, a new bacterial artificial chromosome (BAC)-based transformation-associated recombination (TAR) cloning vector, pCAP-BAC (pCB), was designed and constructed to enable stable maintenance of large constructs in Escherichia coli (Supplementary Fig. 1). pCB lacks host-specific integration elements, which can be introduced after cloning, to make it easier to retrofit cloned pathways with integration elements for different hosts. Following successful cloning of ttc, heterologous expression was attempted but never achieved in E. coli, despite efforts to perform promoter refactoring, stabilize protein expression, and co-express the pathway with various promiscuous PPTases (Supplementary Fig. 2). Thus, we constructed a Pseudomonas integration cassette containing the Int-B13 site-specific recombinase14 and introduced it into the vector backbone to generate pCB-ttc-int (Supplementary Fig. 1). This construct was successfully integrated into the genome of Pseudomonas putida EM38315 (Supplementary Fig. 3). The same procedure was used for ttm, and expression of both gene clusters in P. putida was successful as evidenced by detection of the representative product thalassospiramide A (1) (Fig. 1c, Supplementary Fig. 4). This validated that the Ttc and Ttm assembly lines do not require additional pathway-specific enzymatic components, beyond what was transferred to the host and supplied through primary metabolism. To our knowledge, this is the first report of successful heterologous expression of a trans-AT pathway without co-transfer of a cognate AT. Our current hypothesis as to why expression was successful in P. putida but not in E. coli is that the P. putida primary metabolic AT is capable of interfacing with the trans-AT PKS modules, while the AT from E. coli is not.
Reconstitution of thalassospiramide structural diversity.
We next explored whether the heterologously expressed Ttc and Ttm clusters could reproduce the full suite of thalassospiramide chemical diversity. Thalassospiramide cyclic lipodepsipeptides can be grouped into four categories based on their chemical structures and biosynthetic origin11-13 (Fig. 2). Products of Ttc can incorporate serine, phenylalanine, or tyrosine as the first amino acid residue (colored blue in Fig. 2), which can be extended by a single ketide unit to generate “B-like” as opposed to “A1-like” thalassospiramides. Alternatively, Ttm incorporates serine or valine in the first position, then includes or omits a valine residue (colored red in Fig. 2) to generate “A4-like” or “E-like” thalassospiramides, respectively. Both assembly lines produce “A-like” thalassospiramides at greater relative abundance than their “B-like” or “E-like” counterparts.
Fig. 2 ∣. Ttc and Ttm assembly lines and structures of associated cyclic lipodepsipeptide products.
(a) Ttc assembly line and structures of a representative set of associated chemical products. (b) Ttm assembly line and representative associated chemical products. Analogs not previously reported are underlined; see Supplementary Table 2 for HR-MS data. Analogs detected by LC-MS from the native producer but not the heterologous host are marked with an asterisk. C, condensation; A, adenylation; T, thiolation; KS, ketosynthase; AT, acyltransferase; DH, dehydratase; KR, ketoreductase; MT, methyltransferase; TE, thioesterase.
Additional elements that contribute to structural diversity include the N-terminal fatty acid, which is predominantly an atypical C10:1(Δ3) fatty acid, and the pattern of N-methylation, which is predominantly limited to the final tyrosine residue of the cyclic peptide core but can also extend to the adjacent valine residue for products of Ttc and further to valine within the linear peptide for Ttm (Fig. 2). Finally, the number of linear Ser-C2-Val units (4-amino-3,5-dihydroxy-N-pentanyl-valine, colored green in Fig. 2) can be 0, 1, or 2 for both assembly lines, which presumably arises from passage of chain-extension intermediates from module 4 back to module 2 or from module 5 back to module 3. Based on the different possible combinations of these variables, each assembly line can theoretically generate well over 100 compound analogs; several dozen compounds are routinely detected from small (50 mL) cultures of producing organisms using mass spectrometry.
LC-MS-MS analysis revealed that nearly all analogs detected from the native producers are also produced by the heterologous host, with some subtle differences in relative production levels (Supplementary Fig. 5). Overall titers of most analogs are comparable between native strains and host and in some cases greater in the host (Supplementary Fig. 5).
Characterization of non-assembly line genes.
To determine whether co-transferred genes beyond the core NRPS/PKS genes affect thalassospiramide titer or product distribution, we performed targeted deletions of all non-assembly line genes within ttc. Previous work using the broad-host-range vector, pCAP05 (ref. 16), demonstrated that no upstream genes (−7 through −1) are essential, although heterologous expression using this vector produced low yields (Supplementary Fig. 6a) and proved to be genetically unstable over time. Targeted deletion of ttc −1, +1, +2, +3, and +4 in the stable pCB-ttc-int expression construct had no impact on thalassospiramide production; however, deletion of the gene encoding the putative PPTase TtcD resulted in an approximately five-fold reduction in thalassospiramide A (1), which was restored upon genetic complementation of ttcD (Supplementary Figs. 6 and 7). This observation suggests that the single native P. putida PPTase is capable of activating Ttc carrier proteins, albeit not as effectively as TtcD. The P. putida PPTase is clearly capable of activating Ttm carrier proteins, as no cognate PPTase is encoded within ttm, and thalassospiramides are still produced in the heterologous host. However, co-expression of ttcD with ttm resulted in an approximately two-fold increase in levels of thalassospiramide A (1) (Supplementary Figs. 6 and 7). Quantitative analysis suggests that the PPTase TtcD favorably biases production of analogs that incorporate one or more linear Ser-C2-Val units (n≥1), particularly for Ttm, as ttcD co-expression actually decreases production of C2 (6) and E4 (25) (n=0) (Supplementary Fig. 7). This result suggests that TtcD-catalyzed phosphopantetheinylation of TtmA carrier proteins predisposes TtmA to perform pass-back chain extension at the expense of linear assembly through an unknown mechanism.
Inactivation and testing of assembly line domains.
Taken together, the results of the heterologous expression and gene deletion experiments strongly suggest that thalassospiramide structural diversity is generated directly from the multi-modular assembly line itself and does not involve accessory enzymes beyond what is provided from primary metabolism. Although certain modules appear to be missing domains based on retro-biosynthetic analysis of thalassospiramide chemical structures, all necessary core and tailoring domains are present somewhere along the assembly line, including a DH domain in module 4 of both systems that was not previously annotated (Fig. 2, Supplementary Fig. 8). Thus, we set out to investigate the predicted inter-module activity of assembly line domains.
Our initial approach leveraged gene deletion and complementation tools, focusing first on the smaller ttcC that encodes the last four domains of terminal module 6. Despite it harboring the lone assembly line MT domain, some thalassospiramides, such as A8 (5), are unusual in containing a ‘misplaced’ penultimate N-methylated valine residue in addition to the conserved terminal N-methylated tyrosine. Although MTs are usually positioned at the C-terminus of interrupted A domains17, MT6 is positioned at the N-terminus of A6 (Supplementary Fig. 8d). To explore whether MT6 can methylate amino acids activated by both the A5 and A6 adenylation domains, we deleted ttcC, resulting in complete loss of thalassospiramide production. Complementation with wild-type ttcC restored thalassospiramide production, while complementation with a mutant encoding TtcC-G234D, in which MT6 has been selectively inactivated, resulted in dramatic reduction and complete loss of thalassospiramides A (1) and A8 (5), respectively, and concomitant formation of a new product with HR-MS and MS/MS spectra consistent with desmethyl thalassospiramide A, or thalassospiramide A15 (26) (Supplementary Table 2, Supplementary Fig. 9). The same result was observed for all analogs, resulting in the formation of a series of desmethyl cylic lipodepsipeptides. This confirms our hypothesis that MT6 can act within the upstream module 5 of TtcB. Furthermore, it suggests that pass-back chain extension occurs between modules 4 and 2 as opposed to 5 and 3, as promiscuous methylation is confined to the cyclic valine residue for Ttc. If linear valine residues were installed by module 5, promiscuous methylation should extend to these positions; however, we never observe this for products of Ttc. As module 5 of Ttm does not possess an A domain and presumably borrows the activity of A2, this could explain why promiscuous methylation can extend to upstream valine residues for Ttm but not Ttc.
We attempted to use the same approach to characterize other thalassospiramide domains; however, efforts to perform PCR mutagenesis in ttcA, which is over 6.5 kb, and ttcB and ttmA, which are both over 15.5 kb, were ultimately unsuccessful. Therefore, we established new methodology combining oligo recombineering with CRISPR-Cas9 counter-selection for facile introduction of point mutations to large DNA constructs cloned into the pCB vector backbone (Supplementary Fig. 10). Using this method, we selectively inactivated a series of assembly line domains (Supplementary Fig. 11) to directly interrogate their role in thalassospiramide biosynthesis.
Consistent with our annotation of C1a of TtcA as a starter C domain responsible for fatty acylation of the first amino acid residue18, C1a inactivation completely abolished production of all thalassospiramides (Fig. 3a). We did not detect any masses corresponding to core peptides lacking an N-terminal fatty acid, suggesting that the assembly line can only generate lipopeptide products. In contrast, inactivation of A1a using two different point mutations, TtcA-G631D and TtcA-K972A, resulted in essentially complete loss of all thalassospiramides incorporating phenylalanine or tyrosine as the first residue but maintained or enhanced production of analogs incorporating serine in the first position (Fig. 3a). Inactivation of A3 using the analogous lysine to alanine mutation (TtcB-K2045A) resulted in complete loss of thalassospiramide production, as did selective inactivation of T1a (Fig. 3a). These results suggest that the serine residue adenylated by A3 is directly loaded onto T1a during biosynthesis of analogs that incorporate serine as the first amino acid residue.
Fig. 3 ∣. Selective inactivation of assembly line enzymatic domains alters product formation.
a,b, Changes in production level of thalassospiramide analogs from Ttc (a) and Ttm (b) assembly lines upon selective inactivation of specific enzymatic domains. Domain and precise amino acid mutations are listed in the first two columns. Fold-change in MS ion intensity is indicated by number and color intensity, while white boxes indicate analogs were not detected (n.d.) from the mutant. Statistical significance was calculated using a two-tailed Student’s t-test; n=3 biologically independent samples, *p<0.05, **p<0.005. (c) Structures and EICs of new thalassospiramide analogs produced upon TtmA C2 inactivation; see Supplementary Table 2 for HR-MS data.
We hypothesized that “B-like” thalassospiramides from Ttc arise through PKS module 1b and, correspondingly, that “A1-like” thalassospiramides arise through module 1b skipping. Consistent with that hypothesis, mutation of the active site cysteine of KS1b to alanine resulted in complete loss of “B-like” analogs but maintained production of “A1-like” analogs, although some yields were slightly reduced (Fig. 3a). Surprisingly, although AT1b inactivation (TtcA-S1728A) reduced production of “B-like” analogs, almost all could still be detected by LC-MS at ~8-24% of wild-type production levels (Fig. 3a). As the AT1b mutation is expected to abolish enzymatic activity and the assembly line has no other AT domains, we propose that the module 4-interacting trans-AT can partially complement AT1b.
We previously hypothesized that the tandem T domains in module 4 might be important for determining whether chain-extension intermediates are passed forward or backward12, although literature precedents indicated that multiple T domains do not change product identity but instead increase flux or yield19. We attempted to use the same oligo recombineering/CRISPR-Cas9 method to selectively inactivate T4a and T4b. However, CRISPR-Cas9 targeting did not result in oligo incorporation but instead recombination across the two very similar T domain sequences to generate an in-frame deletion of residues 3511 to 3596 in TtcB. The resultant mutant protein contained only a single chimeric T domain composed of 46% of the N-terminus of T4a and 54% of the C-terminus of T4b (Supplementary Fig. 11). Transfer of this construct to the heterologous host revealed that all thalassospiramide analogs could still be produced, including those incorporating linear Ser-C2-Val units arising from pass-back chain extension, albeit in decreased yields (Fig. 3a). This result proves that tandem T domains are not essential for bi-directional chain extension. Moreover, production of all analogs was affected equally, supporting the flux hypothesis.
Finally, we predicted that “E-like” thalassospiramides produced by Ttm arise from module 2 skipping, analogous to module 1b skipping in Ttc. Thus, if pass-back chain extension occurs through modules 5 and 3, C2 inactivation in Ttm would preserve production of all “E-like” analogs, as C2 is completely skipped in this model. While C1 inactivation abolished production of all thalassospiramides, C2 inactivation dramatically reduced and completely abolished production of thalassospiramides E (23) and E1 (24), respectively (Fig. 3b). This result confirms our hypothesis that most pass-back events occur between modules 4 and 2, as levels of “E-like” analogs were not preserved. Furthermore, we observed a greater than 100-fold increase in levels of thalassospiramide E4 (25), suggesting that the inability to pass growing chains back via C2 forces the assembly line to pass intermediates forward, resulting in enhanced production of the “premature” termination product E4 (25) (Fig. 3b). However, we also observed the formation of two new compounds not previously detected with HR-MS and MS/MS spectra consistent with the structures of thalassospiramides E5 (27) and E6 (28), indicating that pass-back between modules 5 and 3 can occur but correlates with additional substrate dehydration by module 4 (Fig. 3c, Supplementary Table 2).
Models for thalassospiramide biosynthesis.
The results of Ttm C2 inactivation provide a clear mechanistic insight into the flexibility and control of pass-back chain extension during thalassospiramide biosynthesis, as illustrated in Figure 4. Selective inactivation of C2 does not affect early stages of “E-like” thalassospiramide biosynthesis, during which A2 loads T1 with valine and then module 2 is skipped following appendage of the N-terminal fatty acid. Assembly then proceeds through modules 3 and 4, but the chain-extension intermediate does not undergo immediate dehydration and may not be directly accessible to DH4. Now, the assembly line would normally pass the chain-extension intermediate from module 4 back to module 2 via C2, which is favored based on product distribution, as thalassospiramide E4 (25) is normally produced at very low abundance. We propose that the assembly line C domains play an important role in “measuring” intermediate chain length, promoting donation of the module 4 intermediate backward to module 2 instead of forward to module 5. However, C2 inactivation forces the chain-extension intermediate forward, making it accessible to DH4 before it enters module 5. Once within module 5, the intermediate would normally proceed directly to module 6, resulting in formation of thalassospiramide E4 (25). Consistent with this proposal, C2 inactivation drives a substantial increase in levels of E4 (25) compared to wild-type. However, chain length can also be “measured” at the donor site of C6 (as C6 usually accepts donor substrates of longer chain length), prompting the assembly line to catalyze pass-back of a subset of intermediates that have already undergone dehydration from module 5 back to module 3, resulting in eventual formation of new products E5 (27) and E6 (28). This result provides additional evidence that the substrate becomes accessible to DH4 just as it is passed forward to module 5 and dehydration is a passive result of forward chain extension, since a model that invokes DH4 gating of module 5 entry would not be consistent with the observed increase in levels of E4 (25) or formation of E5 (27) and E6 (28). Furthermore, it suggests that biosynthesis is flexible and controlled by a mechanism that measures intermediate chain length. If we assume that dehydration of the hydroxyl group within the linear unit is a signature for module 5 progression, there is evidence that pass-back between modules 5 and 3 occurs at low frequency under normal conditions, perhaps as an additional checkpoint, as we can detect thalassospiramides E7 (29), E8 (30), and E9 (31) by LC-MS (Supplementary Table 2), which are all produced by wild-type Ttm and have undergone additional rounds of “premature” dehydration.
Fig. 4 ∣. Model for thalassospiramide biosynthesis by Ttm C2 inactivation mutant.
Valine is loaded onto T1 by A2 and C1 catalyzes addition of an activated fatty acid (FA) bound to an acyl carrier protein (ACP) or coenzyme A. Module 2 is skipped, and the fatty valine is extended directly onto serine-loaded T3 via C3. Chain extension proceeds normally to module 4, where the substrate is not immediately dehydrated and would normally be passed back to module 2 via C2, perhaps based on chain length. However, C2 inactivation forces the intermediate forward, where it becomes transiently accessible to DH4 and is dehydrated before extension onto T5, which is loaded with valine by A2. Direct progression through module 6 results in formation of thalassospiramide E4 (25), which is substantially increased as a result of C2 inactivation. Alternatively, passage from module 5 back to module 3 results in generation of new thalassospiramide analogs E5 (27) and E6 (28), which undergo one or two additional rounds of chain extension, respectively, through modules 3-5.
We can also propose a full model for thalassospiramide A (1) biosynthesis via Ttc (Fig. 5). T1a is preferentially adenylated with serine through the downstream A3 domain. To our knowledge, this is the first report of an A domain activating a carrier protein within an upstream module that already possesses its own active A domain12. Subsequently, module 1b is skipped during formation of “A1-like” analogs. Normal chain extension proceeds from modules 2 to 4, at which point we hypothesize that the substrate is sequestered from DH4 activity and module 5 entry based on chain length. We propose that during the formation of “B-like” thalassospiramides, ketoreduction to generate the upstream statine-like amino acid residue occurs at this time, analogous to dual ketoreduction of two disparate positions catalyzed by KR3 of PksJ during bacillaene biosynthesis20. The chain-extension intermediate within module 4 is then passed back to module 2 via C2, where it undergoes another round of linear chain extension through modules 2 to 4 (Fig. 5). Now, the intermediate has reached “sufficient” chain length and becomes accessible to DH4 before transfer to module 5 via C5. MT6 promiscuously methylates valine residues installed by module 5 to generate analogs such as thalassospiramide A8 (5). Finally, normal progression through modules 5 and 6 results in the formation of thalassospiramide A (1), the most abundant product of both Ttc and Ttm.
Fig. 5 ∣. Model for thalassospiramide A biosynthesis by Ttc.
Serine is loaded onto T1a by A3, and C1a catalyzes addition of an activated fatty acid (FA) bound to an ACP or coenzyme A. Module 1b is skipped, and the fatty serine is passed directly to valine-loaded T2 via C2. Chain extension proceeds normally to module 4, where the substrate is sequestered from DH4, perhaps within an enzyme binding pocket. The intermediate is passed from module 4 back to module 2 via C2, where it undergoes another round of chain extension through modules 3 and 4. Upon return to module 4, the intermediate is now acted upon by DH4, perhaps due to a conformational change associated with longer chain length, and extended forward through modules 5 and 6 to generate thalassospiramide A (1).
Discussion
Modular assembly lines are large and dynamic enzymes that undergo dramatic domain conformational rearrangements during a single catalytic cycle21-24. However, whether multi-modular enzymes adopt a rigid, supermodular architecture25 or a more flexible configuration26 remains in debate. In this work, we demonstrate the ability of multi-modular assembly lines to catalyze bidirectional and nonlinear passage of chain-extension intermediates, favoring a more flexible arrangement.
Several new features of multi-modular NRPS/PKS biosynthesis are described in this work. Thalassospiramide assembly lines catalyze several instances of inter-module substrate activation and tailoring. While A domain supplementation has been previously reported6-10, prior examples have been limited to upstream domains supplementing downstream modules, often encoded on separate proteins, and only within modules that lack their own A domains. For Ttc, A3 adenylates T1a with serine more frequently than A1a does with phenylalanine or tyrosine, although A1a is active and participates in biosynthesis of numerous thalassospiramide analogs. For Ttm, both A2 and A3 deliver substrates to T1, but only A3 adenylates T5. Furthermore, only A2 loads T1 if module 2 is skipped, as all “E-like” analogs incorporate valine as the first amino acid, perhaps due to the specificity of C327. MT6 promiscuously methylates valine residues activated by A5 of Ttc and A2 of Ttm. Finally, the trans-AT interacting with module 4 can partially supplement lost AT1b activity in Ttc. Complementation of lost cis-AT activity with cis-ATs from other modules or non-cognate trans-ATs has been previously observed in the 6-deoxyerythronolide B PKS synthase28.
Thalassospiramide assembly lines also catalyze programmed module skipping. Forward module skipping has been previously observed in PKS29-31 and NRPS32-34 systems, both naturally and as a byproduct of engineering. Ttc and Ttm catalyze analogous skipping of PKS module 1b and NRPS module 2, respectively, although NRPS module 2 does not sit at an enzyme junction but is embedded within a very large polypeptide. Perhaps as a result, skipping is favored in Ttc but disfavored in Ttm, based on product distribution.
Finally, thalassospiramide assembly lines catalyze pass-back chain extension. To our knowledge, this mechanism has not been previously reported in modular assembly line systems, although it is analogous to “iteration” observed in the fungal beauvericin and bassianolide synthetases35, where intermediates are passed back and forth between adjacent modules. Consistent with previous findings, tandem T domains in thalassospiramide assembly lines increase flux19 but are not mechanistic determinants of pass-back chain extension. This result is also consistent with the observation that the thalassospiramide assembly line from Oceanibaculum pacificum contains only a single T domain in module 4 and can still produce thalassospiramide A (1)13. There is no evidence that additional T domains promote “stalling” to allow for additional tailoring reactions, as having a single T domain in module 4 does not change product distribution but decreased levels of all analogs equally.
Under normal conditions, thalassospiramide intermediates are passed from module 4 back to module 2 via C2. Upon C2 inactivation, we observe pass-back chain extension from module 5 back to module 3 via C3. This suggests that the assembly line is flexible and can respond to perturbation, trying to “correct” for aberrant intermediate chain length caused by C2 inactivation by passing back through C3, which, like C6, is specific for accepting intermediates with C-terminal valine residues. Our results are consistent with recent findings that multi-modular NRPSs can be “mixed and matched” if the specificity and relative position of downstream C domains are maintained27. It also suggests that assembly lines must possess some symmetry for pass-back chain extension to occur. While all NRPS modules possess mechanisms to control the timing of chain extension to prevent misinitiation36,37, thalassospiramide modules have the added ability to accept longer intermediates from nonsequential modules.
Perhaps more importantly, how physical proximity between upstream C domains and downstream T domains is achieved during pass-back chain extension remains unclear. We cannot predict the plasticity of the hybrid thalassospiramide assembly line proteins based on their primary amino acid sequence alone (Supplementary Fig. 8). Although we can assume that TtcA, TtcB, and TtmA are homodimeric due to the dimeric nature of KS and many linker domains, we do not know whether oligomerization impacts thalassospiramide assembly, although it is tempting to suggest that higher-order architecture helps facilitate nonlinear transfer.
It is curious that although the early stages of thalassospiramide biosynthesis are flexible, resulting in production of lipopeptides with a high degree of N-terminal structural diversity, the final stages are rather fixed. The C-terminal cyclic depsipeptide core of all thalassospiramide analogs is highly conserved, particularly the 12-membered ring and the α,β-unsaturated carbonyl moiety that together form the pharmacophore responsible for calpain protease inhibition38. Thus, the assembly line constructs a series of chemical products in which the structural elements that confer activity are maintained, while accessory elements such as the fatty acid and linear chain composition and length, which may confer target specificity, are variable. It has been speculated that biosynthetic promiscuity resulting in chemical diversity may be evolutionarily advantageous39,40. Furthermore, the type of combinatorial biosynthesis observed in this study expands the portfolio of small molecules produced without introducing new assembly line modules or domains, or even additional tailoring enzymes, thus representing a means of expanding chemical diversity while minimizing genomic space.
Our work was facilitated by new methodology combining oligo recombineering with CRISPR-Cas9 counter selection for facile editing of large DNA constructs. Although our method was used solely for domain inactivation in this study, it can be easily applied to perform other forms of multi-modular assembly line engineering, for example to alter A domain specificity41,42. Future efforts to understand the specific elements that enable and control assembly line flexibility will hopefully enhance efforts to engineer NRPS/PKS proteins and possibly expand their biosynthetic repertoire.
Methods
General methods.
A complete list of the primers, plasmids, and strains used in this study can be found in Supplementary Table 3. DNA fragments larger than 3 kb were amplified with PrimeSTAR Max (Clontech Laboratories, Inc.); all other PCR products were amplified with PrimeSTAR HS DNA polymerase (Clontech Laboratories, Inc.). DNA isolations and manipulations were carried out using standard protocols. Thalassospira sp. CNJ-328 and T. mobilis KA081020-065 were grown in GYP media (glucose 10 g/L, yeast extract 4 g/L, peptone 2 g/L, sea salt 25 g/L). S. cerevisiae VL6-48N was grown in YPDA media (yeast extract 10 g/L, peptone 20 g/L, dextrose 20 g/L, adenine 100 mg/L) or selective histidine drop-out media containing 5-FOA (yeast nitrogen base without amino acids and ammonium sulfate 1.7 g/L, yeast synthetic dropout medium without histidine 1.9 g/L, sorbitol 182 g/L, dextrose 20 g/L, ammonium sulfate 5 g/L, adenine 100 mg/L, 5-FOA 1 g/L). E. coli and P. putida strains were grown in LB. E. coli TOP10 and DH5α λpir were used for standard cloning procedures. E. coli BW25113/pIJ790 was used for λ Red PCR targeting, and E. coli HME68 was used for oligo recombineering and CRISPR-Cas9 counter selection. P. putida EM383 was used for heterologous expression. All strains were grown at 30 °C except TOP10 and DH5α λpir, which were grown at 37 °C. Liquid cultures were grown shaking at 220 r.p.m. When necessary, E. coli (and P. putida) cultures were supplemented with the following antibiotics: 50 μg/mL kanamycin (150 μg/mL for P. putida), 10 μg/mL gentamycin (30 μg/ml for P. putida), 50 μg/mL apramycin, 100 μg/mL ampicillin, 25 μg/mL chloramphenicol.
Cloning and heterologous expression of ttc and ttm.
The overall workflow for genetic manipulation and heterologous expression of the ttc and ttm pathways is outlined in Supplementary Fig. 1. Biosynthetic gene clusters were cloned from genomic DNA using a TAR cloning protocol described previously43. Cluster-specific capture vectors were generated through a one-step PCR amplification of pCAP-BAC (pCB) using primers pCB-ttcCV_F/R for ttc and pCB-ttmCV_F/R for ttm. Yeast clones were screened, and PCR positive constructs were purified and transferred to E. coli TOP10 for verification by restriction digestion. pCB was miniprepped from at least 25 mL of VL6-48N and at least 10 mL of TOP10. Expression was attempted but never achieved in E. coli strains, including BL21(DE3) and BAP144 (Supplementary Fig. 2). pJZ001, containing the intB13 cassette, was assembled by Gibson Assembly using four PCR fragments (amplified using primers CEN6/ARS4_608F/R, intB13_2330F/R, aacC1_1257F/R, and pADH_597F/R) and the pACYCDuet-1 vector backbone, linearized using HindIII and XhoI. Fully assembled pJZ001 was digested using HindIII and XhoI, and the 5155 bp fragment was gel purified and used for λ Red recombination45 to knock-in the intB13 cassette into the pCB vector backbone, replacing several yeast genes no longer necessary. Retrofitted constructs were then transferred to P. putida by electroporation as described previously16 and selected for using kanamycin and gentamicin. IntB13-mediated integration of ttc into the genome of P. putida was characterized by PCR and chemical analysis (Supplementary Fig. 3). Edited constructs were similarly introduced to P. putida and then tested for heterologous production of lipopeptide products.
Extraction and LC-MS analysis.
Precultures were inoculated with colonies picked from plates and grown overnight before being inoculated into full 50 mL cultures in 250 mL Erlenmeyer flasks (in triplicate). Full cultures were grown for 5 hours before addition of 1.5 g of autoclaved XAD7HP resin per 50 mL of culture. After 24 hours of additional incubation, culture ODs were measured and recorded at 600 nm and supernatant and cells were decanted. Resin was washed three times with Milli-Q water before extraction with 20 mL of ethyl acetate. Extracts were dried under nitrogen, resuspended in 200 μL of methanol, and filtered through a 0.22 μm filter prior to LC-MS-MS analysis.
An Agilent 1100 series HPLC system (Palo Alto, CA, U.S.A.) was coupled to a Bruker Impact II Q-TOF mass spectrometer (Billerica, MA, U.S.A.) for LC-MS analysis. An Agilent ZORBAX 300SB-C18 LC column (300 Å, 5 μm, 150 × 0.5 mm) was used for LC separation. Mobile phase A was H2O in 0.1% FA and mobile phase B was ACN in 0.1% FA. The LC gradient was: t=0.00 min, 70%A; t=3.00 min, 70%A; t=5.00 min, 57%A; t=35.00 min, 57%A; t=49.00 min, 20%A; t=51.00 min, 0%A; t=52.00 min, 0%A; t=57.00 min, 70%A; t=60.00 min, 70%A. A post time of 10 min was set to re-equilibrate the column. For shorter runs, the LC gradient was: t=0.00 min, 70%A; t=3.00 min, 70%A; t=23.00 min, 20%A; t=24.00 min, 0%A; t=27.00 min, 70%A; t=30.00 min, 70%A. A post time of 3 min was set to re-equilibrate the column. Flow rate was 20 μL/min. Sample injection volume was 2 μL.
MS conditions for MS/MS spectra generation were set as follows: capillary voltage, 4500; nebulizer gas flow, 0.8 Bar; dry gas, 5.0 L/min at 180 ºC; funnel 1 RF 150 Vpp; funnel 2 RF, 300 Vpp; isCID energy, 0 eV; hexapole RF: 50 Vpp; Quadrupole ion energy, 4 eV; low mass 50 m/z; collision cell energy, 20 – 50 eV; pre pulse storage 5.0 μs; collision RF, ramp from 350 to 800 Vpp; transfer time ramp from 50 to 100 μs; detection mass range 25 to 1000 m/z; MS/MS spectra collection rate was 2.0 Hz.
All samples analyzed by comparison were run at the same time and under the same conditions. Values were normalized by culture ODs and compared only for peaks with identical MS spectra and retention time. HR-MS data for thalassospiramide analogs analyzed in this study are provided in Supplementary Table 2 and Supplementary Fig. 12-24.
Gene deletion and complementation experiments.
Gene deletions were made using λ Red PCR targeting as described previously46. Primers used to amplify the aac(3)IV cassette and confirm gene deletions are listed in Supplementary Table 3. Deletions using this cassette were made after addition of the intB13 cassette, which contains a gentamycin resistance gene, as the apramycin resistance gene aac(3)IV confers resistance to gentamycin in E. coli. For complementation, ttcD and ttcC were amplified using primers Tn7-ttcD_F/R and Tn7-ttcC_F/R, respectively, and cloned into the mini-Tn7 vector pUC18R6K-mini-Tn7T-Gm47. Cloned constructs were introduced to P. putida by electroporation16 along with the helper plasmid pTNS147 and selected for using gentamycin. Complemented P. putida clones were then made electrocompetent and pCB constructs were transferred by electroporation and selected for using kanamycin and gentamycin. For complementation of TtcC-G234D (MT6 inactivation), the mutation was generated by amplification of pTn7::ttcC using primers Tn7-ttcC-g702a_F/R and confirmed by sequencing. Although a ttcA deletion construct was generated and a mini-Tn7 vector containing ttcA was prepared, the latter ultimately could not be transferred to P. putida for complementation, as no clones were obtained even after multiple attempts, likely due to the large size of the gene (>6.5 kb).
Inactivation and testing of assembly line enzymatic domains.
Assembly line domain active sites are shown in Supplementary Fig. 8. Motifs and active sites were identified by sequence alignment against annotated NRPS/PKS domains22-24,48-55. pJZ002, an ampicillin resistant version of the pCas956 vector, was constructed as follows. First, the BsaI restriction site was first removed from the ampicillin resistance gene bla via PCR amplification of pKD20 using primers pKD20-g848a_F/R. The resulting construct was PCR amplified using primers ts-repA101_F/R and combined with a fragment amplified from pCas9 using primers pCas9_5058F/R by Gibson assembly. Spacer sequences were cloned into pJZ002 as described previously56. Spacer sequences and targeting oligos used to target specific domains are listed in Supplementary Table 3. Oligo recombination and CRISPR-Cas9 counter selection were performed as described previously56,57, with several modifications. The general workflow is shown in Supplementary Fig. 10. pCB constructs were first transferred to E. coli HME68 by electroporation and selected for using kanamycin. A transformant was picked and grown at 30 °C to OD600 0.4-0.5 and heat shocked for 15 minutes at 42 °C in a shaking water bath before being chilled on ice for 10 minutes. The cells were pelleted and washed with ice-cold water before being resuspended in a small volume of ice-cold water. 100 ng of pJZ002 containing the appropriate spacer was mixed with 100 ng of targeting oligo and the DNA mixture was introduced to the cells prior to electroporation at 2.5 kV in a 2 mm gap electroporation cuvette. Cells were recovered for 2 hours at 30 °C shaking and plated on LB with kanamycin and ampicillin. Four clones of each mutant were picked, miniprepped, and screened by sequencing. If no correct mutant was identified, a new spacer sequence was designed and cloned into pJZ002 and the method was retried. Very subtle mutations could be recovered efficiently using an effective spacer sequence, although the effectiveness of the spacer could only be determined empirically. In total, four of 12 spacer sequences were redesigned to achieve successful editing (Supplementary Table 3). Correctly edited constructs were transferred to TOP10 and confirmed by restriction digestion (Supplementary Fig. 11) before being transferred to P. putida for heterologous expression.
Data availability
The ttc and ttm biosynthetic gene cluster sequences are available in the MIBiG database (accession BGC0001050 and BGC0001876). Plasmids pCAP-BAC (#120229), pJZ001 (#120230), and pJZ002 (#120231) are available at Addgene.
Supplementary Material
Acknowledgements
The authors thank J.R. van der Meer (University of Lausanne) for providing plasmid pRMR6K-Gm, V. de Lorenzo (National Center for Biotechnology-CSIC) for providing strain P. putida EM383, and D.L. Court (National Cancer Institute, NIH) for providing strain E. coli HME68. We are grateful to A. Edlund (J. Craig Venter Institute), P.Y. Qian (Hong Kong University of Science and Technology), H. Xia (Shanghai Institutes for Biological Sciences, CAS), and W. Fenical and P.R. Jensen (Scripps Institution of Oceanography, UCSD) for facilitating access to equipment, chemical standards, and bacterial strains. We also thank Y. Kudo, P.A. Jordan, J.R. Chekan, L.T. Hoang, and S. Carreto for helpful discussion and technical assistance. The research reported has been supported by National Institutes of Health grants F31-AI129299 to J.J.Z. and R01-GM085770 to B.S.M.
Footnotes
Competing financial interests
The authors declare no competing financial interests.
References
- 1.Weissman KJ The structural biology of biosynthetic megaenzymes. Nat. Chem. Biol. 11, 660–670, doi: 10.1038/nchembio.1883 (2015). [DOI] [PubMed] [Google Scholar]
- 2.Cane DE Programming of erythromycin biosynthesis by a modular polyketide synthase. J. Biol. Chem. 285, 27517–27523, doi: 10.1074/jbc.R110.144618 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Robbel L & Marahiel MA Daptomycin, a bacterial lipopeptide synthesized by a nonribosomal machinery. J. Biol. Chem. 285, 27501–27508, doi: 10.1074/jbc.R110.128181 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Helfrich EJ & Piel J Biosynthesis of polyketides by trans-AT polyketide synthases. Nat. Prod. Rep. 33, 231–316, doi: 10.1039/c5np00125k (2016). [DOI] [PubMed] [Google Scholar]
- 5.Sussmuth RD & Mainz A Nonribosomal peptide synthesis-principles and prospects. Angew. Chem. Int. Ed. Engl. 56, 3770–3821, doi: 10.1002/anie.201609079 (2017). [DOI] [PubMed] [Google Scholar]
- 6.Magarvey NA, Haltli B, He M, Greenstein M & Hucul JA Biosynthetic pathway for mannopeptimycins, lipoglycopeptide antibiotics active against drug-resistant gram-positive pathogens. Antimicrob. Agents Chemother. 50, 2167–2177, doi: 10.1128/AAC.01545-05 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Felnagle EA, Rondon MR, Berti AD, Crosby HA & Thomas MG Identification of the biosynthetic gene cluster and an additional gene for resistance to the antituberculosis drug capreomycin. Appl. Environ. Microbiol. 73, 4162–4170, doi: 10.1128/AEM.00485-07 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Thomas MG, Chan YA & Ozanick SG Deciphering tuberactinomycin biosynthesis: isolation, sequencing, and annotation of the viomycin biosynthetic gene cluster. Antimicrob. Agents Chemother. 47, 2823–2830 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Du L, Sanchez C, Chen M, Edwards DJ & Shen B The biosynthetic gene cluster for the antitumor drug bleomycin from Streptomyces verticillus ATCC15003 supporting functional interactions between nonribosomal peptide synthetases and a polyketide synthase. Chem. Biol. 7, 623–642 (2000). [DOI] [PubMed] [Google Scholar]
- 10.Gehring AM et al. Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactin by Yersinia pestis. Chem. Biol. 5, 573–586 (1998). [DOI] [PubMed] [Google Scholar]
- 11.Oh DC, Strangman WK, Kauffman CA, Jensen PR & Fenical W Thalassospiramides A and B, immunosuppressive peptides from the marine bacterium Thalassospira sp. Org. Lett. 9, 1525–1528, doi: 10.1021/ol070294u (2007). [DOI] [PubMed] [Google Scholar]
- 12.Ross AC et al. Biosynthetic multitasking facilitates thalassospiramide structural diversity in marine bacteria. J. Am. Chem. Soc. 135, 1155–1162, doi: 10.1021/ja3119674 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhang W et al. Family-wide structural characterization and genomic comparisons decode the diversity-oriented biosynthesis of thalassospiramides by marine Proteobacteria. J. Biol. Chem 291, 27228–27238, doi: 10.1074/jbc.M116.756858 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Miyazaki R & van der Meer JR A new large-DNA-fragment delivery system based on integrase activity from an integrative and conjugative element. Appl. Environ. Microbiol. 79, 4440–4447, doi: 10.1128/AEM.00711-13 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Martinez-Garcia E, Nikel PI, Aparicio T & de Lorenzo V Pseudomonas 2.0: genetic upgrading of P. putida KT2440 as an enhanced host for heterologous gene expression. Microb. Cell Fact. 13, 159, doi: 10.1186/s12934-014-0159-3 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhang JJ, Tang X, Zhang M, Nguyen D & Moore BS Broad-host-range expression reveals native and host regulatory elements that influence heterologous antibiotic production in Gram-negative bacteria. MBio 8, e01291, doi: 10.1128/mBio.01291-17 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Labby KJ, Watsula SG & Garneau-Tsodikova S Interrupted adenylation domains: unique bifunctional enzymes involved in nonribosomal peptide biosynthesis. Nat. Prod. Rep. 32, 641–653, doi: 10.1039/c4np00120f (2015). [DOI] [PubMed] [Google Scholar]
- 18.Rausch C, Hoof I, Weber T, Wohlleben W & Huson DH Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol. Biol. 7, 78, doi: 10.1186/1471-2148-7-78 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Crosby J & Crump MP The structural role of the carrier protein--active controller or passive carrier. Nat. Prod. Rep. 29, 1111–1137, doi: 10.1039/c2np20062g (2012). [DOI] [PubMed] [Google Scholar]
- 20.Calderone CT, Bumpus SB, Kelleher NL, Walsh CT & Magarvey NA A ketoreductase domain in the PksJ protein of the bacillaene assembly line carries out both alpha- and beta-ketone reduction during chain growth. Proc. Natl. Acad. Sci. U S A 105, 12809–12814, doi: 10.1073/pnas.0806305105 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Whicher JR et al. Structural rearrangements of a polyketide synthase module during its catalytic cycle. Nature 510, 560–564, doi: 10.1038/nature13409 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Drake EJ et al. Structures of two distinct conformations of holo-non-ribosomal peptide synthetases. Nature 529, 235–238, doi: 10.1038/nature16163 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Reimer JM, Aloise MN, Harrison PM & Schmeing TM Synthetic cycle of the initiation module of a formylating nonribosomal peptide synthetase. Nature 529, 239–242, doi: 10.1038/nature16503 (2016). [DOI] [PubMed] [Google Scholar]
- 24.Miller BR, Drake EJ, Shi C, Aldrich CC & Gulick AM Structures of a Nonribosomal Peptide Synthetase Module Bound to MbtH-like Proteins Support a Highly Dynamic Domain Architecture. J. Biol. Chem. 291, 22559–22571, doi: 10.1074/jbc.M116.746297 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Marahiel MA A structural model for multimodular NRPS assembly lines. Nat. Prod. Rep. 33, 136–140, doi: 10.1039/c5np00082c (2016). [DOI] [PubMed] [Google Scholar]
- 26.Tarry MJ, Haque AS, Bui KH & Schmeing TM X-ray crystallography and electron microscopy of cross- and multi-module nonribosomal peptide synthetase proteins reveal a flexible architecture. Structure 25, 783–793 e784, doi: 10.1016/j.str.2017.03.014 (2017). [DOI] [PubMed] [Google Scholar]
- 27.Bozhuyuk KAJ et al. De novo design and engineering of non-ribosomal peptide synthetases. Nat. Chem. 10, 275–281, doi: 10.1038/nchem.2890 (2018). [DOI] [PubMed] [Google Scholar]
- 28.Dunn BJ, Watts KR, Robbins T, Cane DE & Khosla C Comparative analysis of the substrate specificity of trans- versus cis-acyltransferases of assembly line polyketide synthases. Biochemistry 53, 3796–3806, doi: 10.1021/bi5004316 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Awakawa T et al. Salinipyrone and pacificanone are biosynthetic by-products of the rosamicin polyketide synthase. Chembiochem 16, 1443–1447, doi: 10.1002/cbic.201500177 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Moss SJ, Martin CJ & Wilkinson B Loss of co-linearity by modular polyketide synthases: a mechanism for the evolution of chemical diversity. Nat. Prod. Rep. 21, 575–593, doi: 10.1039/b315020h (2004). [DOI] [PubMed] [Google Scholar]
- 31.Thomas I, Martin CJ, Wilkinson CJ, Staunton J & Leadlay PF Skipping in a hybrid polyketide synthase. Evidence for ACP-to-ACP chain transfer. Chem. Biol. 9, 781–787 (2002). [DOI] [PubMed] [Google Scholar]
- 32.Wenzel SC, Meiser P, Binz TM, Mahmud T & Muller R Nonribosomal peptide biosynthesis: point mutations and module skipping lead to chemical diversity. Angew. Chem. Int. Ed. Engl. 45, 2296–2301, doi: 10.1002/anie.200503737 (2006). [DOI] [PubMed] [Google Scholar]
- 33.Mootz HD et al. Decreasing the ring size of a cyclic nonribosomal peptide antibiotic by in-frame module deletion in the biosynthetic genes. J. Am. Chem. Soc. 124, 10980–10981 (2002). [DOI] [PubMed] [Google Scholar]
- 34.Gao L et al. Module and individual domain deletions of NRPS to produce plipastatin derivatives in Bacillus subtilis. Microb. Cell Fact. 17, 84, doi: 10.1186/s12934-018-0929-4 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yu D, Xu F, Zhang S & Zhan J Decoding and reprogramming fungal iterative nonribosomal peptide synthetases. Nat. Commun. 8, 15349, doi: 10.1038/ncomms15349 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Linne U & Marahiel MA Control of directionality in nonribosomal peptide synthesis: role of the condensation domain in preventing misinitiation and timing of epimerization. Biochemistry 39, 10439–10447 (2000). [DOI] [PubMed] [Google Scholar]
- 37.Belshaw PJ, Walsh CT & Stachelhaus T Aminoacyl-CoAs as probes of condensation domain selectivity in nonribosomal peptide synthesis. Science 284, 486–489 (1999). [DOI] [PubMed] [Google Scholar]
- 38.Lu L et al. Mechanism of action of thalassospiramides, a new class of calpain inhibitors. Sci. Rep. 5, 8783, doi: 10.1038/srep08783 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Fischbach MA & Clardy J One pathway, many products. Nat. Chem. Biol. 3, 353–355, doi: 10.1038/nchembio0707-353 (2007). [DOI] [PubMed] [Google Scholar]
- 40.Firn RD & Jones CG Natural products--a simple model to explain chemical diversity. Nat. Prod. Rep. 20, 382–391 (2003). [DOI] [PubMed] [Google Scholar]
- 41.Davidsen JM & Townsend CA In vivo characterization of nonribosomal peptide synthetases NocA and NocB in the biosynthesis of nocardicin A. Chem. Biol. 19, 297–306, doi: 10.1016/j.chembiol.2011.10.020 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Thirlway J et al. Introduction of a non-natural amino acid into a nonribosomal peptide antibiotic by modification of adenylation domain specificity. Angew. Chem. Int. Ed. Engl. 51, 7181–7184, doi: 10.1002/anie.201202043 (2012). [DOI] [PubMed] [Google Scholar]
- 43.Zhang JJ, Yamanaka K, Tang X & Moore BS Direct cloning and heterologous expression of natural product biosynthetic gene clusters by transformation-associated recombination. Methods Enzymol. 621, 87–110, doi: 10.1016/bs.mie.2019.02.026 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Zhang H, Fang L, Osburne MS & Pfeifer BA The Continuing Development of E. coli as a Heterologous Host for Complex Natural Product Biosynthesis. Methods Mol. Biol. 1401, 121–134, doi: 10.1007/978-1-4939-3375-4_8 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Datsenko KA & Wanner BL One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad. Sci. U S A 97, 6640–6645, doi: 10.1073/pnas.120163297 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Tang X et al. Identification of thiotetronic acid antibiotic biosynthetic pathways by target-directed genome mining. ACS Chem. Biol. 10, 2841–2849, doi: 10.1021/acschembio.5b00658 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Choi KH & Schweizer HP mini-Tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protoc. 1, 153–161, doi: 10.1038/nprot.2006.24 (2006). [DOI] [PubMed] [Google Scholar]
- 48.Davison J et al. Insights into the function of trans-acyl transferase polyketide synthases from the SAXS structure of a complete module. Chemical Science 5, 3081–3095, doi: 10.1039/c3sc53511h (2014). [DOI] [Google Scholar]
- 49.Tanovic A, Samel SA, Essen LO & Marahiel MA Crystal structure of the termination module of a nonribosomal peptide synthetase. Science 321, 659–663, doi: 10.1126/science.1159850 (2008). [DOI] [PubMed] [Google Scholar]
- 50.Dutta S et al. Structure of a modular polyketide synthase. Nature 510, 512–517, doi: 10.1038/nature13423 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Edwards AL, Matsui T, Weiss TM & Khosla C Architectures of whole-module and bimodular proteins from the 6-deoxyerythronolide B synthase. J. Mol. Biol. 426, 2229–2245, doi: 10.1016/j.jmb.2014.03.015 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Keatinge-Clay A Crystal structure of the erythromycin polyketide synthase dehydratase. J. Mol. Biol. 384, 941–953, doi: 10.1016/j.jmb.2008.09.084 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Akey DL et al. Crystal structures of dehydratase domains from the curacin polyketide biosynthetic pathway. Structure 18, 94–105, doi: 10.1016/j.str.2009.10.018 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Al-Mestarihi AH et al. Adenylation and S-methylation of cysteine by the bifunctional enzyme TioN in thiocoraline biosynthesis. J. Am. Chem. Soc. 136, 17350–17354, doi: 10.1021/ja510489j (2014). [DOI] [PubMed] [Google Scholar]
- 55.Miller BR, Sundlov JA, Drake EJ, Makin TA & Gulick AM Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases. Proteins 82, 2691–2702, doi: 10.1002/prot.24635 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Jiang W, Bikard D, Cox D, Zhang F & Marraffini LA RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239, doi: 10.1038/nbt.2508 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Sawitzke JA et al. Recombineering: highly efficient in vivo genetic engineering using single-strand oligos. Methods Enzymol. 533, 157–177, doi: 10.1016/B978-0-12-420067-8.00010-6 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The ttc and ttm biosynthetic gene cluster sequences are available in the MIBiG database (accession BGC0001050 and BGC0001876). Plasmids pCAP-BAC (#120229), pJZ001 (#120230), and pJZ002 (#120231) are available at Addgene.





