Abstract
Many natural products contain the hexahydropyrrolo[2, 3-b]indole (HPI) framework. HPI containing chemicals exhibit various biological activities and distinguishable structural arrangement. This structural complexity renders chemical synthesis very challenging. Here, through investigating the biosynthesis of a naturally occurring C3-aryl HPI, naseseazine C (NAS-C), we identify a P450 enzyme (NascB) and reveal that NascB catalyzes a radical cascade reaction to form intramolecular and intermolecular carbon–carbon bonds with both regio- and stereo-specificity. Surprisingly, the limited freedom is allowed in specificity to generate four types of C3-aryl HPI scaffolds, and two of them were not previously observed. By incorporating NascB into an engineered strain of E. coli, we develop a whole-cell biocatalysis system for efficient production of NAS-C and 30 NAS analogs. Interestingly, we find that some of these analogs exhibit potent neuroprotective properties. Thus, our biocatalytic methodology offers an efficient and simple route to generate difficult HPI framework containing chemicals.
The hexahydropyrrolo[2, 3-b]indole (HPI) framework is found in many natural products. Here, the authors discover a P450 enzyme and develop a whole-cell biocatalysis system that produces the HPI naseseazine C (NAS-C) and 30 NAS-C analogs, several of which show neuroprotective properties.
Introduction
A common heterocyclic motif observed in a large number of alkaloids and synthetic compounds is hexahydropyrrolo[2, 3-b]indole (HPI), usually referred to as pyrroloindoline. Pyrroloindoline-containing natural products exhibit a broad array of biological properties, ranging from anticancer and antibacterial activities to the inhibition of cholinesterase1. Naturally occurring C3-aryl pyrroloindolines are mostly manifest in the fungal-sourced, tryptophan-based homodimeric diketopiperazine (DKP), in which two pyrroloindoline units are fused in a C3–C3′ bond (Fig. 1)1–3. Unlike the homodimeric DKPs that are in large abundance (>100 members)1–3, only five heterodimeric DKPs have been identified so far, three of which are naseseazine A, B, and C (NAS-A, B, and C) produced by a bacterial system (Fig. 1)4–6. Except that L-Ala in NAS-A is replaced by L-Pro in NAS-B, NAS-A is identical to NAS-B. In NAS-B and C, the identical pyrroloindoline and DKP moieties are connected in two different ways: (i) the C3-aryl pyrroloindoline framework is formed through a C3–C7′ bond and with 2R-3S stereo-configuration (NAS-B); (ii) the connection is formed through a C3–C6′ bond and with 2S-3R chirality (NAS-C).
The characteristic molecular architecture and promising medicinal value of these products have garnered extensive interest particularly with respect to efforts to develop a variety of chemical methods for enantio-selective synthesis of pyrroloindoline-containing products7. However, the regio- and stereo-specificity in the densely functionalized frameworks of NASs, especially at the quaternary stereocenter at the C3 position that includes an aryl substituent, requires tremendous efforts in its chemical preparation through organic synthesis8,9. This feature severely impedes an assessment of structural diversity and associated biological activities of NASs. In the chemical synthesis of NASs, regio-specificity was achieved by pre-installation of direction groups in both of the pyrroloindoline and DKP moieties, resulting in long synthetic steps and very low yields7,9. The only stereo-configuration accomplished is the C2R-C3S, which is induced by the C11 stereocenter (derived from Cα of tryptophan) (Fig. 1)7. However, this induction from the C11 stereocenter is not able to generate the unfavored C2S-C3R in NAS-C. So far, no successful strategies have been reported to chemically synthesize NAS-C. As a practical alternative to chemical synthesis of NASs, two Streptomyces strains have been reported to produce NAS-A, -B, and -C4–6. The biosynthesis of NAS-A, -B, and -C thus may provide an enzyme-catalyzed route for the generation of NASs through a stereo- and regio-chemically defined reaction.
Here, we unveil the biosynthesis of NAS-C and discover a key P450 enzyme, which catalyzes a highly regio- and stereo-selective C3-aryl bond-forming step to generate NAS-C. We further incorporate this P450 enzyme into a highly efficient whole-cell biocatalysis system. This engineered whole-cell factory is fed with synthetic cyclodipeptides to produce 30 heterodimeric C3-aryl pyrroloindolines (NASs analogs). Finally, some of those NASs analogs are found to have potent neuroprotective properties.
Results
Deciphering the biosynthesis of NASs
Fijian marine-sourced Streptomyces sp. CMB-MQ030 (MQ030 strain) was previously reported to produce only NAS-A/B4,6, so our initial aim was to identify the biosynthetic gene clusters of NAS-A/B. We hypothesized that NAS-A/B were assembled by two molecules of cyclodipeptides (2,5-diketopiperazine) through a radical mechanism. In order to locate the biosynthetic gene clusters, we sequenced and assembled the draft genome of Streptomyces sp. CMB-MQ030 (8,454,906 bp). Analysis of the draft genome of the MQ030 strain revealed that, neighbored by several genes for regulation and exportation, each of three distinct loci (locus-1, 2, and 3) contains one tRNA-dependent cyclodipeptide synthase (CDPS) gene and one adjacent P450 gene, which are functionally competent for a hypothesized biosynthetic route (Fig. 2a). Locus-1 and locus-2 share high similarity to each other: 61% identity between two CDPS genes and 68% identity between the two P450 genes; locus-3 harbors additional albonoursin biosynthetic genes10 (CDO-A and B) and is therefore excluded from further consideration.
To validate that locus-1 or -2 is the biosynthetic gene cluster of NAS-A/B, heterologous expression of locus-1 and -2 was performed in Streptomyces albus J1074: no obvious metabolite was detected in the recombinant strain with locus-2, whereas the recombinant strain with locus-1 resulted in production of a single metabolite with identical molecular weight to NAS-B (Fig. 2b, trace IV). However, the high performance liquid chromatography (HPLC) retention time of this metabolite is different from that of standard NAS-B in HPLC, indicating that this metabolite is not NAS-B (Fig. 2b, trace V). This metabolite was then determined to be NAS-C by comparing the nuclear magnetic resonance (NMR) data with the reported ones5. Surprisingly, the production of NAS-C in MQ030 strain was not detected (Fig. 2b, trace III), though both NAS-C and NAS-A/B are produced by Australian marine-sourced Streptomyces sp. USC-6365. As locus-1 only produce NAS-C, we infer that NAS-A/B must be encoded by a different biosynthetic pathway, i.e., locus-2, though heterologous expression of locus-2 in host S. albus failed to produce NAS-A/B.
With decoupling from CDPS, the in vitro activity of P450 enzymes in locus-1 and -2 was also assayed to verify that the P450 works on the products of CDPS to produce NASs. It is well-characterized that CDPS catalyzes the formation of cyclic peptide dimers (cyclodipeptides), enabling us to propose that those cyclodipeptides would be substrates of the P450. Therefore, we directly feed synthetic cyclodipeptides into the P450-catalyzed reaction to assay the activity of the P450. Neither P450 genes from locus-1 nor -2 could be expressed in Escherichia coli BL21 (DE3), until their gene codons were optimized for E. coli usage. P450-1 (P450-NAS-C or NascB) is partially soluble in the supernatant and can be successfully purified, while P450-2 forms inclusion bodies and was thus not characterized further. In the presence of E. coli-sourced flavodoxin (Fdx), flavodoxin reductase (FdxR) and an NADPH recycle system (NADP, glucose, and glucose dehydrogenase), P450-NAS-C (NascB) can convert the synthetic cyclo-(l-tryptophan-l-proline) (s4, cWL-PL) into NAS-C (Fig. 3a, trace III). Consistent with the in vivo result that no NAS-B was produced in this reaction, this enzyme is unequivocally confirmed to only be responsible for NAS-C biosynthesis.
Delineating the mechanism of the P450 reaction
Typical P450 reactions require the substrates to enter the active site of the P450, but not directly bind to the FeIII ion11,12. To investigate if substrate-binding causes a change in the electronic environment of the enzyme ferric heme and hence binds to the NascB, X-band continuous wave electron paramagnetic resonance (CW EPR) was used to measure the ferric heme signal. The spectra recorded at 15 K for the substrate-free enzyme (Fig. 3b, top traces) was very similar to that previously reported11, showing the enzyme predominantly in the low-spin state (LS) due to axial water coordination. The spectrum comprises a number of EPR components as indicated by the simulation (Fig. 3b, red trace) due to small differences in the orientation of the coordinated water molecule. Upon addition of cWL-PL, the majority of the ferric heme iron remains LS (Supplementary Figure 1), but only a single species is observed with shifted g-values, confirming a modification of the heme iron electronic environment due to the substrate cWL-PL binding to the protein.
While P450 can either catalyze oxygenation or radical-mediated coupling reactions, we hypothesize that the mechanism of forming NAS-C is a radical-mediated coupling reaction. This is supported by our result that the reaction process can be gradually inhibited by the increasing concentration of the radical scavenger TEMPO (Supplementary Figure 2). Similar to NAS-C, the biosynthesis of (-)-dibrevianamide F (homodimeric DKP) from the fungus Aspergillus flavus (A. flavus) was also assumed through a radical mechanism13. Instead of CDPS, A. flavus uses nonribosomal peptide synthase (NRPS) to synthesize the cyclodipeptide substrate and a P450 enzyme, DtpC, for catalyzing the dimerization. DtpC is proposed to initiate the reaction by abstracting a hydrogen from N10 or N1′ of the cWL-PL substrate to form N10• radical13 or N1′• radical14. N10-radical (N10•) undergoes intramolecular addition to C2 to form the pyrroloindoline C3-radical. Two pyrroloindoline C3-radicals react with each other to yield (-)-dibrevianamide F, the major homodimeric product (Supplementary Figure 3)13. For two marginal heterodimeric heterodimeric DKPs, N1′-radical (N1'•) can either directly couple with the pyrroloindoline C3-radical to form the C3–N1′ bond, or migrates to C7′ first and then couple with the pyrroloindoline C3-radical to generate NAS-A/B (Supplementary Figure 3)14.
Using the density functional theory (DFT) calculations (Supplementary Methods), the structures of N10• and N1• radical have been optimized (Supplementary Figure 4). It is revealed that the unpaired spin density in N1• is more delocalized than in N10• (Supplementary Figure 4, Supplementary Dataset). Consequently, N1• is more stabilized and has a free energy (△G) 16 kcal mol−1 lower than N10•, implying that P450 enzymes most likely prefer the hydrogen abstraction from the cWL-PL N1 atom (instead of N10) to form the N1•. H-atom abstraction, rather than single electron transfer, is also supported by thiolate ligation in P450 through significantly decreasing heme reduction potential and elevating pKa of compound II15,16. To further support this mechanism, we prepared an oxo-mimic of cWL-PL (Oxo-cWL-PL) (s30 in Supplementary Figure 5), in which the HN1 was replaced by an oxygen. Biochemical assay revealed that this substrate is not able to be converted by the NascB, suggesting HN1 is critical for the enzymatic reaction. As assumed in the biosynthesis of communesin17, calycanthine, and chimonanthine18,19, N1• could migrate to C3, followed by a Mannich-type reaction occurring between the N10 and imine bond of N1–C2. In addition, a similar mechanism has been recently proposed in chemical oxidation of tryptophan20. Collectively, we can conclude that the N1•-mediated intramolecular Mannich reaction most probably results in the formation of pyrroloindoline C3 radical (Fig. 3c). Afterwards, C3 radical could undergo two possible routes: (1) the radical inserts into C6′ of another cWL-PL followed by elimination to generate NAS-C; (2) the C3 radical turns into C3 cation, which attaches C6′ of another cWL-PL to form NAS-C under a Friedel–Crafts scheme. We rule out the second route with Friedel–Crafts scheme of electrophilic aromatic substitution, because NascB-catalyzed reaction rates are not affected by strong electron-withdrawing group F on 7-F substituted cWL-PL (Fig. 3c, NAS-27 in Fig. 4, and Supplementary Figure 6).
NAS-C bears a distinct bond of C3–C6′. Given that the proposed C6′ radical can’t be generated from the migration from N1′, the C3–C6′ bond is more likely to be formed through an intermolecular (for C6′) radical additions, in which the nascent pyrroloindoline C3 radical directly attacks the C6′ of the second molecule of cyclodipeptide to form the bond (Fig. 3c). To confirm this mechanism, we synthesized and assayed a variety of cyclodipeptide substrates to seek the co-generation of products with different regio-selectivity. NascB transformed the substrate cyclo-(l-tryptophan-l-valine) (cWL-VL) (s3, Supplementary Figure 5) into both products with C3–C6′ (NAS-18) and C3–C7′ (NAS-17) bond, respectively. Similar outcomes were also observed in NAS-4/3, 6/5, 20/19, 22/21and 24/23 (Fig. 4). The variation of observed bonds cannot be generated by radical coupling, but rather from the addition of a pyrroloindoline C3 radical to either the C6′- or C7′ position. Based on these results, we confirmed that the formation of NAS-C involves a mechanism of radical-mediated intramolecular cyclization and intermolecular addition (Fig. 3c), which is very efficient for the synthesis of complex natural products.
Developing an E. coli-based whole-cell biocatalysis system
The in vitro catalyzed system requires the tedious purification of multiple enzymes, supplementation with the expensive cofactor NADPH and it is not suitable for large-scale preparation, so we decided to develop an E. coli-based whole-cell biocatalysis system for NAS synthesis. The NAS-C reaction depends on electron transportation systems and, therefore, we first evaluated different pairs of electron transport systems to optimize this reaction. As NAS-C can be produced by heterologous expression in S. albus, the endogenous ferrodoxin (Fd) and ferrodoxin reductase (FdR) of S. albus could be competent for this reaction. Thereby, all four FdR and three Fd genes from S. albus were amplified and cloned into pET28a for expression in E. coli. Unfortunately, none of Fd and FdR combination show activity and thus were excluded for further study. Although we could use the E. coli-sourced Fdx and FdxR, the commercial Spinach Fd and FdR were tested and found to provide much better activity than the E. coli-sourced Fdx and FdxR (~50-fold based on the conversion yield, Fig. 3a, trace IV).
Both spinach fd and fdr genes were synthesized with codons optimized for E. coli and constructed into a variety of plasmids for evaluation of their expression yield in E. coli. When fused with a TRX and MBP tag at N-terminal, respectively, Fd and FdR protein can be expressed well, and the fusion proteins are fully competent without a need to remove the tags. Finally, we cloned trx-fd and mbp-fdr into a pRSFduet vector and nascB into pET21a to achieve the co-expression of Fd, FdR, and NascB in E. coli.
Unexpectedly, co-expression of these genes (fd, fdr, and nascB) in E. coli BL21 (DE3) showed a highly toxic effect as cells rapidly underwent self-lysis after the expression was induced by IPTG. As cells are safe upon the individual expression of Fd, FdR, and NascB, the toxicity effect must be derived from the activated NascB in the presence of both Fd and FdR. After screening several commercial E. coli strains including E. coli Rosetta (DE3), BL21 (DE3)-pLysE, C41 (DE3), and C43 (DE3), which are widely used and some claimed to be particularly suitable for toxic protein expression, none of those strains can survive the activated NascB.
All above tested E. coli strains are a derivative of E. coli BL21 (DE3), a B type strain. The B type strains are often used for protein expression, while K type strains are mostly used for DNA cloning but also for protein expression, such as Shuffle T7 competent E. coli from New England Biolabs. Considering the difference between these two types, K type strain may be tolerant to the toxicity of activated NascB. We further screened two different K strains, i.e., the commercial strain E. coli JM109 (DE3) and a homemade strain E. coli GB05dir-T7. E. coli GB05dir is a derivative of DH10B by integration of RecET proteins on the genome21. For satisfying the requirement of expressing Fd, FdR, and NascB under T7 promoter, a T7 polymerase-coding gene was integrated into the lac operon, yielding E. coli GB05dir-T7. Surprisingly, the resulting strain E. coli GB05dir-T7 is very robust for the complete P450 catalytic system expression, while E. coli JM109 (DE3)-T7 died rapidly as other B type strains did. Both E. coli JM109 (DE3) and E. coli GB05dir are derived from the prototype strain K12, so their genotype is similar. The most obvious difference is the deficiency of the Lon protease in JM109 (DE3). Like other B type (DE3) strains, the lon gene was knocked out as it causes protein degradation. However, Lon protease is also required for stress-induced developmental changes and survival from DNA damage22,23. Because activated P450 can generate radical species, which may be able to cause DNA damage, we assume the lon deficiency in E. coli strains could be the major reason for cell death. Considering Lon can cause protein degradation, the expression yield for every single protein in GB05dir-T7 and BL21 (DE3) was compared to show the effect of Lon on protein expression is trivial: the protein yield in GB05dir-T7 is only slightly lower (~20%) than BL21 (DE3). Therefore, this strain is still efficient for protein expression and could be suitable for many other toxic P450 systems. Using this E. coli GB05dir-T7-based whole-cell system, complete conversion of cWL-PL into NAS-C can be achieved by an overnight incubation (Supplementary Figure 6a). Considering that the P450 reaction requires NADPH, we further tried co-expression of glucose dehydrogenase (GDH) with Fd, FdR, and NascB in GB05dir-T7, but the catalytic activity of NascB did not improve, suggesting that the endogenous NADPH supply in the E. coli is indeed sufficient.
Generation of structural varieties through biocatalysis
After establishing the cell biocatalysis system, we set out to perform biocatalysis to generate NAS varieties by feeding E. coli cells with 20 chemically synthesized and L-Trp containing cyclodipeptide (Supplementary Figure 5), i.e., cWL-XL, where XL denotes one of 20 natural L-amino acids. To evaluate the substrate specificity of NascB, these cyclodipeptides except for the natural cWL-PL were individually fed to the recombinant E. coli (GB05dir-T7) containing the nascB, trx-fd, and mbp-fdr genes (GB05dir-T7-NascB). Three products were generated in high yield (Fig. 4, Supplementary Table 1, and Supplementary Figure 6b, c), including a cWL-AL dimerization (NAS-1) and two cWL-VL dimerization products (NAS-17 and NAS-18). Interestingly, NAS-1 and NAS-18 have identical connections and stereo-configuration as in NAS-C (C3–C6′ and 2S-3R), while NAS-17 resembles NAS-A/B (C3–C7′ and 2R-3S). The production of NAS-17 suggested that NascB indeed also has a relaxed regio- and stereo-specificity in addition to its broad spectrum of substrates.
The efficient generation of NAS-C, NAS-1, 17, and 18 suggests that substrates cWL-PL, cWL-AL, and cWL-VL can be accepted readily by NascB. Considering that each of the pyrroloindolines is formed by two units of cyclodipeptides, we were interested in combining these three substrates with other cyclodipeptides to generate hetero-pyrroloindolines. Following this aim, each of these three substrates was individually co-fed with one of the remaining 17 cyclodipeptides into the recombinant E. coli GB05dir-T7-NascB. To our gratification, besides the production of NAS-C, -1, -17, and -18 (homo-dimerization), this co-feeding of two different cyclodipeptides at one time resulted in additional 23 products (hetero-dimerization, Fig. 4, Supplementary Tables 2–4 and Supplementary Figures 7–9): 8 cWL-AL-derived hetero-pyrroloindolines (NAS-2 to NAS-9), 7 cWL-PL-derived hetero-pyrroloindolines (NAS-10 to NAZ-16), and 8 cWL-VL-derived hetero-pyrroloindolines (NAS-19 to NAS-26). Interestingly, the selectivity of the upper cyclodipeptide is much stricter than the lower pyrroloindoline moiety; only cWL-PL, cWL-AL, and cWL-VL can be accepted as the upper moiety. Furthermore, when co-feeding cWL-PL and cWL-AL or cWL-VL, cWL-PL was accepted as the upper moiety (NAS-10 and 11); when co-feeding cWL-AL and cWL-VL, cWL-AL was accepted as the upper moiety (NAS-2), suggesting that cWL-PL, cWL-AL, and cWL-VL are the most, second, and least favored substrates, respectively, for the upper moiety. Unlike the upper moiety, the specificity for the lower pyrroloindoline moiety is more flexible and can accept in total eight tryptophan-containing cyclodipeptides, including cWL-AL, cWL-PL, cWL-VL, cWL-IL, cWL-LL, cWL-ML, cWL-FL, and cWL-YL.
Besides the capability of taking various substrates, NascB also shows tolerance in the regio- and stereo-specificity of the connection manner and C2–C3 stereo-configuration. The tolerance of these specificities is increased in the order of cWL-PL < cWL-AL < cWL-VL containing products: (i) In the cWL-PL containing products, both of the regio- and stereo-specificities are conserved and the same as those observed in NAS-C. (ii) In cWL-AL-containing products, six products (NAS-1, 4, and 6–9) share the same the regio- and stereo-specificity to NAS-C. Only two products NAS-2 and NAS-5 show an identical specificity to the NAS-A/B (C3–C7′ bond, 2R-3S), and one product with a not previously observed combination of bond and stereo-configuration (NAS-3, C3–C7′, and 2S-3R). (iii) In cWL-VL-containing products, only four products (NAS-18, 20, 22, and 25) retain the specificity of NAS-C and three (NAS-17, 23, and 26) have a specificity of NAS-A/B. Three remaining products show a combination of C3–C6′/2R-3S (NAS-24) and C3–C7′/2S-3R (NAS-19 and 21), which have not previously been discovered. This substrate tolerance enables a single enzyme transformation to produce analogs with different specificities, which is very efficient for generating structural varieties. Interestingly, in some products, the sulfur group of their methionine moiety was oxidized to a sulfoxide (NAS-7 and 14) or sulfone (NAS-8) group. This spontaneous or enzymatic oxidation (by endogenous enzymes) further increases the structural varieties of the NAS scaffold.
Since the enzyme shows very good substrate flexibility, we were next interested in their reaction on halogenated and D-residue containing substrates. We chemically synthesized six halogenated substrates including 7F-, 5Cl-, 6Cl-, 7Cl-, 8Cl-, 6Br-cWL-PL, and three D-residue containing cyclodipeptides including cWD-PD, cWD-PL, and cWL-PD (s21–s29, Supplementary Figure 5). Then, we individually fed each of the six halogenated and three D-amino acid containing cyclodipeptides to the recombinant E. coli GB05dir-T7-NascB. The enzyme only recognizes halogens in the 7-position, and both of 7F- and 7Cl-cWL-PL can be efficiently converted into their corresponding products (NAS-27 and 28, Fig. 4, Supplementary Figure 6d, e, Supplementary Tables 1–4). Interestingly, these two halogenated products show a different stereo-configuration in C2–C3 from each other, although they have the same C3–C6′ bond. The D-residue containing substrates are not able to form homo-coupled products, while, through another co-fed experiment (Supplementary Tables 2–4), they can be efficiently coupled with cWL-VL to form hetero-pyrroloindolines (NAS-29 and 30). This is surprising as natural C3-aryl pyrroloindolines are very rare with D-amino acids and this result indicates the potential for expanding the structural variety through the incorporation of D-amino acids.
Bioactivity assessment
The bioactivity of NASs are rarely reported: (i) NAS-A/B have no cytotoxicity to arrays of bacteria, fungi, or cancer cell lines4. (ii) NAS-C shows only a trivial activity against the chloroquine-sensitive malaria parasite, Plasmodium falciparum5. Considering that many alkaloids are active on neuronal systems and NASs analogs are structurally similar to alkaloids, we are curious to know whether NASs analogs are also effective on neuronal systems. To evaluate this potential bioactivity, both the Aβ25-35-induced and l-glutamic acid-induced PC-12 cell lines were prepared and used as a model of Alzheimer and nerve injury, respectively. Compounds NAS-1 to -28 were subjected to testing in these two models. The results indicated that these products have no activity on the Alzheimer model (Supplementary Table 5), while most products exhibited an obvious protection activity against glutamate-induced PC-12 damage (Supplementary Table 6). Particularly, NAS-12, 27, 10, and 11 show a potent activity, which is even better than the control nimodipine (Table 1), a clinical drug used for nerve protection. Interestingly, these products all bear a proline in the upper moiety and C3–C6′ bond, suggesting both the upper moiety and regio-specificity are critical for bioactivity, while the groups in the lower pyrroloindoline moiety are less important.
Table 1.
Compound | Concentration (μM) | Inhibition rate (%) |
---|---|---|
NAS-12 | 10 | 29.01 ± 1.67 |
NAS-27 | 10 | 29.88 ± 2.87 |
NAS-10 | 10 | 32.73 ± 2.83 |
NAS-11 | 10 | 33.00 ± 4.92 |
Nimodipine (positive control) | 20 | 33.30 ± 2.79 |
Negative control | 0 | 42.04 ± 4.86 |
Discussion
Biocatalysis is often more efficient and cost-effective than chemical routes. The selectivity of reactions is principally induced by the microenvironment in the catalytic cavity of an enzyme, so enzymatic conversions can often generate chemically unfavored stereo- and regio-selectivity in very high degree. These merits make biocatalysis a very appealing route for the generation of pyrroloindoline-containing compounds with unfavored stereo and regio-selectivity.
The above-presented results show that NascB is clearly able to generate all four combinations of two regio-specific and two stereo-specific reactions. Its unusual substrate-binding mechanism is interesting and significant for further engineering of specificity. NascB is assumed to contain two independent pockets for accommodation of the upper cyclodipeptides and lower pyrroloindolines moieties, respectively: the upper pocket accommodating the upper cyclodipeptide, while the lower pocket accommodates the lower pyrroloindoline. To generate both the C3R and C3S products, the heme-containing lower pocket is necessary to grant substrates freedom to expose the Re or Si face of C3 to the other substrate molecule in the upper pocket. In addition, this pocket is highly tolerant to structural variations in the C15-substituted groups of the pyrroloindolines moieties (Fig. 1), as bulky residues such as methionine and phenylalanine can be accepted. The failure to take polar residues (except for tyrosine) suggests that the chemical environment in this pocket is indeed very hydrophobic, which could be optimized by protein engineering to further broaden its substrate scope to include polar residues or other unnatural groups. In contrast to the lower pocket, the upper pocket is apparently more crowded and hydrophobic as only three small cyclodipeptides including cWL-PL, cWL-AL, and cWL-VL can be accepted.
Analysis of the conformation of each product revealed that the upper cyclodipeptide units are all perpendicular to the pyrroloindoline moiety and the small residue (or not Trp residue) is pointing away from the pyrroloindoline (Supplementary Figure 10). Interestingly, both the scaffolds C3–C7′/2S-3R (NAS-5) and C3–C6′/2R-3S (NAS-24), particularly the latter, can cause severe steric interaction between the upper and lower small residues (Supplementary Figure 10), thus are thermodynamically unfavored. Chemical synthesis of the scaffolds with 2S-3R is very difficult as these are normally dependent on the induction of a particular cyclodipeptides conformation7, while this biosynthetic approach provides a practical and efficient way to access these. To allow NascB to accept more diverse structures in the upper pocket, further protein engineering may be employed to increase the available space. Moreover, reduction of the size of the upper moiety such as using smaller phenylalanine-containing cyclodipeptides or even indole groups are also promising avenues to expand pyrroloindolines diversity.
The biosynthesis of many natural products biosynthesis features C–C bond or C–N bond-coupling formation13,14,17,24–31. These reactions are often radical-mediated, processed by Flavin-dependent, SAM-dependent, metal-dependent enzymes, or P450. Due to high reactivity, radicals can accomplish very difficult chemical transformations. However, their instability, destructive effect, and short-life in normal conditions make it very challenging to tame their reactivity. Unlike other radical reactions, the NascB-catalyzed reaction presents a rare radical cascade reaction and involves a three-steps conversion: including the radical generation and migration, Mannich reaction, and radical addition. Such a multistep radical-involving transformation is very efficient and our whole-cell biocatalysis system makes NascB ideal for practical application.
The absence of an efficient production approach restricts the exploration of biological activities of pyrroloindoline-containing compounds with this characteristic molecular architecture. So far, only trivial bioactivity has been known through testing a limited set of molecules. With our enzyme-based and efficient biocatalysis platform, it is now possible to significantly expand the bioactive space of heterodimeric C3-aryl pyrroloindolines.
Methods
Heterologous expression of biosynthetic gene clusters
We extracted the genomic DNA of the NAS-producing strain of Streptomyces sp. (CMB-MQ030) through a method developed by Nikodinovic et al.32 with minor modification. This minor modification is one more round of chloroform treatment before isopropanol precipitation of genomic DNA. To sequence the genome of the NAS-producing strain, two SMRT cells were employed at UQ Centre for Clinical Genomics (UQCCG) to generate 114,142 reads with a mean read length of 14,421 bp, which provided an average of ×194.7 coverage across the genome reference. The finished genome was assembled with HGAP233. The gene clusters were amplified by PCR (Supplementary Methods) and inserted into the vector pIB139 under the driving of the constitutive promoter (ermE*) for heterologous expression in the model strain Streptomyces albus J1074.
Protein expression, purification, and enzyme assay
P450 genes with and without codon optimized for E. coli were cloned into pET28a and overexpressed in E. coli BL21 (DE3). NascB activity against cWL-PL was assayed by incubating purified NascB (0.1 µM) and cWL-PL (1 mM) at 18 °C with 1 µM E. coli flavodoxin (FdX), flavodoxin reductase (FdxR), or 1 µM spinach ferredoxin (Fd), ferredoxin reductase (FdR), 2 mM NADP+ (Sigma-Aldrich), 2 mM glucose, and 2 mM glucose dehydrogenase (GDH) in 50 mM HEPES buffer, 100 mM NaCl, at pH 7.5. The reaction was left at 18 °C for 24 h. After 24 h, two times more volume of ethyl acetate was added into the reaction solution, followed by the sonication for 5 min. After the separation of aqueous and organic phase, the top ethyl acetate was transferred to a rotavapor to dry, which was re-dissolved in HPLC-graded methanol with an addition of small amount of DMSO, if the solubility is poor. The resultant solution was filtered through 0.45 µM membrane and subjected to analysis by HPLC analysis. A diamonsil (C18, 5 μm, 250 × 4.6 mm, Dikma Technologies Inc.) was used with a flow rate at 1 mL min−1 and a PDA detector over a 40 min gradient program with water (eluent A) and acetonitrile (eluent B): T = 0 min, 5% B; T = 30 min, 100% B; T = 33 min, 100% B; T = 34 min, 5% B; T = 40 min, 5% B.
Electron paramagnetic resonance spectroscopy
X-band CW EPR spectra were recorded at 15 K on a Bruker Elexsys E500 spectrometer fitted with a super high Q Bruker cavity, a liquid helium cryostat (Oxford Instrument, and a microwave frequency counter (BrukerER049X). Spectra were measured with a microwave power of 2 mW (non-saturating conditions), a modulation amplitude of 0.3 mT, and a modulation frequency of 100 KHz. The magnetic field was calibrated with a Tesla meter. Frozen solutions of the substrate-free and substrate-bound enzyme in 50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 10% glycerol were analyzed in 4 mm-quartz tubes. For the substrate test, 80 µL of 0.23 mM NascB solution was mixed with 0.368 µL of 100 mM cWL-PL solution in DMSO to obtain a twofold substrate excess in the final mix. For the substrate-free test, the same amount of NascB enzyme solution was mixed with 0.368 µL of DMSO.
Biotransformation and purifications
The recombinant strains were prepared by growth to an OD600 of 0.8–1.0 at 37 °C. After expression at 18 °C, 220 r.p.m. for 20 h under 100 μM IPTG (isopropyl-β-d-thiogalactopyranoside), 0.4 mM δ-aminolevulinic acid (ALA), and 0.2 mM (NH4)2Fe(SO4)2 induction (they were added when OD600 reached 0.8–1.0), cells from 50 mL culture were harvested by centrifugation at 2000×g at 4 °C and were washed twice and resuspended in 2 mL M9 medium. Then cyclodipeptide substrates were added in the M9 medium. After 48 h incubation at 18 °C, the reaction mixture was extracted with 4 mL ethyl acetate under sonication for 10 min. Organic phase was transferred and dried by vacuum at low temperature. Metabolites were subsequently re-dissolved by methanol with the right amount of DMSO and filtrated by a 0.45 μm membrane to remove particles before HPLC or HPLC-MS (mass spectrometry) analysis.
NMR spectroscopy
The NMR spectra were recorded on a Bruker Avance III spectrometer at a 1H frequency of 400 MHz, 700 MHz, or 900 MHz equipped with a TCI cryoprobe. Lyophilized samples (varying from 1 to 7 mg) were dissolved in 280 µL DMSO-d6 (Cambridge Isotope) and all spectra were recorded at 25 °C (298 K). 1H and 13C resonances were assigned through the analysis of 1D–1H, 1D–13C, 2D 1H–1H rotating frame Overhauser effect spectroscopy (ROESY), 2D 1H–13C heteronuclear single quantum correlation (HSQC), and 2D 1H–13C heteronuclear multiple bond correlation (HMBC) (optimized for long-range heteronuclear couplings of 6 Hz). 1H and 13C chemical shifts were calibrated with reference to the DMSO solvent signal (2.50 and 39.5 p.p.m. for 1H and 13C, respectively). NMR experiments were processed with Bruker Topspin program (version 3.57) and analyzed with mnova software.
Electronic supplementary material
Acknowledgements
We thank Prof. Rob Capon and Dr. Zeinab Khalil at The University of Queensland (UQ) for the assistance and transfer of Streptomyces sp. CMB-MQ030 strain and NAS-A and -B standards; Professor Youming Zhang at Shandong University for gifting the E. coli GB05dir. This work was supported in part by the NSFC (31570057 and 31770063 to X.Q.) and UQ (UQ Postdoctoral Research Fellowship 2015–2017 to X.J., Development Fellowship 2017–2019 to M.M.). We thank the Queensland NMR Network for access to the NMR spectrometer and Centre for Advanced Imaging for access to the EPR spectrometer.
Author contributions
X.Q., X.J., and Z.D. designed research; W.T., C.S., X.J., M.Z., Y.Z., H.P., J.R.H., M.M., and D.Z. performed the experiment; M.Y. and S.-L.C. perform the DFT calculation; X.Q., W.T., and C.S. analyzed data; and X.Q., W.T., C.S., and X.J. wrote the paper.
Data Availability
The sequence of the nasc reported in this paper has been deposited in GenBank under accession number MH201515. The hyperlink of MH201515 is https://www.ncbi.nlm.nih.gov/nuccore/mh201515, which is currently on hold and will be released upon publication. All other relevant data are available from the corresponding authors.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Wenya Tian, Chenghai Sun.
Contributor Information
Xinying Jia, Email: x.jia1@uq.edu.au.
Xudong Qu, Email: quxd@whu.edu.cn.
Electronic supplementary material
Supplementary Information accompanies this paper at 10.1038/s41467-018-06528-z.
References
- 1.Ruiz-Sanchis P, Savina SA, Albericio F, Alvarez M. Structure, bioactivity and synthesis of natural products with hexahydropyrrolo[2,3-b]indole. Chem. Eur. J. 2011;17:1388–1408. doi: 10.1002/chem.201001451. [DOI] [PubMed] [Google Scholar]
- 2.Ma YM, Liang XA, Kong Y, Jia B. Structural diversity and biological activities of indole diketopiperazine alkaloids from fungi. J. Agric. Food Chem. 2016;64:6659–6671. doi: 10.1021/acs.jafc.6b01772. [DOI] [PubMed] [Google Scholar]
- 3.Belin P, et al. The nonribosomal synthesis of diketopiperazines in tRNA-dependent cyclodipeptide synthase pathways. Nat. Prod. Rep. 2012;29:961–979. doi: 10.1039/c2np20010d. [DOI] [PubMed] [Google Scholar]
- 4.Raju R, et al. Naseseazines A and B: a new dimeric diketopiperazine framework from a marine-derived actinomycete, Streptomyces sp. Org. Lett. 2009;11:3862–3865. doi: 10.1021/ol901466r. [DOI] [PubMed] [Google Scholar]
- 5.Buedenbender L, et al. Naseseazine C, a new anti-plasmodial dimeric diketopiperazine from a marine sediment derived Streptomyces sp. Tetrahedron Lett. 2016;57:5893–5895. doi: 10.1016/j.tetlet.2016.11.071. [DOI] [Google Scholar]
- 6.Kim J, Movassaghi M. Concise total synthesis and stereochemical revision of (+)-Naseseazines A and B: regioselective arylative dimerization of diketopiperazine alkaloids. J. Am. Chem. Soc. 2011;133:14940–14943. doi: 10.1021/ja206743v. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kim J, Movassaghi M. Biogenetically-inspired total synthesis of epidithiodiketopiperazines and related alkaloids. Acc. Chem. Res. 2015;48:1159–1171. doi: 10.1021/ar500454v. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kim J, Ashenhurst JA, Movassaghi M. Total synthesis of (+)-11,11’-dideoxyverticillin A. Science. 2009;324:238–241. doi: 10.1126/science.1170777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Repka LM, Reisman SE. Recent developments in the catalytic, asymmetric construction of pyrroloindolines bearing all-carbon quaternary stereocenters. J. Org. Chem. 2013;78:12314–12320. doi: 10.1021/jo4017953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lautru S, Gondry M, Genet R, Pernodet JL. The Albonoursin gene cluster of S. noursei. Chem. Biol. 2002;9:1355–1364. doi: 10.1016/S1074-5521(02)00285-5. [DOI] [PubMed] [Google Scholar]
- 11.Goldfarb D, Bernardo M, Thomann H, Kroneck PMH, Ullrich V. Study of water binding to low-spin Fe(III) in cytochrome P450 by pulsed ENDOR and four-pulse ESEEM spectroscopies. J. Am. Chem. Soc. 1996;118:2686–2693. doi: 10.1021/ja951307e. [DOI] [Google Scholar]
- 12.Krest CM, et al. Reactive intermediates in cytochrome P450 catalysis. J. Biol. Chem. 2013;288:17074–17081. doi: 10.1074/jbc.R113.473108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Saruwatari T, et al. Cytochrome P450 as dimerization catalyst in diketopiperazine alkaloid biosynthesis. Chembiochem. 2014;15:656–659. doi: 10.1002/cbic.201300751. [DOI] [PubMed] [Google Scholar]
- 14.Kishimoto S, Sato M, Tsunematsu Y, Watanabe K. Evaluation of biosynthetic pathway and engineered biosynthesis of alkaloids. Molecules. 2016;21:1078. doi: 10.3390/molecules21081078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Green MTCH. Bond activation in heme proteins: the role of thiolate ligation in cytochrome P450. Curr. Opin. Chem. Biol. 2009;13:84–88. doi: 10.1016/j.cbpa.2009.02.028. [DOI] [PubMed] [Google Scholar]
- 16.Green MT, Dawson JH, Gray HB. Oxoiron (IV) in chloroperoxidase compound II is basic: implications for P450 chemistry. Science. 2004;304:1653–1656. doi: 10.1126/science.1096897. [DOI] [PubMed] [Google Scholar]
- 17.Lin, H.-C. et al. P450-mediated coupling of indole fragments to forge communesin and unnatural isomers. J. Am. Chem. Soc.138, 4002–4005 (2016). [DOI] [PMC free article] [PubMed]
- 18.Woodward, R. B. et al. Calycanthine: the structure of the alkaloid and its degradation product, calycanine. Proc. Chem. Soc. 1960, 76−78 (1960).
- 19.Robinson, R. & Teuber, H. J. Reactions with nitrosodisulfonate. IV. Calycanthine and calycanthidine. Chem. Ind. 1954, 783−784 (1954).
- 20.Gentry EC, Rono LJ, Hale ME, Matsuura R, Knowles RR. Enantioselective synthesis of pyrroloindolines via noncovalent stabilization of indole radical cations and applications to the synthesis of alkaloid natural products. J. Am. Chem. Soc. 2018;140:3394–3402. doi: 10.1021/jacs.7b13616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fu J, et al. Full-length RecE enhances linear-linear homologous recombination and facilitates direct cloning for bioprospecting. Nat. Biotech. 2012;30:440. doi: 10.1038/nbt.2183. [DOI] [PubMed] [Google Scholar]
- 22.Bota DA, Davies KJA. Mitochondrial Lon protease in human disease and aging: including an etiologic classification of Lon-related diseases and disorders. Free Radic. Biol. Med. 2016;100:188–198. doi: 10.1016/j.freeradbiomed.2016.06.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Karlowicz A, et al. Defining the crucial domain and amino acid residues in bacterial Lon protease for DNA binding and processing of DNA-interacting substrates. J. Biol. Chem. 2017;292:7507–7518. doi: 10.1074/jbc.M116.766709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Belin P, et al. Identification and structural basis of the reaction catalyzed by CYP121, an essential cytochrome P450 in Mycobacterium tuberculosis. Proc. Natl Acad. Sci. USA. 2009;106:7426–7431. doi: 10.1073/pnas.0812191106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Makino M, et al. Crystal structures and catalytic mechanism of cytochrome P450 StaP that produces the indolocarbazole skeleton. Proc. Natl Acad. Sci. USA. 2007;104:11591–11596. doi: 10.1073/pnas.0702946104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ikezawa N, Iwasa K, Sato F. Molecular cloning and characterization of CYP80G2, a cytochrome P450 that catalyzes an intramolecular C–C phenol coupling of (S)-reticuline in magnoflorine biosynthesis, from cultured Coptis japonica cells. J. Biol. Chem. 2008;283:8810–8821. doi: 10.1074/jbc.M705082200. [DOI] [PubMed] [Google Scholar]
- 27.Baunach M, Ding L, Bruhn T, Bringmann G, Hertweck C. Regiodivergent N-C and N-N aryl coupling reactions of indoloterpenes and cycloether formation mediated by a single bacterial flavoenzyme. Angew. Chem. Int. Ed. Engl. 2013;52:9040–9043. doi: 10.1002/anie.201303733. [DOI] [PubMed] [Google Scholar]
- 28.Zhang Q, et al. Characterization of the flavoenzyme XiaK as an N-hydroxylase and implications in indolosesquiterpene diversification. Chem. Sci. 2017;8:5067–5077. doi: 10.1039/C7SC01182B. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Präg A, et al. Regio- and stereoselective intermolecular oxidative phenol coupling in Streptomyces. J. Am. Chem. Soc. 2014;136:6195–6198. doi: 10.1021/ja501630w. [DOI] [PubMed] [Google Scholar]
- 30.Teufel R, Agarwal V, Moore BS. Unusual flavoenzyme catalysis in marine bacteria. Curr. Opin. Chem. Biol. 2016;31:31–39. doi: 10.1016/j.cbpa.2016.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tang MC, Zou Y, Watanabe K, Walsh CT, Tang Y. Oxidative cyclization in natural product biosynthesis. Chem. Rev. 2017;117:5226–5333. doi: 10.1021/acs.chemrev.6b00478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Nikodinovic J, Barrow KD, Chuck JA. High yield preparation of genomic DNA from Streptomyces. Biotechniques. 2003;35:932–936. doi: 10.2144/03355bm05. [DOI] [PubMed] [Google Scholar]
- 33.Chin CS, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods. 2013;10:563. doi: 10.1038/nmeth.2474. [DOI] [PubMed] [Google Scholar]
- 34.Rittle J, Green MT. Cytochrome P450 compound I: capture, characterization, and C-H bond activation kinetics. Science. 2010;330:933–937. doi: 10.1126/science.1193478. [DOI] [PubMed] [Google Scholar]
- 35.Conner KP, et al. 1,2,3-triazole−heme interactions in cytochrome P450: functionally competent triazole−water−heme complexes. Biochemistry. 2012;51:6441–6457. doi: 10.1021/bi300744z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The sequence of the nasc reported in this paper has been deposited in GenBank under accession number MH201515. The hyperlink of MH201515 is https://www.ncbi.nlm.nih.gov/nuccore/mh201515, which is currently on hold and will be released upon publication. All other relevant data are available from the corresponding authors.