Abstract
Siderophores are iron-chelating molecules produced by microorganisms and plants to acquire exogenous iron. Siderophore biosynthetic enzymology often produces elaborate and unique molecular through unusual reactions to enable specific recognition by the producing organisms. Herein, we report the structure of two siderophore analogs from Agrobacterium fabrum strain C58, which we named fabrubactin (FBN) A and FBN B. Additionally, we characterized the substrate specificities of the NRPS and PKS components. The structures suggest unique Favorskii-like rearrangements of the molecular backbone that we propose are catalyzed by the flavin-dependent monooxygenase, FbnE. FBN A and B contain a 1,1-dimethyl-3-amino-1,2,3,4-tetrahydro-7,8-dihydroxy-quinolin (Dmaq) moiety previously seen only in the anachelin cyanobacterial siderophores. We provide evidence that Dmaq is derived from L-DOPA and propose a mechanism for the formation of the mature Dmaq moiety. Our bioinformatic analyses suggest that FBN A and B and the anachelins belong to a large and diverse siderophore family widespread throughout the Rhizobium/Agrobacterium group, α-proteobacteria, and cyanobacteria.
Keywords: Siderophore, nonribosomal peptide synthetase, polyketide synthase, natural product, Agrobacterium, monooxygenase, peroxidase, anachelin
INTRODUCTION
Siderophores are high-affinity iron-chelating small molecules made by plants, fungi, and prokaryotes, that are used to acquire exogenous iron. Biosynthesis of siderophores comes at a significant metabolic cost to the producing organism, which is further exacerbated by competitors capturing the ferric-siderophore before it can be absorbed by the producer. As a result, siderophore biosynthesis is an evolutionary battleground of enzymology that leads to a high diversity of elaborate structures1–4. One such example are the pyoverdines, made by fluorescent pseudomonads. Apart from a characteristic dihydroxyquinoline chromophore present in all of these molecules, pyoverdine biosynthetic enzymology produces remarkable structural diversity by varying the number, identities, and cyclization of amino acids incorporated into these natural products5. To date there are more than 60 structurally unique pyoverdines6. Thus, exploring siderophore biosynthesis can facilitate the discovery of unusual natural product enzymology and provide insights into how Nature generates natural product diversity.
We have previously reported that Agrobacterium fabrum strain C58, formerly known as A. tumefaciens strain C587, 8, produces a siderophore assembled by a hybrid nonribosomal peptide synthetase (NRPS)/polyketide synthase (PKS)9. This conclusion was based on the associated biosynthetic gene cluster (BGC) encoding two PKS modules, one fatty acyl-CoA ligase module, and seven NRPS modules. This enzymology did not align with the structure of any known siderophore from Agrobacterium spp. or related bacteria. Interestingly, some of the enzymes encoded by this BGC have homologs in the cyanobacterium Anabaena sp. strain PCC71209, and it was subsequently shown that the BGC from this cyanobacterium encodes enzymes that assemble a siderophore of unknown structure10. More recent bioinformatic analyses of BGCs in cyanobacteria suggest the BGC in PCC7120 is involved in the production of an anachelin analog11.
In our previous study, we were able to purify the siderophore, but did not perform any structural analyses. Efforts to use biochemical characterization of the biosynthetic enzymes to guide structural analysis were thwarted by our inability to obtain soluble NRPS components when heterologously produced in Escherichia coli. Since our initial work, we, and others, discovered that small proteins belonging to the MbtH-like protein (MLP) superfamily dramatically influence the solubility and activity of associated NRPSs12–14. Genes encoding MLPs are commonly found in association with genes encoding NRPSs and hybrid NRPS/PKSs15, as is the case with the A. fabrum strain C58 siderophore BGC.
Herein we report the structure of two siderophore analogs from A. fabrum strain C58 that we refer to as fabrubactin (FBN) A and B and the biochemical characterization of the NRPS and PKS components that produce the siderophore backbones. These siderophores contain a lipid tail at one end and are capped by a 1,1-dimethyl-3-amino-1,2,3,4-tetrahydro-7,8-dihydroxy-quinolin (Dmaq) at the other (Figure 1). This Dmaq moiety is rare in natural products and has previously only been found in three analogs of the cyanobacterial siderophore anachelin16, 17. We provide biochemical evidence that levodopa (L-DOPA) is a precursor for the formation of the Dmaq moiety and is produced by a heme peroxidase. We hypothesize that this heme peroxidase is bifunctional and catalyzes L-DOPA formation and aza-annulation to form the Dmaq moiety. FBN A and B are unusual in that they contain two backbone quaternary carbons each with a free hydroxyl and carboxyl group. We propose these moieties are formed by two Favorskii-like rearrangements catalyzed by a flavin-dependent monooxygenase. Finally, the formation of the two analogs suggests the potential for an unusual NRPS module that first selectively activates β-Alanine (β-Ala), covalently tethering the amino acid to its thiolation domain, and then activates either another β-Ala (FBN A) or captures an activated L-Ala from a neighboring NRPS for amide bond formation (FBN B), generating a dipeptide on the thiolation domain. The structure of the FBNs and the unusual enzymology provide more support for siderophores and their biosyntheses being rich sources of novel chemistry and enzymes. Additionally, bioinformatics suggests the FBNs and anachelins are a large class of Dmaq-containing siderophores found in many bacteria.
Figure 1. Structures of fabrubactin (FBN) analogs A and B.

Unique partial structures 2-(2- aminoethyl) - 2-hydroxy-propanedioic acid (Ahpa; shown in blue), 2-C-carboxy-2,4-dihydroxy-undecanoic acid (Cdua; shown in red), and 1,1-dimethyl-3-amino-1,2,3,4-tetrahydro-7,8-dihydroxy-quinolin (Dmaq; shown in green) are indicated. Only atoms unique to FBN B are numbered on the structure of this analog.
RESULTS AND DISCUSSION
Structure elucidation of FBN A and B.
Growth conditions in Tris-based minimal medium containing the iron chelator dipyridyl were optimized for siderophore production. The siderophores were purified as described in the Experimental Procedures. Fractions containing the siderophore were identified by reactivity with chromeazurol S reagent18, 19. During growth and purification, a parallel culture of an A. fabrum strain C58 disrupted in siderophore production (an fbnN deletion) was used as a negative control. From eight liters of culture, five mg of FBN A and B were obtained for structural studies.
The structures of the two siderophores, FBN A and B (Figure 1), were determined using a combination of MS and NMR. The purified compounds had an m/z of 1008.4334 (M+), showing the appropriate M+1, M+2, and M+3 isotopic patterns (Figure S1A). A metabolite with this m/z was not observed in the negative control. The molecular formula of FBN A and B was established as C44H66N9O16S based on high-resolution electrospray ionization mass spectrometry (HR-ESI-MS) data. The two analogs are identical except for β-Ala in FBN A and Ala in FBN B corresponding to backbone atoms numbered 39–42. Therefore, FBNs will henceforth be used to refer to both analogs unless otherwise indicated. Additionally, ultra-high-resolution Fourier transform ion cyclotron resonance (FT-ICR) MS was also employed to obtain the exact molecular formula information from its isotopic fine patterns, which clearly suggested the presence of one sulfur (Figure S1B, S1C).
Analysis of the 13C-13C COSY data provided most of the backbone carbons of the structure (i.e., without configurational assignment) (Figure 2, Table S1, and Table S2). Analysis of the 1D NMR, COSY, and HMBC data of FBN A identified three standard amino acids including one Ala and two glycines (Gly), and one β-Ala. Similarly, two Ala and two Gly amino acids were identified in FBN B. HMBC correlations from H-29 (s, δH 7.88) to C-25 (δC 148.8) and C-27 (δC 174.5) suggested the presence of a 2-substituted thiazole (Thz) ring20. The connectivity between C-27 and C-30 and between C-30 and C-31 completed the Ala-Thz fragment (Figure 2, Table S1, and Table S2).
Figure 2. Select 2D NMR analysis for FBN A. 2D NMR correlations to determine FBN A (β-Ala) and FBN B (Ala) structures.

The HMBC correlations from the proton signals of the two N-methyl groups (H3-11 and H3-12) to C-2 and C-10 enabled a linkage of C-2 and C-10 via a nitrogen atom. C-7 and C-8 were assigned as two phenolic carbons based on the chemical shift and were also confirmed by NOESY correlations between OH-8 and H-9 and between H-6 and OH-7. Therefore, one of the remaining unassigned moieties was associated with the Dmaq unit (Figure 2, Table S1, and Table S2)16, 17. The counterion for the Dmaq moiety was trifluoroacetate, used in the HPLC purification. The catecholate in the Dmaq moiety of anachelin is one of the functional groups involved in binding iron17. The presence of Dmaq in the FBNs provides structural support for their role as siderophores. Mixing a solution of FeCl3 and siderophore gave a peak at 1043 m/z corresponding to a 1:1 complex as [M+Fe3+−3H−H2O]+, determined by ESI-MS (Figure S2).
Assembly of the last partial structures, Ahpa and 2-C-carboxy-2,4-dihydroxy-undecanoic acid (Cdua) was accomplished by 1H-1H COSY and HMBC. Since both C-34 (δC 78.0) and C-44 (δC 77.6) were sp3 quaternary carbons, one hydroxy and one carboxylic acid group was attached to each to satisfy the molecular formula (Figure 2, Table S1, and Table S2). The presence of two carboxylic acid groups were further confirmed by ESI-MS2 and FT-MS2 data (Figure S3A), showing two significant peaks [M−COOH]+ and [M−2*COOH]+ at 964.4371 m/z and 920.4474 m/z, respectively.
Attempts to confirm the connectivity of the aforementioned partial structures using FT-MS2 experiments were conducted concurrently with the NMR-based structure elucidation. Accurate mass measurements and fine structure isotopic patterns resulted in the assignment of exact molecular formulas of fragments (Figure S3B)21, 22. The fragment ion at m/z 548.2283 could be assigned the elemental composition C24H34N7O6S+, corresponding to Dmaq−Gly1−Ala−Gly2−Thz-Ala. Detection of the ion at m/z 764.3029 (C32H46N9O11S+), resulted in the sequence assignment of a β-Ala−Ahpa after the Ala-Thz residue. The neutral loss between m/z 764.3029 and parent ion was produced through cleavage of the amide bond of the Cdua residue (Figure S3B). Observation of HMBC correlations from NH-42 and H2-45 to C-43 suggested the connectivity between Cdua and β-Ala, thus completing the structure of the FBNs without configurational assignment.
Substrate specificity of the initiation module, FbnG.
The unusual structures of the FBNs raised questions about how these siderophores are biosynthesized. We previously identified a large portion of the FBN BGC9. Here we extended the BGC to additional ORFs to account for the biosynthesis of necessary precursors, transcriptional regulation, FBN transport across the inner and outer membranes, and the FBN biosynthetic enzymes (Table S3). At the core of FBN biosynthesis is an NRPS/PKS hybrid megasynthase (Figure 3A) and understanding the substrate-specificities of these enzymes would provide insights into how these siderophores are assembled. Each of the adenylation (A) domain and acyltransferase (AT) domain-containing modules were overproduced and purified in E. coli as either N- or C-terminal histidine-tag fusions and assayed for substrate specificity. The likely initiating module was predicted to be FbnG based on it containing an N-terminal acyl-CoA-ligase domain (AL). The purified, histidine-tagged version of this protein was assayed for fatty acid recognition using standard ATP/PPi exchange assays. FbnG was found to preferentially activate decanoate over octanoate, nonanoate, and 3-hydroxydecanoate (Figure S4A). This result was unexpected since the FBN structures suggest that a hydroxylated fatty acid would be the initial substrate. The BGC codes for a flavin-dependent monooxygenase (FMO, FbnE) that may catalyze the hydroxylation of the thioesterified decanoate; however, as discussed below, we hypothesize that FbnE catalyzes the C2 hydroxylation of thioesterified intermediates on the PKS T domains during Favorskii-like rearrangement reactions. Currently, it is not clear if FbnE is involved in modification of decanoate while tethered to FbnG or whether FbnG has a different substrate specificity in vivo.
Figure 3. Biosynthetic scheme of the NRPS and PKS components.

A) Hybrid polyketide synthase/nonribosomal peptide synthetase encoded by the fbn BGC. Substrates incorporated by each module are displayed as the numbered chemical moiety that is added to the preceding structure (left to right) and thioesterified to the protein. B) Proposed mechanism of putative iterative NRPS FbnD leading to FBN A formation. C) Proposed mechanism of capture of L-Ala thioesterified to the T domain of either FbnO or FbnP leading to FBN B formation. Domains are abbreviated as follows: AL, acyl-CoA ligase; T, thiolation KS, ketosynthase; AT, acyltransferase; KR, ketoreductase; FMO, flavin monooxygenase; C, condensation; A. adenylation; Cy, cyclase; OX, oxidase; E, epimerase, Rd, reductase; MLP, MbtH-like protein. R is conditional upon the presence of the β-carbon (shown in parentheses as 0 or 1): if 1, R = H (FBN A), if 0, R = CH3 (FBN B).
Substrate specificities of the adenylation domains.
All the NRPS modules were insoluble when heterologously overproduced in E. coli unless they were co-overproduced with the MLP FbnK; thus, we conclude every A domain is MLP-dependent. Each of the purified modules was assayed for amino acid activation. According to the solved FBN structures, the amino acids that must be accounted for in the biosynthesis are: two glycine residues, one L-cysteine (L-Cys) residue, two (FBN A) or three L-Ala (FBN B) residues, one (FBN B) or potentially two (FBN A) β-Ala residues, and likely an L-DOPA residue (Figure 1). Interestingly, the megasynthase only contains seven NRPS modules for eight amino acids, with one of the remaining modules (FbnQ) lacking an A domain (Figure 3A).
FbnH was hypothesized to be the terminal module since it contains a C-terminal reductase (Rd) domain to reductively release the product from the megasynthase. Based on the tetrahydroquinolone termini of the FBNs, FbnH was predicted to activate L-DOPA. FbnH was assayed with all 20 proteinogenic amino acids and L-DOPA, with L-DOPA being the only amino acid that stimulated significant ATP/PPi exchange (Figure S4B); thus, FbnH is the terminal module of the megasynthase.
FbnO is a two-module NRPS subunit. Screening the A domain of the first module for amino acid activation using the standard 20 proteinogenic amino acids detected the highest activation with L-Ala, but also activation of L-Cys, Gly, and L-Ser (Figure S4C). To determine the preferred substrate, the specificity constant (Vmax/Km) was determined for each amino acid (Table S4). These data support L-Ala as the preferred substrate for the first module of FbnO. The second module of FbnO contains both a cyclization (Cy) domain and an oxidase (Ox) domain. This domain organization suggested the second module of FbnO would be involved in forming the thiazole rings of the FBNs. Consistent with this hypothesis, the A domain of this module was specific for L-Cys (Figure S4D).
FbnP has two modules, each containing an A domain. We were unable to generate a construct that enabled soluble production of the N-terminal module to be produced, but we could overproduce and purify both the full-length FbnP and the C-terminal module. The C-terminal A domain was specific for L-Ala, while the full-length FbnP activated both L-Ala and Gly (Figure S4E, S4F). The presence of a C-terminal epimerase (E) domain suggests the stereochemistry of this residue in the FBNs will be D-Ala. These data are consistent with FbnP functioning immediately after FbnO to form the central region of the FBNs.
FbnD was the only A domain-containing module remaining. Of the 20 proteinogenic amino acids tested, FbnD was found to weakly activate only Gly. The BGC codes for a homolog of aspartate α-decarboxylase (FbnA) that converts L-Asp to β-Ala, and the FBNs contain β-Ala moieties. Based on these observations, it is reasonable to propose that FbnD activates β-Ala. ATP/PPi assays using Gly, L-Ala, D-Ala, β-Ala, and β-Ala-Gly determined the A domain was specific for β-Ala activation (Figure S4G).
Based on the combined ATP/PPi assay results and our structural analysis, it was reasonable to place FbnQ, which lacks an A domain, as the module between FbnP and the terminal module, FbnH. The structures of the FBNs suggest the penultimate NRPS module would introduce Gly. Since the N-terminal module of FbnP is the only one specific for Gly, we propose that it loads Gly onto the T domain of FbnQ in trans (Figure 3A).
FbnD is an unusual NRPS.
The distinction between FBN A and B is the presence of a β-Ala or L-Ala linking the Cdua and Ahpa portions of the siderophores (Figure 1). As discussed below, the Ahpa moiety is likely to be derived from β-Ala and malonyl-CoA loaded onto FbnD and FbnN, respectively. The functions of all the other modules are accounted for, so the introduction of the linking β-Ala or L-Ala must come from an unusual mechanism. β-Ala may be introduced by an iterative function of FbnD. Briefly, β-Ala would be activated and tethered onto the T domain of FbnD, forming a β-Ala-S-FbnD intermediate. The A domain of this module would then work a second time to form a β-Ala-AMP intermediate, but since there is only a single T domain, amide bond formation would occur when the amino group of the thioesterfied β-Ala attacks the activated carbonyl of the β-Ala-AMP, thereby forming β-Ala-β-Ala-S-FbnD (Figure 3B). Our proposed mechanism is analogous to streptothricin biosynthesis wherein a β-lysyl polymer is formed by an iterative module23, with the exception that two distinct A domains are used; one forms the thioesterified β-lysyl unit and a second activates the remaining β-lysyl-AMP intermediates for capture by the growing thioesterified β-lysyl polymer. Here, we propose a single A domain will perform both activities. This is a reasonable hypothesis based on the observations that a single NRPS module can have both the thioesterified amino acid and an amino acyl-AMP intermediate present simultaneously24, 25. However, at this time we cannot eliminate the possibility that a second FbnD module provides the second β-Ala in trans.
The formation of the L-Ala bridge observed in FBN B is more challenging to address. We tested three hypotheses. First, we assessed whether FbnD was able to activate L-Ala after forming a β-Ala-S-FbnD intermediate. This was tested by first forming the β-Ala-S-FbnD intermediate then assaying whether the A domain gained the ability to activate L-Ala. The results from this experiment were negative, with L-Ala activation by β-Ala-S-FbnD not being statistically different than the control reactions (Figure S5). Next, we investigated whether β-Ala-S-FbnD was able to capture L-Ala from either L-Ala-AMP or L-Ala thioesterified to a T domain from either FbnO or FbnP (Figure 3C), the two modules that activate L-Ala. To test this we formed β-Ala-S-FbnD, added [3H]-L-Ala, ATP, and either apo- or holo-FbnO or FbnP, and investigated whether [3H]-L-Ala-β-Ala-S-FbnD was formed by scintillation counting the FbnD band excised from an SDS-PAGE gel. No radiolabel was detected on β-Ala-S-FbnD in either experiment nor in control experiments where FbnO and FbnP were absent. The lack of detectable radiolabel on β-Ala-S-FbnD in the control experiments also eliminated the possibility that L-Ala was activated by FbnD at some low level and then tethered to the β-Ala-S-FbnD. These series of negative results suggest the relationship between FbnD, FbnO, and FbnP will require further investigations to decipher how FBN A and B are assembled.
Substrate specificities of the PKSs FbnF and FbnN.
The PKSs were incubated with [14C-C2]-malonyl-CoA or [14C-C2]-(R,S)-methylmalonyl-CoA and subjected to SDS-PAGE followed by phosphorimaging to check for [14C]-labeling. Both FbnF and FbnN were radiolabeled when incubated with [14C-C2]-malonyl-CoA but not when they were incubated with [14C-C2]-(R,S)-methylmalonyl-CoA, indicating that malonyl-CoA is the preferred substrate of both (Figure S6).
Given the gene organization in the BGC, we predict FbnF would function after FbnG and FbnN would function after FbnD (Figure 3A). However, it is unclear how the available enzymology would account for the quaternary carbons, C-34 and C-44, or the lack of hydroxyl groups on C-35 and C-45, which would be expected due to the presence of ketoreductase (KR) domains in FbnN and FbnF. Additionally, assuming canonical NRPS-PKS enzymology, the number of backbone carbons in both Cdua and Ahpa, assembled in part by the PKSs, is one less than expected.
FbnE is a putative flavin-dependent monooxygenase.
Taken together, the aforementioned data suggest the FBNs undergo two Favorskii-like rearrangements to excise two carbons, C-37 and C-54, from the backbone as carboxylic acids. We propose in each case, the rearrangement is catalyzed by the putative flavin-dependent monooxygenase (FMO), FbnE, stimulated by a double oxidation of the α-carbons C-34 and C-44 from the acyl groups introduced by FbnF and FbnN (Figure 4A, 4B). Currently, it is unclear what role the KR domains play since both the solved FBN structures and our proposed mechanism for backbone contraction by Favorskii-like rearrangements are incompatible with established KR function. To reveal the origins of the excised carbons, we attempted 13C-acetate feeding studies with C-1, C-2, and both carbons labeled; however, the addition of acetate to the growth media dramatically decreased FBN production and, as a result, the degree of 13C incorporation was too low to be interpretable. Repeated attempts to find the appropriate time for 13C-acetate addition were unsuccessful, with disruption of FBN production occurring whenever acetate was added to the growth medium.
Figure 4. Proposed Favorskii rearrangement mechanisms of FBNs.

Favorskii rearrangements are catalyzed by flavin monooxygenase FbnE when intermediates are thioesterified to FbnF (A) or FbnN (B). R1 is conditional upon the presence of the β-carbon (shown in parentheses as 0 or 1): if 1, R1 = H (FBN A); if 0, R1 = CH3 (FBN B).
Favorskii-like rearrangements are commonly observed in dinoflagellate natural products, such as the brevetoxins, and structural evidence of similar enzymology is seen in fungal aspyrones26, 27. In bacteria, there are two known FMOs involved in Favorskii-like rearrangements of polyketides, EncM and AmbI from enterocin and ambruticin biosynthesis, respectively28–30. Additionally, the structure of myxoprincomide from Myxococcus xanthus DK1622 suggested to us the FMO domain associated with MXAN_3779 is likely to catalyze a Favorskii-like rearrangement31.
FbnE is a homolog of the MXAN_3779 FMO domain (43% identical), yet it is not homologous to either EncM or AmbI, which in turn are also not homologous to each other. One key difference in the reactions catalyzed by these enzymes is the fate of the excised backbone carbon. In enterocin, it is retained via intra-molecular esterification, while in ambruticin and myxoprincomide, it is lost via decarboxylation. Based on the FBN structures, we propose the reactions catalyzed by FbnE retain the backbone carbon as a carboxylic acid thereby creating the unusual quaternary carbons C-34 and C-44 that are likely to be essential for iron chelation (Figure 4A, 4B).
Formation of L-DOPA.
We established FbnH is selective for L-DOPA (Figure S4B). We hypothesized that L-DOPA is made from L-Tyr by FbnM prior to loading onto the megasynthase. FbnM is weakly homologous to SfmD (28% identical), a member of a recently discovered group of heme-containing peroxidases that catalyze the conversion of 3-methyltyrosine to 3-hydroxy-5-methyltyrosine in the tetrahydroisoquinoline family of natural products32. Additionally, FbnM contains the HXXXC motif found in SfmD thought to be required for heme binding. Initial attempts to overproduce soluble FbnM in E. coli were unsuccessful. We noticed FbnM shows homology to only the C-termini half of SfmD. We hypothesized that the gene immediately upstream of fbnM may code for a protein that complexes with FbnM based on its genomic location and our finding of FbnL/FbnM homolog fusions in WP_096621997.1 (e.g. from Microchaete diplosiphon).
FbnL and FbnM were found to be soluble when the encoding genes were co-expressed and the proteins co-purified (Figure 5A). The UV-Vis spectra of the FbnL/M pair is consistent with other heme-containing enzymes (Figure 5B)33. Additionally, the presence of both genes on a single plasmid resulted in production of a brown pigment that was not present when the genes were expressed independently (Figure S7). Connor et al. reported dark brown pigmentation when overproducing Orf13, an unrelated heme peroxidase confirmed to catalyze the ortho-hydroxylation of L-Tyr to L-DOPA34. It is the subsequent polymerization of L-DOPA, through a dopaquinone intermediate, that produces the pigment35. Thus, it is reasonable that one role of FbnL/M is to hydroxylate L-Tyr, using H2O2 as the oxidant, thereby forming L-DOPA, which is loaded onto the NRPS FbnH.
Figure 5. FbnL/M is a heme peroxidase that converts L-Tyr to L-DOPA.

A) SDS-PAGE/Coomassie Blue staining of co-overproduced and Ni-NTA-purified FbnL/M. Imidazole concentrations used for protein elution are noted above each lane. B) UV-vis spectrum of 10 μg of purified FbnL/M complex in 120 μL of buffer with (solid line) or without (dashed line) the addition of 100 μM DTT. The soret band typical of heme B shifts from 404 nm to 420 nm signifying the ferric to ferrous iron conversion. C) Detection of L-DOPA in culture supernatants. Samples were induced with 100 μM IPTG where indicated. The averages of three independent reactions with standard deviations are shown. D) Three representative HPLC chromatograms of reactions (top to bottom): no hydrogen peroxide added, L-DOPA added rather than L-Tyr, and L-Tyr plus hydrogen peroxide added.
We tested this hypothesis using in vivo and in vitro approaches. We employed a colorimetric assay based on an L-DOPA dioxygenase from Mirabilis jalapa (mjDOD)36 to determine whether fbnLM-expressing E. coli were exporting excess L-DOPA. MjDOD converts L-DOPA to betalamic acid, which spontaneously reacts with amino acids to form yellow pigmented betaxanthins. A greater betaxanthin signal was detected when E. coli BL21(DE3) contained the fbnLM overexpression vector compared to the empty vector control (Figure 5C). The L-DOPA concentration also increased when fbnLM expression was induced with IPTG. We also detected L-DOPA formation from L-Tyr by purified FbnL/M using HPLC (Figure 5D). The low level of turnover detected is not unexpected based on the well-established understanding of how heme peroxidases self-inactivate during catalysis with H2O237. The detection of L-DOPA in these experiments confirm the heme peroxidase FbnL/M can convert L-Tyr to L-DOPA and is the likely source of L-DOPA for incorporation into the FBNs by FbnH.
Proposed biosynthetic mechanism of the Dmaq heterocycle.
To form the Dmaq moiety, FbnL/M hydroxylates L-Tyr (1) forming L-DOPA (2). The L-DOPA is loaded onto the NRPS FbnH (3) and condensed with the upstream backbone of FBN (4) donated by FbnQ (Figure 6). To set up the aza-annulation, the FBN backbone must be reductively released from FbnH to form the semi-aldehyde (5) by the reductase domain of FbnH. An amino group is then transferred to the carbonyl carbon by FbnJ, a homolog of glutamate-1-semialdehyde amino transferases (HemL superfamily, COG0001) to form 6. It has been proposed that Dmaq formation in the anachelins occurs similarly to cyclo-DOPA, i.e. through a tyrosinase-catalyzed oxidation of either L-Tyr or L-DOPA to dopaquinone, which enables the aza-annulation38. However, a tyrosinase is not encoded in the genome of A. fabrum C58, thus the ring closure to form FBN must be accomplished by other means. We propose the FbnL/M pair performs a second function by catalyzing the oxidation of the catechol to a quinone (7). In the following non-enzymatic steps, as have been previously proposed38, an intra-molecular Michael addition yields 8, which tautomerizes resulting in the aza-annulation product (9). This is followed by double N-methylation by FbnI (pfam 13649, class I SAM-dependent methyltransferase) to form the mature Dmaq moiety (10), although it is possible this step precedes quinone formation (Figure 6).
Figure 6. Proposed mechanism for the biosynthesis of the Dmaq moiety.

Fabrubactins and anachelins belong to a large siderophore family.
The Dmaq moiety has only been observed in the anachelins and the FBNs. The biosynthetic formation of Dmaq is defined by the presence of FbnH-M homologs, with the occasional absence of an FbnK homolog (the MbtH-like protein). The organisms encoding homologs of this proposed Dmaq biosynthetic “cassette” are widespread in the Rhizobium/Agrobacterium group, α-proteobacteria, and cyanobacteria (Figure 7). To identify BGCs likely to be involved in the production of the Dmaq family of siderophores, we focused our initial attention on finding FbnL and FbnM homologs since these are the most distinct proteins in FBN biosynthesis. We also searched for FbnH-K homologs, NRPS and PKS enzymology, and proteins involved in siderophore transport (e.g. TonB, outer-membrane siderophore transporters). We did not fully analyze partial BGCs from incompletely assembled (meta)genomes; their includion would add at least double the number of BGCs of interest.
Figure 7. BGCs for the production of Dmaq-containing siderophores.

Schematic representations of the BGCs proposed to be involved in the production of Dmaq-containing siderophores. Arrow colors represent genes coding for the following: Green (NPRS or PKS enzymology), Blue (Favorskii-like rearrangement FMOs), Purple (L-DOPA-recognizing NRPS involved in Dmaq formation); Purple with Red asterisk (L-DOPA-recognizing NRPS with additional domains prior to reductase domain); Red (Dmaq-forming enzymes); Orange (ORFs coding for other functions). a, representative of BGCs in A. fabrum (strains 12D13, 1D132, ATCC 31749, ARqua1, and Arqua), A. tumefaciens (strains 15955, 15-174, CFBP5877, CFBP5499, LBA4404, F4), Rhizobium species. (NFIX01, NFIX02, UGM030330-04); b, representative of BGCs in A. vitas (strains SZ1, AB3, AB4, 15-172, 1D1609), A. tumefaciens strains 15-174 and 1D1609, R. rhizosphaerae MH17). Identifications of the locus tags for the 5’ and 3’ genes for each BGC are listed in Table S5.
From this analysis we identified 40 BGCs, in three groups, that we predict to be involved in the production of a Dmaq-containing siderophore (Figure 7, Table S5). The largest group (Group I in Figure 7) includes the FBN BGC, many members of the Rhizobium/Agrobacterium group, and other α-proteobacteria. Based on the similarity of these BGCs, we anticipate all will produce FBN analogs. Furthermore, homology between NRPS components provided clear indications of where insertions or deletions of modules has occurred. Group II includes three cyanobacteria BGCs, one of which is the putative anachelin BGC from Anabaena cylindrica PCC712211. These do not show the same similarity in NRPS/PKS components shared in Group I and only Nostoc sp. ‘Lobaria pulmonaria’ codes for a homolog of FbnE; thus, these will likely share the Dmaq moiety, but will be structurally distinct from each other. Group III BGCs have the necessary components for generating a Dmaq moiety, but the NRPSs associated with L-DOPA activation and reductive release of the siderophore backbone are split onto two distinct proteins. Interestingly, five of the six members of this group have a significant modification to the NRPS component for Dmaq formation. These five have the reductase domain on a separate polypeptide than the C-A-T domains, and these proteins contain an N-terminal non-heme β-hydroxylase domain (CmlA_N, pfam18456), a central UlaG-like domain (COG2220), and TubC-N-terminal NRPS docking domain (pfam18563). This domain organization suggests the Dmaq moiety itself will be structurally distinct, potentially with a β-hydroxylated L-DOPA incorporated, resulting in a hydroxyl group on the C4 of the Dmaq moiety. The sixth member of this Dmaq siderophore Group III is found in a Planctomcetaceae isolate and while it has a split FbnH homolog, it lacks the additional domains seen in the other five members of the group and likely will form a standard Dmaq moiety.
CONCLUSIONS
We have solved the structures of two siderophores from A. fabrum C58, FBN A and B, and characterized the specificities of the core enzymes of the NRPS/PKS megasynthase previously linked to the production of these molecules9. Taken together, our data suggest the involvement of unusual enzymology in the assembly of the FBNs. We identified an unusual NRPS module, FbnD, that first activates and tethers β-Ala to its T domain, and then captures a second molecule of β-Ala from itself (FBN A) or L-Ala from some other source (FBN B) to form the bridge between CduA and Ahpa components of the FBNs. We determined the FBN siderophores contain an unusual C-terminal Dmaq moeity previously only seen in the anachelin siderophores16, 17. We have provided biochemical evidence that a two-protein heme peroxidase (FbnL/M) catalyzes the conversion of L-Tyr to L-DOPA and propose this protein complex also catalyzes the formation of a dopaquinone intermediate the enables aza-annulation to form one of the rings of Dmaq. Due to its high affinity for iron, Dmaq, when synthetically conjugated to a polyethylene glycol, has been shown to bind surface-adhered metal-oxides and display protease-resistant properties39. This application has the potential to attach a desired natural product onto a metal-coated surface by using a combinatorial biosynthesis approach to selectively add Dmaq to the molecule. A better understanding of the Dmaq-producing suite of biosynthetic enzymes is needed to facilitate such combinatorial approaches.
The structures of the FBNs suggest the involvement of a Favorskii-like rearrangement-catalyzing FMO, FbnE, distinct from the other two known bacterial FMOs that carry out similar chemistry28, 29. Favorskii rearrangements in bacterial NRPS- and PKS-made natural products are rare and little is known about the substrate selectivity of the FMOs. A deeper understanding of these enzymes could allow for unprecedented backbone-carbon manipulations of nonribosomal peptides or polyketides. We propose FbnE is responsible for the formation of the two quaternary carbons, C-34 and C-44 (Figure 4), that make the FBNs unique among bacterial polyketide-associated natural products. We have identified related hybrid NRPS/PKS-encoding BGCs in multiple rhizobacterial species that we propose also make FBN analogs containing the Dmaq moiety and quaternary backbone carbons (Figure 7). Finally, we provide bioinformatic evidence that the enzymology for Dmaq formation is found in many bacteria, suggesting there are still many members of this siderophore family yet to be discovered.
METHODS
Details of Experimental Methods are provided in the Supporting Information.
Supplementary Material
ACKNOWLEDGEMENTS
This work was supported by the National Science Foundation (NSF 1716594 to M.G.T.) and the National Institutes of Health (GM100346 and AI065850 to M.G.T.). V.V. was supported in part by the Jerome J. Stefaniak Predoctoral Fellowship. F.Z. was supported by the National Institutes of Health grants R01GM104192, U19AI109673 and U19AI142720. We thank E. Schmidt (University of Utah) for insightful discussions regarding the potential Favorskii-like rearrangements catalyzed by the FMO FbnE. We thank P. Romero and M. Politz (UW-Madison) for discussions and conditions for the colorimetric L-DOPA assay. We thank A. Podevels for technical assistance in the cloning of fbnD and fbnG. The NMR study made use of the National Magnetic Resonance Facility at Madison, which is supported by NIH grant P41GM103399 (NIGMS), old number: P41RR002301. Equipment was purchased with funds from the University of Wisconsin-Madison, the NIH P41GM103399, S10RR02781, S10RR08438, S10RR023438, S10RR025062, S10RR029220), the NSF (DMB-8415048, OIA-9977486, BIR-9214394), and the USDA. The authors thank the Analytical Instrumentation Center (AIC) at the University of Wisconsin-Madison School of Pharmacy for the facilities to acquire spectroscopic data, especially HRESIMS data. The authors would like to acknowledge the UW-Madison Human Proteomics Program Mass Spectrometry Facility (initially funded by the Wisconsin partnership funds) for support in obtaining mass spectrometry data (FR-ICR MS) and NIH S10OD018475 for the acquisition of ultra-high resolution mass spectrometer.
Footnotes
SUPPORTING INFORMATION
The supporting Information is available free of charge at http://pubs.acs.org. The contents of the supporting information include detailed experimental procedures, MS analyses, results from A and AT domain biochemical assays, NMR tables, annotation of the ORFs in the fbn gene cluster, related siderophore biosynthetic gene cluster locus tags, and lists of primers, strains, and plasmids used in this study.
DECLARATION OF INTERESTS
The authors declare no competing interests.
References
- [1].Crosa JH, and Walsh CT (2002) Genetics and assembly line enzymology of siderophore biosynthesis in bacteria, Microbiol. Mol. Biol. Rev 66, 223–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Neilands JB (1995) Siderophores: structure and function of microbial iron transport compounds, J. Biol. Chem 270, 26723–26726. [DOI] [PubMed] [Google Scholar]
- [3].Winkelmann G (2002) Microbial siderophore-mediated transport, Biochem. Soc. Trans 30, 691–696. [DOI] [PubMed] [Google Scholar]
- [4].Winkelmann G (1991) CRC Handbook of Microbial Iron Chelates, CRC Press, Boca Raton, FL. [Google Scholar]
- [5].Meyer JM (2000) Pyoverdines: pigments, siderophores and potential taxonomic markers of fluorescent Pseudomonas species, Arch. Microbiol 174, 135–142. [DOI] [PubMed] [Google Scholar]
- [6].Meyer JM, Gruffaz C, Raharinosy V, Bezverbnaya I, Schäfer M, and Budzikiewicz H (2008) Siderotyping of fluorescent Pseudomonas: molecular mass determination by mass spectrometry as a powerful pyoverdine siderotyping method, Biometals 21, 259–271. [DOI] [PubMed] [Google Scholar]
- [7].Young JM, Kuykendall LD, Martínez-Romero E, Kerr A, and Sawada H (2001) A revision of Rhizobium Frank 1889, with an emended description of the genus, and the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicola de Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis, Int. J. Syst. Evol. Microbiol 51, 89–103. [DOI] [PubMed] [Google Scholar]
- [8].Young JM, Kuykendall LD, Martinez-Romero E, Kerr A, and Sawada H (2003) Classification and nomenclature of Agrobacterium and Rhizobium – a reply to Farrand et al. In. J. Syst. Evol. Microbiol 53, 1689–1695. [DOI] [PubMed] [Google Scholar]
- [9].Rondon MR, Ballering KS, and Thomas MG (2004) Identification and analysis of a siderophore biosynthetic gene cluster from Agrobacterium tumefaciens C58, Microbiology 150, 3857–3866. [DOI] [PubMed] [Google Scholar]
- [10].Jeanjean R, Talla E, Latifi A, Havaux M, Janicki A, and Zhang CC (2008) A large gene cluster encoding peptide synthetases and polyketide synthases is involved in production of siderophores and oxidative stress response in the cyanobacterium Anabaena sp. strain PCC 7120, Environ. Microbiol 10, 2574–2585. [DOI] [PubMed] [Google Scholar]
- [11].Calteau A, Fewer DP, Latifi A, Coursin T, Laurent T, Jokela J, Kerfeld CA, Sivonen K, Piel J, and Gugger M (2014) Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria, BMC Genomics 15, 977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Felnagle EA, Barkei JJ, Park H, Podevels AM, McMahon MD, Drott DW, and Thomas MG (2010) MbtH-like proteins as integral components of bacterial nonribosomal peptide synthetases, Biochemistry 49, 8815–8817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Imker HJ, Krahn D, Clerc J, Kaiser M, and Walsh CT (2010) N-acylation during glidobactin biosynthesis by the tridomain nonribosomal peptide synthetase module GlbF, Chem. Biol 17, 1077–1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Zhang W, Heemstra JR, Walsh CT, and Imker HJ (2010) Activation of the pacidamycin PacL adenylation domain by MbtH-like proteins, Biochemistry 49, 9946–9947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Baltz RH (2011) Function of MbtH homologs in nonribosomal peptide biosynthesis and applications in secondary metabolite discovery, J. Ind. Microbiol. Biotechnol 38, 1747–1760. [DOI] [PubMed] [Google Scholar]
- [16].Beiderbeck H, Taraz K, Budzikiewicz H, and Walsby AE (2000) Anachelin, the siderophore of the cyanobacterium Anabaena cylindrica CCAP 1403/2A, Z. Naturforsch. C 55, 681–687. [DOI] [PubMed] [Google Scholar]
- [17].Itou Y, Okada S, and Murakami M (2001) Two structural isomeric siderophores from the freshwater cyanobacterium Anabaena cylindrica (NIES-19), Tetrahedron 57, 9093–9099. [Google Scholar]
- [18].Alexander DB, and Zuberer DA (1991) Use of the chrome azurol S reagents to evaluate siderophore production by rhizosphere bacteria, Biol. Fertil. Soils 12, 39–45. [Google Scholar]
- [19].Schwyn B, and Neilands JB (1987) Universal chemical assay for the detection and determination of siderophores, Anal. Biochem 160, 47–56. [DOI] [PubMed] [Google Scholar]
- [20].Dalisay DS, Rogers EW, Edison AS, and Molinski TF (2009) Structure elucidation at the nanomole scale. 1. Trisoxazole macrolides and thiazole-containing cyclic peptides from the nudibranch Hexabranchus sanguineus, J. Nat. Prod 72, 732–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].McDonald LA, Barbieri LR, Carter GT, Kruppa G, Feng X, Lotvin JA, and Siegel MM (2003) FTMS structure elucidation of natural products: Application to muraymycin antibiotics using ESI Multi-CHEF SORI-CID FTMSn, the top-down/bottom-up approach, and HPLC ESI capillary-skimmer CID FTMS, Anal. Chem 75, 2730–2739. [DOI] [PubMed] [Google Scholar]
- [22].Laird DW, LaBarbera DV, Feng X, Bugni TS, Harper MK, and Ireland CM (2007) Halogenated cyclic peptides isolated from the sponge Corticium sp., J. Nat. Prod 70, 741–746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Maruyama C, Toyoda J, Kato Y, Izumikawa M, Takagi M, Shin-ya K, Katano H, Utagawa T, and Hamano Y (2012) A stand-alone adenylation domain forms amide bonds in streptothricin biosynthesis. Nat. Chem. Biol 8, 791–797. [DOI] [PubMed] [Google Scholar]
- [24].Gevers W, Kleinkauf H, and Lipmann F (1969) Peptidyl transfers in gramicidin S bisoynthesis from enzyme-bound thioester intermediates, Proc. Natl. Acad. Sci. USA 63, 1335–1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Kleinkauf H, Gevers W, and Lipmann F (1969) Interrelation between activation and polymerization in gramicidin S biosynthesis, Proc. Natl. Acad. Sci. USA 62, 226–233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Simpson TJ, and Holker JSE (1975) The biosynthesis of a pyrone metabolite of aspergillus melleus an application of long-range 13C-13C coupling constants, Tetrahedron letters 16, 4693–4696. [Google Scholar]
- [27].Lee MS, Repeta DJ, Nakanishi K, and Zagorski MG (1986) Biosynthetic origins and assignments of carbon 13 NMR peaks of brevetoxin B, J. Am. Chem. Soc 108, 7855–7856. [DOI] [PubMed] [Google Scholar]
- [28].Julien B, Tian ZQ, Reid R, and Reeves CD (2006) Analysis of the ambruticin and jerangolid gene clusters of Sorangium cellulosum reveals unusual mechanisms of polyketide biosynthesis, Chem. Biol 13, 1277–1286. [DOI] [PubMed] [Google Scholar]
- [29].Teufel R, Miyanaga A, Michaudel Q, Stull F, Louie G, Noel JP, Baran PS, Palfey B, and Moore BS (2013) Flavin-mediated dual oxidation controls an enzymatic Favorskii-type rearrangement, Nature 503, 552–556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Piel J, Hoang K, and Moore B (2000) Natural metabolic diversity encoded by the enterocin biosynthesis gene cluster, J. Am. Chem. Soc 122, 5415–5416. [Google Scholar]
- [31].Cortina NS, Krug D, Plaza A, Revermann O, and Müller R (2012) Myxoprincomide: a natural product from Myxococcus xanthus discovered by comprehensive analysis of the secondary metabolome, Angew. Chem. Int. Ed. Engl 51, 811–816. [DOI] [PubMed] [Google Scholar]
- [32].Tang MC, Fu CY, and Tang GL (2012) Characterization of SfmD as a Heme peroxidase that catalyzes the regioselective hydroxylation of 3-methyltyrosine to 3-hydroxy-5-methyltyrosine in saframycin A biosynthesis, J. Biol. Chem 287, 5112–5121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Sugano Y (2009) DyP-type peroxidases comprise a novel heme peroxidase family, Cell. Mol. Life Sci 66, 1387–1403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Connor KL, Colabroy KL, and Gerratana B (2011) A heme peroxidase with a functional role as an L-tyrosine hydroxylase in the biosynthesis of anthramycin, Biochemistry 50, 8926–8936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Morris F (1950) Non-enzymatic oxidation of tyrosine and dopa, Proc. Natl. Acad. Sci. USA 36, 606–611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Savitskaya J, Protzko RJ, Li FZ, Arkin AP, and Dueber JE (2019) Iterative screening methodology enables isolation of strains with improved properties for a FACS-based screen and increased L-DOPA production, Sci. Rep 9, 5815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Valderrama B, Ayala M, and Vazquez-Duhalt R (2002) Suicide inactivation of peroxidases and the challenge of engineering more robust enzymes, Chem. Biol 9, 555–565. [DOI] [PubMed] [Google Scholar]
- [38].Gademann K (2005) Mechanistic studies on the tyrosinase-catalyzed formation of the anachelin chromophore, ChemBioChem 6, 913–919. [DOI] [PubMed] [Google Scholar]
- [39].Zurcher S, Wackerlin D, Bethuel Y, Malisova B, Textor M, Tosatti S, and Gademann K (2006) Biomimetic surface modifications based on the cyanobacterial iron chelator anachelin, J. Am. Chem. Soc 128, 1064–1065. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
