Skip to main content
Applied and Environmental Microbiology logoLink to Applied and Environmental Microbiology
. 2011 Nov;77(22):8034–8040. doi: 10.1128/AEM.05993-11

Nostophycin Biosynthesis Is Directed by a Hybrid Polyketide Synthase-Nonribosomal Peptide Synthetase in the Toxic Cyanobacterium Nostoc sp. Strain 152,

David P Fewer 1, Julia Österholm 1, Leo Rouhiainen 1, Jouni Jokela 1, Matti Wahlsten 1, Kaarina Sivonen 1,*
PMCID: PMC3208980  PMID: 21948844

Abstract

Cyanobacteria are a rich source of natural products with interesting pharmaceutical properties. Here, we report the identification, sequencing, annotation, and biochemical analysis of the nostophycin (npn) biosynthetic gene cluster. The npn gene cluster spans 45.1 kb and consists of three open reading frames encoding a polyketide synthase, a mixed polyketide nonribosomal peptide synthetase, and a nonribosomal peptide synthetase. The genetic architecture and catalytic domain organization of the proteins are colinear in arrangement, with the putative order of the biosynthetic assembly of the cyclic heptapeptide. NpnB contains an embedded monooxygenase domain linking nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) catalytic domains and predicted here to hydroxylate the nostophycin during assembly. Expression of the adenylation domains and subsequent substrate specificity assays support the involvement of this cluster in nostophycin biosynthesis. Biochemical analyses suggest that the loading substrate of NpnA is likely to be a phenylpropanoic acid necessitating deletion of a carbon atom to explain the biosynthesis of nostophycin. Biosyntheses of nostophycin and microcystin resemble each other, but the phylogenetic analyses suggest that they are distantly related to one another.

INTRODUCTION

Cyanobacteria are a prolific source of natural products with interesting biological activities and potential as drug leads (2, 25, 27). Many of the bioactive compounds have a peptidic portion and contain a diverse array of proteinogenic and nonproteinogenic amino acids (27). They can be linear or cyclic, with versatile chemical structures, and are often heavily modified (2, 27). Common modifications are epimerization and N-methylation but may include other derivatizations, such as glycosylation, N-formylation, halogenation, or heterocyclization (19, 27). Many are the end products of nonribosomal peptide synthetases (NRPS), which can work together with polyketide synthases (PKS) to yield complex biosynthetic pathways capable of producing a range of novel peptide metabolites with exotic chemistries.

Nostophycin is comprised of seven amino acids that form a ring structure (Fig. 1). The novel β-amino acid (2S,3R,5R)-3-amino-2,5-dihydroxy-8-phenyloctanoic acid (Ahoa) is found only in nostophycin (9). Nostophycin contains a rare stereoisomer of isoleucine and has been reported from a single strain of the genus Nostoc (9). Nostophycin showed weak cytotoxic activity against lymphocytic mouse leukemia (9). Nostoc sp. strain 152 also produces microcystins, which are potent inhibitors of eukaryotic protein phosphatases 1 and 2A. Nostophycin and microcystins are both cyclic heptapeptides and share similar β-amino acids and stereochemistry of some amino acids and are suggested to be related to one another (9). However, the biosynthetic origins of nostophycin are unknown.

Fig. 1.

Fig. 1.

The chemical structure of nostophycin A produced by Nostoc sp. strain 152. The hydroxyl on the α-carbon of the Ahoa β-amino acid is highlighted in gray.

Hybrid NRPS and PKSs form modular enzyme assembly lines that synthesize an array of complex molecules, including many important pharmaceuticals. The chemical structure of nostophycin suggests that it is the product of a mixed PKS-NRPS pathway. Interestingly, PKS enzymes typically involve β-carbonyl-processing catalytic domains, while nostophycin contains an α-carbon hydroxyl group on the C-13 carbon (9). Here, we show that nostophycin is synthesized on a large multienzyme mixed nonribosomal peptide synthetase and polyketide synthase complex which includes rare monoxygenase functionality. The structural similarities between microcystins and nostophycin reported in the literature (9) are the result of analogous loading mechanisms and similarities in the organization of the catalytic domains in their respective gene clusters.

MATERIALS AND METHODS

Genomic DNA.

High-molecular-weight genomic DNA was prepared from a 21-day-old culture of Nostoc sp. strain 152 growing in Z8 medium lacking a source of combined nitrogen under continuous light with a photon irradiance of 5 to 12 μmol m−2 s−1 at 20 to 25°C. Filaments were collected by centrifugation at 8,000 × g for 15 min and washed with 50 mM Tris (pH 8.0) and 100 mM EDTA (pH 8.0). The filaments were resuspended in 10 ml of the same buffer containing lysozyme (10 mg ml−1) and incubated at 37°C for 1 h. Proteinase K and sodium sarcosyl were added to 200 μg ml−1 and 0.5%, respectively. The mixture was incubated at 50°C for 90 min. The sample of lysed cells was adjusted to 100 mM NaCl and then gently extracted sequentially with phenol and chloroform according to standard procedures. The DNA was precipitated with 2 volumes of 95% ethanol, centrifuged for 15 min at 12,000 × g, washed with 70% ethanol, dried, and dissolved in TE (10 mM Tris and 1 mM EDTA, pH 8.0). RNase (20 μg μl−1) was added to the solution and incubated at 37°C for 30 min. NaCl was added to a 0.1 M concentration, and the extraction with phenol and chloroform was repeated followed with ethanol precipitation and washing as previously described. The DNA pellet was left to dry and then dissolved in TE.

A domain library.

The nostophycin gene cluster was identified by screening for the presence of specific NRPS catalytic domains. Adenylation (A) domains were amplified using degenerate PCR primers based on the A2 (KAGGAY) and A10 (NGKID) A domain conserved motifs (16). The COF-1 (5-AACTCGAGAARGCWGGIGGNGCNTA-3) and COR-1 (5-CAGGATCCTCDATYTTNCCRTT-3) primer pairs were used to amplify approximately 1.2-kb fragments by PCR. Reaction conditions for PCR were as follows: one cycle of 95°C for 2 min, 55°C for 40 s, and 72°C for 1 min; 25 cycles of 95°C for 30 s, 57°C for 30 s, and 72°C for 1 min; followed with the final extension step at 72°C for 5 min. DyNAzyme II polymerase (Finnzymes), 0.2 U, was used in the reaction volume in 20 μl with the buffer (10 mM Tris-HCl [pH = 8.8], 1.5 mM MgCl2, 50 mM KCl, and 0.1% Triton X-100), 0.2 mM each nucleotide, 0.5 mM primers, and 10 to 50 ng of DNA. The resulting PCR products were purified with the PCR purification kit (Edge Bio Systems), digested with XhoI and BamHI restriction enzymes (Promega), and ligated to pBluescript SK+ vector opened with the same enzymes. The constructed clone library was transformed to Escherichia coli DH5α. Six clones were sequenced using the BigDye Terminator cycle sequencing kit (Applied Biosystems) and the ABI 310 Genetic Analyzer and yielded three distinct A domain sequences. One of the three A domains was predicted to activate l-Leu, which is found in microcystins produced by this strain (23, 24) but not in nostophycin (9). The remaining A domains were predicted to activate l-Gln and l-Phe, which are both found in nostophycin but neither of which is found in microcystin (9). We designed primers specific for the A domain predicted to recognize and activate l-Phe.

Fosmid library.

The high-molecular-weight DNA was sheared using a 22-gauge syringe, size selected (∼40 kb), end repaired, and ligated into the copy control fosmid vector pCC1FOS according to instructions provided with the CopyControl fosmid library production kit (Epicentre, Madison, WI). The titer from this method was very low, and the entire packaging extract was used to generate a 2,200-clone library. Fosmid DNA isolation was carried out using commercially available kits (Qiagen, Valencia, CA). DNA manipulations, such as restriction digests, ligations, and transformations, were carried out using standard methods.

We introduced a three-step PCR screening system to identify the nostophycin gene cluster from the fosmid library. The first step is to use 500 fosmid clones in one plate as the unit. Fosmid DNA was extracted from 4 pools of 500 clones and used as a template in PCR using primers specific for the l-Phe-activating A domain, LEU1F (5-TCGGGTCAGGAGCAACAC-3) and LEU1R (5-CCCGTCTGGCAAATAACG-3). The PCR was performed for 30 cycles consisting of 94°C for 30 s, 56°C for 30 s, and 72°C for 60 s, with a final extension at 72°C for 10 min. Pools positive for fosmids bearing the l-Phe-activating A domain gene were plated and used to construct 5 pools of 100 clones. Fosmid DNA was extracted and used as a template in PCR using the l-Phe-activating A domain primers. Pools positive for fosmids bearing the l-Phe-activating A domain were plated and used to construct a third pool of 10 fosmid clones. Those pools positive for the l-Phe-activating A domain were plated and screened directly by colony PCR to identify fosmid clones bearing the l-Phe-activating A domain. These fosmid clones were end sequenced, and two overlapping fosmid clones were identified and targeted for shotgun sequencing.

Gene cluster annotation.

Protein-encoding open reading frames (ORFs) were predicted using Glimmer 2.0 (5). The predicted proteins were used in a BLAST search (1) to assign putative functions. Catalytic domain prediction was carried out using conserved motifs characteristic of NRPS and PKS proteins (16) in combination with BLAST searches and the CDD database. Prediction of A domain substrate specificity was carried out using the NRPS predictor (18).

Cloning, overexpression, and the substrate specificity of the A domains.

Nostoc sp. strain 152 proved intractable to genetic manipulation, and we undertook biochemical analysis of A domain substrate specificity in order to verify that the 47-kb npn gene cluster encodes the nostophycin biosynthetic gene cluster. Three A domains of the putative nostophycin gene cluster were amplified by PCR as previously described (8) using primers listed in the supplemental material. The annealing temperature was modified to 58°C and the extension temperature to 72°C. The PCR products were purified using the QuickStep 2 PCR purification kit (Edge Biosystems), ligated to pET101/D-TOPO vector (Invitrogen), and transformed into Escherichia coli TOP10 according to the manufacturer's instructions. The correct orientation and the end sequences of the inserts were verified by sequencing with the BigDye Terminator cycle sequencing kit and ABI 310 DNA autosequencer (Applied Biosystems). The E. coli BL21 Star (DE3) strain (Invitrogen) was used in the overexpression of A domains, and the expressed proteins were purified as previously described (8, 19). Soluble protein could be generated when cells were grown at 37°C for 4.5 h (NpnC3) or at 24°C for 16 to 18 h (NpnA1, NpnC2) with IPTG (isopropyl-β-d-thiogalactopyranoside) induction. The ATP-pyrophosphate (PPi) exchange assay was performed as previously described (20).

Gene cluster organization.

Filaments were collected from an 11-day-old culture of Nostoc sp. strain 152 by centrifugation at 8,000 × g for 8 min. The pellet was resuspended in 1 ml PMR1 solution (MO BIO Laboratories Inc.), and the cells were disrupted mechanically with a FastPrep FP120 bead beater (Thermo Electron Corporation) for 40 s at a speed of 5 m s−1. RNA extraction was performed using an Ultraclean plant RNA kit by following the manufacturer's instructions (Mo Bio Laboratories). RNA was treated twice with DNase I according to the manufacturer's instructions (Promega). The DNase-treated RNA was phenol-chloroform extracted to inactivate and remove the enzyme, ethanol precipitated, and diluted to a 40-μl final volume with water. The quantity and quality of the RNA were measured with a NanoDrop-1000 spectrophotometer (Nanodrop Technologies). The RNA was reverse transcribed to cDNA, using 7 μl of RNA as a template, with an iScript cDNA synthesis kit (Bio-Rad) according to the manufacturer's instructions. A control which lacked the reverse transcriptase (RT) enzyme was performed. The quality of the RNA extraction and cDNA synthesis was controlled using positive and negative controls. The PCR was performed for 30 cycles consisting of 94°C for 30 s, 56°C for 30 s, and 72°C for 60 s, with a final extension at 72°C for 10 min as described previously and a set of specific oligonucleotide primers (see the supplemental material).

Chemical analysis.

An Agilent 1100 series modular high-performance liquid chromatograph (HPLC) containing a diode array detector interfaced with an Agilent Ion Trap XCT Plus mass spectrometer (Agilent Technologies, Palo Alto, CA) was used for the liquid chromatography mass spectrometry (LC-MS) analyses. Initial extractions of Nostoc sp. strain 152 were carried out as previously described (14). Nostophycin variants were separated on a Luna C8 (2) column (150 by 2 mm, 5-μm particle size; Phenomenex) eluted at 40°C with a 42-min gradient from a 95/5 ratio of 0.1% HCOOH-isopropanol (+0.1% HCOOH) to a ratio of 25/75 in 35 min and then directly to a ratio of 0/100. Three nostophycin variants were identified with LC-MS by using electrospray ionization in positive-ion mode. The nebulizer gas (N2) pressure was 30 lb/in2 (207 kPa), the drying gas flow rate was 8 liters min−1, the temperature was 350°C, the capillary voltage was 5,000 V, the end plate offset was −500 V, the skimmer potential was 66 V, and the trap drive value was 73. Spectra were recorded using a scan range from m/z 50 to m/z 1,200. The identification of nostophycin variants was based on the ion masses and the assigned fragment ions of tandem MS (MS-MS) spectra.

Liquid cultures of Nostoc sp. strain 152 were grown in 72 liters of Z8 medium as described previously (9) in order to extract for nostophycin A. Freeze-dried biomass (16.6 g) was subjected to 24 h of extraction with 200 ml of 85% acetonitrile (ACN) in water using a magnetic stirrer. The solution was centrifuged for 5 min at 10,000 × g, and the supernatant was extracted with a Strata C18-E (5 g/20 ml, 55 μm, 70 Å; Phenomenex) cartridge preconditioned with 85% ACN. The effluent was evaporated with rotary evaporator, and the residue was dissolved to 20% ACN. Nostophycin A was isolated from the solution by nine sequential 1.4-ml injections into a Luna C18 column (250 by 10 mm; Phenomenex) eluted 5 ml min−1 at ambient temperature with 10 mM ammonium acetate-35% ACN. Nostophycin A precipitated from the pooled fractions probably because of the high concentration and the spontaneous evaporation of ACN from the solution.

Isolated nostophycin A (100 μg) was dried in a 300-μl glass vial which was then inserted into a 4-ml glass vial containing 1 ml of 6 M HCl. Prior to being closed, the vial was flushed with argon. Acid hydrolysis was performed by incubating overnight at 110°C. After hydrolysis, the inner vial was dried for 30 min with a vacuum centrifuge. Amino acids in the dry residue and reference amino acids were derivatized by the Marfey method using the l-FDAA (1-fluoro-2,4-dinitrophenyl-5)-l-alaninamide) or d-FDLA (1-fluoro-2,4-dinitrophenyl-5)-l-leucinamide) reagent (9). Reaction mixtures were analyzed with a Luna C18 column (150 by 2 mm, 5 μm; Phenomenex) eluted with 0.01% trifluoroacetic acid (TFA) (A) and ACN (B). For l-FDAA derivatives, the gradient was from 30% B to 80% B in 25 min, and for d-FDLA derivatives, the gradient was from 30% B to 35% B in 25 min and then 35 min at the reached level. The detection wavelength for the Marfey derivatives was 340 nm.

Phylogenetic analysis.

The A domains encoded in the microcystin (McyG), nodularin (NpaC), nostophycin (NpnA), barmamide (BarE), spumigin (SpuA), cryptophycin (CrpA, CrpD), aeruginosin (AerA), ceruelide (CesA), and hectochlorin (HctE, HctF) gene clusters were aligned with CLUSTAL X and adjusted manually. Ambiguous regions and gaps were excluded, and 508 positions were subjected to phylogenetic analysis. A maximum likelihood tree was constructed using ProtML with the JTT-F model of amino acid substitution and 10 random sequence addition searches with global rearrangements (6). The phylogenetic analysis was midpoint rooted. One thousand likelihood bootstrap replicates were performed under a JTT and uniform rate model with 5 random sequence additions per replicate and global rearrangements. A user-defined tree that forced NpnA, NdaC, and McyG to a monophyletic clade was constructed. The Templeton test (26) as implemented in the PROTPARS program of the PHYLIP package was used to test this competing hypothesis.

Nucleotide sequence accession number.

The entire sequence of the gene cluster for nostophycin production in Nostoc sp. strain 152 has been deposited in GenBank under accession number JF430079.

RESULTS

Chemical variation of nostophycins.

Chemical analysis (LC-MS) demonstrated that Nostoc sp. strain 152 produces small amounts of two nostophycins in addition to the originally reported nostophycin (9). The original nostophycin variant is renamed nostophycin A, and the two new nostophycins are named B and C here (Table 1). Quantitative analyses demonstrated that nostophycin A accounts for 97% of the nostophycin variants produced by Nostoc sp. strain 152, while the remaining two variants account for 3% together. Nostophycin B has a mass of 874 Da and contains l-Val in place of l-allo-Ile, while nostophycin C has a mass of 872 Da and lacks one of the hydroxyl groups on the C-13 of Ahoa. The assigned product ions used for the identification of nostophycin variants are presented in the supplemental material.

Table 1.

Nostophycin variants identified from Nostoc sp. strain 152 by LC-MSa

Nostophycin MH+ Rt (min) Amino acid at indicated position in nostophycin
1 2 3 4 5 6 7
A 889 24.9 Ahoa d-Gln Gly l-Pro l-Phe d-allo-Ile l-Pro
B 875 24.2 Ahoa d-Gln Gly l-Pro l-Phe Val l-Pro
C 873 27.3 doAhoa d-Gln Gly l-Pro l-Phe d-allo-Ile l-Pro
a

MT+, monoisotopic mass of protonated molecular ion; Rt, retention time; Ahoa, 3-amino-2,5-dihydroxy-8-phenyloctanoic acid; doAhoa, 3-amino-hydroxy-8-phenyloctanoic acid.

Cloning and identification of the nostophycin biosynthetic gene cluster.

The chemical structure of nostophycin (Fig. 1) suggested that it was derived from a mixed PKS-NRPS pathway. The npn gene cluster was identified in two overlapping fosmid clones and assembled into a 55.5-kb contiguous region that contained the 45.1-kb npn gene cluster (Fig. 2) by fosmid clone pooling and screening by PCR. Pools containing fosmid clones of interest were plated and rescreened in three successive rounds until the fosmid of interest was isolated. The npn gene cluster consists of three large ORFs encoding three proteins, NpnA, NpnB, and NpnC, exhibiting a low G+C content (43%), which is typical of the G+C content encountered in genomes and secondary metabolite gene clusters from cyanobacteria (3, 21). There were 8 open reading frames flanking the npn gene cluster (Table 2). These encode housekeeping proteins and a variety of hypothetical open reading frames which seem unlikely to be involved in the biosynthesis of nostophycin (Table 2).

Fig. 2.

Fig. 2.

Map of the npn gene cluster and proposed biosynthetic pathway of nostophycin. (A) Structure and genetic organization of the 48-kb npn biosynthetic gene cluster from Nostoc sp. strain 152. Abbreviations: A, adenylation; ACP, acyl carrier protein; C, condensation; PCP, peptidyl carrier protein; KR, keto reductase domain; E, epimerase domain; KS, β-ketoacyl-ACP synthase; AT, acyl transferase; DH, β-hydroxy-acyl-ACP dehydratase; ER, enoyl reductase; AMT, aminotransferase; MonoOx, monooxygenase domain; Te, thioesterase.

Table 2.

Proposed function of genes

Protein Length (aa) Proposed function Top BLAST hit Organism % identity Accession no.
ORF1 463 Cytochrome c biosynthesis ResB-like protein Anabaena variabilis ATCC 29413 87 YP_324324
ORF2 363 ABC transporter Phosphate ABC transporter Nostoc punctiforme PCC 73102 74 YP_001868921
ORF3 558 Unknown Hypothetical protein PBPRB0532 Photobacterium profundum SS9 42 YP_132205
ORF4 136 GTP cyclohydrolase 7-Cyano-7-deazaguanine reductase Anabaena variabilis ATCC 29413 91 YP_324909
ORF5 139 Unknown Hypothetical protein all1163 Nostoc sp. PCC 7120 79 NP_485206
ORF6 126 Unknown Hypothetical protein all1164 Nostoc sp. PCC 7120 84 NP_485207
ORF7 201 Putative RNA ligase Hypothetical protein alr1166 Nostoc sp. PCC 7120 71 NP_485209
NpnA 5,124 PKS JamL Lyngbya majuscula 49 AAS98783
NpnB 4,768 Mixed NRPS-PKS MicA Planktothrix rubescens NIVA-CYA 98 62 CAQ48259
NpnC 5,072 NRPS NcpA Nostoc sp. ATCC 53789 62 AAO23333
ORF11 766 Unknown Hypothetical protein Npun_F0919 Nostoc punctiforme PCC 73102 77 YP_001864599

Organization of the nostophycin biosynthetic gene cluster.

The npnA, npnB, and npnC genes share the same orientation (Fig. 2), with 10 bp separating npnA and npnB and 2 bp separating npnB and npnC. It seemed likely, given the small intergenic regions between the genes, that a single promoter resides in the 912-nucleotide (nt) noncoding region upstream of npnA and leads to transcription of the three genes on a single transcript. We designed primers to amplify mRNA across the boundaries of the three genes to confirm that the core npn genes are transcribed as a large polycistronic mRNA. RNA from 24-day-old cultures of Nostoc sp. strain 152 grown in Z8 medium was subjected to RT-PCR, which yielded products of the sizes predicted for amplification across the junctions of adjacent npn genes, npnA to npnB and npnB to npnC (see the supplemental material). This suggested that the contiguous genes from npnA to npnC are transcribed as a single polycistronic transcript. We identified a potential RNA polymerase recognition site (TTGAAA) and a Pribnow box (TTAAATT). Sequence analysis also identified a Shine-Dalgarno-like sequence (AGAAGG) 12 bp upstream of the ATG start site.

NpnA is a PKS protein with a length of 5,124 amino acids (aa) and a molecular mass of 565.94 kDa (Table 2). The NpnA gene encodes a loading module and two PKS modules (Fig. 2). NpnB is a mixed PKS-NRPS protein with a length of 4,768 amino acids and a molecular mass of 528.44 kDa (Fig. 2). The NpnB protein begins with a type I PKS module followed by an aminotransferase domain (Fig. 2) predicted to convert the β-keto of growing chain into an amino acid moiety which eventually serves as the site of cyclization and subsequent release of the final nostophycin (Fig. 2). The aminotransferase domain is followed by a 387-aa-long monooxygenase domain (Fig. 2). NpnC is an NRPS protein with a length of 5,072 amino acids and a molecular mass of 567.39 kDa (Table 2). The NpnC gene encodes four NRPS modules which are predicted to recognize and activate l-Pro, l-Phe, l-allo-Ile, and l-Pro (Table 3).

Table 3.

Proposed and the main activated substrates, the 10 amino acid residues predicted to line the substrate binding pocket of the NpnA, NpnB, and NpnC adenylation domains, and predicted substrates determined through precedence in other NRPS proteins (17)a

NRPS module Proposed substrate Activated substrate % match Amino acid at residue:
235 236 239 278 299 301 322 330 331 517
NpnA1 PheAc-CoA HClN 100 V G V W V A A S G K
NpnB1 Gln Gln 100 D A W L F G L I D K
NpnB2 Gly Gly 100 D I L A L G L I W K
NpnC1 Pro Pro 100 D V Q F I A H V V K
NpnC2 Phe Phe 100 D A W T I A A V C K
NpnC3 Ile l-allo-Ile 100 D A F F L G V T F K
NpnC4 Pro Pro 100 D V Q F I A H V V K
a

PheAc-CoA, phenylacetate coenzyme A; HClN, hydrocinnamic acid.

Cloning, expression, and functional analysis of adenylation domains from the nostophycin biosynthetic gene cluster.

The substrate specificity of each adenylation domain from NpnA, NpnB, and NpnC had exact precedents in characterized NRPS gene clusters (Table 3). Biochemical analyses of the A domains from NpnA, NpnC2, and NpnC3, predicted to activate phenylacetate, l-Phe, and l-allo-Ile, respectively, were performed using the ATP-PPi exchange assay. This assay measures the reversible formation of amino-acyl-AMP derivative and allows determination of amino acid selectivity. The relative activations of the various amino acid substrates in the ATP-PPi exchange assay were determined (Fig. 3).

Fig. 3.

Fig. 3.

Substrate specificities of adenylation domains of NpnA1, NpnC2, and NpnC3 determined by the ATP-pyrophosphate exchange assay. The chosen substrates were those found in nostophycin and the chemically related compounds. HCIN, hydrocinnamic acid; CIN, cinnamic acid; PP, phenylpyruvate; l-PLAC, l-phenyllactic acid; d-PLAC, d-phenyllactic acid; COU, p-coumaric acid; HPL, p-hydroxyphenyllactic acid; Phe-Ac, phenylacetic acid.

Phylogenetic analyses.

Maximum parsimony, maximum likelihood, and neighbor-joining analyses suggest a distant relationship between microcystin and nostophycin loading modules (Fig. 4). The Templeton test, in which a sister-taxon relationship between microcystin and nostophycin loading modules was enforced, yielded significantly worse trees (P = 0.05). The loading modules of NpnA and McyG have identical predicted substrate specificities (Table 3). However, phylogenetic analysis demonstrates that the McyG adenylation domains and the NpnA adenylation domain do not cluster together (Fig. 4).

Fig. 4.

Fig. 4.

A maximum likelihood tree showing the relationship between the NpnA and McyG adenylation domains. These proteins do not form a single monophyletic clade. Other adenylation domains involved in loading or extension modules were identified through BLAST similarity searches.

DISCUSSION

The npn gene cluster (41.5 kb) involved in nostophycin biosynthesis identified in this study encodes a hybrid polyketide synthase nonribosomal peptide synthetase. The NpnA loading module initiates biosynthesis of nostophycin. Analysis of the NpnA A domain binding pocket residues that confer substrate specificity revealed that the conserved Asp235 involved in ionic interaction with the amino group of the substrate amino acid was replaced by Val235 (Table 3). Surprisingly, the NpnA A domain was found to activate different derivates of phenylpropionic acid—hydrocinnamic acid, cinnamic acid, phenylpyruvate, l-phenyllactic acid, and d-phenyllactic acid—to an appreciable level (Fig. 3). The nostophycin (NpnA), microcystin (McyG), and cryptophycin (CrpA) loading modules share the same predicted substrate specificity (11, 15). In each case, the ATP exchange assay does not provide evidence for the activation of l-PheAc as might have been expected from the chemical structure of nostophycins, microcystins, and cryptophycins (11, 15). The loading module specificity necessitates deletion of one carbon atom of the phenylpropanoic acid starter unit in order to generate the expected polyketide chain. This phenomenon has been observed independently in the biosynthesis of microcystin (11), cryptophycin (15), and barbamide (4). However, the precise mechanism behind this phenomenon remains a mystery.

The NpnA protein contains a starter module with an A domain followed by an embedded KR domain similar to that reported from hectochlorin (17), spumigin (8), cryptophycin (15), and aeruginoside (13) gene clusters. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the β-carbonyl of a polyketide to a hydroxyl group. The role of this domain in the biosynthesis of nostophycin is unclear. The phenylpropanoic acid initiator of nostophycin biosynthesis is next proposed to undergo two rounds of ketide extension by PKS modules encoded by NpnA to form the Ahoa intermediate. The starter module is followed by two type I PKS modules predicted to elaborate the polyketide portion of the nostophycin (Fig. 2).

NpnB encodes an unusual monooxygenase domain which is rare in nonribosomal peptide synthetases but not unprecedented. Embedded monooxygenase domains are found, for example, in the MtaG protein of Stigmatella aurantiaca DW4/3-1, the MelG protein of Melittangium lichenicola, and the CtaG protein of Cystobacter fuscus (7). Interestingly, BLAST analyses demonstrated that the MicA protein of Planktothrix rubescens NIVA-CYA 98 (21) also contains this domain. The MicA, MtaG, MelG, and CtaG proteins participate in the biosynthesis of oscillaginin, myxothiazols, melithiazols, and cystothiazole, respectively. It should be noted that the monooxygenase domain is not annotated in the putative oscillaginin gene cluster (21). However, the domain is found directly after the aminotransferase domain in MicA as well and has 78% sequence identity to NpnA (Table 2). We propose that the monooxygenase domain within NpnB acts upon the Ahoa polyketide following transfer to the peptidyl carrier protein (PCP) and oxidizes the intermediate (Fig. 2). Monooxygenases belong to the oxidoreductase family and are capable of oxidizing the methylene group (CH2) through the addition of a single oxygen atom from molecular oxygen and the reduction of a second oxygen atom to water. The carbon backbone of an amino acid added to the myxothiazol acid is proposed to be removed via a monooxygenase module within MtaG (22). However, the extent of domain involvement, timing of α-hydroxylation, and biochemical mechanism await further characterization.

The monooxygenase domain of NcpB is followed by two NRPS modules that are proposed to activate l-Gln and l-Gly (Table 3). Each A domain finds an exact substrate specificity precedent in characterized NRPS gene clusters (18). The first NcpB A domain yielded a substrate prediction for l-Gln and found exact precedents in the anabaenopeptolide (19) and tyrocidine gene clusters. The second NcpB A domain gave substrate specificity predictions for l-Gly and found exact precedents in the nostopeptolide, actinomycin, and bacillibactin gene clusters among others (18). The first of the two NRPS modules contains an epimerase domain for the racemization of activated amino acids and is consistent with the structure of nostophycin (9).

The NpnC gene encodes four NRPS modules which are predicted to recognize and activate l-Pro, l-Phe, l-allo-Ile, and l-Pro (Table 3). Each of the four A domains finds an exact precedent in characterized NRPS gene clusters (18). The ATP-PPi exchange assay demonstrated that the NpnC2 A domain activated l-Phe to the greatest extent, which is consistent with the presence of l-Phe at this position in all nostophycin variants (Fig. 3). The NpnC3 activated l-allo-Ile (Fig. 3), which is consistent with the presence of d-allo-Ile and d-Val at this position in the nostophycin variants. An E domain is found in this module (Fig. 2), and so the expected substrate is l-Ile. However, the NpnC3 adenylation domain also activated l-allo-Ile and l-Leu to an appreciable level. Nostophycin B contains Val at this position. However, no variants containing l-Leu were detected. Chromatographic analysis of the acid hydrolysate of the isolated nostophycin A revealed only d-allo-Ile. No traces of d-Ile, d-Leu (see the supplemental material), or l-isomers of allo-Ile, Ile and Leu, which have clearly distinct retention times to corresponding d-isomers (9), were observed.

The NpnC protein ends in a thioesterase domain which is presumably responsible for the release and cyclization of the heptapeptide. allo-Isoleucine is the diastereoisomer of isoleucine and produced as a by-product of isoleucine transamination. It is presumably activated directly by the adenylation domain. However, analysis of the substrate specificity code found exact precedents in both the anabaenopeptolide (19) and nostopeptolide (12) gene clusters. Each of these compounds contains l-Ile. It is not clear at present how the difference in stereochemistry is achieved, and the biosynthetic origins of this unusual isoleucine diastereomer remain uncertain. Finally, an embedded type I thioesterase that possesses the conserved GXSXG motif is present at the C-terminal end of NcpC and is likely to catalyze product release and cyclization.

The unusual Ahoa β-amino acid is found only in nostophycin. However, similar β-amino acids have been found in several peptides isolated from cyanobacteria (e.g., reference 10). Nostophycin bears a great deal of structural similarities to microcystin (9). However, BLAST searches using each catalytic domain encoded in the npn gene cluster as a query did not return microcystin synthetase genes among any of the top hits (see the supplemental material). In each case, top hits for other peptides or polyketide synthetases were returned (see the supplemental material). The biosyntheses of nostophycin and microcystins share a number of features. However, an additional 2 carbon units are present in Adda. This is reflected in the presence of an additional PKS module in the microcystin synthetase gene cluster. The domain compositions of microcystin and nostophycin PKS modules differ in each module, and this is reflected in the degree to which the β-amino acids are reduced (Fig. 1). In order to examine this relationship further, we constructed phylogenetic trees based on the loading module A domain of NpnA. Adenylation domains were identified through BLAST sequence similarity searches using NpnA to query the nonredundant database at NCBI, and the search yielded A domains from a selection of cyanobacterial NRPS gene clusters, some of which initiate pathways and others that are midstream in a biosynthetic assembly. The structural similarities between nostophycins and microcystins may be explained by an analogous loading mechanism. This loading mechanism seems to be common in cyanobacteria and can be found in the gene clusters encoding aeruginosins (13), spumigins (8), microcystins (11), and cryptophycin (16) among others and allows the incorporation of unusual starter units into the nonribosomal peptides they produce. Among the nonproteinogenic amino acids, polyhydroxylated amino acids are interesting due to their presence in natural and synthetic compounds with diverse biological activities. Here, we report the nostophycin gene cluster which encodes an unusual loading mechanism and monooxygenase functionality.

Supplementary Material

Supplemental Material

ACKNOWLEDGMENTS

We thank Lyudmila Saari for her valuable help in handling the culture. Hanna Sipari is acknowledged for help in making cDNA.

This work was supported by grants from the European Union PEPCY (QLK4-CT-2002-02634) to K.S. and grants from the Academy of Finland to D.P.F. (1212943) and K.S. (53305, 118637, and 214457).

Footnotes

Supplemental material for this article may be found at http://aem.asm.org/.

Published ahead of print on 23 September 2011.

REFERENCES

  • 1. Altschul S. F., et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Burja A. M., Banaigs B., Abou-Mansour E., Burgess J. G., Wright P. C. 2001. Marine cyanobacteria—a prolific source of natural products. Tetrahedron 57:9347–9377 [Google Scholar]
  • 3. Cadel-Six S., et al. 2008. Halogenase genes in nonribosomal peptide synthetase gene clusters of Microcystis (cyanobacteria): sporadic distribution and evolution. Mol. Biol. Evol. 9:2031–2041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chang Z., et al. 2002. The barbamide biosynthetic gene cluster: a novel marine cyanobacterial system of mixed polyketide synthase (PKS)-nonribosomal peptide synthetase (NRPS) origin involving an unusual trichloroleucyl starter unit. Gene 296:235–247 [DOI] [PubMed] [Google Scholar]
  • 5. Delcher A. L., Harmon D., Kasif S., White O., Salzberg S. L. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 27:4636–4641 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Felsenstein J. 1989. PHYLIP 3.2 manual. University of California Herbarium, Berkeley, CA [Google Scholar]
  • 7. Feng Z., et al. 2005. Construction of a bacterial artificial chromosome library for a myxobacterium of the genus Cystobacter and characterization of an antibiotic biosynthetic gene cluster. Biosci. Biotechnol. Biochem. 69:1372–1380 [DOI] [PubMed] [Google Scholar]
  • 8. Fewer D. P., et al. 2009. The nonribosomal assembly and frequent occurrence of the protease inhibitors spumigins in the bloom-forming cyanobacterium Nodularia spumigena. Mol. Microbiol. 73:924–937 [DOI] [PubMed] [Google Scholar]
  • 9. Fujii K., Sivonen K., Kashiwagi T., Hirayama K., Harada K. 1999. Nostophycin, a novel cyclic peptide from the toxic cyanobacterium Nostoc sp. 152. J. Org. Chem. 64:5777–5782 [Google Scholar]
  • 10. Helms G. L., et al. 1988. Scytonemin A, a novel calcium antagonist from a blue-green alga. J. Org. Chem. 53:1298–1307 [Google Scholar]
  • 11. Hicks L. M., Moffitt M. C., Beer L. L., Moore B. S., Kelleher N. L. 2006. Structural characterization of in vitro and in vivo intermediates on the loading module of microcystin synthetase. ACS Chem. Biol. 1:93–102 [DOI] [PubMed] [Google Scholar]
  • 12. Hoffmann D., Hevel J. M., Moore R. E., Moore B. S. 2003. Sequence analysis and biochemical characterization of the nostopeptolide A biosynthetic gene cluster from Nostoc sp. GSV224. Gene 311:171–180 [DOI] [PubMed] [Google Scholar]
  • 13. Ishida K., et al. 2007. Biosynthesis and structure of aeruginoside 126A and 126B, cyanobacterial peptide glycosides bearing a 2-carboxy-6-hydroxyoctahydroindole moiety. Chem. Biol. 14:565–576 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Leikoski N., et al. 2010. Highly diverse cyanobactins in strains of the genus Anabaena. Appl. Environ. Microbiol. 76:701–709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Magarvey N. A., et al. 2006. Biosynthetic characterization and chemoenzymatic assembly of the cryptophycins. Potent anticancer agents from Nostoc cyanobionts. ACS Chem. Biol. 1:766–779 [DOI] [PubMed] [Google Scholar]
  • 16. Marahiel M. A., Stachelhaus T., Mootz H. D. 1997. Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem. Rev. 97:2651–2674 [DOI] [PubMed] [Google Scholar]
  • 17. Ramaswamy A. V., Sorrels C. M., Gerwick W. H. 2007. Cloning and biochemical characterization of the hectochlorin biosynthetic gene cluster from the marine cyanobacterium Lyngbya majuscula. J. Nat. Prod. 70:1977–1986 [DOI] [PubMed] [Google Scholar]
  • 18. Rausch C., Weber T., Kohlbacher O., Wohlleben W., Huson D. H. 2005. Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res. 33:5799–5808 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Rouhiainen L., et al. 2000. Genes encoding synthetases of cyclic depsipeptides, anabaenopeptilides, in Anabaena strain 90. Mol. Microbiol. 37:156–167 [DOI] [PubMed] [Google Scholar]
  • 20. Rouhiainen L., Jokela J., Fewer D. P., Urmann M., Sivonen K. 2010. Two alternative starter modules for the nonribosomal biosynthesis of specific anabaenopeptin variants in Anabaena (Cyanobacteria). Chem. Biol. 17:265–273 [DOI] [PubMed] [Google Scholar]
  • 21. Rounge T. B., Rohrlack T., Nederbragt A. J., Kristensen T., Jakobsen K. S. 2009. A genome-wide analysis of nonribosomal peptide synthetase gene clusters and their peptides in a Planktothrix rubescens strain. BMC Genomics 10:396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Silakowski B., et al. 1999. New lessons for combinatorial biosynthesis from myxobacteria. The myxothiazol biosynthetic gene cluster of Stigmatella aurantiaca DW4/3-1. J. Biol. Chem. 274:37391–37399 [DOI] [PubMed] [Google Scholar]
  • 23. Sivonen K., et al. 1990. Isolation and characterization of hepatotoxic microcystin homologs from the filamentous freshwater cyanobacterium Nostoc sp. strain 152. Appl. Environ. Microbiol. 56:2650–2657 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Sivonen K., et al. 1992. Three new microcystins, cyclic heptapeptide hepatotoxins, from Nostoc sp. strain 152. Chem. Res. Toxicol. 5:464–469 [DOI] [PubMed] [Google Scholar]
  • 25. Sivonen K., Leikoski N., Fewer D. P., Jokela J. 2010. Cyanobactins—ribosomal cyclic peptides produced by cyanobacteria. Appl. Microbiol. Biotech. 86:1213–1225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Templeton A. R. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the humans and apes. Evolution 37:221–244 [DOI] [PubMed] [Google Scholar]
  • 27. Welker M., von Döhren H. 2006. Cyanobacterial peptides—nature's own combinatorial biosynthesis. FEMS Microbiol. Rev. 30:530–563 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES