Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Mar 1.
Published in final edited form as: Chembiochem. 2023 Jul 26;24(17):e202300342. doi: 10.1002/cbic.202300342

Biosynthesis and Genome Mining Potentials of Nucleoside Natural Products

Yanan Du a, Anyarat Thanapipatsiri a, Kenichi Yokoyama a,b
PMCID: PMC10530009  NIHMSID: NIHMS1922487  PMID: 37357819

Abstract

Nucleoside natural products show diverse biological activities and serve as leads for various application purposes, including human and veterinary medicine and agriculture. Studies in the past decade revealed that these nucleosides are biosynthesized through divergent mechanisms, in which early steps of the pathways can be classified into two types (C5’ oxidation and C5’ radical extension), while the structural diversity is created by downstream tailoring enzymes. Based on this biosynthetic logic, we investigated the genome mining discovery potentials of these nucleosides using the two enzymes representing the two types of C5’ modifications: LipL-type α-ketoglutarate (α-KG) and Fe-dependent oxygenases and NikJ-type radical S-adenosyl-L-methionine (SAM) enzymes. The results suggest that this approach allows discovery of putative nucleoside biosynthetic gene clusters (BGCs) and the prediction of the core nucleoside structures. The results also revealed the distribution of these pathways in nature and implied the possibility of future genome mining discovery of novel nucleoside natural products.

Keywords: natural products, biosynthesis, nucleosides, antibiotics, antifungal agents

Graphical Abstract:

graphic file with name nihms-1922487-f0001.jpg

Nucleoside natural products are diverse in their structures and biological activities. Many of these compounds are produced through divergent biosynthetic pathways. In this perspective, we review the biosynthetic logic of the two most abundant groups of nucleoside natural products and investigate their potential for future genome mining discovery.

Introduction

Nucleoside natural products constitute an important class of natural products with a wide range of biological activities such as antibacterial, antiviral, and antifungal activities.[1] In general, the nucleoside or nucleobase moiety of these compounds mimics the native nucleobase or nucleoside/nucleotide and binds ribosome or nucleoside/nucleotide metabolizing enzymes.[2] Therefore, the nucleoside moiety of these natural products may be considered a pharmacophore. These nucleosides are frequently modified with amino acids, fatty acids, and/or carbohydrates (Figure 1),[1] which determine the specificity of the target enzyme or ribosome.[2] Since these structural variations are created through biosynthetic pathways,[3] understanding their biosynthetic logic is important to understand the structural diversity of nucleoside natural products in nature and to potentially predict the structures of uncharacterized metabolites based on genome sequence information. In this concept, we focus on the nucleosides with modifications in the sugar moieties based on their important biological activities and unifiable biosynthetic logic. More comprehensive reviews of nucleoside natural products and their biosynthesis can be found elsewhere.[34]

Figure 1.

Figure 1.

Nucleoside natural products with hexopyranose or modified furanose.

Structure-Activity Relationships

Most nucleoside natural products have modified sugar moieties with substantial structural variations (Figure 1).[1] Many of these sugars are larger than ribose and are called high-carbon sugar nucleosides. Structurally, they can be divided into two types. One type is nucleosides with a hexopyranose moiety (hexopyranosyl nucleosides) represented by blasticidin, gougerotin, and amicetin (Figure 1a). These nucleosides exhibit antifungal, antibacterial, and anticancer activities by inhibiting prokaryotic and eukaryotic protein synthesis.[1b,5] In the reported crystal structures of blasticidin S bound to 70S-tRNA ribosome complex, the uracil base of blasticidin S replaced Cytidine (C) 75 of tRNA bound to the P-site of the ribosome and formed a base pair with guanosine 2251 of 50S ribosome subunit.[2a] Consequently, C75 of tRNA was flipped out, and the 3’-terminus of the tRNA was bent towards the A-site, which was proposed to impede protein synthesis likely by inhibiting the hydrolysis of the peptidyl tRNA during the termination. Therefore, the cytosine base of blasticidin S mimics C75 of tRNA, and the remaining parts of the molecule will provide affinity to the surrounding region of the 50S ribosome. Other hexopyranosyl nucleosides are thought to have the same or similar mechanism of action.[6] Thus, these nucleosides likely evolved specifically to inhibit protein synthesis.

Many other nucleosides have furanose with various number of carbons (from five to as long as 11; furanosyl nucleosides). While the hexopyranose moiety of hexopyranosyl nucleosides serves as an anchor and does not mimic the sugar structures of nucleosides/nucleotides, the furanose structures of the furanosyl nucleosides allow these compounds to serve as nucleoside/nucleotide mimics. Recent structural studies on the antibacterial and antifungal nucleosides provided molecular insights about the mechanism by which these compounds exhibit their biological activities. For example, muraymycin D2 exhibits its antibiotic activity by inhibiting bacterial peptidoglycan biosynthetic enzyme MraY.[1a] The X-ray crystal structures of MraY in complex with muraymycin D2 suggested that the binding site for the nucleoside moiety of muraymycin D2 (red highlight in Figure 1b) overlaps with that for the nucleoside moiety of UDP-MurNAc-pentapeptide, the substrate of MraY (Figure 1c). On the other hand, the aminoribosyl moiety of muraymycin D2 (blue highlight in Figure 1b) binds to the region of MraY that is not involved in the substrate binding.[2c] Nikkomycin Z (Figure 1d) shows antifungal activity by inhibiting chitin synthase (CHS) responsible for the fungal cell wall biosynthesis.[7] The recent cryo-EM structure of CHS from pathogenic fungi, Candida albicans, in complex with nikkomycin Z suggested that the nucleoside moiety of nikkomycin Z (red highlight in Figure 1d) binds to the binding site for the nucleotide moiety of the sugar donor substrate (UDP-GlcNAc, Figure 1e). On the other hand, the pyridinol moiety of nikkomycin Z (blue highlight in Figure 1e) extends beyond the sugar donor binding site and towards the sugar acceptor (nascent polysaccharide chain) binding site. Therefore, in both cases, the nucleoside moiety of muraymycin D2 and nikkomycin Z occupies the nucleotide binding site of the target enzyme, while other part of the molecules forms unique interactions to achieve substantially tighter affinity than the substrates. Importantly, for both MraY[2c] and chitin synthase,[2b] divalent metals are essential for their catalytic functions, but were not observed in their structures in complex with the inhibitors. Therefore, while part of these compounds serves as nucleotide mimics, the functional groups decorating the nucleoside moieties allow these compounds to form interactions distinct from the substrate.

While the MraY and CHS inhibitors are the best-characterized examples, there are many other furanosyl nucleosides that target other enzymes and show different biological activities. For example, the antibiotic activity of pseudouridimycin is derived from the inhibition of bacterial RNA polymerase,[8] and the antifungal activity of jawsamycin is caused by the inhibition of fungal phosphatidylinositol N-acetylglucosaminyltransferase responsible for protein GPI anchoring.[9] These diversities in modes of action and biological activities are in sharp contrast to the relatively focused mechanism of action of hexopyranosyl nucleosides that target ribosome P-site. Therefore, the conjugation of the furanosyl nucleosides with varied functional groups is likely the versatile approach that nature chose to evolve nucleoside natural products targeting different nucleotide/nucleoside metabolizing enzymes.

Overview of the Biosynthesis of Hexopyranosyl and Furanosyl Nucleosides

Hexopyranosyl and furanosyl nucleosides are biosynthesized through distinct mechanisms. In the hexopyranosyl nucleoside biosynthesis, the hexose moiety is biosynthesized as sugar nucleotide through deoxysugar biosynthetic pathways (Figure 2a). The nucleobase is provided by a phosphorolysis of a nucleoside 5’-monophosphate by nucleoside phosphorylase (BlsM in blasticidin biosynthesis).[11] Then, the resulting nucleobase and nucleotide sugar are coupled by a glycosyltransferase (MilC in mildiomycin biosynthesis).[10] This approach provides an advantage in that these pathways can use deoxysugar biosynthetic enzymes widely distributed in various natural product biosynthetic pathways[26] to produce sugar donor substrates and couple the deoxysugar with a separately synthesized nucleobase. Thus, these pathways may be viewed as convergent biosynthesis and could be used for combinatorial biosynthesis of structurally diverse hexopyranosyl nucleosides.

Figure 2.

Figure 2.

Generalized biosynthetic pathways for the hexopyranosyl and 5’-modified furanosyl nucleosides. (a) Hexopyranosyl nucleoside biosynthesis. The scheme is based on the work on mildiomycin,[10] blasticidin,[11] and gougerotin[12] biosyntheses and characterization of the BlsM nucleotide hydrolase[11] and the MilC glycosyltrasnferaseglycosyltransferase.[10] (b) Biosynthesis of furanosyl nucleosides through a 5’-aldehyde intermediate. The scheme is based on the characterization of the C5’ oxygenase (LipL,[13] Cpr19,[14] Jaw7[15]), C5’ oxidase (PacK/Pac11[16]), transaldolase (LipK[17]), PLP-dependent oxygenase (Cap15[18]), C5’ aminotransferase (Jaw8,[15] Pac5,[16] and LipO[19]), and GlyU glycosyltransferase (LipN[19]). CarU glycosyltransferase (Cpr24 and homologs) has not been functionally characterized yet and the function was proposed based on the isotope incorporation study[20] and sequence information.[21] (c) Biosynthesis of furanosyl nucleosides through C5’ modification by radical mechanisms. The scheme is based on the characterization of NikO enolpyruvyltransferase,[22] NikJ/PolH radical SAM cyclase,[23] NikI and MalI oxygenases,[24] and TunB radical SAM enzyme.[25] The number of carbon atoms on the carbohydrate moiety of the nucleoside metabolites is shown in bold blue.

In contrast, most furanosyl nucleosides with modified sugar moieties are biosynthesized from ribonucleosides/ribonucleotides through the modification or extension of the carbon chain at the C5’ position. The mechanisms of C5’ modification can be categorized into two major types. The most widely employed approach is the oxidation of C5’ into aldehyde, which is then modified by pyridoxal phosphate (PLP)-dependent aminotransferase or transaldolase (Figure 2b). As detailed below, many previously reported antimicrobial nucleosides, especially those with antibacterial activities, are biosynthesized through the radical C5 modification mechanism. On the other hand, radical-mediated C5’ extension has been reported for other high-carbon furanosyl nucleosides (Figure 2c). These modifications are catalyzed by radical SAM enzymes and proceed through the formation of C5’ radical on the nucleoside or nucleotide precursor that adds to an sp2 carbon of the acceptor substrate. Compared to the nucleosides biosynthesized through the 5’-aldehyde intermediate, fewer compounds can be categorized in this radical C5’ modification group. However, our analysis described below suggests the presence of many unreported nucleosides biosynthesized through this mechanism.

In the subsequent two sections, we discuss the biosynthetic logics of the furanosyl nucleosides and their potential distribution in nature based on bioinformatic analysis. Since a diversity in the structures and biological activities have been reported for these furanosyl nucleosides, understanding the natural evolution and distribution of these pathways is important for the future genome mining discovery of nucleoside natural products with potentially novel biological activities.

Nucleosides Biosynthesized through Nucleophilic C5’ Modification

The nucleosides biosynthesized through the 5’-aldehyde intermediate are represented by the MraY-targeting antibacterial antibiotics. The C5’ oxidation is catalyzed by either α-KG and Fe-dependent oxygenase (e.g. liposidomycin[13] and caprazamycin,[14] Figure 1b), or flavin-dependent oxidase (pacidamycin[16] and pseudouridimycin[27]). Of the two mechanisms of C5’ oxidation, more pathways employ the α-KG and Fe-dependent oxygenase. The representative member of these enzymes is LipL in the liposidomycin-type A-90289 biosynthesis,[13] which catalyzes a 5’-hydroxylation of UMP to initially generate a phospho-hemiacetal that collapses into aldehyde by elimination of the phosphate.[14] On the other hand, the flavin-dependent oxidase, such as PacK in pacidamycin biosynthesis,[16] uses nucleosides as substrates and catalyzes the oxidation of alcohol into aldehyde.

The pathways diverge from the aldehyde intermediate between nucleosides with a pentose (C5-sugar nucleosides) and those with hexose and heptose (C6- and C7-sugar nucleosides). For the C5-sugar nucleoside biosynthesis, the aldehyde intermediate is transaminated to yield 5’-amino-5’-deoxy-nucleosides by an action of a PLP-dependent aminotransferase, such as Jaw8[15] in the jawsamycin biosynthesis. The 5’-amino group is then acylated to yield antimicrobial nucleosides, such as jawsamycin,[15] pseudouridimycin,[28] and pacidamycin.[29]

The C6- and C7-sugar nucleosides are biosynthesized through an addition of a glycine unit from l-threonine using a PLP-dependent transaldolase, such as LipK in A-90289 biosynthesis.[17] The resulting 5’-C-glycyluridine (GlyU) is used as the core nucleoside structure of liposidomycin and caprazamycin-type C7-sugar nucleosides. For the biosynthesis of the C6-sugar nucleosides, such as capuramycin, GlyU is further processed through the oxidative decarboxylation into uridine-5’-carboxamide (CarU) by PLP-dependent oxygenase such as Cap15 in capuramycin-type nucleoside biosynthesis.[18] Therefore, the sugar size of these nucleosides is determined by the combination of the LipK-type transaldolase, the Jaw8-type aminotransferase, and Cap15-type PLP-dependent oxygenase.

After the formation of the respective nucleosides, they receive various modifications such as glycosylation, amidation, and lipidation. One characteristic modification is 5’-aminodeoxyribosylation (ADR) of GlyU, which generates the ADR-GlyU core structure conserved among many of the MraY inhibitors including muraymycin, sphaerimicin A, and liposidomycins. As described above, the ADR moiety of ADR-GlyU forms a unique interaction with MraY and likely contributes to the specific binding to MraY. ADR is biosynthesized from uridine 5’-aldehyde through transamination by LipO, phosphorolysis by LipP, and nucleotidylation by LipM to yield the sugar donor, UDP 5’-aminodeoxyribose (UDP-ADR), followed by glycosylation by LipN.[19] In contrast, in the capuramycin biosynthesis, the same 5’-OH group of CarU is modified with a hexose. In the reported BGCs for capuramycin-type compounds, Cpr24 homologs were the only glycosyltransferase (GT).[21] Since an earlier isotope incorporation study suggested mannose as a precursor of the hexose,[20] Cpr24 homologs were proposed to catalyze a transfer of mannose or its derivative using GDP-activated donor substrate.[21b] LipN and Cpr24 are weakly homologous to each other (~30% identity) but catalyze distinct GT reactions. While LipN uses UDP riboside as the donor substrate and catalyzes the reaction with inversion of configuration at the anomeric center, Cpr24 uses GDP-hexose, and its catalysis is retention of configuration. Consequently, Cpr24 is homologous to GTs in the GT4 family,[30] while LipN does not show apparent homologies to any of the GT family members and likely forms a unique family. So far, the Carbohydrate Active enZYme (CAZy) website,[3031] which classifies glycosyltransferases based on their amino acid sequence and catalytic and structural properties, does not classify either of these GTs. Further functional and structural characterizations are needed for both of these GTs, especially for Cpr24 whose substrate remains ambiguous.[3]

The general biosynthetic logic described above helps correlate the genetic information and the structures of their metabolites and forms the critical foundation for the future genome-mining discovery of related nucleoside natural products. To investigate the genome mining discovery potentials, we searched for LipL homologs in the UniProt genome sequence database using the BLAST search function of the Enzyme Function Initiative Enzyme Similarity Tool (EFI-EST).[32] We then used the results to generate an SSN of proteins that showed the E-value of 5 or better with LipL. The resulting SSN was tailored by adjusting the edge threshold by the alignment score. At the alignment score of 45, LipL homologs in the reported nucleoside BGCs were found in a cluster with 38 nodes, which we defined as LipL homologs. Their phylogenetic analysis revealed clustering of LipL homologs from pathways that produce structurally related nucleosides (Figure 3), consistent with their close evolutional relationships.

Figure 3.

Figure 3.

Analysis of putative BGCs containing LipL homologs. The phylogenetic tree represents evolutionary relationships of LipL homologs among the organisms in the first column of the table. Each branch in the phylogenetic tree is labeled with bootstrap values. The table summarizes the presence of nucleoside core biosynthetic genes (gray) and the modification enzymes with following predicted functions: LipN and Cpr24 glycosyltransferase homologs (columns labeled as LipN and Cpr24); glycosyltransferase other than LipN or Cpr24 (GT); pyrimidine nucleoside phosphorylase (LipP); deoxysugar biosynthetic enzymes (Sugar); lipid biosynthesis or ligation such as lipase, CoA-ligase, PKS, or acyltransferase (Lipid); peptide bond formation such as NRPS or ATP grasp amide ligase (Peptide). Also shown are the predicted or reported structure of the nucleoside core (Core column) and the reported metabolites (Metabolite column). The BGCs for sphaerimicin A,[36] A-90289,[37] caprazamycin,[38] liposidomycin,[39] jawsamycin,[15] muraymycin,[40] A-102395,[21b] A-503083,[21c] and capuramycin[41] have been reported.

More than half of the LipL homologs identified in this analysis have not been functionally characterized. Still, all these genes were found in operons containing either or both of LipO and LipK homologs (>33% amino acid identity; Figure S1). Other genes in these operons were also homologous to enzymes in secondary metabolisms, suggesting that these genomic regions likely code for nucleoside natural product BGCs. Many of the BGCs identified in this search are conserved in multiple organisms while others are unique singletons. The BGC conservation agreed very well with the phylogenetic relationships in Figure 3, suggesting that LipL and these pathways were likely co-evolved. The presence of the homologous pathways in multiple organisms likely reflects the abundance of those metabolites in nature. In support of this, many of the reported compounds (such as caprazamycin, liposidomycin, and capuramycin; Figures 3 and S1) are produced by these conserved pathways. However, it is unknown if there is a correlation between the pathway conservation and biological activities because some of the singleton pathways were reported to produce bioactive compounds such as sphaerimicin and muraymycin (Figures 3 and S1).

We further analyzed these BGCs to identify the types of nucleosides that they may produce (Figure 3). The sugar size of the nucleoside core can be predicted by the presence of LipK (transaldolase), LipO (aminotransferase), and Cap15 (PLP-dependent oxygenase) homologs (gray highlights in the table of Figure 3). BGCs with LipO-type aminotransferase without LipK-type transaldolase most likely produce C5-sugar nucleoside (5’-amino-5’-deoxyuridine). BGCs with LipK-type transaldolase without Cap15-type PLP-dependent oxygenase likely produce C7 nucleoside (GlyU). BGCs with both LipK-type transaldolase and Cap15-type PLP-dependent oxygenase are consistent with C6 nucleoside (CarU). Therefore, these three genes likely serve as genetic markers to predict the sugar size of the nucleoside core.

GlyU and CarU can be modified by 5’-glycosylation by either LipN or Cpr24 GT homolog. Since LipN homologs likely form their own GT family, it is difficult to define them by any sequence motif. Therefore, we used overall sequence identity. However, LipN and Cpr24 are weakly homologous to each other (~30–35%), and some of the GTs in these nucleoside BGCs showed similar levels of overall sequence identity to LipN and Cpr24. Therefore, we defined LipN homologs to be the enzymes that show sequence identity of >30% without any detectable motifs for the defined GT family. On the other hand, Cpr24 homologs were defined as enzymes with >30% sequence identity to Cpr24 and have a motif for the GT4 family. Phylogenetic analysis of Cpr24 and LipN homologs (Figure S2) showed all the Cpr24 homologs in a single clade, supporting the functional annotation. On the other hand, the putative LipN homologs were found in multiple clades and some of them are in clades without any homologs in the putative nucleoside BGCs (Figure S2). Therefore, our assignment of LipN homologs is tentative and requires experimental verification. Using this definition, we annotated 10 BGCs with Cpr24 homologs, and 14 BGCs with LipN homologs. Most of the BGCs with LipN homologs encoded the UDP-ADR biosynthetic enzymes (LipO, LipP, and LipM; see “LipP” column in Figure 3), consistent with the production of ADR-GlyU. Cpr24 homologs were found in BGCs with deoxysugar biosynthetic enzymes (see “Sugar” column in Figure 3). Since all of these BGCs with Cpr24 homologs encode Cap15 PLP-dependent oxygenase, they likely produce the Gly-CarU core found in capuramycin-type compounds.

We also noticed that many of the BGCs lacking LipN or Cpr24 GT homologs harbor another GT gene (see “GT” column in Figure 3). Intriguingly, many of these GT-containing BGCs do not carry LipK transaldolase but carry LipP pyrimidine nucleotide phosphorylase and LipQ nucleotidyltransferase. These are inconsistent with the rules described above and suggest the presence of either a novel biosynthetic mechanism of high-carbon sugar nucleosides or glycosylation at a position other than 5’-OH. Regardless, these BGCs likely produce structurally uncharacterized nucleosides and require future characterization.

Many of the reported furanosyl nucleosides are highly decorated by polyketides, lipids, amines, and amino acids. While the prediction of the exact structures of these modifications purely by sequence information is challenging, the types of modification are predictable based on our current understanding of the characterized pathways. For example, the lipid-like hydrocarbons found in jawsamycins are biosynthesized through polyketide synthase (PKS).[14,33] In liposidomycins, their lipid moieties are biosynthesized through a LipT/Cpz23-type acyltransferase that shows sequence homologies to lipases.[34] Therefore, BGCs with these enzymes (see “Lipid” column in Figure 3) could produce nucleosides with lipid-like functional groups. On the other hand, amide linkages between a nucleoside and an amine or an amino acid are biosynthesized through nonribosomal peptide synthetase (NRPS),[29] CapW-type serine protease family enzyme,[21c] or ATP grasp amide ligase.[28,35] Therefore, BGCs with these enzymes (see “Peptide” column in Figure 3) likely produce peptidyl nucleosides.

These analyses revealed that more than half of these BGCs have features distinct from the previously characterized BGCs. Therefore, they are promising targets for future genome mining discoveries of structurally novel nucleoside natural products. In particular, our analysis revealed many BGCs encoding LipL and LipO homologs in the absence of LipK homologs, suggesting that they have the potentials to produce C5-sugar nucleosides. So far, jawsamycin is the only reported C5-sugar nucleoside produced through LipL-type mechanism. Many of the putative BGCs without LipK do not harbor PKS and a radical SAM enzyme (Jaw5) responsible for the formation of cyclopropane-containing lipid in the jawsamycin biosynthesis.[14,33] Instead, many of these BGCs harbor GT and other lipid and peptide biosynthetic enzymes. Thus, they are likely responsible for production of glycosylated lipopeptidylnucleosides with biological activities distinct from jawsamycins. Many other BGCs also have genes for C6- and C7-nucleoside core biosynthesis with different combinations of lipid or peptide biosynthetic enzymes. These pathways likely produce ADR-GlyU or Gly-CarU nucleoside core structures with structurally diverse modifications. Since these nucleoside core structures are important for the MraY inhibition, genome mining studies of these pathways may reveal novel MraY inhibitors. Overall, these analyses suggested the presence of many uncharacterized nucleoside biosynthetic pathways.

Nucleosides Biosynthesized through Radical-Mediated C5’ Extension

A distinct mechanism of C5’ extension has been reported for antifungal nucleoside natural products. These pathways involve radical SAM enzymes that catalyze various radical-mediated reactions by generating a highly reactive radical species, 5’-deoxyadenosyl radical (5’-dA·), from SAM.[42] The radical-mediated C5’ extension has been demonstrated for NikJ in the nikkomycin biosynthesis[23] and has been proposed for TunB in the tunicamycin biosynthesis[25] (Figure 2c). The function of TunB, however, has been proposed based only on gene disruption study and has not been demonstrated in vitro.[25] Consequently, significant ambiguity remains about its function and the subsequent steps in the pathway.

On the other hand, significant progress has been made in our understanding of nikkomycin/polyoxin pathways. In these pathways, UMP is first coupled with phosphoenol pyruvate (PEP) by NikO[22] to yield enolpyruvyl UMP (EP-UMP), to which NikJ generates a radical on C5’ and catalyzes a radical cyclization reaction to yield octosyl acid 5’-phosphate (5’-OAP).[23b] The resulting 5’-OAP is then converted to a C8-sugar nucleotide intermediate, 5’-amino-6’-hydroxy-octosyl acid 2’-phosphate (A-HOAP). During this process, the 2’-OH of 5’-OAP receives phosphorylation and the 5’-phosphate is hydrolysed, a process designated as a cryptic phosphorylation.[35b] Similar phosphorylation of biosynthetic intermediates was also found in some of the LipL-containing pathways, such as muraymycin[43] and caprazamycin[34] biosynthesis, and has been proposed as a self-resistance mechanism against the toxic effects of the biosynthetic intermediates. Although the role of 2’-phosphorylation in the nikkomycin pathway is unclear, it is essential for the activities of the downstream enzymes and conserved in nikkomycin, polyoxin, and malayamycin pathways.[24a,35a] Since kinases with unknown functions are frequently found in many other nucleoside BGCs,[35b] similar cryptic phosphorylation is likely operating in other nucleoside pathways.

In the nikkomycin and polyoxin pathways, the C8-sugar nucleotide, AHOAP, is converted into a C6-sugar nucleotide, aminohexuronic acid 2’-phosphate (AHAP), by an α-KG and Fe-dependent oxygenase (NikI). The characteristic bicyclic structure of AHOAP was also found in several other antifungal nucleosides, such as ezomycins[1b] and malayamycin A.[46] Malayamycin A is biosynthesized through a pathway analogous to nikkomycin Z and the difference in the sugar size derives from the function of the α-KG and Fe-dependent oxygenase (MalI, Figure 2c).[24] MalI is homologous to NikI in nikkomycin biosynthesis (31% identity), but instead of the oxidative C C bond cleavage catalyzed by NikI,[35b] MalI catalyzes the oxidation of 5’-OH of AHOAP into ketone.[24a] Subsequently, MalB dehydrogenase catalyzes decarboxylation and keto-reduction to yield a C7-sugar nucleotide (5’-amino-6’-hydroxyhepturonic acid 2’-phosphate; AHHP).[24a] The BGC for C8-sugar nucleosides, ezomycins, has not been reported, but it is likely that their nucleoside core is biosynthesized through the same mechanism without the final tailoring step by the NikI/MalI homolog. In fact, all the reported ezomycins have AHOAP as the nucleoside core structure.[1b] Therefore, the functions of the α-KG and Fe-dependent oxygenases, NikI and MalI, or their absence likely determines the sugar size of the final nucleoside metabolites.

To investigate the distribution of nucleoside natural products biosynthesized through the radical-mediated C5’ extension, we created SSN of NikJ homologs in a way similar to LipL homologs described above. The initial SSN was generated using BLAST search results of NikJ with E-value of 5 or better. With the edge threshold at the alignment score of 100, the NikJ-containing cluster had 48 sequences, all of which were co-clustered with NikO homologs. Most of these putative BGCs also encode the other nucleoside biosynthetic enzymes, NikN, NikL, NikM, and NikK, suggesting they likely produce nucleosides with 5’-amino-6’-hydroxyoctosyl acid (AHOA), aminohexuronic acid (AHA), or 5’-amino-6’-hydroxyhepturonic acid (AHH). The potential BGCs for C6 (AHA), C7 (AHH), and C8 (AHOA) sugar nucleosides can be identified based on the presence or absence of NikI or MalI homolog. BGCs with a NikI homolog likely produce nucleosides with AHA as the core structure. BGCs with MalI homolog likely produce nucleosides with AHH and those without NikI or MalI homolog likely produce C8-sugar nucleosides.

Analogous to the LipL-containing BGCs, the conservation of the genome neighborhood of nikJ homologs showed very good agreement with the NikJ phylogenetic analysis (compare Figures 4 and S2). For example, NikJ and MalJ have close homologs based on the phylogenetic tree in Figure 4, which are associated with nikkomycin and malayamycin-like pathways, respectively, as identified in the genome neighborhood analysis in Figure S3. Several other pathways are conserved within the genus of Xenorhabdus or Photorhabdus, or among several genera of actinobacteria (Figure S3), suggesting the presence of relatively conserved nucleoside natural product biosynthetic pathways that have not been characterized yet. Overall, the comparison of NikJ phylogenetic tree and the genome neighborhood analysis suggest that NikJ and these nucleoside biosynthetic pathways are likely co-evolved and NikJ serves as an excellent genetic probe to identify the putative nucleoside BGCs.

Figure 4.

Figure 4.

Analysis of putative BGCs containing NikJ homologs. The phylogenetic tree represents evolutionary relationships of NikJ homologs among the organisms in the first column of the table. Each branch of the phylogenetic tree is labeled with bootstrap values. The table summarizes the presence of nucleoside core biosynthetic genes and the modification enzymes. In the “NikI/MalI” column, NikI homologs are shown in cyan, and MalI homologs are in orange. Also shown are the predicted or reported structure of the nucleoside core (Core column) and the reported metabolites (Metabolite column). The BGCs for nikkomycins,[24b,44] polyoxins,[45] and malayamycin A[24b] have been reported.

Many of these putative BGCs harbor one or more copies of ATP grasp amide ligase enzyme homologous to NikS (see “NikS” column in Figure 4). Since NikS catalyzes the ligation of amino acid to the 5’-amine of AHAP,[35] these BGCs with NikS homologs likely produce peptidyl nucleosides. Interestingly, some pathways with a MalI homolog or without NikI/MalI homolog also harbor a NikS homolog. Since malayamycin pathway does not have a NikS homolog, these pathways likely produce structurally novel C7-sugar peptidylnucleosides. Also, substantial diversity was found in the NikI-containing BGCs. While structural prediction of their putative metabolites is currently not possible due to the poor understanding of these pathways, they likely produce nucleoside and peptidylnucleoside natural products structurally distinct from nikkomycins and polyoxins.

Finally, these nucleosides likely have variations in the nucleobase structure. Most characteristic is the presence of pseudouridine structure reported for malayamycin A.[46] In the malayamycin biosynthesis, the pseudouridine nucleoside has been proposed to be formed by TruD[24b] homologous to an E. coli tRNA pseudouridine synthase (PS)[47] and the PS (PumJ) encoded in another pseudouridine-containing nucleoside natural product, pseudouridimycin.[28] While characterized PSs act on tRNA or rRNA,[47a] TruD[24b] in malayamycin biosynthesis and PumJ[28] were proposed to act on a uridine mononucleotide. Many of the putative BGCs in Figure 4 contain PS homologs, suggesting that they may produce pseudouridine nucleosides. The target of malayamycin is unknown, and therefore the role of the pseudouridine base is unclear. However, the presence of these pseudouridine pathways creates additional structural diversity in these natural products.

While the current list of reported nucleosides biosynthesized through radical-mediated C5’ extension is short, the number of organisms with BGCs encoding NikJ homologs was more than that with LipL homologs. This observation suggests that many more nucleosides with the AHA, AHH, or AHOA core structure are produced in nature than our current understanding. Since these molecules have not been discovered so far, the BGCs are most likely silent under lab culture conditions. Alternatively, they may have biological activities that are not commonly screened against, such as quorum sensing, communication with plant hosts, or selective antimicrobial activities against phytopathogens. Therefore, their isolation and characterization of their biological functions could provide novel chemical tools to modulate these interactions.

Summary and Outlook

The analyses described here demonstrate how biosynthetic understanding can help predict the structures of putative metabolites from genome sequence information. The comparison of the phylogenetic and the genome neighborhood analyses supported the feasibility of the identification of putative nucleoside BGCs using the lipL and nikJ genes. The extensive characterization of the known biosynthetic pathways allowed logical prediction of the nucleoside core structures. However, the analysis does not allow structural prediction of the final metabolites due to the diverse nature of the nucleoside natural products. Also, while the predicted nucleoside natural products likely exhibit biological activities by perturbing the functions of enzymes that use nucleotide substrates, the prediction of their biological activity is currently not possible. Therefore, future experimental validation is necessary to elucidate the structures and biological activities of the nucleoside natural products produced by the proposed BGCs. Such studies will elucidate the more comprehensive picture of the diversity of nucleoside natural products and would potentially reveal novel bioactive nucleoside natural products.

Supplementary Material

Du et al. SI

Acknowledgements

This work was supported by the Duke University School of Medicine and National Institute of General Medical Sciences R01 GM115729 (to K.Y.).

Biographies

graphic file with name nihms-1922487-b0006.gif

Yanan Du obtained her Ph.D. degree from the Shanghai Institute of Organic Chemistry at the Chinese Academy of Sciences in 2021, where she studied under the mentorship of Prof. Wen Liu. Her doctoral research delved into the biosynthesis of thiopeptides. Currently, she is a Postdoctoral Associate at Duke University under the supervision of Prof. Kenichi Yokoyama. Her current research aims to elucidate the mechanisms of enzymes involved in the biosynthesis of nucleoside antifungal antibiotics.

graphic file with name nihms-1922487-b0007.gif

Anyarat Thanapipatsiri received her Ph.D. degree from Kasetsart University under the supervision of Prof. Arinthip Thamchaipenet in 2016. During her Ph.D. program, she performed her research at John Innes Centre under supervision of Prof. Mervyn Bibb, focused on expression of actinobacterial Type III polyketide synthases towards novel polyketide natural product discovery. She then joined Prof. Kenichi Yokoyama’s group at Duke University School of Medicine to pursue her postdoctoral research working on biosynthesis and genome mining of nucleoside natural products.

graphic file with name nihms-1922487-b0008.gif

Kenichi Yokoyama received his Ph.D. degree from the Tokyo Institute of Technology under the supervision of Prof. Tadashi Eguchi by studying the aminoglycoside antibiotics biosynthesis. His postdoc study was with Prof. JoAnne Stubbe at MIT on the mechanism of long-range radical propagation in ribonucleotide reductase. He is now an independent researcher at Duke University, where he combines his expertise in mechanistic enzymology and natural product biochemistry. One of the foci of his lab is the biosynthesis and mode of action of antifungal natural products. His lab also studies the radical S-adenosyl-L-methionine enzymes in natural products and cofactor biosynthesis.

Footnotes

Conflict of Interests

The authors declare no conflict of interest.

Supporting information for this article is available on the WWW under https://doi.org/10.1002/cbic.202300342

Supporting Information

Figures S1 and S2 are available in the Supporting Information.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Du et al. SI

RESOURCES