Abstract
Natural products (NPs) of microbial origin are highly valued for their diverse bioactive properties. Among bacteria, Streptomyces stands out as a prolific source of NPs with applications in medicine and agriculture. Recent advances in metabolomics, and bioinformatics as well as the abundance of genomic data have revolutionized the study of NPs, enabling the rapid connection of biosynthetic pathways and metabolites. However, discovering novel compounds from large pools of genomes and strains is cumbersome. Metabolo-genomics approaches are promising strategies that can save time and resources at initial stages of the natural product discovery pipeline by rapidly linking molecules and their biosynthetic genes. Here, we present genomic characterization and metabolomic profiling of Streptomyces sp. KL110A, a strain isolated from the rainforest soils of Calakmul, Campeche in Mexico. Using genome mining tools and LC-MS/MS metabolomics, we identified and characterized known biosynthetic gene clusters (BGCs) and proposed a biosynthetic mechanism for the biosynthesis of the benz(a)anthraquinone tetrangulol. Our findings underscore the relevance of integrating genomic and metabolomic approaches in elucidating novel biosynthetic pathways, positively contributing to the field of natural product research.
Supplementary Information
The online version contains supplementary material available at 10.1007/s11274-025-04298-7.
Keywords: Streptomyces, Genomics, Metabolomics, Genome mining, Natural products, Biosynthetic gene cluster, Tetrangulol
Introduction
Natural products (NPs) are bioactive metabolites produced by plants, animals, fungi, and bacteria. Within bacteria, the Streptomyces genus is the best-known source of NPs, as underlined by the many Streptomyces-derived molecules used to treat human diseases, safeguard animal health, and protect agricultural crops (Katz and Baltz 2016). The growing advances in genomics, metabolomics and bioinformatics have facilitated the study of NPs (Licona-Cassani et al. 2015; Gavriilidou et al. 2022). These developments not only allow us to predict the biosynthetic basis of NPs but also to identify novel classes and even to optimize their expression through genetic engineering approaches (Liu et al. 2018).
In recent years, the discovery of novel NPs has mostly relied on extensive sampling of rare environments and the use of genome mining workflows (Cruz-Morales et al. 2017; Rodriguez-Sanchez et al. 2023; Undabarrena et al. 2016; Ziemert et al. 2014). Through high-throughput microbial genome sequencing and advanced bioinformatics tools (Blin et al. 2023; Lee et al. 2020; Zdouc et al. 2024), it is now possible to rapidly identify and annotate the biosynthetic gene clusters (BGCs) and the molecules they produce (Jørgensen et al. 2024; Lee et al. 2020). An unsought consequence of the increasing number of available genomes and strain collections is the repeated discovery of known compounds, a problem that drains resources and overall limits the progress in the discovery of novel compounds (Aguilar et al. 2024; Chevrette and Handelsman 2021; Lu et al. 2021; Sidebottom and Carlson 2015).
While genome mining tools allow for the identification and prediction of BGC products, direct characterization of NPs is achieved through MS2-based metabolomics. The correlation of genomics and metabolomics datasets has raised as promising approach for dereplication, that is, to eliminate known and redundant compounds, and select for new NPs (Lee et al. 2022; Ma et al. 2003; Mohimani et al. 2017; Yurekten et al. 2024; Zuffa et al. 2024). The use of this approach has recently increased the discovery of novel NPs. This is supported by numerous studies showcasing the versatility and power of genome mining approaches (Rateb et al., 2015; Duncan et al. 2015; Lu et al. 2021). These findings underscore the potential of integrating omics approaches to uncover novel bioactive compounds with significant biotechnological and therapeutic applications.
Here, we report the genomic characterization and metabolomic profile of Streptomyces sp. KL110A, a strain isolated from rainforest soils of Calakmul, Campeche. Through the integration of bioinformatics for genome analysis, and liquid chromatography coupled to mass spectrometry (LC-MS/MS) for metabolomic analysis, we rapidly dereplicated BGCs and identified the candidate BGC potentially responsible for the biosynthesis of tetrangulol. Our study highlights the value of the correlation between genomic data and metabolomics to identify novel biosynthetic routes for known metabolites and for the discovery of new bioactive compounds.
Materials and methods
Strain isolation
Streptomyces sp. KL110A was isolated from wetlands at the Calakmul Biosphere Reserve, Campeche, México (18°36 × 43” N, 89°32′53” O). Samples were collected from 15 cm depth, using a soil sampler. Pure cultures were obtained by serial dilutions using the International Streptomyces Project liquid medium 2 (ISP-2) supplemented with Nalidixic acid (30 mg/mL) and Cycloheximide (100 mg/mL) as previously described (Trejo-Alarcon et al. 2025).
Genome sequencing, assembly, and annotation
For genomic DNA extraction, 15 ml of ISP-2 liquid media were inoculated with 20 µL of glycerol stocks with a spore concentration of approximately 1 × 10^6 spores/mL. Genomic DNA was obtained by liquid nitrogen lysis and phenol: chloroform: isoamyl alcohol (25: 24: 1, v/v) extraction as reported previously (González-Salazar et al. 2023). Streptomyces sp. KL110A was sequenced using a hybrid approach combining short-read and long-read sequencing technologies. The short reads sequencing was performed using the Illumina MiSeq platform with 2 × 300 bp pair-end reads. The DNA libraries were prepared using TruSeq Nano DNA library kit (Illumina, San Diego, CA, USA) following the manufacturer’s instructions. We used Oxford Nanopore Technologies (ONT) for long-read sequencing, using the ONT rapid sequencing kit (catalog number SQK-RAD003) and an R9.4 flow cell (FLOMIN106) aspreviously described (Gallegos-Lopez et al. 2020).
We used FastQC version 0.74 (Andrew 2010) to assess the qualityof the sequencing and Trimmomatic (Bolger et al. 2014) to excise adapters and low-quality bases excluding reads shorter than 200 bp. The genome was assembled utilizing Unicycler version 0.4.8 implemented in the BV-BRC server version 3.93.3 (Wattam et al. 2018; Wick et al. 2017). The assembly was annotated using NCBI PGAP pipeline version 6.9 (Tatusova et al. 2016) and the BV-BRC server (version 3.93.3) which is based on the RAST tool (Wattam et al. 2018). Annotation of the regions containing biosynthetic gene clusters (BGCs) was performed using antiSMASH version 7.0 (Blin et al. 2023) and the MiBIG database 4.0 (Zdouc et al. 2024). The completeness of the annotations was then assessed using BIGFAM version 1.0.0 (Kautsar et al. 2021).
Phylogeny and taxonomic characterization
To infer the taxonomic placement of Streptomyces sp. KL110A we compiled a dataset of 66 reference genomes, from them we obtained a set of 320 core proteins using Usearch v10.0.240 (Edgar 2010) implemented in BPGA (Chaudhari et al. 2016). The proteins sequences were aligned, trimmed and concatenated to obtain a partitioned matrix, this matrix was used to calculate a maximum likelihood phylogenetic tree with IQ-TREE (version 2.0.7), the optimal substitution model for each partition was selected automatically using ModelFinder (Kalyaanamoorthy et al. 2017), and the branch support were assessed with 10,000 ultrafast bootstrap replicates (Minh et al. 2020) and visualized using iTOLversion 5 (Letunic and Bork 2021). Taxonomic characterization was obtained using the TYGS server (Meier-Kolthoff and Göker 2019).
Metabolomics analysis
For metabolite extraction, 500 mL of BS media (Trejo-Alarcon et., 2025) were inoculated with 200 µL of glycerol stocks with a spore concentration of approximately 1 × 10^6 spores/mL into shake-flasks. The cultures were incubated at 30˚C for 7 days and centrifugated at 14,000 g for 15 min. The secreted metabolites were extracted from the from supernatants twice in ethyl acetate (EtOAC) 1:1. The extract was concentrated using the Buchi R-215 rotavapor®.
The extracts were analyzed using LC-MS/MS on a Vanquish Duo UHPLC binary system (Thermo Fisher Scientific, Waltham, MA, USA) coupled to an IDX-Orbitrap Mass Spectrometer (Thermo Fisher Scientific). Waters ACQUITY BEH C18 (10 cm × 2.1 mm, 1.7 μm) (Waters™, Milford, MA, USA) column equipped with an ACQUITY BEH C18 guard column kept at 40 °C. The mobile phase consisted of MilliQ water + 0.1% formic acid (v/v) (A) and acetonitrile + 0.1% formic acid (v/v) (B) (both sourced from HiPerSolv CHROMANORM®, HPLC and LC-MS grade, VWR Chemicals BDH®). The mobile phase gradient composition was as follows: 0–0.8 min 2% B, 0.8–3.3 min 2–5% B, 3.3–10 min 5–100% B, 10–11 min 100% B. The column was then re-equilibrated at 2% B for 2.7 min. Flow rate was set at 0.35 mL/min. The injection volume was 1µL.
The MS measurements were performed using heated electrospray ionization (HESI) mode in positive and negative ion mode as previously described (Radi et al. 2024). The voltage was set to 3500 V in positive mode and 2500 V in negative mode, acquiring in full MS/MS spectra (Data dependent Acquisition-driven MS/MS, DDA) in the mass range of 70-1000 Da. The mass resolution was set to 120,000 for full scan MS and 30,000 for MS/MS events. Precursor ions were fragmented by the stepped High-energy Collision Dissociation (HCD) using collision energies of 20, 40, and 55. The automatic gain control (AGC) target value set at 4E05 for the full MS and 5E04 for the MS/MS spectral acquisition.
The MS/MS data were converted to mzXML format using the MS-Convert software, which is part of ProteoWizard (Palo Alto, CA, USA). Feature detection was performed with MZmine3 version 3.6.0 (Schmid et al. 2023). The resulting features were exported to Sirius 6.0.7 (Dührkop et al. 2019) or MS/MS dereplication and feature annotation.
To enhance metabolite annotation, the MS-SNAP tool (Morehouse et al. 2023) was used to create molecular networks. The analysis was performed using the following parameters: reference database set to full NP Atlas, adduct [M-H]+, PPM error of 5, minimum GNPS node size of 2, maximum GNPS node size of 5000, minimum Atlas node size of 3, minimum group size of 2, maximum result nodes of 500, maximum result edges of 10,000, and duplicate removal enabled.
Results
Genomic characterization and identification of the biosynthetic potential of Streptomyces sp. KL110A
The genome sequencing of Streptomyces sp. KL110A yielded 77.8 million high-quality paired-end reads and 38,319 ONT reads with an average length of 73.5 and 2,745.9, respectively, which were assembled a genome of 8,705,038 bp (73.36% GC) in a total of 32 contigs (130- depth coverage, N50 of 895,028 bp). The Streptomyces sp. KL110A genome encodes 7,482 proteins 4,164 on the forward strand and 3,318 on the reverse strand, 24 rRNA genes (8 complete 5 S, 16 S, 23 S), 72 tRNA genes, and 2 ncRNA genes (Fig. 1a). 4.10% of the genome encodes for hypothetical proteins (320). Digital DNA-DNA Hybridization (dDDH) analysis indicated that Streptomyces sp. KL110A has 89.2% similarity compared to Streptomyces hydrogenans JCM 4771 (G + C content difference = 0.13%, δ = 0.142). While our phylogenetic analysis showed that Streptomyces sp. KL110A is closely related to S. purpureus KA281 (GCF_000384175) this confirms that our strain can be clearly classified as Streptomyces.
Fig. 1.
Genomic and phylogenetic analysis of Streptomyces sp. KL110A. (a) Graphical representation of the genome of Streptomyces sp. KL110A. From the outside to the center ring, number of contigs (dark blue), genes on the forward strand (light green), genes on the reverse strand (dark purple), GC content (green peaks), and putative biosynthetic gene clusters (BGCs) in diverse colors: NRPS in red, PKS in dark purple, RiPPs in dark green, terpenes in teal, siderophore clusters in yellow, melanin in deep maroon, ectoine in dark navy blue, butyrolactone in bright green, unknown clusters in pink and fragmented clusters in gray. In the center, the macroscopic morphology in MS medium is shown. (b) Identification and comparison of the desferrioxamine BGC. A high conservation of gene cluster architecture is displayed by the comparative analysis of the BGC identified in Streptomyces sp. KL110A (contig 13) with BGCs previously annotated (BGC0001453 and BGC0000940). On the bottom, the chemical structure of desferrioxamine. (c) Genomic-based phylogenetic analysis of Streptomyces sp. KL110A. The clade containing Streptomyces sp. KL110A (highlighted in red) is shown in yellow. Bootstrap values are indicated in circles. The scale bar indicates evolutionary distance, with branch lengths proportional to the number of substitutions per site
The genome of Streptomyces sp. KL110A encodes ∼ 23 BGCs including three non-ribosomal peptide synthetase (NRPS), four polyketide synthase (PKS), five terpenes, post-translationally modified peptide (RiPP), siderophore, melanin, ectoine, butyrolactone, and hybrid NRPS-PKS and two unknown clusters distributed within the genome. Also, we identified two clusters located on contig edges, specifically a Type II PKS (lugdunomycin, contig 3.1) and a terpene (2-methylisoborneol, contig 13.1) (Fig. 1a; Table 1). Using MiBiG database 4.0 (Zdouc et al. 2024), we identified seven commonly found BGCs (80–100% similarity), namely, spore pigment (BGC0000271.5), class IV lanthipeptide/SflA (BGC0002337.2), isorenieratene (BGC0000664.5), geosmin (BGC0001181.4), 2-methylisoborneol (BGC0000658.3), desferrioxamin B/desferrioxamine E (BGC0000940.5) and ectoine (BGC0002052.3) (Table 1). Using BiG-SCAPE (Navarro-Muñoz et al. 2020), we clustered these BGCs into related families based on sequence similarity and domain architecture. Interestingly, the desferrioxamin B/desferrioxamine E BGC showed close clustering with those from Streptomyces argillaceus (BGC0001453), Streptomyces coelicolor A3(2) (BGC0000940) and another strain from our collection, Streptomyces sp. CC302I (Trejo-Alarcon et al. 2025) (Fig. 1b). Given that most of the BGC could not be linked to known molecules using state-of-the-art bioinformatic tools, we reasoned that this strain may produce new compounds. Thus, we generated extracts for metabolomics analysis using LC-MS/MS hoping to identify potentially new natural products.
Table 1.
Loci with biosynthetic gene clusters identified in Streptomyces sp. KL110A
Contig | Family | Size (nt) | Predicted product* | Similarity | MIBiG BGC-ID** |
---|---|---|---|---|---|
Complete BGCs | |||||
3.2 | NRPS | 105,275 | CDA1b/CDA2a/CDA2b/CDA3a, b/CDA4a, b | 15% | none |
3.5 | NRPS | 53,774 | corynecin III/corynecin I/corynecin II | 13% | none |
4.2 | NRPS | 64,917 | BE-43547A1/A2, BE-43547B1/B2/B3, BE-43547C1/C2 | 10% | none |
1.5 | T1PKS | 60,535 | lavendiol | 32% | none |
1.7 | T1PKS | 43,230 | amycolamycin A/ amycolamycin B | 4% | none |
3.3 | T2PKS | 72,512 | spore pigment | 83% | BGC0000271.5 |
1.1 | T3PKS | 41,130 | flaviolin/1,3,6,8-tetrahydroxynaphthalene | 66% | BGC0002127.3 |
1.4 | RiPP | 11,391 | / | 0% | none |
3.4 | RiPP | 22,351 | class IV lanthipeptide/SflA | 100% | BGC0002337.2 |
12.2 | RiPP | 10,827 | / | 0% | none |
1.8 | terpene | 26,440 | hopene | 76% | BGC0000663.5 |
3.6 | terpene | 24,895 | isorenieratene | 100% | BGC0000664.5 |
7.1 | terpene | 21,139 | kosinostatin | 3% | none |
12.3 | terpene | 22,246 | geosmin | 100% | BGC0001181.4 |
1.2 | siderophore | 32,643 | kinamycin | 16% | none |
4.1 | siderophore | 29,787 | desferrioxamin B/desferrioxamine E | 100% | BGC0000940.5 |
8.1 | melanin | 10,395 | istamycin | 8% | none |
12.1 | ectoine | 10,404 | ectoine | 100% | BGC0002052.3 |
1.6 | butyrolactone | 10,944 | griseoviridin/fijimycin A | 8% | none |
1.3 | hybrid (NRPS + PKS) | 58,217 | leucomycin | 23% | none |
5.1 | hybrid (PKS + RiPP) | 109,596 | BE-7585 A | 38% | none |
Fragmented BGCs | |||||
3.1 | T2PKS | 55,728 | lugdunomycin | 62% | BGC0002016.2 |
13.1 | terpene | 17,279 | 2-methylisoborneol | 100% | BGC0000658.3 |
* Clusters identified by antiSMASH through KnownClusterBlast
** Clusters identified by MIBiG comparison in antiSMASH. Some sequences are present in the MIBiG database but were not reported by antiSMASH due to low similarity percentages
Metabolomics profile of Streptomyces sp. KL110A in BS medium
For dereplication and metabolite annotation, the spectral data was initially analyzed on the GNPS platform, but no molecules could be annotated in the KL101A metabolome, underlying the likelihood of this strain to encode for new molecules. Then, we procced to use Sirius 6.0.7. Most detected molecules could not be annotated with high confidence (≥ 0.85). Nevertheless, we annotated tetrangulol with a similarity score of 72.89%. This observation was supported by the presence of a peak in the extracted ion chromatogram at a retention time of 7.67 min with m/z 305.0814 (mass error: 0.05 ppm), as well as by the analysis of the fragmentation spectrum corresponding to this parent ion (Fig. 2a and b and Fig.S1). Furthermore, our analysis revealed the presence of tetrangomycin, an intermediate of tetrangulol (Fig. 3b), at 7.88 min with mass [M + H]+ = 323.09 Da (Fig. S2), Tanimoto similarity of 98.26% and a CSI: fingerID score of -28.62. We used MS-SNAP networks to explore the potential connections amongst both peaks. We obtained a network that included tetrangulol along with closely related compounds such as tetrangomycin and boshramycinones (Fig. 2b). This result provides a comprehensive visualization of potential structural and biosynthetic relationships with the angucyclines and suggests a broader context for the association of tetrangulol within the metabolome.
Fig. 2.
LC-MS/MS analysis of the crude extract from Streptomyces sp. KL110A, annotation of tetrangulol, and molecular network analysis. (a) Extracted ion chromatogram of m/z 305.0814 [C19H13O4]+, showing data for the control and the crude extract from Streptomyces sp. KL110A. (b) MS/MS spectrum of m/z 305.0814 [C19H13O4]+ with proposed molecular formulas for the fragments. (c) Molecular network generated using SNAP-MS, showing the connections between tetrangulol, tetrangomycin, and related angucycline compounds
Fig. 3.
Comparison of tetrangulol-related BGCs in Streptomyces sp. KL110A. (a) CORASON analysis of BGC in contig 3.1 of Streptomyces sp. KL110A. The phylogenetic tree generated shows the relationships between the BGC found in the conting 3.1 of Streptomyces sp. KL110A and other previously characterized angucycline- producing BGCs. (b) Proposed tetrangulol biosynthetic route. Tetrangulol biosynthesis follows the PKS type II route, starting with the condensation of acetyl-CoA and nine malonyl-CoA units catalyzed by two ketosynthases (1,2) and an acyl carrier protein (3), producing a linear polyketide chain. This chain is folded and cyclized by polyketide cyclases (4,6), forming the aromatic core characteristic of benz(a)anthraquinone derivatives. Monooxygenases hydroxylate, while NADPH- and FAD-dependent reductases modify the aromatic nucleus, generating intermediates such as tetrangomycin. A final dehydrogenation step converts tetrangomycin into tetrangulol
Tetrangulol: proposed biosynthetic route in Streptomyces sp. KL110A
After identifying tetrangulol from our metabolomics dataset, we mined the KL101A genome for its BGC using an optimized version (Huang et al. 2023) of the CORASON (Navarro-Muñoz et al. 2020) pipeline which code is freely available at: https://github.com/WeMakeMolecules/myCORASON to identify the BGC responsible for its production. The analysis allowed us to identify a BGC in contig 3.1 (annotated as lungdomycin, 62% similarity) with similar biosynthetic mechanism to other angucyclines, such as kinanthraquinone produced by Streptomyces sp. SN 5934 and lugdunomycin produced by Streptomyces sp. QL37 (Fig. 3a). Then, we hypothesized that the same biosynthetic pathway, with minor variations, could lead to the production of tetrangulol.
We propose that the tetrangulol biosynthesis follows a canonical type II PKS route, beginning with the condensation of an acetyl-CoA and nine units of malonil-CoA in a process catalyzed by two ketosynthases (1,2) and an acyl carrier protein (3) that generates a lineal polyketide chain which is then processed by a polyketide cyclases (4,6) which facilitates folding and cyclization to form the characteristic aromatic core of benz(a)anthraquinone derivates. Subsequently, the aromatic nucleus is modified by the catalysis of monooxygenase enzymes that hydroxylate, and NADPH and FAD-dependent reductases that reduced the molecule to generate intermediates as tetrangomycin and finally, a dehydrogenation process converts this intermediate to tetrangulol (Fig. 3b).
Discussion
In this work, we present the isolation and a metabolomic-guided genome mining analysis of Streptomyces sp. KL110A, an environmental strain from wetlands of the Calakmul Biosphere Reserve in Mexico. While preliminary genome mining analysis allowed us to dereplicate commonly found biosynthetic gene clusters, using metabolomics we detected the presence of tetrangulol, providing a logical explanation for its biosynthesis. Our approach underscores the importance of using combined omics datasets for advancing the discovery of novel biosynthetic pathways of natural products.
It is a common belief that remaining novel microbial metabolites are mostly produced by “rare” actinomycetes (e.g. all actinomycete genera except for Streptomyces, (Ding et al. 2019). Our study demonstrates that novel environments harbor novel biosynthetic pathways, even in over-sampled taxa. This aligns with the general understanding that specialized metabolites have evolved in response to environmental cues. In our study we first used AntiSMASH (Blin et al. 2023) to identify ∼ 23 BGCs. As expected, we rapidly dereplicated redundant BGCs in Streptomyces, such as desferrioxamine, geosmin and ectoine.
Metabolomic analysis performed using LC-MS/MS allowed us to go beyond genomic predictions. The analysis revealed the presence of compounds related to angucyclines in the extract from Streptomyces sp. KL110A. Tetrangulol was annotated based on its molecular ion [M + H]+ = 305.081 Da, which corresponds to the molecular formula C19H12O4 and its predicted fragmentation pattern according with Sirius 6.0.7 (Fig. S1). However, the 73% Tanimoto similarity score highlights a limitation in its definitive identification due to the lack of MS/MS data for tetrangulol in publicly available databases. Interestingly, we also identified tetrangomycin with a molecular ion [M + H]+ = 323.09 Da, corresponding with its exact mass of 322.084 Da, which showed a high similarity score of 98%. This compound is well documented in literature as a precursor in the biosynthetic pathway leading to tetrangulol (Hong et al. 1997). Previous studies, such as those performed on Streptomyces cyanogenus S-136, have reported tetrangomycin and tetrangulol as compounds structurally related to other angucyclines (Kharel et al., 2012). This observation is consistent with the biosynthetic pathway proposed in Fig. 3.
To complement our MS/MS-based analysis, we used SNAP-MS to construct a molecular network including tetrangomycin, tetrangomycin, and other angucyclines such as boshramycinones and fujianmycin A (Fig. 2c). This network highlights the chemical diversity within the metabolome of this strain and supports the biosynthetic relationships between these compounds, related to angucyclines.
Although these findings highlight the biosynthetic capabilities of Streptomyces sp. KL110A to produce novel molecules, we face certain limitations in our current analysis. While annotation tools alone may not suffice for unequivocal and complete structural elucidation of tetrangulol, integrating genomic and metabolomic data provides an additional layer of confidence, supporting our proposed structure. Future studies employing authentic standards or complementary techniques, such as NMR spectroscopy, will be crucial to validate these findings.
Overall, our findings underscore the importance of complementing genomic analysis with metabolomic tools to identify metabolites not predictable by current databases. This challenge highlights the complexity and potential of Streptomyces sp. KL110A to produce novel molecules, as evidenced by the low number of annotated compounds, which are not common in other, more redundant Streptomyces strains (Mohite et al. 2024).
In conclusion, our work presents the identification of the putative BGC responsible for producing tetrangulol by Streptomyces sp. KL110A. This study highlights the importance of integrating high quality omics datasets such as metabolomics and genomics to advance the field of microbial natural products research and emphasizes the need to update and expand spectral databases. Such efforts will facilitate future dereplications and a more comprehensive characterization of the biosynthetic capabilities of unexplored strains.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
We thank FEMSA Biotechnology Center of Tecnologico de Monterrey and Novo Nordisk Foundation Center for Biosustainability of DTU for financial support. The National Council for Humanities, Science and Technology (CONAHCYT) provided PhD scholarship to LMTA. We thank Dr. Karina Verdel-Aranda from the Instituto Tecnológico de Chiná-TecNM for providing access to the Calakmul Biosphere Reserve through the permit. The Work by PCM, LA, DR, ACC and CCP was funded by the Novo Nordisk Foundation with Grant NNF20CC0035580.
Author contributions
LMTA performed isolation, genome sequencing and genomic characterization, wrote the original draft and the final version of the manuscript. CPC and CCA performed genome mining analysis, analyzed the metabolomics data and revised the final version of the manuscript. RD and LA performed the metabolomics dataset. PCM and CLC designed and supervised the experimental work, wrote the original draft, revised the final version of the manuscript and provided financial support.
Funding
This work was partially supported by FEMSA Biotechnology Center from Tecnologico de Monterrey and the Novo Nordisk Foundation Center for Biosustainability from DTU through the Research program TEC-DTU 2024. L.M.T.A thank the National Council for Humanities, Science and Technology (CONAHCYT) for PhD scholarship (living allowance) and Tecnológico de Monterrey for tuition fee scholarship. CLC and PCM received funding from UCMEXUS Grant No. CN-18-10. The Work by PCM, ACC and CCP was funded by the Novo Nordisk Foundation with Grant NNF20CC0035580.
Data availability
The Streptomyces sp. KL110A genome sequence has been deposited at DDBJ/ENA/GenBank under accession JBJPFI000000000. The MS1 and MS2 data of tetrangulol derived from Streptomyces sp. KL110A have been deposited in the MassIVE repository under accession number MSV000096653.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Pablo Cruz-Morales, Email: pcruzm@biosustain.dtu.dk.
Cuauhtémoc Licona-Cassani, Email: clicona@tec.mx.
References
- Aguilar C, Alwali A, Mair M, Rodriguez-Orduña L, Contreras-Peruyero H, Modi R, Roberts C, Sélem-Mojica N, Licona-Cassani C, Parkinson EI (2024) Actinomycetota bioprospecting from ore-forming environments. Microb Genomics 10(5):001253. 10.1099/mgen.0.001253 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andrew S (2010) FastQC: A Quality Control tool for High Throughput Sequence Data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- Blin K, Shaw S, Augustijn HE, Reitz ZL, Biermann F, Alanjary M, Fetter A, Terlouw BR, Metcalf WW, Helfrich EJN, van Wezel GP, Medema MH, Weber T (2023) antiSMASH 7.0: New and improved predictions for detection, regulation, chemical structures and visualisation. Nucleic Acids Res 51(W1):W46–W50. 10.1093/nar/gkad344 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinf (Oxford England) 30(15):2114–2120. 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaudhari NM, Gupta VK, Dutta C (2016) BPGA- an ultra-fast pan-genome analysis pipeline. Sci Rep 6(1):24373. 10.1038/srep24373 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chevrette G, M., Handelsman J (2021) Needles in haystacks: reevaluating old paradigms for the discovery of bacterial secondary metabolites. Nat Prod Rep 38(11):2083–2099. 10.1039/D1NP00044F [DOI] [PubMed] [Google Scholar]
- Cruz-Morales P, Ramos-Aboites HE, Licona-Cassani C, Selem-Mójica N, Mejía-Ponce PM, Souza-Saldívar V, Barona-Gómez F (2017) Actinobacteria phylogenomics, selective isolation from an iron oligotrophic environment and siderophore functional characterization, unveil new desferrioxamine traits. FEMS Microbiol Ecol 93(9):fix086. 10.1093/femsec/fix086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dührkop K, Fleischauer M, Ludwig M, Aksenov AA, Melnik AV, Meusel M, Dorrestein PC, Rousu J, Böcker S (2019) SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat Methods 16(4):299–302. 10.1038/s41592-019-0344-8 [DOI] [PubMed] [Google Scholar]
- Ding T, Yang L-J, Zhang W-D, Shen Y-H (2019) The secondary metabolites of rare actinomycetes: Chemistry and bioactivity. RSC Adv 9(38):21964–21988. 10.1039/C9RA03579F [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duncan KR, Crüsemann M, Lechner A, Sarkar A, Li J, Ziemert N, Wang M, Bandeira N, Moore BS, Dorrestein PC, Jensen PR (2015) Molecular networking and pattern-based genome mining improves Discovery of Biosynthetic Gene clusters and their products from Salinispora Species. Chem Biol 22(4):460–471. 10.1016/j.chembiol.2015.03.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26(19):2460–2461. 10.1093/bioinformatics/btq461 [DOI] [PubMed] [Google Scholar]
- Gallegos-Lopez S, Mejia-Ponce PM, Gonzalez-Salazar LA, Rodriguez-Orduña L, Souza-Saldivar V, Licona-Cassani C (2020) Draft genome sequence of Streptomyces sp. Strain C8S0, isolated from a highly oligotrophic sediment. Microbiol Resource Announcements 9(14). 10.1128/mra.01441-19 [DOI] [PMC free article] [PubMed]
- Gavriilidou A, Kautsar SA, Zaburannyi N, Krug D, Müller R, Medema MH, Ziemert N (2022) Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes. Nat Microbiol 7(5):726–735. 10.1038/s41564-022-01110-2 [DOI] [PubMed] [Google Scholar]
- González-Salazar LA, Quezada M, Rodríguez-Orduña L, Ramos-Aboites H, Capon RJ, Souza-Saldívar V, Barona-Gomez F, Licona-Cassani C (2023) Biosynthetic novelty index reveals the metabolic potential of rare actinobacteria isolated from highly oligotrophic sediments. Microb Genomics 9(1):mgen000921. 10.1099/mgen.0.000921 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hong ST, Carney JR, Gould SJ (1997) Cloning and heterologous expression of the entire gene clusters for PD 116740 from Streptomyces strain WP 4669 and tetrangulol and tetrangomycin from Streptomyces rimosus NRRL 3016. J Bacteriol 179(2):470–476. 10.1128/jb.179.2.470-476.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang J, Quest A, Cruz-Morales P, Deng K, Pereira JH, Van Cura D, Kakumanu R, Baidoo EEK, Dan Q, Chen Y, Petzold CJ, Northen TR, Adams PD, Clark DS, Balskus EP, Hartwig JF, Mukhopadhyay A, Keasling JD (2023) Complete integration of carbene transfer chemistry into biosynthesis. Nature 617(7960):403–408. 10.1038/s41586-023-06027-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jørgensen TS, Mohite OS, Sterndorff EB, Alvarez-Arevalo M, Blin K, Booth TJ, Charusanti P, Faurdal D, Hansen TØ, Nuhamunada M, Mourched A-S, Palsson BØ, Weber T (2024) A treasure trove of 1034 actinomycete genomes. Nucleic Acids Res 52(13):7487–7503. 10.1093/nar/gkae523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14(6):587–589. 10.1038/nmeth.4285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katz L, Baltz RH (2016) Natural product discovery: past, present, and future. J Ind Microbiol Biotechnol 43(2–3):155–176. 10.1007/s10295-015-1723-5 [DOI] [PubMed] [Google Scholar]
- Kautsar SA, Blin K, Shaw S, Weber T, Medema MH (2021) BiG-FAM: the biosynthetic gene cluster families database. Nucleic Acids Res 49(D1):D490–D497. 10.1093/nar/gkaa812 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kharel MK, Pahari P, Shepherd M D, Tibrewal N, Nybo SE, Shaaban A, Rohr J (2012) Angucyclines: Biosynthesis, mode-of-action, new natural products, and synthesis. Nat Prod Rep 29(2):264–325. 10.1039/c1np00068c [DOI] [PMC free article] [PubMed]
- Lee N, Hwang S, Kim J, Cho S, Palsson B, Cho B-K (2020) Mini review: genome mining approaches for the identification of secondary metabolite biosynthetic gene clusters in Streptomyces. Comput Struct Biotechnol J 18:1548–1556. 10.1016/j.csbj.2020.06.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, van Santen JA, Farzaneh N, Liu DY, Pye CR, Baumeister TUH, Wong WR, Linington RG (2022) NP Analyst: an Open Online platform for compound activity mapping. ACS Cent Sci 8(2):223–234. 10.1021/acscentsci.1c01108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Letunic I, Bork P (2021) Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res 49(W1):W293–W296. 10.1093/nar/gkab301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Licona-Cassani C, Cruz-Morales P, Manteca A, Barona-Gomez F, Nielsen LK, Marcellin E (2015) Systems biology approaches to understand natural products biosynthesis. Front Bioeng Biotechnol Dec 09:3199. 10.3389/fbioe.2015.00199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu R, Deng Z, Liu T (2018) Streptomyces species: Ideal chassis for natural product discovery and overproduction. Metab Eng 50:74–84. 10.1016/j.ymben.2018.05.015 [DOI] [PubMed] [Google Scholar]
- Lu Q-P, Huang Y-M, Liu S-W, Wu G, Yang Q, Liu L-F, Zhang H-T, Qi Y, Wang T, Jiang Z-K, Li J-J, Cai H, Liu X-J, Luo H, Sun C-H (2021) Metabolomics Tools assisting Classic Screening methods in discovering New antibiotics from Mangrove Actinomycetia in Leizhou Peninsula. Mar Drugs 19(12) Article 12. 10.3390/md19120688 [DOI] [PMC free article] [PubMed]
- Ma B, Zhang K, Hendrie C, Liang C, Li M, Doherty-Kirby A, Lajoie G (2003) PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom 17(20):2337–2342. 10.1002/rcm.1196 [DOI] [PubMed] [Google Scholar]
- Meier-Kolthoff JP, Göker M (2019) TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy. Nat Commun 10(1):2182. 10.1038/s41467-019-10210-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R (2020) IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37(5):1530–1534. 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mohimani H, Gurevich A, Mikheenko A, Garg N, Nothias L-F, Ninomiya A, Takada K, Dorrestein PC, Pevzner PA (2017) Dereplication of peptidic natural products through database search of mass spectra. Nat Chem Biol 13(1):30–37. 10.1038/nchembio.2219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mohite OS, Jørgensen TS, Booth T, Charusanti P, Phaneuf PV, Weber T, Palsson BO (2024) Pangenome mining of the Streptomyces genus redefines their biosynthetic potential (p. 2024.02.20.581055). bioRxiv. 10.1101/2024.02.20.581055 [DOI] [PMC free article] [PubMed]
- Morehouse NJ, Clark TN, McMann EJ, van Santen JA, Haeckl FPJ, Gray CA, Linington RG (2023) Annotation of natural product compound families using molecular networking topology and structural similarity fingerprinting. Nat Commun 14(1):308. 10.1038/s41467-022-35734-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Navarro-Muñoz JC, Selem-Mojica N, Mullowney MW, Kautsar SA, Tryon JH, Parkinson EI, De Los Santos ELC, Yeong M, Cruz-Morales P, Abubucker S, Roeters A, Lokhorst W, Fernandez-Guerra A, Cappelini LTD, Goering AW, Thomson RJ, Metcalf WW, Kelleher NL, Barona-Gomez F, Medema MH (2020) A computational framework to explore large-scale biosynthetic diversity. Nat Chem Biol 16(1):60–68. 10.1038/s41589-019-0400-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Radi MS, Munro LJ, Rago D, Kell DB (2024) An untargeted Metabolomics Strategy to identify substrates of known and Orphan E. Coli transporters. Membranes 14(3):70. 10.3390/membranes14030070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-Sanchez AC, Gónzalez-Salazar LA, Rodriguez-Orduña L, Cumsille Á, Undabarrena A, Camara B, Sélem-Mojica N, Licona-Cassani C (2023) Phylogenetic classification of natural product biosynthetic gene clusters based on regulatory mechanisms. Front Microbiol 14. 10.3389/fmicb.2023.1290473 [DOI] [PMC free article] [PubMed]
- Rateb ME, Zhai Y, Ehrner E, Rath CM, Wang X, Tabudravu J, Ebel R, Bibb M, Kyeremeh K, Dorrestein PC, Hong K, Jaspars M, Deng H (2015) Legonaridin, a new member of linaridin RiPP from a Ghanaian Streptomyces isolate. Org Biomol Chem 13(37):9585–9592. 10.1039/C5OB01269D [DOI] [PubMed]
- Schmid R, Heuckeroth S, Korf A, Smirnov A, Myers O, Dyrlund TS, Bushuiev R, Murray KJ, Hoffmann N, Lu M, Sarvepalli A, Zhang Z, Fleischauer M, Dührkop K, Wesner M, Hoogstra SJ, Rudt E, Mokshyna O, Brungs C, Pluskal T (2023) Integrative analysis of multimodal mass spectrometry data in MZmine 3. Nat Biotechnol 41(4):447–449. 10.1038/s41587-023-01690-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sidebottom AM, Carlson EE (2015) A reinvigorated era of bacterial secondary metabolite discovery. Curr Opin Chem Biol 24:104–111. 10.1016/j.cbpa.2014.10.014 [DOI] [PubMed] [Google Scholar]
- Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J (2016) NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44(14):6614–6624. 10.1093/nar/gkw569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trejo-Alarcon LM, Cruz-Morales P, Licona-Cassani C (2025) Draft genome sequence of Streptomyces sp. CC302I with non-canonical biosynthetic gene clusters for codon-readthrough activity. Microbiol Resource Announcements. 10.1128/mra.01109-24 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Undabarrena A, Beltrametti F, Claverías FP, González M, Moore ERB, Seeger M, Cámara B (2016) Exploring the diversity and antimicrobial potential of Marine Actinobacteria from the Comau Fjord in Northern Patagonia, Chile. Front Microbiol 7. 10.3389/fmicb.2016.01135 [DOI] [PMC free article] [PubMed]
- Wattam AR, Brettin T, Davis JJ, Gerdes S, Kenyon R, Machi D, Mao C, Olson R, Overbeek R, Pusch GD, Shukla MP, Stevens R, Vonstein V, Warren A, Xia F, Yoo H (2018) Assembly, Annotation, and Comparative Genomics in PATRIC, the All Bacterial Bioinformatics Resource Center. Methods Mol Biology (Clifton N J) 1704:79–101. 10.1007/978-1-4939-7463-4_4 [DOI] [PubMed] [Google Scholar]
- Wick RR, Judd LM, Gorrie CL, Holt KE (2017) Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13(6):e1005595. 10.1371/journal.pcbi.1005595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yurekten O, Payne T, Tejera N, Amaladoss FX, Martin C, Williams M, O’Donovan C (2024) MetaboLights: open data repository for metabolomics. Nucleic Acids Res 52(D1):D640–D646. 10.1093/nar/gkad1045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zdouc MM, Blin K, Louwen NLL, Navarro J, Loureiro C, Bader CD, Bailey CB, Barra L, Booth TJ, Bozhüyük KAJ, Cediel-Becerra JDD, Charlop-Powers Z, Chevrette MG, Chooi YH, D’Agostino PM, de Rond T, Del Pup E, Duncan KR, Gu W, Medema MH (2024) MIBiG 4.0: advancing biosynthetic gene cluster curation through global collaboration. Nucleic Acids Res gkae1115. 10.1093/nar/gkae1115 [DOI] [PMC free article] [PubMed]
- Ziemert N, Lechner A, Wietz M, Millán-Aguiñaga N, Chavarria KL, Jensen PR (2014) Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora. Proceedings of the National Academy of Sciences, 111(12), E1130–E1139. 10.1073/pnas.1324161111 [DOI] [PMC free article] [PubMed]
- Zuffa S, Schmid R, Bauermeister A, Gomes P, Caraballo-Rodriguez PW, Abiead AME, Aron Y, Gentry AT, Zemlin EC, Meehan J, Avalon MJ, Cichewicz NE, Buzun RH, Terrazas E, Hsu MC, Oles C-Y, Ayala R, Zhao AV, Chu J, Dorrestein H, P. C (2024) microbeMASST: a taxonomically informed mass spectrometry search tool for microbial metabolomics data. Nat Microbiol 9(2):336–345. 10.1038/s41564-023-01575-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The Streptomyces sp. KL110A genome sequence has been deposited at DDBJ/ENA/GenBank under accession JBJPFI000000000. The MS1 and MS2 data of tetrangulol derived from Streptomyces sp. KL110A have been deposited in the MassIVE repository under accession number MSV000096653.