Rhodococcus comparative genomics reveals a phylogenomic-dependent non-ribosomal peptide synthetase distribution: insights into biosynthetic gene cluster connection to an orphan metabolite

Agustina Undabarrena; Ricardo Valencia; Andrés Cumsille; Leonardo Zamora-Leiva; Eduardo Castro-Nallar; Francisco Barona-Gomez; Beatriz Cámara

doi:10.1099/mgen.0.000621

. 2021 Jul 9;7(7):000621. doi: 10.1099/mgen.0.000621

Rhodococcus comparative genomics reveals a phylogenomic-dependent non-ribosomal peptide synthetase distribution: insights into biosynthetic gene cluster connection to an orphan metabolite

Agustina Undabarrena ^1,^*, Ricardo Valencia ^1,^†, Andrés Cumsille ¹, Leonardo Zamora-Leiva ¹, Eduardo Castro-Nallar ², Francisco Barona-Gomez ³, Beatriz Cámara ^1,^*

PMCID: PMC8477407 PMID: 34241590

Abstract

Natural products (NPs) are synthesized by biosynthetic gene clusters (BGCs), whose genes are involved in producing one or a family of chemically related metabolites. Advances in comparative genomics have been favourable for exploiting huge amounts of data and discovering previously unknown BGCs. Nonetheless, studying distribution patterns of novel BGCs and elucidating the biosynthesis of orphan metabolites remains a challenge. To fill this knowledge gap, our study developed a pipeline for high-quality comparative genomics for the actinomycete genus Rhodococcus, which is metabolically versatile, yet understudied in terms of NPs, leading to a total of 110 genomes, 1891 BGCs and 717 non-ribosomal peptide synthetases (NRPSs). Phylogenomic inferences showed four major clades retrieved from strains of several ecological habitats. BiG-SCAPE sequence similarity BGC networking revealed 44 unidentified gene cluster families (GCFs) for NRPS, which presented a phylogenomic-dependent evolution pattern, supporting the hypothesis of vertical gene transfer. As a proof of concept, we analysed in-depth one of our marine strains, Rhodococcus sp. H-CA8f, which revealed a unique BGC distribution within its phylogenomic clade, involved in producing a chloramphenicol-related compound. While this BGC is part of the most abundant and widely distributed NRPS GCF, corason analysis unveiled major differences regarding its genetic context, co-occurrence patterns and modularity. This BGC is composed of three sections, two well-conserved right/left arms flanking a very variable middle section, composed of nrps genes. The presence of two non-canonical domains in H-CA8f’s BGC may contribute to adding chemical diversity to this family of NPs. Liquid chromatography-high resolution MS and dereplication efforts retrieved a set of related orphan metabolites, the corynecins, which to our knowledge are reported here for the first time in Rhodococcus. Overall, our data provide insights to connect BGC uniqueness with orphan metabolites, by revealing key comparative genomic features supported by models of BGC distribution along phylogeny.

Keywords: biosynthetic gene clusters, comparative genomics, non-ribosomal peptide synthetase evolution, orphan metabolites, Rhodococcus

Data Summary

All supporting data and protocols have been provided within the article or through supplementary data files or Figshare repositories (https://doi.org/10.6084/m9.figshare.13158086.v2). Public genome data were retrieved from the National Center for Biotechnology Information GenBank (Table S1, available in the online version of this article). Code scripts are available as Jupyter notebooks in a GitHub repository (https://github.com/rvalenciaaz/rhodococcus-bgc). All supplementary material can be found on Figshare (https://doi.org/10.6084/m9.figshare.13158086.v2).

Impact Statement.

Biosynthetic gene clusters (BGCs) harbour genetic information to build a myriad of natural products (NPs). Actinomycete NPs provide an unsurpassed resource in drug discovery to face multi-resistant pathogenic bacteria. Although researchers have been describing how BGCs play a role in their biosynthesis, little is known regarding the patterns modelling BGC structure and distribution. Understanding these has an important effect in linking the vast amount of genomic information with the production of NPs, especially to orphan metabolites. This study performed a comparative genomics analysis of the underexplored genus Rhodococcus, using Rhodococcus sp. H-CA8f, one of our Chilean fjord-derived marine strains, as a model to perform an in-depth analysis of BGC distribution patterns. A BGC network revealed that the main category was encompassed by non-ribosomal peptide synthetase (NRPS) pathways, retrieving 44 gene cluster families (GCFs). Our results support a strong correlation with phylogeny, revealing clade-specific GCFs. Deeper understanding of a NRPS in Rhodococcus sp. H-CA8f, likely to be producing an orphan chloramphenicol-related compound, revealed that its BGC distribution is unique among its phylogenomic clade. This study contributes to unveiling unique BGCs, understanding their distribution among clades and the proposal of the involvement of the production of an orphan metabolite, never described before in Rhodococcus.

Introduction

Natural products (NPs) are commonly synthesized by complex specialized metabolic pathways, whose genes are physically grouped together in biosynthetic gene clusters (BGCs) [1, 2]. The advances in sequencing technologies and bioinformatics tools of the genomic era have played an essential role in the discovery of BGCs through genome mining [3, 4]. Thousands of sequences have become available, containing an even larger number of BGCs with overwhelming diversity, making a roadmap for their characterization necessary [5]. For instance, classifying BGCs into gene cluster families (GCFs) allows further prioritization based on the similarities shared between NP structural scaffolds [6, 7]. However, there are knowledge gaps regarding BGC linkage with NPs, leaving a vast abundance of orphan metabolites. Moreover, understanding BGC diversity, maintenance and distribution patterns, to ultimately decipher how these contribute to environmental adaptations, remains a challenge [8]. In this sense, comparative genomics allows a comprehensive exploration of BGCs based on high-throughput mining, providing the much-needed evidence to target certain BGCs, augmenting the knowledge to empower the genomic-guided bioprospection for NPs.

Actinomycetes have been in the spotlight as a renowned source of NPs, due to their ability to produce a myriad of structurally rich bioactive compounds [9–11]. Although focus has been historically placed on the soil-derived genus Streptomyces [12, 13], bioprospecting underexploited environments with strong selective pressures such as the ocean [14], along with the study of other genera – rather than Streptomyces [15] – has proven to be a successful strategy to enrich screening collections [16]. As actinomycete genome sequencing increases, a correlation between genome size and BGC abundance was evidenced for the genera Actinomadura, Gordonia, Micromonospora, Nocardia, Nocardiopsis and Rhodococcus, which were demonstrated to harbour an unexplored reservoir for unique BGCs [17]. Historically, the genus Rhodococcus has been largely explored for its extensive catabolic versatility, including bioremediation, biotransformation and biocatalysis applications [18–20]. In contrast, scarce knowledge is available regarding comparative BGC analysis, although Rhodococcus genomes currently add up to ~500 in the National Center for Biotechnology Information (NCBI) database.

Comparative studies of the genus Rhodococcus have been mainly focused on defining phylogeny, determining the core genome and to functionally analyse their catabolic potential and stress responses [21, 22]. Notably, a prior study contemplating 20 Rhodococcus genomes showed a mostly uncharacterized BGC repertoire, revealing certain strain-specific GCFs [23]. However, NP BGCs and the connection with the roles of their metabolites are mosly unknown [24]. A few studies connect non-ribosomal peptide synthetase (NRPS) pathways to their products, mostly with siderophores, such as heterobactin [25], rhequichelin [26] and rhodochelin [27], and also to a lipopeptide surfactant [28]. Other efforts have yielded humimycins, a synthetic NRPS-inspired NP [29], which no doubt validates the use of genome mining of BGCs. Still, little is known regarding NPs with antibiotic activity in the genus Rhodococcus. The main compounds known to date are the cyclic tetrapeptide rhodopeptins [30], the cyclic lasso peptides lariatins [31] and quinoline aurachins [32, 33], although none are reported from marine-derived Rhodococcus. Thus, there are still open questions about the main mechanisms underlying BGC distribution in rhodococci, and insights into their connection to specialized metabolites.

In this work, we aim to augment the knowledge of NRPS distribution across phylogeny, by performing an in-depth BGC comparative genomics analysis of the genus Rhodococcus. We developed a bioinformatics pipeline to address the selection of high-quality data, phylogenomics (corason) and sequence similarity BGC networking analyses to reveal patterns that model BGC diversity and structure. Moreover, the bioprospection of orphan metabolites was unveiled by using one of our bioactive strains as a proof of concept, the marine-derived Rhodococcus sp. H-CA8f [34]. Complementing high-throughput comparative genomics with phylogenomic and GCF network analysis sustains BGC correlations; thus, enhancing genome mining predictions. Our results ultimately bear potential connections through biosynthesis, evolution and ecological implications of the genus Rhodococcus.

Methods

Comparative genomics pipeline

Rhodococcus genomes were downloaded from the NCBI RefSeq FTP server (306 entries as of 12th September 2018). Additionally, Rhodococcus sp. H-CA8f was selected from our culture collection, since it bears unique genomic features [34] and displayed antibacterial activity against both Gram-negative and Gram-positive target pathogens [35]. A comparative genomics pipeline was developed to comprehensively analyse high-throughput genome datasets. A schematic representation of the bioinformatic and biological criteria used to filter non-informative data is presented in Fig. S1.

Multiple data filtering criteria were performed in the pipeline on three levels: genomes (Fig. S1, blue box); BGCs (Fig. S1, red box); and NRPSs (Fig. S1, green box). Briefly, ‘Green Yes boxes’ indicate that data fulfil defined criteria and, thus, can be downstream analysed. ‘Yellow No boxes’ indicate that data conditionally fulfilled criteria and, hence, another filter was applied. ‘Red No boxes’ indicate that data did not fulfil the criteria and, thus, was subsequently discarded for further analyses. In the first level (Fig. S1, blue box), genomes with <200 contigs were selected, and analysed for completeness (>98 %) and contamination (<5 %), as implemented in CheckM v1.012 [36]. Although a rigorous completeness filter was applied and excessive fragmentation was avoided, some BGCs were predicted on contig edges, and those were still maintained for further analyses. Additionally, a manual bibliographical filter was performed to remove redundant genomes, checking for: (i) synonym strains – the same strains with different culture collection numbers; (ii) synonym genomes – with different entry names due to genome assembly improvements; or (iii) mutant strains – checked using culture collection database and bibliography [37]. ANIb (average nucleotide identity by blast alignments) between genomes was calculated using the pyANI package [38], to identify and discard highly similar genomes (>98 %). This threshold has been used for the dereplication of genomes and metagenome-assembly genomes in BGC biodiversity studies [39] and environmental microbial genomics [40]. Finally, if two or more Rhodococcus entries were redundant, only one genome was selected considering the following criteria: (i) fewer contigs; (ii) total assembly length in base pairs; and (iii) a recent year of entry publication at the NCBI database.

In the second level (Fig. S1, green box), selected genomes were submitted to standalone antiSMASH v4.1.0 [41] for BGC prediction, and BiG-SCAPE v.20181005 [42] was used to obtain cluster similarities. Similarly, redundant clusters were filtered using genomic ANIb (≥98 %) and BiG-SCAPE raw distance (≤10⁻³). A Python workflow was constructed to select BGCs that were composed of at least one biosynthetic plus one non-biosynthetic gene (https://github.com/rvalenciaaz/rhodococcus-bgc). Finally, at the third level (Fig. S1, red box), NRPS BGCs were manually corroborated for presenting two or more adenylation domains by using antiSMASH v4.1.0 [41].

Phylogenomic analysis

A phylogenomic tree (Fig. 1) was inferred with Orthofinder v2.2.7 [43] using the selected Rhodococcus genomes (Fig. S1). diamond aligner was used for orthogroup retrieval [44], maft [45] for multiple sequence alignment and FastTree [46] for approximate maximum-likelihood (ML) tree inference. Additionally, a phylogenomic method involving multilocus sequence analysis (MLSA) based on 100 highly conserved single copy genes using Automated Multi-locus Species Tree (AutoMLST) (http://automlst.ziemertlab.com/) was performed for Rhodococcus strains comprising subclade 4b, considering AutoMLST strain upload limitations [47]. For phylogeny inference, de novo mode was used with the option of concatenated alignment under the following configuration parameters: (i) strains from subclade 4b were manually selected from the AutoMLST in-house database, with the addition of three strains: Rhodococcus sp. NACPA4, Rhodococcus sp. H-CA8f and Rhodococcus sp. AQ5-07; (ii) iq-tree Ultrafast Bootstrap analysis was performed with 1000 replicates [48]; (iii) ModelFinder was used to find the best algorithm for tree reconstruction; (iv) inconsistent MLST genes were filtered (i.e. genes with greatest topology differences), and (v) fast alignment mode was activated. The final tree was modified with Dendroscope 3.6.2 [49] and megax [50] (Fig. S2). Table S2 lists the 100 conserved single-copy genes from which 88 were selected based on neutral dN/dS values, applying software default parameters [47]. To complement tree topology, a Bayesian multilocus phylogeny (BY) was inferred (Fig. S3) using mafft [45] and the concatenated nucleotide alignment of the genes gyrB, rpoB, rpoC, secY and recA [21]. Tree inference was accomplished with MrBayes v.3.2.7 [51, 52] using one million generations and two runs, while PartitionFinder2 [53] was used for fitting substitutions models. Orthofinder and Bayesian trees were compared using Robinson–Foulds [54] and SPR metrics in R, using the phangorn package [55]. The quartet distance [56], which considers tree similarity using small taxa groups, between the ML and BY trees was calculated using the TreeCmp webserver [57]. The prunes trees option was used to compare common taxa. The metric was normalized with respect to the mean value for random trees generated with the Yule and uniform model, respectively. To investigate putative ecological relationships, the isolation source of each strain was obtained from the NCBI and Joint Genome Institute (JGI) online servers and depicted in both trees with a colour legend next to each strain.

BGC networking and GCF analysis of NRPS

Selected Rhodococcus genomes (see Table S2) were uploaded to the antiSMASH v4.1.0 tool [41] to identify BGCs (Table S3) and into BiG-SCAPE v.20181005 [42] to calculate raw distances between clusters, by which a BGC network was constructed (Fig. 2). For this analysis, only the sequence of Rhodococcus sp. H-CA8f’s chromosome (GenBank accession no. CP023720) was used [34], and detailed genome mining is presented in Table S4. For network construction, several raw distance cut-offs were tested, ranging from 0 to 1 (0 being the most restrictive scenario) with a step of 0.1, where 0.6 was finally selected, aiming for a balanced connectivity of the overall network. Final graph layout was obtained using a combination of Fruchterman–Reingold [58] and Yifan Hu [59] algorithms, adjusting balance between node sparsity and agglomeration. Visualization of the networks were performed in Gephi v0.9.2 [60]. A reduced classification of BGC categories is presented, based on the following modifications: (i) ‘PKS I’ was grouped together with ‘PKS’; (ii) ‘Other hybrids’ was created to contain any hybrid combination; (iii) ‘ectoine’ and ‘butyrolactone’ were dropped from ‘Others’ and annotated as individual separated categories. Furthermore, a NRPS network was generated as a subgraph of the whole BGC network (Fig. 3), coloured by GCFs (Fig. 3a) and the phylogenomic clades (Fig. 3b). To group NRPSs into GCFs, the Louvain algorithm for community detection was applied with a default resolution parameter value of 1 [61, 62]. Unconnected nodes were excluded from the GCF definition. Manual inspection of selected GCFs was performed by uploading into antiSMASH v.4.1.0 all its BGCs.

Fig. 2. — *Rhodococcus* BGC networking. The distance network was constructed using BiG-SCAPE based on the *Rhodococcus* genomes filtered dataset, leading to a total of 1891 BGCs grouped by different categories. Each node represents one BGC, connected by edges when sharing a raw distance ≤0.6. *Rhodococcus* sp. H-CA8f BGCs, shown in black bold font, were apart from the main group of nodes but maintaining their connections. Colours represent BGC categories used in this study (slightly modified, see Table S3) depicted as follows: blue, NRPS; orange, polyketide synthase (PKS); pink, other hybrids; brown, (ribosomally synthesized and post-translationally modified peptide) RiPPs; purple, terpenes; dark green, ectoine; turquoise, butyrolactone; and green, other.

Fig. 3. — NRPS BGC network. NRPS nodes (n=717) were retrieved from the full BGC network (see Fig. 2). *Rhodococcus* sp. H-CA8f BGCs are depicted as NRPS 1–6 (for details, see Table S4) with black labels. (a) Colours depict the GCFs' pattern of distribution, formed by ≥10 BGCs. The remaining GCFs are shown in grey. (b) Colours depict the phylogenomic distribution, correlated with the subclade colours from the phylogenomic tree from Fig. 1.

Phylogenomic-dependent patterns of NRPS GCFs

Presence/absence matrix patterns of each NRPS GCF were determined with a binary set in R, using the pheatmap v1.0.10 package [63]. Filled squares denote the presence of a certain NRPS GCF in a Rhodococcus genome (Fig. 4). A hierarchical clustering of the presence/absence map of the NRPS GCFs is shown as a dendrogram alongside the vertical axis. The horizontal axis considers the clades from the phylogenomic tree, maintaining the respective clade colour as depicted in Fig. 1. GCF-1/GCF-5 are highlighted in their respective colours for better visualization. NRPS GCF rarefaction curves (Fig. S4) were generated using the GCF presence/absence matrix plotted against the surveyed genomes. Richness calculations were performed using the iNEXT package in R [64]. NRPS GCF richness was considered for the diversity index, and default bootstrap iterations (n=50) with 95 % confidence intervals were used in the run. Interpolation and extrapolation data were inferred by iNEXT. GCF presence/absence pattern similarity within and between clades was assessed with a non-parametric multivariate statistical test [65]. Since GCF presence/absence is a binary trait, the Jaccard distance was employed to generate a distance matrix. Then, we performed a permutational multivariate analysis of variance (PERMANOVA) [66] for 999 permutations, considering the clades as groups, in the vegan package of R (https://CRAN.R-project.org/package=vegan).

Fig. 4. — Hierarchical clustering of NRPS GCFs. NRPS GCFs (n=44, right side) considering the presence/absence barcoding depicted in *Rhodococcus* genomes represented according to phylogenomic inference and including isolation source, according to Fig. 1. Presence of a GCF in a *Rhodococcus* genome is represented by a filled square, while its absence is represented by an empty square. Related GCF-1 (light orange) and GCF-5 (green) are highlighted for better visualization. *Rhodococcus* sp. H-CA8f is shown in bold font within subclade 4b, and its GCF representatives are highlighted in yellow.

Evolutionary relationships of GCF-1/GCF-5 NRPS

GCF-1/GCF-5 genetic context across clade four was evaluated with CORe Analysis of Syntenic Orthologs to prioritize NP-biosynthetic gene clusters (corason) [42] (Fig. 5), using every gene from H-CA8f’s NRPS #5 as the query (Fig. 5a). The level of gene conservation was analysed by three criteria: (i) gene co-occurrence pattern across the phylogenomic clade, (ii) putative function based on blastp annotation and (iii) genetic organization (i.e. whether genes are in the same position and codified in the same direction). According to this, genes were grouped into blocks and represented with symbols when presenting a co-occurrence pattern. If at least one gene from the block is missing, the symbol is depicted empty. Otherwise, filled symbols represent the presence of the same genes shown for NRPS #5 (Fig. 5a). Final construction was manually edited to maintain schematic representation according to the phylogenomic subclades 4a (Fig. 5b) and 4b (Fig. 5c). For NRPS modularity analysis (Fig. S5), domain prediction was conducted using antiSMASH v5.2.0 [67], which incorporates NRPSpredictor3 [68], latent semantic indexing (LSI) based A-domain function predictor [69] and NRPSsp [70]. Additionally, Prediction Informatics for Secondary Metabolomes (PRISM 3) [71] was used for the detection of non-canonical domains.

Fig. 5. — Gene distribution patterns of GCF-1/GCF-5. (a) Genetic representation of NRPS #5 BGC belonging to GCF-1 from *Rhodococcus* sp. H-CA8f. NRPS #5 is grouped into three regions: left arm (green); middle section (blue); and right arm (purple). Genes grouped into blocks are represented with the following symbols: (i) left arm region – hexagon, *la1–la2*; cross, *la3–la4*; heart, *la5–la9*; and inverted triangle, *la10–la11;* (ii) right arm region – diamond, *ra1–ra2*; rectangle; *ra3–ra5*; circle, *ra6*; and triangle, *ra7–ra10*. For detailed predicted functions of genes, see Table 1. In each section, genes are drawn according to the size bar. In the middle section, letters within genes represent special domains: C*, starter condensation domain in *nrps I*; A*, non-classical adenylation domain in *nrps III*. (b) and (c) Genomic context comparison using corason of the GCF-1/GCF-5 BGC distribution from phylogenomic subclades 4a and 4b, respectively. Every gene comprising NRPS #5 of strain H-CA8f (shown in black bold font within subclade 4b and highlighted in yellow) was used as a query. Gene orientation and genetic organization are depicted similarly to NRPS #5 of *Rhodococcus* sp. H-CA8f, unless otherwise indicated. Filled symbols represent the presence of all genes constituting a block, whereas empty symbols indicate that at least one gene of that block is missing. Symbol size is not representative of gene size, and intergenic spaces are not to scale. Parallel lines indicate that genes are present elsewhere in the genome. Other genes – different from those previously mentioned – are represented as follows: black arrows, tRNAs; white arrows, hypothetical proteins.

NP extraction and assessment of antibacterial activity

Five culture media were used to test varying culture conditions for Rhodococcus sp. H-CA8f: ISP1 (5 g tryptone l⁻¹, 3 g yeast extract l⁻¹), ISP2 (10 g malt extract l⁻¹, 4 g yeast extract l⁻¹, 4 g glucose l⁻¹), R5A [72], SM19 [73] and a modified-SG medium, with soytone peptone –instead of soytone – and with no added glucose. ISP1 and ISP2 media were prepared with artificial sea water (ASW) (i.e. ISP1-ASW and ISP2-ASW). Fermentations were performed in a 250 ml Erlenmeyer flask containing 50 ml culture media, in a rotary shaker at 200 r.p.m. at 30 °C for 10 days. Afterwards, cells were separated from the supernatant by centrifugation at 5000 r.p.m. for 10 min. Supernatants were extracted twice in a decantation funnel using ethylacetate (EtOAc) in a 1 : 1 (v/v) ratio. The recovered organic phase was almost completely evaporated with a speed vacuum. Crude extracts were dissolved in methanol:water (HPLC-grade MeOH:MQ-H₂O, 1 : 1) to a final concentration of 5 mg ml⁻¹, and subsequently stored at −20 °C until further use.

The antibacterial activity of crude extracts was assessed as previously described [74], with minor modifications. In this study, seven model bacteria were used to test susceptibility: Staphylococcus aureus NBRC 227 100910^T (STAU), Listeria monocytogenes 07PF0776 (LIMO), Salmonella enterica subsp. enterica LT2^T 228 (SAEN), Escherichia coli FAP1 (ESCO), Pseudomonas aeruginosa DSM 50071^T (PSAU), Clavibacter michiganensis subsp. michiganensis VL493 (CLMI), a phytopathogenic strain isolated from an infected tomato plant obtained from Limache, Chile [75], and Micrococcus luteus H-CD9b (MILU), an actinomycete previously isolated by our group from the Northern Chilean Patagonia [35]. Model bacteria were grown overnight in a 5 ml LB culture at either 37 °C (PSAU, SAEN, ESCO and STAU) or 30 °C (MILU, LIMO and CLMI). The inoculum was adjusted to a final OD₆₀₀ of 0.2. Subsequently, model bacteria were streaked as a fine lawn on LB agar plates and 10 µl extract was placed on top. Inhibition zones were observed after overnight incubation. Results are shown in Table S5. Extractions of the media were also tested for antibacterial activity, and methanol:water was used as negative control. Extracts with antibacterial activity were selected for further chemical dereplication.

Chemical dereplication of NPs

Chemical dereplication was accomplished using liquid chromatography-high resolution MS (LC-HRMS) performed by Fundación MEDINA (Fig. 6). Experiments were carried out using an HPLC 1200 Rapid Resolution system (Agilent) coupled to a high-resolution mass spectrometer, MaXiz (Bruker). For separation, a SB-C8 Zorbax column (2.1×30 mm, 3.5 µm) was used with a flow rate of 0.3 ml min⁻¹. The mobile phase consisted of solvent A, H₂O:acetonitrile (AcN) (90 : 10), and solvent B, H₂O:AcN (10 : 90), both with ammonium formate 13 mM and 0.01 % trifluoroacetic acid (TFA). Gradient composition started with a linear decrease of solvent A from 90–0 %, and a linear increase of solvent B from 10–100 %, in 8 min. Then, the following 2 min were as for the initial maintaining conditions with 90 % of solvent A and 10 % of solvent B. MS was operated in positive mode (ESI+) with a spray voltage at 4kV, 11 l N₂ min⁻¹ at 200 °C capillary temperature and 280 KPa of nebulizer pressure. Absorbance was measured at 210 nm wavelength. Data analysis for NP identification was performed concerning: (i) retention time; (ii) UV absorbance spectrum; and (iii) accurate masses, obtained for every peak from the crude extract chromatogram profile (Fig. 6a). These criteria were used for comparison with MEDINA’s in-house database along with the Dictionary of Natural Products (DNP) database of Chapman and Hall, where molecules were searched for their identification (Fig. 6b).