Abstract
With the rise of antibiotic resistance along with the increase in the emergence of infectious diseases, the search for new therapeutic agents is of dire need. Bacillus is one of the widely distributed genera of bacteria, which has immense applications in biotechnology, medicine, and environmental protection, and is known to produce a vast range of secondary metabolites (SMs). Using several bioinformatics programming approaches along with phylogenetic and genomic comparisons, biosynthetic gene clusters (BGCs) were characterized from Bacillus reference genome sequences publicly available in databases to obtain new insights into the diversity and distribution of these BGCs in Bacillus for the discovery of unexplored SMs. From this study, average nucleotide identity (ANI) analysis of 87 species shows more than 70% similarities among them, whereas phylogenetic studies show species diversification into 4 clades based on the presence of 1–2 BGC types. High diversity for SMs production was observed in most of the species, where the majority of BGCs types were found to be terpene, nonribosomal peptide synthase (NRPS), polyketide synthase (PKS), hybrid gene clusters, ribosomally synthesized and post translationally modified peptide (RiPP) encoding for different characterized secondary metabolites, along with some unique metabolites characterized based on similarity confidence value. Many uncharacterized BGCs were found distributed within each species, giving new insights into novel compound discovery. These findings will open the door to the identification of species-specific pathways for SM development and contribute to creating new insights into therapeutic systems.
Keywords: Secondary metabolite (SM), Biosynthetic gene cluster (BGC), Genome mining, Bacillus, Bioinformatics
1. Introduction
Secondary metabolites (SMs) are low-molecular-weight natural products (NPs) synthesized mainly by bacteria, fungi, and plants during their stationary phase of growth. These compounds are not directly involved in growth-related processes; rather, they increase the survival of the organisms by defending them from biotic (e.g., bacteria, fungi, amoebae, plants, insects, and large animals) and abiotic stresses (e.g., nutrient depletion, cold, extreme temperatures, salinity, etc.)1. Different microbes, such as bacteria (Bacillus, Pseudomonas), and fungi (Aspergillus) are known to produce a wide range of secondary metabolites, which have vast applications in pharmaceutical, agricultural, cosmetics, and food industries2, 3, 4, 5.
With the rapid enhancement in medical science and increase in various diseases, several numberof life-saving drugs were developed from secondary metabolites produced by microorganisms, particularly bacteria. However, due to the rise of antibiotic resistance along with the decrease in discoveries of new treatment processes and methods, according to WHO, previously discovered antibiotics have lost their functionality resulting in an antibiotic crisis which poses a threat to human health6. Thus, the discovery of new antibiotics and therapeutic options to fight against emerging infectious diseases is of dire need.
The biosynthesis of microbial secondary metabolites is controlled by a group of genes clustered together in a region of the microbial genome to form the biosynthetic gene clusters (BGCs)7. These BGCs enable the co-expression of biosynthetic enzymes, regulators, and transporters responsible for the biosynthesis, assembly, and modification of the compound scaffold, as well as resistance, export, and regulation for a particular secondary metabolite. So, if one of the genes involved in the biosynthesis of a secondary metabolite can be demonstrated, the full pathway can be easily identified.
Bacillus, a Gram-positive and rod-shaped bacterium, is one of the most widely distributed genera of bacteria. It has more than 266 species and is commonly found in soil and water. Most of its members produce stress-resistant spores which are environment-friendly and are useful in pesticide production, thus considered as a biological resource for microbial pesticide development8, 9. Besides, it produces a vast set of secondary metabolites, including antibiotics, antifungals, siderophores, surfactins, fengycins, various polyketide-derived nonribosomal peptides, linear lipopeptides, enzymes, δ-endotoxins, etc.10. Thus, these bacterial species represent a new and rich source of secondary metabolites that require further exploration.
Genomics is a platform for describing how genome-scale technologies are developed and how they are used in all fields of biological research. With the advancement of DNA sequencing technology and reduction in sequencing cost, DNA sequence data from a wide range of organisms have been deposited in publicly accessible databases. Utilization of such data may help to acquire new knowledge in different areas associated with medicinal chemistry, creating a base for new drug designing. Genome mining, a terminology used in various fields in genomics, uses computational technology and modern bioinformatics to explore the genome sequence for the discovery of new processes, targets, and products. It recognizes specific functional genes or gene clusters from genome sequences, thus acting as a promising strategy for novel secondary metabolite discovery11.
A large collection of Bacillus genome sequences is publicly available in NCBI datasets and can act as a significant source for genome mining. So, by utilizing these genome sequences, a bacterium’s biosynthetic potential can be validated. For this study, the complete reference genome sequences of 87 species under the genus Bacillus were scanned using different bioinformatics tools to decipher and screen the distribution of known and uncharacterized BGCs, their evolutionary diversity, and the NP-coding potential of these genomes to characterize important secondary metabolites. Compiling the BGCs obtained from different bioinformatics tools may help in selecting targets for new drug design purposes.
2. Methodology
2.1. Retrieval of sequences from public databases
Reference sequences under the genus Bacillus were retrieved from the database “National Centre for Biological Information” (NCBI). Applying the keyword ‘Bacillus’ under the ‘Genome’ database, a total of 220 search results were obtained. Among them, species containing the Genus title ‘Bacillus’ were filtered, and 105 FASTA sequences were downloaded, while the rest were excluded. Again, all the sequences were screened, and a final of 87 complete reference sequences were selected. Besides sequence collection, additional information, including isolation sources, genome size, and gene number of each Bacillus species were also noted down from the GenBank format of each sequence. A flowchart for the overall workflow of the study is shown in Fig. 1.
Fig. 1.
Overall view of the workflow of the study.
2.2. Phylogenetic analysis
The phylogenetic tree and average nucleotide identity (ANI) analysis were generated using an open-source website integrated prokaryotes genome and pan-genome analysis service (IPGA v1.09). IPGA (https://nmdc.cn/ipga/) is a one-stop web service to analyze, compare, and visualize pan-genome as well as individual genomes, which avoids users from installing any tools. Genomes are grouped or clustered based on whole‐genome SNP‐based phylogenetic inference. To obtain a circular phylogenetic tree, the tool ITOL (Interactive Tree Of Life) was used (https://itol.embl.de/).
2.3. Identification of BGCs using bioinformatics tools
For each species, a single genome reference sequence was used for analysis. Three bioinformatics tools, namely antiSMASH, PRISM4, and BAGEL4 were used to explore each sequence. The FASTA format (.fna) of each sequence was input into each tool, and after a waiting period of 5–30 min, results were obtained. The output from each tool generates an HTML file, which is a visually interactive web-report of the predicted BGCs, their locations, descriptions, chemical structures, along with a few other interpretations. The number of each class of BGCs in each genome, along with some other important information, was extracted from the HTML files. The results can also be downloaded in several formats (i.e., GenBank, JSON, text, summary, etc.) based on one’s requirement. A brief on the three tools is given below:
2.3.1. antiSMASH
antiSMASH (antibiotics and secondary metabolites analysis shell) is a platform that allows fast identification, annotation, and analysis of secondary metabolite biosynthesis gene clusters (BGCs) in complete or draft genomes of bacteria, fungi, and, to some extent, plants. It integrates and cross-links with many in silico secondary metabolite analysis tools and is regarded as the gold standard for the prediction of existing and undiscovered BGCs. The present version of antiSMASH 6.1.1 (https://antismash.secondarymetabolites.org/#!/start) is an open-source web server and can be easily accessed. It can accurately identify the gene clusters encoding secondary metabolites of all known broad chemical classes and detect potential unexplored forms of BGCs based on comparisons to existing BGCs and final chemical product information obtained from MIBiG (The Minimum Information about a Biosynthetic Gene cluster) database12, 13. This tool automatically compares clusters against known, experimentally validated BGCs in the MIBiG database and shows similarity confidence (the qualitative similarity of the closest known compound within that region) and the most similar known cluster (the closest compound from the MiBIG database, along with its type), providing insights about the type of BGC found in each species. High similarity (>70 %) indicates a likely already known BGC, whereas moderate similarity (30–70 %) gives an indication of a possible variable BGC. However, low similarity (<30 %) or no match gives a prediction of novel BGC candidates, thus giving an initial screening for novel compounds.
2.3.2. PRISM4
PRISM (PRediction Informatics for Secondary Metabolites) is a computational resource for the chemical structure prediction of genetically encoded secondary metabolites based on microbial genomes14. It identifies biosynthetic gene clusters, predicts genetically encoded nonribosomal peptides and type I and II polyketides, and separates known natural products by the process of dereplication (biologically and chemically). The present version PRISM 4 (4.4.5), is an open-source, user-friendly web server and can be accessed from https://prism.adapsyn.com/.
2.3.3. BAGEL4
BAGEL is a web server that allows the identification and visualization of gene clusters in prokaryotic DNA involved in the biosynthesis of Ribosomally synthesized and Post translationally modified Peptides (RiPPs) and (unmodified) bacteriocins15. It displays an overview table of all the detected potential clusters as Area of Interest (AOI) responsible for RiPPs and bacteriocins production. It provides as much information as possible on identified AOI’s and improves annotation of novel bacterial genome sequences. The present version BAGEL4 is a user-friendly webserver providing fast, reliable and convenient mining of specifically RiPPs and bacteriocins and can be easily accessed from the website https://bagel4.molgenrug.nl/index.php.
3. Results
3.1. Phylogenetic analysis within Bacillus species
From the phylogenetic analysis, four major clades were obtained based on whole genome single nucleotide polymorphism (SNP) inference (Fig. 2). Observing the distribution of BGCs (obtained from antiSMASH 6.1.1) among the Bacillus species within the clades (Supplementary file 2), it was found that Type-III polyketide synthase (T3PKS) and hybrid clusters were seen to be frequent within clade 1. In clade 2, the distribution of T3PKS was frequent, whereas the frequency of hybrid clusters reduced compared to clade 1 species. In clade 3, RiPP-like and betalactone BGCs are exclusively present in each species. Hybrid and NRPS clusters were also frequent within clade 3 species, consisting of more than one copy within them, whereas a single copy of siderophore and lassopeptide clusters were distributed among most of the species. In clade 4, the frequency of RiPP-like and betalactone reduces among the species, whereas the frequency of hybrid and NRPS clusters increases compared to clade 3. T3PKS clusters were exclusively found in every species in clade 4. Terpene clusters were seen to be the only cluster found among all species. Thus, certain clades specialize in particular BGC types. Again, from average nucleotide identity (ANI) analysis, it shows that there is more than 70 % similarity among the species (Supplementary file 1).
Fig. 2.
A circular phylogenetic tree of 87 Bacillus spp. based on single nucleotide polymorphism. Each color represents a clade.
3.2. Prediction of Evident biosynthetic gene clusters (BGCs) in Bacillus species genomes using antiSMASH
From the analysis of 87 Bacillus species genome sequences using antiSMASH 6.1.1, 737 BGCs types were detected. Among them, a total of 32 major non-redundant BGCs were identified and grouped into 6 classes based on the HTML file of antiSMASH (Table 1). The classes include ribosomally synthesized and post-translationally modified peptides (RiPP), polyketide synthase (PKS), non-ribosomal peptide synthetases (NRPS), terpene, others, and hybrids. The “others” class defines those clusters containing a secondary metabolite-related protein that does not fit into any other category.
Table 1.
Major BGCs obtained from antiSMASH.
| RIPPS | PKS | NRPS |
|---|---|---|
| cyclic-lactone-autoinducer | PKS-like | CDPS |
| Epipeptide | T3PKS | NRP-like |
| Glycocin | transAT-PKS | NRPS |
| lanthipeptide-class-i | transAT-PKS-like | thioamide-NRP |
| lanthipeptide-class-ii | ||
| lanthipeptide-class-iii | ||
| lanthipeptide-class-iv | ||
| LAP | OTHERS | TERPENE |
| Lassopeptide | Arylpolyene | |
| Ranthipeptide | Betalactone | |
| redox-cofactor | Ectoine | |
| RiPP-like | Ladderane | HYBRIDS |
| RRE-containing | Other | |
| Sactipeptide | Phosphonate | |
| Thiopeptide | Siderophore |
The distribution of BGCs within Bacillus species was shown based on their isolation source obtained from NCBI – environmental, marine, plant, and those with undefined sources (Fig. 3). From environmental sources (Fig. 3.A), it was observed that terpene BGC was most frequently distributed in more or less all species. Most of the terpene clusters showed similarity to stress-protecting and growth-enhancing compounds like carotenoids (83–50 % similarity) (mostly), molybdenum cofactor (17 % similarity), pseudomonine (20 % similarity), and antimicrobial compounds like locillomycin (14 % similarity), ulleungmycin (5 % similarity). Besides this, the species under this group were shown to possess a significant number of BGCs under the class RIPPS, where RiPP-like, and lassopeptide were predominant. Betalactone, siderophore, and ectoine were seen to be predominant under the class “others”. Among the modular BGCs (PKS and NRPS) classes, T3PKS and NRPS were abundant. To obtain a variety of metabolites for surviving in different environments, each species also contained hybrid clusters. An individual species may have multiple copies of the same BGC, e.g. B. alkalicellulosilyticus, B. couhaluiensis, B. halotolerans, B. inaquosorum, B. mesophilum, B. salipaludis, and B.zhangzhouensis contained two copies of terpene BGCs while B. cihuensis, B. tamaricis, and B. wudalianchiensis had three copies of terpene BGCs (Supplementary file 2). A similar type of distribution of these BGC was also found in Bacillus spp. in marine, plant, and undefined sources (Fig. 3.B-D).
Fig. 3.
Distribution of BGCs among different Bacillus spp. A) Environmental sources B) Marine source and C) Plant source, and D) Undefined or unknown source.
Some clusters were found to be the least common, which include lassopeptide, CDPS, sactipeptide, RRE-containing, phosphonate, lanthipeptide-class-ii, NRP-like, and other. Rare BGCs, for example thiopeptide, ranthipeptide, cyclic-lactone-autoinducer, ladderane, ectoine, transAT-PKS-like, PKS-like, lanthipeptide-class-iii, lanthipeptide-class-iv, redox-cofactor, glycocin, thioamide-NRP, lanthipeptide-class-I, LAP, and arylpolyene were found uniquely in 1–2 species.
A total of 116 hybrid BGCs were found among all these species. Hybrid clusters consist of two or more different biosynthetic classes or cases where one class is used to biosynthesize a precursor for a second one. These hybrid BGCs are either a combination of 2–3 major BGCs; a major BGC with a subtype BGC, or a combination of 2–3 subtype BGCs. To understand hybrid clusters, the concept of core, neighbourhood, protocluster, and candidate cluster needs to be noted. Core refers to the minimum area containing one or more genes that code for enzymes for a single BGC type. Neighbourhood is the distance upstream and downstream of the cluster core used for finding tailoring genes or enzymes. Both of these are determined by manually curated detection rules. A protocluster contains a core with neighbourhoods on both sides of it. Each protocluster will always have one single product type, e.g, a protocluster coding for NRPS will produce only NRPS; and a single protocluster may overlap partially or completely with other prototypes. Lastly, candidate clusters are hybrids containing one or more protoclusters. The candidate hybrid clusters can be of four types – single, neighbouring, interleaved, and chemical hybrid (Fig. 4). Single hybrids contain a single protocluster and don’t have any overlaps with other protoclusters (Fig. 4A). Neighbouring hybrids contain protoclusters which transitively overlap in their neighbourhoods (Fig. 4B). Interleaved hybrids contain protoclusters that do not share cluster-type-defining coding sequences, but their core locations overlap (Fig. 4C). Chemical hybrids contain at least two protoclusters that share at least one gene coding for enzymes of two or more separate BGC types (Fig. 4D).
Fig. 4.
Types of hybrid clusters A. Single cluster B. Neighbouring cluster C. Interleaved cluster D. Chemical hybrid.
Observation from the obtained data showed hybrid clusters of neighboring (most), chemical, and interleaved type. Hybrid clusters consisting of PKS and NRPS modules were mostly found among the species. This modular characteristic allows reshuffling, removal, or fusion of different BGCs, giving variations of new metabolites to adapt to new environments. Besides PKS-NRPS hybrids, RiPP fused cluster hybrids were also found. A list of the hybrid clusters was grouped based on their candidate cluster type (Table 2). Twenty-five species lacked hybrid BGCs (Supplementary file 2). Bacillus swezeyi had the greatest number of hybrids (5 hybrids in the genome). Some hybrid clusters showed 100 % similarity to a wide range of antimicrobial agents bacillaene, sporulation killing factor, bacilysin, bacillibactin, and high similarity to fengycin, teichuronic acid, plantazolicin, and were found to be limited to some species (Supplementary File 2). Besides these, some hybrid BGCs were also shown to produce other compounds with low similarity like butirosin A/ butirosin B, bacitracin, difficidin, heme D1, lactocin S, and zwittermicin A, giving some evidence of the presence of novel BGCs.
Table 2.
Distribution of hybrid BGCs based on their candidate cluster type.
| Neighboring | Chemical hybrid | Interleaved |
|---|---|---|
| Betalactone-lassopeptide | Betalactone-NRPS | LAP-RiPP-like |
| betalactone-NRPS-RRE-containing | NRPS-betalactone-transAT-PKS | NRPS-T1PKS |
| betalactone-NRPS-transAT-PKS | NRPS-T1PKS | NRPS-transAT-PKS-T3PKS-PKS-like |
| betalactone-RiPP-like | NRPS-transAT-PKS-betalactone | ranthipeptide- sactipeptide |
| cyclic-lactone-autoinducer-lanthipeptide-class-ii | NRPS-betalactone | T3PKS-terpene |
| ladderane-NRPS | ranthipeptide-LAP-thiopeptide-RiPP-like | thiopeptide-RiPP-like |
| lassopeptide-RRE-containing | transAT-PKS-NRPS-T3PKS | transAT-PKS-PKS-like-NRPS |
| LAP-RRE-containing | thiopeptide-LAP | |
| LAP-RiPP-like-betalactone | thiopeptide-RiPP-like | |
| LAP-RiPP-like-betalactone | T1PKS-NRPS | |
| lanthipeptide-class-i-T1PKS-NRPS | transAT-PKS-NRPS-prodigiosin | |
| NRPS-betalactone-transAT-PKS | transAT-PKS-T3PKS-NRPS | |
| NRPS-NRPS-like | ||
| NRPS-NRPS-like-transAT-PKS | ||
| NRPS-T1PKS-ladderane | ||
| NRPS-RiPP-like | ||
| NRPS-transAT-PKS-PKS-like, T3PKS | ||
| NRPS-like-NRPS | ||
| phosphonate-NRPS-1PKS | ||
| RiPP-like-NRPS | ||
| RiPP-like-NRPS-like-NRPS-T1PKS | ||
| siderophore-terpene | ||
| sactipeptide-other | ||
| siderophore-LAP-RRE-containing | ||
| terpene −siderophore | ||
| transAT-PKS-PKS-like-T3PKS-NRPS | ||
| transAT-PKS-T1PKS-NRPS-like-NRPS |
Upon reviewing the antiSMASH analysis, from the 737 BGCs obtained from 87 species, 311 BGCs showed similarity to known clusters and were found to code for a diverse range of antimicrobial products as well as other important compounds (Supplementary file 2). The compound Fengycin was observed to be produced by 59 Bacillus species (around 67.8 %). Other prevalent compounds like bacillibactin, surfactin, bacilysin, carotenoid, lichenysin, molybdenum cofactor, subtilosin A, and zwittermicin A were also found to be produced by several species. Compounds like acinetobactin (Bacillus cytotoxicus), bicereucin (Bacillus cecembensis), iturin (Bacillus alveayuensis), rhizomide A (Bacillus rugosus), thuricin (Bacillus thuringiensis), pseudomycoicidin (Bacillus pseudomycoides), Sibiromycin (Bacillus ndiopicus), ulleungmycin (Bacillus salacetis), etc., were seen to be produced by only specific Bacillus species. However, the production of these compounds was not necessarily dependent on the isolation source of bacteria. Again, from antiSMASH HTML file, 426 (57.8 %) unknown clusters were found for which no known homologous or similar BGCs could be identified. A glimpse of the products with the number of incidence and the type of BGC it is coded from has been shown in Table 3 (detailed information shown in Supplementary File 2).
Table 3.
List of products obtained from BGCs of 87 Bacillus species.
| Product | No.of incidence of the product among the species | Type of BGCs encoding the product | Product | No.of incidence of the product among the species | Type of BGCs encoding the product |
|---|---|---|---|---|---|
| aurantinin B / aurantinin C / aurantinin D | 1 | transAT-PKS,PKS-like,T3PKS,NRPS | nostopeptolide A2 | 1 | RiPP-like, NRPS-like, NRPS, T1PKS |
| acinetobactin | 1 | NRPS-like | octapeptin C4 | 1 | NRPS-like |
| atratumycin | 1 | ladderane, NRPS | paeninodin | 17 | *Lassopeptide; RRE-containing; lassopeptide,RRE-containing |
| amipurimycin | 1 | ladderane | petrobactin | 13 | Siderophore |
| bacilysin | 16 | *Other; sactipeptide,other | pelgipeptin | 2 | NRPS |
| bottromycin A2 | 3 | betalactone | pseudomonine | 2 | Terpene, NRPS |
| bacillibactin | 40 | *NRPS; RiPP-like,NRPS; NRPS-like,NRPS | plantazolicin | 3 | LAP,RRE-containing; siderophore,LAP,RRE-containing |
| bacillomycin D | 2 | NRPS | plipastatin | 7 | NRPS |
| bicereucin | 1 | lanthipeptide-class-ii | paenilamicin | 2 | NRPS,T1PKS |
| bacillaene | 10 | transAT-PKS, NRPS,T3PKS; transAT-PKS,PKS-like,T3PKS,NRPS; NRPS,transAT-PKS,T3PKS,PKS-like; transAT-PKS,PKS-like,NRPS | paenilarvins | 2 | ladderane,NRPS |
| butirosin A / butirosin B | 8 | PKS-like; *thiopeptide,RiPP-like | pseudomycoicidin | 1 | lanthipeptide-class-ii |
| bacitracin | 5 | NRPS,T1PKS; NRPS | pyoverdin | 1 | NRPS |
| bogorol A | 1 | NRPS | pacidamycin 1 / pacidamycin 2 / pacidamycin 3 / pacidamycin 4 / pacidamycin 5 / pacidamycin 6 / pacidamycin 7 / pacidamycin D | 1 | NRPS |
| BE-7585A | 1 | T3PKS | paenicidin A | 1 | lanthipeptide-class-i |
| carotenoid | 16 | *Terpene; terpene,siderophore | putrebactin / avaroferrin | 1 | Siderophore |
| colistin A / colistin B | 1 | NRPS | puwainaphycin A / puwainaphycin B / puwainaphycin C / puwainaphycin D | 1 | NRPS-like,NRPS |
| cichofactin A / cichofactin B | 1 | NRPS-like,NRPS | rhizocticin A | 4 | NRPS |
| chejuenolide A / chejuenolide B | 1 | NRPS | rhizomide A / rhizomide B / rhizomide C | 1 | NRPS-like |
| cerecidin / cerecidin A1 / cerecidin A2 / cerecidin A3 / cerecidin A4 / cerecidin A5 / cerecidin A6 / cerecidin A7 | 1 | lanthipeptide-class-ii | Subtilin | 3 | lanthipeptide-class-i |
| difficidin | 4 | *transAT-PKS; transAT-PKS-like | Sibiromycin | 1 | Phosphonate |
| ectoine | 3 | ectoine | staphylobactin | 5 | Siderophore; thioamide-NRP |
| fengycin | 62 | NRPS | salecan | 1 | NRPS |
| hygrocin A / hygrocin B | 2 | siderophore | surfactin | 21 | NRPS |
| haloduracin β / haloduracin α | 1 | lanthipeptide-class-ii | subtilosin A | 9 | Sactipeptide |
| heme D1 | 1 | transAT-PKS,T1PKS,NRPS-like,NRPS | staphylococcin C55 α / staphylococcin C55 β | 1 | lanthipeptide-class-ii |
| iturin | 1 | T3PKS | sporulation killing factor | 4 | Sactipeptide; ranthipeptide,sactipeptide |
| lichenysin | 10 | NRPS | sublancin 168 | 2 | Glycocin |
| locillomycin | 2 | terpene | salivaricin A | 2 | lanthipeptide-class-ii |
| lactocin S | 2 | NRP; NRPS,NRPS-like | thailanstatin A | 3 | Epipeptide |
| lankacidin C | 2 | redox-cofactor | thaxteramide C | 1 | T3PKS |
| lichenicidin VK21 A1 / lichenicidin VK21 A2 | 1 | lanthipeptide-class-ii | teichuronic acid | 2 | NRPS,T1PKS |
| macrolactin H | 2 | transAT-PKS | thuricin | 1 | lanthipeptide-class-ii |
| molybdenum cofactor | 10 | Terpene | thurincin H | 1 | RRE-containing |
| mycosubtilin | 3 | NRPS; NRPS, betalactone, transAT-PKS; transAT-PKS, NRPS,prodigiosin | teichuronic acid | 2 | NRPS,T1PKS |
| mersacidin | 2 | lanthipeptide-class-ii | ulleungmycin | 1 | Terpene |
| micrococcin P1 | 2 | RiPP-like | zwittermicin A | 9 | *NRPS,T1PKS; NRPS,T1PKS,ladderane;phosphonate,NRPS,T1PKS |
| nostocyclopeptide A2 | 1 | NRPS | 7-deoxypactamycin | 2 | T3PKS |
Indicating highest no. of BGC producing that product between more than one BGCs, semi-colon (;) indicating separate BGCs.
3.3. Prediction of Evident biosynthetic gene clusters (BGCs) in Bacillus species genomes using PRISM 4
From PRISM 4 analysis, a total of 411 distinct classes of BGCs were obtained (Supplemntary File 2). Among them, 144 nonribosomal peptide (NRP), 5 polyketide (PK), and 39 hybrid NRP-PK modular BGCs were frequently distributed among the different species. Several species contained multiple copies of NRP BGCs, e.g., Bacillus pseudomycoides has 12 NRP BGCs, and Bacillus gaemokensis contains 11 NRP BGCs. Besides these, some RiPPs, bacteriocins, others, and unclassified BGCs were found (Table 4). Among these, BGCs coding for kasugamycin family aminoglycoside, melanin, and autoinducing peptide were found to be unique among the Bacillus pakistanensis, Bacillus thuringiensis, and Bacillus xiapuensis, respectively. Bacillus rugosus was seen to have the highest number of BGCs, a total of 16 BGCs, out of which 9 were NRP BGCs. There were eight species (Bacillus acidicola, Bacillus cihuensis, Bacillus fonticola, Bacillus infantis, Bacillus renqingensis, Bacillus salacetis, Bacillus salipaludis, Bacillus tepidiphilus) where no BGCs were found using PRISM 4.
Table 4.
List of other BGCs found from prism4 analysis.
| RiPPs | Bacteriocin | Others | Unclassified |
|---|---|---|---|
| lasso peptide | Class II/III confident bacteriocin | ectoine | unknown thiotemplated cluster type |
| class II lantipeptide | bacterial head-to-tail cyclized peptide | phosphonate | Melanin |
| class III/IV lantipeptide | class II/III Possible off-target bacteriocin | NRPS-independent siderophore synthase | cyclodipeptide (NYH family) |
| sactipeptide | APA-derived phosphonate | cyclopeptide (NYH family) | |
| glycocin | kasugamycin family aminoglycoside | ||
| CoX | autoinducing peptide | ||
| bacilysin | |||
| linear azol(in)e-containing peptide (LAP) | |||
| thiopeptide | |||
| class I lantipeptide | |||
3.4. Prediction of Evident biological gene clusters (BGCs) in Bacillus species using BAGEL 4
From Bagel4 analysis, 238 BGCs were identified (Supplementary File 2). Bagel4 specifically analyzes potential clusters responsible for the production of different RiPPs (mainly) and bacteriocins. RiPPs generally fall under Type 1 bacteriocins. The bacteriocins are generally classified into three types- Type 1 (RiPPs), Type 2 (small unmodified bacteriocins), and Type 3 (>10kD antimicrobial proteins). Different varieties of BGCs obtained from the analysis were mostly under the Type 1 bacteriocin (RiPPs) category. These Type 1 bacteriocins were further categorized into subclasses, namely Lanthipeptide A, Lanthipeptide B, Lanthipeptide, Glycocin, ComX, Sactipeptide, Head-to-tailcyclized peptides, LAP, Lasso peptide, Microcin, and No subclass (those BGCs not falling under these subclasses) (Table 5). Some clusters coding for Type 2 bacteriocin, namely UviB, Subtilosin_(SboX), Lacticin_Z (D), Lacticin_Q (D), Propionicin_SM1 and LCI, and Type 3 bacteriocins, namely Colicin_E9, Zoocin_A, and Closticin_574 were also observed. Around 1–5 RiPPs BGCs were distributed among the species. Bacillus swezeyi was seen to have the most RiPP BGCs (eight in total). Four species – Bacillus fonticola, Bacillus licheniformis, Bacillus mesophilus, and Bacillus tuaregi showed no RiPP BGCs from Bagel 4 analysis.
Table 5.
Categorization of RiPPs (Type 1 bacteriocin) BGCs into subclasses from Bagel4 analysis.
| Lanthipeptide A | ComX | LAP |
|---|---|---|
| BacCH91 | ComX4 | Plantazolicin |
| Paenibacillin | ComX1 | |
| Subtilin | ComX3 | Lasso peptide |
| Competence pheromone | Paeninodin | |
| Lanthipeptide B | Sactipeptide | Microcin |
| LichenicidinVK21A1_(Lichenicidin_A1) | Subtilosin_A | Plantathiazolicin_(Plantazolicin) |
| SalivaricinA | Sporulation-killingfactor_skfA | |
| Haloduracin_beta | Thurincin_H_(thuricin17) | No subclass |
| Cerecidin (CerA1) | ||
| Mersacidin | Head-to-tailcyclized peptides | Thiopeptide |
| Bicereucin_BsjA1 | Enterocin_Nkr-5-3B | Bottromycin |
| Staphylococcins_C55a_SacaA | Amylocyclicin | Lanthipeptide_class_I |
| Salivaricin_A4 | Carnocyclin-A | Lanthipeptide_class_II |
| Cytolysin_ClyLl | Enterocin_AS-48 | Lanthipeptide_class_IV |
| Macedovicin | Pumilarin | Sactipeptides |
| Lasso peptide | ||
| Lanthipeptide | Glycocin | LAPs |
| Subtilomycin | Sublancin_168 | |
| Sonorensin | ||
| Thurandacin | ||
3.5. Comparative analysis of BGCs obtained from different genome mining tools
The total hits of BGCs of each species from antiSMASH, Prism4, and Bagel4 along with their source, genome size, and gene number, have been summarized (Supplementary file 1). antiSMASH generally gives a prediction of all types of possible BGCs found in an organism’s genome. Observing in total, the bacteria Bacillus carries between 2–22 BGCs per genome (mean = 8.47, s.d. = 3.95). Among the 87 species, B. kwashiorkori, a bacterium with the smallest genome size (2.8 Mbp), contained two BGCs, whereas B. alveayuensis, found with the largest genome size (6.7 Mbp), had 7 BGCs. B. gaemokensis and B. rugosus, with the most BGCs, i.e., 22 and 21, respectively have genome sizes of 5.6 Mbp and 4.5 Mbp. Again, the genomes of B. fonticola and B. kwashiorkori contain at least 2 BGCs. The most commonly distributed BGCs include terpene (95.4 % genomes), non-ribosomal peptide synthase (NRPS, 82.8 % genomes), type III polyketide synthase (T3PKS, 90.8 % genomes), ribosomally synthesized and post-translationally modified peptides-like (RiPP-like, 47.1 % genomes), and hybrid clusters (71.3 %genomes) (Supplementary file 2). These 6 types of BGCs accounted for more than two-thirds of all the BGCs detected in a genome.
Prism4 basically predicts non-ribosomal peptide synthetase (NRPS), type-I-, and type-II- Polyketide synthase (PKS) gene clusters. Analyzing the genomes, the number of NRPS and PKS varied among each species. A Bacillus species may carry 1–12 NRPS, 1–2 PKS, and 1–12 PKS-NRPS hybrid. A total of 46 genomes out of 87 Bacillus species have NRPS, 5 containing PKS, and 26 containing PKS-NRPS hybrid BGCs. Other types of BGCs may also be found along with these clusters. Eight species, namely Bacillus acidicola, Bacillus cihuensis, Bacillus fonticola, Bacillus infantis, Bacillus renqingensis, Bacillus salacetis, Bacillus salipaludis, and Bacillus tepidiphilus were devoid of any NRPS/PKS BGCs.
Bagel4 generally predicts different classes of bacteriocins. The genomes of 87 Bacillus species were found to carry 1–8 bacteriocin-producing BGCs (mean = 2.72, s.d. = 1.69). Bacillus swezeyi, with a genome size of 4.6 Mbp and a total of 4,735 genes, was found to have the highest number (eight) of bacteriocin BGCs. A graphical visualization of the results obtained from the three genome mining tools is shown in Fig. 5.
Fig. 5.
Graphical representation of the results obtained from antiSMASH, Prsim4 and Bagel4.
4. Discussion
The ability to produce numerous antimicrobial compounds, the capacity to form highly viable formulated spores, and availability in soil – these unique properties allow Bacillus species to be a suitable candidate for screening of its secondary metabolite potential16. For this study, three genome mining tools, namely antiSMASH, PRISM4, and BAGEL4 were chosen for exploring the genomes of different Bacillus species. Out of these three tools, antiSMASH is used as a baseline genome-wide screening tool- detects a wide range of biosynthetic gene clusters (BGCs), more than 50 BGC types including NRPS, PKS (I,II, III), RiPPs and their subclasses (lantipeptides, thiopeptides, etc.), terpenes, sactipeptides, siderophores, hybrid clusters, etc.13, 17. BAGEL4 specializes in predicting bacteriocins and RiPPs20, 15, 18. PRISM4 mainly focuses on chemical structure prediction of the secondary metabolites encoded by BGCs and can predict some major classes −PKS, NRPS, RiPPs, hybrid clusters, and a few RiPPs subclasses, but BGCs coverage is not as exhaustive as antiSMASH13, 14, 19. Other genome mining tools like RODEO and RiPPMiner-Genome are also available for detecting RiPPs. However, RODEO focuses mostly on specific RiPP classes (e.g., lassopeptides, thiopeptides), whereas RiPPMiner-Genome is a specialized tool for prediction and classification of RiPPs but doesn't offer a broad coverage and is often less integrated with genome-wide analysis20, 21, 22. antiSMASH, PRISM4, and BAGEL4, when used together, maximize the discovery potential across both known and novel RiPPs, cover a broad range of BGCs, and provide complementary insights (gene clusters, core peptides, structures), making them the tools of choice for modern genome mining projects.
Observing the distribution of BGCs among the species, each clade specializes in a particular BGC type, suggesting the existence of orthologous BGCs within a clade share ancestry23. Ecological pressures, such as nutrient competition, osmotic stress, surviving desiccation, or host association, tend to drive retention or acquisition of specialized BGCs24. On the other hand, evolutionary forces such as horizontal gene transfer contribute to functional diversity by transferring BGCs within inter- or intra-species boundaries24. For instance, to adapt to their microenvironment and niche, species in clade 3 have an abundance of RiPP-like, betalactone, siderophore, lassopeptides, NRPS, and hybrid BGCs, thus helping them produce a wide range of antimicrobial metabolites. This study’s finding shows that the Genus Bacillus is highly capable of producing secondary metabolites, where the most prevalent BGCs include terpene, NRPS, PKS, RiPP, hybrid, and other undefined BGCs forming highly diversified compounds.
Studies state that terpene BGCs are highly conserved in phytobiomes rather than the soil microbiome, and their abundance is species-specific25, 26. Our findings showed that they were widely distributed among the Bacillus species, despite their environment and isolation source. However, their abundance varied specifically in some species. Other studies show that some Bacillus species utilize carotenoids primarily for their antioxidant and photoprotective properties, while others use them as pigments27, 28, 29. The terpene BGCs from this study also showed high to moderate similarity with carotenoids or may encode a variant of carotenoids, showing their consistency with previous studies. As BGCs have a modular organization, they can be reshuffled, truncated, or fused over time and thus give rise to hybrid BGCs containing genes encoding for more than one type of scaffold-synthesizing enzymes27, 30, 31. Findings show that some hybrid clusters exhibited 100 % similarity to a wide range of antimicrobial agents, including bacillaene, sporulation killing factor, bacilysin, and bacillibactin, as well as high similarity to fengycin, teichuronic acid, and plantazolicin in some species, suggesting these clusters are limited to specific species. Although the origin and specific functions of hybrid clusters are unknown, mosaic evolution leading to such combinations allows for significant structural and chemical modifications to the major BGC classes, thereby enhancing the capacity of these bacterial species to produce medically important derivatives of a compound and to generate new metabolite diversity32, 33.
Besides terpene BGCs, the majority of BGCs distributed show that all Bacillus species are rich in NRPs, PKS, and RiPPs BGCs and their subcategories. Studies state NRPS, PKS, or NRPS/PKS hybrid BGCs contain information for the synthesis of a definite metabolite, including regulatory elements, resistant genes, and transporters34, 35. However, the same compound produced by multiple species but having different similar confidence values might show variations in functionality of that compound or give rise to new compounds. Some species were seen to have RiPP BGCs producing compounds found only in a particular species. Bacillus pakistanensis contained a BGC that codes for Kasugamycin, an aminoglycoside antibiotic that was isolated from Streptomyces kasugaensis for controlling fungal and bacterial growth in rice fields36. Again, Bacillus xiapuensis, contains a BGC coding for autoinducing peptides, which is generally found in gram-positive bacteria Staphylococcus aureus, involved in cell-to-cell communication37, 38, 39. Some of the RiPP BGCs obtained code for different types of compounds under the subcategory lanthipeptide A, ComX, LAP, lanthipeptide B, lanthipeptide, glycocin, sactipeptide, microcin, head-to-tailcyclized peptides, and other undefined categories showing consistency with previous studies40. These RiPPs and bacteriocins not only show a high degree of structural variations but also a wide range of bioactivities.
More than 50 % unknown BGCs were identified; these clusters do not share any similarity with any known clusters according to antiSMASH as they do not fit into any other category, indicating that a substantial number of the BGCs are orphans or cryptic and may be considered as potential biosynthetic gene clusters for novel antibiotics and secondary metabolites.
Based on the comparative analysis of BGCs obtained from different genome mining tools, the number of BGCs per genome in Bacillus species has a moderate connection with genome size. Some studies show that most species of different genera show a linear relationship between the number of BGCs and their genome size; the larger the genome size, the more the BGCs41. However, in the case of Bacillus species, it showed some exceptions in this manner as some species had a greater number of BGCs despite their small genome size. Such variations in BGCs aren’t random; rather it is shaped by evolutionary lineage and ecological factors. Different environments impose distinct selective pressures that influence the need for secondary metabolites, which are often encoded by BGCs to survive the stress conditions. Some soil-dwelling species, e.g., Bacillus subtilis, Bacillus amyloliquefaciens, Bacillus licheniformis, Bacillus pumilus, Bacillus cereus tend to produce a wide variety of antimicrobial compounds like surfactin, fengycin, bacillomycin D, bacitracin, lichenysin, zwittermicin A to compete with other microorganisms for space and nutrients and help establish dominance in their habitat42, 43, 44, 45. Other species e.g., Bacillus subtilis, Bacillus amyloliquefaciens colonize in roots of plants to establish niches in the rhizosphere and act as plant growth-promoting rhizobacteria (PGPR), promoting plant growth by protecting roots from pathogens46, 47. Again, in microbial populations, horizontal gene transfer plays a significant role in the diversity of BGC. Due to the modular organization of BGCs and the presence of mobile genetic elements like transposons and plasmids help to transfer BGCs across strains or species, leading to rapid BGC acquisition and diversification48, 49, 50.
5. Conclusion
Numerous species of Bacillus have shown to be a repository of various types of BGCs- terpene, NRPS, PKS, RiPP, hybrids, and other subcategories and have the potential to produce a wide range of secondary metabolites. With the help of genome mining tools, not only made it easier to find BGCs to screen and characterized known compounds, but also helped identify uncharacterized BGCs, helping to escape the previous use of laborious methods for screening and purification of antimicrobial activity of compounds against suitable targets. A substantial number (>50 %) of orphan or cryptic BGCs were also obtained, increasing the scope for exploration of biosynthetic gene clusters for novel antibiotics and secondary metabolites. Thus, the data obtained from this study might serve as a repository in developing approaches not only to unlock the full potential of Bacillus chemical repertoire but also in activating their BGCs and initiating the synthesis of their bioactive compounds, leading to the discovery of novel biologically important compounds. Despite the discovery of multiple novel gene clusters of potential antimicrobials, their roles are still unknown, and they have not yet been defined.
Funding sources
No financial support was provided to the author(s) or their institution(s) to cover salaries, consulting fees, and/or other expenses related to this study.
CRediT authorship contribution statement
Showti Raheel Naser: Writing – review & editing, Writing – original draft, Methodology, Data curation. Sanjana Fatema Chowdhury: Writing – review & editing, Software, Formal analysis, Data curation. Md. Murshed Hasan Sarkar: Writing – review & editing, Supervision, Project administration, Formal analysis, Conceptualization. Md. Ahashan Habib: Writing – review & editing. Shahina Akter: Writing – review & editing. Tanjina Akhtar Banu: Writing – review & editing. Barna Goswami: Writing – review & editing. Md. Salim Khan: Writing – review & editing, Supervision.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.jgeb.2025.100595.
Contributor Information
Md. Murshed Hasan Sarkar, Email: murshed-sarkar@bcsir.gov.bd.
Md. Salim Khan, Email: k2salim@bcsir.gov.bd.
Appendix A. Supplementary material
The following are the Supplementary data to this article:
References
- 1.Demain AL, Fang A. The natural functions of secondary metabolites. History of modern biotechnology I. 2000:1-39. (book chapter). [DOI] [PubMed]
- 2.Pham J.V., Yilma M.A., Feliz A., et al. A review of the microbial production of bioactive natural products and biologics. Front. Microbiol. 2019;10:1404. doi: 10.3389/fmicb.2019.01404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Newman D.J., Cragg G.M. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. J. Nat. Prod. 2020;83:770–803. doi: 10.1021/acs.jnatprod.9b01285. [DOI] [PubMed] [Google Scholar]
- 4.Manikprabhu D., Lingappa K. γ Actinorhodin a natural and attorney source for synthetic dye to detect acid production of fungi. Saudi J. Biol Sci. 2013;20:163–168. doi: 10.1016/j.sjbs.2013.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Narsing Rao M.P., Xiao M., Li W.J. Fungal and bacterial pigments: secondary metabolites with wide applications. Front. Microbiol. 2017;8:1113. doi: 10.3389/fmicb.2017.01113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.World Health Organization. Global antimicrobial resistance surveillance system (GLASS) report: early implementation 2020. Accessed 25 July 2022. https://www.who.int/publications/i/item/9789240005587/; 2022.
- 7.Osbourn A. Secondary metabolic gene clusters: Evolutionary toolkits for chemical innovation. Trends Genet. 2010;26:449–457. doi: 10.1016/j.tig.2010.07.001. [DOI] [PubMed] [Google Scholar]
- 8.Fan H., Ru J., Zhang Y., Wang Q., Li Y. Fengycin produced by Bacillus subtilis 9407 plays a major role in the biocontrol of apple ring rot disease. Microbiol. Res. 2017;199:89–97. doi: 10.1016/j.micres.2017.03.004. [DOI] [PubMed] [Google Scholar]
- 9.Su Z., Chen X., Liu X., et al. Genome mining and UHPLC–QTOF–MS/MS to identify the potential antimicrobial compounds and determine the specificity of biosynthetic gene clusters in Bacillus subtilis NCD-2. BMC Genom. 2020;21:1–6. doi: 10.1186/s12864-020-07160-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Turnbull PC, Kramer JM, Melling J. Bacillus. Medical microbiology. 4th ed. 1996.
- 11.Rutledge P.J., Challis G.L. Discovery of microbial natural products by activation of silent biosynthetic gene clusters. Nat. Rev. Microbiol. 2015;13:509–523. doi: 10.1038/nrmicro3496. [DOI] [PubMed] [Google Scholar]
- 12.Blin K., Shaw S., Steinke K., et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 2019;47(W1) doi: 10.1093/nar/gkz310. W81-W7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Blin K., Shaw S., Kloosterman A.M., et al. antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 2021;49(W1):W29–W35. doi: 10.1093/nar/gkab335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Skinnider M.A., Merwin N.J., Johnston C.W., Magarvey N.A. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes. Nucleic Acids Res. 2017;45(W1):W49–W54. doi: 10.1093/nar/gkx320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.van Heel A.J., de Jong A., Song C., Viel J.H., Kok J., Kuipers O.P. BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins. Nucleic Acids Res. 2018;46(W1) doi: 10.1093/nar/gky383. W278-W81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wulff E.G., Mguni C.M., Mansfeld-Giese K., Fels J., Lübeck M., Hockenhull J. Biochemical and molecular characterization of bacillus amyloliquefaciens, B. subtilis and B. pumilus isolates with distinct antagonistic potential against Xanthomonas campestris pv. campestris. Plant Pathology. 2002;51:574–584. doi: 10.1046/j.1365-3059.2002.00753.x. [DOI] [Google Scholar]
- 17.Blin K., Kim H.U., Medema M.H., Weber T. Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Brief. Bioinform. 2019;20(4):1103–1113. doi: 10.1093/bib/bbx146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.de Jong A., van Hijum S.A., Bijlsma J.J., Kok J., Kuipers O.P. BAGEL: a web-based bacteriocin genome mining tool. Nucleic Acids Res. 2006;34(suppl_2):W273–9 doi: 10.1093/nar/gkl237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Skinnider M.A., Dejong C.A., Rees P.N., et al. Genomes to natural products prediction informatics for secondary metabolomes (PRISM) Nucleic Acids Res. 2015;43(20):9645–9662. doi: 10.1093/nar/gkv1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tietz J.I., Schwalen C.J., Patel P.S., et al. A new genome-mining tool redefines the lasso peptide biosynthetic landscape. Nat. Chem. Biol. 2017;13(5):470–478. doi: 10.1038/nchembio.2319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schwalen C.J., Mitchell D. Discovery of Antibiotic Peptides from Novelty‐Prioritized Natural Product Genome Mining. FASEB J. 2017;31:939–1938. doi: 10.1096/fasebj.31.1_supplement.939.8. [DOI] [Google Scholar]
- 22.Agrawal P., Amir S., Barua D., Mohanty D. RiPPMiner-Genome: a web resource for automated prediction of crosslinked chemical structures of RiPPs by genome mining. J. Mol. Biol. 2021;433(11) doi: 10.1016/j.jmb.2021.166887. [DOI] [PubMed] [Google Scholar]
- 23.Emms D.M., Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20(1):238. doi: 10.1186/s13059-019-1832-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Salamzade R., Kalan L.R. Context matters: assessing the impacts of genomic background and ecology on microbial biosynthetic gene cluster evolution. Msystems. 2025;10(3) doi: 10.1128/msystems.01538-24. e01538-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Mukherjee A., Tikariha H., Bandla A., Pavagadhi S., Swarup S. Global analyses of biosynthetic gene clusters in phytobiomes reveal strong phylogenetic conservation of terpenes and aryl polyenes. Msystems. 2023;8(4):e00387–e423. doi: 10.1128/msystems.00387-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Yin Q.J., Ying T.T., Zhou Z.Y., et al. Species-specificity of the secondary biosynthetic potential in bacillus. Front Microbiol. 2023;14 doi: 10.3389/fmicb.2023.1271418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Khaneja R., Perez-Fons L., Fakhry S., et al. Carotenoids found in bacillus. J. Appl. Microbiol. 2010;108(6):1889–1902. doi: 10.1111/j.1365-2672.2009.04590.x. [DOI] [PubMed] [Google Scholar]
- 28.Soni N., Dhandhukia P., Thakker J.N. Carotenoid from marine bacillus infantis: production, extraction, partial characterization, and its biological activity. Arch. Microbiol. 2023;205(5):161. doi: 10.1007/s00203-023-03505-z. [DOI] [PubMed] [Google Scholar]
- 29.Patkar S., Shinde Y., Chindarkar P., Chakraborty P. Evaluation of antioxidant potential of pigments extracted from bacillus spp. and Halomonas spp. isolated from mangrove rhizosphere. Biotechnologia. 2021;102(2):157–169. doi: 10.5114/bta.2021.106522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zotchev S.B. Evolutionary Biology: Genome Evolution, Speciation, Coevolution and Origin of Life. Springer International Publishing; Cham: 2014. Genomics-based insights into the evolution of secondary metabolite biosynthesis in actinomycete bacteria; pp. 35–45. [Google Scholar]
- 31.Cimermancic P., Medema M.H., Claesen J., et al. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell. 2014;158:412–421. doi: 10.1016/j.cell.2014.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Gallagher K.A., Jensen P.R. Genomic insights into the evolution of hybrid isoprenoid biosynthetic gene clusters in the MAR4 marine streptomycete clade. BMC Genom. 2015;16:1–3. doi: 10.1186/s12864-015-2110-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Khaldi N., Collemare J., Lebrun M.H., Wolfe K.H. Evidence for horizontal transfer of a secondary metabolite gene cluster between fungi. Genome Biol. 2008;9:1. doi: 10.1186/gb-2008-9-1-r18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Strieker M., Tanović A., Marahiel M.A. Nonribosomal peptide synthetases: structures and dynamics. Curr. Opin. Struct. Biol. 2010;20:234–240. doi: 10.1016/j.sbi.2010.01.009. [DOI] [PubMed] [Google Scholar]
- 35.Shen B. Polyketide biosynthesis beyond the type I, II and III polyketide synthase paradigms. Curr. Opin. Struct. Biol. 2003;7:285–295. doi: 10.1016/S1367-5931(03)00020-6. [DOI] [PubMed] [Google Scholar]
- 36.Kasuga K., Sasaki A., Matsuo T., et al. Heterologous production of kasugamycin, an aminoglycoside antibiotic from Streptomyces kasugaensis, in Streptomyces lividans and Rhodococcus erythropolis L-88 by constitutive expression of the biosynthetic gene cluster. Appl. Microbiol. Biotechnol. 2017;101(10):4259–4268. doi: 10.1007/s00253-017-8189-5. [DOI] [PubMed] [Google Scholar]
- 37.Sturme M.H., Kleerebezem M., Nakayama J., Akkermans A.D., Vaughan E.E., De Vos W.M. Cell to cell communication by autoinducing peptides in gram-positive bacteria. Antonie Van Leeuwenhoek. 2002;81(1):233–243. doi: 10.1023/A:1020522919555. [DOI] [PubMed] [Google Scholar]
- 38.Ji G., Beavis R., Novick R.P. Bacterial interference caused by autoinducing peptide variants. Sci. 1997;276(5321):2027–2030. doi: 10.1126/science.276.5321.2027. [DOI] [PubMed] [Google Scholar]
- 39.Thoendel M., Horswill A.R. Identification of Staphylococcus aureus AgrD residues required for autoinducing peptide biosynthesis. J. Biol. Chem. 2009;284(33):21828–21838. doi: 10.1074/jbc.M109.031757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhao X., Kuipers O.P. Identification and classification of known and putative antimicrobial compounds produced by a wide variety of Bacillales species. BMC Genom. 2016;17(1):882. doi: 10.1186/s12864-016-3224-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Shi Y.M., Hirschmann M., Shi Y.N., et al. Global analysis of biosynthetic gene clusters reveals conserved and unique natural products in entomopathogenic nematode-symbiotic bacteria. Nat. Chem. 2022;14:701–712. doi: 10.1038/s41557-022-00923-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Stein T. Bacillus subtilis antibiotics: structures, syntheses and specific functions. Mol. Microbiol. 2005;56(4):845–857. doi: 10.1111/j.1365-2958.2005.04587.x. [DOI] [PubMed] [Google Scholar]
- 43.Chen X.H., Koumoutsi A., Scholz R., et al. Comparative analysis of the complete genome sequence of the plant growth–promoting bacterium bacillus amyloliquefaciens FZB42. Nat. Biotechnol. 2007;25(9):1007–1014. doi: 10.1038/nbt1325. [DOI] [PubMed] [Google Scholar]
- 44.Stabb E.V., Jacobson L.M., Handelsman J.O. Zwittermicin A-producing strains of Bacillus cereus from diverse soils. Appl. Environ Microbiol. 1994;60(12):4404–4412. doi: 10.1128/aem.60.12.4404-4412.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Bizani D.A., Brandelli A. Characterization of a bacteriocin produced by a newly isolated bacillus sp. Strain 8 a. J. Appl. Microbiol. 2002;93(3):512–519. doi: 10.1046/j.1365-2672.2002.01720.x. [DOI] [PubMed] [Google Scholar]
- 46.Blake C., Christensen M.N., Kovács Á.T. Molecular aspects of plant growth promotion and protection by Bacillus subtilis. Mol. Plant-Microbe Interact. 2021;34(1):15–25. doi: 10.1094/MPMI-08-20-0225-CR. [DOI] [PubMed] [Google Scholar]
- 47.Chowdhury S.P., Hartmann A., Gao X., Borriss R. Biocontrol mechanism by root-associated bacillus amyloliquefaciens FZB42–a review. Front Microbiol. 2015;6:780. doi: 10.3389/fmicb.2015.00780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Biology insights. Biosynthetic Gene Clusters: New Metabolic Insights. Accessed 25 August 2025. https://www.biologyinsights.com/biosynthetic-gene-clusters-new-metabolic-insights/.
- 49.Chase A.B., Sweeney D., Muskat M.N., Guillén-Matus D.G., Jensen P.R. Vertical inheritance facilitates interspecies diversification in biosynthetic gene clusters and specialized metabolites. Mbio. 2021;12(6) doi: 10.1128/mBio.02700-21. e02700-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Park C.J., Smith J.T., Andam C.P. Horizontal Gene Transfer: Breaking Borders between Living Kingdoms. Springer International Publishing; Cham: 2019. Horizontal gene transfer and genome evolution in the phylum Actinobacteria; pp. 155–174. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






