Abstract
Background
Microbial interactions are critical for maintaining the stability of food fermentation microbiomes, and mobile genetic elements (MGEs) significantly influence these interactions by horizontal gene transfer events. Although MGEs are known to facilitate horizontal gene transfer, their distribution among microorganisms and specific effects on microbial interactions remain poorly understood.
Results
We analyzed 590 metagenomic and 42 metatranscriptomic samples from food fermentations, recovering 1133 metagenome-assembled genomes (MAGs). Our analysis revealed that MGEs were widely distributed in food fermentation microbiomes, with higher occurrence rates in Firmicutes (Bacillota: 0.71 ~ 11.85%) and Proteobacteria (Pseudomonadota: 0.47 ~ 11.05%). MGEs tended to be located adjacent to functional genes, particularly biosynthetic gene clusters (BGCs), with co-occurrence rates ranging from 9.41 to 23.99%. Furthermore, the transcriptional activity of BGCs was significantly correlated with the number of MGEs that were co-located with BGCs, which might enhance the competitiveness of strains. Variability in the diversity of MGEs that were co-located with BGCs was also evident at the strain level. Using Lactiplantibacillus plantarum as a case, we revealed that the strain-level differences in MGEs that were co-located with BGCs are positively correlated with the transcription of BGCs and competitiveness of strains within the species.
Conclusions
This study highlighted the role of MGEs in enhancing transcription of BGCs and facilitating strain competitiveness, providing new insights into how MGEs enhance the adaptability of microbial communities.
Video Abstract
Supplementary Information
The online version contains supplementary material available at 10.1186/s40168-025-02180-0.
Keywords: Biosynthetic gene cluster, Microbial interaction, Mobile genetic element, Multi-omics, Secondary metabolites, Transcription
Introduction
Microorganisms are the most abundant and diverse organisms in nature [1]. Microbial communities establish intricate networks where species interact in complex ways, shaping the structure and stability of ecosystems [2]. Recent studies highlight the crucial role of microbial interactions in driving community resilience and ecological dynamics [3, 4]. Within these communities, microorganisms continuously compete for limited resources [5], utilizing various strategies such as metabolic cross-feeding [6] and competitive behaviors [7]. In food fermentations, microbial interactions are particularly crucial, as they not only contribute to the stability of fermentations [8], but also affect the flavor and safety of the final product [9]. However, the mechanisms enabling microorganisms to adapt and survive in food fermentations remain poorly understood.
Mobile genetic elements (MGEs) are prevalent in prokaryotic genomes [10]. A recent study revealed the average number of MGEs varies among phyla: Proteobacteria and Firmicutes are enriched in phages, while Bacteroidetes and Firmicutes are enriched in conjugative elements [11]. Additionally, MGEs are pivotal in bacterial adaptation by enabling the transfer of various genes [12–15]. Gene mobility can drive microbial genomic variations [16, 17]. They integrate into the host genome through mechanisms such as insertion and recombination, causing effects such as gene mutations and changes in the transcriptional landscape [18]. Nonetheless, the impact of MGEs on microbial adaptation within communities, as well as the underlying mechanisms, is still poorly understood. Understanding these mechanisms is vital for deciphering microbial community assembly and interactions.
Secondary metabolites are crucial drivers of microbial interactions [19]. Although not essential for microbial growth, these compounds play vital roles in communication, signaling, and competition, thereby mediating microbial interactions [20]. For instance, interspecies interactions are exemplified by the production of signaling molecules such as quorum sensing peptides, which enable communication between different species to coordinate behaviors such as biofilm formation and virulence factor expression [21]. Genes involved in the biosynthesis of secondary metabolites are often clustered together in bacterial genomes, forming biosynthetic gene clusters (BGCs) [22]. A study identified 119 biosynthetic units that encode polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS) are clustered in genomic islands enriched with mobile genetic elements in Salinispora [23], indicating MGEs can also be co-located with BGCs. However, it remains unclear whether MGEs exhibit preferences for specific BGCs or how they precisely influence BGC transcription [24]. To address this gap, we proposed the following research questions: (1) Do MGEs exhibit preference in their occurrence, and how frequently are they co-located with BGCs? (2) Do MGEs affect the transcription of BGCs? (3) Can MGEs regulate strain adaptation within microbial communities by modulating BGC transcription?
To answer these questions, we performed multi-omics analysis in combination with co-culture experiments. We recovered 1133 metagenome-assembled genomes (MAGs) from 590 metagenomes to investigate the occurrence of MGEs in genomes and especially in BGCs. By examining the relationship between MGEs that were co-located with BGCs and the relative abundance of MAGs, we assessed how MGEs influence strain adaptation. Additionally, we explored the effects of MGEs on BGC transcription through metatranscriptomic analysis. Finally, pairwise co-culture experiments were performed to confirm the adaptation of strains with MGEs that were co-located with BGCs and their impact on adaptation within species, particularly in relation to enhancing the transcription of biosynthetic gene clusters. This highlights the fundamental roles of MGEs and BGCs in shaping microbial ecology and competitive interactions in food fermentations.
Methods
Metagenomic and metatranscriptomic data collection
We downloaded 223 food fermentation metagenomes and 42 corresponding metatranscriptomes from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA), 653 food fermentation bacterial MAGs and 367 corresponding metagenomes from our previous analysis [25] in July 2023, which covered 25 fermented food types (Supplementary Table S1).
Metagenome assembly and binning
All SRA data were separated into paired-end or single-end data using fast-dump (v. 3.0.2) from sratoolkit (v. 3.0.2) (https://github.com/ncbi/sra-tools) with option “–split-3–gzip” All raw reads were quality-controlled using Trim Galore (v. 0.6.10) (https://github.com/FelixKrueger/TrimGalore) to remove adapter and low-quality reads (q < 20) with default parameters, followed by checking quality with FastQC (v. 0.12.1) (https://github.com/s-andrews/FastQC). For paired-end data, the clean reads of each sample were individually assembled using metaSPAdes (v. 3.15.5) [26] with default parameters. The clean reads from the same type of fermented food in the same study were coassembled with MEGAHIT (v. 1.2.9) [27] with option “–presets meta-sensitive” For single-end data, the clean reads of each sample were individually assembled, and the clean reads from the same type of fermented food in the same study were coassembled using MEGAHIT (v. 1.2.9) with option “–presets meta-sensitive”.
Contigs with sequence length < 1500 bp were filtered by seqtk (v. 1.4) (https://github.com/lh3/seqtk) and used for metagenomic binning. For metagenomic binning analysis, the clean reads were mapped to corresponding contigs using Bowtie2 (v. 2.2.5) [28]. Mapped results were converted into BAM format using Samtools (v. 1.7.0) [29]. Then, the BAM files were sorted and indexed using Samtools (v. 1.7.0). The sorted BAM files were used for metagenomic binning based on the sequence characteristics and coverage depth using CONCOCT (v. 1.1.0) [30], MetaBAT 2 (v. 2.12.1) [31] and MaxBin 2 (v. 2.2.7) [32]. MAGs generated from different methods were integrated using DAS Tool (v. 1.1.6) [33]. The completeness and contamination of all MAGs were estimated using CheckM (v. 1.2.2) [34] with the lineage_wf workflow and CheckM2 (v. 1.0.1) [35] with option “—allmodels” The MAGs with medium and high qualities (completeness ≥ 50% and contamination ≤ 10%) [36] were retained. The food fermentation MAGs from the same fermented food type were further dereplicated into 1133 nonredundant MAGs based on 99% ANI with the following options: “–ignoreGenomeQuality -pa 0.95 -sa 0.99 –S_algorithm fastANI” using dRep (v2.2.4) [37]. The relative abundance of MAGs in each sample was quantified using CoverM (v. 0.6.1) (https://github.com/wwood/CoverM) under genome mode.
Taxonomic annotation and phylogenetic analysis
All MAGs were classified using GTDB-Tk (v. 2.1.1) [38] with the classify_wf workflow and default parameters based on the Genome Taxonomy Database (http://gtdb.ecogenomic.org/). The aligned protein sequences were generated using GTDB-Tk (v. 2.1.1) [38] and edited using BMGE (v. 1.12) [39] to retain sequence regions with phylogenetic information. The phylogenetic tree was inferred using FastTree (v. 2.1.11) [40] with default parameters. The phylogenetic tree was visualized and edited using the interactive Tree of Life (iTOL) (v. 6) [41].
Biosynthetic gene cluster analysis and mobile genetic elements annotation
BGCs of MAGs were predicted using antiSMASH (v. 6.1.0) [42] with the parameters: -taxon bacteria, -genefinding-tool prodigal, -cb-knownclusters, -cc-mibig, -cb-general, -cb-subclusters, and -fullhmmer. Clustering analysis was performed on BGCs using BiG-SCAPE (v. 1.1.5) [43] with the parameters: –cutoffs 0.3, –include_singletons, –mode auto, –verbose. The nucleotide sequence of each BGC was extracted from the gbk file (output of antiSMASH). The extracted nucleotide sequences were used as input for antibiotic resistance gene identification of BGCs using Resistance Gene Identifier (v. 5.1.1) [44]. The command line tool (cluster_function_prediction.py) was run with default parameters. The BGCs (gbk format) and ARGs (txt format) were utilized to predict the bioactivities of BGC-producing secondary metabolites using a developed machine learning model as previously described [45]. All BGCs (gbk format) were converted into bed files using python scripts, and the sequences for the upstream and downstream 5000 bp of the BGCs were queried and extracted using bedtools (v. 2.26.0) [46] with the flank and getfasta workflow. Putative protein-coding sequences (CDSs) were predicted using Prodigal (v. 2.6.3) [47]. MGEs were predicted using Diamond (v. v2.1.4.158) [48] with option ‘-k 1 -e 1e−5 –query-cover 60 –id 70’ based on the mobile-OG database [49]. MGE prediction results collate tool (mobileOGs-pl-kyanite.py) (https://github.com/clb21565/mobileOG-db/tree/main/mobileOG-pl) was run with default parameters. We modified the script getElementClassifications.R (https://github.com/clb21565/mobileOG-db/blob/main/scripts/getElementClassifications.R) and used it on the beatrix-v1.6 metadata file to classify MGEs into different categories. MGEs were subclassified into element categories of plasmid (sequences derived from COMPASS or NCBI Plasmid RefSeq), phage (sequences derived from GPD or pVOG), transposable element (sequences derived from ISfinder), integrative element (sequences derived from ICEberg and integration/excision category proteins not included in ISfinder), conjugative element (sequences with the transfer major mobileOG category and conjugation minor category) or others (sequences derived from ACLAME and not classified as plasmid, phage, transposable element, integrative element and conjugative element) [50]. The MGEs predicted from BGCs and the upstream and downstream 5000 bp [51–53] sequences of BGCs were defined as BGC-related MGEs. The BGCs and the upstream and downstream 5000 bp sequences of BGCs containing MGEs were defined as MGE-related BGCs [54, 55]. Potential plasmid-derived MGEs were cross-referenced using Diamond (v. v2.1.4.158) [48] with option “-k 1 -f 6 -b6 -c1” based on the COMPASS [56] and NCBI Plasmid RefSeq [57] database. Plasmids were predicted using viralverify [58] with default parameters.
Metatranscriptome quality control
All SRA data were separated into paired-end or single-end data using fast-dump (v. 3.0.10) from sratoolkit (v. 3.0.10) with option “–split-3 –gzip”. All raw reads were quality-controlled using Trim Galore (v. 0.6.10) to remove adapter and low-quality reads (q < 20) with default parameters, followed by checking quality with FastQC (v. 0.12.1). The rRNAs in quality-filtered metatranscriptomic reads were removed by comparing with rRNA sequences in the Rfam and Silva databases using SortMeRNA (v4.2.0) [59]. Each BGC was treated as a long gene [55, 60], and clean reads were mapped to the nucleotide sequence of each BGC to generate read count of each gene using Salmon (v. 1.10.3) [61]. Gene expression levels were estimated by transcript per million (TPM).
Collection of different Lactiplantibacillus plantarum
L. plantarum was isolated from food fermentation samples. Five-gram fermented samples were combined with 100 mL of sterile saline solution, followed by serial dilution. These dilutions were plated onto Man Rogosa Sharpe (MRS) agar plates added with 1% calcium carbonate, and incubated at 37 °C. After 24 h of incubation, colonies with distinct calcium lytic circles were examined on the agar plates. A total of 300 colonies were picked and preliminary screening by Polymerase Chain Reaction (PCR) using L. plantarum‑specific PCR primers LPF: (5′-CAAAACGGATTATCGCCAAC- 3′) and LPR: (5′-CGGTGTAAACGACATCATGC-3′) [62]. Finally, 18 colonies were confirmed to be L. plantarum by identification of 16S rRNA using general PCR primers 27F: (5′-AGAGTTTGATCMTGGCTCAG- 3’) and 1492R: (5′-GGTTACCTTGTTACGACTT-3′). Next, we determined the growth curves of these colonies. They were inoculated into 10 mL chemically defined medium [63] for 24 h, and then inoculated into 2 mL chemically defined medium at a concentration of 1 × 106 colony forming unit (CFU)/mL in 96-well plates for 48 h. During the incubation, samples were taken to determine biomass at 4-h intervals. At the 24-h time point, cells were harvested for DNA extraction and RNA extraction. Each culture setup had three biological replicates. The biomass of all samples was determined by measuring the optical density at 600 nm (OD600) using a BioTek Synergy microplate reader (BioTek, Winooski, VT, USA).
DNA extraction, whole-genome sequencing, and analysis
The genomic DNA of 18 L. plantarum colonies was extracted using a Rapid Bacterial Genomic DNA Isolation Kit (Sangon Biotech, Shanghai, China). Then, a sequencing library was prepared with the high-molecular-weight DNA using a Rapid Sequencing Kit (SQK-LSK109 connection kit). Sequencing was performed on an Oxford Nanopore Technologies (ONT) platform, leveraging single-molecule real-time electrical signal sequencing. The raw reads of genomes were selected based on size and quality using Filtlong (v. 0.2.1) (https://github.com/rrwick/Filtlong), followed by checking quality with FastQC (v. 0.12.1) (https://github.com/s-andrews/FastQC). The obtained reads were assembled de novo using Canu (v. 2.2) [64]. Genomes were circularized using Circlator (v. 1.5.5) [65]. Putative protein coding sequences (CDSs) were predicted using Prodigal (v. 2.6.3) [47]. Average nucleotide identity (ANI) value was calculated using FastANI (v. 1.33) [66].
RNA extraction, sequencing, and analysis
Total RNA was extracted using an RNA extraction kit (Tiangen, Beijing, China), rRNA was depleted with the Ribo-zero TM rRNA Removal Kit (Epicenter Biotechnologies, Madison, WI, USA), and sequencing libraries were prepared using the NEBNext® Ultra II Directional RNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA). All raw reads were quality-controlled using Trimmomatic (v. 0.39) [67] to remove adapter and low-quality reads with options “-phred33, ILLUMINACLIP: adapters.fa: 2: 30: 10 SLIDINGWINDOW: 4:1 5 MINLEN: 75”, followed by checking quality with FastQC (v. 0.12.1). The rRNAs in quality-filtered metatranscriptomic reads were removed by comparing with rRNA sequences in the Rfam and Silva databases using SortMeRNA (v4.2.0) [59]. Clean reads were aligned to whole genomes using Bowtie2 (v. 2.2.5) [28], reads were sorted by name using Samtools (v. 1.7.0) and HTSeq (v. 2.0) [68] was used for quantitative analysis of each CDS in the genome. Gene expression levels were estimated by TPM.
Abundance and transcriptional activities calculation of biosynthetic gene clusters
The clean data of metagenomes, metatranscriptomes, and transcriptomes of 18 colonies of L. plantarum were mapped to the nucleotide sequence of BGCs to generate reads using Bowtie2 (v. 2.2.5); reads were sorted by name using Samtools (v. 1.7.0) and counted using Salmon (v. 1.10.3). The clean data of genomes of 18 L. plantarum were mapped to the nucleotide sequence of BGCs to generate reads using Minimap2 (v. 2.28) [69]; reads were sorted by name using Samtools (v. 1.7.0) and counted using Salmon (v. 1.10.3). Genes per million (GPM) was used as a proxy for BGC abundance in metagenomes, and TPM was used as a proxy for BGC abundance in metatranscriptomes. Transcriptional activity of BGCs was measured as TPM/GPM [70].
Mono-culture and co-culture experiments
Eighteen L. plantarum strains were cultured in chemically defined medium for both mono-culture and pairwise co-culture in transwell plates (12-well polystyrene plates with 0.4-μm pore size, Corning, NYC, NY, 153, USA) to separate the bacterial cells and allow medium passage through the membrane pores. For the mono-cultures, strains were inoculated into both the upper and lower chambers of the transwell plate, with an initial total concentration of 1 × 106 CFU/mL. For pairwise co-cultures, each tested strain was inoculated in the lower chamber, and the corresponding partner was inoculated in the upper chamber. Each strain was inoculated at an initial concentration of 5 × 105 CFU/mL. All cultures were maintained under static conditions at 37 °C for 24 h. To make sure the culture conditions were consistent, we only collected the final bacterial solution in the lower chamber to determine biomass. Therefore, for each pair of co-culture, we performed two co-culture series to ensure either of the strains was cultured in the lower chamber. The biomass was determined by measuring the optical density at 600 nm (OD600) using a BioTek Synergy microplate reader (BioTek, Winooski, VT, USA). Each culture had three biological replicates. The competitive index (Ci) value was used to mitigate differences in growth rates among strains. The Ci value was set as following equation:
ODcA and ODcB represent the average biomass of colony A and B in co-cultures, respectively; ODmA and ODmB represent the average biomass of colony A and B in mono-cultures, respectively; ODcdm represents the average biomass of initial chemically defined medium.
Statistical analysis
Independent sample t-tests were employed to evaluate differences in the occurrence rates of BGC-related MGEs and the genome using SPSS Statistics 27 (IBM, Armonk, NY, USA). To investigate the relationship between microbial competitiveness and MGE-related BGCs, linear regression models were developed based on the relative abundance of MAGs and the abundance of MGE-related BGCs in each metagenomic sample, utilizing Origin (OriginLab, Northampton, MA, USA). Furthermore, to assess the correlation between microbial competitiveness and MGE quantity, nonlinear regression models were constructed with the relative abundance of MAGs and the number of BGC-related MGEs in each metagenomic sample, also using Origin.
To evaluate the impact of the number of BGC-related MGEs on transcriptional activity and strain competitiveness, Spearman’s correlations were calculated between the number of BGC-related MGEs, the transcriptional activity of BGCs, and the relative abundance of MAGs, using SPSS Statistics 27. Independent sample t-tests were additionally used to compare the relative abundance of MAGs between those containing BGCs with lexA, nusA, and polC, and MAGs containing BGCs with lexA and nusA.
To compare the differences of BGC-related MGEs across taxonomic levels, the Shannon index was used to represent MGE diversity in a MAG, and we then calculated the difference in MGE at different taxonomic levels for MAGs containing MGE-related BGCs. Next, we used the average of the relative abundance of MAGs across all metagenomes to calculate strain fitness differences within a species. We established a linear regression model between MGE differences and strain fitness differences by calculating the variance of MGE diversity and the variance of relative abundance among all MAGs within a species. Additionally, the Wilcoxon rank-sum test was applied to analyze differences in transcription ratios between BGCs with varying amounts of MGE using SPSS Statistics 27. P-value < 0.05 was considered statistically significant.
Results
The landscape of mobilome and BGCs in food fermentation microbiomes
We analyzed 590 metagenomic datasets representing 25 food fermentation types (Supplementary Table S1). Samples were subjected to standard quality control, assembly, and genomic reconstruction of single taxa. A total of 22,143 MAGs were generated, which resulted 4184 high-quality (> 90% completeness and < 5% contamination) and 3024 medium-quality (50% ≤ completeness ≤ 90% and 5% ≤ contamination ≤ 10%) [36] MAGs. From these MAGs meeting or exceeding the medium quality, 1133 non-redundant prokaryotic MAGs were identified based on 99% ANI (Supplementary Table S2). These MAGs were classified into 14 bacterial phyla, with Firmicutes and Proteobacteria accounting for 52.87% and 33.01% of the MAGs, respectively (Fig. 1a). Within Firmicutes, lactic acid bacteria were the predominant group, including two of the most frequently reconstructed families: Lactobacillaceae (323 MAGs) and Streptococcaceae (98 MAGs). These families are known for their crucial roles in food fermentation. In addition, 189 MAGs did not align with any reference genomes (ANI < 95%) and were identified as unknown genomes at the species level. Among these, 5 MAGs could not be assigned to any known genera and were defined as novel genera. These 1133 MAGs underscore the diversity of microbial taxa present in food fermentations.
Fig. 1.
1133 MAGs from food fermentations and their genomic characteristics. a A phylogenetic tree of 1133 nonredundant bacterial MAGs. From the inside to the outside, Layer A: Phylum classification of 1133 MAGs. Layer B: Heat map shows BGC number of 1133 MAGs. Layer C: Average number of BGC-related MGEs in each MAG. A scale of 0~1 means that 0b Density plot of proportion of MGE in the genome with different phylum. c The occurrence of MGEs in BGC is universal. The width of ribbons represents the proportion of MGEs in different MAGs in the middle and the co-occurrence proportion of MGEs in different BGC types on the right.
To explore the mobilome landscape, we performed a comprehensive analysis of MGEs across 1133 non-redundant MAGs from food fermentation microbiomes. A total of 89,293 MGEs were identified across all MAGs (Supplementary Table S3), with each MAG containing at least two types of MGE. Most MGEs belonged to plasmids (68.01%), followed by phages (25.64%). Plasmids and phages were present in 99.82% and 92.23% of the MAGs, respectively, and conjugative elements (CEs) and transposable elements (TEs) were found in 83.41% and 55.16% of the MAGs, respectively. Furthermore, we cross-referenced these potential plasmid-derived MGEs by blasting their nucleotide sequences against the complete plasmid sequence database. In total, 50,338 sequences (81.62%) showed significant alignment to plasmid sequences (Fig. S1), which further confirmed the reliability of our findings.
These MGEs were classified into five functions based on the major categories of mobileOG-db: integration/excision-related (22.01%), phage-related (24.37%), replication/recombination/repair-related (32.04%), stability/transfer/defense-related (9.12%) and transfer-related (12.46%). Clusters of orthologous groups (COG) annotation of MGEs revealed that 47.35% of the MGEs were annotated with functions related to replication, recombination, and repair, indicating that these MGEs play key roles in maintaining genomic stability and facilitating genetic diversity. Additionally, 16.20% of the MGEs were function unknown, highlighting the potential for further investigation into their roles in microbial physiology or ecology (Fig. S2). Although MGEs were widespread among food fermentation microbiomes, MGE occurrence rates varied significantly across different phyla, ranging from 0.04 to 11.85%. Firmicutes (0.71% ~ 11.85%) and Proteobacteria (0.47% ~ 11.05%) exhibited a higher propensity for MGE occurrence compared to other phyla (Fig. 1b). These results revealed that MGEs are common and diverse in food fermentation microbiomes.
COG annotation of genes that were co-located with MGEs revealed that genes related to replication, recombination, and repair were the most frequent (44.60%). This finding highlighted the critical role of MGEs in genomic rearrangement and evolutionary adaptation. Surprisingly, genes associated with secondary metabolite synthesis were the second most prevalent, with 41.91% linked to MGE co-occurrence (Fig. S3). We further investigated the biosynthetic potential of secondary metabolites in the food fermentation microbiomes. In total, 4059 BGCs were detected in 86.32% of MAGs. The number of BGCs ranged from 0 to 62 in different MAGs (Fig. 1a). Only 31 BGCs (0.76%) were derived from plasmids, and the remaining BGCs were located on chromosomes. A total of 1277 MGEs were co-located with BGCs. Of all BGCs, 660 (16.26%) were co-located with MGEs. Ribosomally synthesized and post-translationally modified peptides (RiPPs, 23.99%), PKSothers (23.18%) and PKS-NRP_Hybrids (20.41%) were co-located with MGEs over 20% of their BGCs (Fig. 1c). The co-occurrence rates of MGEs in BGCs (9.41 ~ 23.99%) were significantly higher than their genome-wide co-occurrence rate (0.04 ~ 11.85%) (P < 0.001). These findings suggested a preferential co-occurrence of MGEs adjacent to functional genes, particularly BGCs.
The role of BGCs and MGE-related BGCs in strain competitiveness
A total of 4059 BGCs were clustered into 2398 gene cluster families (GCFs). Prediction of antimicrobial activity indicated that 1026 GCFs were associated with secondary metabolites that had potential antibacterial activity (predicted probability of antibacterial activity ≥ 50%). Among these 1026 GCFs, 216 GCFs were found in MAGs with high relative abundance (average relative abundance ≥ 1%) (Fig. S4). The majority of the 216 GCFs belonged to the RiPP-like (48 GCFs) and type III polyketide synthases (T3PKS, 36 GCFs) families, responsible for producing antimicrobial peptides and other bioactive compounds, respectively (Fig. S5). These results suggested that secondary metabolites produced by BGCs play a crucial role in enhancing microbial competitiveness.
Although a substantial proportion of MGEs were co-located with BGCs, their specific physiological roles remain largely unclear. The relationship between the abundance of MGE-related BGCs and strain competitiveness was investigated. A weak correlation was observed between the relative abundance of MAGs and the abundance of MGE-related BGCs (R2 = 0.195, P < 0.001, Fig. 2a), indicating that MGE-related BGCs might contribute to strain competitiveness in food fermentation microbiomes. This finding suggested a potential link between BGC-related MGEs and the relative abundance of MAGs (Fig. 2b).
Fig. 2.
Characterization of mobilome were co-located with biosynthetic gene clusters. a Correlation between abundance of MGE-related BGCs and relative abundance of MAGs. b Correlation between average number of BGC-related MGEs and relative abundance of MAGs. c Network of MGEs co-occurring with BGCs. Nodes represent MGEs, colored by their category. The size of each node is proportional to the number of co-occurrences, and the edge thickness indicates the frequency of co-occurrence. Red edges indicate an average relative abundance >1, 1 ≤blue edges ≤0.5, 0.1 ≤green edges <0.5 and gray edges <1. NA1 represents unidentified gene
To investigate the relationship between MGEs and BGCs, we analyzed the co-occurrence patterns of different BGC-related MGEs. Certain genes, such as lexA, were restricted to specific BGC types like PKSother (Fig. S6). In addition, certain combinations also tended to be co-located with the specific BGC types; for example, nusA and lexA exclusively appeared in PKSothers; ihfB and rarA were found only in RiPPs (Figure S7). A co-occurrence network analysis of MGEs in BGCs was conducted to explore the relationship between MGE co-occurrence frequency and the relative abundance of MAGs. The lexA and nusA, rarA and ihfB, and rarA and NA1 were often co-located with BGCs (co-occurrence count > 10, Fig. 2c). The average relative abundance of MAGs containing BGCs with lexA and nusA, rarA and ihfB, and rarA and NA1 was all higher than 0.5% (medium of relative abundance of all MAGs = 0.33%, Fig. S8). Prediction of antimicrobial activity revealed that BGCs containing lexA and nusA, rarA and ihfB, and rarA and NA1 exhibited average antimicrobial activities of 50.10%, 65.15%, and 69.11%, respectively. These findings suggested that the antimicrobial compounds produced by these BGCs may contribute to the higher relative abundance of specific MAGs. Genes lexA and nusA were found only in the Firmicutes, rarA and ihfB, and rarA and NA1 were only found in the Proteobacteria (Fig. S9). These data suggested that different MGE class associations are biased for distinct BGC family and microbiomes.
We also analyzed the co-occurrence of three MGEs. The co-occurrence count dropped sharply (co-occurrence count < 10), with ruvA, ruvB, and tag being the most frequently co-occurring combination (co-occurrence count = 7), followed by lexA, nusA, and polC (co-occurrence count = 6, Fig. S10a). The average relative abundance of MAGs containing BGCs with ruvA, ruvB, and tag, and lexA, nusA, and polC was both higher than 0.5% (Figure S10b). Additionally, the relative abundance of MAGs containing BGCs with lexA, nusA, and polC was significantly higher than that of MAGs containing BGCs with only lexA and nusA (P < 0.05), suggesting that polC may work in conjunction with lexA and nusA (Fig. S11). These results suggested that there may be mutual synergistic interactions among multiple MGEs.
MGE-related BGCs exhibited enhanced transcriptional activity
We investigated the impact of MGE on BGC transcription. The average number of BGC-related MGEs showed a significant positive correlation with the transcriptional activity of BGCs (Spearman’s R = 0.195, P < 0.001, Fig. 3a). This suggested that a higher average number of BGC-related MGEs might enhance the transcription of BGCs. Additionally, the transcriptional activity of BGCs was significantly associated with the relative abundance of MAGs (Spearman’s R = 0.130, P < 0.001, Fig. 3a). This relationship suggested that MAGs with more BGC-related MGEs may exhibit higher transcriptional activity of BGCs and competitiveness.
Fig. 3.
Mobilome impact on BGCs and strain competitiveness. a Correlations between average number of BGC-related MGEs and transcriptional activity of BGCs, and between relative abundance of MAGs and transcriptional activity of BGCs. b Relationship between average number of BGC-related MGEs and transcriptional activity of BGCs. The transcriptional activity of BGCs significantly increases from 0 to 4 MGEs (Wilcoxon rank-sum test, *** P <0.001, ** P <0.01). c Specific gene combinations and their impact on transcriptional activity of BGCs (Wilcoxon rank-sum test, *** P <0.001)
We hypothesized that MGEs might enhance microbial competitiveness in relation to increasing transcriptional activity of BGCs. We compared the transcriptional activity of BGCs with and without MGE co-occurrence. The results showed that MGE-related BGCs had higher transcriptional activity than BGCs without MGE co-occurrence (P < 0.001, Fig. 3b).
BGCs with lexA and nusA (average transcription activity = 7.41, P < 0.001), and rarA and ihfB (average transcription activity = 11.29, P < 0.001) exhibited higher transcriptional activity compared to other BGCs in the same GCFs (average transcriptional activity = 1.02 and 1.23, respectively, Fig. 3c). To investigate whether MGEs synergistically upregulate the transcription of BGCs, we analyzed a GCF containing either lexA or nusA, excluding the influence of other genes. BGCs with both lexA and nusA exhibited higher transcriptional activity than those with only lexA or nusA (94.29% synergistic transcriptional activity > 1), suggesting that MGE may synergistically upregulate transcription of BGCs (Fig. S12).
Intraspecies interactions were related to BGC-related MGEs
We further explored the diversity of BGC-related MGEs across different taxonomic levels. No significant difference in average MGE diversity (based on Shannon index) across all taxonomic levels, but the variance of MGE diversity across different taxonomy units gradually increased from phyla to strain level (Fig. 4a). The strain level evolutionary variation in MGE implied that MGEs might play a role in microbial competitiveness even at the strain level. Additionally, species that contained at least five MAGs with MGE-related BGCs were selected, and we performed both interspecies and intraspecies comparisons of MGE diversity. Our results showed significant differences in MGE diversity between species (Adonis: R = 0.30, P < 0.01) and also highlighted notable intraspecies variation (Fig. S13). Therefore, we analyzed the correlation between MGE diversity and strain competitiveness and found that the variance of MGE diversity significantly correlated with the variance of relative abundance of MAGs within species (Spearman’s R = 0.590, P < 0.05, Fig. S14). This suggested that disparities in the diversity of BGC-related MGEs at the strain level may contribute to strain competitiveness within species.
Fig. 4.

The BGC-related MGEs varied at different taxonomic levels. a Variance of MGE number and average MGE diversity (Shannon index) across different taxonomic levels. b~c Correlation between abundance of BGC-related MGEs and relative abundance of MAGs in Lactiplantibacillus plantarum (b) and Streptococcus thermophilus (c). d~e Phylogenetic trees of L. plantarum (d) and S. thermophilus (e) MAGs. These circular network plots illustrate the relationships between the relative abundance of MAGs, only abundance of MGE-related BGCs and relative abundance of MAGs both high were wired. The outer circles represent abundance of MGE-related BGCs across different MAGs, with the color gradient from blue to red indicating low to high abundance levels. The bar graph represents the average number of MGE-related BGCs, and the number is annotated below the cylinder
The observed correlation between MGE diversity and strain competitiveness at the strain level implied potential interactions among MAGs within the same species. To explore these intraspecies interactions, we employed a co-occurrence network to analyze interactions of all MAGs. The intraspecies interactions accounted for 5.1% of all interactions (Fig. S15), with their frequency significantly surpassing that of interactions between species within genera (P < 0.001, Fig. S16). These findings indicated the prevalence and significance of intraspecies interactions.
To further elucidate the impact of BGC-related MGEs on strain competitiveness within species, we focused on species with more than 10 MAGs, and a total of 10 species were selected for analysis (Fig. S17). Among these species, 5 species displayed a significant correlation between MGE-related BGC abundance and relative abundance of the strain, and they were Lactiplantibacillus plantarum (Spearman’s R = 0.346, P < 0.05, Fig. 4b), Streptococcus thermophilus (Spearman’s R = 0.444, P < 0.05, Fig. 4c), Lactococcus lactis (Spearman’s R = 0.251, P < 0.01, Fig. S18a), Lactococcus cremoris (Spearman’s R = 0.144, P < 0.05, Fig. S18b), and Levilactobacillus brevis (Spearman’s R = 0.270, P < 0.05, Fig. S18c). These results suggested that MGE-related BGCs would be related to intraspecies interactions.
We further investigated whether MGEs play a regulatory role in intraspecies interactions among these 5 species. Initially, we assessed whether the average number of BGC-related MGEs in one MAG affects strain competitiveness within species. The relative abundance of MAGs was significantly related to the average number of BGC-related MGEs of L. plantarum (Spearman’s R = 0.146, P < 0.001), S. thermophilus (Spearman’s R = 0.280, P < 0.001) and L. lactis (Spearman’s R = 0.111, P < 0.001) (Fig. S19), indicating that an increase in the average number of BGC-related MGEs might be associated with a corresponding rise in the relative abundance of MAGs.
Next, we examined whether BGC-related MGEs also upregulate transcriptional activity of BGCs. There was a significant correlation between average number of BGC-related MGEs and transcriptional activity of BGCs in L. plantarum and S. thermophilus (Spearman’s R = 0.402, P < 0.01 and Spearman’s R = 0.423, P < 0.001, respectively, Fig. S20), indicating that average number of BGC-related MGEs may significantly upregulate transcription of BGCs in these 2 species. In addition, MAGs with higher competitiveness contained more BGC-related MGEs in L. plantarum (Fig. 4d) and S. thermophilus (Fig. 4e). Overall, these findings suggested that the average number of BGC-related MGEs was related to the transcription of BGCs in MAGs within species, thereby potentially improving the competitiveness of MAGs within species.
MGE-related BGCs are related to species competitiveness in pairwise co-culture
To illustrate that MGE facilitates strain competitiveness within species in relation to enhanced transcription, we focused on L. plantarum due to its higher variations of transcriptional activity of BGCs than that of S. thermophilus (L. plantarum: variance value = 73.28, S. thermophilus: variance value = 0.06). Eighteen colonies of L. plantarum were isolated from food fermentation samples to investigate the influence of MGEs on intraspecies competition (Fig. 5a). These 18 colonies were divided into 7 clusters (based on 99% ANI, Supplementary Table S4). Initially, we established growth curves for all 18 colonies, with most reaching a stable growth phase by 24 h (Fig. S21). The genome sizes of these 18 colonies ranged from 2.99 to 5.73 Mbp, with the number of BGCs varying from 3 to 9. The compositions of BGC-related MGEs in these 18 colonies also differed (variance value = 27.70, Supplementary Table S4). Therefore, we used these 18 colonies as representatives to analyze and verify the effect of BGC-related MGEs on intraspecies interaction.
Fig. 5.
Mobilome’s role in transcription of BGCs and strain competitiveness in Lactiplantibacillus plantarum. a Experimental design. Colonies of L. plantarum were screened and isolated in food fermentation samples. DNA and RNA sequencing were performed on these strains. Mono-culture and pairwise co-culture experiments were set up to investigate the competitiveness of each L. plantarum strain with three biological replicates. Competitiveness was evaluated by measuring OD600 in lower chambers of a transwell plate, where each well contained a different strain combination. b Correlation between number of BGC-related MGEs and competitive index (Ci). c Correlation between transcriptional activity of BGC and Ci. d Heatmap of Ci for strains with and without lexA, nusA and ploC. The heatmap shows the log10(Ci) for all pairwise co-culture combinations. Strains containing BGCs with lexA, nusA and ploC are marked with stars, while those without lexA, nusA and ploC are marked with squares. e The violin plot compares the log10(Ci) between strains with and without lexA, nusA and polC (t-test, *** P <0.001)
Pairwise co-culture experiments of these 18 strains were conducted in transwell plates to assess contact-independent intraspecific interactions, and a total of 306 co-culture pairs were obtained (Fig. 5a). To mitigate differences in growth rates among strains, we employed the competitive index (Ci), comparing the growth of strains in co-culture, to characterize strain growth advantages. In pairwise cultures, Ci values were significantly correlated with differences in the average number of BGC-related MGEs between strains (Spearman’s R = 0.300, P < 0.001, Fig. 5b), confirming that strains with a higher average number of BGC-related MGEs were more competitive. To ensure that the observed differences in co-culture growth were not confounded by genomic differences, we also compared 7 strains (clustered together by ANI ≥ 99%) in terms of their gene content, genome size, and BGC number (Supplementary Table S4). The results revealed no significant differences in these characteristics across these 7 strains (ANOVA: P = 0.9997). Our results showed a significant correlation between Ci values and the average number of BGC-related MGEs across the strains (Spearman’s R = 0.262, P < 0.05, Fig. S22). This finding suggested that strains with a higher average number of BGC-related MGEs tended to exhibit greater competitiveness in co-culture, supporting our hypothesis that MGEs play a role in microbial competitiveness.
As BGC transcription typically occurs during the logarithmic to stationary growth phases [59], we conducted transcriptome sequencing at 24 h. Transcriptional activity of BGCs was significantly associated with Ci values (Spearman’s R = 0.348, P < 0.001, Fig. 5c), supporting our hypothesis that increased BGC transcription contributed to strain competitiveness.
To elucidate the specific contributions of different MGEs, we analyzed the presence of various BGC-related MGEs (Fig. S23). Genes lexA and nusA were identified in 18.18% BGCs (Fig. S24a), and the Ci values of strains containing BGCs with lexA and nusA were significantly higher than those of strains containing BGCs without lexA and nusA (P < 0.001, Fig. S24b). We further investigated the combined effect of lexA, nusA, and polC (Fig. 5d). When both 2 strains containing BGCs with lexA, nusA, and polC were co-cultured, their Ci values were not significantly different. However, strains containing BGCs with lexA, nusA, and polC had a higher average Ci value when co-cultured with strains containing BGCs without lexA, nusA, and polC (Fig. 5e). These results further supported the role of lexA, nusA, and polC in enhancing BGC transcription; consequently, strain competitiveness within species.
Discussion
In this study, we revealed the distribution of MGEs among microorganisms in food fermentations, emphasizing their tendency to be located adjacent to functional genes, particularly BGCs. The propensity would enhance strain competitiveness via increasing transcription of BGCs, thereby enhancing potential production of secondary metabolites [71], such as antimicrobial peptides [72] and siderophores [73]. These secondary metabolites, typically produced by BGCs [19], might play key roles in mediating microbial interactions [74]. Although the correlation between the abundance of MGE-related BGCs and the relative abundance of MAGs was weak in this study, its high significance (P < 0.001) indicated that MGE-related BGCs would contribute to strain competitiveness. Moreover, most of the secondary metabolites produced by BGCs are still unknown through the existing bioinformatic analysis. The functions of these secondary metabolites produced by BGCs need to be further elucidated, which will enhance our understanding of microbial interactions.
In addition, a weak correlation between the average number of BGC-related MGEs and the transcriptional activity of BGCs was observed. The transcriptional activity of BGCs is regulated by various mechanisms, such as quorum sensing [75], which were not fully resolved in our analysis. The complex regulatory mechanisms, along with the physiological costs associated with MGE-mediated transferred genes [76], may explain the weak correlation between BGC-related MGEs and the transcriptional activity of BGCs. Despite the weak correlation, the high significance (P < 0.001) indicated that BGC-related MGEs would contribute to regulating BGC transcription. Furthermore, MGEs typically do not directly carry highly expressed genes, as each transfer event entails potential costs [77]. However, MGEs could carry regulatory factors or promoters capable of inducing high gene expression, potentially exerting global regulation effects [17]. For instance, gene lexA was identified as a global transcriptional regulator that modulates the transcription of beneficial genes, thereby improving strain adaptability [78]. Gene nusA, a transcriptional anti-terminator, regulates the transcription and translation of entire gene clusters [79, 80]. This may explain why these two MGEs enhance BGC transcription. However, the molecular mechanisms underlying MGE synergism and their regulation of functional gene transcription remain unclear and require further investigation.
MGEs can introduce new genetic variations or traits, facilitating the adaptation of microorganisms to their environments [81]. Thus, MGEs are key factors driving microbial evolution [82]. MGEs enhance strain competitiveness by transferring new functional genes and altering the genome [83]. In this study, we found that MGEs may provide novel regulatory functions, promoting the transcription of specific functional genes. We have started exploring the impact of MGEs on adjacent genes. However, their broader influence on the entire genome, particularly on global transcriptional effects, remains a complex challenge. The relationship between these influences and microbial evolution requires further investigation.
This study provided novel insights into the role of MGEs in regulating BGC transcriptional activity and species competitiveness. Although protein ortholog-based methods may lead to ambiguous classifications during MGE annotation, a universally applicable method for this challenge remains unattainable [84]. By combining metagenomic and metatranscriptomic analysis with cultivation experiments, we demonstrated that the MGEs influence the regulation of BGC transcription. These findings successfully elucidated the role of MGEs in intraspecies competition in food fermentations and provided valuable insights into the ecological functions of MGEs in microorganisms.
Conclusion
This work systematically revealed the regular prevalence of MGEs in food fermentation microbiomes. The strain competitiveness correlated positively with the abundance of MGE-related BGCs. Specifically, strains containing enriched MGE-related BGCs exhibited higher transcriptional activity, which further enhanced their competitiveness. Overall, this study demonstrated the critical role of mobilome in enhancing strain competitiveness by promoting the transcription of BGCs in food fermentation microbiomes.
Supplementary Information
Supplementary Material 1: Figure S1. The similarity between these potential plasmid derived MGEs and the MGEs in plasmid databases. Figure S2. Functional classification of mobile genetic elements. Figure S3. Functional classification of mobile genetic element (MGE)-related genes. Figure S4. Overview of biological function of biosynthetic gene clusters (BGC). Figure S5. Classification of biosynthetic gene clusters (BGCs) in 216 gene cluster families (GCFs). Figure S6. The occurrence of different MGEs in different BGC types. Figure S7. The co-occurrence of two MGEs in different BGC types. Figure S8. Relative abundance of MAGs containing BGCs with lexA and nusA, ihfB and rarA, and rarA and NA1. Figure S9. The co-occurrence of different MGEs in different phyla. Figure S10. The co-occurrence of three MGEs. Figure S11. The relative abundance of MAGs containing BGCs with lexA and nusA, and lexA, nusA and polC. Figure S12. Mobile genetic elements (MGEs) synergistically increased the transcriptional activity of biosynthetic gene clusters (BGCs). Figure S13. Interspecies and intraspecies comparisons of MGE diversity. Figure S14. Correlation between variance of MGE diversity and variance of relative abundance of MAG at the strain level. Figure S15. Microbial co-occurrence network. Figure S16. Interactions between strains within the same species and between species within the same genera. Figure S17. The number of MAGs in different species. Figure S18. Correlation between abundance of MGE-related BGCs and relative abundance of MAGs in Lactococcus lactis (a), Lactococcus cremoris (b) and Levilactobacillus brevis (c). Figure S19. Correlation between average number of BGC-related MGEs and relative abundance of MAGs in Lactiplantibacillus plantarum (a), Streptococcus thermophilus (b) and Lactococcus lactis (c). Figure S20. Correlation between average number of BGC-related MGEs and transcriptional activity of BGCs in Lactiplantibacillus plantarum (a) and Streptococcus thermophilus (b). Figure S21. Growth curves of 18 colonies of Lactiplantibacillus plantarum. Figure S22. Correlation between number of BGC-related MGE and competitive index (Ci). Figure S23. The presence of various MGEs within BGCs in 18 Lactiplantibacillus plantarum strains. Figure S24. The profile of strains having BGCs with and without lexA and nusA.
Supplementary Material 2. Table S1. The information of 590 food fermentation metagenomes and 42 corresponding metatranscriptomes.
Supplementary Material 3. Table S2. The information of 1133 metagenome-assembled genomes.
Supplementary Material 4. Table S3. Mobile genetic elements identified in 1133 metagenome-assembly genomes.
Supplementary Material 5. Table S4. The information of 18 colonies of Lactiplantibacillus plantarum.
Acknowledgements
We gratefully acknowledge support from the high-performance cluster platform of the School of Biotechnology, Jiangnan University.
Authors’ contributions
Lei Xu: Data curation; software; investigation; formal analysis; methodology; validation; visualization; writing—original draft; Writing—review & editing. Jian-Yu Jiao: Writing—review & editing; investigation. Chen Ling: Data curation, software. Ru-Bing Du: Software; writing—review & editing. Qun Wu: Conceptualization; funding acquisition; resources; supervision; writing—review & editing. Yan Xu: Funding acquisition; supervision. Wen-Jun Li: Writing—review & editing; supervision.
Funding
This work was supported by the National Natural Science Foundation of China (32172175), the Priority Academic Program Development of Jiangsu Higher Education Institutions, the 111 Project (No. 111–2-06).
Data availability
The metagenomic data used in this study were downloaded from NCBI, and a summary of their accessions was provided in Supplementary Data 1. Six hundred and fifty-three MAGs from food fermentation were downloaded from the GitHub repository (https://github.com/durubing-jn/food-fermentation-mategenome). The genome and transcriptome of 18 L. plantarum were submitted to the NCBI with accession number PRJNA1187877, the detail information and their genomes were available at the GitHub repository (https://github.com/SSxlei/The-role-of-mobile-genetic-element). The data produced in this study, including 1133 MAGs, 4059 BGCs and MGE annotation file, as well as the python and R scripts for data analysis, had been deposited and were available at the GitHub repository (https://github.com/SSxlei/The-role-of-mobile-genetic-element).
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Lei Xu and Jian-Yu Jiao contributed equally to this study.
Contributor Information
Qun Wu, Email: wuq@jiangnan.edu.cn.
Wen-Jun Li, Email: liwenjun3@mail.sysu.edu.cn.
References
- 1.Shoemaker WR, Locey KJ, Lennon JT. A macroecological theory of microbial biodiversity. Nat Ecol Evol. 2017;1:0107. [DOI] [PubMed] [Google Scholar]
- 2.Giordano N, et al. Genome-scale community modelling reveals conserved metabolic cross-feedings in epipelagic bacterioplankton communities. Nat Commun. 2024;15:2721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Qian JJ, Akcay E. The balance of interaction types determines the assembly and stability of ecological communities. Nat Ecol Evol. 2020;4:356–65. [DOI] [PubMed] [Google Scholar]
- 4.Barber JN, et al. Species interactions constrain adaptation and preserve ecological stability in an experimental microbial community. ISME J. 2022;16:1442–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ghoul M, Mitri S. The ecology and evolution of microbial competition. Trends Microbiol. 2016;24:833–45. [DOI] [PubMed] [Google Scholar]
- 6.Kost C, Patil KR, Friedman J, Garcia SL, Ralser M. Metabolic exchanges are ubiquitous in natural microbial communities. Nat Microbiol. 2023;8:2244–52. [DOI] [PubMed] [Google Scholar]
- 7.Hansen MH, et al. Resurrecting ancestral antibiotics: unveiling the origins of modern lipid II targeting glycopeptides. Nat Commun. 2023;14:7842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ban S, et al. Community dynamics and assembly is driven by environmental microbiota mediated by spatiotemporal distribution: the case of Daqu fermentation. Int J Food Microbiol. 2025;426: 110933. [DOI] [PubMed] [Google Scholar]
- 9.Cheng T, et al. The complex world of kefir: Structural insights and symbiotic relationships. Compr Rev Food Sci Food Saf. 2024;23:e13364. [DOI] [PubMed] [Google Scholar]
- 10.Newton IL, Bordenstein SR. Correlations between bacterial ecology and mobile DNA. Curr Microbiol. 2011;62:198–208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Khedkar S, et al. Landscape of mobile genetic elements and their antibiotic resistance cargo in prokaryotic genomes. Nucleic Acids Res. 2022;50:3155–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rodriguez-Beltran J, DelaFuente J, Leon-Sampedro R, MacLean RC, San MA. Beyond horizontal gene transfer: the role of plasmids in bacterial evolution. Nat Rev Microbiol. 2021;19:347–59. [DOI] [PubMed] [Google Scholar]
- 13.Partridge SR, Kwong TM, Firth N, Jensen SO. Mobile genetic elements associated with antimicrobial resistance. Clin Microbiol Rev. 2018;31:e00088-00017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Saati-Santamaría Z. Global map of specialized metabolites encoded in prokaryotic plasmids. Microbiol Spectrum. 2023;11:e01523-01523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hemme CL, et al. Lateral gene transfer in a heavy metal-contaminated-groundwater microbial community. mBio. 2016;7:e02234-02215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Frost LS, Leplae R, Summers AO, Toussaint A. Mobile genetic elements: the agents of open source evolution. Nat Rev Microbiol. 2005;3:722–32. [DOI] [PubMed] [Google Scholar]
- 17.Tuller T, et al. Association between translation efficiency and horizontal gene transfer within microbial communities. Nucleic Acids Res. 2011;39:4743–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Durrant MG, Li MM, Siranosian BA, Montgomery SB, Bhatt AS. A bioinformatic analysis of integrative mobile genetic elements highlights their role in bacterial adaptation. Cell Host Microbe. 2020;27:140–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Scherlach K, Hertweck C. Mining and unearthing hidden biosynthetic potential. Nat Commun. 2021;12:3864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gavriilidou A, et al. Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes. Nat Microbiol. 2022;7:726–35. [DOI] [PubMed] [Google Scholar]
- 21.Rebuffat S. Ribosomally synthesized peptides, foreground players in microbial interactions: recent developments and unanswered questions. Nat Prod Rep. 2022;39:273–310. [DOI] [PubMed] [Google Scholar]
- 22.Chevrette MG, et al. Evolutionary dynamics of natural product biosynthesis in bacteria. Nat Prod Rep. 2020;37:566–99. [DOI] [PubMed] [Google Scholar]
- 23.Ziemert N, et al. Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora. Proc Natl Acad Sci U S A. 2014;111:E1130-1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Baunach M, Chowdhury S, Stallforth P, Dittmann E. The landscape of recombination events that create nonribosomal peptide diversity. Mol Biol Evol. 2021;38:2116–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Du R, Xiong W, Xu L, Xu Y, Wu Q. Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations. Microbiome. 2023;11:115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 2017;27:824–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31:1674–6. [DOI] [PubMed] [Google Scholar]
- 28.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Danecek P, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10:giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Alneberg J, et al. Binning metagenomic contigs by coverage and composition. Nat Methods. 2014;11:1144–6. [DOI] [PubMed] [Google Scholar]
- 31.Kang DD, et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ. 2019;7:e7359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wu YW, Simmons BA, Singer SW. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics. 2016;32:605–7. [DOI] [PubMed] [Google Scholar]
- 33.Sieber CMK, et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat Microbiol. 2018;3:836–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chklovski A, Parks DH, Woodcroft BJ, Tyson GW. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods. 2023;20:1203–12. [DOI] [PubMed] [Google Scholar]
- 36.Bowers RM, et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol. 2017;35:725–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Olm MR, Brown CT, Brooks B, Banfield JF. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 2017;11:2864–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Parks DH, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol. 2018;36:996–1004. [DOI] [PubMed] [Google Scholar]
- 39.Criscuolo A, Gribaldo S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. 2010;10:210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26:1641–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Blin K, et al. AntiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 2021;49:W29–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Navarro-Munoz JC, et al. A computational framework to explore large-scale biosynthetic diversity. Nat Chem Biol. 2020;16:60–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jia B, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 2017;45:D566–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Walker AS, Clardy J. A machine learning bioinformatics method to predict biological activity from biosynthetic gene clusters. J Chem Inf Model. 2021;61:2560–71 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hyatt D, et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 2010;11:119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Buchfink B, Reuter K, Drost HG. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods. 2021;18:366–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Anonymous. mobileOG-db: a manually curated database of protein families mediating the life cycle of bacterial mobile genetic elements. Appl Environ Microbiol. 2023;88:e0099122. [DOI] [PMC free article] [PubMed]
- 50.Brown CL, et al. Selection and horizontal gene transfer underlie microdiversity-level heterogeneity in resistance gene fate during wastewater treatment. Nat Commun. 2024;15:5412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Chen B, et al. Antimicrobial peptides in the global microbiome: biosynthetic genes and resistance determinants. Environ Sci Technol. 2023;57:7698–708. [DOI] [PubMed] [Google Scholar]
- 52.Ellabaan MMH, Munck C, Porse A, Imamovic L, Sommer MOA. Forecasting the dissemination of antibiotic resistance genes across bacterial genomes. Nat Commun. 2021;12:2435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Che Y, et al. Reply to Partridge et al.: Complementary bioinformatics and experimental approaches to investigate the transfer of AMR genes. Proc Natl Acad Sci U S A. 2021;118:e2108995118. [DOI] [PMC free article] [PubMed]
- 54.Yang Y, et al. Pet cats may shape the antibiotic resistome of their owner’s gut and living environment. Microbiome. 2023;11:235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gao Y, Zhong Z, Zhang D, Zhang J, Li YX. Exploring the roles of ribosomal peptides in prokaryote-phage interactions through deep learning-enabled metagenome mining. Microbiome. 2024;12:94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Douarre PE, Mallet L, Radomski N, Felten A, Mistou MY. Analysis of COMPASS, a new comprehensive plasmid database revealed prevalence of multireplicon and extensive diversity of IncF plasmids. Front Microbiol. 2020;11:483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.O’Leary NA, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44:D733-745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Antipov D, Raiko M, Lapidus A, Pevzner PA. Metaviral SPAdes: assembly of viruses from metagenomic data. Bioinformatics. 2020;36:4126–9. [DOI] [PubMed] [Google Scholar]
- 59.Kopylova E, Noe L, Touzet H. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012;28:3211–7. [DOI] [PubMed] [Google Scholar]
- 60.Zhang; J-W, et al. Novel gene clusters for natural product synthesis are abundant in the mangrove swamp microbiome. Appl Environ Microbiol. 2023;89:e0010223. [DOI] [PMC free article] [PubMed]
- 61.Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Du R, Wang S, Wu Q, Xu Y. LSQP-DB: a species-specific quantitative PCR primer database for 307 Lactobacillaceae species. Syst microbiol biomanuf. 2022;3:593–601. [Google Scholar]
- 63.Guiying Z, Mills DA, Block DE. Development of chemically defined media supporting high-cell-density growth of lactococci, enterococci, and streptococci. Appl Environ Microbiol. 2009;75:1080–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Koren S, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Hunt M, et al. Circlator: automated circularization of genome assemblies using long sequencing reads. Genome Biol. 2015;16:294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Jain C, Rodriguez RL, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9:5114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Putri GH, Anders S, Pyl PT, Pimanda JE, Zanini F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics. 2022;38:2943–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Abu-Ali GS, et al. Metatranscriptome of human faecal microbial communities in a cohort of adult men. Nat Microbiol. 2018;3:356–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hibbing ME, Fuqua C, Parsek MR, Peterson SB. Bacterial competition: surviving and thriving in the microbial jungle. Nat Rev Microbiol. 2010;8:15–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Yanglei Y, et al. Current status and potentiality of class II bacteriocins from lactic acid bacteria: structure, mode of action and applications in the food industry. Trends Food Sci Technol. 2022;120:387–401. [Google Scholar]
- 73.Galdino ACM, et al. Siderophores promote cooperative interspecies and intraspecies cross-protection against antibiotics in vitro. Nat Microbiol. 2024;9:631–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Chevrette MG, et al. Microbiome composition modulates secondary metabolism in a multispecies bacterial community. Proc Natl Acad Sci U S A. 2022;119:e2212930119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Meng F, et al. Acetate activates Lactobacillus bacteriocin synthesis by controlling quorum sensing. Appl Environ Microbiol. 2021;87:e0072021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Baltrus DA. Exploring the costs of horizontal gene transfer. Trends Ecol Evol. 2013;28:489–95. [DOI] [PubMed] [Google Scholar]
- 77.Park C, Zhang J. High expression hampers horizontal gene transfer. Genome Biol Evol. 2012;4:523–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Fornelos N, Browning DF, Butala M. The use and abuse of LexA by mobile genetic elements. Trends Microbiol. 2016;24:391–401. [DOI] [PubMed] [Google Scholar]
- 79.O’Reilly FJ, et al. In-cell architecture of an actively transcribingtranslating expressome. Science. 2020;269:554–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Tran NT, Le TBK. Control of a gene transfer agent cluster in Caulobacter crescentus by transcriptional activation and anti-termination. Nat Commun. 2024;15:4749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Werren JH. Selfish genetic elements, genetic conflict, and evolutionary innovation. Proc Natl Acad Sci U S A. 2011;108(Suppl 2):10863–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Brockhurst MA, et al. The Ecology and Evolution of Pangenomes. Curr Biol. 2019;29:R1094–103. [DOI] [PubMed] [Google Scholar]
- 83.Soucy SM, Huang J, Gogarten JP. Horizontal gene transfer: building the web of life. Nat Rev Genet. 2015;16:472–82. [DOI] [PubMed] [Google Scholar]
- 84.Zhang XB, Oualline G, Shaw J, Yu YW. skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements. Bioinformatics. 2024;40:ii155–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Material 1: Figure S1. The similarity between these potential plasmid derived MGEs and the MGEs in plasmid databases. Figure S2. Functional classification of mobile genetic elements. Figure S3. Functional classification of mobile genetic element (MGE)-related genes. Figure S4. Overview of biological function of biosynthetic gene clusters (BGC). Figure S5. Classification of biosynthetic gene clusters (BGCs) in 216 gene cluster families (GCFs). Figure S6. The occurrence of different MGEs in different BGC types. Figure S7. The co-occurrence of two MGEs in different BGC types. Figure S8. Relative abundance of MAGs containing BGCs with lexA and nusA, ihfB and rarA, and rarA and NA1. Figure S9. The co-occurrence of different MGEs in different phyla. Figure S10. The co-occurrence of three MGEs. Figure S11. The relative abundance of MAGs containing BGCs with lexA and nusA, and lexA, nusA and polC. Figure S12. Mobile genetic elements (MGEs) synergistically increased the transcriptional activity of biosynthetic gene clusters (BGCs). Figure S13. Interspecies and intraspecies comparisons of MGE diversity. Figure S14. Correlation between variance of MGE diversity and variance of relative abundance of MAG at the strain level. Figure S15. Microbial co-occurrence network. Figure S16. Interactions between strains within the same species and between species within the same genera. Figure S17. The number of MAGs in different species. Figure S18. Correlation between abundance of MGE-related BGCs and relative abundance of MAGs in Lactococcus lactis (a), Lactococcus cremoris (b) and Levilactobacillus brevis (c). Figure S19. Correlation between average number of BGC-related MGEs and relative abundance of MAGs in Lactiplantibacillus plantarum (a), Streptococcus thermophilus (b) and Lactococcus lactis (c). Figure S20. Correlation between average number of BGC-related MGEs and transcriptional activity of BGCs in Lactiplantibacillus plantarum (a) and Streptococcus thermophilus (b). Figure S21. Growth curves of 18 colonies of Lactiplantibacillus plantarum. Figure S22. Correlation between number of BGC-related MGE and competitive index (Ci). Figure S23. The presence of various MGEs within BGCs in 18 Lactiplantibacillus plantarum strains. Figure S24. The profile of strains having BGCs with and without lexA and nusA.
Supplementary Material 2. Table S1. The information of 590 food fermentation metagenomes and 42 corresponding metatranscriptomes.
Supplementary Material 3. Table S2. The information of 1133 metagenome-assembled genomes.
Supplementary Material 4. Table S3. Mobile genetic elements identified in 1133 metagenome-assembly genomes.
Supplementary Material 5. Table S4. The information of 18 colonies of Lactiplantibacillus plantarum.
Data Availability Statement
The metagenomic data used in this study were downloaded from NCBI, and a summary of their accessions was provided in Supplementary Data 1. Six hundred and fifty-three MAGs from food fermentation were downloaded from the GitHub repository (https://github.com/durubing-jn/food-fermentation-mategenome). The genome and transcriptome of 18 L. plantarum were submitted to the NCBI with accession number PRJNA1187877, the detail information and their genomes were available at the GitHub repository (https://github.com/SSxlei/The-role-of-mobile-genetic-element). The data produced in this study, including 1133 MAGs, 4059 BGCs and MGE annotation file, as well as the python and R scripts for data analysis, had been deposited and were available at the GitHub repository (https://github.com/SSxlei/The-role-of-mobile-genetic-element).




