Skip to main content
mBio logoLink to mBio
. 2023 Dec 20;15(1):e01911-23. doi: 10.1128/mbio.01911-23

Toxinome—the bacterial protein toxin database

Aleks Danov 1,#, Ofir Segev 1,#, Avi Bograd 1,#, Yedidya Ben Eliyahu 2, Noam Dotan 3, Tommy Kaplan 4,5,, Asaf Levy 1,
Editor: Victor J Torres6
PMCID: PMC10790787  PMID: 38117054

ABSTRACT

Protein toxins are key molecular weapons in biology that are used to attack neighboring cells. Bacteria use protein toxins to kill or inhibit the growth of prokaryotic and eukaryotic cells using various modes of action that target essential cellular components. The toxins are responsible for shaping microbiomes in different habitats, for abortive phage infection, and for severe infectious diseases in animals and plants. Although several toxin databases have been developed, each one is devoted to a specific toxin family, and they encompass a relatively small number of toxins. Antimicrobial toxins are often accompanied by antitoxins (or immunity proteins) that neutralize the cognate toxins. Here, we combined toxins and antitoxins from many resources and created Toxinome, a comprehensive and updated bacterial protein toxin database. The Toxinome includes a total of 1,483,028 toxins and 491,345 antitoxins encoded in 59,475 bacterial genomes across the tree of life. We identified a depletion of toxin and antitoxin genes in bacteria that dwell in extreme temperatures. We defined 5,161 unique Toxin Islands within phylogenetically diverse bacterial genomes, which are loci dense in toxin and antitoxin genes. By focusing on the unannotated genes within these islands, we characterized a number of these genes as toxins or antitoxins. Finally, we developed an interactive Toxinome website (http://toxinome.pythonanywhere.com) that allows searching and downloading of our database. The Toxinome resource will be useful to the large research community interested in bacterial toxins and can guide toxin discovery and function elucidation, and infectious disease diagnosis and treatment.

IMPORTANCE

Microbes use protein toxins as important tools to attack neighboring cells, microbial or eukaryotic, and for self-killing when attacked by viruses. These toxins work through different mechanisms to inhibit cell growth or kill cells. Microbes also use antitoxin proteins to neutralize the toxin activities. Here, we developed a comprehensive database called Toxinome of nearly two million toxins and antitoxins that are encoded in 59,475 bacterial genomes. We described the distribution of bacterial toxins and identified that they are depleted by bacteria that live in hot and cold temperatures. We found 5,161 cases in which toxins and antitoxins are densely clustered in bacterial genomes and termed these areas “Toxin Islands.” The Toxinome database is a useful resource for anyone interested in toxin biology and evolution, and it can guide the discovery of new toxins.

KEYWORDS: bacterial toxins, microbial toxins, toxins, effectors, database, protein toxins, toxin-antitoxin

INTRODUCTION

Bacteria employ protein toxins to harm surrounding eukaryotic or prokaryotic cells. There is a large variety of bacterial protein toxins. These can be coarsely divided into several large classes: (i) effectors proteins that are translocated into target cells via different membrane-bound or extracellular bacterial secretion systems, such as the Type VI secretion system, contact-dependent inhibition systems, Tc toxins, or the extracellular contractile injection system (17), (ii) toxins that are released from the attacking microbe and enter the target cell through specific receptors or through insertion into target cell membranes, e.g., bacteriocins, AB toxins, or MARTX toxins (813), and (iii) toxins that inhibit self growth of the producing cell in response to phage infection or antibiotic persistence, such as toxin-antitoxin systems (1416). Toxin-antitoxin systems are currently divided into eight types and are implicated in the impairment of DNA replication, translation, cell envelope, and cytoskeleton integrity, and they can induce metabolic stress (14, 17). Most of these toxins are studied in model bacterial genomes that represent only a minor portion of the over 600,000 bacterial genomes that have been sequenced in recent years (18). A more comprehensive view of all bacterial protein toxins will facilitate a general understanding of toxin evolution and inference of correlations between different toxin families and the genome organization of toxins. In addition, since in bacteria gene proximity often implies a functional relationship between neighboring genes (e.g., genes that are co-transcribed as part of an operon), an extensive toxin database can enable the discovery of new toxins and toxin-associated genes based on their genomic neighborhood. Toxin-associated genes located next to toxins can be involved in toxin production, maturation, or secretion, in immunity against the toxin or in lateral transfer of the toxin gene between bacteria.

There have been previous efforts to construct protein toxin databases. However, these databases often focus on one toxin class or they do not cover a large collection of bacterial protein toxins. The Toxin Exposome Database (T3DB) focuses mostly on drugs, industrial toxins, and pollutants but contains only a few bacterial protein toxins (19). The SecRet6 database is a comprehensive database of only microbial T6SS effector proteins, which are mostly serving as antimicrobial toxins (20). Similarly, TADB is an excellent resource for type two toxin-antitoxin systems, mostly used for self-growth inhibition of bacteria to achieve plasmid addiction or dormancy (21). BACTIBASE and BAGEL4 focus on bacterial bacteriocins (2224). DBETH includes exotoxins from human pathogens (22) and DBAASP is an extensive updated database of mostly synthetic antimicrobial peptides (25). A recent database, prokaryotic antimicrobial toxin database (PAT), is more inclusive than the aforementioned databases and contains 441 bacterial and archaeal toxins from seven classes, 64% of which are bacteriocins and T6SS effectors of proteobacteria (26). PAT also contains 6,064 predicted antimicrobial toxins in prokaryotic reference genomes. Notwithstanding, the vast majority of microbial protein toxins are currently not present in any toxin database except being part of general large protein databases, such as UniProt. The microbiology research community therefore lacks an integrated updated database that focuses on microbial toxins from various classes and their mapping to tens of thousands of microbial genomes. There are also recent reports that show that the same bacterial toxins can serve in one organism as part of self-inhibiting toxin-antitoxin modules, and in another organism, they evolved into toxin effectors that are being injected into target cells (5, 27, 28), and hence there is little benefit in separating toxins into different databases based on their specific biological function. Moreover, numerous toxin proteins are likely encoded in the tens of thousands of bacterial genomes, many of which are poorly studied non-model or non-pathogenic bacteria. These toxin genes are currently not reported anywhere and they can aid in studying toxin function and evolution.

We developed “Toxinome”: The Bacterial Protein Toxin Database, which includes 1,483,028 toxins, and 491,345 immunity genes, from many toxin families that are encoded in 59,475 bacterial genomes. We developed an interactive interface that allows online querying and full downloading of the Toxinome database. Our analysis indicated that toxins are depleted from bacteria that dwell in extreme temperatures. Finally, using the Toxinome database we defined a large collection of 5,161 genomic regions that are highly rich in toxin and antitoxin genes which we define as genomic Toxin Islands. The Toxin Islands can be used to identify novel toxin and toxin-associated genes that are currently functionally unannotated. The resource we developed will be valuable to the large research community that studies different classes of bacterial toxins. It can be used for toxin and antitoxin annotations of genomes of interest, and it can be mined in an effort to discover new antimicrobial proteins, anticancer proteins, or biopesticides of clinical, biotechnological, and environmental importance.

MATERIALS AND METHODS

Assembly of a toxin protein database

We initiated Toxinome by combining toxin proteins gathered from four databases: SecRet6 (20), BACTIBASE (20, 23), TADB (21), and BAGEL4 (24), which were downloaded in October 2020. In addition, we collected a large number of toxin proteins from the UniProt database (29) using keyword searches (e.g., “toxic” and “toxin”). At this point, we removed proteins with keywords such as “anti-toxin” and “immunity.” These antitoxins were later added based on the Pfam database (see below). In total, 175,573 toxin protein sequences were included in the combined database. We clustered the 175,573 collected toxins using the CD-HIT 4.8.1 software (threshold = 0.7, alignment coverage = 0.92) (30). The proteins were clustered into 70,867 groups. We then mapped the toxin genes into microbial genomes downloaded from the Integrated Microbial Genomic (IMG) database (31); the representative genes of all clusters were mapped to 1,078,446 genes from 59,475 genomes (Table S1). The mapping was performed using DIAMOND 2.0.4.142 software (alignment coverage = 0.95, e-value = 0.05, threshold identity >30%) (Fig. 1) (32). To increase the number of toxins and immunity proteins in our database we used protein domain information. We added 219 and 94 toxin and antitoxin domains, respectively, to the resulting toxin gene set that we downloaded from the Pfam database (33) in June 2021 and manually curated based on their functional description. We mapped these domains to genes from the IMG genome database using HMMER3.0 as described before (34). Altogether we found an additional 895,927 proteins coding genes that include the toxin or the immunity protein domains. The resulting data set was then manually curated for quality assurance, and toxin or antitoxin genes erroneously included were removed.

FIG 1.

FIG 1

Construction of Toxinome database. the toxin proteins from SecRet6, BACTIBASE, TADB, BAGEL4, and UniProt databases were joined, clustered to remove redundancy, and were mapped to microbial genomes from Integrated Microbial Genomics (IMG) database. In addition, we manually curated and retrieved toxin and immunity domains from Pfam, and mapped them again into proteins and genes from IMG. The union of these sets resulted in Toxinome.

Toxin island prediction

We systematically searched DNA scaffolds from bacterial genomes from the IMG database for segments where toxins and antitoxins from the Toxinome were closely located to each other. Different values for the parameter that represents the maximum distance between two adjacent toxins or antitoxins were tested, ranging from 10 Kb base pairs (bp) to 60 Kb bp. This analysis generated multiple DNA segments that are rich in toxins and antitoxins.

We employed two scoring methods to assess the segments, one based on the Poisson distribution and another utilizing a specialized scoring approach. To evaluate the background density of toxinome genes within genome segments, we assumed a Poisson distribution: P(x=k) =eλλkk! . This involved calculating the overall presence of toxins and antitoxins per unit length in the genome, which was then multiplied by the segment’s length to calculate the expected value of lambda. K represents the actual number of toxins and antitoxins within the segment. This analysis generated a P-value, which quantifies the cumulative probability of observing a grouping of toxins and antitoxins in the genome of size K or larger. In addition, we assigned a score to each hit, considering factors such as the number of toxins and antitoxins (T), the total gene count (G), and the length of the segment (L). Several parameter combinations were tested to establish an accurate measure of a Toxin Island resulting with the formula T4G*L*1000. A high score represents a DNA segment enriched with toxins and antitoxins.

After scoring all the segments, we ensured the reliability of our findings by establishing thresholds for characterizing Toxin Islands: a score of >4, a Poisson P-value ≤ 0.05, and a minimum of four toxins within a segment. The threshold of a score >4 was chosen as it showed the best improvement in the average Poisson P-value of our hits (Fig. S1). This process resulted in 23,025 segments defined as “Toxin Islands.” Subsequently, we eliminated redundant segments using mmseqs2 (35) (min-seq-id=0.99, seed-sub-mat=3, cluster-mode = 2) and chose the best scoring representative, to obtain a final set of 5,161 unique “Toxin Islands” derived from 4,240 distinct bacterial genomes.

Annotation of hypothetical proteins as toxins and antitoxins

We employed a two-step approach to predict the structures and functionally characterize unannotated proteins within the Toxin Islands. Firstly, we utilized ColabFold (36), a deep learning-based method, to predict the structures of these proteins. ColabFold utilizes the Protein Data Bank (PDB) file ranking system, and we selected the PDB file with the highest rank from the output files.

Subsequently, for the functional characterization of these proteins, we employed Foldseek (37) as our tool of choice. Foldseek enables structural alignment and identification of similar proteins, providing valuable insights into their structural characteristics and potential functional roles. To achieve this, we conducted a search of the proteins against the “AlphaFold/UniProt50 v4” and “PDB100 2201222” databases, utilizing Foldseek’s built-in search functionality. To ensure the selection of highly relevant and reliable matches, we implemented specific thresholds for filtering the results obtained from Foldseek. The applied thresholds included an e-value lower than 0.003, a TM-score higher than 0.6 (TM-score is a metric for assessing the topological similarity of protein structures), and a minimum coverage of at least 90% of the query protein.

Design of the Toxinome website

The Toxinome website was constructed using the Django framework (https://www.djangoproject.com) for full-stack web development in Python. SQLite database engine was used for storing the collected data, and DIAMOND software (38) was used for online homology-based search, as part of the backend development of the Toxinome website. The data are organized in five SQL tables—genome table, gene table, cluster table, Pfam domain table, and Pfams in genes table. Each table stores the relevant information about collected toxins and antitoxins. The tables have associated connections based on common fields, which enables retrieval of different intersections of data and their presentation with simple front end code. HTML, CSS, JAVA SCRIPT, and bootstrap toolkit (https://getbootstrap.com/) were used for frontend development. Finally, the constructed website was deployed on the pythonanywhere server and can be accessed by URL: http://toxinome.pythonanywhere.com. The website was tested and found to work from Chrome, Internet Explorer, Firefox, and Safari browsers.

Phylogenetic analysis

To represent the prevalence of toxin and antitoxin genes across all bacterial classes, we built a phylogenetic tree using a random representative genome from each taxonomic class (each leaf represents a taxonomic class). As a basis for comparison, we used universal marker genes. Specifically, we used 29 COGs of 102 COGs that correspond to ribosomal proteins (39). The COGs used are COG0048, COG0049, COG0051, COG0052, COG0080, COG0081, COG0087, COG0088, COG0089, COG0090, COG0091, COG0092, COG0093, COG0094, COG0096, COG0097, COG0098, COG0099, COG0100, COG0102, COG0103, COG0185, COG0186, COG0197, COG0198, COG0200. COG0244, COG0256, and COG0522. We aligned each COG protein from each representative genome using Clustal Omega version 1.2.4 (40). We then concatenated all the COG protein alignments for each genome, filling missing COGs with gaps. We used this 29 COG protein (single copy marker protein) concatenated alignment as an input to FastTree2 with default parameters (41) to create the phylogenetic tree. R’s ggtree v2.4.2 (42) and ggtreeExtra v1.0.4 (43) packages were used to plot the tree.

Toxin and antitoxin correlation with microbial environment temperature

Metadata of temperature range (mesophile, thermophile, hyperthermophile, and psychrophile) of different microbes that are included in Toxinome were downloaded from the IMG database. For each genome, we also calculated the proportion of toxin and antitoxin genes as part of the total gene number.

RESULTS

Toxinome content and website usage

We downloaded 175,573 toxin sequences of various types from five databases, with UniProt being the main one, and mapped them into 59,475 bacterial genomes to find all their orthologous and paralogous proteins (Materials and Methods). In addition, we manually curated toxin and immunity (antitoxin) protein domains and mapped them to genes in our genome data set (Materials and Methods). This process resulted in the Toxinome database. The proteins were further clustered into 70,867 clusters (threshold = 0.7, alignment coverage = 0.92) using CD-HIT software. Next, we developed a user-friendly website, http://toxinome.pythonanywhere.com, to store the toxin information online, and facilitate querying our toxin/antitoxin database (Fig. 2). When browsing by organism, the user can find an alphabetically ordered list of all toxins and antitoxins from a selected organism. The information is displayed in a tabular way, including the IMG gene ID (44), product (protein) name, DNA scaffold, gene positions on the scaffold, protein length in amino acids, Pfam domains within the protein, functionality (toxin/antitoxin), information source, and cluster identifier. By clicking the Pfam link on the organism information page, the website redirects to a page describing the Pfam domains page that are encoded within the gene. Each Pfam has its Pfam ID, Pfam name, domain classification (Tox/Anti-Tox), coordinates on the protein, length in amino acids, and graphic representation. Pfam ID is an active link to the record on the UniProt website (45). The website redirects to the cluster page by clicking on the cluster number link on the organism information page. The cluster page shows all the genes in the specific cluster with their information. The user can retrieve all the genes that have a specific Pfam name by choosing the Pfam name from the Pfam name list under the “Browse By Pfam” button. “Perform Advanced Search” functionality allows searching for records using free language search for one of the four most usable fields: product name, organism name, Pfam name, and Pfam ID. The combination of these fields can also be used to retrieve the records that contain the intersection of the searched fields. “Homologous Protein Search allows a search based on an amino acid sequence similarity based on a query protein sequence. The amino acid query sequence is aligned to the protein database using DIAMOND (32) with the following parameters: query cover of 90%, reference protein cover of 60%, minimum identity of 40%, and e-value of 0.001. The best 25 hits are presented with all their genetic information. Finally, the user can download the entire database as a comma-separated file with all the database information, a FASTA file with all the protein sequences, and Toxin Island information.

FIG 2.

FIG 2

Schematic representation of the functionalities available on the Toxinome website. The homepage [1] serves as the starting point for accessing the database. The user can browse the database by organism names organized in alphabetical order [2.a]. Selecting an organism name from the list [2.b] provides information on toxins and immunity genes encoded in the genome [2.c]. Each protein is associated with internal Pfam domains [3], and genes are part of protein clusters [4]. The database provides access to the toxin islands of the organism [5]. A more advanced search [6] allows users to search for a protein domain of interest. Advanced queries for toxin islands can be made based on Phylum, Class, Genus, and Integrated Microbial Genomics (IMG) genome id [7]. The database can be filtered by pfam name using a list of existing domains [8]. Sequence-based search [9] enables the detection of homologous sequences. Links associated with pfam id and gene id on an organism page redirect to external databases (https://www.ebi.ac.uk/interpro) [10] and (https://img.jgi.doe.gov/) [11]. Clicking on the length link provides access to the protein sequence on the IMG website [12]. The user can download the entire database, the toxin islands table, and the detailed user guide [13].

Distribution of toxin and antitoxin across prokaryotes

We analyzed the toxin and antitoxin content of bacteria and archaea by calculating the average toxin gene per genome in class (number of toxins in class/number of genomes in class). The results are presented on a phylogenetic tree that was constructed based on single-copy marker genes (Materials and Methods). As we expect, there is a high correlation between toxin and antitoxin content (R = 0.6581, P-value = 7.59 × 10−13). The classes that are highly rich in toxins are Actinobacteria (Actinobacteria phylum), Gloeobacteria (Cyanobacteria), Caldilineae (Chloroflexota), Gammaproteobacteria, and Acidithiobacillia (Pseudomonadota, previously Proteobacteria). Compared with bacteria, Archaeal classes (“hours” 3–4 in Fig. 3) have a relatively low content of toxins and antitoxins. Interestingly, we noted that bacterial classes that are depleted for toxins and antitoxins are surprisingly enriched in many thermophilic and hyperthermophilic classes from eight different bacterial phyla. These classes include Aquificae (Aquificota phylum), Thermotoage (Thermotogota phylum), Dictyoglomia (Dictyoglomota phylum), Caldisericia (Caldisericota phylum), Thermomicrobia (Thermomicrobiota phylum), Calditrichae (Calditrichota phylum), Coprothermobacteria (Coprothermobacterota phylum), and Methylacidiphylae (Verrucomicrobiota phylum). The Caldilineae thermophilic class is an exception to being toxin-rich. Other classes that are depleted in toxins and antitoxins are Dehalococcoidia, Endomicrobia, Cathonomonadetes, Chitinivibirionia, Chlamydia, and Deferribacteres.

FIG 3.

FIG 3

Toxin and antitoxin content in bacterial and archaeal genomes. Toxinome genes were counted in each class and were divided by the number of genomes per class.

To further test if toxin and antitoxin gene frequency is correlated with bacterial adaptation to temperature, we used 11,748 publicly available bacterial and archaeal genomes (44) for which the temperature preference metadata is available, namely, whether the microbes are mesophiles, thermophiles, or psychrophiles. We found that mesophiles have a significantly higher content of toxins and antitoxins than thermophiles and psychrophiles bacteria. Statistical significance was calculated by analysis of variance test followed by pairwise comparisons and adjustment with Tukey method. The results are summarized in Tables 1 to 4.

TABLE 1.

Distribution of bacterial antitoxins

Group 1 Group 2 P-value
Mesophiles Psychrophiles 2.8e-8
Mesophiles Thermophiles 8.7e-11
Thermophiles Psychrophiles 0.09

TABLE 2.

Distribution of bacterial toxins

Group 1 Group 2 P-value
Mesophiles Psychrophiles 9.6e-14
Mesophiles Thermophiles 8.9e-14
Thermophiles Psychrophiles 0.76

TABLE 3.

Distribution of archaeal antitoxins

Group 1 Group 2 P-value
Mesophiles Psychrophiles 0.045
Mesophiles Thermophiles 0.45
Thermophiles Psychrophiles 0.15

TABLE 4.

Distribution of archaeal toxins

Group 1 Group 2 P-value
Mesophiles Psychrophiles 0.61
Mesophiles Thermophiles 0.01
Thermophiles Psychrophiles 0.97

As can be seen from the above, the bacterial mesophiles have significantly more toxins and antitoxins than bacterial psychrophiles and thermophiles when normalized to their genome size (Fig. 4). For archaea the difference between mesophiles and psychrophiles groups has borderline P-value in both toxins (P-value ≤ 0.0138) and antitoxins (P-value ≤ 0.0455) count (Fig. 4).

FIG 4.

FIG 4

Toxin and antitoxin genes are enriched in mesophilic bacteria. Toxin and antitoxin counts were normalized by genome size in bacterial and archaeal genomes.

Identification of Toxin Islands

Genomic pathogenicity islands were described as genomic regions of 10–200 Kb that are enriched in virulence genes, such as toxins, adhesins, invasins, iron uptake systems, and protein secretion systems and are enriched in pathogens (4649). These regions often have GC content different from the genome, are rich with mobile genetic elements and repetitive sequences, and are frequently horizontally transferred and integrated next to tRNA genes (46, 50). Erwinia Amylovora, for instance, has been reported to have a number of pathogenic islands, mostly those associated with the type II and III secretion systems (T2SS and T3SS) (51). Salmonella genus contains multiple pathogenicity islands that encode T3SS and its effector genes, and genes required for survival inside macrophages (52, 53), and the pathogenicity islands of Pseudomonas aeruginosa are required for virulence in animals and plants (54). A few years ago Bacteroidales species in the human gut were reported to encode dynamic genomic islands of immunity genes that protect against T6SS antibacterial effectors of co-resident microorganisms (55).

Our observation of toxins and antitoxins frequently grouped together led us to propose the existence of “Toxin Islands” which are genomic islands enriched in toxins and antitoxins within bacterial species. We hypothesized that some pathogenicity islands are enriched in toxin genes and therefore would be more accurately functionally annotated as “Toxin Islands.” In addition, Toxin Islands can have a non-virulence function when they overlap with antiphage defense islands which are rich with toxin-antitoxin genes (56, 57) or enriched in an antibacterial arsenal used for outcompeting microbes.

To test this hypothesis, we developed a statistical, score-based, computational algorithm capable of identifying regions in a genome that exhibit high densities of toxins and antitoxins. The algorithm systematically scans each genome, searching for segments where adjacent toxins or antitoxins are within a specified maximum distance threshold (Materials and Methods). Subsequently, we applied stringent filtering using carefully selected thresholds to delineate the regions that meet the criteria for classification as “Toxin Islands” (Materials and Methods, Fig. S1). Using this method, we identified 23,025 Toxin Islands where 5,161 of them are unique islands that originated from 4,240 different genomes (Table S2). The majority of the islands have lengths shorter than 15 Kb (Fig. S2), indicating a prevalence of relatively compact Toxin Islands. The phylogenetic distribution of these Toxin Islands is highly diverse and spans across 50 different classes. They are most abundant in the classes of Gammaproteobacteria, Bacilli, and Actinobacteria (Fig. 5), but when normalizing by the number of genomes analyzed, we find an enrichment in Delta- and Betaproteobacteria classes (Fig. S3).

FIG 5.

FIG 5

Number of unique toxin islands identified per class. This plot illustrates the distribution of unique toxin islands discovered across different classes. Each slice represents a specific class, and the size of each slice corresponds to the number of distinct Toxin Islands identified within that class. The term “other” refers to classes that either exhibit low toxin island counts or genomes that lack taxonomy annotation.

To characterize the identified Toxin Islands across different classes, we conducted a count of toxins and antitoxins for each of the abundant classes in our analysis (Fig. 6). Interestingly, we observed a consistent trend where the majority of Toxin Islands contained a higher number of toxins compared to antitoxins. This trend was particularly pronounced in the class Bacilli, where over half of the identified islands exclusively contained toxins without any associated antitoxins. Furthermore, we found that Toxin Islands often tend to contain homologous toxins or antitoxins within the same island, suggesting the occurrence of local gene duplication events (Fig. S4).

FIG 6.

FIG 6

Toxin and antitoxin counts within Toxin Islands per class. The plot represents the distribution of toxins and antitoxins within each identified Toxin Island. The number of toxins is depicted by the red box, while the number of antitoxins is represented by the blue box. The Y-axis indicates the count of toxins or antitoxins, while the X-axis corresponds to the classes that are rich in Toxin Islands. Each number on the X-axis corresponds to a specific class as follows: 1 = Actinobacteria, 2 = Alphaproteobacteria, 3 = Bacilli, 4 = Betaproteobacteria, 5 = Clostridia, 6 = Deltaproteobacteria, and 7 = Gammaproteobacteria.

One example of a predicted Toxin Island is found in the denitrifying bacteria Thauera phenylacetica B4P (Fig. 7A) (58). This specific island spans a length of 15,000 bp and contains nine toxins and nine antitoxins. Notably, within this Toxin Island, among other toxins and antitoxins, one can identify the presence of the toxin YoeB accompanied by the antitoxin YefM (Fig. 7A). In addition to the toxins and antitoxins, other genes present in this island include genes that encode to type IV pilus assembly protein, transposase, glycosyltransferase, and hypothetical proteins (Fig. 7A).

FIG 7.

FIG 7

Toxin Islands occasionally encode unidentified toxins and antitoxins. (A) Identified Toxin Island within Thauera phenylacetica B4P. The figure displays a Toxin Island identified within the genome of Thauera phenylacetica B4P, highlighting specific proteins and their respective functions. Different colors are used to distinguish toxins, antitoxins, functionally annotated proteins, and hypothetical proteins. Proteins for which we performed computational functional characterization are indicated by colored frames. The corresponding structures for these characterized proteins are shown in Fig. 7C, labeled as 1 and 2. (B) Distribution of protein annotations in Toxin Islands for non-toxin/non-immunity genes. Each bar represents a frequently observed product annotation associated with genes found in Toxin Islands, excluding toxins and antitoxins. The y-axis represents the counts presented in logarithmic scale. (C) Protein structure alignment of uncharacterized proteins within the toxin island: Uncharacterized proteins are colored in blue, while known proteins from the aligned database are colored in yellow. Protein numbers 1 and 2 displayed similarity to the crystal structure of HEPN and MNT toxin-antitoxin system (PDB Structures ID: 7AE8 and 7BXO), respectively, identified in Shewanella oneidensis MR-1.

We hypothesized that toxins or antitoxin genes can be identified by analyzing proteins with unknown activity that are densely clustered in Toxin Islands. In our analysis of non-toxin and non-antitoxin proteins within these islands, we observed that a significant proportion of protein-coding genes lack any Pfam domain annotation and are annotated as “hypothetical proteins” (Fig. 7B). These genes may potentially encode novel toxins and antitoxins, which genomically cluster together with the known toxins and antitoxins. To test this suggestion, we investigated the potential functions of the unannotated proteins within the specific island depicted in Fig. 7A and employed a comparative analysis approach. Using Foldseek (37), we conducted a structural comparison of these proteins against protein structure databases. Remarkably, we found that two hypothetical protein-coding genes, labeled as 1 and 2, displayed a high degree of similarity to known toxin and antitoxin systems (Fig. 7C). Specifically, protein numbers 1 and 2 exhibited similarity to the crystal structure of HEPN/MNT toxin-antitoxin system found in Shewanella oneidensis MR-1. The MntA antitoxin (MNT-domain protein) acts as an adenylyltransferase and chemically modifies the HepT toxin (HEPN-domain protein) to block its toxicity as an RNase (59). These findings provide compelling evidence supporting the hypothesis of the presence of novel toxin and antitoxin genes within these islands, highlighting the genomic clustering phenomenon. Another group of genes that is abundant within Toxin Islands are transposases and integrases (Fig. 7B), which indicates that the islands can be transferred horizontally between bacteria.

DISCUSSION

Toxinome is a unique and extensive database of most known toxins and antitoxins in 59,475 bacterial genomes. The toxins include classic exotoxins such as the Anthrax and Pertussis toxins, toxin effectors of different secretion systems, toxin-antitoxin systems, bacteriocins, and others, along with their cognate antitoxins. We expect that the database will serve as a valuable resource for the large research community interested in the biology of bacterial protein toxins that are used for virulence, abortive phage infection, and inter-microbial competition. Toxinome database is presented through a user-friendly web interface that is easy to navigate, query, and download. A wide range of follow-up studies can be conducted using the data collected, from the discovery of new toxins, antitoxins, and toxin delivery systems and associating observed phenotypes with genes from Toxinome. Microbiologists can retrieve a full profile of all known toxins encoded by the microbe they study.

By analyzing the Toxinome database we identified prokaryotic classes that are rich and poor in toxins and antitoxins. It is intriguing to know whether the observed depletion in thermophilic and psychrophilic bacteria is the result of purifying selection against toxin and antitoxin gene presence in these clades. This can result from having fewer microbial competitors or fewer hosts for bacteria dwelling in extreme environments making toxins unnecessary energetically expensive molecular weapons. It was reported that when temperature increases, both microbial species diversity and Pfam diversity steadily decline, and hence this may also affect toxin and antitoxin gene content, particularly those used in inter-microbial competition in polymicrobial communities (60). We speculate that since most animals and plants dwell in mesophilic conditions there are fewer pathogens equipped with anti-eukaryotic toxins living in extreme temperatures. Alternatively, this finding might be the result of a bias in toxin and antitoxin functional annotation. Namely, it may suggest that most known toxins and antitoxins were discovered and studied in the organisms where they are supposedly enriched, such as human pathogenic proteobacteria, whereas biochemically poorly characterized clades may have independent sets of toxins and antitoxins that remain unidentified and can lead to description of new phenotypes in these microbes. Indeed, classic toxin-antitoxn systems were not studied in many extremophiles until quite recently (61) and novel toxin-antitoxin systems were predicted to be specific to thermophiles (15). It is also fascinating to test if the relatively low incidence of toxins and antitoxins is common to other extremophilic bacteria that live for example in acidic or alkaline environments. For example, certain hyperthermophilic and halophilic Archaea were described to produce bacteriocins called sulfolobicins and halocins, respectively (62).

Bacteria tend to co-localize genes of similar functions within short DNA segments such as antibiotic resistance (63, 64), heavy metal resistance (65), phage defense (56, 57), and pathogenicity (66). We utilized the Toxinome database to conduct an analysis and define the concept of genomic Toxin Islands. These islands exhibit a significant abundance of toxins and antitoxins within a short DNA stretch. We observed that the number of toxins surpasses the number of antitoxins in these islands, mostly in Bacilli, despite the fact that toxins and antitoxins are typically coupled together in the genome to safeguard bacteria from their own harmful weaponry. One possible explanation for this phenomenon is the presence of toxins that specifically target eukaryotes, rendering the bacteria immune to their own toxins. Alternatively, it could be due to the lack of annotation of antitoxins as Toxinome content is biased with threefold more toxins than antitoxins. In addition to their toxin and antitoxin content, these Toxin Islands are characterized by the presence of numerous genes, a majority of which have no functional annotation (Fig. 7B). Moreover, these islands are abundant in mobile genetic elements, suggesting the horizontal transfer of genes into the islands or the lateral transfer of entire islands. It is plausible that the multitude of “hypothetical protein” genes found within Toxin Islands encode novel toxins and antitoxins that may target prokaryotic or eukaryotic cells, as we have shown in a Toxin Island form Thauera phenylacetica (Fig. 7C). Consequently, annotating the Toxin Islands can greatly facilitate the discovery of gene functions, including those that lack sequence similarity to known genes, and play a pivotal role in shaping microbial interactions with hosts, phages, and other microbes. We anticipate that additional genes involved in toxin production, maturation, and secretion will also be localized in Toxin Islands. Exploring the cellular phenotypes resulting from the expression of hypothetical protein-encoding genes from the Toxin Islands of certain archaeal, thermophilic, and psychrophilic groups that we identified as having low toxin and antitoxin content could yield intriguing insights and novel types of toxins. Therefore, we believe that experimental testing of such toxins holds a significant interest.

ACKNOWLEDGMENTS

A.L. is supported by the Israeli Science Foundation (Grant Nos. 1535/20, 3300/20, and 3062/20), Alon Fellowship of the Israeli council of higher education, and the Volkswagen Foundation (Grant ZN 4041).

Contributor Information

Tommy Kaplan, Email: tommy@cs.huji.ac.il.

Asaf Levy, Email: alevy@mail.huji.ac.il.

Victor J. Torres, St. Jude Children's Research Hospital, Memphis, Tennessee, USA

SUPPLEMENTAL MATERIAL

The following material is available online at https://doi.org/10.1128/mbio.01911-23.

Supplemental Figures. mbio.01911-23-s0001.pdf.

Figures S1-S4.

DOI: 10.1128/mbio.01911-23.SuF1
Table S1. mbio.01911-23-s0002.csv.

Additional data.

DOI: 10.1128/mbio.01911-23.SuF2
Table S2. mbio.01911-23-s0003.csv.

Toxin islands.

DOI: 10.1128/mbio.01911-23.SuF3

ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.

REFERENCES

  • 1. Russell AB, Peterson SB, Mougous JD. 2014. Type VI secretion system effectors: poisons with a purpose. Nat Rev Microbiol 12:137–148. doi: 10.1038/nrmicro3185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Hayes CS, Koskiniemi S, Ruhe ZC, Poole SJ, Low DA. 2014. Mechanisms and biological roles of contact-dependent growth inhibition systems. Cold Spring Harb Perspect Med 4:a010025–a010025. doi: 10.1101/cshperspect.a010025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Leidreiter F, Roderer D, Meusch D, Gatsogiannis C, Benz R, Raunser S. 2019. Common architecture of Tc toxins from human and insect pathogenic bacteria. Sci Adv 5:eaax6497. doi: 10.1126/sciadv.aax6497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Vlisidou I, Hapeshi A, Healey JRJ, Smart K, Yang G, Waterfield NR. 2019. The photorhabdus asymbiotica virulence cassettes deliver protein effectors directly into target eukaryotic cells. Elife 8:e46259. doi: 10.7554/eLife.46259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Harms A, Liesch M, Körner J, Québatte M, Engel P, Dehio C, Viollier PH. 2017. A bacterial toxin-antitoxin module is the origin of inter-bacterial and inter-kingdom effectors of bartonella. PLoS Genet 13:e1007077. doi: 10.1371/journal.pgen.1007077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Ahmad S, Wang B, Walker MD, Tran H-K, Stogios PJ, Savchenko A, Grant RA, McArthur AG, Laub MT, Whitney JC. 2019. An interbacterial toxin inhibits target cell growth by synthesizing (P)ppApp. Nature 575:674–678. doi: 10.1038/s41586-019-1735-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Bernal P, Allsopp LP, Filloux A, Llamas MA. 2017. The pseudomonas putida T6SS is a plant warden against phytopathogens. ISME J 11:972–987. doi: 10.1038/ismej.2016.169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Odumosu O, Nicholas D, Yano H, Langridge W. 2010. AB toxins: a paradigm switch from deadly to desirable. Toxins 2:1612–1645. doi: 10.3390/toxins2071612 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Soltani S, Hammami R, Cotter PD, Rebuffat S, Said LB, Gaudreau H, Bédard F, Biron E, Drider D, Fliss I. 2021. Bacteriocins as a new generation of antimicrobials: toxicity aspects and regulations. FEMS Microbiol Rev 45. doi: 10.1093/femsre/fuaa039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Wilson BA, Ho M. 2012. Pasteurella multocida toxin interaction with host cells: entry and cellular effects. Curr Top Microbiol Immunol 361:93–111. doi: 10.1007/82_2012_219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Woida PJ, Satchell KJF. 2020. The MARTX toxin silences the inflammatory response to cytoskeletal damage before inducing actin cytoskeleton collapse. Sci Signal 13:eaaw9447. doi: 10.1126/scisignal.aaw9447 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Clemons NC, Bannai Y, Haywood EE, Xu Y, Buschbach JD, Ho M, Wilson BA. 2018. Cytosolic delivery of multidomain cargos by the N terminus of pasteurella multocida toxin Infect Immun 86:e00596-18. doi: 10.1128/IAI.00596-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Ibler AEM, ElGhazaly M, Naylor KL, Bulgakova NA, F El-Khamisy S, Humphreys D. 2019. Typhoid toxin exhausts the RPA response to DNA replication stress driving senescence and salmonella infection. Nat Commun 10:4040. doi: 10.1038/s41467-019-12064-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Jurėnas D, Fraikin N, Goormaghtigh F, Van Melderen L. 2022. Biology and evolution of bacterial toxin-antitoxin systems. Nat Rev Microbiol 20:335–350. doi: 10.1038/s41579-021-00661-1 [DOI] [PubMed] [Google Scholar]
  • 15. Makarova KS, Wolf YI, Koonin EV. 2009. Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 4:19. doi: 10.1186/1745-6150-4-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. LeRoux M, Srikant S, Teodoro GIC, Zhang T, Littlehale ML, Doron S, Badiee M, Leung AKL, Sorek R, Laub MT. 2022. The darTG toxin-antitoxin system provides phage defence by ADP-ribosylating viral DNA. Nat Microbiol 7:1028–1040. doi: 10.1038/s41564-022-01153-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Harms A, Brodersen DE, Mitarai N, Gerdes K. 2018. Toxins, targets, and triggers: an overview of toxin-antitoxin biology. Mol Cell 70:768–784. doi: 10.1016/j.molcel.2018.01.003 [DOI] [PubMed] [Google Scholar]
  • 18. Blackwell GA, Hunt M, Malone KM, Lima L, Horesh G, Alako BTF, Thomson NR, Iqbal Z. 2021. Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences. PLoS Biol 19:e3001421. doi: 10.1371/journal.pbio.3001421 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Wishart D, Arndt D, Pon A, Sajed T, Guo AC, Djoumbou Y, Knox C, Wilson M, Liang Y, Grant J, Liu Y, Goldansaz SA, Rappaport SM. 2015. T3DB: the toxic exposome database. Nucleic Acids Res 43:D928–34. doi: 10.1093/nar/gku1004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Li J, Yao Y, Xu HH, Hao L, Deng Z, Rajakumar K, Ou H-Y.. 2015. SecReT6: a web-based resource for type VI secretion systems found in bacteria. Environ Microbiol 17:2196–2202. [DOI] [PubMed] [Google Scholar]
  • 21. Shao Y, Harrison EM, Bi D, Tai C, He X, Ou H-Y, Rajakumar K, Deng Z. 2011. TADB: a web-based resource for type 2 toxin-antitoxin loci in bacteria and archaea. Nucleic Acids Res 39:D606–11. doi: 10.1093/nar/gkq908 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Chakraborty A, Ghosh S, Chowdhary G, Maulik U, Chakrabarti S. 2012. DBETH: a database of bacterial exotoxins for human. Nucleic Acids Res 40:D615–20. doi: 10.1093/nar/gkr942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Hammami R, Zouhir A, Ben Hamida J, Fliss I. 2007. BACTIBASE: a new web-accessible database for bacteriocin characterization. BMC Microbiol 7:89. doi: 10.1186/1471-2180-7-89 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. van Heel AJ, de Jong A, Song C, Viel JH, Kok J, Kuipers OP. 2018. BAGEL 4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins. Nucleic Acids Res 46:W278–W281. doi: 10.1093/nar/gky383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Pirtskhalava M, Amstrong AA, Grigolava M, Chubinidze M, Alimbarashvili E, Vishnepolsky B, Gabrielian A, Rosenthal A, Hurt DE, Tartakovsky M. 2021. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49:D288–D297. doi: 10.1093/nar/gkaa991 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Liu Y, Liu S, Pan Z, Ren Y, Jiang Y, Wang F, Li D, Li Y, Zhang Z. 2023. PAT: a comprehensive database of prokaryotic antimicrobial toxins. Nucleic Acids Research 51:D452–D459. doi: 10.1093/nar/gkac879 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Yadav SK, Magotra A, Ghosh S, Krishnan A, Pradhan A, Kumar R, Das J, Sharma M, Jha G. 2021. Immunity proteins of dual nuclease T6SS effectors function as transcriptional repressors. EMBO Rep 22. doi: 10.15252/embr.202051857 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Nachmias N, Dotan N, Fraenkel R, Campos Rocha M, Kluzek M, Shalom M, Rivitz A, Shamash-Halevy N, Cahana I, Deouell N, Klein J, Schlezinger N, Tzarum N, Oppenheimer-Shaanan Y, Levy A. 2022. Systematic discovery of antibacterial and antifungal bacterial toxins. bioRxiv. doi: 10.1101/2021.10.19.465003 [DOI] [Google Scholar]
  • 29. Bateman A, Martin M-J, Orchard S, Magrane M, Agivetova R, Ahmad S, Alpi E, Bowler-Barnett EH, Britto R, Bursteinas B, Bye-A-Jee H, Coetzee R, Cukura A, Da Silva A, Denny P, Dogan T, Ebenezer T, Fan J, Castro LG, Garmiri P, Georghiou G, Gonzales L, Hatton-Ellis E, Hussein A, Ignatchenko A, et al. , The UniProt Consortium . 2021. Uniprot: the universal protein knowledgebase in 2021. Nucleic Acids Res 49:D480–D489. doi: 10.1093/nar/gkaa1100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Li W, Godzik A. 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659. doi: 10.1093/bioinformatics/btl158 [DOI] [PubMed] [Google Scholar]
  • 31. Chen I-M, Chu K, Palaniappan K, Pillay M, Ratner A, Huang J, Huntemann M, Varghese N, White JR, Seshadri R, Smirnova T, Kirton E, Jungbluth SP, Woyke T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. 2019. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 47:D666–D677. doi: 10.1093/nar/gky901 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Buchfink B, Xie C, Huson DH. 2015. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60. doi: 10.1038/nmeth.3176 [DOI] [PubMed] [Google Scholar]
  • 33. Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, Tosatto SCE, Paladin L, Raj S, Richardson LJ, Finn RD, Bateman A. 2021. Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419. doi: 10.1093/nar/gkaa913 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D, Palaniappan K, Szeto E, Pillay M, Chen I-M, Pati A, Nielsen T, Markowitz VM, Kyrpides NC. 2015. The standard operating procedure of the DOE-JGI microbial genome annotation pipeline (MGAP v.4). Stand Genomic Sci 10:86. doi: 10.1186/s40793-015-0077-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Steinegger M, Söding J. 2017. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35:1026–1028. doi: 10.1038/nbt.3988 [DOI] [PubMed] [Google Scholar]
  • 36. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. 2022. Colabfold: making protein folding accessible to all. Nat Methods 19:679–682. doi: 10.1038/s41592-022-01488-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. van Kempen M, Kim SS, Tumescheit C, Mirdita M, Lee J, Gilchrist CLM, Söding J, Steinegger M. 2023. Fast and accurate protein structure search with foldseek. Nat Biotechnol. doi: 10.1038/s41587-023-01773-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Buchfink B, Reuter K, Drost H-G. 2021. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods 18:366–368. doi: 10.1038/s41592-021-01101-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Puigbò P, Wolf YI, Koonin EV. 2009. “Search for a 'tree of life' in the thicket of the phylogenetic forest”. J Biol 8:59. doi: 10.1186/jbiol159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG. 2011. Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol 7:539. doi: 10.1038/msb.2011.75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Price MN, Dehal PS, Arkin AP. 2010. Fasttree 2--approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. doi: 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Yu G. 2020. Using ggtree to visualize data on tree-like structures. Curr Protoc Bioinform 69:e96. doi: 10.1002/cpbi.96 [DOI] [PubMed] [Google Scholar]
  • 43. Xu S, Dai Z, Guo P, Fu X, Liu S, Zhou L, Tang W, Feng T, Chen M, Zhan L, Wu T, Hu E, Jiang Y, Bo X, Yu G, Tamura K. 2021. ggtreeExtra: compact visualization of richly annotated phylogenetic data. Mol Biol Evol 38:4039–4042. doi: 10.1093/molbev/msab166 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Chen I-MA, Chu K, Palaniappan K, Ratner A, Huang J, Huntemann M, Hajek P, Ritter S, Varghese N, Seshadri R, Roux S, Woyke T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. 2021. The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities. Nucleic Acids Res 49:D751–D763. doi: 10.1093/nar/gkaa939 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Paysan-Lafosse T, Blum M, Chuguransky S, Grego T, Pinto BL, Salazar GA, Bileschi ML, Bork P, Bridge A, Colwell L, Gough J, Haft DH, Letunić I, Marchler-Bauer A, Mi H, Natale DA. 2023. Interpro in 2022. Nucleic Acids Res 51:D418–D427. doi: 10.1093/nar/gkac993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Dobrindt U, Hochhut B, Hentschel U, Hacker J. 2004. Genomic islands in pathogenic and environmental microorganisms. Nat Rev Microbiol 2:414–424. doi: 10.1038/nrmicro884 [DOI] [PubMed] [Google Scholar]
  • 47. Schmidt H, Hensel M. 2004. Pathogenicity islands in bacterial pathogenesis. Clin Microbiol Rev 17:14–56. doi: 10.1128/CMR.17.1.14-56.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Dobrindt U, Janke B, Piechaczek K, Nagy G, Ziebuhr W, Fischer G, Schierhorn A, Hecker M, Blum-Oehler G, Hacker J. 2000. Toxin genes on pathogenicity islands: impact for microbial evolution. Int J Med Microbiol 290:307–311. doi: 10.1016/S1438-4221(00)80028-4 [DOI] [PubMed] [Google Scholar]
  • 49. Lindsay JA, Ruzin A, Ross HF, Kurepina N, Novick RP. 1998. The gene for toxic shock toxin is carried by a family of mobile pathogenicity islands in Staphylococcus aureus. Mol Microbiol 29:527–543. doi: 10.1046/j.1365-2958.1998.00947.x [DOI] [PubMed] [Google Scholar]
  • 50. Gal-Mor O, Finlay BB. 2006. Pathogenicity islands: a molecular toolbox for bacterial virulence. Cell Microbiol 8:1707–1719. doi: 10.1111/j.1462-5822.2006.00794.x [DOI] [PubMed] [Google Scholar]
  • 51. Zhao Y, Sundin GW, Wang D. 2009. Construction and analysis of pathogenicity Island deletion mutants of Erwinia amylovora. Can J Microbiol 55:457–464. doi: 10.1139/w08-147 [DOI] [PubMed] [Google Scholar]
  • 52. Marcus SL, Brumell JH, Pfeifer CG, Finlay BB. 2000. Salmonella pathogenicity islands: big virulence in small packages. Microbes Infect 2:145–156. doi: 10.1016/s1286-4579(00)00273-2 [DOI] [PubMed] [Google Scholar]
  • 53. Galán JE. 1996. Molecular genetic bases of salmonella entry into host cells. Mol Microbiol 20:263–271. doi: 10.1111/j.1365-2958.1996.tb02615.x [DOI] [PubMed] [Google Scholar]
  • 54. He J, Baldini RL, Déziel E, Saucier M, Zhang Q, Liberati NT, Lee D, Urbach J, Goodman HM, Rahme LG. 2004. The broad host range pathogen Pseudomonas aeruginosa strain Pa14 carries two pathogenicity Islands harboring plant and animal virulence genes. Proc Natl Acad Sci U S A 101:2530–2535. doi: 10.1073/pnas.0304622101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Ross BD, Verster AJ, Radey MC, Schmidtke DT, Pope CE, Hoffman LR, Hajjar AM, Peterson SB, Borenstein E, Mougous JD. 2019. Human gut bacteria contain acquired Interbacterial defence systems. Nature 575:224–228. doi: 10.1038/s41586-019-1708-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Doron S, Melamed S, Ofir G, Leavitt A, Lopatina A, Keren M, Amitai G, Sorek R. 2018. Systematic discovery of antiphage defense systems in the microbial pangenome. Science 359:eaar4120. doi: 10.1126/science.aar4120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Makarova KS, Wolf YI, Snir S, Koonin EV. 2011. Defense islands in bacterial and archaeal genomes and prediction of novel defense systems. J Bacteriol 193:6039–6056. doi: 10.1128/JB.05535-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Mechichi T, Stackebrandt E, Gad’on N, Fuchs G. 2002. Phylogenetic and metabolic diversity of bacteria degrading aromatic compounds under denitrifying conditions, and description of thauera phenylacetica sp. nov., thauera aminoaromatica sp. nov., and azoarcus buckelii sp. nov. Arch Microbiol 178:26–35. doi: 10.1007/s00203-002-0422-6 [DOI] [PubMed] [Google Scholar]
  • 59. Yao J, Zhen X, Tang K, Liu T, Xu X, Chen Z, Guo Y, Liu X, Wood TK, Ouyang S, Wang X. 2020. Novel polyadenylylation-dependent neutralization mechanism of the HEPN/MNT toxin/antitoxin system. Nucleic Acids Res 48:11054–11067. doi: 10.1093/nar/gkaa855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Ruhl IA, Sheremet A, Smirnova AV, Sharp CE, Grasby SE, Strous M, Dunfield PF. 2022. Microbial functional diversity correlates with species diversity along a temperature gradient. mSystems 7:e0099121. doi: 10.1128/msystems.00991-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Fan Y, Hoshino T, Nakamura A. 2017. Identification of a vapBC toxin-antitoxin system in a thermophilic bacterium thermus thermophilus HB27. Extremophiles 21:153–161. doi: 10.1007/s00792-016-0891-1 [DOI] [PubMed] [Google Scholar]
  • 62. O’Connor EM, Shand RF. 2002. Halocins and sulfolobicins: the emerging story of archaeal protein and peptide antibiotics. J Ind Microbiol Biotechnol 28:23–31. doi: 10.1038/sj/jim/7000190 [DOI] [PubMed] [Google Scholar]
  • 63. Qin S, Wang Y, Zhang Q, Chen X, Shen Z, Deng F, Wu C, Shen J. 2012. Identification of a novel genomic Island conferring resistance to multiple aminoglycoside antibiotics in campylobacter coli. Antimicrob Agents Chemother 56:5332–5339. doi: 10.1128/AAC.00809-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Johnson TA, Stedtfeld RD, Wang Q, Cole JR, Hashsham SA, Looft T, Zhu Y-G, Tiedje JM, Gillings M, Davies JE. 2016. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture. mBio 7:e02214-15. doi: 10.1128/mBio.02214-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Klonowska A, Moulin L, Ardley JK, Braun F, Gollagher MM, Zandberg JD, Marinova DV, Huntemann M, Reddy TBK, Varghese NJ, Woyke T, Ivanova N, Seshadri R, Kyrpides N, Reeve WG. 2020. Novel heavy metal resistance gene clusters are present in the genome of cupriavidus neocaledonicus STM 6070, a new species of mimosa pudica microsymbiont isolated from heavy-metal-rich mining site soil. BMC Genomics 21:214. doi: 10.1186/s12864-020-6623-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Schmidt H, Hensel M. 2004. Pathogenicity islands in bacterial pathogenesis. Clin Microbiol Rev 17:14–56. doi: 10.1128/CMR.17.1.14-56.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Figures. mbio.01911-23-s0001.pdf.

Figures S1-S4.

DOI: 10.1128/mbio.01911-23.SuF1
Table S1. mbio.01911-23-s0002.csv.

Additional data.

DOI: 10.1128/mbio.01911-23.SuF2
Table S2. mbio.01911-23-s0003.csv.

Toxin islands.

DOI: 10.1128/mbio.01911-23.SuF3

Articles from mBio are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES