Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2019 Oct 17;48(D1):D590–D598. doi: 10.1093/nar/gkz916

PADS Arsenal: a database of prokaryotic defense systems related genes

Yadong Zhang 1,2,3,4, Zhewen Zhang 1,2,3,, Hao Zhang 1,2,3,4, Yongbing Zhao 5, Zaichao Zhang 6, Jingfa Xiao 1,2,3,4,
PMCID: PMC7145686  PMID: 31620779

Abstract

Defense systems are vital weapons for prokaryotes to resist heterologous DNA and survive from the constant invasion of viruses, and they are widely used in biochemistry investigation and antimicrobial drug research. So far, numerous types of defense systems have been discovered, but there is no comprehensive defense systems database to organize prokaryotic defense gene datasets. To fill this gap, we unveil the prokaryotic antiviral defense system (PADS) Arsenal (https://bigd.big.ac.cn/padsarsenal), a public database dedicated to gathering, storing, analyzing and visualizing prokaryotic defense gene datasets. The initial version of PADS Arsenal integrates 18 distinctive categories of defense system with the annotation of 6 600 264 genes retrieved from 63,701 genomes across 33 390 species of archaea and bacteria. PADS Arsenal provides various ways to retrieve defense systems related genes information and visualize them with multifarious function modes. Moreover, an online analysis pipeline is integrated into PADS Arsenal to facilitate annotation and evolutionary analysis of defense genes. PADS Arsenal can also visualize the dynamic variation information of defense genes from pan-genome analysis. Overall, PADS Arsenal is a state-of-the-art open comprehensive resource to accelerate the research of prokaryotic defense systems.

INTRODUCTION

As mentioned in the Red Queen hypothesis, the ongoing and competitive arms race is one of the most powerful driving factors in co-evolution between prokaryotic organisms and viruses (1–3). As a consequence, prokaryotes have evolved numerous diverse and elaborate defense systems to protect themselves against viruses (4). Based on their action modes, the defense systems can be divided into two major groups, immunity and dormancy induction or programmed cell death (5,6). The immunity group contains restriction-modification (RM) system (7,8), DNA phosphorothioation system (known as DND system) (9–11), defense island system associated with restriction-modification (DISARM) system (12), bacteriophage exclusion (BREX) system (13), prokaryotic Argonautes (pAgos) system (14,15), and clustered regularly interspaced short palindromic repeats and adjacent to cas genes (CRISPR-Cas) system (16–19). The dormancy induction or programmed cell death by infection group includes toxin-antitoxin (TA) system (20–22) and abortive infection (ABI) system (23). Recently, several new types of defense systems have been discovered, such as DRUANTIA, GABIJA, and ZORYA (24). All of these defense systems not only prevent the introduction of heterologous DNA from plasmids or viruses, but also are widely applied in multiple fields, such as ABI system and RM system to avoid phage contamination in the fermentation industry (23,25,26), CRISPR-Cas system in precise genetic editing in biochemistry (27,28), TA system in picking cloning and living bacterial cellular single protein expression (29).

Several databases have been developed to integrate different defense systems. CRISPRdb and CRISPRone collect data of spacers and repeats, provide tools to search and display CRISPR-associated genes (30,31); REBASE is centered on RM system about restriction enzymes, methylases, and methylation specificity (32); TADB integrates information of type 2 toxin-antitoxin loci and genetic features and provides similarity search, genome context browse, and phylogenetic tools (33). However, all the databases or platforms mentioned above are only focused on a single defense system or subtype. Confronting the ever-increasing prokaryotic genomic data and the fast-emerging newfound defense systems, an integrated database embedding an in-depth analysis platform for multiple defense systems is an urgent need. To fill this gap, here we present PADS Arsenal, a comprehensive database of prokaryotic defense systems related genes. With a large collection of prokaryotic genomic data from public databases, PADS Arsenal is dedicated to gathering, storing, analyzing and visualizing prokaryotic defense system gene datasets over 33 000 species.

DATABASE IMPLEMENTATION

In terms of data collecting, all prokaryotic genomic data were retrieved from NCBI (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/) (34). For the identification of defense systems related genes, we first extracted some defense systems related genes as seed sequences from literature curation (12,13,19,24). In order to expand the seed dataset, we also downloaded protein families/sequences from COG (35), Pfam (36), REBASE (32), TIGRFAMs (37) and TADB (33) databases. Second, PSI-BLAST (38) was adopted to homology search of defense systems related genes. Sequences with identity value ≥30% were selected as putative defense systems related genes for further analyses (39). Third, all putative defense systems related genes were confirmed by checking conserved domains within the defense genes using InterProScan (40). In addition, we also randomly selected some strains from eight species (Pseudomonas aeruginosa, Bacillus cytotoxicus, Listeria ivanovii, Listeria monocytogenes, Neisseria meningitides, Streptococcus pyogenes, Escherichia coli and Mycoplasma pneumoniae) in PADS Arsenal for quality control. The identified CRISPR-Cas systems related genes in these strains were compared to the results of a well-known CRISPR-Cas systems identification tool CRISPRCasFinder (41). About 96% cas genes detected by CRISPRCasFinder were archived in PADS Arsenal. The reason for a small amount of gene missing was the slightly lower coverage of our seed datasets. We will integrate more seed sequences in the next version of PADS Arsenal for higher defense genes detection rate. Prokka was employed for genome annotation (42), Roary was applied for defense system gene orthologous clustering (43), ComplexHeatmap was used to construct the heatmap of defense system gene (44), and MAFFT was utilized for multiple sequences alignment (45). As for database construction, we used PHP, HTML5, CSS, Bootstrap, JQuery for front-end rendering and implementation of interactive events. Echarts, D3, circosJs, MSAviewer (46), phylotree.js (47) were adopted for building interactive graphs. DataTables and Bootstrap Table were used to render data tables. On the back-end, MySQL was employed to store data, and finally bioinformatics applications were achieved with PSI-BLAST (38), MAFFT (45), PhyML (48) and Python.

DATABASE CONTENT AND USAGE

In current version 1.0, we have annotated 6 600 264 genes from 18 distinctive categories of defense systems. These genes were retrieved from 63 701 genomes, a total of 33 390 species across archaea and bacteria (Table 1). PADS Arsenal not only provides a user-friendly interface but also a rich analysis function, which offers flexible ways to retrieve and present a dynamic interactive defense systems related genes annotation pipeline.

Table 1.

The statistics of annotated genes of each defense system in PADS Arsenal

Defense system Archaea
(1043 species)
Bacteria
(32 347 species)
Abortive infection/phage exclusion systems (ABI) 909 70 595
Bacteriophage Exclusion (BREX) 9648 465 752
Clustered regularly interspaced short palindromic repeats with cas genes (CRISPR-CAS) 10 836 143 345
Defence island system associated with restriction–modification (DISARM) 6461 414 500
DNA phosphorothioation (DND) 1866 99 150
DRUANTIA 8642 726 041
GABIJA 2728 321 625
HACHIMAN 9675 713 810
KIWA 56 4780
LAMASSU 1026 127 721
Prokaryotic Argonautes (PAGOS) 90 1539
Restriction-Modification (RM) 13 276 1 016 565
SEPTU 990 151 706
SHEDU 16 2658
Toxin–Antitoxin (TA) 19 997 1 227 160
THOERIS 663 45 056
WADJET 2311 98 485
ZORYA 7456 873 130

In the browse module, all the completed prokaryotic genomes can be visualized by different taxonomic hierarchies. Users can select a taxonomy label and type some characters in the input box and click the corresponding taxonomic group. For example, searching for E. coli (Figure 1), a table of all related strains with a color bar will show up. Users can intuitively observe the composition of defense systems related genes and their corresponding strains and the composition variations of defense systems related genes between different strains. Each colored block can be clicked to show the details of all genes in that defense system. In addition, the last thumbnail click is used to display the information of the locus of the defense systems related genes, GC content and GC skew value of the genome by Circos graph. Strips of different colors represent different types of defense systems related genes, and each strip can be clicked for further information. Users can estimate the regions of defense island by combining all information of the arrangement of defense systems related genes across the genome, GC skew value, and the difference in GC content compared to the average of the genome.

Figure 1.

Figure 1.

Screenshots of browse page. (A) The E. coli search table based on species label at the browse page. (B) The ZORYA defense system gene table of E. coli BL21(DE3) by clicking the colored block. (C) Circos graphs of E. coli BL21(DE3) by clicking the shortcut link. (D) The detail information about a ZORYA defense system gene by clicking the strip (only partially shown).

To better search and explore the database, we provide four searching approaches (Figure 2). System-based and gene-based approaches can be applied when users are interested in a certain system or gene in a defense system, respectively. Species-based and assembly accession-based searches are also provided when users look for a species or an assembly accession ID. The results collected by the four searching approaches are identical, such as defense system category, defense system subtype, and gene symbol.

Figure 2.

Figure 2.

Screenshots of search page. (A) Species-based search results with Acidianus hospitalis. (B) Assembly accession-based search results for ‘GCA_900248165.1’. (C) System-based search for ABI defense system. (D) Gene-based search for the cas6 gene of CRISPR–Cas system.

An interactive online pipeline of defense systems related gene annotation is integrated in the analysis module, combining the function of sequence homology search, multiple sequence alignment, and phylogenetic analysis. Users can upload a protein sequence for sequence similarity search. The targeting sequences will be further filtered by blast identity value and users can select seed sequences of interest for multiple sequence alignment. Users can also construct a phylogenetic tree to further annotate their uploaded sequence. For instance (Figure 3), we present the example sequence of DND and BREX systems and show the related results of homologous sequences search, multiple sequence alignment, and phylogenetic analysis in return.

Figure 3.

Figure 3.

Screenshots of annotation page. (A) The upload of a sequence, the program selection and the parameters settings. (B) The preliminary results of the annotation and settings the filtering threshold. (C) Selected filtered results based on the threshold and parameters for multiple sequence alignment and building an evolutionary tree. (D) The result of multiple sequence alignment (only partially shown due to limited space). (E) The constructed evolutionary tree.

Gene conservation is an important character for understanding the mechanism of defense system. To visualize the dynamic variation of defense systems related genes across species, a static presence-absence variation (PAV) analysis function is integrated in PADS Arsenal. In PAV analysis, users can select a species of interest to view the heatmap of PAV analysis result, by which users will choose a defense system to view the dynamic variation of defense systems related genes at the species-level from the insight of pan-genome. All defense system gene families (core, shared, unique) are listed in a table. For example, the results of searched Chlamydia muridarum and selected DISARM defense system are shown in Figure 4. For further interpretation, the heatmap of C. muridarum suggests that genes associated with DISARM system are highly conserved. In addition, the orthologous clustering of defense system genes identified in PAV analysis also paves a way for downstream analyses.

Figure 4.

Figure 4.

Screenshots of PAV analysis page. (A) The heatmap of defense system genes distribution for C. muridarum. (B) The detailed information of DISARM defense system orthologous gene clusters based on the heatmap. (C) The results of multiple sequence alignment of an orthologous gene cluster by clicking the ‘MSA’ button.

In the statistic module, interactive charts are provided (Supplementary Figure S1). Users can get the overall distribution of defense systems related genes in archaea and bacteria kingdom through two pie charts. In the histogram, two browsing modes (single/multiple) are provided based on multiple taxonomic hierarchies (from phylum to genus). Users can recognize the presence-absence condition of different defense systems related genes at different taxonomic hierarchies by dynamic histograms. For instance, ZORYA defense systems related genes are widespread in phyla under archaea, while Abi genes are more specific and only observed in some archaeal genera. In addition, our statistics results for four species E. coli, S. enterica, S. pyogenes and M. pneumoniae show that some defense systems (TA, RM and ZORYA) might include different numbers of defense genes in different strains from the same species (Supplementary Figure S2). However, defense genes numbers in GABIJA, LAMASSU and WADJET defense systems are relatively stable.

All the processed results for these 6 600 264 defense systems related genes are publicly available at the download section. Besides, we also provide the data tables retrieved from the browse page and the search page, as well as the results of PAV analysis and online annotation.

FUTURE DIRECTIONS

Over the last several decades, defense systems related genes have been served as important editing, engineering and regulation tools due to their natural and powerful enzymatic activities, and the development of these tools has gone through two generations to date (6). RM enzymes were used as key genetic engineering tools in the early stage (49–51). Recently, CRISPR–Cas systems have been widely used as genetic editing tools with its functional diversity, which includes versatile mechanisms of crRNA guide processing, self/non-self discrimination, and target cleavage (48). Moreover, prokaryotic Argonaute proteins have been reported to mediate nucleic acid-guided cleavage of cognate DNA targets (52,53) or RNA targets (54,55) in vitro. This might lead to a new generation of genome-editing tools (56,57). In this study, we construct PADS Arsenal in a wide variety of application, including displaying defense systems related genes in a complete genome-scale at different taxonomic hierarchies, searching defense systems related genes, annotating and analyzing specific sequences with multiple tools and depicting dynamic variation of defense systems related genes across species. PADS Arsenal archives defense systems related genes rather than indicating complete defense systems. This is mainly because there are no definite descriptions of complete system or active defense system for some multiple gene systems (more than three genes in a system), such as DISARM, DND and Druantia. The integrity identification of all the 18 defense systems or their subtypes is a great challenge and it is also the future development direction for PADS Arsenal. In current version, PADS Arsenal will help users to detect potential defense systems related genes as engineering tools, but none of these systems can be functional if they are not complete. For defense systems integrity, we count the number of strains with or without complete systems, the results presented that RM and TA defense systems are complete in all analyzed strains of E. coli, S. enterica and S. pyogenes (Supplementary Figure S3). This implies that the complete RM and TA defense systems might be essential for these species. However, the integrity of HACHIMAN, KIWA and SEPTU defense systems shows dynamic changes in different strains of the same species (E. coli and S. enterica). Some recent studies indicate that the defense genes are the most evolutionarily dynamic functional class of genes and the gene loss is about three times more than gene gain (57,58).

There will be many new defense systems that have yet to be discovered (2,5). In future, PADS Arsenal, as one of the important database resources in BIG Data Center (59), will continuously collect and organize more types of defense systems and prokaryotic genomic data. Defense islands, formed by many physically clustered genes that are involved in archaeal and bacterial defense functions, provide a shortcut for discovering new defense systems (4,6,60). We will develop and integrate novel prediction methods to facilitate the identification of defense islands. In some defense systems, genomic modification plays a key role in self/non-self discrimination, for instance, in the BREX system, methylation on the fifth locus of non-palindromic TAGGAG motifs to guide self/non-self discrimination (13); and in the DISARM system, methylation on the second locus of CCWGG motifs as a marker of self DNA (12). And with a greater integration with motif and gene modification site information of self/non-self discrimination through literature curation and deep mining of genome modification information will be a welcome improvement.

Supplementary Material

gkz916_Supplemental_Files

ACKNOWLEDGEMENTS

We thank Dr Jun Zhong and Members of the BIG Data Center for reporting bugs and sending comments. We also thank Dr Jiayan Wu for the proofreading.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Key Research Program of China [2016YFB0201702 to J.X.]; National Natural Science Foundation of China [31771465 and 31970634 to J.X.]; Promoting Big Data Development Project, the National Development and Reform Commission of China [2016-999999-65-01-000696-07 to J.X.]; International Partnership Program of the Chinese Academy of Sciences [153F11KYSB20160008]; The 13th Five-year Informatization Plan of Chinese Academy of Sciences [XXH13505-05 to J.X.]. Funding for open access charge: National Key Research Program of China [2016YFB0201702].

Conflict of interest statement. None declared.

REFERENCES

  • 1. Valen L.V. A new evolutionary law. Evol. Theory. 1973; 1:1–30. [Google Scholar]
  • 2. Stern A., Sorek R.. The phage-host arms race: shaping the evolution of microbes. Bioessays. 2011; 33:43–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Koonin E.V., Wolf Y.I.. Evolution of microbes and viruses: a paradigm shift in evolutionary biology. Front. Cell Infect. Microbiol. 2012; 2:119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Makarova K.S., Wolf Y.I., Snir S., Koonin E.V.. Defense islands in bacterial and archaeal genomes and prediction of novel defense systems. J. Bacteriol. 2011; 193:6039–6056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Makarova K.S., Wolf Y.I., Koonin E.V.. Comparative genomics of defense systems in archaea and bacteria. Nucleic Acids Res. 2013; 41:4360–4377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Koonin E.V., Makarova K.S., Wolf Y.I.. Evolutionary Genomics of Defense Systems in Archaea and Bacteria. Annu. Rev. Microbiol. 2017; 71:233–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Arber W., Linn S.. DNA modification and restriction. Annu. Rev. Biochem. 1969; 38:467–500. [DOI] [PubMed] [Google Scholar]
  • 8. Ershova A.S., Rusinov I.S., Spirin S.A., Karyagina A.S., Alexeevski A.V.. Role of restriction-modification systems in prokaryotic evolution and ecology. Biochemistry (Mosc.). 2015; 80:1373–1386. [DOI] [PubMed] [Google Scholar]
  • 9. Zhou X., Deng Z., Firmin J.L., Hopwood D.A., Kieser T.. Site-specific degradation of Streptomyceslividans DNA during electrophoresis in buffers contaminated with ferrous iron. Nucleic Acids Res. 1988; 16:4341–4352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Wang L., Chen S., Xu T., Taghizadeh K., Wishnok J.S., Zhou X., You D., Deng Z., Dedon P.C.. Phosphorothioation of DNA in bacteria by dnd genes. Nat. Chem. Biol. 2007; 3:709–710. [DOI] [PubMed] [Google Scholar]
  • 11. Wang L., Chen S., Vergin K.L., Giovannoni S.J., Chan S.W., DeMott M.S., Taghizadeh K., Cordero O.X., Cutler M., Timberlake S. et al.. DNA phosphorothioation is widespread and quantized in bacterial genomes. Proc. Natl. Acad. Sci. U.S.A. 2011; 108:2963–2968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Ofir G., Melamed S., Sberro H., Mukamel Z., Silverman S., Yaakov G., Doron S., Sorek R.. DISARM is a widespread bacterial defence system with broad anti-phage activities. Nat. Microbiol. 2018; 3:90–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Goldfarb T., Sberro H., Weinstock E., Cohen O., Doron S., Charpak-Amikam Y., Afik S., Ofir G., Sorek R.. BREX is a novel phage resistance system widespread in microbial genomes. EMBO J. 2015; 34:169–183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Hur J.K., Olovnikov I., Aravin A.A.. Prokaryotic Argonautes defend genomes against invasive DNA. Trends Biochem. Sci. 2014; 39:257–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Swarts D.C., Makarova K., Wang Y., Nakanishi K., Ketting R.F., Koonin E.V., Patel D.J., van der Oost J.. The evolutionary journey of Argonaute proteins. Nat. Struct. Mol. Biol. 2014; 21:743–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. van der Oost J., Jore M.M., Westra E.R., Lundgren M., Brouns S.J.. CRISPR-based adaptive and heritable immunity in prokaryotes. Trends Biochem. Sci. 2009; 34:401–407. [DOI] [PubMed] [Google Scholar]
  • 17. Garneau J.E., Dupuis M.E., Villion M., Romero D.A., Barrangou R., Boyaval P., Fremaux C., Horvath P., Magadan A.H., Moineau S.. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature. 2010; 468:67–71. [DOI] [PubMed] [Google Scholar]
  • 18. Horvath P., Barrangou R.. CRISPR/Cas, the immune system of bacteria and archaea. Science. 2010; 327:167–170. [DOI] [PubMed] [Google Scholar]
  • 19. Makarova K.S., Haft D.H., Barrangou R., Brouns S.J., Charpentier E., Horvath P., Moineau S., Mojica F.J., Wolf Y.I., Yakunin A.F. et al.. Evolution and classification of the CRISPR-Cas systems. Nat. Rev. Microbiol. 2011; 9:467–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Gerdes K., Christensen S.K., Lobner-Olesen A.. Prokaryotic toxin-antitoxin stress response loci. Nat. Rev. Microbiol. 2005; 3:371–382. [DOI] [PubMed] [Google Scholar]
  • 21. Yamaguchi Y., Park J.H., Inouye M.. Toxin-antitoxin systems in bacteria and archaea. Annu. Rev. Genet. 2011; 45:61–79. [DOI] [PubMed] [Google Scholar]
  • 22. Page R., Peti W.. Toxin-antitoxin systems in bacterial growth arrest and persistence. Nat. Chem. Biol. 2016; 12:208–214. [DOI] [PubMed] [Google Scholar]
  • 23. Chopin M.C., Chopin A., Bidnenko E.. Phage abortive infection in lactococci: variations on a theme. Curr. Opin. Microbiol. 2005; 8:473–479. [DOI] [PubMed] [Google Scholar]
  • 24. Doron S., Melamed S., Ofir G., Leavitt A., Lopatina A., Keren M., Amitai G., Sorek R.. Systematic discovery of antiphage defense systems in the microbial pangenome. Science. 2018; 359:eaar4120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Allison G.E., Klaenhammer T.R.. Phage resistance mechanisms in lactic acid bacteria. Int. Dairy J. 1998; 8:207–226. [Google Scholar]
  • 26. Hansen E.B. Commercial bacterial starter cultures for fermented foods of the future. Int. J. Food Microbiol. 2002; 78:119–131. [DOI] [PubMed] [Google Scholar]
  • 27. Cong L., Ran F.A., Cox D., Lin S., Barretto R., Habib N., Hsu P.D., Wu X., Jiang W., Marraffini L.A. et al.. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013; 339:819–823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Hwang W.Y., Fu Y., Reyon D., Maeder M.L., Tsai S.Q., Sander J.D., Peterson R.T., Yeh J.R., Joung J.K.. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat. Biotechnol. 2013; 31:227–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Unterholzner S.J., Poppenberger B., Rozhon W.. Toxin-antitoxin systems: biology, identification, and application. Mob Genet. Elements. 2013; 3:e26219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Grissa I., Vergnaud G., Pourcel C.. The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats. BMC Bioinformatics. 2007; 8:172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Zhang Q., Ye Y.. Not all predicted CRISPR-Cas systems are equal: isolated cas genes and classes of CRISPR like elements. BMC Bioinformatics. 2017; 18:92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Roberts R.J., Vincze T., Posfai J., Macelis D.. REBASE–a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res. 2015; 43:D298–D299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Xie Y., Wei Y., Shen Y., Li X., Zhou H., Tai C., Deng Z., Ou H.Y.. TADB 2.0: an updated database of bacterial type II toxin-antitoxin loci. Nucleic Acids Res. 2018; 46:D749–D753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Sayers E.W., Agarwala R., Bolton E.E., Brister J.R., Canese K., Clark K., Connor R., Fiorini N., Funk K., Hefferon T. et al.. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2019; 47:D23–D28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Galperin M.Y., Makarova K.S., Wolf Y.I., Koonin E.V.. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res. 2015; 43:D261–D269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. El-Gebali S., Mistry J., Bateman A., Eddy S.R., Luciani A., Potter S.C., Qureshi M., Richardson L.J., Salazar G.A., Smart A. et al.. The Pfam protein families database in 2019. Nucleic Acids Res. 2019; 47:D427–D432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Haft D.H., Loftus B.J., Richardson D.L., Yang F., Eisen J.A., Paulsen I.T., White O.. TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res. 2001; 29:41–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J.. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997; 25:3389–3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Kuzniar A., van Ham R.C., Pongor S., Leunissen J.A.. The quest for orthologs: finding the corresponding gene across genomes. Trends Genet.: TIG. 2008; 24:539–551. [DOI] [PubMed] [Google Scholar]
  • 40. Hunter S., Jones P., Mitchell A., Apweiler R., Attwood T.K., Bateman A., Bernard T., Binns D., Bork P., Burge S. et al.. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 2012; 40:D306–D312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Couvin D., Bernheim A., Toffano-Nioche C., Touchon M., Michalik J., Neron B., Rocha E.P.C., Vergnaud G., Gautheret D., Pourcel C.. CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins. Nucleic Acids Res. 2018; 46:W246–W251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014; 30:2068–2069.24642063 [Google Scholar]
  • 43. Page A.J., Cummins C.A., Hunt M., Wong V.K., Reuter S., Holden M.T., Fookes M., Falush D., Keane J.A., Parkhill J.. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015; 31:3691–3693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Gu Z., Eils R., Schlesner M.. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016; 32:2847–2849. [DOI] [PubMed] [Google Scholar]
  • 45. Katoh K., Misawa K., Kuma K., Miyata T.. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002; 30:3059–3066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Yachdav G., Wilzbach S., Rauscher B., Sheridan R., Sillitoe I., Procter J., Lewis S.E., Rost B., Goldberg T.. MSAViewer: interactive JavaScript visualization of multiple sequence alignments. Bioinformatics. 2016; 32:3501–3503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Shank S.D., Weaver S., Kosakovsky Pond S.L.. phylotree.js - a JavaScript library for application development and interactive data visualization in phylogenetics. BMC Bioinformatics. 2018; 19:276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Guindon S., Dufayard J.F., Lefort V., Anisimova M., Hordijk W., Gascuel O.. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biol. 2010; 59:307–321. [DOI] [PubMed] [Google Scholar]
  • 49. Vasu K., Nagaraja V.. Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol. Mol. Biol. Rev.: MMBR. 2013; 77:53–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Roberts R.J. Restriction endonucleases. CRC Crit. Rev. Biochem. 1976; 4:123–164. [DOI] [PubMed] [Google Scholar]
  • 51. Williams R.J. Restriction endonucleases: classification, properties, and applications. Mol. Biotechnol. 2003; 23:225–243. [DOI] [PubMed] [Google Scholar]
  • 52. Swarts D.C., Hegge J.W., Hinojo I., Shiimori M., Ellis M.A., Dumrongkulraksa J., Terns R.M., Terns M.P., van der Oost J.. Argonaute of the archaeon Pyrococcusfuriosus is a DNA-guided nuclease that targets cognate DNA. Nucleic Acids Res. 2015; 43:5120–5129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Swarts D.C., Jore M.M., Westra E.R., Zhu Y., Janssen J.H., Snijders A.P., Wang Y., Patel D.J., Berenguer J., Brouns S.J.J. et al.. DNA-guided DNA interference by a prokaryotic Argonaute. Nature. 2014; 507:258–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Yuan Y.R., Pei Y., Ma J.B., Kuryavyi V., Zhadina M., Meister G., Chen H.Y., Dauter Z., Tuschl T., Patel D.J.. Crystal structure of A. aeolicus argonaute, a site-specific DNA-guided endoribonuclease, provides insights into RISC-mediated mRNA cleavage. Mol. Cell. 2005; 19:405–419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Kaya E., Doxzen K.W., Knoll K.R., Wilson R.C., Strutt S.C., Kranzusch P.J., Doudna J.A.. A bacterial Argonaute with noncanonical guide RNA specificity. Proc. Natl. Acad. Sci. U.S.A. 2016; 113:4057–4062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Hegge J.W., Swarts D.C., van der Oost J.. Prokaryotic Argonaute proteins: novel genome-editing tools?. Nat. Rev. Microbiol. 2018; 16:5–11. [DOI] [PubMed] [Google Scholar]
  • 57. Puigbo P., Makarova K.S., Kristensen D.M., Wolf Y.I., Koonin E.V.. Reconstruction of the evolution of microbial defense systems. BMC Evol. Biol. 2017; 17:94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Puigbo P., Lobkovsky A.E., Kristensen D.M., Wolf Y.I., Koonin E.V.. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes. BMC Biol. 2014; 12:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Members of BIG Data Center Database Resources of the BIG Data Center in 2019. Nucleic Acids Res. 2019; 47:D8–D14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Makarova K.S., Wolf Y.I., Forterre P., Prangishvili D., Krupovic M., Koonin E.V.. Dark matter in archaeal genomes: a rich source of novel mobile elements, defense systems and secretory complexes. Extremophiles. 2014; 18:877–893. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkz916_Supplemental_Files

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES