Abstract
ChlamDB is a comparative genomics database containing 277 genomes covering the entire Chlamydiae phylum as well as their closest relatives belonging to the Planctomycetes-Verrucomicrobiae-Chlamydiae (PVC) superphylum. Genomes can be compared, analyzed and retrieved using accessions numbers of the most widely used databases including COG, KEGG ortholog, KEGG pathway, KEGG module, Pfam and InterPro. Gene annotations from multiple databases including UniProt (curated and automated protein annotations), KEGG (annotation of pathways), COG (orthology), TCDB (transporters), STRING (protein–protein interactions) and InterPro (domains and signatures) can be accessed in a comprehensive overview page. Candidate effectors of the Type III secretion system (T3SS) were identified using four in silico methods. The identification of orthologs among all PVC genomes allows users to perform large-scale comparative analyses and to identify orthologs of any protein in all genomes integrated in the database. Phylogenetic relationships of PVC proteins and their closest homologs in RefSeq, comparison of transmembrane domains and Pfam domains, conservation of gene neighborhood and taxonomic profiles can be visualized using dynamically generated graphs, available for download. As a central resource for researchers working on chlamydia, chlamydia-related bacteria, verrucomicrobia and planctomyces, ChlamDB facilitates the access to comprehensive annotations, integrates multiple tools for comparative genomic analyses and is freely available at https://chlamdb.ch/. Database URL: https://chlamdb.ch/
INTRODUCTION
All known members of the phylum Chlamydiae are obligate intracellular bacteria exhibiting a unique life cycle. Described chlamydial species cause a broad range of diseases in various species of birds, fishes, reptiles, amphibians, marsupials and mammals (1), and include major human pathogens such as Chlamydia trachomatis—a leading cause of blindness and infertility (1,2). Chlamydiae are difficult to cultivate and genetic manipulations are only available for a few species, which drastically slows down the understanding of their fascinating biology. Other members of the Planctomycetes-Verrucomicrobiae-Chlamydiae (PVC) superphylum include the closest relatives of the Chlamydiae: The Planctomycetes are extremely attractive for the field of evolutionary cell biology given their peculiar intracellular compartments (3). Like Chlamydiae, they replicate using an FtsZ-independent mechanism but contrarily to the Chlamydiae, Planctomycetales were shown to have a complete peptidoglycan cell wall (4–7). There is currently no database allowing an easy access and comparison of comprehensive genomics data for members of the PVC superphylum. A database focusing on the curation of chlamydial genome annotation was recently published (8), but it is limited to three species of the genus Chlamydia. A phylum-scale perspective including comparative data with the closest free-living relatives of the Chlamydiae would provide significant added value for the research community given the conserved intracellular lifestyle of these bacteria that were estimated to diverge over 700 million years ago (9). The PVCbase (10) provides updated automated protein annotations of forty-two PVC genomes, but only offers limited browsing capabilities and no comparative data. ChlamDB offers a centralized resource for genomic data and annotations of the entire PVC-superphylum. Its simple search engine allows browsing protein annotations, identifying orthologs in PVC genomes and performing a variety of comparative analyses.
Genomic data and search
ChlamDB release 2.0 integrates data from 277 PVC genomes of 82 different species (Table 1), retrieved from GenBank (11) or RefSeq (12) (when GenBank records were not annotated). It includes all complete PVC genomes as well as draft genomes of the Chlamydiae phylum to increase the diversity of genera and species represented in the database. Draft genomes of the most studied Chlamydia species were discarded to reduce unnecessary redundancy in the database. Most genomes (n = 221) belong to the Chlamydiae phylum, including 86 C. trachomatis, 20 Chlamydia muridarum, 20 Chlamydia psittaci and 12 Chlamydophilapneumoniae genomes, thus allowing intra-species comparison for these important human pathogens. Species-level diversity was shown to determine C. trachomatis tissue tropism, hence showing the interest of such comparisons to elucidate novel aspects of chlamydial lifestyle and pathogenesis. To allow for broader comparisons, this database also contains the genomes of 34 Verrucomicrobia, 20 Planctomycetes, 1 Lentisphaerae and 1 Kiritimatiellaeota. Among the 34 Verrucomicrobia, there are 23 Akkermansia muciniphila, a bacterium commonly found in the human gut (13).
Table 1.
Phylum | # genomes | # species |
---|---|---|
Chlamydiae | 221 | 48 |
Planctomycetes | 20 | 20 |
Verrucomicrobia | 34 | 12 |
Lentisphaerae | 1 | 1 |
Kiritimatiellaeota | 1 | 1 |
TOTAL | 277 | 82 |
The database provides various tools for comparing, analyzing and retrieving genomic data. A simple Boolean search interface allows querying the database for specific entries using NCBI protein accessions and locus tags or UniProt accessions. Accessions numbers of widely-used databases such as COG (14), KEGG ortholog (KO) (15), KEGG pathway (16), KEGG module, Pfam (17) and InterPro (18) are also recognized and can be used to search for proteins with specific annotations. The annotation of individual genomes can be browsed in tables of genes that are accessible directly from the front web page. In addition, sequence homology searches can be performed through a BLAST interface integrating the different blast flavours (BLASTp, BLASTn, tBLASTn and BLASTx) (19).
Individual protein annotation view
Searching for a protein allows to access a ‘locus’ page, designed to summarize automated and imported functional annotations, and provides comprehensive comparative data to facilitate the interpretation of annotations (Figure 1). It integrates annotations from multiple databases including UniProt (curated and automated protein annotations) (20), KEGG (annotation of pathways), COG (orthology), TCDB (transporters) (21), STRING (protein-protein interactions) (22) and InterPro (domains and signatures). The different tabs at the top of the page link to additional data such as the list of orthologs in other PVC genomes (Figure 1C), identified using OrthoFinder (23). Orthologs are listed in a table containing the locus tag, the gene name, the name of the organism, the product, the percentage of amino acid identity as compared to the reference locus and the UniProt annotation score. Orthologs that were reviewed on SwissProt are flagged to quickly identify orthologs with manually curated annotations. Additional tabs link to (i) a precomputed phylogeny of the orthologous group, (ii) a second phylogeny that includes the closest non-PVC RefSeq hits of each sequence of the orthogroup, allowing to investigate the phylogenetic relationship of PVC proteins and their closest homologs available in public databases (Figures 1J and 2J), precomputed homology searches with (iii) RefSeq and (iv) SwissProt databases (200 top hits), (v) links to published literature based on text-mining from the STRING database (24) and PaperBLAST hits (25) and (vi) candidate functional interactors. Putative interactors were predicted in-house from genomic data alone using phylogenetic profiling and investigation of conserved gene neighborhood (see online methods) (Figure 1G). See (26) and (27) for the rationale justifying use of those two approaches.
We put a strong emphasis on the visual representation of the data (Figure 2). The pattern of presence/absence of orthologous groups within the PVC superphylum can be visualized with help of an annotated reference phylogeny (Figures 1D and 2D). The reference phylogeny was reconstructed with FastTree (28) (default parameters, JTT+CAT model) based on the concatenated alignment of 32 single copy orthologs conserved in at least 266 out of the 277 genomes.
The organization of transmembrane and Pfam domains in orthologs can be easily compared along the phylogeny of the orthologous group (Figures 1H and 2H). The conservation of proteins encoded in the direct neighborhood (23 kb upstream and downstream) of the protein of interest can also be visualized (Figures 1E and 2E).
The ‘orthogroup’ link (Figure 1K) provides an overview of the annotation of orthologs including gene name, product, COG annotation, KEGG annotation, InterPro annotations, number of transmembrane domains and sequence length. It allows verifying the consistency of annotations among putative orthologs and identifying wrongly grouped proteins (e.g. non-orthologous proteins sharing a domain).
Annotation of candidate type III secretion system effectors
Chlamydiae use a type III secretion system (T3SS) to deliver effector proteins that will allow the bacterium to overcome eukaryotic host defenses and to manipulate host cells. Effectors are difficult to identify because they evolve quickly and are much less conserved than proteins encoding components of the T3SS apparatus (29,30). Between 5 and 8% of Chlamydia spp. coding sequences (CDS) are estimated to be effectors (31). Candidate T3SS effectors were identified using four different machine-learning classifiers that were trained with known effector sequences: BPBAac (32), effectiveT3 (33), DeepT3 (34) and T3_MM (35). In addition, we tagged proteins harboring eukaryotic domains rarely found in bacterial genomes. Such domains are known to be frequently involved in bacteria–host interactions (36,37). The ADP/ATP transporter domain (InterPro accession IPR004667) is for instance frequently found in both bacteria (70.48%) and eukaryotes (29.52%) (Figure 1L). A dedicated page allows visualizing the taxonomic distribution of each COG and Pfam domains across respectively 2,031 (for COG) and 6,677 (for Pfam) representative Archaea, Bacteria, Eukaryotes and Viruses genomes (Figure 1M and 2M). The detailed list of identified homologs can (for instance) be used to quickly determine whether a candidate effector protein harbors a domain predominantly identified in the genome of eukaryotes and other intracellular bacterial parasites such as Rickettsia or Legionella.
Comparative genomics and data mining tools
Since C. trachomatis genome became one of the first sequenced genomes (38), hundreds of Chlamydiae genomes have been sequenced. Comparisons of complete genomes of different strains and species can help identify genetic variations that can be involved in defining tissue tropism or host specificity (39), or identify genes essential to the unique intracellular lifestyle of Chlamydiae. ChlamDB allows users to perform various comparative analyses based on orthologous proteins to identify highly conserved and genome-specific or clade-specific orthologous groups (Figure 3.1 and 3.2). Whole genome comparisons can be visualized using interactive circular genome maps, Venn diagrams or heat maps (Figure 3.3, 3.4 and 3.5). In addition, ChlamDB enables the alignment of local genomic regions in two or more genomes (Figure 3.6).
Pfam domains, KEGG orthologs and InterPro entries can also be compared to identify clade-specific or highly conserved protein features (Figure 3.7). A simple form enables the user to compare the size of gene families or the frequency of domains/KEGG annotations in each genome, allowing the identification of large protein families or frequent domains. For instance, the polymorphic membrane protein family (Pmp), a family of proteins involved in adhesion identified in all sequenced Chlamydiaceae genomes (40), is present in up to 28 copies in C. psittaci CP3 genome. Interestingly, the Pfam domain PF05150 (‘Legionella pneumophila major outer membrane protein domain’), a domain extremely rarely identified outside of the Legionella genus (see https://chlamdb.ch/pfam_profile/PF05150/phylum) is present in 219 copies within the PVC superphylum (https://chlamdb.ch/fam/PF05150/pfam). This domain is also the most frequent domain identified in the genome of Simkania negevensis (36 occurrences). Proteins harboring this domain were probably acquired by horizontal gene transfer by Chlamydiae, Legionella or both and might share similar functions.
Annotations from the KEGG database were used to classify proteins into metabolic pathways and modules (16). Data for individual pathways and modules can be retrieved by searching KEGG accessions in the main search bar. In addition, KEGG annotations in various genomes can be compared as annotated phylogenies (Figure 4.1) and interactive bar charts or accessed from summary tables available for each genome (Figure 4.2). Modules and pathways pages detail KEGG orthologs associated to a given entry (Figure 4.3) and report the list of orthologs identified in each PVC genome (Figure 4.4).
Implementation, methods and updates
The interface was developed using the Django framework (https://www.djangoproject.com/). Data are stored on a MySQL server and visualized with existing JavaScript libraries allowing to draw interactive plots and tables such as jvenn.js (41), datatables.js (https://datatables.net), cytoscape.js (42) and feature−viewer.js (https://github.com/calipho-sib/feature-viewer) (43). The python module GenomeDiagram is used to draw genome schematics, including alignments of multiple genomic locations (44). Circular representations of genomes and plasmids are made with Circos (45). The Ete3 Python module is used to draw phylogenetic trees with associated metadata (46). Some plots are also made using R (47), ggplot2 (48) and plotly (https://plot.ly). Annotations, phylogenetic trees and multiple sequence alignments can be downloaded from the website. A detailed description of the methods used to pre-compute functional and comparative analyses and setup the database is available online (https://www.chlamdb.ch/docs/index.html). The code source of the website is freely available on Github and issues can be reported online (https://github.com/metagenlab/chlamdb). This database has been developed at the Centre for Research on Intracellular Bacteria (CRIB) in Lausanne and will be maintained and updated at least once a year.
CONCLUSION AND FUTURE DIRECTIONS
As the number of genome sequences quickly increases, there is a need for a centralized genomics resource providing updated annotations and extensive comparative genomics capabilities for the PVC superphylum. A superphylum-specific database has a significant added value with respect to large-scale genomic databases such as PATRIC (49) or Microscope (50): ChlamDB greatly facilitates access to comprehensive annotations and comparative data meaningful to the Chlamydia and PVC research community, with an intuitive interface and a special focus on visual representations of comparative data. Easy access to precomputed homology searches and phylogenetic reconstructions will help researchers to investigate the function and evolutionary history of proteins encoded in PVC genomes. Annotations of proteins specific for intracellular life such as predictions of type III secretion system effectors and identification of eukaryote-like domains will also facilitate the identification of uncharacterized proteins that might be involved in chlamydia-host interactions.
Since the annotation of PVC genomes stored in Genbank is generally not up-to-date with the most recent research, the existing ChlamDB could be extended to allow manual curation of the annotation and tracking of protein annotation history. Indeed, successful examples of community-curated databases exist for major pathogens, such as the Pseudomonas Database (www.pseudomonas.com) (51). The inference of orthologous relationships could be used to propagate the annotation of characterized proteins to less studied members of the phylum.
ACKNOWLEDGEMENTS
We are grateful to Roland Sahli for his support and for providing the computational resources necessary for the development and setup of this database. We would like to thank Valentin Scherz for valuable discussions as well as Carole Kebbi-Beghdadi, Ludovic Pilloux, Silvia Ardissone, Sébastien Aeby, Marie de Barsy, Nicolas Jacquier, Firuza Bayramova, Aurélie Scherler, Alyce Taylor-Brown and all other members of the Center for Research on Intracellular Bacteria (CRIB) for their support and feedbacks during the development of ChlamDB.
FUNDING
This work results from the close interaction between the Laboratory of Genomics and Metagenomics and the Center for Research on Intracellular Bacteria, both led by Prof. Greub. Funding of research in genomics and on intracellular bacteria by Greub’s research team is supported by various grants, including grants from the Swiss National Science Foundation (Sinergia grant n° CRSII3-141837; SNSF n°3200BO-116445; SNSF n°310030-162603; SNSF n°10531C-170280), as well as by grants from Foundations (Jürg Tschopp award and Leenaards award).
Conflict of interest statement. None declared.
REFERENCES
- 1. Bachmann N.L., Polkinghorne A., Timms P.. Chlamydia genomics: providing novel insights into chlamydial biology. Trends Microbiol. 2014; 22:464–472. [DOI] [PubMed] [Google Scholar]
- 2. Leonard C.A., Borel N.. Chronic chlamydial diseases: from atherosclerosis to urogenital infections. Curr. Clin. Microbiol. Rep. 2014; 1:61–72. [Google Scholar]
- 3. Rivas-Marín E., Devos D.P.. The Paradigms They Are a-Changin’: past, present and future of PVC bacteria research. Antonie Van Leeuwenhoek. 2018; 111:785–799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Jacquier N., Viollier P.H., Greub G.. The role of peptidoglycan in chlamydial cell division: towards resolving the chlamydial anomaly. FEMS Microbiol. Rev. 2015; 39:262–275. [DOI] [PubMed] [Google Scholar]
- 5. Jacquier N., Frandi A., Pillonel T., Viollier P., Greub G.. Cell wall precursors are required to organize the chlamydial division septum. Nat. Commun. 2014; 5:3578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Jeske O., Schüler M., Schumann P., Schneider A., Boedeker C., Jogler M., Bollschweiler D., Rohde M., Mayer C., Engelhardt H. et al.. Planctomycetes do possess a peptidoglycan cell wall. Nat. Commun. 2015; 6:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. van Teeseling M.C.F., Mesman R.J., Kuru E., Espaillat A., Cava F., Brun Y.V., VanNieuwenhze M.S., Kartal B., van Niftrik L.. Anammox Planctomycetes have a peptidoglycan cell wall. Nat. Commun. 2015; 6:1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Putman T., Hybiske K., Jow D., Afrasiabi C., Lelong S., Cano M.A., Stupp G.S., Waagmeester A., Good B.M., Wu C. et al.. ChlamBase: a curated model organism database for the Chlamydia research community. Database. 2019; 2019:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Greub G., Raoult D.. History of the ADP/ATP-translocase-encoding gene, a parasitism gene transferred from a Chlamydiales ancestor to plants 1 billion years ago. Appl. Environ. Microbiol. 2003; 69:5530–5535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Bordin N., González-Sánchez J.C., Devos D.P.. PVCbase: an integrated web resource for the PVC bacterial proteomes. Database. 2018; 2018:10–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Sayers E.W., Cavanaugh M., Clark K., Ostell J., Pruitt K.D., Karsch-Mizrachi I.. GenBank. Nucleic Acids Res. 2019; 47:D94–D99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Haft D.H., DiCuccio M., Badretdin A., Brover V., Chetvernin V., O’Neill K., Li W., Chitsaz F., Derbyshire M.K., Gonzales N.R. et al.. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 2018; 46:D851–D860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Derrien M., Collado M.C., Ben-Amor K., Salminen S., de Vos W.M.. The Mucin degrader Akkermansia muciniphila is an abundant resident of the human intestinal tract. Appl. Environ. Microbiol. 2008; 74:1646–1648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Galperin M.Y., Makarova K.S., Wolf Y.I., Koonin E.V.. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res. 2015; 43:D261–D269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Kanehisa M., Furumichi M., Tanabe M., Sato Y., Morishima K.. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017; 45:D353–D361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Kanehisa M., Sato Y. KEGG Mapper for inferring cellular functions from protein sequences. Protein Sci. 2019; 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. El-Gebali S., Mistry J., Bateman A., Eddy S.R., Luciani A., Potter S.C., Qureshi M., Richardson L.J., Salazar G.A., Smart A. et al.. The Pfam protein families database in 2019. Nucleic Acids Res. 2019; 47:D427–D432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Mitchell A.L., Attwood T.K., Babbitt P.C., Blum M., Bork P., Bridge A., Brown S.D., Chang H.-Y., El-Gebali S., Fraser M.I. et al.. InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 2019; 47:D351–D360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L.. BLAST+: architecture and applications. BMC Bioinformatics. 2009; 10:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019; 47:D506–D515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Saier M.H., Reddy V.S., Tsu B.V., Ahmed M.S., Li C., Moreno-Hagelsieb G.. The Transporter Classification Database (TCDB): recent advances. Nucleic Acids Res. 2016; 44:D372–D379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Szklarczyk D., Gable A.L., Lyon D., Junge A., Wyder S., Huerta-Cepas J., Simonovic M., Doncheva N.T., Morris J.H., Bork P. et al.. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019; 47:D607–D613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Emms D.M., Kelly S.. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015; 16:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Szklarczyk D., Morris J.H., Cook H., Kuhn M., Wyder S., Simonovic M., Santos A., Doncheva N.T., Roth A., Bork P. et al.. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 2017; 45:D362–D368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Price M.N., Arkin A.P.. PaperBLAST: Text Mining Papers for Information about Homologs. mSystems. 2017; 2:e00039-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Dandekar T., Snel B., Huynen M., Bork P.. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 1998; 23:324–328. [DOI] [PubMed] [Google Scholar]
- 27. Pellegrini M., Marcotte E.M., Thompson M.J., Eisenberg D., Yeates T.O.. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U.S.A. 1999; 96:4285–4288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Price M.N., Dehal P.S., Arkin A.P.. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One. 2010; 5:e9490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Nogueira T., Touchon M., Rocha E.P.C.. Rapid evolution of the sequences and gene repertoires of secreted proteins in bacteria. PLoS One. 2012; 7:e49403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Dehoux P., Flores R., Dauga C., Zhong G., Subtil A.. Multi-genome identification and characterization of chlamydiae-specific type III secretion substrates: the Inc proteins. BMC Genomics. 2011; 12:1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Valdivia R.H. Chlamydia effector proteins and new insights into chlamydial cellular microbiology. Curr. Opin. Microbiol. 2008; 11:53–59. [DOI] [PubMed] [Google Scholar]
- 32. Wang Y., Zhang Q., Sun M., Guo D.. High-accuracy prediction of bacterial type III secreted effectors based on position-specific amino acid composition profiles. Bioinformatics. 2011; 27:777–784. [DOI] [PubMed] [Google Scholar]
- 33. Jehl M.-A., Arnold R., Rattei T.. Effective—a database of predicted secreted bacterial proteins. Nucleic Acids Res. 2011; 39:D591–D595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Xue L., Tang B., Chen W., Luo J.. DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence. Bioinformatics. 2019; 35:2051–2057. [DOI] [PubMed] [Google Scholar]
- 35. Wang Y., Sun M., Bao H., White A.P.. T3_MM: a Markov model effectively classifies bacterial type III secretion signals. PLoS One. 2013; 8:e58173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Gimenez G., Bertelli C., Moliner C., Robert C., Raoult D., Fournier P.-E., Greub G.. Insight into cross-talk between intra-amoebal pathogens. BMC Genomics. 2011; 12:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Ponting C.P., Aravind L., Schultz J., Bork P., Koonin E.V.. Eukaryotic signalling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer. J. Mol. Biol. 1999; 289:729–745. [DOI] [PubMed] [Google Scholar]
- 38. Stephens R.S., Kalman S., Lammel C., Fan J., Marathe R., Aravind L., Mitchell W., Olinger L., Tatusov R.L., Zhao Q. et al.. Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. Science. 1998; 282:754–759. [DOI] [PubMed] [Google Scholar]
- 39. Read T.D., Brunham R.C., Shen C., Gill S.R., Heidelberg J.F., White O., Hickey E.K., Peterson J., Utterback T., Berry K. et al.. Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. Nucleic Acids Res. 2000; 28:1397–1406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Becker E., Hegemann J.H.. All subtypes of the Pmp adhesin family are implicated in chlamydial virulence and show species-specific function. Microbiologyopen. 2014; 3:544–556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Bardou P., Mariette J., Escudié F., Djemiel C., Klopp C.. jvenn: an interactive Venn diagram viewer. BMC Bioinformatics. 2014; 15:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Franz M., Lopes C.T., Huck G., Dong Y., Sumer O., Bader G.D.. Cytoscape.js: a graph theory library for visualisation and analysis. Bioinformatics. 2016; 32:309–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Gaudet P., Michel P.-A., Zahn-Zabal M., Britan A., Cusin I., Domagalski M., Duek P.D., Gateau A., Gleizes A., Hinard V. et al.. The neXtProt knowledgebase on human proteins: 2017 update. Nucleic Acids Res. 2017; 45:D177–D182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Pritchard L., White J.A., Birch P.R.J., Toth I.K.. GenomeDiagram: a python package for the visualization of large-scale genomic data. Bioinformatics. 2006; 22:616–617. [DOI] [PubMed] [Google Scholar]
- 45. Krzywinski M., Schein J., Birol İ., Connors J., Gascoyne R., Horsman D., Jones S.J., Marra M.A.. Circos: an information aesthetic for comparative genomics. Genome Res. 2009; 19:1639–1645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Huerta-Cepas J., Serra F., Bork P.. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 2016; 33:1635–1638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. R Core Team R: A Language and Environment for Statistical Computing. 2016; Vienna: R Foundation for Statistical Computing; https://www.R-project.org/. [Google Scholar]
- 48. Wickham H. ggplot2: Elegant Graphics for Data Analysis. 2016; NY: Springer. [Google Scholar]
- 49. Wattam A.R., Davis J.J., Assaf R., Boisvert S., Brettin T., Bun C., Conrad N., Dietrich E.M., Disz T., Gabbard J.L. et al.. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center. Nucleic Acids Res. 2017; 45:D535–D542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Vallenet D., Calteau A., Cruveiller S., Gachet M., Lajus A., Josso A., Mercier J., Renaux A., Rollin J., Rouy Z. et al.. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes. Nucleic Acids Res. 2017; 45:D517–D528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Winsor G.L., Griffiths E.J., Lo R., Dhillon B.K., Shay J.A., Brinkman F.S.L.. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database. Nucleic Acids Res. 2016; 44:D646–D653. [DOI] [PMC free article] [PubMed] [Google Scholar]