Genomes from bacteria associated with the canine oral cavity: A test case for automated genome-based taxonomic assignment

David A Coil; Guillaume Jospin; Aaron E Darling; Corrin Wallis; Ian J Davis; Stephen Harris; Jonathan A Eisen; Lucy J Holcombe; Ciaran O’Flynn

doi:10.1371/journal.pone.0214354

. 2019 Jun 10;14(6):e0214354. doi: 10.1371/journal.pone.0214354

Genomes from bacteria associated with the canine oral cavity: A test case for automated genome-based taxonomic assignment

David A Coil ¹, Guillaume Jospin ¹, Aaron E Darling ², Corrin Wallis ³, Ian J Davis ³, Stephen Harris ³, Jonathan A Eisen ^1,⁴, Lucy J Holcombe ³, Ciaran O’Flynn ^3,^*

Editor: Gabriel Moreno-Hagelsieb⁵

¹Genome Center, University of California, Davis, CA, United States of America

²The Ithree Institute, University of Technology Sydney, Ultimo NSW, Australia

³The Waltham Centre for Pet Nutrition, Melton Mowbray, Leicestershire, United Kingdom

⁴Evolution and Ecology, Medical Microbiology and Immunology, University of California, Davis, Davis, CA, United States of America

⁵Wilfrid Laurier University, CANADA

Competing Interests: This work was funded by Mars Petcare UK, the employer of Corrin Wallis, Ian J. Davis, Stephen Harris, Lucy J. Holcombe and Ciaran O’Flynn. There are no products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials.

^✉

* E-mail: ciaran.oflynn@effem.com

Roles

David A Coil: Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Guillaume Jospin: Data curation, Formal analysis, Methodology, Software, Writing – review & editing

Aaron E Darling: Conceptualization, Data curation, Funding acquisition, Software, Validation, Writing – review & editing

Corrin Wallis: Conceptualization, Data curation, Funding acquisition, Investigation, Supervision, Writing – review & editing

Ian J Davis: Conceptualization, Data curation, Investigation, Supervision, Writing – review & editing

Stephen Harris: Conceptualization, Investigation, Supervision, Writing – review & editing

Jonathan A Eisen: Conceptualization, Formal analysis, Funding acquisition, Investigation, Project administration, Supervision, Writing – original draft, Writing – review & editing

Lucy J Holcombe: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Supervision, Writing – original draft, Writing – review & editing

Ciaran O’Flynn: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

Gabriel Moreno-Hagelsieb: Editor

PMCID: PMC6557473 PMID: 31181071

Abstract

Taxonomy for bacterial isolates is commonly assigned via sequence analysis. However, the most common sequence-based approaches (e.g. 16S rRNA gene-based phylogeny or whole genome comparisons) are still labor intensive and subjective to varying degrees. Here we present a set of 33 bacterial genomes, isolated from the canine oral cavity. Taxonomy of these isolates was first assigned by PCR amplification of the 16S rRNA gene, Sanger sequencing, and taxonomy assignment using BLAST. After genome sequencing, taxonomy was revisited through a manual process using a combination of average nucleotide identity (ANI), concatenated marker gene phylogenies, and 16S rRNA gene phylogenies. This taxonomy was then compared to the automated taxonomic assignment given by the recently proposed Genome Taxonomy Database (GTDB). We found the results of all three methods to be similar (25 out of the 33 had matching genera), but the GTDB approach required fewer subjective decisions, and required far less labor. The primary differences in the non-identical taxonomic assignments involved cases where GTDB has proposed taxonomic revisions.

Introduction

With the ever-decreasing costs of DNA sequencing, it has become far easier and cheaper to sequence bacterial genomes than to analyze them. Understanding the gene content and metabolic pathways of a newly sequenced isolate is a time-consuming and knowledge-intensive process. Another, perhaps underappreciated, bottleneck is properly assigning taxonomy to a genome. This is most often seen with metagenome-assembled genomes (MAGs) which are unidentified prior to sequencing, but is even a problem with cultured isolates. Traditional morphological taxonomic assignment of bacterial isolates is tedious, and the more common approach of 16S rRNA gene PCR followed by Sanger sequencing is often uninformative beyond the genus level. Given the costs, some laboratories sequence isolate genomes directly, with no attempt at prior identification. While some have argued against the need for taxonomic assignment for many aspects of microbial genome analysis (e.g., [1][2]), there are many situations where taxonomy is considered valuable for making use of genomic information (e.g., see [3] and [4]).

There have been a number of proposed attempts to move to a genome-based taxonomy for bacteria and archaea, instead of relying on traditional chemotaxonomic/morphological characteristics of isolates [5][6][7]. These include the use of average nucleotide identity (ANI) [8][9][10][11], concatenated marker gene phylogenies (e.g. SILVA (unpublished) and GTDB [12], and shared protein content [13]. Most of these approaches however rely on a provisional identification (at least to genus), followed by locating and downloading the genomes of close relatives for comparison.

In this work, we briefly describe the genome sequences of 33 bacterial isolates from the canine oral cavity. These isolates were collected as part of a larger project on canine oral health and had a preliminary taxonomy assigned through Sanger sequencing and a sequence identity comparison of the 16S rRNA gene [14,15]. After genome sequencing, we first assigned taxonomy to these isolates based on a manual examination of “whole genome” concatenated marker phylogenetic trees, average nucleotide identity (ANI), and 16S rRNA gene phylogenetic trees. We then compared this taxonomy to automated taxonomic assignments given by the recently proposed Genome Taxonomy Database (GTDB).

Genomes and taxonomy

Genome selection

The study design, bacterial isolation, DNA extraction, isolation identification and genome sequencing/assembly have been previously described in our work on the Porphyromonas genus [16]. Briefly, bacterial isolates from the canine oral cavity were grown on supplemented Columbia Blood Agar containing 5% defibrinated horse blood (CBA; Oxoid, UK) with or without the addition of 5 mg/L hemin (catalog no. H9039; Sigma) and 0.5 mg/L menadione (catalog no. M5625; Sigma) or Heart Infusion Agar containing 5% defibrinated horse blood (HIA:Oxoid,UK) (Table 1). Aerobes were incubated at 38°C under normal atmospheric conditions for 1–5 days. Microaerophilic and anaerobic strains were incubated at 38°C for 1–21 days in a MACS1000 anaerobic workstation (Don Whitley, UK) with gas levels at 5% oxygen, 10% carbon dioxide, and 85% nitrogen for microaerophiles, and 10% hydrogen, 10% carbon dioxide, and 80% nitrogen for anaerobes. Following DNA extraction, library preparation, and Illumina sequencing, the reads were assembled using the A5-miseq assembly pipeline [17].

Table 1. Comparative taxonomy of 33 bacterial strains by three different methods.

Isolate Number	Media	Aerobe/anaerobe	16S taxonomy (Sanger, BLAST)	Manual Taxonomy (S, A, or W)	GTDB Taxonomy	Comments	Accession
ATCC 29435	CBA H&M	Aerobe	Conchiformibius steedae_COT-280	Conchiformibius steedae (S)	Conchiformibius steedae	All species	RQYC00000000
OH1139	CBA	Microaerophile	Staphylococcus epidermidis	Staphylococcus epidermidis (A)	Staphylococcus epidermidis	All species	RQYG00000000
OH1877	CBA	Aerobe	Actinomyces hordeovulneris COT-415	Actinomyces hordeovulneris (A)	Actinomyces_D hordeovulneris	All species	RQYO00000000
OH2158	CBA H&M	Anaerobe	Campylobacter rectus	Campylobacter rectus (A)	Campylobacter_A rectus	All species	RQYP00000000
OH1206	HIA	Anaerobe	Bacillus licheniformis (100%)	Bacillus licheniformis (A)	Bacillus licheniformis	All species	RQYJ00000000
OH3297	CBA	Aerobe	Actinomyces hordeovulneris COT-415	Actinomyces hordeovulneris (A)	Actinomyces_D hordeovulneris	All species	RQYV00000000
OH4621	CBA	Aerobe	Streptococcus minor COT-116	Streptococcus minor (A)	Streptococcus minor	All species	RQZA00000000
OH953	CBA	Aerobe	Streptococcus sanguinis	Streptococcus sanguinis (A)	Streptococcus sanguinis	All species	RQZI00000000
OH1186	CBA H&M	Anaerobe	Desulfovibrio sp. COT-070	Desulfovibrio (S)	Desulfovibrio	All genus	RQYH00000000
OH1287	HIA	Aerobe	Leucobacter sp. COT-429	Leucobacter (S)	Leucobacter	All genus	RQYM00000000
OH2974	CBA	Aerobe	Leucobacter sp. COT-288	Leucobacter (S)	Leucobacter	All genus	RQYU00000000
OH3620	HIA	Anaerobe	Leptotrichia_COT-345	Leptotrichia (S)	Leptotrichia	All genus	RQYW00000000
OH937	HIA	Anaerobe	Prevotella sp. COT-195	Prevotella (S)	Prevotella	All genus	RQZH00000000
OH1220	CBA H&M	Microaerophile	Fretibacterium sp. COT-178	Fretibacterium (S)	Fretibacterium	All genus	RQYL00000000
OH1205	HIA	Anaerobe	Alloprevotella sp. COT-284	Alloprevotella (S)	F0040 [Alloprevotella]^*	All genus	RQYI00000000
OH2545	CBA	Aerobe	Ottowia sp. COT-014	Ottowia (W)	Ottowia	All genus	RQYR00000000
OH741	CBA H&M	Microaerophile	Erysipelotrichaceae [G-1] sp. COT-311	Erysipelotrichaceae (S)	Erysipelotrichaceae	All family	RQZE00000000
OH1209	CBA	Aerobe	Streptococcus sp. COT-279	Desulfovibrio (S)	Desulfovibrio	Mis-identification	RQYK00000000
OH4464	CBA H&M	Anaerobe	Proprionibacterium sp. COT-324	Tesseracoccus (S)	Tessaracoccus	Related genus	RQYZ00000000
OH4692	CBA	Aerobe	Streptococcus pneumoniae COT-348	Streptococcus (S)	Streptococcus	16S species, manual & GTDB genus	RQZB00000000
OH2310	CBA	Aerobe	Xenophilus sp. COT-174	Lampropedia (W)	Lampropedia	Related genus	RQYQ00000000
OH3737	CBA	Aerobe	Xenophilus sp. COT-264	Lampropedia (W)	Lampropedia	Related genus	RQYX00000000
OH1047	CBA H&M	Anaerobe	Bacteroides heparinolyticus COT-310	Prevotella heparinolyticus (S)	Bacteroides	Official nomenclature change	RQYF00000000
OH1426	CBA H&M	Anaerobe	Tannerella forsythus COT-023	Tannerella forsythia (S)	Tannerella	GTDB genus, manual & 16S species	RQYN00000000
OH2617	CBA H&M	Anaerobe	Tannerella forsythus COT-023	Tannerella forsythia (S)	Tannerella	GTDB genus, manual & 16S species	RQYS00000000
OH4460	CBA H&M	Anaerobe	Fusobacterium canifelinum COT-188	Fusobacerium canifelinum (S)	Fusobacterium	GTDB genus, manual & 16S species	RQYY00000000
OH5050	CBA	Aerobe	Actinomyces bowdenii COT-413	Actinomyces bowdenni (S)	Actinomyces	GTDB genus, manual & 16S species	RQZC00000000
OH770	CBA	Aerobe	Actinomyces canis COT-409	Actinomyces canis (S)	Actinomyces_B	GTDB genus, manual & 16S species	RQZF00000000
OH5060	CBA H&M	Anaerobe	Fusobacterium sp. COT-236	Fusobacterium nucleatum (W)	Fusobacterium	GTDB & 16S genus, manual species	RQZD00000000
OH887	CBA H&M	Anaerobe	Proprionibacterium sp. COT-365	Propionibacterium propionicum (S)	Pseudopropionibacterium	GTDB proposed taxonomic change	RQZG00000000
6824 SB	HIA	Anaerobe	Lachnospiraceae XIVa [G-6] sp. COT-073	Clostridiales (S)	Lachnospirales	GTDB proposed taxonomic change	RQYD00000000
OH1046	CBA H&M	Anaerobe	Atopobium parvulum	Coriobacterineae (W)	Atopobiaceae	GTDB proposed taxonomic change	RQYE00000000
OH2822	CBA H&M	Anaerobe	Proprionibacterium sp. COT-296	Propionibacterium propionicum (S)	Pseudopropionibacterium	GTDB proposed taxonomic change	RQYT00000000

Open in a new tab

CBA: Columbia Blood Agar, HIA: Heart Infusion Agar, H&M: Hemin and Menadione.

*"F0400" is actually a strain name, used as a placeholder by GTDB when the authors believe that genome belongs in a new genus, for which no type representative is present. (S) is for strains in the manual approach whose taxonomy was assigned using 16S rRNA gene trees, (A) for those for which ANI was used, and (W) for those for which a WGS tree was used. The comments column describes the way in which each of the three methods differs or is the same.

The remaining non-Porphyromonas genomes were further screened by a combination of assembly metrics and CheckM [18] to estimate completeness and contamination. We chose for further study those that appeared to be the highest quality using admittedly somewhat arbitrary cutoffs of (a) fewer than 350 contigs in the assembly, (b) CheckM contamination score of <3%, and (c) CheckM completeness score of >90%. A subset of 33 genomes meeting these criteria was chosen to study in more detail.

Preliminary taxonomic identification

All isolates were given a preliminary identification based on analysis of 16S rRNA gene sequences generated via Sanger sequencing (F24/Y36(9-29F/1525-1541R) primers). The 16S rRNA gene sequences were searched using BLAST [19] against an in-house canine oral microbiome database. This database contained 460 published 16S rRNA gene sequences obtained from canine oral taxa (Genbank accession numbers JN713151-JN713566 & KF030193-KF030235 [15]. In addition, sequences were searched against the RDP 16S rRNA database v10_31 [20]. Taxonomic identifications were made using percent identity cutoffs of 98.5% (for species), 94% (for genus), and 92% (for family).

Traditional taxonomic identification

The contigs comprising the genomes were first uploaded to the RAST server [21] for annotation. The results archives were downloaded using the RAST API and then searched for full-length 16S rRNA gene sequences by scanning the annotations using the “SSU rRNA” tag and filtering for length greater than 1 kb. These sequences were uploaded to the Ribosomal Database Project (RDP) [20] and incorporated into their alignment. For first-pass taxonomic identification, an alignment and phylogenetic tree was generated using SSU-ALIGN [22] and FastTree 2 [23] of all ~12,000 type strain 16S rRNA gene sequences from RDP, along with our 33 sequences from RAST. This provided us with the general region of the tree for each isolate or group of isolates.

Next, we generated a series of concatenated “whole genome sequence” (WGS) marker trees for each isolate or group of isolates as described below. All available genome sequences from the genus or family of interest were downloaded from GenBank, except in cases where more than 500 genomes were available. In those genome-rich genera (Bacillus, Clostridium, Streptococcus, Staphylococcus, Campylobacter), the WGS trees were built with only a subset of genomes, selected from the type strain results. For example, Staphylococcus contains ~40 species, but only genomes from those species closest (by eye in the tree) to our isolate of interest were needed to build a useful WGS tree.

For all WGS trees, an outgroup genome(s) was chosen from nearby in the NCBI taxonomy (e.g. another genus in the same family). The file names and sequences were reformatted for easier visualization. The assemblies were then screened for 37 core maker genes [24] using PhyloSift [25] in the search and align mode using “isolate” and “besthit” flags. PhyloSift concatenates and aligns the hits of interest then the sequences are subsequently extracted from the PhyloSift output files and added to a single file for tree-building. An approximately maximum-likelihood tree was then inferred using FastTree2 with default parameters [23].

For all isolates where the WGS tree indicated a possible placement into a well-defined clade with sequenced genomes, we downloaded the type strain genome sequences for every member of that genus/family from Refseq at NCBI. These were used to create an ANI (average nucleotide identity) matrix using FastANI [26] for each group and anything having an ANI >95% to a type strain was considered to belong to that species [8][27].

For the majority of taxa, the WGS tree and ANI matrix were still inadequate to assign species level taxonomy, due to a paucity of genome sequences for many groups. In those cases, we relied solely on the 16S rRNA gene for taxonomy, but using phylogeny instead of just sequence identity as in the first pass. This was again accomplished through RDP, by downloading all sequences for a given genus/family, inferring a phylogenetic tree, and looking for well-supported placement into a monophyletic clade. For all groups with fewer than 3000 sequences, the alignment was downloaded directly from RDP, for larger groups the sequences were downloaded and the alignment generated with SSU-ALIGN. All alignments were cleaned to remove problematic characters in the headers using a custom script [28] and all trees were inferred using FastTree2 with default parameters.

Final taxonomic assignments were based on a taxa-dependent combination of the WGS trees, the ANI results, and the 16S rRNA gene trees (Table 1). First priority was given to the ANI results, then to placement within a well-supported WGS tree with monophyletic clades, and then finally to 16S rRNA gene-based results. As a result of inadequate mapping between phylogeny and taxonomy, some isolates were only assigned to the genus or family level, and in one case the order level. Examples of both informative and non-informative WGS/16S trees, along with a sample ANI matrix, can be found in the Supplemental Materials. Of the remaining WGS/16S trees and ANI results, those that were informative in determining taxonomy can be found at (doi: 10.25338/B8801B).

Genome taxonomy database

The Genome Taxonomy Database (GTDB) is a recently proposed system attempting to create a genomic-based, standardized bacterial taxonomy [12]. The system uses 120 single-copy conserved marker genes to generate a phylogenetic tree of 94,759 genomes, and analysis of the topology of this tree was then used to propose large-scale revisions to bacterial taxonomy. During our work on this project, a tool based on GTDB was release; “GTDB-Tk” [29] attempts to automate taxonomic assignment for genomes, based on their revised GTDB taxonomy. The tool uses a combination of phylogenetic tree topology, ANI values, and relative evolutionary divergence (RED). We screened our 33 isolates using this tool and compared the taxonomic assignments to the manual process above (Table 1). Note that an alphabetic suffix in the GTDB taxonomy (e.g. “Actinomyces_B”) indicates that, within the GTDB tree, the genus does not belong to a monophyletic group with the type species of that genus.

Discussion

Here we present the genomes of 33 bacterial isolates from the canine oral cavity. Some are from groups for which some of the members of these groups are known to be involved in human and canine oral health (e.g. Actinomyces and Fusobacterium) and others have not been previously suggested to play such a role. A few of our isolates appear to be phylogenetically novel and potentially of interest for further study. For example Canine Oral Taxon number 073 (COT073) and bacterial isolate number OH741 were only identified to the order and family level respectively.

The difference in labor required by the two genomic methods of taxonomic assignment was noticeable, with the manual method having taken two weeks of daily downloads, alignments, ANI comparisons and tree-building whereas the GTDB automated method ran overnight. The latter also is much less subjective (for the user) than the former. A comparison of the three approaches to taxonomic assignments shows a very high degree of similarity (Table 1).

For 51% (17/33) of isolates the three methods gave identical results; for 18.2% (6/33) of isolates the 16S and manual methods gave the same results; and in one case the 16S and GTDB methods gave the same results. Finally, for 12% (4/33) of isolates all three methods gave different results to each other i.e. different genera within same family or different levels of assignment (granularity/resolution) within the same branch of the tree. Most of the differences between the GTDB taxonomy and the other approaches appears to be due to proposed taxonomic revisions by the GTDB group. For example COT073 was placed within the Clostridiales order in the manual taxonomy, but that order was subdivided into several new groups in the GTDB taxonomy, based on calculations of relative evolutionary divergence (RED) within that group.

Our results suggest that the use of the automated GTDB tool to assign taxonomy to unidentified bacterial isolates is less subjective and much faster than manual assignment, while giving very similar results (e.g. identical taxonomy, difference only in taxonomic rank, or proposed taxonomic changes). Both genomic methods offer improved taxonomic resolution relative to analysis of just 16S rRNA gene sequences, at the additional cost/burden of requiring the entire genome. Finally, we expect these genomes will be of use to researchers studying canine oral health since the vast majority of closely related isolates so far sequenced have been collected from humans.

Accession numbers

The whole-genome shotgun projects have been deposited at DDBJ/ENA/GenBank under the accession numbers RQYC00000000-RQZI00000000. The versions described in this paper are versions RQYC01000000-RQZI01000000. These strains were submitted to NCBI using our “manual taxonomy” assignments, since the GTDB taxonomy is not yet widely accepted. However, for some of the isolates, NCBI made their own minor taxonomic revisions.

Supporting information

S1 Fig. A taxonomically inconclusive WGS tree for isolate OH770.

The isolate is not found within a clade of other sequenced isolates with species-level taxonomy.

(PDF)

Click here for additional data file.^{(42.8KB, PDF)}

S2 Fig. A taxonomically inconclusive 16S rRNA gene tree for isolate OH1287.

Taxonomy is not congruent with phylogeny and many neighboring sequences are only identified to the genus level.

(PDF)

Click here for additional data file.^{(32.8KB, PDF)}

S3 Fig. A taxonomically informative 16S rRNA gene tree for isolate OH5050.

The isolate is found in a monophyletic clade and the name given to the closest relatives is not found elsewhere in the tree.

(PDF)

Click here for additional data file.^{(35KB, PDF)}

S1 Table. A sample ANI result for isolate OH3297.

The isolate is ~99% identical to both OH1877 and to Actinomyces hordeovulneris. The ANI drops off quite rapidly to other members of the genus.

(DOCX)

Click here for additional data file.^{(13.4KB, docx)}

Acknowledgments

The authors would like to acknowledge the laboratory staff at WALTHAM Centre for Pet Nutrition for growing the bacterial isolates, extracting genomic DNA, and for initial species identification, and Zoe Lonsdale for figure creation and bioinformatics support. The authors would also like to acknowledge John Zhang from UC Davis for the sequencing library preparation of isolates. The sequencing was carried by the DNA Technologies and Expression Analysis Cores at the UC Davis Genome Center, supported by NIH Shared Instrumentation Grant 1S10OD010786-01.

Data Availability

Funding Statement

This work was funded by Mars Petcare UK. The WALTHAM Centre for Pet Nutrition is the fundamental research center for Mars Petcare; it employs CW, ID, SH, LH and COF. WALTHAM was involved at all levels of this research including study design, data collection and analysis, decision to publish, and preparation of the manuscript. However, there are no conflicts of interest; this project represents an unbiased view (in the commercial sense) of the bioinformatic methods used for taxonomic assignment genome data. There were no external funding sources for this study.

References

1.Sedlar K, Kupkova K, Provaznik I. Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics. Comput Struct Biotechnol J. 2017;15: 48–55. 10.1016/j.csbj.2016.11.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Sun Y, Cai Y, Huse SM, Knight R, Farmerie WG, Wang X, et al. A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis. Brief Bioinform. 2012;13: 107–121. 10.1093/bib/bbr009 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Chun J, Rainey FA. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. Int J Syst Evol Microbiol. 2014;64: 316–324. 10.1099/ijs.0.054171-0 [DOI] [PubMed] [Google Scholar]
4.Chun J, Oren A, Ventosa A, Christensen H, Arahal DR, da Costa MS, et al. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int J Syst Evol Microbiol. 2018;68: 461–466. 10.1099/ijsem.0.002516 [DOI] [PubMed] [Google Scholar]
5.Sentausa E, Fournier P-E. Advantages and limitations of genomics in prokaryotic taxonomy. Clin Microbiol Infect. 2013;19: 790–795. 10.1111/1469-0691.12181 [DOI] [PubMed] [Google Scholar]
6.Garrity GM. A New Genomics-Driven Taxonomy of Bacteria and Archaea: Are We There Yet? J Clin Microbiol. 2016;54: 1956–1963. 10.1128/JCM.00200-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Klenk H-P, Göker M. En route to a genome-based classification of Archaea and Bacteria? Syst Appl Microbiol. 2010;33: 175–182. 10.1016/j.syapm.2010.03.003 [DOI] [PubMed] [Google Scholar]
8.Chan JZ-M, Halachev MR, Loman NJ, Constantinidou C, Pallen MJ. Defining bacterial species in the genomic era: insights from the genus Acinetobacter. BMC Microbiol. 2012;12: 302 10.1186/1471-2180-12-302 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Zhang W, Du P, Zheng H, Yu W, Wan L, Chen C. Whole-genome sequence comparison as a method for improving bacterial species definition. J Gen Appl Microbiol. 2014;60: 75–78. [DOI] [PubMed] [Google Scholar]
10.Kim M, Oh H-S, Park S-C, Chun J. Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Microbiol. 2014;64: 346–351. 10.1099/ijs.0.059774-0 [DOI] [PubMed] [Google Scholar]
11.Varghese NJ, Mukherjee S, Ivanova N, Konstantinidis KT, Mavrommatis K, Kyrpides NC, et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 2015;43: 6761–6771. 10.1093/nar/gkv657 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol. 2018;36: 996–1004. 10.1038/nbt.4229 [DOI] [PubMed] [Google Scholar]
13.Qin Q-L, Xie B-B, Zhang X-Y, Chen X-L, Zhou B-C, Zhou J, et al. A proposed genus boundary for the prokaryotes based on genomic insights. J Bacteriol. 2014;196: 2210–2215. 10.1128/JB.01688-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Dewhirst FE, Klein EA, Thompson EC, Blanton JM, Chen T, Milella L, et al. The canine oral microbiome. PLoS One. 2012;7: e36067 10.1371/journal.pone.0036067 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Davis IJ, Wallis C, Deusch O, Colyer A, Milella L, Loman N, et al. A cross-sectional survey of bacterial species in plaque from client owned dogs with healthy gingiva, gingivitis or mild periodontitis. PLoS One. 2013;8: e83158 10.1371/journal.pone.0083158 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Coil DA, Alexiev A, Wallis C, O’Flynn C, Deusch O, Davis I, et al. Draft genome sequences of 26 porphyromonas strains isolated from the canine oral microbiome. Genome Announc. 2015;3 10.1128/genomeA.00187-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Coil D, Jospin G, Darling AE. A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics. 2015;31: 587–589. 10.1093/bioinformatics/btu661 [DOI] [PubMed] [Google Scholar]
18.Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043–1055. 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215: 403–410. 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
20.Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 2014;42: D633–42. 10.1093/nar/gkt1244 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics. 2008;9: 75 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Nawrocki EP. Structural RNA homology search and alignment using covariance models [Internet]. Eddy SR, editor. Washington University in St. Louis; 2009. Available: https://search.proquest.com/docview/305017419 [Google Scholar]
23.Price MN, Dehal PS, Arkin AP. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5: e9490 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Wu D, Jospin G, Eisen JA. Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups. PLoS One. Public Library of Science; 2013;8: e77033 10.1371/journal.pone.0077033 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Darling AE, Jospin G, Lowe E, Matsen FA 4th, Bik HM, Eisen JA. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ. 2014;2: e243 10.7717/peerj.243 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9: 5114 10.1038/s41467-018-07641-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Richter M, Rosselló-Móra R. Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci U S A. 2009;106: 19126–19131. 10.1073/pnas.0906412106 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Dunitz MI, Lang JM, Jospin G, Darling AE, Eisen JA, Coil DA. Swabs to genomes: a comprehensive workflow. PeerJ. 2015;3: e960 10.7717/peerj.960 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Ecogenomics. Ecogenomics/GtdbTk. In: GitHub [Internet]. [cited 24 May 2018]. Available: https://github.com/Ecogenomics/GtdbTk

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. A taxonomically inconclusive WGS tree for isolate OH770.

The isolate is not found within a clade of other sequenced isolates with species-level taxonomy.

(PDF)

Click here for additional data file.^{(42.8KB, PDF)}

S2 Fig. A taxonomically inconclusive 16S rRNA gene tree for isolate OH1287.

Taxonomy is not congruent with phylogeny and many neighboring sequences are only identified to the genus level.

(PDF)

Click here for additional data file.^{(32.8KB, PDF)}

S3 Fig. A taxonomically informative 16S rRNA gene tree for isolate OH5050.

The isolate is found in a monophyletic clade and the name given to the closest relatives is not found elsewhere in the tree.

(PDF)

Click here for additional data file.^{(35KB, PDF)}

S1 Table. A sample ANI result for isolate OH3297.

The isolate is ~99% identical to both OH1877 and to Actinomyces hordeovulneris. The ANI drops off quite rapidly to other members of the genus.

(DOCX)

Click here for additional data file.^{(13.4KB, docx)}

Data Availability Statement

[pone.0214354.ref001] 1.Sedlar K, Kupkova K, Provaznik I. Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics. Comput Struct Biotechnol J. 2017;15: 48–55. 10.1016/j.csbj.2016.11.005 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref002] 2.Sun Y, Cai Y, Huse SM, Knight R, Farmerie WG, Wang X, et al. A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis. Brief Bioinform. 2012;13: 107–121. 10.1093/bib/bbr009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref003] 3.Chun J, Rainey FA. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. Int J Syst Evol Microbiol. 2014;64: 316–324. 10.1099/ijs.0.054171-0 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref004] 4.Chun J, Oren A, Ventosa A, Christensen H, Arahal DR, da Costa MS, et al. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int J Syst Evol Microbiol. 2018;68: 461–466. 10.1099/ijsem.0.002516 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref005] 5.Sentausa E, Fournier P-E. Advantages and limitations of genomics in prokaryotic taxonomy. Clin Microbiol Infect. 2013;19: 790–795. 10.1111/1469-0691.12181 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref006] 6.Garrity GM. A New Genomics-Driven Taxonomy of Bacteria and Archaea: Are We There Yet? J Clin Microbiol. 2016;54: 1956–1963. 10.1128/JCM.00200-16 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref007] 7.Klenk H-P, Göker M. En route to a genome-based classification of Archaea and Bacteria? Syst Appl Microbiol. 2010;33: 175–182. 10.1016/j.syapm.2010.03.003 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref008] 8.Chan JZ-M, Halachev MR, Loman NJ, Constantinidou C, Pallen MJ. Defining bacterial species in the genomic era: insights from the genus Acinetobacter. BMC Microbiol. 2012;12: 302 10.1186/1471-2180-12-302 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref009] 9.Zhang W, Du P, Zheng H, Yu W, Wan L, Chen C. Whole-genome sequence comparison as a method for improving bacterial species definition. J Gen Appl Microbiol. 2014;60: 75–78. [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref010] 10.Kim M, Oh H-S, Park S-C, Chun J. Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Microbiol. 2014;64: 346–351. 10.1099/ijs.0.059774-0 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref011] 11.Varghese NJ, Mukherjee S, Ivanova N, Konstantinidis KT, Mavrommatis K, Kyrpides NC, et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 2015;43: 6761–6771. 10.1093/nar/gkv657 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref012] 12.Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol. 2018;36: 996–1004. 10.1038/nbt.4229 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref013] 13.Qin Q-L, Xie B-B, Zhang X-Y, Chen X-L, Zhou B-C, Zhou J, et al. A proposed genus boundary for the prokaryotes based on genomic insights. J Bacteriol. 2014;196: 2210–2215. 10.1128/JB.01688-14 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref014] 14.Dewhirst FE, Klein EA, Thompson EC, Blanton JM, Chen T, Milella L, et al. The canine oral microbiome. PLoS One. 2012;7: e36067 10.1371/journal.pone.0036067 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref015] 15.Davis IJ, Wallis C, Deusch O, Colyer A, Milella L, Loman N, et al. A cross-sectional survey of bacterial species in plaque from client owned dogs with healthy gingiva, gingivitis or mild periodontitis. PLoS One. 2013;8: e83158 10.1371/journal.pone.0083158 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref016] 16.Coil DA, Alexiev A, Wallis C, O’Flynn C, Deusch O, Davis I, et al. Draft genome sequences of 26 porphyromonas strains isolated from the canine oral microbiome. Genome Announc. 2015;3 10.1128/genomeA.00187-15 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref017] 17.Coil D, Jospin G, Darling AE. A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics. 2015;31: 587–589. 10.1093/bioinformatics/btu661 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref018] 18.Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043–1055. 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref019] 19.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215: 403–410. 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]

[pone.0214354.ref020] 20.Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 2014;42: D633–42. 10.1093/nar/gkt1244 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref021] 21.Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics. 2008;9: 75 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref022] 22.Nawrocki EP. Structural RNA homology search and alignment using covariance models [Internet]. Eddy SR, editor. Washington University in St. Louis; 2009. Available: https://search.proquest.com/docview/305017419 [Google Scholar]

[pone.0214354.ref023] 23.Price MN, Dehal PS, Arkin AP. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5: e9490 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref024] 24.Wu D, Jospin G, Eisen JA. Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups. PLoS One. Public Library of Science; 2013;8: e77033 10.1371/journal.pone.0077033 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref025] 25.Darling AE, Jospin G, Lowe E, Matsen FA 4th, Bik HM, Eisen JA. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ. 2014;2: e243 10.7717/peerj.243 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref026] 26.Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9: 5114 10.1038/s41467-018-07641-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref027] 27.Richter M, Rosselló-Móra R. Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci U S A. 2009;106: 19126–19131. 10.1073/pnas.0906412106 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref028] 28.Dunitz MI, Lang JM, Jospin G, Darling AE, Eisen JA, Coil DA. Swabs to genomes: a comprehensive workflow. PeerJ. 2015;3: e960 10.7717/peerj.960 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0214354.ref029] 29.Ecogenomics. Ecogenomics/GtdbTk. In: GitHub [Internet]. [cited 24 May 2018]. Available: https://github.com/Ecogenomics/GtdbTk

PERMALINK

Genomes from bacteria associated with the canine oral cavity: A test case for automated genome-based taxonomic assignment

David A Coil

Guillaume Jospin

Aaron E Darling

Corrin Wallis

Ian J Davis

Stephen Harris

Jonathan A Eisen

Lucy J Holcombe

Ciaran O’Flynn

Roles

Abstract

Introduction

Genomes and taxonomy

Genome selection

Table 1. Comparative taxonomy of 33 bacterial strains by three different methods.

Preliminary taxonomic identification

Traditional taxonomic identification

Genome taxonomy database

Discussion

Accession numbers

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Genomes from bacteria associated with the canine oral cavity: A test case for automated genome-based taxonomic assignment

David A Coil

Guillaume Jospin

Aaron E Darling

Corrin Wallis

Ian J Davis

Stephen Harris

Jonathan A Eisen

Lucy J Holcombe

Ciaran O’Flynn

Roles

Abstract

Introduction

Genomes and taxonomy

Genome selection

Table 1. Comparative taxonomy of 33 bacterial strains by three different methods.

Preliminary taxonomic identification

Traditional taxonomic identification

Genome taxonomy database

Discussion

Accession numbers

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases