Skip to main content
. 2011 Jun 30;6(1):94–103. doi: 10.1038/ismej.2011.82

Table 1. The training sets used for naïve Bayesian classification of bacterial 16S rRNA sequences.

Training set Abbreviation Sequence database Taxonomy mapping
RDP Training Set 6 RDP TS6 8422 sequences (Cole et al., 2009)a Based on Bergey's taxonomy
SILVA bacteria subset distributed for Mothur SILVA Subset 14 956 bacterial sequences selected from an export of the SILVA databaseb,c SILVA taxonomy
Reduced SILVA subset, comparable in size to RDP TS6 SILVA98.1 8572 bacterial sequences, >1.9% unique, from the SILVA subset SILVA taxonomy
Greengenes bacteria subset of 99% similar sequences GG99 127 741 bacterial sequences, >1% unique, from the Greengenes databased Greengenes taxonomy
Reduced Greengenes training set, comparable in size to RDP TS6 GG91.3 8275 bacterial sequences, >8.7% unique, from the full Greengenes database Greengenes taxonomy