Skip to main content
. 2020 Sep 9;33(4):e00053-19. doi: 10.1128/CMR.00053-19

TABLE 2.

Overview of some publicly and commercially available reference databases for bacterial identification using 16S rRNA gene sequence interpretationa

Database DNA target(s) No. of sequences Curation Alignment of clustered sequences Link Comment
NCBI nt (Genbank NCBI) All ≈21,000,000 Limited No https://blast.ncbi.nlm.nih.gov/Blast.cgi (select appropriate dataset in the menu in order to restrict and accelerate the search); for downloads, ftp://ftp.ncbi.nlm.nih.gov/blast/db/ Hosts all published sequences; excellent coverage; frequent updates; many redundant entries; frequent erroneous entries; use for unusual or new species
Greengenes (consortium comprised of Second Genome Inc., University of Colorado, and University of Queensland) 16S rRNA genes ≈1,200,000 Yes; manual sequences >12,000 bp; taxonomy curation Yes: sequence clusters at various similarity percentages Searches, http://greengenes.lbl.gov/Download/Tutorial/Tutorial_19Dec05.pdf; downloads, https://greengenes.secondgenome.com/ Includes several tools from chromatogram analysis to alignments; latest version from 2013; unclear updates, some taxonomy information may be outdated
RDP (Michigan State University) 16S rRNA ≈3,200,000 Yes; manual sequences >12,000 bp; taxonomy curation Yes; Aligned Searches, https://rdp.cme.msu.edu/seqmatch/seqmatch_intro.jsp and https://rdp.cme.msu.edu/index.jsp Manually and not regularly updated; last update was May 2015; various tools available to analyze user data further
SILVA (Max Plank Institute for Marine Microbiology) ≈5,000,000, includes small ribosomal subunit for eukaryotes Yes; manual sequence quality; taxonomy curation Yes; multiple cluster sets available Search, http://www.arb-silva.de/aligner/; downloads, http://www.arb-silva.de/download/arb-files/ Continually updated; tools available to analyze user data; genes other than 16S rRNA
Molzym SepsiTest ≈7,043 Manual No CE-IVD database works with kit but also with sequences generated otherwise
SmartGene IDNS Bacteria Module 3.9.x 16S rRNA and rpoB ≈800,000 16S, 358,000 centroid annotated Yes; quality filters for sequence quality, centroid annotation for annotation qualification Centroid annotation for most representative sequence per species CE-IVD; proprietary centroid annotation; quality filtered, continually updated; tools available to analyze user data; genes other than 16S rRNA
MicroSEQ 3.1 16S rRNA 2,300 Sequences of collection and type strains No Compatible with the MicroSEQ sequencing kit of ThermoFisher; mainly for 500-bp sequencing
a

Entries with gray shading represent publicly available databases, and those without gray shading represent commercially available databases. See reference 5.