Skip to main content
. 2021 Mar 7;19:1497–1511. doi: 10.1016/j.csbj.2021.02.020

Table 3.

Summary of the most widely used reference databases for metataxonomics and metagenomics analysis.

Database name Reference Description
Metataxonomics databases
GreenGenes [77] 16S rRNA database from Genbank sequences, manually curated and modified by the user community.
SILVA [78] Small and large rRNA subunits database including 16S rRNA sequences from the European Nucleotide Archive.
The Ribosomal Database Project (RDP) [79] 16S rRNA taxonomically annotated sequence collection from the INSDC database.
RefSeqTargeted Loci Project (https://www.ncbi.nlm.nih.gov/refseq/targetedloci/) BLAST specific marker gene databases for Bacteria (16S/23S) and Fungi (28S/18S) extracted and curated from GenBank sequences.



Metagenomics databases
nt/nr [58] Default database for BLAST sequence searches including RefSeq RNA and GenBank sequences.
RefSeq [75] Non-redundant and NCBI curated and annotated database based on Genbank sequences.
GenBank [76] Main NCBI nucleotide database with the largest complete and draft microbial genomes sequence collection.



Annotation and functional databases
Kyoto Encyclopedia of Genes and Genomes (KEGG) [81] Manually curated set of 18 databases for annotating cellular and organism-level functions from nucleotide sequences.
Integrated reference catalog of the human gut microbiome (IGC) [86] Gut-specific annotated microbial genes from KEGG functional databases.
Comprehensive Antibiotic Resistance Database (CARD) [82] Bioinformatic resources and database for the annotation of antimicrobial resistance genes (AMR) and mutations from genomic sequences.
DeepARG-DB [87] Antibiotic resistance genes database generated by a deep-learning prediction algorithm trained with ARG from other sequence collections.
MEGARes [83] Hand-curated database containing AMR genes optimized for use with high-throughput sequencing data.