Table 3.
Summary of the most widely used reference databases for metataxonomics and metagenomics analysis.
Database name | Reference | Description |
---|---|---|
Metataxonomics databases | ||
GreenGenes | [77] | 16S rRNA database from Genbank sequences, manually curated and modified by the user community. |
SILVA | [78] | Small and large rRNA subunits database including 16S rRNA sequences from the European Nucleotide Archive. |
The Ribosomal Database Project (RDP) | [79] | 16S rRNA taxonomically annotated sequence collection from the INSDC database. |
RefSeqTargeted Loci Project | (https://www.ncbi.nlm.nih.gov/refseq/targetedloci/) | BLAST specific marker gene databases for Bacteria (16S/23S) and Fungi (28S/18S) extracted and curated from GenBank sequences. |
Metagenomics databases | ||
nt/nr | [58] | Default database for BLAST sequence searches including RefSeq RNA and GenBank sequences. |
RefSeq | [75] | Non-redundant and NCBI curated and annotated database based on Genbank sequences. |
GenBank | [76] | Main NCBI nucleotide database with the largest complete and draft microbial genomes sequence collection. |
Annotation and functional databases | ||
Kyoto Encyclopedia of Genes and Genomes (KEGG) | [81] | Manually curated set of 18 databases for annotating cellular and organism-level functions from nucleotide sequences. |
Integrated reference catalog of the human gut microbiome (IGC) | [86] | Gut-specific annotated microbial genes from KEGG functional databases. |
Comprehensive Antibiotic Resistance Database (CARD) | [82] | Bioinformatic resources and database for the annotation of antimicrobial resistance genes (AMR) and mutations from genomic sequences. |
DeepARG-DB | [87] | Antibiotic resistance genes database generated by a deep-learning prediction algorithm trained with ARG from other sequence collections. |
MEGARes | [83] | Hand-curated database containing AMR genes optimized for use with high-throughput sequencing data. |