Skip to main content
. 2021 Dec 1;50(D1):D20–D26. doi: 10.1093/nar/gkab1112

Table 1.

NCBI databases (as of 4 September 2021)

Database Records Description
Literature
PubMed 33 027 761 Scientific and medical abstracts/citations
PubMed Central 7 325 415 Full-text journal articles
NLM Catalog 1 629 799 Index of NLM collections
Bookshelf 892 126 Books and reports
MeSH 348 370 Ontology used for PubMed indexing
Genomes
Nucleotide 476 054 019 DNA and RNA sequences from GenBank and RefSeq
BioSample 19 473 659 Descriptions of biological source materials
SRA 15 919 320 High-throughput DNA/RNA sequence read archive
Taxonomy 2 492 889 Taxonomic classification and nomenclature catalog
Assembly 1 083 900 Genome assembly information
BioProject 536 242 Biological projects providing data to NCBI
Genome 64 815 Genome sequencing projects by organism
BioCollections 8 468 Museum, herbaria, and biorepository collections
Genes
GEO Profiles 128 414 055 Gene expression and molecular abundance profiles
Gene 33 664 932 Collected information about gene loci
GEO DataSets 4 784 603 Functional genomics studies
PopSet 366 935 Sequence sets from phylogenetic/population studies
HomoloGene 141 268 Homologous gene sets for selected organisms
Clinical
dbSNP 1 076 992 604 Short genetic variations
dbVar 7 117 914 Genome structural variation studies
ClinVar 1 071 071 Human variations of clinical significance
ClinicalTrials.gov 388 717 Registry of clinical studies and results database
MedGen 335 277 Medical genetics literature and links
GTR 77 498 Genetic testing registry
dbGaP 1 405 Genotype/phenotype interaction studies
Proteins
Protein 968 236 913 Protein sequences from GenBank and RefSeq
Identical Protein Groups 448 096 579 Protein sequences grouped by identity
Protein Clusters 1 137 329 Sequence similarity-based protein clusters
Structure 181 772 Experimentally-determined biomolecular structures
Protein Family Models 179 133 Conserved domain architectures, HMMs, and BlastRules
Conserved Domains 62 852 Conserved protein domains
Chemicals
PubChem Substance 284 180 803 Deposited substance and chemical information
PubChem Compound 110 628 849 Chemical information with structures, information and links
PubChem BioAssay 1 391 308 Bioactivity screening studies
BioSystems 983 968 Molecular pathways with links to genes, proteins and chemicals