Skip to main content
. 2020 Oct 23;49(D1):D10–D17. doi: 10.1093/nar/gkaa892

Table 1.

The Entrez Databases (as of 9 September 2020)

Database Records Description
Literature
PubMed 31 471 600 scientific and medical abstracts/citations
PubMed Central 6 447 271 full-text journal articles
NLM catalog 1 619 856 index of NLM collections
Books 825 385 books and reports
MeSH 300 500 ontology used for PubMed indexing
Genomes
Nucleotide 429 731 711 DNA and RNA sequences
BioSample 14 628 076 descriptions of biological source materials
SRA 11 807 161 high-throughput DNA and RNA sequence read archive
Taxonomy 2 401 136 taxonomic classification and nomenclature catalog
Assembly 837 406 genome assembly information
BioProject 458 893 biological projects providing data to NCBI
Genome 55 580 genome sequencing projects by organism
BioCollections 8 138 museum, herbaria and other biorepository collections
Genes
GEO Profiles 128 414 055 gene expression and molecular abundance profiles
Gene 28 377 759 collected information about gene loci
GEO datasets 4 002 373 functional genomics studies
PopSet 350 627 sequence sets from phylogenetic and population studies
HomoloGene 141 268 homologous gene sets for selected organisms
Genetics
SNP 720 643 623 short genetic variations
dbVar 6 030 887 genome structural variation studies
ClinVar 845 008 human variations of clinical significance
MedGen 335 277 medical genetics literature and links
GTR 76 814 genetic testing registry
dbGaP 1 397 genotype/phenotype interaction studies
Proteins
Protein 874 272 642 protein sequences
Identical protein groups 329 946 078 protein sequences grouped by identity
Protein clusters 1 137 329 sequence similarity-based protein clusters
Structure 167 650 experimentally-determined biomolecular structures
Sparcle 149 462 conserved domain architectures
Conserved domains 59 951 conserved protein domains
Chemicals
PubChem substance 285 048 146 deposited substance and chemical information
PubChem compound 111 325 418 chemical information with structures, information and links
PubChem BioAssay 1 229 071 bioactivity screening studies
BioSystems 983 968 molecular pathways with links to genes, proteins and chemicals