Skip to main content
. 2014 Oct 9;2014:bau100. doi: 10.1093/database/bau100

Table 2.

Data integration database

Name Records Description Source
Chemical Entities of Biological Interest (ChEBI) 38 580 The database of chemical entities of biological interest www.ebi.ac.uk/chebi
ChEBI ontology 29 974
Enzyme 5 418 Enzymes nomencalture ca.expasy.org/enzyme
Gene 8 927 911 NCBI gene database www.ncbi.nlm.nih.gov/gene
Functional association data/networks (GeneMania) 21 084 Gene associations database genemania.org
GO 34 940 Gene ontology database www.geneontology.org
GOA 11 300 749 Gene ontology annotation www.ebi.ac.uk/GOA
HUGO Gene nomenclature 35 795 Human genes nomenclature www.genenames.org
Human major histocompatibility complex 6 939 Human major histocompatibility complex (HLA) sequences www.ebi.ac.uk/imgt/hla
Immunoglobulins and T-cell receptors nucleotide sequences 156 529 The international imMunoGeneTics information system www.imgt.org/GeneInfoServlets/htdocs
Interpro 21 749 Protein sequence analysis and classification www.ebi.ac.uk/interpro
KEGG module 196 659 Collection functional units used for annotation and biological interpretation of sequenced genomes. www.genome.jp/kegg/module.html
KEGG pathway 262 432 Pathway maps on the molecular interaction and reaction networks for biological interpretation of higher-level systemic functions. www.genome.jp/kegg/pathway.html
KEGG ligand compound 34 182 Database of chemical substances and reactions that are relevant to life. www.genome.jp/kegg/ligand.htm
KEGG ligand enzyme 6 118
KEGG Ligand Glycan 10 985
KEGG Ligand Reaction 9 400
Oxford Human Mouse grid 17 834 Laboratory mouse genetic, genomic and biological data resources. www.informatics.jax.org
Pfam-A 12 273 Collection of protein families, each represented by multiple sequence alignments and hidden Markov models pfam.sanger.ac.uk
Pfam-B 233 174
Pfam seed 12 273
PRINTS 2 050 Protein fingerprints, groups of conserved motifs used to characterize a protein families bioinf.man.ac.uk/dbbrowser/PRINTS
Prosite 2 247 Documentation entries describing protein domains, families, functional sites and associated patterns and profiles to identify them. www.expasy.org/prosite/
Prosite documentation 1 621
REBASE 5 020 The restriction enzyme database. rebase.neb.com/rebase/rebase.html
RefSeq 18 236 994 Set of reference sequences including genomic, transcript, and protein. www.ncbi.nlm.nih.gov/refseq/
UniProt/Swiss-Prot 531 473 Protein database, manually annotated and reviewed. www.uniprot.org/
Taxonomy 817 120 Classification and nomenclature for all of the organisms in the public sequence databases. www.ncbi.nlm.nih.gov/taxonomy
UniProt/TrEMBL 16 504 022 Protein database, automatically annotated and not reviewed. www.uniprot.org
Unigene 2 652 777 NCBI database of the transcriptome. www.ncbi.nlm.nih.gov/unigene
Uniprot KB 17 035 495 Protein knowledgebase (Swiss-Prot + TrEMBL). www.uniprot.org
Total records 77 163 817