Abstract
The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year’s highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein–protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI’s dbVar and EBI’s DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein–ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and currently lists 1512 online databases. The full content of the Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
NEW AND UPDATED DATABASES
This 1300-page virtual volume represents the 20th annual Database Issue of Nucleic Acids Research (NAR). It includes descriptions of 88 new online databases, 77 update articles on databases that have been previously featured in the NAR Database Issue (Table 1) and 11 articles with updates on database resources whose descriptions have been previously published in other journals (Table 2).
Table 1.
Database name | URL | Brief description |
---|---|---|
APPRIS | http://appris.bioinfo.cnio.es/ | A system for annotating alternative splice isoforms |
BioLiP | http://zhanglab.ccmb.med.umich.edu/BioLiP/ | Biologically relevant ligand–protein interactions |
BSRD | http://kwanlab.bio.cuhk.edu.hk/BSRD | A repository of bacterial small regulatory RNA |
CellLineNavigator | http://www.medicalgenomics.org/celllinenavigator | Cell line expression profiles by microarray analysis |
ChIPBase | http://deepbase.sysu.edu.cn/chipbase/ | Transcriptional regulation of lncRNA and microRNA genes from ChIP-Seq data |
ChiTaRS | http://chimerasrch.bioinfo.cnio.es/ | Chimeric RNAs of two or more different transcripts |
CIL-CCDB | http://www.cellimagelibrary.org/ | Images, videos and animations of various cell types from diverse organisms |
CircaDB | http://bioinf.itmat.upenn.edu/circa/ | Circadian gene expression profiles in human and mouse |
CloneDB | http://www.ncbi.nlm.nih.gov/clone/ | Clones and libraries: sequence data, map positions and distributor information |
ClusterMine360 | http://www.sigma54.ca/microbialclusters/ | Microbial PKS/NRPS Biosynthesis |
Cyanolyase | http://cyanolyase.genouest.org/ | Sequences and motifs of the phycobilin lyase protein family |
D2P2 | http://d2p2.pro/ | Database of Disordered Protein Predictions |
dbVar | http://www.ncbi.nlm.nih.gov/dbvar | Structural variation in chromosomes: inversions, translocations, insertions and deletions |
DGVa | http://www.ebi.ac.uk/dgva/ | |
dcGO | http://supfam.org/SUPERFAMILY/dcGO | domain-centric Gene Ontology |
Dfam | http://dfam.janelia.org | Human DNA repeat families |
DGA | http://dga.nubic.northwestern.edu/ | Disease and Gene Annotations database |
DIANA-LncBase | http://www.microrna.gr/LncBase | microRNA targets on long noncoding RNAs |
DoBISCUIT | http://www.bio.nite.go.jp/pks/ | Database Of BIoSynthesis clusters CUrated and InTegrated |
EBI Enzyme Portal | http://www.ebi.ac.uk/enzymeportal | Various kinds of information about enzymes: small-molecule chemistry, biochemical pathways and drug compounds |
ECMDB | http://www.ecmdb.ca/ | Escherichia coli Metabolome Database |
EENdb | http://eendb.zfgenetics.org/ | Engineered endonucleases: zinc finger nucleases and transcription activator-like effector nucleases |
eProS | http://bioservices.hs-mittweida.de/Epros/ | Energy profiles of protein structures |
Factorbook | http://www.factorbook.org/ | Human transcription factor-binding data from ChIP-seq |
G4LDB | http://www.g4ldb.org/ | G-quadruplex Ligands Database |
GDSC | http://www.cancerRxgene.org/ | Genomics of Drug Sensitivity in Cancer: Sensitivity for anti-cancer drugs in various cell lines |
GeneTack | http://topaz.gatech.edu/GeneTack/db.html | Genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences |
Genome3D | http://genome3d.eu/ | Domain structure predictions and 3D models for proteins from model genomes |
Glycan Fragment DB | http://www.glycanstructure.org/fragment-db | Database of glycan 3D structures |
H2DB | http://tga.nig.ac.jp/h2db/ | Heritability data with trait-associated genomic loci |
HBVdb | http://hbvdb.ibcp.fr/HBVdb/ | A knowledge database for the Hepatitis B Virus |
HemaExplorer | http://servers.binf.ku.dk/shs/ | Gene expression profiles in haematopoiesis |
HEXEvent | http://hertellab.mmg.uci.edu/cgi-bin/HEXEvent/HEXEventWEB.cgi | Human Exone Splicing Events |
HOCOMOCO | http://autosome.ru/HOCOMOCO, http://cbrc.kaust.edu.sa/hocomoco/ | HOmo sapiens COmprehensive MOdel COllection of hand-curated transcription factor-binding site models |
KIDFamMap | http://gemdock.life.nctu.edu.tw/KIDFamMap/ | Kinase-Inhibitor-Disease Family Map |
LAMP | http://www.llamp.net/ | Library of Apicomplexan Metabolic Pathways |
Lncipedia | http://www.lncipedia.org/ | Human lncRNA gene sequences and structures |
LncRNADisease | http://cmbi.bjmu.edu.cn/lncrnadisease | Long non-coding RNA-associated diseases |
LUCApedia | http://eeb.princeton.edu/lucapedia/ | Predicted genome of Last Universal Common Ancestor |
meta.MicrobesOnline | http://meta.MicrobesOnline.org/ | Comparative genomic tools for metagenome analysis |
MetaboLights | http://www.ebi.ac.uk/metabolights | Metabolomics experiments and associated metadata |
MetalPDB | http://metalweb.cerm.unifi.it/ | Metal-binding sites in macromolecular structures |
METscout | http://metscout.mpg.de | Spatial organization of metabolic reactions in the mouse |
MonarchBase | http://monarchbase.umassmed.edu/ | Genome biology of the monarch butterfly Danaus plexippus |
NetwoRx | http://ophid.utoronto.ca/networx/ | Chemogenomic experiments in yeast: connection of drug response to biological pathways, phenotypes, and networks |
NCBI Bookshelf | http://www.ncbi.nlm.nih.gov/books | Free online books on the NCBI website |
NHPRTR | http://nhprtr.org/ | Non-human Primate Reference Transcriptome Resource |
NIH Genetic Testing Registry | http://www.ncbi.nlm.nih.gov/gtr/ | Genetic tests and laboratories that perform them |
NPACT | http://crdd.osdd.net/raghava/npact/ | Naturally occurring Plant-based Anticancer Compound Targets |
OikoBase | http://oikoarrays.biology.uiowa.edu/Oiko/ | Genome expression database of Oikopleura dioica |
OrtholugeDB | http://www.pathogenomics.sfu.ca/ortholugedb/ | Microbial orthology resource |
OrysPSSP | http://genoportal.org/SPD/index.do | Small secreted proteins from rice |
Papillomavirus Episteme | http://PaVE.niaid.nih.gov/ | A database of Papillomaviridae family of viruses |
PGDD | http://chibba.agtec.uga.edu/duplication/ | Plant Genome Duplication Database |
PIECE | http://probes.pw.usda.gov/piece/ | Plant Intron Exon Comparison and Evolution |
PlantRNA | http://plantrna.ibmp.cnrs.fr/ | tRNAs of plants and algae |
PR2 | http://ssu-rrna.org/ | Protist Ribosomal reference database |
prePPI | http://bhapp.c2b2.columbia.edu/PrePPI | Predicted and experimentally determined protein–protein interactions for yeast and human |
PTMcode | http://ptmcode.embl.de/ | Functional associations between posttranslational modifications within proteins |
Quorumpeps | http://quorumpeps.ugent.be/ | A database of quorum-sensing peptides |
RhesusBase | http://www.rhesusbase.org/ | A Knowledgebase for the Monkey Research Community |
RiceFREND | http://ricefrend.dna.affrc.go.jp/ | Rice Functionally Related gene Expression Network Database |
RNApathwaysDB | http://rnb.genesilico.pl/ | A database of RNA processing pathways |
SecReT4 | http://db-mml.sjtu.edu.cn/SecReT4/ | Type IV Secretion system Resource |
SEVA | http://seva.cnb.csic.es/SEVA/ | Standard European Vector Architecture: a collection of plasmids to analyse complex prokaryotic phenotypes |
SIFTS | http://www.ebi.ac.uk/pdbe/docs/sifts/ | Structure Integration with Function, Taxonomy and Sequences |
SINEBase | http://sines.eimb.ru | A database of short interspersed elements (SINEs) |
SomamiR | http://compbio.uthsc.edu/SomamiR/ | Somatic mutations that impact microRNA targeting in cancer |
Spermatogenesis Online | http://mcg.ustc.edu.cn/sdap1/spermgenes/ | Spermatogenesis-related genes |
SpliceAid-F | http://mi.caspur.it/SpliceAidF/ | Human splicing factors and their RNA-binding sites |
Spliceosome Database | http://spliceosomedb.ucsc.edu/ | Spliceosome genes and proteins, splicing complexes |
StreptomeDB | http://streptomedb.pharmaceutical-bioinformatics.de | Antibiotic, anti-tumour and immunosuppressant drugs produced by Streptomyces spp. |
SwissBioisostere | http://www.swissbioisostere.ch/ | Molecular replacements for ligand design |
SwissSidechain | http://www.swisssidechain.ch/ | Non-natural amino acid sidechains for protein engineering |
SynSysNet | http://bioinformatics.charite.de/synsysnet/ | Synapse proteins, their structures and interactions |
TCMID | http://www.megabionet.org/tcmid/ | Traditional Chinese Medicine Integrated Database |
TFClass | http://www.edgar-wingender.de/huTF_classification.html | Human transcription factors classified according to their DNA-binding domains |
TissueNet | http://netbio.bgu.ac.il/tissuenet/ | Tissue distribution of protein–protein interactions |
TOPPR | http://iomics.ugent.be/toppr/ | The Online Protein Processing Resource |
TSGene | http://bioinfo.mc.vanderbilt.edu/TSGene/ | Tumor Suppressor Gene database |
UCNEbase | http://ccg.vital-it.ch/UCNEbase/ | Ultraconserved non-coding elements and gene regulatory blocks |
UUCD | http://uucd.biocuckoo.org/ | Ubiquitin and ubiquitin-like conjugation database |
ValidNESs | http://validness.ym.edu.tw/ | Validated nuclear export signals-containing proteins |
Voronoia4RNA | http://proteinformatics.charite.de/voronoia4rna/tools/v4rna/index | Packing of RNA molecules and complexes |
WDDD | http://so.qbic.riken.jp/wddd/ | Worm Developmental Dynamics Database |
WholeCellKB | http://wholecellkb.stanford.edu/ | Pathway and genome database of Mycoplasma genitalium for whole-cell modelling |
WormQTL | http://www.wormqtl.org | Natural variation data in Caenorhabditis spp. |
YM500 | http://ngs.ym.edu.tw/ym500/ | smRNA-seq database for miRNA research |
ZInC | http://research.nhgri.nih.gov/zinc | Zebrafish Insertions Collection |
Table 2.
Database name | URL | Previous article | Brief description |
---|---|---|---|
2P2Idb | http://dimr.cnrs-mrs.fr | 2010 | Structural data on protein–protein interactions and their inhibitors |
Allen Brain Atlas | http://www.brain-map.org | 2009 | Gene expression and neuroanatomical data on human and mouse brain |
BioGPS | http://biogps.org | 2009 | Gene annotation portal and a resource on gene and protein function |
DARNED | http://beamish.ucc.ie/ | 2010 | Database of RNA Editing |
DoriC | http://tubic.tju.edu.cn/doric/ | 2007 | Replication origin (oriC) regions in bacterial and archaeal genomes |
FlyAtlas | http://flyatlas.org/ | 2007 | Drosophila gene expression atlas |
GenColors | http://sgb.fli-leibniz.de/ | 2005 | Genome annotation and comparison database for small genomes |
Genomicus | http://www.dyogen.ens.fr/genomicus | 2010 | Syntenic relationships between eukaryote genomes |
InnateDB | http://www.innatedb.com/ | 2008 | A database of mammalian innate immune response |
MicroScope | http://www.genoscope.cns.fr/agc/microscope/ | 2009 | Microbial genome annotation and analysis platform |
NPIDB | http://npidb.belozersky.msu.ru/ | 2007 | Nucleic acids–protein interaction database |
At this point it might be instructive to look back at the origin and evolution of the NAR Database Issue. Its history started from two supplementary issues that were published in NAR in April of 1991 and in May of 1992 and consisted of 18 and 19 articles, respectively (see http://nar.oxfordjournals.org/content/19/supplement.toc and http://nar.oxfordjournals.org/content/20/supplement.toc). These articles offered descriptions of several nucleotide sequence databases, such as GenBank, the EMBL Data Library, compilations of small RNA, tRNA, and 5S, 16S, and 23S rRNA sequences (including the Ribosomal Database Project), DNA sequences from Escherichia coli and a human genome database (GDB). Those first issues also included descriptions of several protein databases, such as Swiss-Prot, PIR, Prosite, Restriction Enzyme Database (REBASE), Transcription Factors Database (TFD) and Histone database. There was also a medical genetics database, Haemophilia B, listing point mutations and indels in the coagulation factor IX (F9) gene that caused this blood clotting disorder, which has affected the royal families of several European countries.
The next issue, published on July 1, 1993, was the first one formally labelled as the Database Issue. It consisted of 24 articles, which added databases of RNA and protein structure and the Enzyme database. It was followed by NAR Database Issues in September 1994, then in January 1996, and each January after that.
In the past 20 years, the Database Issue has gradually grown in size before stabilizing at the level of ∼180 articles. However, despite the almost 10-fold increase in the number of published articles, the key topics of the current issue remain largely the same as 20 years ago. This issue again features articles from GenBank and the European Nucleotide Archive (formerly the EMBL Data Library), which, together with the DNA Data Bank of Japan, form the International Nucleotide Sequence Database collaboration, INSDC (1–4). Just as 20 years ago, there are updates from Swiss-Prot and PIR (now combined into UniProt) and Prosite (5,6).
Continuing the tradition of featuring well-curated databases of RNA sequences, this issue includes an update on SILVA, a widely used comprehensive database of bacterial, archaeal and eukaryotic 16S/18S and 23S/28S rRNA sequences (7), and a description of Protist Ribosomal Reference database (PR2), a new database that catalogs small subunit rRNA sequences from unicellular eukaryotes (8). An update on the Ribosomal Database Project, a constant feature of the NAR Database Issue since 1991 (9), was last published in 2009 (10). Other RNA databases in this issue include an update on Rfam (11), the universally acclaimed database of RNA families, as well as several databases on long non-coding RNA, microRNA and their targets. An update of Modomics, a database on RNA modification, is now supplemented by RNApathwaysDB, a database of RNA maturation and decay pathways developed by the same group (12,13).
As before, this issue presents several transcription factor (TF) databases. Two of them cover TFs themselves: TFClass offers a classification of human TFs, while NPIDB presents structural information on DNA–protein and RNA–protein complexes (14,15). Several other databases collect information on the TF-binding sites. These include Factorbook, a database of TF-binding data from the ENCODE project; HOCOMOCO, a collection of human TF-binding sites; CTCFBSDB, a database of CCCTC-binding factor (CTCF)-binding sites; RegulonDB, a database of transcriptional regulation in E. coli; and SwissRegulon, a database of regulatory sites in human, mouse and yeast genomes and in model bacteria (16–20).
The structural databases featured in this issue all show a trend towards a better integration and cross-referencing tools. This refers both to the updates of well-known databases, such as the RCSB Protein Data Bank (PDB), CATH and PDBTM, and to such databases as EBI’s SIFTS, a joint effort of UniProt and PDBe to provide a residue level mapping of their entries and supplement it with annotation from other public databases; Genome3D, a recent collaborative project aiming to provide structural annotation from CATH and SCOP to the genomic sequences; and dcGO, which develops domain-centric ontologies to link protein domains with functions, phenotypes and diseases (21–23).
Likewise, with E. coli remaining the workhorse of molecular biology, this issue includes update articles on the EcoGene (the first one since 2000), EcoCyc and RegulonDB databases, as well as a description of the newly developed E. coli Metabolome Database (20,24–26).
HUMAN DISEASE GENOMICS—THE NEXT FRONTIER?
As discussed earlier (27), the original GDB did not survive the influx of the new data and multiple changes of ownership. Nevertheless, we now have a wide variety of databases that cover different aspects of human genome and genomes of model organisms. This issue features annual updates from Ensembl and ENCODE projects and from the UCSC Genome Browser and the Japanese H-InvDB database (28–31). The model organism databases are represented by the updates to FlyBase, Mouse Genome database, Xenbase and ZFIN (32–35).
Two new databases, RhesusBase and NHPRTR, present extensive genome and RNAseq data for non-human primates, including great apes, old world monkeys, new world monkeys and prosimians (36,37). These data could go a long way towards establishing monkeys as model organisms for comparative genomics studies. One more database is dedicated to a more distant relative of human, the urochordate Oikopleura dioica (38).
A potentially important development is the construction of two new databases of repetitive DNA elements, Dfam and SINEBase (39,40). Along with the industry standard Repbase Update (41,42) and monthly RepBase Reports (http://www.girinst.org/repbase/reports/), these databases promise to contribute to a better understanding of eukaryotic repeat elements.
With the abundance of databases providing valuable tools for genome analysis, there is a clear trend towards bringing genomics ‘from the bench to the bedside’, i.e. using genomic data for a better understanding and, hopefully, better treatment of human disease. A number of projects, including ClinSeq (http://www.genome.gov/20519355), DDD (http://www.ddduk.org/) and UK10K (http://www.uk10k.org/) are working towards these goals, and several databases featured in this issue represent important steps in this direction. Last year’s issue introduced the GWASdb database of human genetic variants identified by genome-wide association studies (43). GWAS Central, established in 2007 as HGVbaseG2P (44), has been revamped and now includes data from over 1000 studies. Now, a joint article from NCBI and EBI describes their databases of genomic structural variation, dbVar and DGVa (45). These databases cover diverse variation data including inversions, insertions and translocations that are >50 bp in length. NCBI is also developing ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/), a database of relationships between human gene variation and the observed health status (46). The task of streamlining the genetic tests that provide such information is taken up by the recently created NIH Genetic Testing Registry, a database of genetic tests and laboratories that perform them, with detailed information about what exactly is measured in each test and its analytic and clinical validity (47).
The impact of the genomic data on developing targeted approaches for fighting disease is particularly evident in the case of cancer. This issue features updates from three great databases, the UCSC Cancer Genome Browser (48), the Atlas of Genetics and Cytogenetics in Oncology and Haematology (49) and the TP53 website [(50), the first update of the database on tumor factor p53 mutations since 1997]. In addition, there are two new databases dedicated to studying cancer at the level of specific cell lines. The CellLineNavigator database provides gene expression profiles of different cancer cell lines in different pathological states (51), whereas the Genomics of Drug Sensitivity in Cancer (GDSC) collects the results of high-throughput studies examining the sensitivity for anti-cancer drugs in various cell lines (52).
CURATION OF THE NAR DATABASE COLLECTION
During the past 20 years, all databases featured in the NAR Database Issues were added to the NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/. With the annual attrition rate of <5%, this Collection has been steadily growing and, in 2012, exceeded 1400 database entries (53). It was clear that the list was due for a serious clean-up, and one of the authors (XMFS) devised and set in motion a semi-automated procedure to identify obsolete and non-responsive websites. Remarkably, >90% of the databases listed in the last year’s release of the online Collection were found to be functional. Corresponding authors of close to a hundred non-responsive resources had been contacted and 44 websites (∼3.2% of the total) have been approved for deletion. About 100 entries in the Collection have been updated by receiving corrected URLs, summaries highlighting recent developments, or some other changes in the deposited data.
Although deletion of 40 databases was well within the average drop-off rate and was hardly surprising, further analysis revealed that most of these resources were not lost. Instead, in the normal course of database evolution, they have been integrated into larger projects. For example, a couple of segmental duplications databases were merged into the Database of Genomic Variants (54), NAR Database Collection entry no. 655, while the NCBI’s Cancer Chromosomes database has been merged into dbVar [described in detail in this issue, (45)]. Further, improved annotation of the human genome made redundant a number of resources that covered specific areas of the genome (e.g. the IXDB with its physical maps of human chromosome X).
In one instance, the ExDom database of exon–intron structures of genes in seven eukaryotic genomes (55) had to be removed from the Collection, as it has taken the commercial route and does not provide a free version anymore, although the author’s company offered a discounted version for academic users. Unfortunately, the tightening budgets (56) might force other databases to follow the same path.
In total, the NAR online Molecular Biology Database Collection now includes 1512 databases sorted into 14 categories and 41 subcategories. The authors wishing to have their databases, published elsewhere, to be included in the Collection are welcome to contact XMFS directly.
FUNDING
Intramural Research Program of the U.S. National Institutes of Health at the National Library of Medicine [to M.Y.G.]. Funding for open access charge: Waived by Oxford University Press.
Conflict of interest statement. The authors' opinions do not necessarily reflect the views of their respective institutions.
ACKNOWLEDGEMENTS
The authors thank Drs Javier Herrero and Michael Schuster for helpful comments and the Oxford University Press team led by Jennifer Boyd and Andrew Malvern for their help in compiling this issue.
REFERENCES
- 1.Benson D, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013;41:D36–D42. doi: 10.1093/nar/gks1195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Cochrane G, Alako B, Amid C, Bower L, Cerdeño-Tárraga A, Cleland I, Gibson R, Goodgame N, Jang M, Kay S, et al. Facing growth in the European Nucleotide Archive. Nucleic Acids Res. 2013;41:D30–D35. doi: 10.1093/nar/gks1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ogasawara O, Mashima J, Kodama Y, Kaminuma E, Nakamura Y, Okubo K, Takagi T. DDBJ new system and service refactoring. Nucleic Acids Res. 2013;41:D25–D29. doi: 10.1093/nar/gks1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Nakamura Y, Cochrane G, Karsch-Mizrachi I. The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res. 2013;41:D21–D24. doi: 10.1093/nar/gks1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.The UniProt Consortium. Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res. 2013;41:D43–D47. doi: 10.1093/nar/gks1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sigrist CJA, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I. New and continuing developments at PROSITE. Nucleic Acids Res. 2013;41:D344–D347. doi: 10.1093/nar/gks1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–D596. doi: 10.1093/nar/gks1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Guillou L, Bachar D, Audic S, Bass D, Berney C, Bittner L, Boutte C, Burgaud G, de Vargas C, Decelle J, et al. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote small subunit rRNA sequences with curated taxonomy. Nucleic Acids Res. 2013;41:D597–D604. doi: 10.1093/nar/gks1160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Olsen GJ, Larsen N, Woese CR. The ribosomal RNA database project. Nucleic Acids Res. 1991;19:2017–2021. doi: 10.1093/nar/19.suppl.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, et al. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009;37:D141–D145. doi: 10.1093/nar/gkn879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013;41:D226–D232. doi: 10.1093/nar/gks1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Machnicka MA, Milanowska K, Osman Oglou O, Purta E, Kurkowska M, Olchowik A, Januszewski W, Kalinowski S, Dunin-Horkawicz S, Rother KM, et al. MODOMICS: a database of RNA modification pathways–2012 update. Nucleic Acids Res. 2013;41:D262–D267. doi: 10.1093/nar/gks1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Milanowska K, Mikolajczak K, Lukasik A, Skorupski M, Balcer Z, Mika M, Rother KM, Bujnicki JM. RNApathwaysDB—a database of RNA maturation and decay pathways. Nucleic Acids Res. 2013;41:D268–D272. doi: 10.1093/nar/gks1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wingender E, Schoeps T, Dönitz J. TFClass: an expandable classification of human transcription factors. Nucleic Acids Res. 2013;41:D165–D170. doi: 10.1093/nar/gks1123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kirsanov D, Zanegina O, Spirin S, Karyagina A, Alexeevski A. NPIDB: Nucleic acids – Protein Interaction DataBase. Nucleic Acids Res. 2013;41:D517–D523. doi: 10.1093/nar/gks1199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wang J, Zhuang J, Iyer S, Lin X, Greven M, Kim B, Moore J, Dong X, Virgil D, Birney E, et al. Factorbook.org: a wiki-based database for transcription factor binding data generated by the ENCODE consortium. Nucleic Acids Res. 2013;41:D171–D176. doi: 10.1093/nar/gks1221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, Makeev VJ. HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res. 2013;41:D195–D202. doi: 10.1093/nar/gks1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ziebarth J, Bhattacharya A, Cui Y. CTCFBSDB 2.0: a database for CTCF binding sites and genome organization. Nucleic Acids Res. 2013;41:D188–D194. doi: 10.1093/nar/gks1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pachkov M, Balwierz PJ, Arnold P, Ozonov E, van Nimwegen E. SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates. Nucleic Acids Res. 2013;41:D214–D220. doi: 10.1093/nar/gks1145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Salgado H, Peralta M, Gama-Castro S, Santos-Zavaleta A, Muñiz-Rascado LJ, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, et al. RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards, and more. Nucleic Acids Res. 2013;41:D203–D213. doi: 10.1093/nar/gks1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Velankar S, Dana JM, Jacobsen J, van Ginkel G, Gane PJ, Luo J, Oldfield TJ, O'Donovan C, Martin M-J, Kleywegt GJ. SIFTS: Structure Integration with Function, Taxonomy and Sequences resource. Nucleic Acids Res. 2013;41:D483–D489. doi: 10.1093/nar/gks1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lewis TE, Sillitoe I, Andreeva A, Blundell TL, Buchan D, Chothia C, Cuff A, Dana JM, Filippis I, Gough J, et al. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains. Nucleic Acids Res. 2013;41:D499–D507. doi: 10.1093/nar/gks1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fang H, Gough J. dcGO: database of domain-centric ontologies on functions, phenotypes, diseases and more. Nucleic Acids Res. 2013;41:D536–D544. doi: 10.1093/nar/gks1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zhou J, Rudd KE. EcoGene 3.0. Nucleic Acids Res. 2013;41:D613–D624. doi: 10.1093/nar/gks1235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Keseler IM, Mackie A, Peralta-Gil M, Santos-Zavaleta A, Gama-Castro S, Bonavides-Martinez C, Fulcher C, Huerta AM, Kothari A, Krummenacker M, et al. EcoCyc: fusing model-organism databases with systems biology. Nucleic Acids Res. 2013;41:D605–D612. doi: 10.1093/nar/gks1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Guo AC, Jewison T, Wilson M, Liu Y, Knox C, Djoumbou Y, Lo P, Mandal R, Krishnamurthy R, Wishart DS. ECMDB: the E. coli Metabolome Database. Nucleic Acids Res. 2013;41:D625–D630. doi: 10.1093/nar/gks992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Galperin MY, Cochrane GR. Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009. Nucleic Acids Res. 2009;37:D1–D4. doi: 10.1093/nar/gkn942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, et al. Ensembl 2013. Nucleic Acids Res. 2013;41:D48–D55. doi: 10.1093/nar/gks1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Rosenbloom KR, Sloan CA, Malladi VS, Dreszer TR, Learned K, Wong MC, Kirkup VM, Maddren M, Fang R, Heitner S, et al. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res. 2013;41:D56–D63. doi: 10.1093/nar/gks1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Meyer LR, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B, Raney BJ, et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 2013;41:D64–D69. doi: 10.1093/nar/gks1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Takeda J, Yamasaki C, Murakami K, Nagai Y, Sera M, Hara Y, Obi N, Habara T, Imanishi T, Gojobori T. H-InvDB in 2013: an omics study platform for human functional gene and transcript discovery. Nucleic Acids Res. 2013;41:D915–D919. doi: 10.1093/nar/gks1245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Marygold SJ, Leyland PC, Seal RL, Goodman JL, Thurmond J, Strelets VB, Wilson RJ. FlyBase: improvements to the bibliography. Nucleic Acids Res. 2013;41:D751–D757. doi: 10.1093/nar/gks1024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bult CJ, Eppig JT, Blake JA, Kadin JA, Richardson JE, The Mouse Genome Database Group The Mouse Genome Database (MGD): phenotype, function and models of human disease. Nucleic Acids Res. 2013;41:D885–D891. doi: 10.1093/nar/gks1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.James-Zorn C, Ponferrada VG, Jarabek CJ, Burns K, Segerdell EJ, Lee J, Synder K, Bhattacharyya B, Karpinka JB, Fortriede J, et al. Xenbase: expansion and updates of the Xenopus model organism database. Nucleic Acids Res. 2013;41:D865–D870. doi: 10.1093/nar/gks1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Howe DG, Bradford YM, Conlin T, Eagle AE, Fashena D, Frazer K, Knight J, Mani P, Martin R, Moxon SA, et al. ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics. Nucleic Acids Res. 2013;41:D854–D860. doi: 10.1093/nar/gks938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhang SJ, Liu CJ, Shi M, Kong L, Chen JY, Zhou WZ, Zhu X, Yu P, Wang J, Yang X, et al. RhesusBase: a knowledgebase for the monkey research community. Nucleic Acids Res. 2013;41:D892–D905. doi: 10.1093/nar/gks835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Pipes L, Li S, Bozinoski M, Palermo R, Peng X, Blood P, Kelly S, Weiss J, Thierry-Mieg J, Thierry-Mieg D, et al. The Nonhuman Primate Reference Transcriptome Resource (NHPRTR) for comparative functional genomics. Nucleic Acids Res. 2013;41:D906–D914. doi: 10.1093/nar/gks1268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Danks G, Campsteijn C, Parida M, Butcher S, Doddapaneni H, Fu B, Petrin R, Metpally R, Lenhard B, Wincker P, et al. OikoBase: a genomics and developmental transcriptomics resource for the urochordate Oikopleura dioica. Nucleic Acids Res. 2013;41:D845–D853. doi: 10.1093/nar/gks1159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, Smit AFA, Finn RD. Dfam: a database of repetitive DNA based on profile Hidden Markov Models. Nucleic Acids Res. 2013;41:D70–D82. doi: 10.1093/nar/gks1265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Vassetzky N, Kramerov D. SINEBase: a database and tool for SINE analysis. Nucleic Acids Res. 2013;41:D83–D89. doi: 10.1093/nar/gks1263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 2005;110:462–467. doi: 10.1159/000084979. [DOI] [PubMed] [Google Scholar]
- 42.Kapitonov VV, Jurka J. A universal classification of eukaryotic transposable elements implemented in Repbase. Nat. Rev. Genet. 2008;9:411–412. doi: 10.1038/nrg2165-c1. [DOI] [PubMed] [Google Scholar]
- 43.Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC, Chanock SJ, Wang J. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res. 2012;40:D1047–D1054. doi: 10.1093/nar/gkr1182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Thorisson GA, Lancaster O, Free RC, Hastings RK, Sarmah P, Dash D, Brahmachari SK, Brookes AJ. HGVbaseG2P: a central genetic association database. Nucleic Acids Res. 2009;37:D797–D802. doi: 10.1093/nar/gkn748. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding D, Chen C, Maguire M, Corbett M, Zhou Z, Paschall J, et al. dbVar and DGVa: Public archives for genomic structural variation. Nucleic Acids Res. 2013;41:D936–D941. doi: 10.1093/nar/gks1213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2013;41:D8–D20. doi: 10.1093/nar/gks1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Rubinstein W, Maglott D, Lee J, Kattman B, Malheiro A, Ovetsky M, Hem V, Gorelenkov V, Song G, Wallin C, et al. The NIH Genetic Testing Registry: A new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic Acids Res. 2013;41:D925–D935. doi: 10.1093/nar/gks1173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Goldman M, Craft B, Swatloski T, Ellrott K, Cline M, Diekhans M, Ma S, Wilks C, Stuart J, Haussler D, et al. The UCSC Cancer Genomics Browser: update 2013. Nucleic Acids Res. 2013;41:D949–D954. doi: 10.1093/nar/gks1008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Huret JL, Ahmad M, Arsaban M, Bernheim A, Cigna J, Desangles F, Guignard J-C, Jacquemot-Perbal M-C, Labarussias M, Leberre V, et al. Atlas of Genetics and Cytogenetics in Oncology and Haematology in 2013. Nucleic Acids Res. 2013;41:D920–D924. doi: 10.1093/nar/gks1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Leroy B, Fournier JL, Ishioka C, Monti P, Inga A, Fronza G, Soussi T. The TP53 web site: an integrative resource centre for the TP53 mutation database and TP53 mutant analysis. Nucleic Acids Res. 2013;41:D962–D969. doi: 10.1093/nar/gks1033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Krupp M, Itzel T, Maass T, Hildebrandt A, Galle PR, Teufel A. CellLineNavigator: a workbench for cancer cell line analysis. Nucleic Acids Res. 2013;41:D942–D948. doi: 10.1093/nar/gks1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, Bindahl N, Beare D, Smith JA, Thompson IR, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2013;41:D955–D961. doi: 10.1093/nar/gks1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Galperin MY, Fernandez-Suarez XM. The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res. 2012;40:D1–D8. doi: 10.1093/nar/gkr1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zhang J, Feuk L, Duggan GE, Khaja R, Scherer SW. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 2006;115:205–214. doi: 10.1159/000095916. [DOI] [PubMed] [Google Scholar]
- 55.Bhasi A, Philip P, Manikandan V, Senapathy P. ExDom: an integrated database for comparative analysis of the exon-intron structures of protein domains in eukaryotes. Nucleic Acids Res. 2009;37:D703–D711. doi: 10.1093/nar/gkn746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Baker M. Databases fight funding cuts. Nature. 2012;489:19. doi: 10.1038/489019a. [DOI] [PubMed] [Google Scholar]