Abstract
The 2023 Nucleic Acids Research Database Issue contains 178 papers ranging across biology and related fields. There are 90 papers reporting on new databases and 82 updates from resources previously published in the Issue. Six more papers are updates from databases most recently published elsewhere. Major nucleic acid databases reporting updates include Genbank, ENA, ChIPBase, JASPAR, mirDIP and the Issue's first Breakthrough Article, NACDDB for Circular Dichroism data. Updates from BMRB and RCSB cover experimental protein structural data while AlphaFold 2 computational structure predictions feature widely. STRING and REBASE are stand-out updates in the signalling and enzymes section. Immunology-related databases include CEDAR, the second Breakthrough Article, for cancer epitopes and receptors alongside returning IPD-IMGT/HLA and the new PGG.MHC. Genomics-related resources include Ensembl, GWAS Central and UCSC Genome Browser. Major returning databases for drugs and their targets include Open Targets, DrugCentral, CTD and Pubchem. The EMPIAR image archive appears in the Issue for the first time. The entire database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, revisiting 463 entries, adding 92 new resources and eliminating 96 discontinued URLs so bringing the current total to 1764 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.
NEW AND UPDATED DATABASES
In its 30th incarnation, the Nucleic Acids Research Database Issue once again ranges across biology with a total of 178 papers. Table 1 lists the 90 new databases included, a recent record number, and there are 82 update papers from resources previously covered by NAR. Finally, six databases most recently published elsewhere contribute updates (Table 2). As usual, updates from the major database providers at the European Bioinformatics Institute (EBI), the U.S. National Center for Biotechnology Information (NCBI), and the National Genomics Data Center (NGDC) in China (1–3) are placed first. The usual categorisation then follows: (i) nucleic acid sequence and structure, transcriptional regulation; (ii) protein sequence and structure; (iii) metabolic and signalling pathways, enzymes and networks; (iv) genomics of viruses, bacteria, protozoa and fungi; (v) genomics of human and model organisms plus comparative genomics; (vi) human genomic variation, diseases and drugs; (vii) plants and (viii) other topics, such as proteomics databases. Many papers are not easily placed in a single category so readers are well advised to browse the full list.
Table 1.
Descriptions of new databases in the 2023 NAR database Issue
| Database name | URL | Short description |
|---|---|---|
| ABC Portal | http://abc.sklehabc.com | Single cell transcriptomics of blood cells |
| AgeAnno | https://relab.xidian.edu.cn/AgeAnno/#/ | Single cell annotation of aging in human |
| Amylograph | http://AmyloGraph.com/ | Amyloid-amyloid interactions |
| Animal-SNPAtlas | http://gong_lab.hzau.edu.cn/Animal_SNPAtlas/ | High-quality SNPs in 20 animal species |
| Antibody Registry | https://antibodyregistry.org | Antibodies, their antigens and catalogue numbers |
| Aquila | https://aquila.cheunglab.org | Spatial omics data |
| ASCancer Atlase | https://ngdc.cncb.ac.cn/ascancer/home | Alternative splicing in cancer |
| BIC | http://bic.jhlab.tw/ | Bacteria in Cancer |
| Brain Catalog | https://ngdc.cncb.ac.cn/braincatalog | Genetics of brain disorders and related phenotypes |
| BV-BRC | https://www.bv-brc.org/ | Bacterial and Viral Bioinformatics Resource Center |
| CEDAR | https://cedar.iedb.org/ | Cancer Epitope Database and Analysis Resource |
| Cell Taxonomy | https://ngdc.cncb.ac.cn/celltaxonomy | Cell types and markers across species, tissues and conditions |
| Cell Tracer | http://bio-bigdata.hrbmu.edu.cn/CellTracer | Multi-omics of cellular development trajectories |
| ChemFOnt | https://www.chemfont.ca | Chemical Functional Ontology |
| ChemPert | https://chempert.uni.lu/ | Transcriptomics responses to pertubagens in non-cancerous cells |
| CMDB | https://db.cngb.org/cmdb/ | Whole-genome sequencing data of 141,000 Chinese individuals |
| ChromLoops | https://3dgenomics.hzau.edu.cn/chromloops | Protein-mediated chromatin loops |
| CohesinDB | https://cohesindb.iqb.u-tokyo.ac.jp | Multi-omics data on cohesin functions |
| COMBATdb | https://db.combat.ox.ac.uk | COVID-19 Multi-omics Blood Atlas |
| CottonMD | http://yanglab.hzau.edu.cn/CottonMD/ | Multi-omics data on cotton |
| CovInter | https://idrblab.org/covinter | Interactions between coronavirus RNAs and host proteins |
| CRAMdb | http://www.ehbio.com/CRAMdb | Microbiome metagenomes across animals |
| CRdb | http://cr.liclab.net/crdb/ | Human chromatin regulators |
| CREAMMIST | https://creammist.mtms.dev | Cancer drug dose-reponse across cell lines |
| CRISPRbase | http://crisprbase.maolab.org | CRISPR Base Editing |
| DirectRMDB | http://www.rnamd.org/directRMDB/ | RNA modifications from direct RNA sequencing |
| DRESIS | https://idrblab.org/dresis/ | Drug resistance |
| DrugMAP | https://idrblab.org/drugmap/ | Molecular Atlas and Pharma-information of drugs |
| DupScan | https://dupscan.sysumeg.com/ | Vertebrate genome duplications |
| EDomics | http://edomics.qnlm.ac | Comparative multi-omics for animal Evo-Devo |
| EmAtlas | http://bioinfor.imu.edu.cn/ematlas | Spatiotemporal multi-omics of mammalian embryogenesis |
| FAVOR | https://favor.genohub.org | Functional Annotation of Variants - Online Resource |
| Fungal Names | https://nmdc.cn/fungalnames/ | Fungal taxonomy |
| G4Atlas | https://www.g4atlas.org/ | Experimentally determined RNA G-quadruplexes in transcriptomes |
| GAIA | https://gaia.cobius.usherbrooke.ca | Predicted G-quadruplexes in genomes and transcriptomes |
| GenomicKB | https://gkb.dcmb.med.umich.edu/ | Knowledge Graph for the Human Genome |
| GotEnzymes | https://metabolicatlas.org/gotenzymes | Predicted enzyme kinetic parameters |
| GPSAdb | https://www.gpsadb.com/ | Genetic Perturbation Similarity Analysis |
| HGD | https://ngdc.cncb.ac.cn/hgd | Homolog Gene Database – expression, traits, variants across species |
| HiChIPdb | http://health.tsinghua.edu.cn/hichipdb/ | HiChIP regulatory interactions |
| HProteome-BSite | https://galaxy.seoklab.org/hproteome-bsite/database | Predicted binding sites and ligands in the human 3D proteome |
| HTCA | https://www.htcatlas.org | Single cell transcriptomes and analytical tools |
| HUSCH | http://husch.comp-genomics.org | Human Universal Single Cell Hub |
| IAnimal | https://ianimal.pro/ | Multi-omics data for 21 animals |
| ipaQTL | http://bioinfo.szbl.ac.cn/iaQTL/ | Intronic polyadenylation quantitative trait loci |
| IEAtlas | http://bio-bigdata.hrbmu.edu.cn/IEAtlas | HLA-presented immune epitopes from non-coding regions etc |
| ImmCluster | http://bio-bigdata.hrbmu.edu.cn/ImmCluster | Immunology cell types in normal and cancer tissues |
| IntroVerse | https://rytenlab.com/browser/app/introverse | Introns and splicing errors/noise |
| isomiRdb | https://anathema.cs.uni-saarland.de/isomirdb/ | microRNA isoform expression |
| Lineage Landscape | http://data.iscr.ac.cn/lineage/#/home | Omics of lineage determination in animals |
| M6AREG | https://idrblab.org/m6areg/ | m6A vs disease and drug response |
| MeDBA | https://medba.ddtmlab.org | Metalloenzyme families, structures, ligands |
| MediaDive | https://mediadive.dsmz.de | Culture media |
| MHC Motif Atlas | http://mhcmotifatlas.org/ | MHC Binding Specificities and Ligands |
| microbioTA | http://bio-annotation.cn/microbiota | Cancer microbiomes |
| MiMeDB | https://mimedb.org | Human-microbe interactions and their metabolomes |
| NACDDB | https://genesilico.pl/nacddb/ | Nucleic Acid Circular Dichroism Database |
| NEMO Archive | https://nemoarchive.org | Neuroscience Multi-Omic Archive |
| NLRscape | https://nlrscape.biochim.ro/ | Plant Nod-like receptors |
| OrganoidDB | http://www.inbirg.com/organoid_db/ | Organoid transcriptomics |
| PAT | http://bioinfo.qd.sdu.edu.cn/PAT/ | Prokaryotic Antimicrobial Toxin database |
| PDCM Finder | www.cancermodels.org | Patient-Derived Cancer Models |
| PGG.MHC | https://pog.fudan.edu.cn/pggmhc | Population genetics of HLA genes |
| PGG.SV | https://www.biosino.org/pggsv | Human genome structural variants |
| PertOrg | http://www.inbirg.com/pertorg/ | Changes induced in model organisms by in vivo genetic perturbation |
| plantEXP | https://biotec.njau.edu.cn/plantExp | Plant gene (co-)expression and alternative splicing |
| ProPan | https://ngdc.cncb.ac.cn/propan | Prokaryotic pan-genomes |
| ProtCAD | http://dunbrack2.fccc.edu/protcad | Protein Common Assembly Database |
| qPTM | http://qptm.omicsbio.info | Quantitative Post-Translational Modifications |
| QUADRAtlas | https://rg4db.cibio.unitn.it | RNA G-Quadruplex and RG4-binding proteins |
| RABC | http://www.onethird-lab.com/RABC/ | Multi-omics of Rheumatoid Arthritis |
| RBPimage | http://rnabiology.ircm.qc.ca/RBPImage/ | Microscopy reporting subcellular distribution of human RNA binding proteins |
| Ribocentre | https://www.ribocentre.org | Ribozymes |
| RiboXYZ | https://ribosome.xyz | Ribosome structures |
| Ribo_uORF | http://rnainformatics.org.cn/RiboUORF | uORFs from ribosome profiling data |
| RLBase | https://gccri.bishop-lab.uthscsa.edu/shiny/rlbase/ | R-loops in the human genome |
| RM2Target | http://rm2target.canceromics.org/ | Targets of RNA modification proteins |
| SPASCER | https://ccsm.uth.edu/SPASCER | Spatial transcriptomics annotation at single-cell resolution |
| SPEED | http://speedatlas.net | Single-cell pan-species atlas |
| SUMMER | http://njmu-edu.cn:3838/SUMMER/ | Biomarkers, GWAS and cancer survival |
| tatDB | https://grigoriev-lab.camden.rutgers.edu/tatdb | Experimentally supported targets of tRNA-derived fragments |
| TEDD | https://TEDD.obg.cuhk.edu.hk/ | Temporal Expression of Development Database |
| TFSyntax | https://tfsyntax.zhaopage.com | Transcription Factor binding syntax |
| Thing Metabolome Repository family | http://metabolites.in/things | LC–MS metabolomics data |
| TIMEDB | https://timedb.deepomics.org | RNAseq data and the tumor immune micro-environment |
| TmAlphaFold | https://tmalphafold.ttk.hu/ | AlphaFold TM protein predictions, membrane-embedded and assessed |
| tModBase | https://www.tmodbase.com/ | tRNA modifications: enzymes, dynamics, diseases |
| TOXRIC | http://toxric.bioinforai.tech/ | Toxicological data and benchmarks |
| TWAS Atlas | https://ngdc.cncb.ac.cn/twas/ | Transcriptome-wide association studies |
| UFCG | https://ufcg.steineggerlab.com | Fungal core genes |
Table 2.
Updated descriptions of databases most recently published elsewhere
| Database name | URL | Short description |
|---|---|---|
| Chemical Probes Portal | https://www.chemicalprobes.org/ | Chemical probe assessment and selection |
| CIViC | https://civicdb.org/ | Clinical interpretation of variants in cancer |
| EMPIAR | https://empiar.org/ | The Electron Microscopy Public Image Archive |
| EpiFactors | http://epifactors.autosome.org | Human epigenetic factors and complexes |
| FerrDb | http://www.zhounan.org/ferrdb/ | Ferroptosis regulators |
| SulfAtlas | https://sulfatlas.sb-roscoff.fr/ | Sulfatase families |
The ‘Nucleic acid databases’ section contains the first of the Issue's Breakthrough Articles, reporting on the Nucleic Acid Circular Dichroism Database (NACCDDB; (4)). CD data can give insights into the folding, stability, dynamics and interactions of nucleic acids. NACDDB archives and disseminates the experimental spectra for the first time alongside the metadata describing the experiment and any associated structure models. At this early stage, the database is keen to receive new data and feedback directly from the community. A trio of new nucleic acid quadruplex-related databases also feature. G4Atlas (5) focuses on experimentally determined RNA G-quadruplexes (rG4s) across transcriptomes, determined by a variety of experimental methods, and accompanied by their classification into canonical and other types. QUADRAtlas (6) similarly focuses on rG4s, covering both experimental and predicted structures and including information on rG4-binding proteins, while GAIA (7) surveys predicted quadruplexes in both genomes and transcriptomes across all three kingdoms. RNA modifications are also addressed by three new databases. tModBase (8) focuses on modifications of tRNA, their dynamics and biomedical implications, and the enzymes involved. RM2Target (9), a development of the earlier m6A2Target database (10), covers writes, erasers and readers of nine RNA modifications as well as diverse annotations and biomedical implications. The third of the trio, DirectRMDB (11), collects data from direct RNA sequencing that captures quantitative RNA modifications, annotating the results in an isoform-specific manner. Long ncRNA data are covered in update papers from three popular resources. lncRNASNP (12) reports hugely expanded content and a variety of new analyses and annotations, many focusing on diseases especially cancer, while LncTarD (13) for experimentally-supported lncRNA-target interactions more than doubles in size and introduces new features, again including some focused on cancer. The third, LncBook (14), a curated database of human lncRNAs, reports improved multi-omics annotations including, for the first time, any experimentally-supported small proteins that they encode. Two further databases focus on uORFs: uORFDB (15) returns and expands on its original literature focus to now cover, sequences and sequence variants across 13 eukaryotes, while the new Ribo_uORF (16) provides rich annotations of uORFs identified by ribosome profiling across six animal species. Elsewhere, popular returning databases include AnimalTFDB which doubles its content to cover 183 animals (17); mirDIP, the aggregative database of microRNA–target interactions (18) and UTRdb, the database of richly annotated 5′ and 3′ untranslated regions of mRNA (19).
The reverberations of the AlphaFold 2 (AF2) earthquake (20,21) continue to be felt across the database community. In the protein section, a new database TmAlphaFold (22) addresses the fact that AF2 has no explicit knowledge of the position of the lipid bilayer in which many proteins are embedded. By predicting the membrane embedding of models in the AlphaFold Protein Structure Database (AFDB; (23)) the new resource allows valuable extra validation of transmembrane helical protein models. Another new database, HPproteome-BSite (24), annotates predicted binding sites and candidate ligands across AFDB models of the human proteome. The returning AlloMAPS database (25) now encompasses AFDB entries and other new-generation structure predictions, offering improved insights into the impact of mutations on allostery and helping design of allosteric drugs. The RCSB Protein Data Bank (26) reports that structure predictions, including AFDB, are now available via its website, alongside its core experimentally determined structures which now number almost 200 000. The update from GPCRdb (27) reports state-specific AF2 models alongside other new features such as lists of ligands for each receptor, both endogenous and surrogate. Even MobiDB (28), focusing on intrinsically disordered proteins, benefits from two AF2-related predictions unforeseen by the original methods developers: predictions of disordered regions and potential interaction motifs contained within them. Other classes of proteins with special structural properties are covered by the new Amylograph (29) which curates information on amyloid-amyloid interactions and the returning PhaSepDB (30) for proteins that can participate in phase separation, now doubled in size and with much more detailed annotations. Other notable updating databases include the Biological Magnetic Resonance Data Bank (31); the eggNOG resource for comparative genomics (32) which more than doubles the number of species covered; the InterPro protein family compilation (33) which benefits from an improved interface that includes features inspired by the now-retiring Pfam website (34); and UniProt (35) which also has a redesigned website. The UniProt paper features interesting updates on the parallel and complementary annotation activities centred respectively on community curation and automatic rules-based methods.
In the section for metabolism, signalling and enzymes, REBASE, the popular database of restriction and modification enzymes returns (36) with genome coverage expanded tenfold and systems based on methylation having seen particularly dramatic growth. Other returning enzyme-focused databases include dbCAN_seq (37) which expands to include microbiome-derived carbohydrate-active enzymes, their encoding in gene clusters, and the prediction of substrates. The SulfAtlas database appears in NAR for the first time (38) and - recognising the increasing pace of research in the area and having demonstrated the reliability of the approach - switches to HMM-based (sub-)family assignment from fully manual updating. Meanwhile, the new database MeDBA (39) offers a welcome compendium of information on metalloenzyme sequences, families, structures and interactions. Finally on enzymes, GotEnzymes (40) predicts turnover numbers for 25 million enzyme-substrate pairs using AI methods. Elsewhere, the cornerstone resource in pathway analysis KEGG (41) contributes an update reporting on new genome and taxonomy browsers, while the equally foundational database STRING (42) reports on new co-expression data sources, improved interaction confidence estimates and the ability to process whole new genomes. Other well-used returning databases include MIBiG, whose update paper (43) has an interesting focus on annotation, including online ‘annotathons’; and SIGNOR (44) which has new data, a new interface, and new links to related projects focusing on diseases, including COVID-19. Finally, the new CovInter database (45) captures data on interactions between Coronavirus RNAs and host proteins.
The section on microbial and viral genomics leads off with a paper from the newly formed Bacterial and Viral Bioinformatics Resource Center (46), the successor to no fewer than three individual databases familiar to NAR readers - PATRIC (47), IRD (48) and ViPR (49). Resources hitherto confined to either bacterial or viral databases have been made available to both communities. In a similar space, the IMG/M and IMG/VR databases, for microbes and viruses respectively, each contribute updates (50,51). New features of the former include an improved genome context viewer while the latter has fresh tools for detection of metagenome-derived viral genomes and prediction of their hosts. The update from the popular MGnify resource for metagenomics data (52), recognising the scale and continued growth of the database, reports interestingly on the adoption of Deep Learning methods to annotate protein sequences with Pfam families (53). The importance of the microbiome to the colonised host is demonstrated by two databases. The returning GutMDisorder database (54), focused on human and mouse gut microbiomes, associates microbes with phenotypes and therapeutic interventions, now including data-derived links as well as those curated from the literature; while the new arrival CRAMdb (55) compiles microbiome data for an impressive 500 different animal species and offers sophisticated facilities to compare between different body locations or between different animals. Elsewhere, fungi are covered by a new database of fungal core genes UFCG (56) enabling easy and automated phylogenetic analysis of the kingdom; and the Fungal Names resource for fungal taxonomy (57) which also includes information on specimens, culture collections, publications and so on. Finally, an intriguing new arrival is ProPan (58), a resource for prokaryotic pan-genomes that allows for inference of core and dispensable genes across isolates of 1500 prokaryotes with implications for understanding of environmental adaptation and genome dynamics.
In the human, model organism and comparative genomics section, the Issue's second Breakthrough Article describes CEDAR, the Cancer Epitope Database and Analysis Resource (59). A companion database to the hugely popular Immune Epitope Database (IEDB; (60)), and borrowing its carefully standardised protocols, CEDAR covers cancer epitope and receptor data curated from the literature. Notably, the CEDAR authors consulted cancer immunology experts before designing the new user interface and the new database will significantly support resurgent interest in antigen-based immunotherapies. Also in the immunology area, the new PGG.MHC database (61) majors on the population genetics of HLA, especially in Asia; while updates from the heavily used IPD-IMGT/HLA include a refreshed website (62). Other major resources updating include Ensembl (63), the UCSC Genome Browser (64) and GWAS Central (65). Ensembl is recognising the increasing availability of multiple high-quality sequences for model species and now offers its first human pangenome graphs, but has also doubled the number of genomes covered, reaching out across species lacking transcriptomics data by employing alternative tools. New tracks at the UCSC Genome Browser are focused particularly on clinical annotations and single cell RNA-seq data, while the SARS-CoV-2 browser continues to be regularly updated with new variants of concern. The new COMBATdb database (66) covers the blood multi-omics of COVID-19 patients compared to healthy controls and will aid in the identification of biomarkers and therapeutic targets. A number of new databases focus on single cell transcriptomics particularly for human. HTCA (67) and HUSCH (68) each offer data on millions of single cells and rich suites of analytic tools while ABC Portal (69) focuses on blood cells, with relevance to blood cancers, and AgeAnno (70) majors on aging and additionally integrates information on chromatin accessibility and transcription factor binding. The ambitious SPEED (71), in contrast, looks across 127 species and multiple modalities of single cell data, allowing for sophisticated comparative analyses. Cell identity and cell fate during development are covered by several resources. The popular returning CellMarker (72), for human and mouse, reports significantly expanded content, both in terms of number and type of marker but also encompassing new sequencing technologies; while the new Cell Taxonomy database (73) covers thousands of curated cell types across 34 species. Another new database TEDD (74) looks at temporal (co-) expression and chromatin accessibility during development in model organisms while Lineage Landscape (75) explores similar themes, additionally covering epigenomics data. Finally, interesting perspectives on genome evolution are offered by the returning HGTree (76), now covering horizontal gene transfer in eight times as many prokaryotic species, and the new DupScan (77) which provides detailed insights into whole genome duplications in vertebrates.
As usual, cancer has a strong presence in the section on human genomic variation, diseases and drugs. The returning canSAR database (78), which reports updates including druggability assessments using AF2 models, is joined by the ASCancer Atlas (79) that focuses on oncogenic splicing events, including annotation of upstream regulators and downstream impact; and by CREAMMIST (80), which offers an improved integrative understanding of cancer cell drug responses. Two new databases, BIC (81) and microbioTA (82) recognise the importance of the microbiota in cancer, profiling composition and abundance at different sites in comparison with controls. For drug development PubChem is a foundational resource and its update here (83) reports on 120 new data sources, with patent coverage particularly strengthened. In the same area ChemFOnt (84) brings a new hierarchical ontology describing functions of biologically important chemicals. Major returning resources for drugs and their targets include DrugCentral (85), which now covers veterinary drugs too and has new data on adverse drug events, and TCRD/Pharos (86) which has incorporated fresh data sources and new data visualisations such as clever circular treemaps to intuitively map expression onto ontologies. The toxicological properties of drugs and other chemicals are covered by the popular CTD (87) which introduces ‘CTD tetramers’, information blocks containing chemical, gene, phenotype and disease compiled from pairwise interactions; and the newcomer TOXRIC (88), which is especially notable for offering standardised benchmarks ready for use by Machine Learning methods. An interesting newcomer in the pharmacology area is DRESIS (89) bringing superbly comprehensive coverage of drug resistance mechanisms. Elsewhere, inferring the potentially pathogenic consequences of sequence variants remains a major preoccupation. In this space FAVOR (90) is a significant new arrival, which comes accompanied by the FAVORannotator software, whose efficiency makes it an appealing option for cloud settings; while CIViC (91), publishing in NAR for the first time, focuses specifically on cancer and reports success in engaging with relevant communities for crucial ongoing curation alongside an appeal for more editors to step forward.
The final sections cover plants and then databases not comfortably accommodated elsewhere. CuGenDB (92) covers cucurbits, such the important crops cucumber and melon, and contributes an update paper reporting a new focus on detection and annotation of genomic variants, as well as improved gene expression options. Also on crops, the new CottonMD (93) compiles multi-omics information including metabolomes facilitating the identification of trait-linked features and speeding future plan breeding efforts. Nod-like receptors (NLRs) are key proteins in plant disease resistance and the new NLRscape (94) offers resources and analyses of sequences, families and structures (from AF2). A notable new arrival in NAR included in the last section is EMPIAR (95). Central to storing the raw data behind cryo-EM experiments, EMPIAR also archives volume EM and X-ray tomography images and provides visualisation tools for them. Further popular databases reporting updates this year include the Chemical Probes Portal (96) which has quintupled in size since first publication, and ProteomeXchange (97), the consortium of high-throughput proteomics resources, which celebrates 10 years. Finally, LitCovid reports an update (98) on its efforts to capture and curate the literature on COVID-19 with new emphasis on long COVID-19, variants and vaccines; while the collection of 3000 culture media for 40 000 microbial strains at MediaDive (99) will undoubtedly prove invaluable for microbiologists.
NAR ONLINE MOLECULAR BIOLOGY DATABASE COLLECTION
For this 30th release of the NAR online Molecular Database Collection (as usual freely available at http://www.oxfordjournals.org/nar/database/c/), we have detected a problem with a number of entries which have been updated accordingly as part of our ongoing curation process, with regards to the latest version, we have updated 463 entries, 92 new resources have made it to the database and further 96 discontinued were removed, bringing the total collection to 1764 databases. We do appreciate the feedback as some errors can be present in older entries: thanks to the community we have identified a number of spurious entries which were updated. Our ongoing effort to ensure an up-to-date resource relies as well on the continuous scanning or current entries to detect discontinued services as well as scanning for new entries. We encourage authors to submit their updates to XMF at xose.m.fernandez@gmail.com in plain text, ideally according to the template found in http://www.oxfordjournals.org/nar/database/summary/1.
ACKNOWLEDGEMENTS
We thank Dr Martine Bernardes-Silva, especially, and the rest of the Oxford University Press team led by Joanna Ventikos for their help in compiling this Issue.
Contributor Information
Daniel J Rigden, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK.
Xosé M Fernández, IQVIA Ltd., The Point, 37 North Wharf Road, London W2 1AF, UK.
FUNDING
Funding for open access charge: Oxford University Press.
Conflict of interest statement. The authors' opinions do not necessarily reflect the views of their respective institutions.
REFERENCES
- 1. Thakur M., Bateman A., Brooksbank C., Freeberg M., Harrison M., Hartley M., Keane T., Kleywegt G., Leach A., Levchenko M.et al.. EMBL’s European Bioinformatics Institute (EMBL-EBI) in 2022. Nucleic Acids Res. 2022; 10.1093/nar/gkac1098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Sayers E.W., Bolton E.E., Brister J.R., Canese K., Chan J., Comeau D.C., Farrell C.M., Feldgarden M., Fine A.M., Funk K.et al.. Database resources of the National Center for Biotechnology Information in 2023. Nucleic Acids Res. 2022; 10.1093/nar/gkac1032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. CNCB-NGDC Members and Partners Database resources of the National Genomics Data Center, China National Center for Bioinformation in 2023. Nucleic Acids Res. 2022; 10.1093/nar/gkac1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Cappannini A., Mosca K., Mukherjee S., Moafinejad S.N., Sinden R.R., Arluison V., Bujnicki J., Wien F.. NACDDB: nucleic acid circular dichroism Database. Nucleic Acids Res. 2022; 10.1093/nar/gkac829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Yu H., Qi Y., Yang B., Yang X., Ding Y.. G4Atlas: a comprehensive transcriptome-wide G-quadruplex database. Nucleic Acids Res. 2022; 10.1093/nar/gkac896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Bourdon S., Herviou P., Dumas L., Destefanis E., Zen A., Cammas A., Millevoi S., Dassi E.. QUADRatlas: the RNA G-quadruplex and RG4-binding proteins database. Nucleic Acids Res. 2022; 10.1093/nar/gkac782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Vannutelli A., Schell L.L.N., Perreault J.-P., Ouangraoua A.. GAIA: g-quadruplexes in alive creature database. Nucleic Acids Res. 2022; 10.1093/nar/gkac657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Lei H.-T., Wang Z.-H., Li B., Sun Y., Mei S.-Q., Yang J.-H., Qu L.-H., Zheng L.-L.. tModBase: deciphering the landscape of tRNA modifications and their dynamic changes from epitranscriptome data. Nucleic Acids Res. 2022; 10.1093/nar/gkac1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Bao X., Zhang Y., Li H., Teng Y., Ma L., Chen Z., Luo X., Zheng J., Zhao A., Ren J.et al.. RM2Target: a comprehensive database for targets of writers, erasers and readers of RNA modifications. Nucleic Acids Res. 2022; 10.1093/nar/gkac945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Deng S., Zhang H., Zhu K., Li X., Ye Y., Li R., Liu X., Lin D., Zuo Z., Zheng J.. M6A2Target: a comprehensive database for targets of m6A writers, erasers and readers. Brief. Bioinf. 2021; 22:bbaa055. [DOI] [PubMed] [Google Scholar]
- 11. Zhang Y., Jiang J., Ma J., Wei Z., Wang Y., Song B., Meng J., Jia G., de Magalhães J.P., Rigden D.J.et al.. DirectRMDB: a database of post-transcriptional RNA modifications unveiled from direct RNA sequencing technology. Nucleic Acids Res. 2022; 10.1093/nar/gkac1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Yang Y., Wang D., Miao Y.-R., Wu X., Luo H., Cao W., Yang W., Yang J., Guo A.-Y., Gong J.. lncRNASNP v3: an updated database for functional variants in long non-coding rnas. Nucleic Acids Res. 2022; 10.1093/nar/gkac981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Zhao H., Yin X., Xu H., Liu K., Liu W., Wang L., Zhang C., Bo L., Lan X., Lin S.et al.. LncTarD 2.0: an updated comprehensive database for experimentally-supported functional lncRNA-target regulations in human diseases. Nucleic Acids Res. 2022; 10.1093/nar/gkac984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Li Z., Liu L., Feng C., Qin Y., Xiao J., Zhang Z., Ma L.. LncBook 2.0: integrating human long non-coding rnas with multi-omics annotations. Nucleic Acids Res. 2022; 10.1093/nar/gkac999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Manske F., Ogoniak L., Jürgens L., Grundmann N., Makałowski W., Wethmar K.. The new uORFdb: integrating literature, sequence, and variation data in a central hub for uORF research. Nucleic Acids Res. 2022; 10.1093/nar/gkac899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Liu Q., Peng X., Shen M., Qian Q., Xing J., Li C., Gregory R.I.. Ribo-uORF: a comprehensive data resource of upstream open reading frames (uORFs) based on ribosome profiling. Nucleic Acids Res. 2022; 10.1093/nar/gkac1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Shen W.-K., Chen S.-Y., Gan Z.-Q., Zhang Y.-Z., Yue T., Chen M.-M., Xue Y., Hu H., Guo A.-Y.. AnimalTFDB 4.0: a comprehensive animal transcription factor database updated with variation and expression annotations. Nucleic Acids Res. 2022; 10.1093/nar/gkac907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Hauschild A.-C., Pastrello C., Ekaputeri G.K.A., Bethune-Waddell D., Abovsky M., Ahmed Z., Kotlyar M., Lu R., Jurisica I.. MirDIP 5.2: tissue context annotation and novel microRNA curation. Nucleic Acids Res. 10.1093/nar/gkac1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Lo Giudice C., Zambelli F., Chiara M., Pavesi G., Tangaro M.A., Picardi E., Pesole G.. UTRdb 2.0: a comprehensive, expert curated catalog of eukaryotic mRNAs untranslated regions. Nucleic Acids Res. 2022; 10.1093/nar/gkac1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Pereira J., Simpkin A.J., Hartmann M.D., Rigden D.J., Keegan R.M., Lupas A.N.. High-accuracy protein structure prediction in CASP14. Proteins. 2021; 89:1687–1699. [DOI] [PubMed] [Google Scholar]
- 21. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A.et al.. Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596:583–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Dobson L., Szekeres L.I., Gerdán C., Langó T., Zeke A., Tusnády G.E.. TmAlphaFold database: membrane localization and evaluation of AlphaFold2 predicted alpha-helical transmembrane protein structures. Nucleic Acids Res. 2022; 10.1093/nar/gkac928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Varadi M., Anyango S., Deshpande M., Nair S., Natassia C., Yordanova G., Yuan D., Stroe O., Wood G., Laydon A.et al.. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022; 50:D439–D444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Sim J., Kwon S., Seok C.. HProteome-BSite: predicted binding sites and ligands in human 3D proteome. Nucleic Acids Res. 2022; 10.1093/nar/gkac873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Tan Z.W., Tee W.-V., Guarnera E., Berezovsky I.N.. AlloMAPS 2: allosteric fingerprints of the AlphaFold and Pfam-trRosetta predicted structures for engineering and design. Nucleic Acids Res. 2022; 10.1093/nar/gkac828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Burley S.K., Bhikadiya C., Bi C., Bittrich S., Chao H., Chen L., Craig P.A., Crichlow G.V., Dalenberg K., Duarte J.M.et al.. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res. 2022; 10.1093/nar/gkac1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Pándy-Szekeres G., Caroli J., Mamyrbekov A., Kermani A.A., Keserű G.M., Kooistra A.J., Gloriam D.E.. GPCRdb in 2023: state-specific structure models using AlphaFold2 and new ligand resources. Nucleic Acids Res. 2022; 10.1093/nar/gkac1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Piovesan D., Del Conte A., Clementel D., Monzon A.M., Bevilacqua M., Aspromonte M.C., Iserte J.A., Orti F.E., Marino-Buslje C., Tosatto S.C.E.. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res. 2022; 10.1093/nar/gkac1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Burdukiewicz M., Rafacz D., Barbach A., Hubicka K., Bąkała L., Lassota A., Stecko J., Szymańska N., Wojciechowski J.W., Kozakiewicz D.et al.. AmyloGraph: a comprehensive database of amyloid-amyloid interactions. Nucleic Acids Res. 2022; 10.1093/nar/gkac882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Hou C., Wang X., Xie H., Chen T., Zhu P., Xu X., You K., Li T.. PhaSepDB in 2022: annotating phase separation-related proteins with droplet states, co-phase separation partners and other experimental information. Nucleic Acids Res. 2022; 10.1093/nar/gkac783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Hoch J.C., Baskaran K., Burr H., Chin J., Eghbalnia H., Fujiwara T., Gryk M.R., Iwata T., Kojima C., Kurisu G.et al.. Biological Magnetic resonance Data Bank. Nucleic Acids Res. 2022; 10.1093/nar/gkac1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Hernández-Plaza A., Szklarczyk D., Botas J., Cantalapiedra C.P., Giner-Lamia J., Mende D.R., Kirsch R., Rattei T., Letunic I., Jensen L.J.et al.. eggNOG 6.0: enabling comparative genomics across 12 535 organisms. Nucleic. Acids. Res. 2022; 10.1093/nar/gkac1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Paysan-Lafosse T., Blum M., Chuguransky S., Grego T., Pinto B.L., Salazar G.A., Bileschi M.L., Bork P., Bridge A., Colwell L.et al.. InterPro in 2022. Nucleic. Acids. Res. 2022; 10.1093/nar/gkac993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Mistry J., Chuguransky S., Williams L., Qureshi M., Salazar G.A., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J.et al.. Pfam: the protein families database in 2021. Nucleic Acids Res. 2021; 49:D412–D419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Consortium UniProt UniProt: the Universal Protein knowledgebase in 2023. Nucleic. Acids. Res. 2022; 10.1093/nar/gkac1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Roberts R.J., Vincze T., Posfai J., Macelis D. REBASE: a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic. Acids. Res. 2022; 10.1093/nar/gkac975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Zheng J., Hu B., Zhang X., Ge Q., Yan Y., Akresi J., Piyush V., Huang L., Yin Y.. dbCAN-seq update: CAZyme gene clusters and substrates in microbiomes. Nucleic Acids Res. 2022; 10.1093/nar/gkac1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Stam M., Lelièvre P., Hoebeke M., Corre E., Barbeyron T., Michel G.. SulfAtlas, the sulfatase database: state of the art and new developments. Nucleic Acids Res. 2022; 10.1093/nar/gkac977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Yu J.-L., Wu S., Zhou C., Dai Q.-Q., Schofield C.J., Li G.-B.. MeDBA: the Metalloenzyme Data Bank and Analysis platform. Nucleic Acids Res. 2022; 10.1093/nar/gkac860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Li F., Chen Y., Anton M., Nielsen J.. GotEnzymes: an extensive database of enzyme parameter predictions. Nucleic Acids Res. 2022; 10.1093/nar/gkac831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Kanehisa M., Furumichi M., Sato Y., Kawashima M., Ishiguro-Watanabe M.. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2022; 10.1093/nar/gkac963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Szklarczyk D., Kirsch R., Koutrouli M., Nastou K., Mehryary F., Hachilif R., Gable A.L., Fang T., Doncheva N.T., Pyysalo S.et al.. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2022; 10.1093/nar/gkac1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Terlouw B.R., Blin K., Navarro-Muñoz J.C., Avalon N.E., Chevrette M.G., Egbert S., Lee S., Meijer D., Recchia M.J.J., Reitz Z.L.et al.. MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters. Nucleic Acids Res. 2022; 10.1093/nar/gkac1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Lo Surdo P., Iannuccelli M., Contino S., Castagnoli L., Licata L., Cesareni G., Perfetto L.. SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update. Nucleic Acids Res. 2022; 10.1093/nar/gkac883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Amahong K., Zhang W., Zhou Y., Zhang S., Yin J., Li F., Xu H., Yan T., Yue Z., Liu Y.et al.. CovInter: interaction data between coronavirus rnas and host proteins. Nucleic Acids. Res. 2022; 10.1093/nar/gkac834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Olson R.D., Assaf R., Brettin T., Conrad N., Cucinell C., Davis J.J., Dempsey D.M., Dickerman A., Dietrich E.M., Kenyon R.W.et al.. Introducing the Bacterial and Viral Bioinformatics Resource Center (BV-BRC): a resource combining PATRIC, IRD and ViPR. Nucleic Acids Res. 2022; 10.1093/nar/gkac1003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Davis J.J., Wattam A.R., Aziz R.K., Brettin T., Butler R., Butler R.M., Chlenski P., Conrad N., Dickerman A., Dietrich E.M.et al.. The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities. Nucleic Acids Res. 2020; 48:D606–D612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Zhang Y., Aevermann B.D., Anderson T.K., Burke D.F., Dauphin G., Gu Z., He S., Kumar S., Larsen C.N., Lee A.J.et al.. Influenza Research Database: an integrated bioinformatics resource for influenza virus research. Nucleic Acids Res. 2017; 45:D466–D474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Pickett B.E., Sadat E.L., Zhang Y., Noronha J.M., Squires R.B., Hunt V., Liu M., Kumar S., Zaremba S., Gu Z.et al.. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 2012; 40:D593–D598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Chen I.-M.A., Chu K., Palaniappan K., Ratner A., Huang J., Huntemann M., Hajek P., Ritter S.J., Webb C., Wu D.et al.. The IMG/M data management and analysis system v.7: content updates and new features. Nucleic Acids Res. 2022; 10.1093/nar/gkac976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Camargo A.P., Nayfach S., Chen I.-M.A., Palaniappan K., Ratner A., Chu K., Ritter S.J., Reddy T.B.K., Mukherjee S., Schulz F.et al.. IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata. Nucleic Acids Res. 2022; 10.1093/nar/gkac1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Richardson L., Allen B., Baldi G., Beracochea M., Bileschi M.L., Burdett T., Burgin J., Caballero-Pérez J., Cochrane G., Colwell L.J.et al.. MGnify – the microbiome sequence data analysis resource in 2023. Nucleic Acids Res. 2022; 10.1093/nar/gkac1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Bileschi M.L., Belanger D., Bryant D.H., Sanderson T., Carter B., Sculley D., Bateman A., DePristo M.A., Colwell L.J.. Using deep learning to annotate the protein universe. Nat. Biotechnol. 2022; 40:932–937. [DOI] [PubMed] [Google Scholar]
- 54. Qi C., Cai Y., Qian K., Li X., Ren J., Wang P., Fu T., Zhao T., Cheng L., Shi L.et al.. gutMDisorder v2.0: a comprehensive database for dysbiosis of gut microbiota in phenotypes and interventions. Nucleic Acids Res. 2022; 10.1093/nar/gkac871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Lei B., Xu Y., Lei Y., Li C., Zhou P., Wang L., Yang Q., Li X., Li F., Liu C.et al.. CRAMdb: a comprehensive database for composition and roles of microbiome in animals. Nucleic Acids Res. 2022; 10.1093/nar/gkac973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Kim D., Gilchrist C.L.M., Chun J., Steinegger M.. UFCG: database of universal fungal core genes and pipeline for genome-wide phylogenetic analysis of fungi. Nucleic Acids Res. 2022; 10.1093/nar/gkac894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Wang F., Wang K., Cai L., Zhao M., Kirk P.M., Fan G., Sun Q., Li B., Wang S., Yu Z.et al.. Fungal names: a comprehensive nomenclatural repository and knowledge base for fungal taxonomy. Nucleic Acids Res. 2022; 10.1093/nar/gkac926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Zhang Y., Zhang H., Zhang Z., Qian Q., Zhang Z., Xiao J.. ProPan: a comprehensive database for profiling prokaryotic pan-genome dynamics. Nucleic Acids Res. 2022; 10.1093/nar/gkac832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Koşaloğlu-Yalçın Z., Blazeska N., Vita R., Carter H., Nielsen M., Schoenberger S., Sette A., Peters B.. The Cancer Epitope Database and Analysis resource (CEDAR). Nucleic Acids Res. 2022; 10.1093/nar/gkac902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Vita R., Mahajan S., Overton J.A., Dhanda S.K., Martini S., Cantrell J.R., Wheeler D.K., Sette A., Peters B.. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019; 47:D339–D343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Zhao X., Ma S., Wang B., Jiang X.Han100K Initiative Han100K Initiative Xu S.. PGG.MHC: toward understanding the diversity of major histocompatibility complexes in human populations. Nucleic Acids Res. 2022; 10.1093/nar/gkac997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Barker D.J., Maccari G., Georgiou X., Cooper M.A., Flicek P., Robinson J., Marsh S.G.E.. The IPD-IMGT/HLA Database. Nucleic Acids Res. 2022; 10.1093/nar/gkac1011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Martin F.J., Amode M.R., Aneja A., Austine-Orimoloye O., Azov A.G., Barnes I., Becker A., Bennett R., Berry A., Bhai J.et al.. Ensembl 2023. Nucleic Acids Res. 2022; 10.1093/nar/gkac958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Nassar L.R., Barber G.P., Benet-Pagès A., Casper J., Clawson H., Diekhans M., Fischer C., Gonzalez J.N., Hinrichs A.S., Lee B.T.et al.. The UCSC Genome Browser database: 2023 update. Nucleic Acids Res. 2022; 10.1093/nar/gkac1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Beck T., Rowlands T., Shorter T., Brookes A.J.. GWAS Central: an expanding resource for finding and visualising genotype and phenotype data from genome-wide association studies. Nucleic Acids Res. 2022; 10.1093/nar/gkac1017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Wang D., Kumar V., Burnham K.L., Mentzer A.J., Marsden B.D., Knight J.C.. COMBATdb: a database for the COVID-19 Multi-Omics Blood ATlas. Nucleic Acids Res. 2022; 10.1093/nar/gkac1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Pan L., Shan S., Tremmel R., Li W., Liao Z., Shi H., Chen Q., Zhang X., Li X.. HTCA: a database with an in-depth characterization of the single-cell human transcriptome. Nucleic Acids Res. 2022; 10.1093/nar/gkac791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Shi X., Yu Z., Ren P., Dong X., Ding X., Song J., Zhang J., Li T., Wang C.. HUSCH: an integrated single-cell transcriptome atlas for human tissue gene expression visualization and analyses. Nucleic Acids Res. 2022; 10.1093/nar/gkac1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Gao X., Hong F., Hu Z., Zhang Z., Lei Y., Li X., Cheng T.. ABC portal: a single-cell database and web server for blood cells. Nucleic Acids Res. 2022; 10.1093/nar/gkac646. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Huang K., Gong H., Guan J., Zhang L., Hu C., Zhao W., Huang L., Zhang W., Kim P., Zhou X.. AgeAnno: a knowledgebase of single-cell annotation of aging in human. Nucleic Acids Res. 2022; 10.1093/nar/gkac847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Chen Y., Zhang X., Peng X., Jin Y., Ding P., Xiao J., Li C., Wang F., Chang A., Yue Q.et al.. SPEED: single-cell pan-species atlas in the light of Ecology and Evolution for Development and diseases. Nucleic Acids Res. 2022; 10.1093/nar/gkac930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Hu C., Li T., Xu Y., Zhang X., Li F., Bai J., Chen J., Jiang W., Yang K., Ou Q.et al.. CellMarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2022; 10.1093/nar/gkac947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Jiang S., Qian Q., Zhu T., Zong W., Shang Y., Jin T., Zhang Y., Chen M., Wu Z., Chu Y.et al.. Cell Taxonomy: a curated repository of cell types with multifaceted characterization. Nucleic. Acids. Res. 2022; 10.1093/nar/gkac816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Zhou Z., Tan C., Chau M.H.K., Jiang X., Ke Z., Chen X., Cao Y., Kwok Y.K., Bellgard M., Leung T.Y.et al.. TEDD: a database of temporal gene expression patterns during multiple developmental periods in human and model organisms. Nucleic Acids Res. 2022; 10.1093/nar/gkac978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Yan H., Wang R., Ma S., Huang D., Wang S., Ren J., Lu C., Chen X., Lu X., Zheng Z.et al.. Lineage Landscape: a comprehensive database that records lineage commitment across species. Nucleic Acids Res. 2022; 10.1093/nar/gkac951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Choi Y., Ahn S., Park M., Lee S., Cho S., Kim H.. HGTree v2.0: a comprehensive database update for horizontal gene transfer (HGT) events detected by the tree-reconciliation method. Nucleic Acids Res. 2022; 10.1093/nar/gkac929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Lu J., Huang P., Sun J., Liu J.. DupScan: predicting and visualizing vertebrate genome duplication database. Nucleic Acids Res. 2022; 10.1093/nar/gkac718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. di Micco P., Antolin A.A., Mitsopoulos C., Villasclaras-Fernandez E., Sanfelice D., Dolciami D., Ramagiri P., Mica I.L., Tym J.E., Gingrich P.W.et al.. canSAR: update to the cancer translational research and drug discovery knowledgebase. Nucleic Acids Res. 2022; 10.1093/nar/gkac1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Wu S., Huang Y., Zhang M., Gong Z., Wang G., Zheng X., Zong W., Zhao W., Xing P., Li R.et al.. ASCancer Atlas: a comprehensive knowledgebase of alternative splicing in human cancers. Nucleic Acids Res. 2022; 10.1093/nar/gkac955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Yingtaweesittikul H., Wu J., Mongia A., Peres R., Ko K., Nagarajan N., Suphavilai C.. CREAMMIST: an integrative probabilistic database for cancer drug response prediction. Nucleic Acids Res. 2022; 10.1093/nar/gkac911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Chen K.-P., Hsu C.-L., Oyang Y.-J., Huang H.-C., Juan H.-F.. BIC: a database for the transcriptional landscape of bacteria in cancer. Nucleic Acids Res. 2022; 10.1093/nar/gkac891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Wang P., Zhang S., He G., Du M., Qi C., Liu R., Zhang S., Cheng L., Shi L., Zhang X.. microbioTA: an atlas of the microbiome in multiple disease tissues of homo sapiens and Mus musculus. Nucleic Acids Res. 2022; 10.1093/nar/gkac851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Kim S., Chen J., Cheng T., Gindulyte A., He J., He S., Li Q., Shoemaker B.A., Thiessen P.A., Yu B.et al.. PubChem 2023 update. Nucleic Acids Res. 2022; 10.1093/nar/gkac956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Wishart D.S., Girod S., Peters H., Oler E., Jovel J., Budinski Z., Milford R., Lui V.W., Sayeeda Z., Mah R.et al.. ChemFOnt: the chemical functional ontology resource. Nucleic Acids Res. 2022; 10.1093/nar/gkac919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Avram S., Wilson T.B., Curpan R., Halip L., Borota A., Bora A., Bologa C.G., Holmes J., Knockel J., Yang J.J.et al.. DrugCentral 2023 extends human clinical data and integrates veterinary drugs. Nucleic Acids Res. 2022; 10.1093/nar/gkac1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Kelleher K.J., Sheils T.K., Mathias S.L., Yang J.J., Metzger V.T., Siramshetty V.B., Nguyen D.-T., Jensen L.J., Vidović D., Schürer S.C.et al.. Pharos 2023: an integrated resource for the understudied Human proteome. Nucleic Acids Res. 2022; 10.1093/nar/gkac1033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Davis A.P., Wiegers T.C., Johnson R.J., Sciaky D., Wiegers J., Mattingly C.J.. Comparative Toxicogenomics Database (CTD): update 2023. Nucleic Acids Res. 2022; 10.1093/nar/gkac833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Wu L., Yan B., Han J., Li R., Xiao J., He S., Bo X.. TOXRIC: a comprehensive database of toxicological data and benchmarks. Nucleic Acids Res. 2022; 10.1093/nar/gkac1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Sun X., Zhang Y., Li H., Zhou Y., Shi S., Chen Z., He X., Zhang H., Li F., Yin J.et al.. DRESIS: the first comprehensive landscape of drug resistance information. Nucleic Acids Res. 2022; 10.1093/nar/gkac812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Zhou H., Arapoglou T., Li X., Li Z., Zheng X., Moore J., Asok A., Kumar S., Blue E.E., Buyske S.et al.. FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Nucleic Acids Res. 2022; 10.1093/nar/gkac966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Krysiak K., Danos A.M., Saliba J., McMichael J.F., Coffman A.C., Kiwala S., Barnell E.K., Sheta L., Grisdale C.J., Kujan L.et al.. CIViCdb 2022: evolution of an open-access cancer variant interpretation knowledgebase. Nucleic Acids Res. 2022; 10.1093/nar/gkac979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Yu J., Wu S., Sun H., Wang X., Tang X., Guo S., Zhang Z., Huang S., Xu Y., Weng Y.et al.. CuGenDBv2: an updated database for cucurbit genomics. Nucleic Acids Res. 2022; 10.1093/nar/gkac921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93. Yang Z., Wang J., Huang Y., Wang S., Wei L., Liu D., Weng Y., Xiang J., Zhu Q., Yang Z.et al.. CottonMD: a multi-omics database for cotton biological study. Nucleic Acids Res. 2022; 10.1093/nar/gkac863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94. Martin E.C., Ion C.F., Ifrimescu F., Spiridon L., Bakker J., Goverse A., Petrescu A.-J.. NLRscape: an atlas of plant NLR proteins. Nucleic Acids Res. 2022; 10.1093/nar/gkac1014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Iudin A., Korir P.K., Somasundharam S., Weyand S., Cattavitello C., Fonseca N., Salih O., Kleywegt G.J., Patwardhan A.. EMPIAR: the Electron Microscopy public Image archive. Nucleic Acids Res. 2022; 10.1093/nar/gkac1062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Antolin A.A., Sanfelice D., Crisp A., Villasclaras Fernandez E., Mica I.L., Chen Y., Collins I., Edwards A., Müller S., Al-Lazikani B.et al.. The Chemical Probes Portal: an expert review-based public resource to empower chemical probe assessment, selection and use. Nucleic Acids Res. 2022; 10.1093/nar/gkac909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Deutsch E.W., Bandeira N., Perez-Riverol Y., Sharma V., Carver J.J., Mendoza L., Kundu D.J., Wang S., Bandla C., Kamatchinathan S.et al.. The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Res. 2022; 10.1093/nar/gkac1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98. Chen Q., Allot A., Leaman R., Wei C.-H., Aghaarabi E., Guerrerio J.J., Xu L., Lu Z.. LitCovid in 2022: an information resource for the COVID-19 literature. Nucleic Acids Res. 2022; 10.1093/nar/gkac1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99. Koblitz J., Halama P., Spring S., Thiel V., Baschien C., Hahnke R.L., Pester M., Overmann J., Reimer L.C.. MediaDive: the expert-curated cultivation media database. Nucleic Acids Res. 2022; 10.1093/nar/gkac803. [DOI] [PMC free article] [PubMed] [Google Scholar]
