Table 1.
Data type | Database | Task type | Prediction tasks |
---|---|---|---|
General | Long Range Graph Benchmark (Dwivedi et al. 2022b) | Edge-level | Molecular bond |
Graph-level | Peptide function, peptide structure | ||
| |||
General | Open Biomedical Network | Node-level | Protein function |
Benchmark (Liu and Krishnan 2024) | Edge-level | Disease–gene association | |
| |||
General | Open Graph Benchmark (Hu et al. 2020b) | Node-level | Protein function |
Edge-level | Protein–protein association, drug–drug interaction, heterogeneous interaction, vessels in mouse brain | ||
Graph-level | Molecular property, species-specific protein association | ||
| |||
General | SubGNN Benchmarks (Alsentzer et al. 2020) | Subgraph-level | Proteins associated with biological process, rare neurological disorders phenotype-based diagnosis, and rare metabolic disorders phenotype-based diagnosis |
| |||
General | Temporal Graph Benchmark (Huang et al. 2024) | Node-level | Dynamic node affinity prediction |
Edge-level | Dynamic link prediction | ||
| |||
Knowledge graph | PrimeKG (Chandak et al. 2023) | Node-level | Identity of protein/gene, disease, drug, biological process, pathway, phenotype, molecular function, cellular component, exposure, and anatomical region |
Edge-level | Protein–protein interaction, disease–drug indication, disease–drug contraindication, disease-drug off-label use, disease–phenotype association, disease–disease association, disease–protein association, disease–exposure association, phenotype–protein association, pathway–gene association, etc. | ||
| |||
Knowledge graph | Phenotype Knowledge Translator (Callahan et al. 2024) | Node-level | Identity of tissue, cell, DNA, RNA, gene, miRNA, variant, protein, disease, biological process, pathway, phenotype, molecular function, cellular component, and chemical |
Edge-level | Tissue-/cell-specific gene expression, gene-variant association, variant-disease association, chemical-disease association, chemical-pathway association, etc. | ||
| |||
Molecular design | Protein sEquence undERstanding (Xu et al. 2022) | Edge-level | Protein–protein interaction, contact prediction |
Graph-level | Molecular property (e.g. fold classification, secondary structure prediction) | ||
| |||
Molecular design | Tasks Assessing Protein Embeddings (Rao et al. 2019) | Edge-level | Protein–protein interaction, contact prediction |
Graph-level | Molecular property (e.g. fold classification, secondary structure prediction) | ||
| |||
Molecular design | Graph Explainability Library (Agarwal et al. 2023) | Graph-level | Molecular mutagenic property, molecular functional group (e.g. benzine rings, fluoride carbonyl) |
| |||
Neurology | NeuroGraph (Said et al. 2023) | Graph-level | Donor demographics (age and gender), task states (emotion processing, gambling, language, motor, relational processing, social cognition, and working memory), cognitive traits (working memory, fluid intelligence) |
| |||
Therapeutic discovery | AVIDa-hIL6 (Tsuruta et al. 2024) | Edge-level | Antigen–antibody interaction |
| |||
Therapeutic discovery | Therapeutic Data Commons (Huang et al. 2021) | Edge-level | Drug–target interaction, drug–drug interaction, protein–protein interaction, disease–gene association, drug–response prediction, drug–synergy prediction, peptide-MHC binding, antibody–antigen affinity, miRNA–target prediction, catalyst prediction, TCR–epitope binding, and clinical trial outcomes |
Graph-level | Molecular property (e.g. synthesizability, drug-likeness) |
Databases are categorized by data type. The table is organized alphabetically by data type and database names.