Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2026 Jan 28;27(1):bbaf710. doi: 10.1093/bib/bbaf710

BiGvCL: bipartite graph-based cross-domain contrastive learning model for the predicting drug-gene interactions

Shida He 1,2,3, Zixu Wang 4, Jing Li 5, Quan Zou 6, Feng Zhang 7,8,
PMCID: PMC12848949  PMID: 41603649

Abstract

Drug-gene interactions (DGIs) influence the toxicity or ineffectiveness of the drug therapy and play an important role in elucidating drug mechanisms, predicting potential adverse effects, and facilitating precision medicine. Existing computational methods typically rely on chemical or genetic sequence features of drugs and genes, limiting their effectiveness for novel entities lacking explicit annotations. To address this, we propose BiGvCL, a framework that predicts DGIs exclusively based on network topology, requiring no explicit feature information for drugs or genes. BiGvCL introduces a lightweight graph attention mechanism (GATLite) to efficiently aggregate local neighborhood information. Additionally, we develop a gated graph convolutional network (GatedGCN) to explicitly learn high-order interactions between drugs and genes, further integrating contrastive learning to enhance the model’s generalizability. Comprehensive experiments on DrugBank and DGIdb datasets show that BiGvCL achieves competitive performance across all metrics compared with representative baselines. Cross-domain evaluations on OGB datasets further confirm its adaptability to heterogeneous biomedical networks. Ablation and hyperparameter analyses highlight the key contributions of contrastive and gated mechanisms, while case studies and molecular docking provide supporting evidence for the biological relevance of predictions. Collectively, while BiGvCL is constrained by its reliance on network topology and transductive learning paradigm, it demonstrates the potential of topology-based approaches for discovering novel drug-gene interactions, which may inform drug repurposing and precision medicine efforts.

Keywords: drug-gene interactions, graph neural networks, contrastive learning, cross-domain knowledge transfer, transductive learning

Introduction

Bioinformatics plays a pivotal role in interpreting vast amounts of data generated from transcriptomics, genomics, proteomics, and other omics fields, facilitating the understanding of biological processes and predicting gene functions [1–4]. Numerous bioinformatics tools have successfully elucidated disease mechanisms underlying infectious diseases and various cancers, including colon, gastric, bladder, prostate, and lung cancers [4–8]. Some computational drug-target interaction (DTI) prediction was established through early works on supervised chemogenomic inference [9], bipartite local models [10], and statistics approaches [11], providing the methodological basis for subsequent advances. In recent years, these bioinformatics methodologies have increasingly integrated artificial intelligence (AI), particularly deep learning (DL), contributing to advances in drug discovery by addressing significant time, cost, and risk traditionally associated with drug development [12–21]. Techniques such as deep generative models, notably diffusion models, facilitate efficient exploration of chemical spaces, enabling rapid design of drug candidates with desired pharmacological profiles, thus enhancing diversity and efficacy in drug design [22–30]. In protein structure prediction, AI-driven models like AlphaFold [31] have resolved long-standing challenges in structural biology, accelerating target structure elucidation and structure-based drug design [32–34]. In biological sequence modeling, deep neural networks also show potential, e.g. Zhao et al. [35] conducted sequence-based toxicity prediction using CNN + GRU with channel attention and a variational information bottleneck, and Le et al. [36] presented sequence-based identification of vesicular transport proteins using GRU with PSSM profiles and class-weighting for imbalance. Moreover, DTI prediction has advanced significantly, with graph neural networks, including GraphDTA and GraphormerDTI, effectively integrating molecular structures and protein sequences to achieve competitive performance on benchmark datasets [37–41]. In particular, Aragh et al. [42] proposed MiRAGE-DTI, which integrates multiple similarity measures of drugs and targets and employs a Random Forest classifier to predict drug–target interactions. Gao et al. [43] developed HMT-DTI, a precomputed hierarchical meta-path learning framework that adopts a Transformer-based message-passing mechanism to assess the importance of neighboring nodes and adaptively aggregate meta-path information. Through a hierarchical knowledge extraction strategy, HMT-DTI evaluates the significance of multi-hop neighbors and diverse meta-path patterns, thereby capturing rich semantic representations of drugs and targets.

Despite progress in DTI models, they predominantly focus on direct binding relationships between drugs and protein targets, neglecting broader systemic effects within cellular gene regulatory networks. In contrast, drug-gene interactions (DGIs) offer a more comprehensive perspective by capturing drug impacts on gene expression, transcriptional regulation, and epigenetic modifications, thereby illuminating multi-target mechanisms, adverse effect formation, and individual variability in patient responses [44–50]. However, the inherent complexity and diversity of DGI networks, characterized by various interaction types (e.g. agonism, antagonism, upregulation, downregulation), render exhaustive experimental validation prohibitively expensive and time-consuming.

With advancements in graph representation learning, researchers increasingly leverage graph structure to predict DGIs [51–53]. Notably, the CoSMIG model employs communicative subgraph inference for inductive predictions in multi-relational drug-gene networks [54, 55], while MDTips [39] integrates multimodal data such as knowledge graphs, gene expression profiles, and molecular structures, demonstrating robustness across diverse data scenarios. Furthermore, HGDruG constructs multi-task prediction frameworks based on heterogeneous hypergraphs capturing micro-to-macro scale drug attributes [56]. Graph-based approaches have also excelled in drug repositioning, exemplified by the graph foundation model TxGNN, which improves accuracy in zero-shot drug repurposing across numerous disease contexts [57].

Despite such advancements, current AI-driven DGI prediction models face critical challenges [58–60]: (i) pronounced data sparsity limits generalization due to scant known interactions; (ii) models often address narrowly defined problems without comprehensive capabilities; and (iii) integrating heterogeneous and multimodal data introduces complexity, imbalanced data distributions, and long-tail phenomena, complicating precise predictions. Recent approaches, including dynamic hypergraph-based DGCL [61] and singular-value decomposition-enhanced SGCLDGA [62] have partly mitigated these issues but remain limited in scalability, generalizability, and predictive specificity. In addition, there are other challenges in the study of drug–gene interactions. For example, the graph diffusion network (GDNDGP) proposed by Wu et al. [63] aims to predict whether an association exists between drugs and genes in a heterogeneous biomedical graph; however, the model only outputs the probability of an interaction and cannot distinguish specific interaction types. He et al. [64] proposed a novel inductive learning-based model to predict unseen drug–gene interactions by constructing a multi-relational drug–disease–gene (DDG) graph. However, the model heavily relies on domain-specific knowledge and, similarly, only predicts the existence of interactions without identifying their specific types or underlying mechanisms.

To address these limitations, we propose BiGvCL, a bipartite graph-based cross-domain contrastive learning framework relying solely on network topology. Our approach captures both local and global interaction information without explicit feature annotations. The primary contributions of this study include:

  1. Introducing lightweight graph attention network (GATLite), a lightweight attention mechanism for adaptive neighborhood aggregation, maintaining inductive learning capabilities and computational efficiency while reducing parameter complexity.

  2. Developing the gated graph convolutional network (GatedGCN) module, which explicitly modeling interaction features via associative matrix-based propagation, adaptively integrating inherent node features with structural information, well-suited to bipartite networks and complex nonlinear relationship modeling.

  3. Designing a contrastive learning framework to guide dual-graph structure extraction and enhance the discriminative capacity of drug and gene representations.

  4. Conducting comprehensive experiments on benchmark datasets (DrugBank [65], DGIdb [66], LINCS L1000 [67], OGB [68] drug-disease), including ablation and sparsity analyses, validating BiGvCL’s potential in DGI prediction tasks.

  5. Demonstrating BiGvCL’s practical utility through case studies confirming its capability to identify literature-supported novel drug-gene interactions, highlighting its role in drug discovery and repositioning.

Materials and methods

Datasets

This study employed four representative datasets covering drug-gene and drug-disease interactions to evaluate the model’s predictive performance and generalizability across multiple scenarios. The detail are provided in Table 1, the DrugBank dataset containing drug-gene interactions involving 425 drugs and 11 284 genes, comprising a total of 80 924 interactions categorized into upregulation and downregulation. The DGIdb dataset includes 1185 drugs and 1664 genes, totaling 11,366 interactions spanning 14 complex relationship types (e.g. binder, inhibitor, agonist), which can assess the model’s capability to distinguish multiple pharmacological relationships. The LINCS L1000 dataset consists of 1878 drugs and 3769 genes, with a total of 29 610 interaction pairs, categorized as upregulation or downregulation of gene expression, serving as an independent external dataset to further evaluate the model’s cross-dataset generalization. Additionally, to assess the model’s performance in a cross-domain heterogeneous graph environment, we selected the OGB biokg knowledge graph dataset, which includes 686 drugs, 1507 diseases, and 10 294 drug-disease interactions simply categorized as ‘interaction’ or ‘no interaction.’

Table 1.

Summary of datasets used in this study.

Dataset Drugs Genes/Diseases Interactions Classes
DrugBank 425 11,284(G) 80 924 2
DGIdb 1185 1664(G) 11 366 14
LINCS L1000 1878 3769(G) 29 610 2
OGB biokg 686 1507(D) 10 294 2

Note: DrugBank: Two interaction categories (up-regulation and down-regulation).

DGIdb: Fourteen interaction categories including binder, inhibitor, agonist, partial agonist, antibody, antagonist, potentiator, cofactor, positive modulator, activator, modulator, allosteric modulator, blocker, and ligand.

LINCS 1000: Two interaction categories (up-regulation and down-regulation).

OGB biokg: Two interaction categories (interaction, no interaction).

Data preprocessing and graph construction

For fair comparison, we adopted the drug-gene data splits provided by DGCL [61], maintaining an ~4:1 train-test ratio. Given sufficient data volume, we randomly sampled 10% of the training set as a validation set. All experiments were repeated across five independent runs with different results to ensure statistical robustness. Notably, the LINCS L1000 dataset served exclusively as an external validation set to assess cross-dataset generalizability. Negative samples in the OGB dataset were generated using a frequency-smoothed sampling strategy. This approach preferentially selected low-frequency nodes while suppressing high-frequency ones, independently sampling node pairs from weighted distributions. Generated drug-disease pairs that did not overlap with positive samples and contained no duplicates were labeled as negative samples, thus addressing class imbalance and data sparsity inherent to interaction prediction tasks to a certain extent.

Due to the transductive learning paradigm, drug and gene nodes in the validation and test sets must appear in the training graph; however, no interaction edges (drug-gene pairs) overlap between training and evaluation splits, thereby preventing data leakage while enabling the model to leverage shared node representations. For baseline methods requiring explicit feature inputs (e.g. GRALS [69]), we collected molecular structural information for drugs from DrugBank and PubChem [70] databases and retrieved gene expression profiles from the GTEx [71] database. For drugs or genes lacking publicly available feature data, we employed randomly initialized embedding vectors as substitutes to ensure compatibility across all baseline comparisons.

Model architecture

To address challenges associated with DGI prediction, we propose the BiGvCL framework, as depicted in Fig. 1. This framework comprises three core components: (i) a GATLite designed to capture structural graph information; (ii) a GatedGCN that explicitly learns higher-order interaction patterns; (iii) a contrastive learning strategy to enhance the discriminative capability of learned representations. Furthermore, we integrate multiple model variants through an ensemble learning approach to improve prediction performance and stability.

Figure 1.

Overview of BiGvCL architecture.

Overview of BiGvCL architecture. (A) Framework outline with embeddings and loss functions. (B) GatedGCN mechanism. (C) GATLite attention module.

Problem definition

We define the prediction of DGIs as a multi-label classification task. Formally, given a set of drugs Inline graphic, a set of genes Inline graphic, and known interactions Inline graphic, where Inline graphic denotes interaction types, our aim is to learn a predictive function to infer interaction types for unknown drug-gene pairs:

graphic file with name DmEquation1.gif

To facilitate efficient graph-based modeling, we first unify drug and gene entities into a single node set and re-index them through a bijection to consecutive integers. Subsequently, an adjacency matrix Inline graphicrepresenting the bipartite interactions is constructed:

graphic file with name DmEquation2.gif

To capture node-specific information and ensure stable propagation during graph convolution operations, we introduce self-loop connections, yielding an augmented adjacency matrix:

graphic file with name DmEquation3.gif

Finally, we apply symmetric normalization to mitigate degree imbalances among nodes:

graphic file with name DmEquation4.gif

Here, Inline graphic is a diagonal degree matrix derived from Inline graphic. This normalized adjacency matrix is then converted into a sparse tensor as input to graph neural networks.

Lightweight graph attention network

Graph attention networks [72, 73] (GATs) exhibit have been widely used for node representation but encounter limitations in computational efficiency and memory usage when applied to large-scale biological molecular networks. To address these issues, we introduce a lightweight variant termed GATLite. The core innovation of GATLite lies in decoupling feature transformation from attention computation through a shared linear projection matrix. Specifically, node features are first transformed through a linear mapping Inline graphic to generate intermediate node representations:

graphic file with name DmEquation5.gif

Next, for a node Inline graphic with neighbors Inline graphic, we compute and normalize the attention coefficients:

graphic file with name DmEquation6.gif
graphic file with name DmEquation7.gif

Here, Inline graphic is the attention parameter vector, and Inline graphic represents feature concatenation.

Finally, node representations are aggregated using the computed attention weights, followed by applying an ELU activation function and L2 normalization to enhance representation capability and stabilize training:

graphic file with name DmEquation8.gif

The implementation leverages sparse matrix operations to ensure computational efficiency on large-scale networks, with attention dropout applied during training to prevent overfitting. This design maintains the expressive power of attention mechanisms while reducing computational overhead.

Gated graph convolutional network

Drug-gene interaction networks typically involve complex, higher-order relational patterns that are difficult to capture through simple neighborhood aggregation. To address this challenge, we designed the GatedGCN module to explicitly model multi-node, second-order dependencies within networks. The core idea of GatedGCN is to integrate gated mechanisms with graph message-passing processes, thus enhancing representation learning capabilities.

The GatedGCN module employs a multi-stage transformation process. First, node features undergo nonlinear transformations through multi-layer perceptrons (MLPs). Bidirectional information propagation is then performed: features flow from nodes to hyperedges, are aggregated and transformed, and subsequently propagate back to nodes. This design enables the capture of higher-order connectivity patterns that extend beyond direct node-to-node relationships.

The gating mechanism adaptively balances node-intrinsic features and structural information through a learnable fusion strategy:

graphic file with name DmEquation9.gif
graphic file with name DmEquation10.gif

where Inline graphic denotes the original feature vector of node Inline graphic represents the aggregated higher-order information, Inline graphic is the learnable weight matrix for the gating mechanism, Inline graphic is the bias term, Inline graphic denotes concatenation, Inline graphic represents element-wise multiplication, and Inline graphic denotes the sigmoid activation function. This adaptive fusion allows the model to dynamically determine the contribution of structural patterns versus intrinsic node features for each entity.

The GatedGCN module complements GATLite, with the former capturing global higher-order relational patterns and the latter focusing on local neighborhood structures. The integration of both strategies enhances the model’s capacity to represent complex drug-gene interaction patterns.

Contrastive learning framework

Contrastive learning has emerged as a paradigm in representation learning research, with the normalized temperature-scaled cross-entropy (NT-Xent) loss attracting particular attention due to its application in frameworks such as SimCLR [74]. However, the standard NT-Xent loss assumes a single positive sample for each anchor, restricting its potential applicability in multi-relational domains. In this study, we propose the Multi-Positive NT-Xent loss, an extension of the standard NT-Xent loss designed to effectively handle scenarios where multiple samples can be considered as positive pairs. The standard NT-Xent loss is formally defined as:

graphic file with name DmEquation11.gif

where Inline graphic is the single positive sample corresponding to anchor Inline graphic, and Inline graphic denotes the temperature parameter.

The proposed Multi-Positive NT-Xent loss generalizes this formulation as follows:

graphic file with name DmEquation12.gif

where Inline graphic denotes the set of samples that form positive pairs with anchor Inline graphic.

The Multi-Positive NT-Xent loss leverages the inherent structure in biological networks by recognizing that entities may have multiple valid representations. Through an adaptive masking mechanism, the framework identifies and aggregates contributions from all relevant positive associations. This approach enables the model to learn representations that capture the multifaceted nature of drug-gene interactions, where the same biological entity may manifest through different molecular profiles or experimental conditions. The temperature-scaled similarity computation ensures that the model maintains discriminative power while accommodating the natural variability within entity groups.

Loss function

In training the BiGvCL model, we employed a composite loss function designed to enhance the model’s representational robustness and generalization. Specifically, the combined loss function integrates three components: regularization loss, contrastive learning loss, and cross-entropy loss.

The regularization loss helps prevent model overfitting by encouraging smaller weight values, defined as:

graphic file with name DmEquation13.gif

where Inline graphic represents the set of all model parameters, and Inline graphic denotes the squared L2 norm of parameter Inline graphic.

The contrastive learning loss, implemented via MultiPosNCE, enhances the discriminative power of node representations by distinguishing between positive and negative samples:

graphic file with name DmEquation14.gif

The cross-entropy loss ensures accurate prediction of drug-gene interaction labels:

graphic file with name DmEquation15.gif

where Inline graphic is the ground-truth label, Inline graphic is the predicted probability of node Inline graphic belonging to class Inline graphic, and Inline graphic is the number of classes.

The total loss function integrates these individual components using corresponding weighting coefficients Inline graphic, and Inline graphic:

graphic file with name DmEquation16.gif

For optimization, we used the AdamW [75] optimizer paired with a cyclic learning rate [76] scheduler (CyclicLR), initialized at a learning rate of 0.001. This strategic combination effectively balances model convergence speed and training stability, contributing to the performance of BiGvCL in predicting drug-gene interactions.

Evaluation metrics

To evaluate the predictive effectiveness of the BiGvCL model, we utilized five standard performance metrics commonly employed in multi-label classification tasks: Accuracy (ACC), Macro-F1 score (Macro-F1), Area Under the Receiver Operating Characteristic Curve (AUROC), Area Under the Precision-Recall Curve (AUPR), and Matthews Correlation Coefficient (MCC).

Accuracy (ACC) quantifies the ratio of correctly predicted instances to the total number of instances, calculated as:

graphic file with name DmEquation17.gif

Macro-F1 computes the mean F1 score across all classes, thereby providing an unbiased metric for datasets with class imbalance. The F1 score for class Inline graphic is given by:

graphic file with name DmEquation18.gif

where Inline graphic and Inline graphic.

AUROC measures the model’s capability to distinguish between classes across all classification thresholds, specifically assessing the trade-off between true positive rate (TPR, sensitivity) and false positive rate (FPR, 1-specificity). AUPR evaluates the model’s performance by measuring the area under the precision-recall curve, emphasizing predictions for minority (positive) classes, and thus is particularly suited for imbalanced datasets.

Matthews Correlation Coefficient (MCC) provides an comprehensive evaluation metric that considers all aspects of the confusion matrix, defined as:

graphic file with name DmEquation19.gif

In the above definitions, Inline graphic(true positives) refers to correctly identified positive instances; Inline graphic(true negatives) represents correctly identified negative instances; Inline graphic(false positives) denotes negative instances incorrectly identified as positives; and Inline graphic(false negatives) indicates positive instances incorrectly classified as negatives. The combined use of these metrics provides a comprehensive assessment of the BiGvCL model’s predictive accuracy and generalization ability.

Results

Model performance and comparison with baselines

To validate the effectiveness of the proposed BiGvCL model, extensive performance evaluations were conducted on two benchmark datasets, DrugBank and DGIdb. Results were compared with several existing state-of-the-art methods. Tables 2 and 3 report the performance of each method across multiple metrics on both the validation and test sets after five independent repeated experiments.

Table 2.

Comparison of different methods on the DrugBank dataset.

Method Features Split Acc Macro_f1 AUROC AUPR MCC
MC no val 0.597 ± 0.008 0.596 ± 0.008 0.629 ± 0.006 0.620 ± 0.006 0.193 ± 0.015
GRALS yes val 0.633 ± 0.016 0.632 ± 0.016 0.689 ± 0.018 0.664 ± 0.018 0.268 ± 0.032
F-EAE no val 0.591 ± 0.005 0.576 ± 0.008 0.636 ± 0.003 0.631 ± 0.002 0.197 ± 0.007
GC-MC yes val 0.648 ± 0.003 0.648 ± 0.003 0.678 ± 0.003 0.650 ± 0.003 0.296 ± 0.005
sRGCNN yes val 0.641 ± 0.002 0.640 ± 0.002 0.671 ± 0.002 0.649 ± 0.004 0.281 ± 0.004
PinSage yes val 0.648 ± 0.006 0.647 ± 0.006 0.716 ± 0.002 0.719 ± 0.002 0.297 ± 0.009
IGMC no val 0.643 ± 0.003 0.636 ± 0.009 0.715 ± 0.001 0.720 ± 0.002 0.297 ± 0.003
CosMIG no val 0.670 ± 0.006 0.670 ± 0.006 0.740 ± 0.005 0.742 ± 0.005 0.340 ± 0.011
DGCL no val 0.684 ± 0.004 0.684 ± 0.004 0.737 ± 0.003 0.715 ± 0.003 0.368 ± 0.009
BiGvCL no val 0.711 ± 0.002 0.710 ± 0.002 0.772 ± 0.001 0.751 ± 0.002 0.421 ± 0.004
MC no test 0.623 ± 0.003 0.624 ± 0.003 0.685 ± 0.006 0.695 ± 0.002 0.247 ± 0.006
GRALS yes test 0.631 ± 0.014 0.630 ± 0.014 0.683 ± 0.014 0.655 ± 0.011 0.264 ± 0.028
F-EAE no test 0.615 ± 0.004 0.614 ± 0.004 0.642 ± 0.003 0.634 ± 0.001 0.234 ± 0.007
GC-MC yes test 0.646 ± 0.001 0.645 ± 0.001 0.648 ± 0.002 0.657 ± 0.004 0.291 ± 0.002
sRGCNN yes test 0.633 ± 0.004 0.633 ± 0.004 0.633 ± 0.004 0.650 ± 0.005 0.266 ± 0.007
PinSage yes test 0.644 ± 0.003 0.643 ± 0.004 0.710 ± 0.001 0.714 ± 0.002 0.288 ± 0.006
IGMC no test 0.645 ± 0.007 0.639 ± 0.012 0.723 ± 0.002 0.723 ± 0.002 0.304 ± 0.004
CosMIG no test 0.667 ± 0.004 0.667 ± 0.004 0.742 ± 0.004 0.747 ± 0.005 0.334 ± 0.009
DGCL no test 0.679 ± 0.002 0.679 ± 0.002 0.737 ± 0.003 0.719 ± 0.002 0.359 ± 0.004
BiGvCL no test 0.708 ± 0.002 0.707 ± 0.003 0.776 ± 0.002 0.761 ± 0.003 0.415 ± 0.006

Table 3.

Comparison of different methods on the DGIdb dataset.

Method Features Split Acc Macro_f1 AUROC AUPR MCC
MC no val 0.796 ± 0.047 0.592 ± 0.071 0.896 ± 0.030 0.620 ± 0.058 0.716 ± 0.061
GRALS yes val 0.862 ± 0.014 0.711 ± 0.050 0.947 ± 0.039 0.777 ± 0.074 0.812 ± 0.019
F-EAE no val 0.787 ± 0.024 0.313 ± 0.037 0.862 ± 0.026 0.439 ± 0.043 0.713 ± 0.033
GC-MC yes val 0.848 ± 0.007 0.522 ± 0.042 0.950 ± 0.007 0.575 ± 0.019 0.791 ± 0.012
sRGCNN yes val 0.910 ± 0.008 0.759 ± 0.068 0.939 ± 0.024 0.769 ± 0.032 0.877 ± 0.010
PinSage yes val 0.803 ± 0.011 0.378 ± 0.005 0.939 ± 0.009 0.486 ± 0.026 0.791 ± 0.004
IGMC no val 0.789 ± 0.005 0.515 ± 0.009 0.910 ± 0.001 0.502 ± 0.015 0.726 ± 0.006
CosMIG no val 0.819 ± 0.001 0.616 ± 0.001 0.922 ± 0.001 0.622 ± 0.006 0.762 ± 0.001
DGCL no val 0.930 ± 0.004 0.849 ± 0.016 0.969 ± 0.005 0.863 ± 0.002 0.906 ± 0.006
BiGvCL no val 0.941 ± 0.002 0.872 ± 0.010 0.986 ± 0.008 0.894 ± 0.010 0.918 ± 0.002
MC no test 0.796 ± 0.047 0.572 ± 0.043 0.892 ± 0.025 0.892 ± 0.025 0.701 ± 0.067
GRALS yes test 0.855 ± 0.009 0.689 ± 0.014 0.946 ± 0.006 0.717 ± 0.036 0.803 ± 0.011
F-EAE no test 0.790 ± 0.020 0.345 ± 0.024 0.345 ± 0.024 0.466 ± 0.014 0.711 ± 0.027
GC-MC yes test 0.832 ± 0.009 0.492 ± 0.027 0.911 ± 0.006 0.543 ± 0.025 0.769 ± 0.012
sRGCNN yes test 0.893 ± 0.011 0.723 ± 0.059 0.959 ± 0.016 0.757 ± 0.034 0.853 ± 0.015
PinSage yes test 0.836 ± 0.009 0.375 ± 0.012 0.953 ± 0.010 0.487 ± 0.034 0.773 ± 0.013
IGMC no test 0.791 ± 0.005 0.511 ± 0.028 0.876 ± 0.001 0.493 ± 0.018 0.493 ± 0.018
CosMIG no test 0.843 ± 0.001 0.638 ± 0.001 0.903 ± 0.003 0.610 ± 0.006 0.791 ± 0.001
DGCL no test 0.922 ± 0.002 0.826 ± 0.016 0.975 ± 0.002 0.842 ± 0.002 0.894 ± 0.003
BiGvCL no test 0.932 ± 0.001 0.851 ± 0.012 0.985 ± 0.003 0.865 ± 0.004 0.908 ± 0.001

Note:

MC [77]: Completes missing entries using convex optimization.

GRALS [69]: Integrates structural graph information into matrix factorization.

F-EAE [78]: Uses deep neural networks to predict interactions represented as tensors.

GC-MC [79]: Employs graph auto-encoders for message passing on bipartite interaction graphs.

sRGCNN [80]: Combines GCN and recurrent networks to capture local stationarity and reduce parameters.

PinSage [81]: Applies graph convolutions with random-walk-based neighbor sampling.

IGMC [82]: Extracts enclosing subgraphs for inductive matrix completion using GNNs.

CoSMIG [54]: Predicts relation types via communicative subgraph representation learning.

DGCL [61]: Uses dynamic hypergraph contrastive learning to extract local and global relationships

As observed in Tables 2 and 3, graph neural network-based approaches generally outperform matrix factorization-based methods such as MC, GRALS, and F-EAE, suggesting that graph structures can effectively capture the complex patterns of drug-gene interactions. On the DrugBank dataset (Table 2), BiGvCL achieves a test accuracy of 0.708, compared to 0.679 for DGCL and 0.667 for CosMIG. The AUROC reaches 0.776, while the MCC is 0.415. On the DGIdb dataset (Table 3), BiGvCL obtains a test accuracy of 0.932 and Macro_F1 of 0.851, compared to DGCL’s 0.922 and 0.826 respectively. And the standard deviations across metrics remain relatively low.

Notably, some methods not relying on additional features (BiGvCL, DGCL, CosMIG, IGMC) tend to show competitive or better performance compared to feature-based approaches. This can be attributed to their robustness in handling missing or noisy data. In real-world biomedical datasets, many entities such as antibodies, protein complexes, and certain compounds lack publicly available features. For such cases, we employ randomly initialized embeddings, allowing the model to learn representations directly from the graph structure, while feature-dependent methods struggle to effectively handle these data gaps.

Ablation study analysis

To understand the contributions of various components in the BiGvCL model, we conducted a series of ablation experiments. Figure 2 presents performance comparisons among different model variants on both validation and test sets of the DrugBank and DGIdb datasets. Four model variants were evaluated: removal of contrastive learning (w/o CL), removal of the voting ensemble mechanism (w/o Vote), simultaneous removal of both graph attention and contrastive learning (w/o GAT&CL), and simultaneous removal of both gated graph networks and contrastive learning (w/o GATE&CL).

Figure 2.

Ablation analysis of BiGvCL components.

Ablation analysis of BiGvCL components.

The full BiGvCL model consistently achieves the best performance across all settings. Quantitatively, removing contrastive learning (w/o CL) degrades test accuracy by 0.61% on DrugBank (from 0.7046 to 0.6985) and 0.47% on DGIdb (from 0.9323 to 0.9276), underscoring its role in cross-domain representation learning. Removing the voting mechanism (w/o Vote) results in 0.63% and 0.47% drops on DrugBank and DGIdb, respectively, confirming the ensemble strategy’s contribution to stability. Removing GatedGCN and contrastive learning (w/o GATE&CL) causes the largest decline on DrugBank, while removing GATLite and contrastive learning (w/o GAT&CL) leads to 0.91% degradation on DrugBank and 0.26% on DGIdb. Notably, the component importance varies across datasets: on DrugBank, GatedGCN is the most critical (1.30% contribution when combined with CL removal), while on DGIdb, contrastive learning and voting mechanism contribute equally (both 0.47%), surpassing GatedGCN’s contribution (0.37%). This difference likely stems from DGIdb’s more complex multi-relational structure (14 interaction types versus 2 in DrugBank), where contrastive learning’s ability to discriminate fine-grained semantics becomes more crucial. The radar chart in Fig. 2 visually reinforces these findings, with the purple and green areas extending furthest along the BiGvCL axis. Overall, the ablation study demonstrates that all components are indispensable.

Hyperparameter analysis

To investigate the impact of critical hyperparameters on the BiGvCL model, we systematically analyzed four key parameters, as illustrated in Fig. 3. For the contrastive loss function (Fig. 3A), we compared four contrastive learning losses, finding that MultiPosNCE, which treats entities sharing identical drug-gene identifiers as positive samples, achieved the best performance, outperforming Barlow by 1.54% and 0.10% on DrugBank and DGIdb test sets respectively. The embedding dimension analysis (Fig. 3B) revealed that model accuracy increased progressively with higher dimensions, with DrugBank improving from 0.6962 (128d) to 0.7048 (1024d), representing a 1.24% gain, while DGIdb performance saturated at 256 dimensions; we selected 1024 dimensions to balance performance and computational efficiency. Regarding aggregation strategies (Fig. 3C), Sum aggregation achieved the best results, outperforming Min aggregation by 1.58% and 0.63% on DrugBank and DGIdb test sets respectively, indicating that summation effectively captures complex drug-gene interaction features. The network depth analysis (Fig. 3D) showed that accuracy improved consistently with increased GNN layers, with the three-layer architecture achieving 1.19% higher test accuracy on DrugBank compared to single-layer. Based on this comprehensive analysis, we determined the optimal configuration as MultiPosNCE loss, 1024-dimensional embeddings, Sum aggregation, and 4 layers GNN architecture, achieving test accuracies of 0.7048 on DrugBank and 0.9323 on DGIdb. Given sufficient GPU resources, these parameters or higher values are recommended to optimize the model’s predictive performance for drug-gene interaction tasks while balancing computational efficiency.

Figure 3.

Impact of hyperparameters on BiGvCL accuracy.

The impact of key hyperparameters of the BiGvCL model on prediction accuracy.

Model performance analysis on sparse networks

Drug-gene interaction networks typically exhibit significant sparsity. We conducted a systematic analysis of the network sparsity characteristics. Figure 4A illustrates the distribution of drug-gene interaction frequencies in the DrugBank and DGIdb datasets. It is observed that both datasets follow a power-law distribution, characterized by a few drugs and genes involved in numerous interactions, whereas the majority of nodes maintain relatively few connections. Figure 4B further examines the top 20 most frequently interacting drugs and genes. Notably, the DGIdb dataset displays a more pronounced imbalance, with top-ranking drugs engaging in over 500 interactions, compared to ~100 interactions for the most active drugs in DrugBank.

Figure 4.

Model performance on sparse networks and top interactions.

Performance analysis of the model on sparse networks. (A) Distribution of drug-gene interaction frequencies. (B) Top 20 drugs and genes by interaction count. (C) Performance comparison on DrugBank and DGIdb datasets under varying sparsity conditions.

To evaluate BiGvCL’s adaptability across networks of varying sparsity, we designed controlled experiments with constrained node numbers, as presented in Fig. 4. On the DrugBank dataset, as the network scales from a small size (drugs <20, genes <100, 5868 interactions) to a larger size (drugs <100, genes <1000, 16,330 interactions), the BiGvCL test accuracy improves from 0.565 to 0.603, consistently outperforming the DGCL benchmark. Similarly, on the DGIdb dataset, accuracy ranges from 0.871 (smallest scale: drugs <10, genes <10, 4089 interactions) to 0.930 (larger scale: drugs <25, genes <40, 7132 interactions), demonstrating larger improvements. These experimental results confirm that BiGvCL maintains improved performance across varying sparsity conditions. BiGvCL achieves accuracy of 0.908 on the DGIdb dataset even at very limited node counts, highlighting its performance in modeling real-world sparse drug-gene interaction networks.

External testing and cross-domain knowledge validation

To further validate the generalization capability of the BiGvCL model, we conducted additional evaluations using the LINCS L1000 dataset and the OGB drug-disease dataset. As depicted in Fig. 5, our proposed BiGvCL model demonstrates competitive performance across these two distinct types of datasets. On the LINCS L1000 external test dataset, BiGvCL achieves 0.6200 test accuracy, outperforming DGCL (0.6043) and CoSMIG (0.5980) by 1.57% and 2.20%, respectively. On the cross-domain OGB drug-disease dataset, BiGvCL achieves 0.9155 test accuracy, surpassing DGCL (0.9068) and CoSMIG (0.8796) by 0.87% and 3.59%. These results further confirm its generalization capabilities and cross-domain adaptability. Overall, the results from both datasets support the effectiveness of the BiGvCL method, highlighting its potential in relational learning tasks within biomedical knowledge networks.

Figure 5.

External and cross-domain validation of BiGvCL.

External test set and cross-domain knowledge verification.

Integrative analysis of drug-gene interactions based on BiGvCL

To demonstrate the practical utility of the BiGvCL model in drug repositioning and drug interaction studies, we conducted a comprehensive case study based on the model’s embedding representations, as illustrated in Fig. 6. Initially, we constructed a complete drug-gene interaction network (Fig. 6A) using training data, showing known interactions between drugs (orange nodes) and genes (blue nodes). Building upon this global network, we explored two applications.

Figure 6.

Integrated drug-gene interaction analysis and case studies.

Integrated analysis of drug-gene interactions based on BiGvCL. (A) Global drug-gene interaction network. (B) Predicted drug-gene associations validated by literature. (C) Discovery of drug–drug interactions based on drug similarity. (D.) identification of antibody drugs based on drug similarity. (E) Identification of GABA receptor-related drugs based on drug similarity.

The first application involved predicting and validating novel drug-gene interactions (Fig. 6B). Four biologically significant genes, including VEGFA, PRKAA1 CYP3A4 and IL6, were selected, and interactions predicted by the model with probabilities exceeding 0.99 were identified as high-confidence candidates. These predicted interactions did not exist in either the training or test sets, representing novel findings. Subsequent literature searches provided evidence supporting these high-confidence predictions. Specifically, we validated various predicted drugs regulating VEGFA through existing anti-angiogenesis studies (Table S1); revealed compounds influencing the cAMP signaling pathway related to PRKAA1 (Table S2). confirmed multiple potential substrates and inhibitors for the drug-metabolizing enzyme CYP3A4 (Table S3); and identified several potential modulators for the inflammatory cytokine IL6 (Table S4). These findings underscore BiGvCL’s capability not only to reproduce known interactions but also to predict previously undiscovered drug-gene relationships, verified by literature, demonstrating its potential in drug discovery applications. The second application leveraged the embedding layer of the model, identifying the top 10 most similar drug pairs to investigate drug interactions (Figs 6C–E, Table S5). In Fig. 6C, we identified three drug pairs with distinct interaction patterns: toxic interactions between Lamotrigine and Escitalopram, synergistic effects between Quetiapine and Aripiprazole, and functional consistency between Rapastinel and Apimostinel. Figure 6D highlights antibody-type drugs identified by the model, including Vanticizumab and REGN421 (DII4 inhibitors), illustrating the model’s generalization capability across diverse drug classes. Figure 6E presents various compounds associated with GABA receptors, including agonists like Bretazenil and antagonists like Flumazenil, indicating that the model captures specific receptor-associated drug patterns.

Collectively, these case studies suggest that the BiGvCL model, through its embedding representations, can predict novel drug-gene interactions, identify functional similarities among drugs, and uncover interaction patterns. Thus, BiGvCL provides a computational method for drug repositioning, combination therapy development, and target-based drug design. Relevant literature supporting these findings can be found in our supplementary materials.

Molecular docking analysis

To further validate the practical applicability of the drug–gene interactions predicted by our model, we conducted molecular docking experiments targeting the drug-metabolizing enzyme CYP3A4. Docking analyses were performed using the online docking platform CB-Dock2 [83]. We selected Lamotrigine and Hesperadin as representative candidate compounds, which were predicted with high confidence by the BiGvCL model to interact with CYP3A4. Two protein structures (PDB IDs: 1W0E and 5VC0) were used in the docking analysis. The 1W0E structure represents CYP3A4 co-crystallized with the enzyme inhibitor metyrapone, whereas 5VC0 corresponds to CYP3A4 complexed with the antiviral drug ritonavir. These two structures capture distinct conformations of the enzyme, providing diverse binding-site characteristics.

Docking results, shown in Fig. 7, indicate that Hesperadin exhibited binding affinity toward CYP3A4 (Vina scores: −9.3 with 5VC0 and −8.2 with 1W0E), suggesting its potential as a CYP3A4 ligand. Lamotrigine also demonstrated binding affinity (Vina score: −6.0 for both 5VC0 and 1W0E). These findings further highlight the potential of BiGvCL for virtual drug screening applications.

Figure 7.

Docking poses of Lamotrigine and Hesperadin with CYP3A4 variants.

Docking poses of Lamotrigine and Hesperadin in two CYP3A4 conformations. (A) Lamotrigine bound to CYP3A4-1W0E. (B)Hesperadin bound to CYP3A4-5VC0. (C) Lamotrigine bound to CYP3A4-5VC0. (D) Hesperadin bound to CYP3A4-1W0E.

Figure 8.

Functional enrichment analysis of PAZOPANIB-associated genes.

Drug-based functional annotation of predicted genes.

Drug-based functional annotation of predicted genes

We further attempted to analyze the application potential of the BiGvCL model in enrichment analysis. In this work, we selected PAZOPANIB as a case study for validation analysis. PAZOPANIB is a potent, multi-targeted tyrosine kinase inhibitor anticancer drug that limits tumor growth by inhibiting enzymes including vascular endothelial growth factor receptor, platelet-derived growth factor receptor, c-KIT, and FGFR, with complex pharmacological mechanisms.

We performed KEGG and GO enrichment analysis on the top 100 genes highly associated with PAZOPANIB drug predicted by the BiGvCL model. Figure 8 was plotted based on Metascape results on the CNSknowall platform (https://cnsknowall.com). The enrichment analysis results showed that the predicted genes mainly participate in several key functional modules: the most significantly enriched pathway was ‘inward rectifier potassium channel activity’ (GO Molecular Functions, Rich factor = 0.25, P < 1 × 10−13), involving seven key genes; followed by the proteasome pathway (KEGG Pathway) and gap junction pathway; additionally including immune inflammatory response, protein kinase activity, and synaptic signaling functional modules.

Conclusions

In this study, we introduced BiGvCL, a novel bipartite graph-based cross-domain contrastive learning framework designed for predicting DGIs. BiGvCL integrates a GATLite, a GatedGCN, and a contrastive learning strategy to capture complex interaction patterns. Evaluations on benchmark datasets (DrugBank, DGIdb, LINCS L1000, and OGB) highlighted BiGvCL’s stability and cross-domain generalization capability. Ablation studies further validated the contributions of the contrastive learning and gating mechanisms, while systematic hyperparameter analyses determined optimal configurations balancing predictive accuracy with computational efficiency. Additionally, BiGvCL demonstrated adaptability under sparse network conditions and predicted novel drug-gene interactions with supporting evidence from existing literature. Despite these strengths, our approach has inherent limitations, such as reliance solely on network topology without explicit molecular annotations, and the constraints imposed by its transductive learning strategy, limiting predictions for unseen entities. In the future, we will extend BiGvCL to inductive learning scenarios and integrate multimodal biomedical data to enhance its predictive performance and generalization capability.

Key Points

  • Developed a novel graph contrastive learning model to capture drug–gene relationships based exclusively on topology.

  • Achieved competitive metrics across DrugBank, DGIdb, LINCS L1000, and OGB biokg benchmarks.

  • Demonstrated consistent robustness in predicting interactions within sparse and small-scale networks.

  • Provided molecular docking validation supporting the biological plausibility of predicted interactions.

  • Enabled identification of potential drug repositioning targets without requiring external biochemical or genomic data.

Supplementary Material

bbaf710_Supplemental_Files

Acknowledgements

This work was supported by the National Science and Technology Major Project(2022ZD0117700), the National Natural Science Foundation of China (No.62450002, 62303355), Zhejiang Provincial Natural Science Foundation of China (No. LD24F020004), and the Municipal Government of Quzhou (No.2024D001).

Contributor Information

Shida He, The Joint Innovation Center for Engineering in Medicine, Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, No. 100, Minjiang Avenue, Kecheng District, Quzhou, Zhejiang, 324000, China; Department of Respiratory and Critical Care, Quzhou Affiliated Hospital of Wenzhou Medical University, No. 100, Minjiang Avenue, Kecheng District, Quzhou, Zhejiang, 324000, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No. 1 Chengdian Road, Kecheng District, Quzhou, Zhejiang, 324000, China.

Zixu Wang, College of Computer Science and Electronic Engineering, Hunan University, Lushan South Road, Yuelu District, Changsha, Hunan, 410082, China.

Jing Li, Department of Microbiology, University of Hong Kong, Block T, Queen Mary Hospital, Pok Fu Lam Road, Pok Fu Lam, Hong Kong, 000000, China.

Quan Zou, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No. 1 Chengdian Road, Kecheng District, Quzhou, Zhejiang, 324000, China.

Feng Zhang, The Joint Innovation Center for Engineering in Medicine, Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, No. 100, Minjiang Avenue, Kecheng District, Quzhou, Zhejiang, 324000, China; Department of Respiratory and Critical Care, Quzhou Affiliated Hospital of Wenzhou Medical University, No. 100, Minjiang Avenue, Kecheng District, Quzhou, Zhejiang, 324000, China.

Conflict of interest: The authors declare no competing interests.

Funding

This work was supported by the National Science and Technology Major Project (2022ZD0117700), the National Natural Science Foundation of China (No. 62450002, 62303355), Zhejiang Provincial Natural Science Foundation of China (No. LD24F020004), and the Municipal Government of Quzhou (No. 2024D001).

Data availability

The data and code in this study are openly accessible on GitHub (https://github.com/heshida01/BiGvCL).

Reference

  • 1. Zhang  S, Liu  K, Liu  Y, et al.  The role and application of bioinformatics techniques and tools in drug discovery. Front Pharmacol  2025;16:1547131. 10.3389/fphar.2025.1547131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Khan  S, Imran  A, Khan  AA, et al.  Systems biology approaches for the prediction of possible role of chlamydia pneumoniae proteins in the etiology of lung cancer. PLoS One  2016;11:e0148530. 10.1371/journal.pone.0148530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Wang  Y, Imran  A, Shami  A, et al.  Decipher the helicobacter pylori protein targeting in the nucleus of host cell and their implications in gallbladder cancer: an insilico approach. J Cancer  2021;12:7214–22. 10.7150/jca.63517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Khan  S, Zaidi  S, Alouffi  AS, et al.  Computational proteome-wide study for the prediction of Escherichia coli protein targeting in host cell organelles and their implication in development of colon cancer. ACS omega  2020;5:7254–61. 10.1021/acsomega.9b04042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Bris  C, Goudenege  D, Desquiret-Dumas  V, et al.  Bioinformatics tools and databases to assess the pathogenicity of mitochondrial DNA variants in the field of next generation sequencing. Front Genet  2018;9:632. 10.3389/fgene.2018.00632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Chen  R, Zhang  D, Chaudhary  AA, et al.  Deciphering the Withania somnifera alkaloids potential for cure of neurodegenerative disease: an in-silico study. AMB Express  2025;15:29. 10.1186/s13568-025-01826-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Khan  S, Zakariah  M, Rolfo  C, et al.  Prediction of mycoplasma hominis proteins targeting in mitochondria and cytoplasm of host cells and their implication in prostate cancer etiology. Oncotarget  2017;8:30830–43. 10.18632/oncotarget.8306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Gupta  KK, Sharma  KK, Chandra  H, et al.  The integrative bioinformatics approaches to predict the xanthohumol as anti-breast cancer molecule: targeting cancer cells signaling PI3K and AKT kinase pathway. Front Oncol  2022;12:950835. 10.3389/fonc.2022.950835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Yamanishi  Y, Araki  M, Gutteridge  A, et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics  2008;24:i232–40. 10.1093/bioinformatics/btn162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Bleakley  K, Yamanishi  Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics  2009;25:2397–403. 10.1093/bioinformatics/btp433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Favorov  A, Mularoni  L, Cope  LM, et al.  Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput Biol  2012;8:e1002529. 10.1371/journal.pcbi.1002529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Ren  Z, Zeng  X, Lao  Y, et al.  Predicting rare drug-drug interaction events with dual-granular structure-adaptive and pair variational representation. Nat Commun  2025;16:3997. 10.1038/s41467-025-59431-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Qian  Y, Wang  Y, Liu  J, et al.  A survey on multi-view fusion for predicting links in biomedical bipartite networks: methods and applications. Information Fusion  2025;117:102894. 10.1016/j.inffus.2024.102894. [DOI] [Google Scholar]
  • 14. Askr  H, Elgeldawi  E, Aboul Ella  H, et al.  Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev  2023;56:5975–6037. 10.1007/s10462-022-10306-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Chen  R, Li  C, Wang  L, et al.  Pretraining graph transformer for molecular representation with fusion of multimodal information. Information Fusion  2025;115:102784. 10.1016/j.inffus.2024.102784. [DOI] [Google Scholar]
  • 16. Li  H, Pang  Y, Liu  B. BioSeq-BLM: a platform for analyzing DNA, RNA, and protein sequences based on biological language models. Nucleic Acids Res  2021;49:e129. 10.1093/nar/gkab829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Li  H, Liu  B. BioSeq-diabolo: biological sequence similarity analysis using diabolo. PLoS Comput Biol  2023;19:e1011214. 10.1371/journal.pcbi.1011214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Tang  Y, Pang  Y, Liu  B. DeepIDP-2L: protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network. Bioinformatics  2022;38:1252–60. 10.1093/bioinformatics/btab810. [DOI] [PubMed] [Google Scholar]
  • 19. Xiang  H, Zeng  L, Hou  L, et al.  A molecular video-derived foundation model for scientific drug discovery. Nat Commun  2024;15:9696. 10.1038/s41467-024-53742-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Li  P, Zhang  K, Liu  T, et al.  A deep learning approach for rational ligand generation with toxicity control via reactive building blocks. Nature Computational Science  2024;4:1–14. [DOI] [PubMed] [Google Scholar]
  • 21. Li  T, Ren  X, Luo  X, et al.  A foundation model identifies broad-Spectrum antimicrobial peptides against drug-resistant bacterial infection. Nat Commun  2024;15:7538. 10.1038/s41467-024-51933-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Schneider  P, Walters  WP, Plowright  AT, et al.  Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov  2020;19:353–64. 10.1038/s41573-019-0050-3. [DOI] [PubMed] [Google Scholar]
  • 23. Wang  Z, Chen  Y, Ma  P, et al.  Image-based generation for molecule design with SketchMol. Nature Machine Intelligence  2025;7:1–12. [Google Scholar]
  • 24. Zhang  H, Saravanan  KM. Advances in deep learning assisted drug discovery methods: a self-review. Curr Bioinforma  2024;19:891–907. 10.2174/0115748936285690240101041704. [DOI] [Google Scholar]
  • 25. Mathivanan  JS, Dhayabaran  VV, David  MR, et al.  Application of deep learning neural networks in computer-aided drug discovery: a review. Curr Bioinforma  2024;19:851–8. 10.2174/0115748936276510231123121404. [DOI] [Google Scholar]
  • 26. Pang  C, Qiao  J, Zeng  X, et al.  Deep generative models in de novo drug molecule generation. J Chem Inf Model  2023;64:2174–94. 10.1021/acs.jcim.3c01496. [DOI] [PubMed] [Google Scholar]
  • 27. Ai  C, Yang  H, Liu  X, et al.  MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS Comput Biol  2024;20:e1012229. 10.1371/journal.pcbi.1012229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Liu  M, Li  C, Chen  R, et al.  Geometric deep learning for drug discovery. Expert Syst Appl  2023;240:122498. 10.1016/j.eswa.2023.122498. [DOI] [Google Scholar]
  • 29. Tao  W, Lin  X, Liu  Y, et al.  Bridging chemical structure and conceptual knowledge enables accurate prediction of compound-protein interaction. BMC Biol  2024;22:248. 10.1186/s12915-024-02049-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Ren  X, Wei  J, Luo  X, et al.  HydrogelFinder: a foundation model for efficient self-assembling peptide discovery guided by non-Peptidal small molecules. Adv Sci  2024;11:2400829. 10.1002/advs.202400829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Jumper  J, Evans  R, Pritzel  A, et al.  Highly accurate protein structure prediction with AlphaFold. Nature  2021;596:583–9. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Yang  Y, Gao  D, Xie  X, et al.  DeepIDC: a prediction framework of injectable drug combination based on heterogeneous information and deep learning. Clin Pharmacokinet  2022;61:1749–59. 10.1007/s40262-022-01180-9. [DOI] [PubMed] [Google Scholar]
  • 33. Mahapatra  M, Sahu  C, Mohapatra  S. Trends of artificial intelligence (AI) use in drug targets, discovery and development: current status and future perspectives. Curr Drug Targets  2024;26:221–42. 10.2174/0113894501322734241008163304. [DOI] [PubMed] [Google Scholar]
  • 34. Song  N, Dong  R, Pu  Y, et al.  Pmf-cpi: assessing drug selectivity with a pretrained multi-functional model for compound-protein interactions. J Chem  2023;15:97. 10.1186/s13321-023-00767-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Zhao  Z, Gui  J, Yao  A, et al.  Improved prediction model of protein and peptide toxicity by integrating channel attention into a convolutional neural network and gated recurrent units. ACS omega  2022;7:40569–77. 10.1021/acsomega.2c05881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Le  NQK, Yapp  EKY, Nagasundaram  N, et al.  Computational identification of vesicular transport proteins from sequences using deep gated recurrent units architecture. Comput Struct Biotechnol J  2019;17:1245–54. 10.1016/j.csbj.2019.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Nguyen  T, le  H, Quinn  TP, et al.  GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics  2021;37:1140–7. 10.1093/bioinformatics/btaa921. [DOI] [PubMed] [Google Scholar]
  • 38. Gao  M, Zhang  D, Chen  Y, et al.  Graphormerdti: a graph transformer-based approach for drug-target interaction prediction. Comput Biol Med  2024;173:108339. 10.1016/j.compbiomed.2024.108339. [DOI] [PubMed] [Google Scholar]
  • 39. Xia  X, Zhu  C, Zhong  F, et al.  MDTips: a multimodal-data-based drug–target interaction prediction system fusing knowledge, gene expression profile, and structural data. Bioinformatics  2023;39:btad411. 10.1093/bioinformatics/btad411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Wang  Y, Zhai  Y, Ding  Y, et al.  SBSM-pro: support bio-sequence machine for proteins. Science China-Information Sciences  2024;67:212106. 10.1007/s11432-024-4171-9. [DOI] [Google Scholar]
  • 41. Yang  Z, Liu  J, Zhu  X, et al.  FragDPI: a novel drug-protein interaction prediction model based on fragment understanding and unified coding. Front Comp Sci  2023;17:175903. 10.1007/s11704-022-2163-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Aragh  AH, Moslemi Amirani  R, Givehchian  P, et al.  MiRAGE-DTI: a novel approach for drug–target interaction prediction by integrating drug and target similarity metrics. Comput Biol Med  2025;192:110249. 10.1016/j.compbiomed.2025.110249. [DOI] [PubMed] [Google Scholar]
  • 43. Gao  D, Zhu  F. HMT-DTI: hierarchical meta-path learning with transformer for drug-target interaction prediction. Neural Netw  2025;194:108093. 10.1016/j.neunet.2025.108093. [DOI] [PubMed] [Google Scholar]
  • 44. Farha  MA, French  S, Brown  ED. Systems-level chemical biology to accelerate antibiotic drug discovery. Acc Chem Res  2021;54:1909–20. 10.1021/acs.accounts.1c00011. [DOI] [PubMed] [Google Scholar]
  • 45. Tannenbaum  C, Sheehan  NL. Understanding and preventing drug–drug and drug–gene interactions. Expert Rev Clin Pharmacol  2014;7:533–44. 10.1586/17512433.2014.910111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Liu  T, Huang  J, Luo  D, et al.  Cm-siRPred: predicting chemically modified siRNA efficiency based on multi-view learning strategy. Int J Biol Macromol  2024;264:130638. 10.1016/j.ijbiomac.2024.130638. [DOI] [PubMed] [Google Scholar]
  • 47. Zhang  R, Zhu  B, Jiang  T, et al.  Enhancing drug-target binding affinity prediction through deep learning and protein secondary structure integration. Curr Bioinforma  2024;19:943–52. 10.2174/0115748936285519240110070209. [DOI] [Google Scholar]
  • 48. Xie  W, Xu  J, Zhao  C, et al.  Transformer-based named entity recognition for clinical cancer drug toxicity by positive-unlabeled learning and KL Regularizers. Curr Bioinforma  2024;19:738–51. 10.2174/0115748936278299231213045441. [DOI] [Google Scholar]
  • 49. Li  L, Zhao  T, Hu  Y, et al.  Mathematical modelling and bioinformatics analyses of drug resistance for cancer treatment. Curr Bioinforma  2024;19:211–21. 10.2174/1574893618666230512141427. [DOI] [Google Scholar]
  • 50. Qiao  J, Jin  J, Wang  D, et al.  A self-conformation-aware pre-training framework for molecular property prediction with substructure interpretability. Nat Commun  2025;16:1–16. 10.1038/s41467-025-59634-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Chen  L, Li  Y, Ma  Y, et al.  Multiscale graph equivariant diffusion model for 3D molecule design. Science. Advances  2025;11:eadv0778. 10.1126/sciadv.adv0778. [DOI] [PubMed] [Google Scholar]
  • 52. Zhu  H, Hao  H, Yu  L. Identification of microbe–disease signed associations via multi-scale variational graph autoencoder based on signed message propagation. BMC Biol  2024;22:172. 10.1186/s12915-024-01968-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Huang  Z, Guo  X, Qin  J, et al.  Accurate RNA velocity estimation based on multibatch network reveals complex lineage in batch scRNA-seq data. BMC Biol  2024;22:290. 10.1186/s12915-024-02085-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Rao  J, Zheng  S, Mai  S. et al.  Communicative subgraph representation learning for multi-relational inductive drug-gene interaction prediction. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 3919–25. Vienna, Austria: International Joint Conferenceson Artificial Intelligence, 2022.
  • 55. Ma  M, Lei  X, Zhang  Y. A review of drug-related associations prediction based on artificial intelligence methods. Curr Bioinforma  2024;19:530–50. 10.2174/1574893618666230707123817. [DOI] [Google Scholar]
  • 56. Jin  S, Hong  Y, Zeng  L, et al.  A general hypergraph learning algorithm for drug multi-task predictions in micro-to-macro biomedical networks. PLoS Comput Biol  2023;19:e1011597. 10.1371/journal.pcbi.1011597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Huang  K, Chandak  P, Wang  Q, et al.  A foundation model for clinician-centered drug repurposing. Nat Med  2024;30:3601–13. 10.1038/s41591-024-03233-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Ghislat  G, Hernandez-Hernandez  S, Piyawajanusorn  C, et al.  Data-centric challenges with the application and adoption of artificial intelligence for drug discovery. Expert Opin Drug Discov  2024;19:1297–307. 10.1080/17460441.2024.2403639. [DOI] [PubMed] [Google Scholar]
  • 59. Hasselgren  C, Oprea  TI. Artificial intelligence for drug discovery: are we there yet?  Annu Rev Pharmacol Toxicol  2024;64:527–50. 10.1146/annurev-pharmtox-040323-040828. [DOI] [PubMed] [Google Scholar]
  • 60. Wei  L, He  W, Malik  A, et al.  Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework. Brief Bioinform  2020;22:bbaa275. 10.1093/bib/bbaa275. [DOI] [PubMed] [Google Scholar]
  • 61. Tao  W, Liu  Y, Lin  X, et al.  Prediction of multi-relational drug–gene interaction via dynamic hypergraph contrastive learning. Brief Bioinform  2023;24:bbad371. 10.1093/bib/bbad371. [DOI] [PubMed] [Google Scholar]
  • 62. Fan  Y, Zhang  C, Hu  X, et al.  SGCLDGA: unveiling drug–gene associations through simple graph contrastive learning. Brief Bioinform  2024;25:bbae231. 10.1093/bib/bbae231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Wu  J, Gan  W, Yu  PS. Graph diffusion network for drug-gene prediction. arXiv preprint arXiv:2502.09335. 2025.
  • 64. He  J, Wu  Y, Yuan  L, et al.  An inductive learning-based method for predicting drug-gene interactions using a multi-relational drug-disease-gene graph. Journal of Pharmaceutical Analysis  2025;15:101347. 10.1016/j.jpha.2025.101347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Knox  C, Wilson  M, Klinger  CM, et al.  DrugBank 6.0: the DrugBank knowledgebase for 2024. Nucleic Acids Res  2024;52:D1265–75. 10.1093/nar/gkad976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Cannon  M, Stevenson  J, Stahl  K, et al.  DGIdb 5.0: rebuilding the drug–gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res  2024;52:D1227–35. 10.1093/nar/gkad1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Subramanian  A, Narayan  R, Corsello  SM, et al.  A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell  2017;171:1437–1452.e17e17. 10.1016/j.cell.2017.10.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Hu  W, Fey  M, Zitnik  M, et al.  Open graph benchmark: datasets for machine learning on graphs. Adv Neural Inf Proces Syst  2020;33:22118–33. [Google Scholar]
  • 69. Rao  N, Yu  H-F, Ravikumar  PK, et al.  Collaborative filtering with graph information: consistency and scalable methods. Adv Neural Inf Proces Syst  2015;28:2107–15. [Google Scholar]
  • 70. Kim  S, Chen  J, Cheng  T, et al.  PubChem 2023 update. Nucleic Acids Res  2023;51:D1373–80. 10.1093/nar/gkac956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Lonsdale  J, Thomas  J, Salvatore  M, et al.  The genotype-tissue expression (GTEx) project. Nat Genet  2013;45:580–5. 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Veličković  P, Cucurull  G, Casanova  A. et al.  Graph attention networks. In: 6th International Conference on Learning Representations, ICLR. Vancouver, BC, Canada, 2018.
  • 73. Zhang  H-Q, Arif  M, Thafar  MA, et al.  PMPred-AE: a computational model for the detection and interpretation of pathological myopia based on artificial intelligence. Front Med  2025;12:1529335. 10.3389/fmed.2025.1529335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Chen  T, Kornblith  S, Norouzi  M. et al.  A simple framework for contrastive learning of visual representations. In: Daumé III H, Singh A (eds.), Proceedings the 37th International Conference on Machine Learning, vol. 119. PMLR, ELECTR NETWORK, 2020, 1597–1607. [Google Scholar]
  • 75. Loshchilov  I, Hutter  F. Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR. New Orleans, LA, USA, 2019.
  • 76. Smith  LN. Cyclical learning rates for training neural networks. In: 2017 IEEE winter conference on applications of computer vision (WACV). Santa Rosa, CA, USA: IEEE, 2017. [Google Scholar]
  • 77. Candes  E, Recht  B. Exact matrix completion via convex optimization. Commun ACM  2012;55:111–9. 10.1145/2184319.2184343. [DOI] [Google Scholar]
  • 78. Hartford  J, et al.  Deep models of interactions across sets. In: The 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018, 1914–23. [Google Scholar]
  • 79. Berg  Rvd, Kipf  TN, Welling  M. Graph convolutional matrix completion. In: KDD Workshop on Deep Learning Day. London, UK, 2018.
  • 80. Monti  F, Bronstein  M, Bresson  X. Geometric matrix completion with recurrent multi-graph neural networks. Adv Neural Inf Proces Syst  2017;30:3698–708. [Google Scholar]
  • 81. Ying  R, He  R, Chen  K. et al.  Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK: ACM, 2018, 974–83.
  • 82. Zhang  M, Chen  Y. Inductive matrix completion based on graph neural networks. In: 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020.
  • 83. Liu  Y, Yang  X, Gan  J, et al.  CB-Dock2: improved protein–ligand blind docking by integrating cavity detection, docking and homologous template fitting. Nucleic Acids Res  2022;50:W159–64. 10.1093/nar/gkac394. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

bbaf710_Supplemental_Files

Data Availability Statement

The data and code in this study are openly accessible on GitHub (https://github.com/heshida01/BiGvCL).


Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES