Abstract
Drug repurposing efficiently identifies new applications for already approved drugs at reduced time and cost. ConvAHKG, an action-based hybrid knowledge graph approach, is proposed to improve the prediction of drug-disease associations by leveraging biological relationships among drugs, proteins, and diseases. AHKG is designed to integrate both drug and disease features to provide a comprehensive framework. To represent these relationships, Word2Vec embeddings are used to capture the semantic similarities among entities, and a novel dual-channel 1D convolutional neural network (IDC_Conv1D) is introduced for the classification of drug-disease pairs. This architecture is specifically intended to handle the complexity and heterogeneity of biological data. Furthermore, to address the significant class imbalance present in drug–disease datasets, a weighted binary cross-entropy loss function was introduced that assigns higher penalties to minority-class misclassifications, resulting in improved predictive performance. ConvAHKG outperforms state-of-the-art models, with an AUC of 0.9836 and an AUPRC of 0.9686. To validate its practical utility, we applied ConvAHKG to study non-small cell lung cancer (NSCLC). The framework identified promising therapeutic candidates for NSCLC, including Trastuzumab, and molecular docking analyses demonstrated strong binding interactions for an additional predicted but experimentally unvalidated compound, further supporting its potential as a novel treatment option. All data and code used in this study are available at https://github.com/Marzieh-Khodadadi/ConvAHKG.
Subject terms: Cancer, Computational biology and bioinformatics, Drug discovery
Introduction
Traditional drug discovery and development involves several stages to identify a new drug1. While these steps are crucial for discovering new treatments, the process is often time-consuming, expensive, and has a high failure rate2. To address these challenges, drug repurposing (DR) has gained attention.
DR, which involves finding new therapeutic applications for existing drugs, presents a promising alternative to traditional drug development. This approach leverages the extensive safety data already available for these medications, potentially accelerating their progression through clinical trials. By repurposing drugs, researchers can shorten development timeline and reduce associated costs3. A notable example is thalidomide, which was originally prescribed as a sedative and for morning sickness but was withdrawn due to severe birth defects. Subsequently, thalidomide was repurposed for the treatment of multiple myeloma and leprosy, revealing significant new therapeutic benefits4.
DR can be approached through either experimental or computational methods. Experimental methods depend on clinical trial outcomes to identify drug effects and similarities in mechanisms of action (MoA)5,6. In contrast, computational methods leverage advanced algorithms and data analysis techniques to predict potential drug candidates. The effectiveness of these computational approaches has significantly increased due to the rapid expansion of large-scale biomedical data7. Computational drug repositioning studies can be classified into three categories based on data type: (i) drug-based strategies, where discovery originates from knowledge related to drugs, (ii) disease-based strategies, where discovery is based on information about diseases, and (iii) hybrid strategies, which utilize both drug and disease features for improved prediction accuracy. Computational approaches offer several benefits, such as quickly analyzing large amounts of data, saving time and cost compared to experimental DR methods, and discovering new drug candidates that might not be easily found through experiments8.
Computational methods for drug repurposing can be broadly categorized into two main approaches: feature-based methods and knowledge graph-based methods. Feature-based approaches focus on utilizing specific characteristics or features of drugs and diseases, such as chemical structure, gene expression profiles, or biological pathways, to identify potential repurposing opportunities. These features often come from text mining and biological databases. Computational techniques analyze features to recognize patterns and similarities, suggesting a drug’s potential for a new therapeutic target9. Aliper et al.10 employed deep neural networks (DNN) to classify various drugs into therapeutic categories only based on their transcriptional profiles and pathway activation scores as features. N. Jarada et al.11proposed the SNF-NN framework, a hybrid approach for predicting drug-disease interactions by utilizing similarity measures. This method employed Similarity Network Fusion to merge these measures into comprehensive matrices. To predict drug response, the DeepDRK framework was developed to integrate multi-omics data and chemical properties of compounds as features. Wang et al.12 employed a kernel-based approach to create similarity matrices which are then used to train DNNs for drug response prediction. Zhang et al.13 proposed a deep learning framework that learns directly from 3D molecular spatial visual information and combines it with traditional molecular descriptors to build multi-perspective drug representations, achieving improved performance on drug–drug, drug–protein, and drug–microRNA association prediction Wei et al.14 presented DeepLPI, that utilizes raw sequences of drug molecules and proteins, employing pre-trained embeddings as input data. The model architecture combines 1D convolutional neural networks (1D-CNN) and bidirectional long short-term memory (biLSTM) layers. In another research Amiri et al.15 introduced IDDI-DNN. Multiple matrices representing drug and disease features are systematically prepared and combined using Cosine similarity and similarity network fusion. This combined matrix is then used as input for training the CNN.
Feature-based methods in drug repurposing can be limited because they tend to ignore the intricate connections and interactions that exist between different biological components. Additionally, these methods often cannot easily adapt to new discoveries or changes in the data16. To address these limitations, recent advancements have introduced network-based methods that aim to overcome the issues17. These methods provide flexible platforms for repurposing drugs by using network biology to model interactions between various biological concepts, especially through knowledge graphs (KGs). KGs are effective for storing biomedical data because they can represent complex data structures. Formally, a KG can be described as a labeled multi-graph, which consists of entities often called nodes and relationships connecting these entities, referred to as edges18,19.
KG methods can be categorized into two classes: link prediction, which predicts potential connections between entities and binary classification, which classifies relationships as either present or absent20. One important class of link prediction methods is recommender systems. These systems are particularly valuable in DR as they recommend top-ranked diseases for a given drug or conversely21.
The well-known methods used in KG recommender systems include matrix factorization, autoencoders, and Graph Neural Network (GNN) models. Sadeghi et al.22 introduced a collaborative filtering-based recommender system, utilizing a latent factor model. Their method applies matrix factorization to combine drug and disease similarities. Lakizadeh et al.23 introduced a drug repurposing method named DRSE. The method applies random walk with restart (RWR) for network diffusion and uses Diffusion Component Analysis (DCA) for dimensionality reduction. It then employs matrix factorization to predict drug-disease associations. Wei et al.24 introduced GraCMI, a multi-hop heterogeneous network model that integrates molecular attributes with global graph structure. It learns multi-level neighborhood representations using high- and low-order matrix factorization, enabling richer feature fusion and improved prediction accuracy. Shao et al.25 propose a recommendation system for prioritizing cancer drugs based on functional similarities derived from multi-omics data, which are mapped to KEGG pathways. Their approach uses single-sample gene set enrichment analysis (ssGSEA) to quantify pathway activity and applies a matrix factorization model to predict drug responses. Zhao et al.26 introduced DDAGDL, a geometric deep learning framework that models drug repurposing as learning on a heterogeneous biomedical network. It combines biological information with attention-based representation learning to generate informative embeddings. In another study, Ren et al.27 present a framework called DeepLGF, designed to merge local and global information derived from a KG. This method utilizes a BFGNN to capture biological function features. Furthermore, the authors implement an autoencoder-based strategy within the KG embedding process, which are subsequently input into a DNN for prediction. Zhao et al.28 demonstrated that combining biological features with graph neighborhood structure in KG/HIN models produces richer node representations than co-occurrence embeddings, improving prediction accuracy and robustness.
Zeng et al.29 present the DeepDR model, which constructs heterogeneous networks and employs a multi-modal deep autoencoder (MDA) to integrate the structural representations derived from random walk-based probability matrices, creating a compact, low-dimensional feature space. For predicting drug-disease associations, they utilize a collective variational autoencoder (cVAE) that encodes and decodes both the features and the known interactions. Tayebi et al.30 propose EKGDR, an End-to-End KG-Based Method that utilizes GNNs. The method focuses on drug features exclusively, employing a GNN recommender system to capture complex relationships among entities in the KG. The HGTDR model operates as an end-to-end system. Gharizadeh et al.31 built a heterogeneous graph and employed a heterogeneous graph transformer (HGT) for extracting node features, utilizing attention mechanisms along the way. Zhao et al.32 proposed a fine-tuned MM/PBSA(GBSA) workflow to improve binding affinity prediction. Their approach uses docking analysis with a fast two-step re-ranking and scoring strategy that outperforms conventional docking scores. For drug repurposing, this provides a reliable post-screening refinement step to validate top drug–target candidates.
Link prediction and recommender systems in graph-based methods face several limitations, but binary classification can act better in solving these issues:
Handling sparse data33: These methods often struggle with sparse data, which can lead to inaccurate predictions. Binary classification can effectively use labeled data, improving prediction accuracy.
Scalability issues34: As the graph size increases, scalability becomes a significant problem, causing computational complexity to rise. However, binary classification reduces computational complexity, making it more efficient and scalable for large-scale applications.
Sensitivity to noise and outliers: These methods are sensitive to noise and outliers in the data, which can negatively impact the quality of recommendations. In contrast, binary classification provides more robust predictions, being less sensitive to noise, especially when using clean and well-labeled data.
In this research, a binary classification method was utilized to predict drug-disease associations. A hybrid KG was constructed, integrating both drug and disease features to extract meaningful relationships.
To extract features from the KG, embedding methods can generally be categorized into shallow and deep approaches. Shallow methods are based on meta-paths and the complex structure of heterogeneous graphs, while deep methods leverage message-passing techniques35. Although these methods demonstrate strong performance, they often require significant computational resources and powerful systems, especially when handling a large number of relationships and high heterogeneity. In this study, the Word2Vec method was chosen as a simpler and faster alternative, initially proposed by Ghorbanali et al.36. This method, utilizing a ‘many-to-many’ configuration, is computationally efficient and capable of capturing complex relationships between entities and their associations.
In this project, a 1D-CNN was implemented to process 1D vectors obtained from Word2Vec embeddings of the KG. Inspired by architectures like InceptionNet (GoogLeNet)37 and AlexNet38, the model leverages CNNs to learn similarities between input pairs by comparing the features extracted from each channel. The advantages of using a 1D-CNN in this context include its ability to effectively capture local dependencies and patterns within the 1D vector data.
To evaluate our model, we compared it with several state-of-the-art approaches in drug repurposing and interaction prediction. KBMF39 employs Bayesian matrix factorization with kernel functions. DTINet40 uses a heterogeneous network to learn low-dimensional representations to predict interactions via vector space projection. DeepDR41 integrates multiple networks through a multi-modal deep autoencoder and applies a variational autoencoder. EKGDR30 leverages graph neural networks on a knowledge graph to embed nodes and relations for end-to-end interaction probability prediction. DrugRep-HeSiaGraph36 combines a drug-disease knowledge graph with a siamese neural network to enhance embeddings in a unified latent space. Our model significantly outperformed all these approaches by employing a novel 1D-CNN architecture combined with Word2Vec embeddings of the knowledge graph. This design allowed our model to effectively capture intricate relationships within the knowledge graph, resulting in significantly higher AUROC and AUPRC scores than all other approaches.
Drug repurposing offers a promising and cost-efficient alternative to conventional drug development; however, accurately identifying novel therapeutic indications for existing drugs remains a significant challenge, largely due to the complexity of biological systems and the scarcity of reliable negative samples. Previous studies have utilized knowledge graphs and machine learning techniques to integrate heterogeneous biomedical data, but these approaches often oversimplify molecular interactions or fail to incorporate critical biological context, thereby limiting their predictive accuracy. In this study, we address these limitations by constructing a hybrid knowledge graph that explicitly models drug–protein and disease–protein interactions using action-based edge labels. In addition, we incorporate drug side effects as negative training instances to enhance model robustness. The main contributions of our model are as follows:
A new hybrid knowledge graph was constructed using drug-protein and disease-protein actions as labels for the edges.
A novel 1D-CNN dual-channel architecture (IDC_Conv1D) was introduced to improve the model’s ability to capture complex relationships.
A weighted binary cross-entropy loss function was developed to address the challenges posed by unbalanced data.
Methods
In this study, a comprehensive dataset was constructed to investigate drug-disease interactions and to develop a network that captures the complex biological and chemical relationships between biomedical entities. This section details the construction of the dataset and its integration into a knowledge graph framework.
Knowledge graph
The knowledge graph
is formally defined as a directed graph, where
is the set of entities (nodes),
is the set of directed edges, and
is the set of relation types. Each edge is a triple
, where
represent the head and tail entities, and
denotes the relation linking them42. For instance, a drug interacting with a disease via an inhibition mechanism would be represented as
.
Drug repurposing via knowledge graph modeling
The objective of drug repurposing is to build a predictive model
that accurately identifies true drug-disease associations from positive samples
, while minimizing false positives from negative samples
. Formally, the model is defined as
, where
indicates a valid association and
denotes a side-effect-based negative interaction 43. RepoDB 44 provides the positive pairs, while SIDER 45 contributes the negative samples based on side effects.
Action-based hybrid knowledge graph (AHKG)
AHKG, illustrated in Fig. 1, is a heterogeneous, multi-relational hybrid knowledge graph constructed to encapsulate complex relationships between compounds, diseases, proteins, and other biomedical entities. This unified representation integrates multiple sources, offering a detailed view of drug mechanisms, interactions, and classifications.
Fig. 1.
AHKG is a heterogeneous hybrid graph constructed for drug repurposing, integrating multiple biomedical databases to model complex relationships among compounds, proteins, and diseases.
Key features of AHKG include:
Table 1, provides a structured overview of different node types used in AHKG.
Compound–protein interactions: Sourced from DrugBank, these interactions are modeled using 38 interaction types such as inhibition, activation, and binding. Additionally, they are classified into four bond types: Target, Enzyme, Carrier, and Transporter bonds, reflecting functional relationships between drugs and proteins.
Disease–protein relations: Derived from DisGeNET, AHKG models six types of disease-protein relationships, such as biomarker associations and expression changes, to reflect the role of proteins in disease progression.
Anatomical therapeutic chemical (ATC) classification: The ATC hierarchy is obtained from DrugBank and each compound is classified within a four-level ATC hierarchy, from anatomical group (Level 1) to chemical subgroup (Level 4) 46.
Drug categories: Also sourced from DrugBank, Drugs are categorized by therapeutic function and structural class. For example, Levothyroxine is labeled as both an agent for hypothyroidism and as a peptide-based molecule 47.
Chemical substructures: Using RDKit, molecular substructures are extracted as 1024-bit Morgan fingerprints 48, which are modeled as nodes in the graph.
Compound–disease associations: This includes therapeutic links (e.g., has_approved_interaction) and adverse effects (e.g., has_side_effect) between drugs and diseases, compiled from RepoDB and SIDER, enabling the representation of both beneficial and harmful drug-disease relationships.
Supplementary summarizes the key structural and semantic differences between RepoDB-centered drug–disease repurposing knowledge graphs (DeepDR, DrugRep-KG, DrugRep-HeSiaGraph, and EKGDR) and our proposed AHKG (ConvAHKG). While all frameworks are grounded in RepoDB supervision, AHKG provides richer biological context by incorporating an expanded set of compound feature nodes (e.g., ATC, type, substructure, and pathway) and explicit protein-centric components, including protein feature nodes, a PPI network, and protein–disease interactions. Most importantly, AHKG introduces substantially more expressive edge semantics with 59 relation types, including action-based compound–protein edges (38 types) and 6 distinct disease–protein interaction categories. Also other additional information, such as node size and databases are provided in the Supplementary Materials.
Table 1.
Summary of datasets with node types, remarks, sizes, and database sources.
| Node Types | Remarks | Node Size | Database |
|---|---|---|---|
| Compound | 1509 | RepoDB | |
| Disease | 1229 | RepoDB | |
| Protein | Protein–protein interactions derived from STRING; Also protein–disease associations extracted from DisGeNET | 2995 | DrugBank |
| Compound_Type | Includes small molecules and biotech products | 2 | DrugBank |
| Compound_Category | 2722 | DrugBank | |
| ATC Code | Four-level hierarchy: anatomical, therapeutic, pharmacological, chemical | 1389 | DrugBank |
| Side Effect | 352 | SIDER | |
| Pathway | 650 | DrugBank | |
| Sub_Structure | Morgan fingerprints with 1024 binary features | 1024 | RDKit |
| Bond Types | Enzyme, Transporter, Receptor, Target | 4 | DrugBank |
This study suggests a new way to repurpose drugs by using a knowledge graph framework to better understand the complex connections between compounds, proteins, diseases, and other biomedical entities. This approach was created to tackle the difficulties of binary classification in imbalanced datasets. At first, entities and their relationships are embedded into a unified latent space through feature extraction. The next step involves inputting these embeddings into a dual convolutional channel model used for classifying drug-disease pairs. The method consists of three main parts:
Feature extraction: The Word2Vec algorithm is employed to extract embeddings from the knowledge graph, representing entities and their relationships in a high-dimensional vector space.
Binary classification task: A novel dual 1-dimensional convolutional channel architecture is implemented to classify drug-disease interactions.
Unbalanced data handling: To address the issue of class imbalance, a weighted binary cross-entropy loss (WBCE) function is introduced.
Feature extraction
In our approach, every triple in the knowledge graph, represented as (h, r, t), where h denotes the head entity, r signifies the relation, and t represents the tail entity, is regarded as a sequence of words36 that constructs a sentence:
![]() |
1 |
For instance, using the triple (Disease_C0021670, Biomarker, protein_P42345), the sentence can be constructed as:
![]() |
2 |
The goal is to develop vector embeddings for each entity and relation in such a way that triples likely to appear together will have similar vector representations. Let
be the vocabulary size, which includes all entities and relations in the graph, and let
be the dimensionality of the embedding space. In the Word2Vec algorithm, the embedding matrices
and
map each word (entity or relation) to a
-dimensional vector.
In this algorithm, the center word can be any of the entities or relations found within a triple represented as (h, r, t). For example, when the relation r serves as the center word, the head entity h and the tail entity t act as context words (Fig. 2–A). Our approach employs the Continuous Bag-of-Words (CBOW)49 model (Fig. 2–B) within the Word2Vec framework. The model aims to predict the center word based on information derived from its context. Importantly, any of the components—the head, relation, and tail entity—can function as the center word in various contexts49. This adaptability allows the model to learn diverse relationships within the graph.
Fig. 2.
Overview of pipeline. A: Biomedical triples from the knowledge graph are linearized into sentence-like sequences. B: CBOW is used to learn feature embeddings from sentences. C: The embeddings are processed by the IDC_Conv1D model, which integrates an Inception module with series of 1D convolutional layers for multi-scale feature extraction and drug-disease prediction.
Binary classification task
The proposed model is designed to utilize convolutional neural networks (CNNs) to transform input features into more informative representations, followed by multi-layer perceptrons for the final classification task (Fig. 2–C). It consists of two main components:
Convolutional dual channels
Multi-perception layers
The convolutional dual channels function as parallel pathways that process drug and disease vectors, refining the initial features through a series of convolutional operations. After this transformation, the refined features are concatenated and passed into the multi-perception layers, which consist of fully connected layers that produce the final output.
Convolutional dual channels
A Convolutional Neural Network (CNN) is a type of deep learning architecture designed to automatically learn and extract relevant features from input data. Instead of manually defining features, a CNN identifies patterns by performing mathematical operations over the input in a structured and hierarchical manner. CNNs are especially effective when the input contains local or sequential relationships such as in signals, or embedding vectors. because a 1-dimensional convolution focuses on small segments of the sequence at a time, allowing the network to learn local dependencies and gradually build more complex feature representations layer by layer.
In this research,the convolutional dual channels are responsible for processing the input disease and drug vectors. Since the input vectors have the same dimensionality,
for drugs and
for diseases, a similar architecture is applied to both channels. Each channel consists of two main parts that follow one another: an Inception block and a series of conventional layers. These components work together to transform the input features into more refined representations.
At the core of a CNN is the convolution filter (also known as a kernel). A convolution filter is a small matrix of learnable weights that moves across the input data. As it slides, it performs element-wise multiplication with the corresponding section of the input and sums the result to produce a single output value. This process allows the filter to detect specific local patterns such as repeated motifs, transitions, or feature combinations.
Definition of the convolutional filter: A one-dimensional convolutional layer is composed of
learnable filters. Each filter, indexed by n, is represented by a small weight matrix with elements
. The term
defines the number of adjacent positions in the input sequence that the filter examines at a given operation.
![]() |
3 |
Within this matrix, the index
corresponds to the position inside the kernel window, and each filter
acts as a localized pattern detector that integrates information from all input channels within a window of length
.
Convolution of the n-th filter: The convolution operation for the n-th filter computes a weighted combination of the corresponding input values within its receptive window.
![]() |
4 |
where:
: The input sequence with a size of
, where
is the length of the sequence (here, the length of the drug or disease embedding vector after reshaping).
: Weight of the
-th element in the
-th filter.
: Identifies the input channel.
: Bias term for the
-th filter.
: Activation function.
1-Dimensional convolution operation: Let
denote a one-dimensional convolution operation, where f[l] represents the kernel size of layer l, The index
denotes the filter number in the current layer.
![]() |
5 |
Inception block
The Inception Block is a fundamental component of convolutional neural networks (CNNs) that enables the model to learn multi-scale features effectively. Its core idea is to apply multiple convolutional operations with varying filter sizes in parallel, capturing diverse aspects of the input data50. Larger kernels capture broader contextual features, while smaller kernels retain fine-grained details, balancing the trade-off between global and local feature extraction51. Each convolutional operation integrates ReLU activation for enhanced non-linearity52 and uses padding to preserve input dimensions. The operations for both the compound vector
and the disease vector
are defined as follows. The Inception Block performs four parallel operations on the input vector:
![]() |
6 |
![]() |
7 |
![]() |
8 |
![]() |
9 |
![]() |
10 |
![]() |
11 |
- Concatenation: The outputs from these branches are concatenated to form the multi-scale feature representation:

12
Series of convolutional layers
Following the Inception Block, additional convolutional layers refine and compress the features. These layers apply three sequential 1-dimensional convolutions with kernel sizes of
,
, and
, capturing diverse global patterns and local dependencies within the data. Each layer adopts ReLU activation and uses padding to maintain spatial dimensions53,54.
The operations are defined as:
![]() |
13 |
To mitigate overfitting, a Dropout layer with a 0.1 rate is applied:
![]() |
14 |
Finally, a MaxPooling1D operation with a pool size of
reduces the feature map’s spatial dimensions.
![]() |
15 |
Multi-perception layers
The concatenated outputs from the 1D convolutional dual-channel branches (
and
) are fed into the Multi-Layer Perceptron (MLP). The layers are structured as follows:
![]() |
16 |
The network comprises three hidden layers: the first two hidden layers each contain 128 units with ReLU activation, while the third hidden layer has 16 units. The output layer applies a sigmoid activation, which generates the final prediction.
Unbalanced data handling
The loss function, Weighted Binary Cross-Entropy Loss, is designed to address class imbalance in binary classification tasks by incorporating class-specific weights for the positive and negative classes.
Let
denote the true labels, and
represent the predicted probabilities for the positive class. The standard binary cross-entropy (BCE) loss is given by:
![]() |
17 |
Where:
is the number of data points (or samples).
and
represent the true and predicted values for the
-th data point, respectively.
This summation applies the binary cross-entropy loss to all
data points in the dataset.To account for class imbalance, the loss is weighted by class-specific factors. The positive class weight
and the negative class weight
are computed as follows:
![]() |
18 |
where
is the total number of samples, and the sums are taken over all data points.The final WBCE loss is computed as:
![]() |
19 |
The loss function can be divided into two distinct cases depending on the value of
. If
, the loss component for the positive class is utilized; conversely, if
, the loss component for the negative class is employed. This allows the weighted BCE loss to be simplified as follows55:
![]() |
20 |
This weighted loss function helps mitigate the bias toward the majority class and encourages the model to focus more on learning the minority class, thereby improving performance on imbalanced datasets.
Results and discussion
In this section, we start with an overview of the statistical characteristics of the dataset. Next, we assess the significance of each AHKG feature. Finally, we compare the performance of our model against other state-of-the-art models.
Statistical overview of the AHKG
The AHKG graph comprises 11,049 entities, 59 types of relations, and a total of 323,442 biomedical triplets. It includes 6,657 positive drug-disease pairs sourced from RepoDB approved drug-disease associations, as well as 21,120 negative pairs derived from the SIDER dataset. Furthermore, the dataset includes 1,509 unique drugs, identified by their DrugBank IDs, and 1,229 unique diseases, identified using UMLS IDs. The graph includes 10,156 compound–protein and 132,533 disease–protein edges, emphasizing the central mechanistic role of proteins. Several relations occur at very high frequencies, including disease–protein interactions labeled as Biomarker (65,101 edges), compound–substructure connections represented by ”has_sub_structure” label (58,763 edges), and protein–protein interaction links(42,356 edges).
Performance of imbalance handling techniques
The model’s performance was evaluated using a 90% training and 10% testing split. To prevent any form of data leakage, 10% of the positive edges and 10% of the negative edges were removed from the knowledge graph before Word2Vec training, ensuring that these held-out pairs were never seen during embedding learning. As a result, all embeddings were generated solely from training-accessible information, without any influence from test associations. To ensure a fair and unbiased assessment, we employed a 10-fold cross-validation method using stratified K-fold. This approach was chosen to address the class imbalance in the dataset. This method ensures that each fold contains nearly equal proportions of each class. Additionally, all negative and positive edges of the test data were removed during every step of feature extraction and classification56
To address the class imbalance within our dataset, we employed four techniques from the imbalanced-learn library57: Random Oversampling (ROS), Random Undersampling (RUS), Oversampling Positive Pairs using SMOTE (OSMOTE), SMOTE with Random Undersampling (SORU) alongside WBCE. A comprehensive evaluation using various metrics (detailed in Table 2) revealed WBCE as the most effective strategy. It consistently outperformed other approaches. For comparing these methods, we used the same test and train data ROS demonstrating potential in AUC-ROC and AUC-PR but WBCE’s overall performance, particularly in Precision and F1-Score, was much greater. RUS, SMOTE, and SORU exhibited relatively lower performance across all metrics. These findings emphasize the critical role of selecting appropriate techniques to effectively handle class imbalance and improve model performance.
Table 2.
Performance metrics for imbalanced data handling techniques. WBCE achieved the best overall results.
| Method | AUC-ROC | AUC-PR | Accuracy | Precision | F1 Score |
|---|---|---|---|---|---|
| ROS | 0.9743 | 0.9511 | 0.9438 | 0.8947 | 0.8811 |
| RUS | 0.9715 | 0.9446 | 0.9388 | 0.8875 | 0.8698 |
| OSMOTE | 0.9691 | 0.9456 | 0.9410 | 0.8838 | 0.8758 |
| SORU | 0.9716 | 0.9491 | 0.9366 | 0.8592 | 0.8694 |
| WBCE | 0.9836 | 0.9686 | 0.9573 | 0.9440 | 0.9074 |
Evaluation of Word2Vec against knowledge graph embeddings
To evaluate the effectiveness of our Word2Vec-based feature extraction method, we compared it against three widely used knowledge graph embedding models: ComplEx58, rotatE59, and transE60. These models were selected due to their strong theoretical foundations and proven performance in relational learning tasks. All models were trained and tested on the same dataset, using 600-dimensional embeddings, and executed on the same computational system to ensure a fair and consistent comparison of both predictive performance and feature computation time.
The ComplEx model extends traditional embeddings into the complex number space to better capture asymmetric relationships. Its score function for a triplet (h, r, t) is defined as:
![]() |
21 |
where
, and
is the complex conjugate of the tail entity. This formulation enables the model to effectively represent complex relational patterns, such as asymmetry and hierarchy. RotatE, in contrast, models each relation as a rotation in complex space, using the following relation:
![]() |
22 |
Here, each relation rotates the head entity on the complex unit circle to reach the tail, allowing the model to handle symmetry, antisymmetry, and inversion. TransE adopts a simpler and more intuitive translational approach in Euclidean space. It models relationships with the assumption:
![]() |
23 |
and defines its score function as:
![]() |
24 |
where p is typically 1 or 2. TransE effectively captures basic relational patterns and remains a strong baseline in knowledge graph embeddings due to its efficiency and simplicity.
As shown in Table 3, Word2Vec achieved competitive performance across all metrics and outperformed the other methods in three key areas: accuracy, precision, and F1 score . Also overall recall corresponds to the standard binary recall (TP / (TP + FN)) computed on the held-out test set using a fixed decision threshold of 0.5. Furthermore, it demonstrated the lowest computation time, which is significantly faster than ComplEx , RotatE, and TransE. While knowledge graph-based methods like ComplEx and TransE showed strength in AUC and recall metrics, Word2Vec delivered the best overall performance when considering both predictive accuracy and computational efficiency.
Table 3.
Performance comparison of different embedding methods. Word2Vec achieved the best balance of accuracy, precision, F1 score, and computational efficiency.
| Model | AUC-ROC | AUC-PR | Overall Recall | Accuracy | Precision | F1 Score | Time (s) |
|---|---|---|---|---|---|---|---|
| ComplEx | 0.9731 | 0.9499 | 0.8526 | 0.9460 | 0.9160 | 0.8832 | 2038 |
| RotatE | 0.9699 | 0.9434 | 0.8316 | 0.9445 | 0.9294 | 0.8778 | 2354 |
| TransE | 0.9770 | 0.9556 | 0.8436 | 0.9474 | 0.9303 | 0.8849 | 1224 |
| Word2Vec | 0.9750 | 0.9535 | 0.8423 | 0.9525 | 0.9541 | 0.8947 | 272 |
Model optimization
The Word2Vec model’s hyperparameters, including vector size and iterations, were tuned using grid search. A 650-dimensional vector and 1500 iterations provided the best performance. The context window was set to 2 words on either side. The model was trained with a learning rate of 0.001. Due to the specific nature of the model, most adjustments were based on try and error. The binary classification task was trained using a NVIDIA GeForce RTX 4090 with an optimized batch size of 128 and a learning rate of 0.0001. To prevent overfitting, EarlyStopping was applied with a patience of 10.
Feature impact on AHKG performance
In our analysis, the knowledge graph (KG) features were categorized, with emphasis placed on compound-related attributes and protein-action relationships. The impact of these features was evaluated by systematically removing them from the graph, and the results of this comparison are presented in the supplementary material. The following paragraphs explain the key findings from these comparisons.
As shown in Fig. 3, hlremoval of compound classification features (category, type, ATC code) results in a slight decrease in both precision and accuracy. Among the various compound features, classification attributes prove particularly effective in enhancing overall performance. Additionally, pathway substructures and compound–protein edges are foundational to the graph’s predictive capabilities. Their removal leads to a moderate decline in performance across all metrics, underscoring their critical role in maintaining the graph’s predictive strength.
Fig. 3.
Impact of feature ablation on model performance.
Regarding protein-action relationships, as illustrated in Fig. 4 the full graph which captures specific protein-action interactions between compounds, proteins, and diseases, achieved the best performance. This highlights the importance of encoding action-specific relationships for accurate predictions. Replacing action-specific edges with a generalized relation causes a noticeable drop in performance, particularly in AUC-ROC and precision. Notably, removing protein nodes results in the most significant degradation, especially in precision and accuracy. This emphasizes the indispensable role of proteins as biological hubs that mediate functional relationships between compounds and diseases. Without proteins, the graph loses both biological coherence and predictive effectiveness.
Fig. 4.
Effect of action-specific protein relations on ConvAHKG performance.
To quantify the impact of action-aware relation semantics, triplet-based learning with explicit relation/action tokens was compared against a controlled single-relation baseline, where all interaction types were collapsed into one generic relation token while the Word2Vec/CBOW input format and all hyperparameters were kept unchanged. Consistent performance improvements were observed when action information was included, indicating that action-aware relations add meaningful signal beyond structural connectivity alone. The complete results table and the corresponding bar-chart visualization are provided in the Supplementary Excel file (Sheet S8).
Feature representation
A t-SNE (t-distributed stochastic neighbor embedding) plot is a widely used dimensionality reduction method that facilitates the visualization of high-dimensional data in two or three dimensions. It operates by minimizing the divergence between probability distributions that capture pairwise similarities in the high-dimensional space 61.
As shown in Fig. 5, diseases, proteins, and compounds can be effectively distinguished and clustered. Compounds form two primary clusters—“biotech” and “small molecule”—indicating distinct properties or functional roles. Proteins are grouped according to their functional categories: target, transporter, carrier, and enzyme. This clustering reflects underlying biochemical behaviors and functional interactions, suggesting strong intra-group similarities relevant to their biological roles.
Fig. 5.
t-SNE visualization of diseases, compounds, and proteins in AHKG.
Evaluation
Table 4 illustrates the performance of our proposed model in comparison to several network-based drug repurposing models, including DrugRep-HeSiaGraph, EKGDR, DeepDR 29, DTINet 40, and KBMF 39.
Table 4.
Model performance comparison: Classification vs. recommender systems for drug–disease interaction prediction. Our model outperforms existing methods in both AUC and AUPRC.
| Method | AUC | AUPRC |
|---|---|---|
| ConvAHKG | 0.9836 | 0.9686 |
| DrugRep-HeSiaGraph | 0.9718 | 0.9503 |
| EKGDR | 0.9475 | 0.9490 |
| DeepDR | 0.9084 | 0.9248 |
| DTINet | 0.8620 | 0.8920 |
| KBMF | 0.7910 | 0.8260 |
DrugRep-HeSiaGraph, which employs a heterogeneous Siamese neural network, faces limitations due to its reliance on SORU for class imbalance handling and its small dataset scale. EKGDR, integrating graph neural networks with Relational Path-Aware Aggregation, lacks comprehensive disease information, reducing its contextual accuracy. DeepDR uses a multi-modal deep autoencoder but struggles to capture complex relationships owing to the absence of a knowledge graph62. DTINet, which relies on Diffusion Component Analysis (DCA) and network diffusion algorithms in a matrix completion setting, can suffer from scalability issues due to model complexity. KBMF, based on kernelized matrix factorization, has high computational demands and inefficiencies when applied to large-scale datasets.
Our model demonstrates superior performance across all these approaches based on two primary evaluation metrics: AUC (Area Under the Curve) and AUPRC (Area Under the Precision-Recall Curve). Our model significantly outperforms all other models, achieving an impressive AUC of 0.9836 and an AUPRC of 0.9686. It demonstrates exceptional performance across binary classification tasks, graph neural network-based recommender systems, autoencoder methods, and kernelized recommender systems. This underscores its ability to effectively integrate heterogeneous graph structures and employ innovative techniques, overcoming the limitations faced by other models.
Case study
Lung cancer represents about 13% of all cancer cases84, with approximately 2.1 million new diagnoses and 1.8 million deaths each year85. It ranks as the leading cause of cancer death among men and the second most prevalent cause for women86. This concerning trend is largely attributed to rising tobacco consumption, especially in Asia, as reported by the World Health Organization (WHO). The disease is categorized into two primary types: non-small cell lung cancer (NSCLC), which constitutes 85% of cases and includes subtypes like adenocarcinoma, squamous cell carcinoma, and large cell carcinoma; and small-cell lung cancer (SCLC), which accounts for the remaining 15% of cases87. NSCLC is still a major health challenge, and discovering new therapeutics is crucial for improving patient survival.
Table 5 presents the repurposed drugs for NSCLC. None of the top ten predicted drugs were included in the known NSCLC–drug interactions in our training data, so all of them are new and previously unseen predictions. Trastuzumab affects NSCLC by specifically targeting HER2, a protein from the epidermal growth factor receptor family that is overexpressed in 20% to 66% of NSCLC cases. By attaching to the extracellular domain of HER2, trastuzumab promotes its downregulation. Research has demonstrated that trastuzumab can enhance the effectiveness of chemotherapeutic agents like cisplatin, gemcitabine, and etoposide when used in combination, significantly improving their efficacy against NSCLC68.
Table 5.
Summary of the top 10 ranked compounds predicted for NSCLC. References indicate prior experimental or clinical support for drug repurposing potential.
| Rank | Compound | Reference | Score |
|---|---|---|---|
| 1 | Trastuzumab | 63–70 | 0.9998 |
| 2 | Obiltoxaximab | 0.9997 | |
| 3 | Raxibacumab | 0.9997 | |
| 4 | Sebelipase alfa | 71 | 0.9997 |
| 5 | Sinecatechins | 0.9997 | |
| 6 | Aflibercept | 72–74 | 0.9996 |
| 7 | Protamine sulfate | 75,76 | 0.9996 |
| 8 | Benzylpenicillin | 0.9995 | |
| 9 | Denileukin diftitox | 77 | 0.9995 |
| 10 | Dihydrotachysterol | 78–83 | 0.9995 |
Sebelipase alfa restores lysosomal acid lipase (LAL) activity, reducing PD-L1 and CSF1R expression in immunosuppressive CD11c+ cells. This reprograms their metabolism and weakens their tumor-promoting effects. In NSCLC, sebelipase alfa shows potential to enhance cancer immunotherapy. Aflibercept, another highly ranked compound, has undergone multiple phase II clinical evaluations in lung cancer, where its high-affinity binding to VEGF-A, VEGF-B, and PlGF has been shown to suppress tumor vascularization and inhibit tumor growth88.
Protamine sulfate (PS) enhances the treatment of lung cancer by optimizing the delivery of gene therapy through non-viral vectors for CRISPR/Cas9 plasmids. PS is utilized to disrupt the MTH1 gene. This strategy has been shown to inhibit tumor growth, reduce metastatic progression, and promote apoptosis in NSCLC models. Similarly, denileukin diftitox has demonstrated clinical activity in a phase II trial involving previously treated NSCLC patients, where its IL-2 receptor–directed mechanism induced targeted cytotoxicity in tumor cells75. Dihydrotachysterol is a synthetic hydrogenated analog of vitamin D. Vitamin D’s potential role in cancer prevention and therapy, especially for lung cancer, is supported by observational, experimental, and clinical research. Studies show that higher vitamin D levels are linked to improved survival and reduced cancer risk. Vitamin D’s effects include regulating tumor-related genes and offering antiangiogenic, anti-inflammatory, and antimetastatic benefits78.
DNA and topoisomerase II alpha inhibition: Benzylpenicillin as anti-cancer compound
Antibiotics are crucial in cancer treatment as they act as protective agents against infections in immunocompromised individuals and as effective anticancer drugs. Leveraging the evolutionary link between bacterial and human cells, certain antibiotics have been repurposed for cancer treatment. These drugs exploit similarities in cellular machinery, particularly in DNA and RNA processing, leading to inhibited cancer cell growth and apoptosis. Crucially, the dual role of certain antibiotics—fighting infections while showing anticancer properties—renders them especially beneficial for cancer patients, who are significantly susceptible to infections due to compromised immune systems89. The research focused on molecular docking studies targeting DNA and topoisomerase II alpha, which is over-expressed in NSCLC90,91.
Docking simulations were performed using AutoDock Vina 1.2.5 to analyze the interaction of Benzylpenicillin with DNA (1BNA PDB structure) and topoisomerase II alpha (4FM9 PDB structure). The 4FM9 PDB file was processed by removing water molecules, adding nonpolar hydrogens, and defining a grid box centered on the DNA-binding region within the enzyme’s active site. The docking analysis revealed that Benzylpenicillin exhibited a binding affinity of –7.9 kcal/mol at the active site of topoisomerase II alpha (Fig. 6–A). As depicted in , Benzylpenicillin effectively occupies the enzyme’s active site, thereby preventing DNA from binding. Since topoisomerase II is crucial for DNA replication and transcription, its inhibition by Benzylpenicillin may cause DNA damage and trigger apoptotic pathways in cancer cells92.
Fig. 6.
Computational docking results A: Docked conformation of benzylpenicillin in the active site of topoisomerase, suggesting potential inhibition via active site occupancy. B: Computational docking image of benzylpenicillin (space-filling model) interacting with a DNA double helix, suggesting a potential mechanism of action. All molecular visualizations were generated using PyMOL.
Likewise, the 1BNA PDB structure was prepared by eliminating water molecules and merging nonpolar hydrogens. Docking studies demonstrated that Benzylpenicillin binds to the DNA structure with a binding affinity of –7.7 kcal/mol (Fig. 6-B). This interaction indicates that Benzylpenicillin disrupts DNA replication, impairing the uncontrolled growth of cancer cells. Notably, cancer cells, which divide more rapidly than normal cells, are particularly vulnerable to DNA damage, making this approach especially effective93. By targeting both DNA and topoisomerase II, Benzylpenicillin employs a dual mechanism of action that promotes cancer cell death.
Discussion
In many prior studies, drug–protein and disease-protein interactions have been represented using a single relation type, limiting the biological context that can be captured30,36. To overcome this limitation, the Action-based Hybrid Knowledge Graph (AHKG) was developed as a central methodological contribution of this work. The AHKG was designed to combine the benefits of hybrid knowledge-graph frameworks while explicitly preserving biologically meaningful relationships. This is achieved by assigning clear biological action labels, such as inhibition, activation, and biomarker association to label compound–protein and disease–protein links. Through this design, the underlying mechanisms by which drugs influence disease-related targets are represented more explicitly, rather than being reduced to broad or non-specific associations.
Although docking offers valuable mechanistic insight, it remains a computational approximation. Limitations such as rigid-body assumptions, simplified scoring functions, and the inability to capture full conformational dynamics mean that docking predictions require careful interpretation and experimental follow-up. To translate molecular docking predictions into real therapeutic value, we suggest a clear, stepwise experimental validation strategy for future work, combining in vitro mechanistic assays with in vivo translational studies. In vitro validation would confirm direct target engagement by assessing TOP2A inhibition using the decatenation assay and DNA binding through fluorescence spectroscopy, supported by UV–Vis absorption and DNA viscosity measurements to clarify binding mode. Cellular relevance would be evaluated in NSCLC cells using the ICE assay to detect TOP2A–DNA covalent complexes and the comet assay to quantify DNA damage, followed by assessment of anticancer phenotypes including proliferation, clonogenic survival, apoptosis, cell-cycle effects, and metabolic alterations. Finally, translational potential could be examined in NSCLC patient-derived xenograft models, focusing on Benzylpenicillin as a chemosensitizer in combination with cisplatin, with evaluation of tumor growth, survival, toxicity, histopathology, and PK/PD evidence of intratumoral TOP2A target engagement.
In future studies, integrating large language models (LLMs) with the knowledge graph can enrich semantic feature representations and provide an additional reasoning layer for validating high-confidence predictions.Moreover, the proposed framework can be further enhanced by incorporating advanced graph-based and geometric deep learning approaches to better capture higher-order structures in heterogeneous biomedical networks and improve prediction accuracy. In addition, generative and representation-aware learning strategies will be explored to address severe data imbalance and rare drug–disease associations, thereby increasing robustness and scalability.
Also, in future the model can be improved by incorporating more advanced relation-aware graph learning beyond Word2Vec26,28. Multi-hop modeling can also help the model handle sparse and imbalanced data more robustly24. To improve generalizability, 3D molecular representations can be integrated with KG features13.
Conclusion
ConvAHKG is a framework designed to improve drug repurposing by using complex biological relations. It achieved an AUC of 0.9836 and an AUPRC of 0.9686. The key advantages of our model include:
This study incorporates a novel action-based hybrid heterogeneous graph structure that captures the relationships between compounds, diseases, and proteins. Unlike traditional models that rely on single node relations, action-specific relationships lead to more accurate predictions.
the use of a 1D-convolutional dual-channel architecture, which efficiently extracts both local and global information from disease and compound features. This dual-channel design enhances the model’s ability to learn patterns.
Using compound-disease side effect pairs as negative samples instead of traditional synthetic approaches makes the dataset more biologically relevant. This method enhances the model’s robustness and reliability.
Lastly, it addresses data imbalance through a weighted binary cross-entropy loss function, which avoids the need for sampling or oversampling by using all available positive and negative pairs.
Trastuzumab was successfully recommended as a repurposed drug for non-small cell lung cancer (NSCLC), a prediction that was supported by experimental evidence confirming its effectiveness. Additionally, G-penicillin was identified as a potential anticancer agent, with molecular docking studies confirming its strong binding affinity to DNA and topoisomerase II alpha, supporting its potential as a novel cancer therapeutic.
This study has several limitations. First, model optimization relied on combination of trial-and-error and like grid search rather than a systematic hyperparameter search. Given the extensive range of possible convolutional filter configurations, a comprehensive search was computationally impractical, resulting in a locally optimized architecture that may not represent the global optimum. Second, while molecular docking supported the predictions, it is a computational approximation, necessitating experimental validation to confirm the therapeutic potential of the predicted compounds. Finally, the framework was primarily trained and tested on the RepoDB dataset of known drug–disease pairs, which may limit its generalizability to other diseases or novel compounds, requiring further evaluation to establish broader applicability.
To address the study’s limitations, future research will focus on several enhancements. Experimental validation through in vitro and in vivo assays will be conducted to confirm the therapeutic potential of predicted compounds, complementing the computational molecular docking results. To enhance the framework’s generalizability, the training dataset will be expanded beyond the RepoDB dataset by incorporating diverse drug–disease pairs, with a focus on rare diseases, using data from other repositories such as Hetionet. Transfer learning and domain adaptation techniques will also be explored to enhance applicability to underrepresented or novel disease categories.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Author contributions
M.K.A. and S.G.conceived the data collection, computational approaches. M.K.A. and R.A. conducted the methods. M.K.A., R.Z., and A.F.S. analyzed the results. S.G. supervised the project and provided guidance throughout the study. All authors reviewed and approved the manuscript.
Funding
No funding.
Data availability
The compound-protein and disease-proteins relations, along with the corresponding chart of the model, are provided in the supplementary. All data and code used in this study are available at https://github.com/Marzieh-Khodadadi/ConvAHKG.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-026-38656-8.
References
- 1.Parvathaneni, V., Kulkarni, N. S., Muth, A. & Gupta, V. Drug repurposing: a promising tool to accelerate the drug discovery process. Drug Discov. Today24, 2076–2085 10.1016/j.drudis.2019.06.014 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kyungsoo, P. A review of computational drug repurposing. Transl. Clin. Pharmacol.27, 59–63 10.12793/tcp.2019.27.2.59 (2019). [DOI] [PMC free article] [PubMed]
- 3.Kulkarni, V. S., Alagarsamy, V., Solomon, V. R., Jose, P. A. & Murugesan, S. Drug repurposing: An effective tool in modern drug discovery. Russ. J. Bioorg. Chem.49, 157–166 10.1134/S1068162023020139 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sundaresan, L., Giri, S., Singh, H. & Chatterjee, S. Repurposing of thalidomide and its derivatives for the treatment of sars-cov-2 infections: Hints on molecular action. Br. J. Clin. Pharmacol.87, 3835–3850 10.1111/bcp.14792 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ghorbanali, Z., Zare-Mirakabad, F., Akbari, M., Salehi, N. & Masoudi-Nejad, A. Drugrep-kg: Toward learning a unified latent space for drug repurposing using knowledge graphs. J. Chem. Inf. Model.63, 2532–2545 10.1021/acs.jcim.2c01291 (2023). [DOI] [PubMed] [Google Scholar]
- 6.Park, K. A review of computational drug repurposing. Transl. Clin. Pharmacol.27, 59–63 10.12793/tcp.2019.27.2.59 (2019). [DOI] [PMC free article] [PubMed]
- 7.Zhang, P., Agarwal, P. & Obradovic, Z. Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2013). [Google Scholar]
- 8.Jarada, T. N., Rokne, J. G. & Alhajj, R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J. Cheminformatics12, 46 10.1186/s13321-020-00450-7 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Xiao, B. H. F. H. C. R. P. B. X. A review of current in silico methods for repositioning drugs and chemical compounds. Front. Oncol.11, 711225 10.3389/fonc.2021.711225 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Aliper, A. et al. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm.13, 2524–2530 10.1021/acs.molpharmaceut.6b00248 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jarada, T. N., Rokne, J. G. & Alhajj, R. Snf-nn: Computational method to predict drug-disease interactions using similarity network fusion and neural networks. BMC Bioinform.22, 28 10.1186/s12859-020-03950-3 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wang, Y., Yang, Y., Chen, S. & Wang, J. Deepdrk: A deep learning framework for drug repurposing through kernel-based multi-omics integration. Brief. Bioinform.22, bbab048 10.1093/bib/bbab048 (2021). [DOI] [PubMed]
- 13.Zhang, Z. et al. Leveraging 3D molecular spatial visual information and multi-perspective representations for drug discovery. Adv. Sci.13, e12453 10.1002/advs.202512453 (2026). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wei, B., Zhang, Y. & Gong, X. Deeplpi: A novel drug repurposing model based on ligand-protein interaction using deep learning. Open Forum Infect. Dis.9, ofac492.574 (2022). [DOI] [PMC free article] [PubMed]
- 15.Amiri, R., Razmara, J., Parvizpour, S. & Izadkhah, H. A novel efficient drug repurposing framework through drug-disease association data integration using convolutional neural networks. BMC Bioinform.24, 442 10.1186/s12859-023-05572-x (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhu, Y. et al. Knowledge-driven drug repurposing using a comprehensive drug knowledge graph. Health Inform. J.26, 2737–2750 10.1177/1460458220937101 (2020). [DOI] [PubMed] [Google Scholar]
- 17.Ahmed, F. et al. A comprehensive review of artificial intelligence and network-based approaches to drug repurposing in covid-19. Biomed. Pharmacother.153, 113350 10.1016/j.biopha.2022.113350 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Khobragade, A., Mahajan, R., Langi, H., Mundhe, R. & Ghumbre, S. Effective negative triplet sampling for knowledge graph embedding. J. Inf. Optim. Sci.43, 2075–2087 (2022). [Google Scholar]
- 19.MacLean, F. Knowledge graphs and their applications in drug discovery. Expert. Opin. on Drug Discov.16, 1057–1069 (2021). [DOI] [PubMed] [Google Scholar]
- 20.Nicholson, D. N. & Greene, C. S. Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotechnol. J.18, 1414–1428 10.1016/j.csbj.2020.05.017 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Luo, H. et al. Computational drug repositioning using low-rank matrix approximation and randomized algorithms. Bioinformatics34, 1904–1912 10.1093/bioinformatics/bty013 (2018). [DOI] [PubMed] [Google Scholar]
- 22.Sadeghi, S. S. & Keyvanpour, M. Rcdr: A recommender based method for computational drug repurposing. In 2019 5th Conference on Knowledge Based Engineering and Innovation (KBEI), 467–471 10.1109/KBEI.2019.8734933 (2019).
- 23.Lakizadeh, A. & Hassan Mir-Ashrafi, S. M. Drug repurposing improvement using a novel data integration framework based on the drug side effect. Inform. Med. Unlocked23, 100523 10.1016/j.imu.2021.100523 (2021). [Google Scholar]
- 24.Wei, M., Wang, L., Su, X., Zhao, B. & You, Z. Multi-hop graph structural modeling for cancer-related circRNA-miRNA interaction prediction. Pattern Recognit.170, 112078 10.1016/j.patcog.2025.112078 (2026). [Google Scholar]
- 25.Shao, M., Jiang, L., Meng, Z. & Xu, J. Computational drug repurposing based on a recommendation system and drug–drug functional pathway similarity. Molecules27, 10.3390/molecules27041404 (2022). [DOI] [PMC free article] [PubMed]
- 26.Zhao, B.-W. et al. A geometric deep learning framework for drug repositioning over heterogeneous information networks. Brief. Bioinform.23, bbac384 10.1093/bib/bbac384 (2022). [DOI] [PubMed]
- 27.Ren, Z.-H. et al. A biomedical knowledge graph-based method for drug–drug interactions prediction through combining local and global features with deep neural networks. Brief. Bioinform.23, bbac363 10.1093/bib/bbac363 (2022). [DOI] [PubMed]
- 28.Zhao, B.-W. et al. A heterogeneous information network learning model with neighborhood-level structural representation for predicting lncRNA-miRNA interactions. Comput. Struct. Biotechnol. J.23, 2924–2933 10.1016/j.csbj.2024.06.032 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zeng, X. et al. deepdr: a network-based deep learning approach to in silico drug repositioning. Bioinformatics35, 5191–5198, 10.1093/bioinformatics/btz418 (2019). https://academic.oup.com/bioinformatics/article-pdf/35/24/5191/48978218/bioinformatics_35_24_5191.pdf. [DOI] [PMC free article] [PubMed]
- 30.Tayebi, J. & BabaAli, B. Ekgdr: An end-to-end knowledge graph-based method for computational drug repurposing. J. Chem. Inf. Model.64, 1868–1881 10.1021/acs.jcim.3c01925 (2024). [DOI] [PubMed] [Google Scholar]
- 31.Gharizadeh, A., Abbasi, K., Ghareyazi, A., Mofrad, M. R. K. & Rabiee, H. R. HGTDR: Advancing drug repurposing with heterogeneous graph transformers. Bioinformatics40, btae349 10.1093/bioinformatics/btae349 (2024). [DOI] [PMC free article] [PubMed]
- 32.Zhao, H. et al. Improving the predictive performance of binding affinities and poses for protein–cyclic peptide complexes through fine-tuned MM/PBSA(GBSA)-based methods. Brief. Bioinform.26, bbaf632 10.1093/bib/bbaf632 (2025). [DOI] [PMC free article] [PubMed]
- 33.Kumar, B. & Sharma, N. Approaches, issues and challenges in recommender systems: A systematic review. Indian J. Sci. Technol.9, 1–12 10.17485/ijst/2016/v9i47/94892 (2016).
- 34.Singh, M. Scalability and sparsity issues in recommender datasets: a survey. Knowl. Inf. Syst.62, 1–43 10.1007/s10115-018-1254-2 (2020). [Google Scholar]
- 35.Shi, C. Heterogeneous graph neural networks. In: Graph Neural Networks: Foundations, Frontiers, and Applications (Eds. Wu, L.,Cui, P.,Pei, J. &Zhao, L.), 351–369, 10.1007/978-981-16-6054-2_16 (Springer Nature Singapore, Singapore, 2022).
- 36.Ghorbanali, Z., Zare-Mirakabad, F., Salehi, N., Akbari, M. & Masoudi-Nejad, A. Drugrep-hesiagraph: when heterogeneous siamese neural network meets knowledge graphs for drug repurposing. BMC Bioinform.24, 374 10.1186/s12859-023-05479-7 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Szegedy, C. et al. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1–9, 10.1109/CVPR.2015.7298594 (2015).
- 38.Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (Eds. Pereira, F., Burges, C., Bottou, L. & Weinberger, K.) , vol. 25 (Curran Associates, Inc., 2012).
- 39.Gönen, M., Khan, S. & Kaski, S. Kernelized bayesian matrix factorization. In International conference on machine learning, 864–872 (PMLR, 2013).
- 40.Luo, Y. et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun.8, 573 10.1038/s41467-017-00680-8 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zeng, X. et al. deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics35, 5191–5198 10.1093/bioinformatics/btz418 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bonner, S. et al. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Brief. Bioinform.23, bbac404 10.1093/bib/bbac404 (2022). https://academic.oup.com/bib/article-pdf/23/6/bbac404/47144248/bbac404.pdf. [DOI] [PubMed]
- 43.Hogan, A. et al. Knowledge Graphs. No. 22 in Synthesis Lectures on Data, Semantics, and Knowledge (Springer, 2021).
- 44.Brown, A. S. & Patel, C. J. A standard database for drug repositioning. Sci. Data4, 170029 10.1038/sdata.2017.29 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kuhn, M., Letunic, I., Jensen, L. J. & Bork, P. The sider database of drugs and side effects. Nucleic Acids Res.44, D1075–D1079 10.1093/nar/gkv1075 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Chen, F.-S. & Jiang, Z.-R. Prediction of drug’s anatomical therapeutic chemical (atc) code by integrating drug–domain network. J. Biomed. Informatics58, 80–88 10.1016/j.jbi.2015.09.016 (2015). [DOI] [PubMed] [Google Scholar]
- 47.Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res.46, D1074–D1082 (2017). [DOI] [PMC free article] [PubMed]
- 48.Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Documentation5, 107–113 10.1021/c160017a018 (1965). [Google Scholar]
- 49.Mikolov, T., Chen, K., Corrado, G. S. & Dean, J. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (2013).
- 50.Muhammad, W. et al. Multi-path deep cnn with residual inception network for single image super-resolution. Electronics10, 10.3390/electronics10161979 (2021).
- 51.Roy, A. M. An efficient multi-scale cnn model with intrinsic feature integration for motor imagery eeg subject classification in brain-machine interfaces. Biomed. Signal Process. Control.74, 103496 10.1016/j.bspc.2022.103496 (2022). [Google Scholar]
- 52.Lu, L., Zhang, C., Cao, K., Deng, T. & Yang, Q. A multichannel cnn-gru model for human activity recognition. IEEE Access10, 66797–66810 10.1109/ACCESS.2022.3185112 (2022). [Google Scholar]
- 53.Pham, T.-A., Lee, J.-H. & Park, C.-S. Mst-vae: Multi-scale temporal variational autoencoder for anomaly detection in multivariate time series. Appl. Sci.12, 10.3390/app121910078 (2022).
- 54.Fu, L., Zhang, L. & Tao, J. An improved deep convolutional neural network with multiscale convolution kernels for fault diagnosis of rolling bearing. IOP Conf. Series: Mater. Sci. Eng.1043, 052021 10.1088/1757-899X/1043/5/052021 (2021). [Google Scholar]
- 55.Ho, Y. & Wookey, S. The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling. IEEE Access8, 4806–4813 10.1109/ACCESS.2019.2962617 (2020). [Google Scholar]
- 56.Widodo, S., Brawijaya, H. & Samudi, S. Stratified k-fold cross validation optimization on machine learning for prediction. Sinkron : jurnal dan penelitian teknik informatika6, 2407–2414 10.33395/sinkron.v7i4.11792 (2022).
- 57.Lemaître, G., Nogueira, F. & Aridas, C. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. 18, 10.48550/arXiv.1609.06570 (2016).
- 58.Trouillon, T., Welbl, J., Riedel, S., Gaussier, É. & Bouchard, G. Complex embeddings for simple link prediction. arXiv preprint arXiv:1606.06357https://arxiv.org/abs/1606.06357 (2016).
- 59.Sun, Z., Deng, Z.-H., Nie, J.-Y. & Tang, J. RotatE: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197https://arxiv.org/abs/1902.10197 (2019).
- 60.Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J. & Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. (NeurIPS)26, 2787–2795 (2013). [Google Scholar]
- 61.van der Maaten, L. & Hinton, G. Viualizing data using t-sne. J. Mach. Learn. Res.9, 2579–2605 (2008). [Google Scholar]
- 62.Parvizi, P., Azuaje, F., Theodoratou, E. & Luz, S. A network-based embedding method for drug-target interaction prediction. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.2020, 5304–5307 10.1109/EMBC44109.2020.9176165 (2020). [DOI] [PubMed]
- 63.Cappuzzo, F., Bemis, L. & Varella-Garcia, M. her2 mutation and response to trastuzumab therapy in non–small-cell lung cancer. New Engl. J. Medicine354, 2619–2621 10.1056/NEJMc060020 (2006). [DOI] [PubMed] [Google Scholar]
- 64.Li, B. T. et al. Trastuzumab deruxtecan inher2-mutant non–small-cell lung cancer. New Engl. J. Medicine386, 241–251 10.1056/NEJMoa2112431 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Langer, C. J., Stephenson, P., Thor, A., Vangel, M. & Johnson, D. H. Trastuzumab in the treatment of advanced non-small-cell lung cancer: Is there a role? focus on eastern cooperative oncology group study 2598. J. Clin. Oncol.22, 1180–1187 10.1200/JCO.2004.04.105 (2004). [DOI] [PubMed] [Google Scholar]
- 66.Hotta, K. et al. A phase ii study of trastuzumab emtansine in her2-positive non–small cell lung cancer. J. Thorac. Oncol.13, 273–279 10.1016/j.jtho.2017.10.032 (2018). [DOI] [PubMed] [Google Scholar]
- 67.Li, B. T. et al. Ado-trastuzumab emtansine for patients withher2-mutant lung cancers: Results from a phase ii basket trial. J. Clin. Oncol.36, 2532–2537 10.1200/JCO.2018.77.9777 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Ferrone, M. & Motl, S. E. Trastuzumab for the treatment of non—small-cell lung cancer. Annals Pharmacother.37, 1904–1908 10.1345/aph.1D101 (2003). PMID: 14632535. [DOI] [PubMed]
- 69.Randomized phase ii trial of gemcitabine–cisplatin with or without trastuzumab in her2-positive non-small-cell lung cancer. Annals Oncol.15, 19–27 10.1093/annonc/mdh031 (2004). [DOI] [PubMed]
- 70.Hirsch, F. R. & Langer, C. J. The role of her2/neu expression and trastuzumab in non-small cell lung cancer. Semin. Oncol.31, 75–82, 10.1053/j.seminoncol.2003.12.018 (2004). Current Perspectives and Novel Strategies in the Treatment of Patients With Lung Cancer. [DOI] [PubMed]
- 71.Zhao, T. et al. Lysosomal acid lipase, csf1r, and pd-l1 determine functions of cd11c+ myeloid-derived suppressor cells. JCI Insight7, e156623 10.1172/jci.insight.156623 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Leighl, N. B. et al. A multicenter, phase 2 study of vascular endothelial growth factor trap (aflibercept) in platinum- and erlotinib-resistant adenocarcinoma of the lung. J. Thorac. Oncol.5, 1054–1059 10.1097/JTO.0b013e3181e2f7fb (2010). [DOI] [PubMed] [Google Scholar]
- 73.Gaya, A. & Tse, V. A preclinical and clinical review of aflibercept for the management of cancer. Cancer Treat. Rev.38, 484–493 10.1016/j.ctrv.2011.12.008 (2012). [DOI] [PubMed] [Google Scholar]
- 74.Ciombor, K. K., Berlin, J. & Chan, E. Aflibercept. Clin. Cancer Res.19, 1920–1925 10.1158/1078-0432.CCR-12-2911 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wang, Y. et al. A multifunctional non-viral vector for the delivery of mth1-targeted crispr/cas9 system for non-small cell lung cancer therapy. Acta Biomater.153, 481–493 10.1016/j.actbio.2022.09.046 (2022). Epub (2022) [DOI] [PubMed]
- 76.Liu, M. et al. Protamine nanoparticles for improving shrna-mediated anti-cancer effects. Nanoscale Res. Lett.10, 134 10.1186/s11671-015-0845-z (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Gerena-Lewis, M. et al. A phase ii trial of denileukin diftitox in patients with previously treated advanced non-small cell lung cancer. Am. J. Clin. Oncol.32, 269–273 10.1097/COC.0b013e318187dd40 (2009). [DOI] [PubMed] [Google Scholar]
- 78.Norton, R. & O’Connell, M. A. Vitamin d: Potential in the prevention and treatment of lung cancer. Anticancer. Res.32, 211–221 https://ar.iiarjournals.org/content/32/1/211.full.pdf (2012). [PubMed]
- 79.Weijie Wang, S. X. Q. C. Y. J. H. Z., Wentao Hu & Zuo, W. Vitamin d and lung cancer; association, prevention, and treatment. Nutr. Cancer73, 2188–2200 10.1080/01635581.2020.1844245 (2021). [DOI] [PubMed]
- 80.Nithya Ramnath, S. K. & Christensen, P. J. Vitamin d and lung cancer. Expert. Rev. Respir. Medicine5, 305–309 10.1586/ers.11.31 (2011). [DOI] [PubMed]
- 81.Kilkkinen, A. et al. Vitamin d status and the risk of lung cancer: A cohort study in finland. Cancer Epidemiol. Biomark. Prev.17, 3274–3278 10.1158/1055-9965.EPI-08-0199 (2008). [DOI] [PubMed] [Google Scholar]
- 82.Zhou, W. et al. Vitamin d is associated with improved survival in early-stage non–small cell lung cancer patients. Cancer Epidemiol. Biomark. Prev.14, 2303–2309 10.1158/1055-9965.EPI-05-0335 (2005). [DOI] [PubMed] [Google Scholar]
- 83.Zhang, L., Wang, S., Che, X. & Li, X. Vitamin d and lung cancer risk: A comprehensive review and meta-analysis. Cell. Physiol. Biochem.36, 299–305 10.1159/000374072 (2015). [DOI] [PubMed] [Google Scholar]
- 84.Araghi, M. et al. Recent advances in non-small cell lung cancer targeted therapy; an update review. Cancer Cell Int.23, 162 10.1186/s12935-023-02990-y (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Bray, F. et al. Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin.68, 394–424 10.3322/caac.21492 (2018). [DOI] [PubMed] [Google Scholar]
- 86.Wu, Y.-L. et al. Pan-asian adapted clinical practice guidelines for the management of patients with metastatic non-small-cell lung cancer: a csco–esmo initiative endorsed by jsmo, ksmo, mos, sso and tos. Annals Oncol.30, 171–210 10.1093/annonc/mdy554 (2019). [DOI] [PubMed] [Google Scholar]
- 87.Alduais, Y., Zhang, H., Fan, F., Chen, J. & Chen, B. Non-small cell lung cancer (nsclc): A review of risk factors, diagnosis, and treatment. Medicine102, e32899, 10.1097/MD.0000000000032899. (2023). [DOI] [PMC free article] [PubMed]
- 88.Neal, J. W. & Wakelee, H. A. Aflibercept in lung cancer. Expert Opin. Biol. Ther.13, 115–120 10.1517/14712598.2013.745847 (2013). [DOI] [PubMed] [Google Scholar]
- 89.Bano, N. et al. Drug repurposing of selected antibiotics: An emerging approach in cancer drug discovery. ACS Omega9, 26762–26779 10.1021/acsomega.4c00617 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Du, X., Xue, Z., Lv, J. & Wang, H. Expression of the topoisomerase ii alpha (top2a) gene in lung adenocarcinoma cells and the association with patient outcomes. Med. Sci. Monit.26, e929120 10.12659/MSM.929120 (2020). [DOI] [PMC free article] [PubMed]
- 91.Syahruddin, E. et al. Differential expression of dna topoisomerase ii alpha and ii beta genes between small cell and non-small cell lung cancer. Jpn. J. Cancer Res.89, 855–861 10.1111/j.1349-7006.1998.tb00640.x (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Khaled, D. M. et al. A computational qsar, molecular docking and in vitro cytotoxicity study of novel thiouracil-based drugs with anticancer activity against human-dna topoisomerase ii. Int. J. Mol. Sci.23, 11799 10.3390/ijms231911799 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Magklaras, A.-D.C., Banti, C. N. & Hadjikakou, S. K. Antiproliferative activity of antibiotics through dna binding mechanism: Evaluation and molecular docking studies. Int. J. Mol. Sci.24, 2563 10.3390/ijms24032563 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The compound-protein and disease-proteins relations, along with the corresponding chart of the model, are provided in the supplementary. All data and code used in this study are available at https://github.com/Marzieh-Khodadadi/ConvAHKG.





























