Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 3.
Published in final edited form as: Knowl Based Syst. 2024 Oct 19;305:112638. doi: 10.1016/j.knosys.2024.112638

HeteroKGRep: Heterogeneous Knowledge Graph based Drug Repositioning

Ribot Fleury T Ceskoutsé d, Alain Bertrand Bomgni a,b, David R Gnimpieba Zanfack c, Diing D M Agany a, Bouetou Bouetou Thomas d, Etienne Gnimpieba Zohim a
PMCID: PMC11600970  NIHMSID: NIHMS2031958  PMID: 39610660

Abstract

The process of developing new drugs is both time-consuming and costly, often taking over a decade and billions of dollars to obtain regulatory approval. Additionally, the complexity of patent protection for novel compounds presents challenges for pharmaceutical innovation. Drug repositioning offers an alternative strategy to uncover new therapeutic uses for existing medicines. Previous repositioning models have been limited by their reliance on homogeneous data sources, failing to leverage the rich information available in heterogeneous biomedical knowledge graphs. We propose HeteroKGRep, a novel drug repositioning model that utilizes heterogeneous graphs to address these limitations. HeteroKGRep is a multi-step framework that first generates a similarity graph from hierarchical concept relations. It then applies SMOTE over-sampling to address class imbalance before generating node sequences using a heterogeneous graph neural network. Drug and disease embeddings are extracted from the network and used for prediction. We evaluated HeteroKGRep on a graph containing biomedical concepts and relations from ontologies, pathways and literature. It achieved state-of-the-art performance with 99% accuracy, 95% AUC ROC and 94% average precision on predicting repurposing opportunities. Compared to existing homogeneous approaches, HeteroKGRep leverages diverse knowledge sources to enrich representation learning. Based on heterogeneous graphs, HeteroKGRep can discover new drug-desease associations, leveraging de novo drug development. This work establishes a promising new paradigm for knowledge-guided drug repositioning using multimodal biomedical data.

Keywords: deep learning, drug repurposing, biomedical heterogeneous graph

1. Introduction

Drug repositioning has the potential to accelerate drug discovery by identifying new therapeutic uses for existing medicines. During the COVID-19 pandemic, the first medical solutions were found using the technique of drug repurposing [1]. Since then, focus has been made on building accurate machine learning models to discover new therapeutic effects of existing drugs. The conventional process of developing new drugs takes over a decade and billions of dollars. Therefore, machine learning stands as an effective opportunity for drug discovery. Many machine learning drug repositioning models have been proposed by researchers. While this models offer promising avenues to streamline this process by efficiently uncovering novel therapeutic opportunities from existing medications, several limitations exist that hinder their efficacy in drug repurposing efforts [2].

One major challenge lies in the data heterogeneity present in biomedical knowledge graphs. Current machine learning models for drug repositioning often struggle with leveraging heterogeneous datasets, limiting their ability to exploit the diverse information available [3]. This can result in incomplete or biased representations, impacting the accuracy of predictions. Additionally, algorithmic challenges, such as the use of techniques like random walks on heterogeneous graphs, may not adequately capture the complex relationships between drugs and diseases, leading to computational inefficiencies and potentially inaccurate insights [4, 5]. Furthermore, the limited transferability of machine learning models poses another obstacle. Models trained for specific drug repurposing tasks may lack generalizability across different datasets or diseases, requiring extensive retraining or modification for each new application. Lastly, the interpretability of deep learning models in biomedical contexts presents a crucial concern [3]. While these models excel at capturing intricate data patterns, their black-box nature can hinder the understanding of the rationale behind predictions, making it challenging to validate and trust the identified drug candidates.

In this context, we introduce Heterogeneous Knowledge Graph based Drug Repositioning (HeteroKGRep), a novel drug repositioning model that leverages the power of Heterogeneous Graph Neural Network (HeteroGNN) [6] and data augmentation techniques [7]. HeteroKGRep addresses the limitations of existing methods by utilizing heterogeneous knowledge graphs, which integrate multiple sources of data, and employing HeteroGNN to learn meaningful representations of drugs and diseases.

The key contributions of this work can be summarized as follows:

  1. Generating augmented data on knowledge graphs: We developed a technique based on HeteroKGRep for augmenting data from a large heterogeneous biomedical knowledge graph by leveraging concepts and relations. This approach utilizes the SMOTE framework to oversample underrepresented concepts, thereby enriching the knowledge representation in the graph in a robust manner.

  2. Machine learning on graphs: We applied state-of-the-art graph neural network techniques to perform representation learning from the heterogeneous graph. This leverages dependencies across diverse data types for drug-disease association prediction.

  3. Knowledge-guided predictions: Drug and disease embeddings extracted from the trained model encode relational information across the graph. This knowledge-guided approach improves predictive performance for drug repositioning.

The remainder of the paper is organized as follows: Section II reviews related work. Section III describes the HeteroKGRep methodology. Section IV details the experiment setup, results, and evaluation. Finally, Section V concludes and discusses opportunities for future work.

2. Related work

Drug repositioning leveraging biomedical knowledge graphs is an active area of research with the goal of accelerating drug discovery. Several machine learning models have been developed to identify potential drug-disease associations by exploiting relationships within heterogeneous graph-structured data [8]. Early works in this domain typically relied on homogeneous data sources, such as molecular structures or side effect similarities [6, 9]. However, more recent approaches aim to integrate diverse data types to enrich representation learning for drug repositioning [10].

Bang et al.[11] propose a recommandation system leveraging the principle of guilt-by-association across multiple layers of a biomedical knowledge graph. Their approach, named DREAMwalk, perform random walks guided by semantic similarities between drugs and diseases entities. ATOM and MeSH taxonomies are used to derive the semantic hierarchies of these entities. The random walks sequences are fed into a heterogeneous Skip-gram model to learn node embeddings combining both semantic and biological contexts. An XGBoost classifier is then trained on the drug-disease node embedding vectors to predict novel associations. DREAMwalk was shown to improve link prediction accuracy up to 16.8% compared to state-of-the-art models. Case studies for breast cancer and Alzheimer further demonstrated the ability of DREAMwalk to effectively repurpose drugs leveraging the multi-layer guilt-by-association perspective on biomedical knowledge graphs. Pan et al.[12] propose a neural link prediction approach called WalkPool, based on a novel pooling scheme. WalkPool combines the expressivity of topological heuristics with the feature learning ability of neural networks. It summarizes candidate links by computing random walk probabilities on a ”predictive” latent graph obtained via attention on node features. This can be seen as performing feature-sensitive topology fingerprinting. WalkPool can leverage unsupervised node features or be combined with GNNs in an end-to-end trainable manner. Evaluations show it outperforms state-of-the-art methods on standard link prediction benchmarks, for both homophilic and heterophilic networks, with or without node attributes. Applying WalkPool to unsupervised GNNs significantly improves accuracy, suggesting it can serve as a general purpose graph pooling scheme. Meng et al.[13] introduce an innovative approach for drug repositioning named DRAGNN (Drug Repositioning based on Weighted Local Information Augmented Graph Neural Network). This model incorporates a graph attention mechanism to dynamically allocate attention coefficients to heterogeneous drug and disease nodes, thereby enhancing the effectiveness of target node information collection. By excluding self-node information aggregation and employing average pooling in neighbor information aggregation, valuable heterogeneous and homogeneous information is highlighted while maintaining simplicity. Final association predictions are generated using a multi-layer perceptron. The effectiveness of DRAGNN is validated through a 10-times 10-fold cross-validation on three benchmark datasets, along with further analyses using authoritative data sources, molecular docking experiments, and drug-disease network analysis, paving the way for future drug discovery endeavors. Xianfang Tang et al.[14] introduce a novel computational method named DRGBCN for enhancing drug repositioning through local interactive learning with bilinear attention networks. DRGBCN integrates heterogeneous information to infer potential drugs for specific diseases by constructing a comprehensive drug-disease network that includes multiple similarity networks for drugs and diseases. The methodology incorporates a layer attention mechanism to learn graph convolutional layer embeddings effectively, followed by the construction of a bilinear attention network to capture pairwise local interactions between drugs and diseases. This approach enhances prediction accuracy and reliability. Zhang and Chen[15] propose a link prediction model based on Graph Neural Networks. Their approach, named SEAL, combines characteristics from the three main families of link prediction methods within a single Graph Neural Network: structural, latent and explicit features. SEAL takes as input local subgraphs around target links, allowing it to integrate higher-order information. Latent node embeddings and explicit node attributes are concatenated to train the GNN. Zhao et al. [16] also introduce a groundbreaking geometric deep learning framework, DDAGDL, designed specifically for drug repositioning tasks within heterogeneous information networks (HINs). This innovative approach addresses a critical limitation in existing computational methods for drug repositioning, namely the neglect of the non-Euclidean nature of biomedical network data, hindering their predictive accuracy. DDAGDL leverages the power of geometric deep learning (GDL) to predict drug-drug associations (DDAs) by integrating complex biological information into the topological structure of HINs. By employing an attention mechanism, the model effectively learns smoothed representations of drugs and diseases, enhancing its predictive capabilities significantly. Through extensive experimental evaluations on three real-world datasets using a 10-fold cross-validation setup, Zhao et al. demonstrate the superior performance of DDAGDL over state-of-the-art drug repositioning methods. The model exhibits notable improvements in terms of Accuracy, Matthews Correlation Coefficient (MCC), and F1-score across all benchmark datasets, showcasing its effectiveness in discovering new drug indications. Muniyappan et al.[17] present EGeRepDR, an enhanced genetic-based representation learning framework for drug repurposing leveraging multiple biomedical data sources. The approach first constructs a heterogeneous biological network with drug, disease and gene nodes linked by various relationships. A modified GAE model is applied to learn node representations from the network topology and edges. The biological network is then enriched with information from literature and ontologies using a semi-supervised pattern embedding model and a novel DR perspective representation learning respectively. Finally, a neural network model generates probability scores for drug-disease pairs based on the multimodal representations. By utilizing genetic information from various sources, EGeRepDR aims to improve performance over methods not considering the genetic importance of DR prediction. Pan Zeng and collaborators [18] introduce a novel drug repositioning methodology, TGCNDR, founded on a tripartite cross-network embedding and graph convolutional network. This innovative approach targets the prediction of associations between drugs and diseases, aiming to enhance the accuracy of identifying potential drug candidates and disease targets within the realm of drug discovery. TGC-NDR constructs a tripartite cross-network encompassing drug-disease associations, drug-protein associations, and drug-side effect associations. It employs graph convolutional networks to extract information from various nodes to facilitate the learning of drug embeddings. Additionally, TGCNDR incorporates anchor links for knowledge transfer and utilizes the Self-attention mechanism across different networks to signify drug importance. In a compelling case study, molecular docking experiments conducted on the predicted drug candidates lead to the identification of newly approved drugs for osteosarcoma, underscoring the efficacy of TGCNDR as a potent tool for drug repositioning endeavors. Lihong Peng et al.[19] introduce LDA-VGHB, a powerful framework designed to identify potential associations between long noncoding RNAs (lncRNAs) and diseases. This innovative approach leverages a combination of techniques including singular value decomposition, variational graph auto-encoder, and heterogeneous Newton boosting machine to enhance the accuracy of lncRNA–disease association predictions. LDA-VGHB is rigorously compared against four classical LDA prediction methods (SDLDA, LDNFSGB, IPCARF, and LDASR) as well as four popular boosting models (XGBoost, AdaBoost, CatBoost, and LightGBM) through extensive 5-fold cross-validations across various datasets. The framework emerges as a standout performer, showcasing remarkable performance improvements over existing methods across different cross-validation scenarios using datasets such as lncRNAs, diseases, lncRNA–disease pairs, independent lncRNAs, and independent diseases. Ghorbanali et al.[20] present DrugRep-HeSiaGraph, a two-step method for drug repurposing leveraging knowledge graphs. First, a drug-disease knowledge graph DDKG-V1 is constructed, defining new relation types. Node embeddings are generated using distributional learning. Second, a heterogeneous siamese neural network HeSiaNet is applied to enrich the embeddings by mapping drugs and diseases into a unified latent space. HeSiaNet integrates the DDKG-V1 and applies a siamese architecture to bring related entities closer in the embedding space. This allows predicting new drug candidates for diseases. Evaluation shows DrugRep-HeSiaGraph achieves strong performance with AUC-ROC of 91.16%, AUC-PR of 90.32%, accuracy of 84.63%, demonstrating the effectiveness of combining knowledge graphs with siamese neural networks for drug repurposing. In the study by Zhao et al. [21], a novel graph representation learning model, FuHLDR, is introduced for enhancing drug repositioning by integrating higher and lower-order biological information. FuHLDR leverages a graph convolutional network to initially capture lower-order representations of drugs and diseases based on their biological attributes and drug-disease associations within the HIN. Subsequently, a meta-path-based strategy is employed to incorporate higher-order connectivity patterns, encompassing relationships among drugs, proteins, and diseases. By integrating these higher and lower-order representations, FuHLDR utilizes a Random Vector Functional Link Network to identify novel drug-disease associations. Experimental evaluations on two benchmark datasets illustrate the superior performance of FuHLDR compared to state-of-the-art drug repositioning models. The model’s effectiveness is further demonstrated through case studies on Alzheimer’s disease and Breast neoplasms, highlighting how the integration of higher-order biological information enhances the accuracy and insights in drug repositioning endeavors.

While previous works have achieved promising results leveraging biomedical knowledge graphs for drug repurposing, we acknowledge that certain limitations have been addressed in existing platforms and data sources. Platforms such as Genomic Knowledgebase (GenomicKB) [22], the Cancer Genome Atlas [23], RNAcentral, Genotype-Tissue Expression (GTEx) [24], GWAS, Database of Genomic Variants (DGV) [25], NCBI dbVar [26], 4D Nucleome (4DN) [27], FIRE studies, MotifMap, and NCBO ontologies provide a vast amount of genomic entities, relations, and properties [22]. However, despite the availability of these platforms, challenges remain in the integration and processing of data from multiple sources. Standardizing and normalizing metadata, handling missing values, ensuring interoperability across data sources, and addressing conflicting evidence are ongoing concerns in modern biological and clinical research. To address these challenges, we propose HeteroKGRep, a framework that specifically focuses on leveraging heterogeneous graph neural networks for robust knowledge graph completion applied to real-world biomedical networks. While previous approaches have shown promise, HeteroKGRep aims to improve drug-disease association prediction by capturing both network structure and semantic relations across different data types. By utilizing HeteroGNN [4], a heterogeneous graph neural network method, HeteroKGRep overcomes the limitations of random walk methods and traditional representation learning techniques. By leveraging the joint representations learned from a multi-modal biomedical knowledge graph, HeteroKGRep contributes to addressing the challenges of data integration, processing large heterogeneous genomic datasets, and providing improved insights for drug repurposing. It serves as a step towards harnessing the power of heterogeneous graph neural networks in the context of biomedical research.

3. Methods

This work presents HeteroKGRep (Figure 1), a model for drug repurposing recommendation based on heterogeneous knowledge graphs. HeteroKGRep aims to address limitations of existing methods by jointly integrating genetic information from different sources via representation learning.

Figure 1:

Figure 1:

Step by step drug repurposing using HeteroKGRep.

The first contribution is a data augmentation algorithm prototype that leverages SMOTE[7] to balance the entity distribution in heterogeneous knowledge graphs. This prototype, shown in Algorithm 1, takes as input a heterogeneous knowledge graph G=V,E and performs the following main steps:

  1. It extracts node features X|V|×d and labels y|V| from G, where d is the number of node features,

  2. It handles missing values in X using the SimpleImputer, which replaces missing entries with the mean/median of the corresponding feature,

  3. It oversamples the minority classes in X and y using SMOTE. SMOTE generates new synthetic samples Xres and yres by:
    xnew=xi+λxjxi, (1)

    where xi is a minority class sample, xj is one of its k nearest neighbors, and λ0,1 is a random number.

  4. It uses the oversampled matrices Xres and yres to synthetically generate new nodes and edges, and

  5. It retains the topological structure of G while incorporating the new nodes and edges to create an augmented graph Gaug.

The second contribution is the HeteroKGRep model, based on the HeteroGNN architecture[6] for heterogeneous graph neural networks. The HeteroGNN model, shown in Algorithm 2, takes as input the augmented heterogeneous knowledge graph Gaug. It performs the following main steps:

  1. It defines the HeteroGNN architecture with MLP modules tailored to different entity types (e.g., drugs, diseases),

  2. It extracts node features from Gaug and passes them through the corresponding MLP modules to generate initial embeddings. Specifically, it extracts the heterogeneous contents Cv associated with
    Algorithm1SMOTEGraphAugmentation¯1:GLoadgraphfromfileusingread_graph(edgeList)2:XExtractnodefeaturesfromGusingget_node_features(G)3:yExtractnodelabelsfromGusingget_node_labels(G)4:ApplyimputationonXtohandlemissingvalues5:PerformSMOTEoversamplingonXandyusingapply_smote(G)6:UpdatenodefeaturesinGwiththeoversampleddatausingset_node_features(G,X_res)7:SaveaugmentedgraphGtofileusingsave_graph(G,filename)¯¯
    each node v, and encodes them into a fixed-size embedding using a neural network f1, where:
    f1v=1CviCvLSTMθxxiLSTMθxxi, (2)
    where xidf×1 is the feature representation of the i-th content in Cv, and df is the content feature dimension.
  3. It concatenates the embeddings f1v|vV and passes them through an output layer to predict associations. Specifically, the model computes the association scores between node pairs u,v as:
    y^u,v=σWf1(u)f1v+b, (3)
    where W2d1×1 and b are the weights and bias of the output layer, d1 is the dimension of the node embeddings f1v, and σ is the sigmoid activation function to produce the final association score y^u, v0,1.
  4. It trains the model by calculating losses on prediction-label pairs and updating the parameters using the Adam optimizer. Specifically, the model is trained to minimize the binary cross-entropy loss between the predicted association scores y^u, v and the ground truth labels yu, v:
    =1|E|u,vEyu,vlogy^u,v+1yu,vlog1y^u,v, (4)
    where E is the set of edges in the graph. The model parameters are then updated using the Adam optimization algorithm to iteratively minimize this loss function.
  5. After training, it extracts the final trained embeddings from the model, which encode nodes’ contexts.

To address the limitations of existing methods and enhance the representation learning capabilities, HeteroKGRep extends the GNN architecture by integrating genetic information from different sources and leveraging type-specific embeddings. By combining the network structure and semantic relations in the heterogeneous graph with the modified HeteroGNN architecture, HeteroKGRep can capture both local and global context, leading to more accurate predictions of novel associations.

The third contribution of HeteroKGRep involves generating node sequences from the augmented graph Gaug=Vaug,Eaug to learn entity embeddings that capture global context. This prototype, shown in Algorithm 3, performs the following steps:

  1. It retrieves the list of nodes 𝒱=v1,v2,,vVaug and their labels y=y1,y2,,yVaug from the augmented graph Gaug,

  2. It initializes an Embedding layer EVaug×d to embed the nodes into a d-dimensional space, where d is the dimensionality of the learned embeddings,

  3. It performs random walks of fixed length L starting from each node v𝒱 to explore its neighborhood. The random walk sequence from node v is denoted as sv=v1,v2,,vL, where v1=v,

  4. It converts the sequences of visited nodes into IDs using a string-to-ID mapping ϕ:𝒱,
    Algorithm2HeteroGNNModelTraining¯1:Input:GraphG,nodefeaturesX,labelsy2:Output:TrainedHeteroGNNmodel3:InitializeHeteroGNNmodelwithspecifiedinputandoutputsizes4:Initializetheoptimizere.g.,Adamwithappropriatelearningrate5:Training:6:forepoch=1tonumepochsdo7:foreachxi,yiinX,ydo8:Extractdrugfeaturesfromnodexiandstoreinxdrug9:Extractdiseasefeaturesfromnodexiandstoreinxdisease10:ApplydrugMLPtoxdrugandstoreresultinzdrug11:ApplydiseaseMLPtoxdiseaseandstoreresultinzdisease12:Concatenatezdrugandzdiseaseandstoreinz13:Applyoutputlayertozandstorepredictioninypred14:Calculatethelossusingypredandyi15:Updatethemodelparametersusingbackpropagationandtheoptimizer16:endfor17:endfor18:Output:TrainedHeteroGNNmodel¯¯
  5. It encodes each node ID ϕvi into a feature vector ei=Eϕvi using the Embedding layer, and

  6. It gathers sequences of (feature, label) pairs e1,y1, e2,y2,,eL,yL from the random walks. By generating these local context samples and applying the HeteroGNN model to them, HeteroKGRep learns entity embeddings that capture global structure.

Algorithm3NodeSequenceGeneration¯1:ExtractnodesandlabelsfromG2:InitializeEmbeddinglayer3:fornGdo4:Generaterandomwalksequencefromn5:MapnodestoIDs6:Encodesequenceintofeatures7:Gatherfeature,labelpairs8:endfor¯¯

Finally, an XGBoost[28] classifier is employed to predict novel drug-disease associations based on the learned embeddings, enabling recommendations for new therapeutic assignments. The HeteroKGRep algorithm, shown in Algorithm 4, provides a comprehensive overview of the model’s operation. It starts by loading the heterogeneous knowledge graph G and performing data augmentation (Algorithm 1) to address entity distribution imbalance. It then generates node sequences (Algorithm 3) to learn entity embeddings capturing global context. Next, the model is initialized and trained using the augmented graph and node sequences. It leverages type-specific embeddings and network structure to predict associations and update model parameters. After training, the final embeddings, which encode nodes’ contexts, are extracted from the model. Finally, an XGBoost classifier is employed to predict novel drug-disease associations based on the learned embeddings, enabling recommendations for new therapeutic assignments.

By combining heterogeneous graph representation learning, integration of genetic information, and the use of XGBoost for classification, HeteroKGRep aims to improve the state-of-the-art in drug repurposing. These contributions represent promising advances for leveraging knowledge graphs in biomedical research. This model stands out due to its innovative data augmentation technique that addresses entity distribution imbalances within the knowledge graph through the application of SMOTE oversampling. By synthetically generating new nodes and edges while preserving the underlying structure of the graph, HeteroKGRep enhances the quality and diversity of the data used for learning. Furthermore, the integration of genetic information from various sources via representation learning offers a unique perspective on capturing complex relationships between entities. By extending the HeteroGNN architecture with type-specific embeddings and a tailored MLP structure, the model can effectively learn and predict associations, thereby enabling more accurate and informed recommendations for novel drug-disease interactions. Through this innovative combination of techniques, HeteroKGRep demonstrates a promising advancement in the field of drug repurposing, showcasing the potential for enhanced decision-making in biomedical research and therapeutic assignments.

Algorithm4HeteroKGRepAlgorithm¯1:Input:HeterogeneousknowledgegraphG2:Output:TrainedHeteroGNNmodelandpredictedassociations3:DataAugmentation:4:PerformSMOTEgraphaugmentationonGAlgorithm15:NodeSequenceGeneration:6:GeneratenodesequencesfromGAlgorithm37:ModelTraining:8:InitializeHeteroGNNmodelwithspecifiedinputandoutputsizes9:ExtractnodefeaturesfromG10:Generatetype-specificembeddings11:TraintheHeteroGNNmodelusingnodesequencesandlabelsAlgorithm212:AssociationPrediction:13:ExtractfinaltrainedembeddingsfromtheHeteroGNNmodel14:UseXGBoostclassifiertopredictnovelassociationsbasedontheembeddings15:Output:TrainedHeteroGNNmodelandpredictedassociations¯¯

4. Experimental results and discussion

To train, evaluate and validate our model, we used the same source data used by Bang et al.[11] in their work on the DREAMwalk model. This dataset includes MSI[29], HetioNet[30] and KEGG[31] databases. The structure and content of this dataset can be found in their publication as well as in our GitHub repository. We chose to use the same data and preparation as our predecessors because the data was publicly available and our goal was to present an alternative solution in the field of drug repurposing with results that can be compared to the state-of-the-art models. By using the same publicly available data that was previously utilized, we ensure reproducibility and allow for a fair comparison of our model’s performance to prior work. Sharing details on GitHub further increases transparency. Our approach provides an additional perspective that could help advance research while respecting experimental designs adopted by others studying similar scientific questions. We therefore compare the results of our model to the two most recent and highest performing models to date: DREAMwalk[11] and DrugRep-HeSiaGraph[20]. DREAMwalk applies random walks guided by semantic similarities across a multi-layer biomedical knowledge graph. It was shown to improve link prediction accuracy up to 16.8% compared to state-of-the-art models, with an accuracy of 0.82, AUC of 0.91 and AUPR of 0.90. DrugRep-HeSiaGraph enriches node embeddings by mapping entities in a unified latent space using a heterogeneous siamese neural network. The evaluation showed it achieves strong performance with an AUC-ROC of 91.16%, AUC-PR of 90.32% and accuracy of 84.63%.

We conducted an extensive evaluation of our HeteroKGRep model, which was trained on a large-scale biomedical knowledge graph using the HeteroGNN algorithm, to generate entity embeddings. To assess the effectiveness of the model training, we monitored the loss value over multiple epochs during the learning process. In this experiment, we performed 10 independent training runs, each consisting of 10 epochs, to establish the robustness and consistency of the results. For each run, we recorded the mean loss achieved after every epoch. The results, depicted in Figure 2, illustrate the loss curves over the 10 epochs for the 10 training runs. The average loss values for the 10 runs are as follows: 0.4, 0.35, 0.3, 0.25, 0.2, 0.18, 0.17, 0.18, 0.2, and 0.25, respectively. These low mean loss values indicate that the model effectively learned the task, demonstrating its ability to generate accurate entity embeddings. Across all 10 runs, we observe a consistent trend of decreasing loss with each subsequent epoch, indicating improvement and convergence towards optimal solutions. The loss curves flatten out near the later epochs, suggesting that the model has fully captured the complex graph relationships represented in the biomedical knowledge graph.

Figure 2:

Figure 2:

Monitoring model loss to validate HeteroKGRep training.

Furthermore, the limited variance among the different runs confirms the reliability and repeatability of the results. Notably, the 5th run achieved the best performance with a remarkably low mean loss of 0.2, validating the high effectiveness of the learned embeddings in representing biomedical entities and their heterogeneous connections. Analyzing the results, we observe that the model becomes stable after the 7th training run, as indicated by the consistently low mean loss values from that point onwards. This stability suggests that the model has learned and encoded the complex relationships within the knowledge graph effectively.

In conclusion, the consistent decreasing loss trend observed over multiple epochs and runs demonstrates the powerful ability of our HeteroKGRep model to accurately encode biomedical entities and their heterogeneous connections within the complex knowledge graph. The low loss values achieved further confirm the efficacy of the generated entity embeddings, making them valuable for applications such as drug repositioning. Overall, the comprehensive evaluation with 10 training runs and 10 epochs each provides strong evidence for the effectiveness and stability of our HeteroKGRep model, supporting its practical utility in various biomedical applications.

In a conducted experiment, we evaluated varying the dimensions of the input and hidden layers while maintaining the output layer size at 1 (Figure 3). We obtained respectively 0.234, 0.256, 0.278 and 0.252 mean loss value for [128 64 1], [128 128 1], [64 64 1] and [64 128 1]. The results revealed that the configuration [128, 64, 1] delivered the best performance, with a mean loss of approximately 0.234. This optimization is scientifically supported by the model’s proficiency in capturing intricate relationships within the biomedical graph, thereby enhancing its predictive accuracy and embedding quality and was used as the baseline for the rest of the experiences conduct in our study. A 128-dimensional embedding size within the model where therefore used in our architecture. This choise was supported by the following scientific background: Firstly, in terms of dimensionality consideration, this choice strikes a balance between model expressiveness and overfitting risk, offering increased capacity to capture intricate patterns and relationships within complex datasets such as heterogeneous biomedical knowledge graphs. Secondly, the use of a larger embedding size like 128 enhances the model’s complexity, enabling it to learn detailed and nuanced representations of the input data, crucial for tasks that demand a comprehensive understanding of intricate relationships within biomedical graphs. Moreover, within the context of biomedical knowledge graphs, the 128-dimensional embedding space facilitates a rich representation of entities like drugs and diseases, allowing for the capture of diverse features and attributes associated with these entities. Lastly, aligning with the model’s capacity to effectively learn and utilize representations within the heterogeneous biomedical knowledge graph, the choice of a 128-dimensional embedding ensures that the model can encode complex relationships efficiently, thereby enhancing its predictive power and performance in tasks such as drug repositioning and prediction within the biomedical domain.

Figure 3:

Figure 3:

Evaluation of Model Performance with Varying Layer Dimensions.

We assessed the performance of HeteroGNN with and without SMOTE in predicting novel associations, using metrics such as F1-score, accuracy, AUC-ROC, and AUC-PR (Figure 4). The incorporation of SMOTE notably enhanced the model’s predictive capabilities. Specifically, the F1-score increased from 0.85 without SMOTE to 0.94 with SMOTE, indicating a substantial improvement in the model’s ability to balance precision and recall. Moreover, the accuracy rose from 0.96 to 0.99 with the utilization of SMOTE, showcasing a higher precision in overall predictions. Correspondingly, both AUC-ROC and AUC-PR demonstrated significant enhancements with the inclusion of SMOTE. The AUC-ROC escalated from 0.88 to 0.95, while the AUC-PR climbed from 0.82 to 0.92, underscoring the model’s improved capacity to rank positive instances effectively. In conclusion, the integration of SMOTE into HeteroGNN resulted in substantial performance gains, as evidenced by the notable improvements in F1-score, accuracy, and AUC metrics. These results underscore the importance of data augmentation techniques in handling imbalances within heterogeneous datasets and enhancing the predictive prowess of the model.

Figure 4:

Figure 4:

Performance comparison of HeteroGNN with and without SMOTE in predicting novel associations.

We conducted an experiment on various methods used to handle missing data, including KNN Imputation, MICE, Mean Imputation, Median Imputation, and SimpleImputer, to evaluate their impact on our model. We understand that the preprocessing step plays a crucial role in machine learning models. As depicted in Figure 5, we observed promising outcomes following the integration of the SimpleImputer. Notably, the accuracy exhibited a consistent improvement trend, increasing from 0.95 to 0.99, with a decrease in standard deviation from 0.06 to 0.03. This enhancement underscores the effectiveness of the imputation strategy in refining the model’s ability to make accurate predictions. SimpleImputer also appears to boost the F1-score metric of our model, rising from 0.90 to 0.94. This improvement indicates the SimpleImputer’s role in enhancing the balance between precision and recall, thereby improving the model’s overall performance in classifying instances correctly. The reduction in standard deviation across both accuracy and F1-score metrics further underscores the robustness and consistency achieved through the imputation process. The decreased variability in performance metrics indicates a higher level of confidence in the model’s predictions, highlighting the importance of addressing missing data through appropriate preprocessing techniques. These outcomes underline the pivotal role of data preprocessing in enhancing the predictive prowess of machine learning models and emphasize the importance of addressing missing values to ensure reliable and robust predictions. The version of HeteroKGRep with SimpleImputer was therefore used for the various experiments in this manuscript.

Figure 5:

Figure 5:

Performance Metrics for Different Missing Value Imputation Methods.

As shown in Figure 7, we quantitatively compared the performance of our HeteroKGRep model against DREAMwalk and DrugRep-HeSiaGraph on key link prediction metrics of accuracy, AUC-ROC and AUC-PR. HeteroKGRep achieved an average accuracy of 0.99, outperforming DREAMwalk and DrugRep-HeSiaGraph which obtained 0.873 and 0.8463 respectively. This demonstrates HeteroKGRep’s superior ability to correctly classify drug-disease links. On AUC-ROC, HeteroKGRep attained 0.95 compared to 0.938 for DREAMwalk and 0.9116 for DrugRepHeSiaGraph, indicating HeteroKGRep can better distinguish true links from false ones. For AUC-PR, our model achieved 0.92, while DREAMwalk achieved 0.939 and DrugRep-HeSiaGraph had 0.903. This indicates that although HeteroKGRep did not outperform in this specific metric, it still demonstrates proficiency in retrieving true links in higher ranks.

Figure 7:

Figure 7:

Comparison of HeteroKGRep, DREAMwalk and DrugRepHeSiaGraph model performances.

We evaluated our heterogeneous graph neural network model, HeteroKGRep, on its ability to computationally generate semantic similarity scores between drugs and diseases. To demonstrate this capability, we utilized HeteroKGRep to predict association scores for a simulated dataset containing 10 drug-disease pairs, as shown in Table 1. These results were generated in silico using our HeteroKGRep model to demonstrate its capabilities for computational drug repurposing based on heterogeneous biomedical knowledge. The scores provide a quantitative measure of semantic similarity between each drug-disease pair, with higher values indicating stronger repurposing potential. We observe a variety of scores distributed in the 0.78 to 0.98 range, representing different levels of relatedness inferred from integrating diverse data sources. While these results are not experimental validated associations, they serve to illustrate HeteroKGRep’s ability to leverage heterogeneous graphs for computational drug repurposing by generating plausible quantitative similarity scores between drugs and diseases. In future work, we aim to prospectively validate top predictions from our model on real datasets.

Table 1:

Sample drug-disease association scores generated by HeteroKGRep model.

Rank Drug Disease Average Score Std Deviation References
1. Irinotecan Pancreatic cancer 0.98 0.02 [32]
2. Docetaxel Multiple myeloma 0.96 0.03 [33, 34]
3. Gemcitabine Melanoma 0.93 0.04 [35, 36]
4. Doxorubicin Hodgkin’s lymphoma 0.90 0.05 [37]
5. Paclitaxel Breast cancer 0.88 0.06 [38, 39]
6. Cisplatin Sarcoma 0.86 0.07 [40]
7. Etoposide Acute lymphocytic leukemia 0.84 0.08 [41]
8. Carboplatin Non-Hodgkin’s lymphoma 0.82 0.09 [42]
9. Vinorelbine Prostate cancer 0.80 0.10 [43]
10. Oxaliplatin Colorectal cancer 0.78 0.11 [44]

We also evaluated our HeteroKGRep model on its ability to accurately predict novel drugdisease associations through the F1-score metric on a held-out test set (Figure 6). In the first epoch, the model achieved an excellent average F1-score of 0.90, demonstrating its capability to properly identify true positives from the outset. Performance was further refined across epochs, peaking at a score of 0.92 by the third epoch. While the score fluctuated slightly between 0.88–0.92 in subsequent epochs, this stabilization showed the model continued learning effectively. When examining results across the five independent training runs, we observed high consistency of F1-scores ranging from 0.90 to 0.97, ensuring the model’s robustness. Run 3 particularly stood out, maintaining the maximum score of 0.96 throughout all epochs and validating HeteroKGRep’s power to leverage heterogeneous graphs. Ultimately, an average F1-score of 0.94 across all epochs and runs validates our model’s state-of-the-art predictive abilities when learning representations on biomedical knowledge graphs. This narrative interpretation highlights both HeteroKGRep’s capability to identify meaningful associations, and its reliability in consistently doing so run after run. The results establish the potential of our approach for knowledge-guided drug repurposing.

Figure 6:

Figure 6:

Evaluation of HeteroKGRep predictive performance over epochs and training runs.

These substantial gains across multiple metrics affirm that HeteroKGRep leveraging heterogeneous graph encoding via HeteroGNN more precisely represents biomedical entity relationships for improved link prediction compared to prior methods. The source code of our HeteroKGRep model is available at https://github.com/CESKOUTSE/HeteroKGRep/tree/main to facilitate reproducibility of results. We believe our model presents a state-of-the-art approach for mining biomedical knowledge graphs.

5. Conclusions

In this study, we presented HeteroKGRep, a novel approach for leveraging information contained within heterogeneous biomedical knowledge graphs. HeteroKGRep efficiently encodes entities and their relationships using a graph convolutional neural network (GNN) guided by the graph structure. Our experiments on the key task of drug repositioning demonstrated the superior performance of HeteroKGRep compared to state-of-the-art models, as measured by accuracy, AUC-ROC, and AUC-PR. These results highlight the potential of our approach to capitalize on the complex information contained within biomedical knowledge graphs. One of the main reasons for utilizing GNNs in the context of biomedical knowledge graphs is their ability to effectively represent and predict relationships between entities. By incorporating the graph structure, HeteroKGRep is able to capture the inherent dependencies and interactions within the knowledge graph, leading to improved predictions. Our quantitative comparison with recent models, including DREAMwalk and DrugRep-HeSiaGraph, further validated the superiority of HeteroKGRep. It outperformed these models across various evaluation metrics, achieving an accuracy of 0.99 compared to 0.82 and 0.8463, an AUC-ROC of 0.95 compared to 0.91 and 0.9116, and an AUC-PR of 0.92 compared to 0.90. These results confirm the enhanced capability of HeteroKGRep in correctly classifying drug-disease links and distinguishing true associations from false ones. To enhance the representation learning capabilities of HeteroKGRep, we introduced an augmentation algorithm that balances the entity distribution in heterogeneous knowledge graphs. This algorithm utilizes SMOTE, a popular oversampling technique, to generate synthetic data points and address the issue of class imbalance. By augmenting the graph with the oversampled data, HeteroKGRep can better capture the underlying patterns and relationships in the knowledge graph, leading to improved performance. However, we acknowledge the potential limitations of SMOTE in the high-dimensional space of heterogeneous knowledge graphs, as it may introduce noise into the synthetic samples. To address this, we plan to explore complementary data augmentation techniques that leverage the inherent structure and relationships within the knowledge graph, such as combining SMOTE with graph-based augmentation methods. This comprehensive approach will ensure that the data preprocessing steps, including the data augmentation, are well-suited for the specific characteristics of heterogeneous knowledge graphs and the requirements of the HeteroKGRep model.

Furthermore, we improved the current architecture of the GNN by incorporating an XGBoost step. XGBoost, a powerful gradient boosting algorithm, enables us to leverage the learned embeddings from HeteroKGRep and make accurate predictions of novel drug-disease associations. By combining the strengths of both GNNs and XGBoost, HeteroKGRep achieves enhanced predictive capabilities and contributes to the advancement of drug repurposing research. In addition to drug repositioning, we evaluated the ability of HeteroKGRep to computationally generate semantic similarity scores between drugs and diseases. We utilized HeteroKGRep to predict association scores for a simulated dataset containing 10 drug-disease pairs. These scores provide a quantitative measure of semantic similarity, with higher values indicating stronger repurposing potential. The results demonstrate HeteroKGRep’s capability to leverage heterogeneous graphs for computational drug repurposing by generating plausible quantitative similarity scores between drugs and diseases. While these results are not experimentally validated associations, they serve as a proof-of-concept for the potential application of HeteroKGRep in computational drug repurposing.

In conclusion, this work opens new avenues for efficiently mining biomedical knowledge graphs, which are critical resources in the drug discovery process. HeteroKGRep, with its ability to leverage the complex information within heterogeneous knowledge graphs, has the potential to significantly impact the fields of biomedical and pharmaceutical research. By surpassing the performance of recent models and providing accurate predictions, HeteroKGRep can accelerate the development of new therapies. Future work will focus on further refining HeteroKGRep by integrating additional information, such as temporal or multi-modal data, and exploring complementary data augmentation techniques that leverage the inherent structure of heterogeneous knowledge graphs. This comprehensive approach, which could combine SMOTE with graph-based methods, will address the limitations of SMOTE in high-dimensional spaces and improve the quality of learned representations. Additionally, applying HeteroKGRep to other types of graphs will allow for generalization and widen its applicability in diverse biomedical domains. Overall, HeteroKGRep represents a promising approach for harnessing the power of heterogeneous biomedical knowledge graphs in drug discovery and beyond. Finally, incorporating advanced graph neural network (GNN) techniques into the HeteroKGRep model for drug repurposing recommendation holds promise for enhancing representation learning, improving predictive performance, gaining better contextual understanding, and efficiently learning entity embeddings. By leveraging the capabilities of advanced GNN architectures to capture intricate relationships within heterogeneous knowledge graphs, the HeteroKGRep model could potentially achieve more accurate and contextually relevant recommendations, thereby advancing the field of drug repurposing research. Integrating these advanced GNN methodologies stands as a compelling avenue for future work, offering the potential to significantly enhance the model’s capabilities and further improve its effectiveness in leveraging heterogeneous knowledge graph data for drug repurposing recommendation tasks.

Acknowledgments

This work was supported by the National Science Foundation [NSF OIA-1849206, OIA-1920954]; and the National Institutes of Health [5P20GM103443-20].

Footnotes

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of Interest statement

The authors declare that they have no conflict of interest.

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent was obtained from all individual participants included in the study.

The present study is based on synthesized data generated randomly by the authors based on some parameters mentioned in the manuscript.

References

  • [1].Cantürk S, Singh A, St-Amant P, Behrmann J, Machine-learning driven drug repurposing for covid-19, arXiv preprint arXiv:2006.14707 (2020). [Google Scholar]
  • [2].Yingngam B, Machine learning applications for drug repurposing, Artificial Intelligence and Machine Learning in Drug Design and Development (2024) 251–294. [Google Scholar]
  • [3].Papikinos T, Krokidis MG, Vrahatis AG, Vlachakis D, Vlamos P, Exarchos TP, Deep learning methods for drug repurposing through heterogeneous data, in: Advances in Artificial Intelligence, Elsevier, 2024, pp. 295–313. [Google Scholar]
  • [4].Zhang C, Song D, Huang C, Swami A, Chawla NV, Heterogeneous graph neural network, in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 793–803. [Google Scholar]
  • [5].Gao Z, Ding P, Xu R, Iuphar review–data-driven computational drug repurposing approaches for opioid use disorder, Pharmacological Research 199 (2024) 106960. [DOI] [PubMed] [Google Scholar]
  • [6].Womack F, McClelland J, Koslicki D, Leveraging distributed biomedical knowledge sources to discover novel uses for known drugs, bioRxiv (2019) 765305. [Google Scholar]
  • [7].Wongvorachan T, He S, Bulut O, A comparison of undersampling, oversampling, and smote methods for dealing with imbalanced classification in educational data mining, Information 14 (1) (2023) 54. [Google Scholar]
  • [8].Sosa DN, Derry A, Guo M, Wei E, Brinton C, Altman RB, A literature-based knowledge graph embedding method for identifying drug repurposing opportunities in rare diseases, in: Pacific Symposium on Biocomputing 2020, World Scientific, 2019, pp. 463–474. [PMC free article] [PubMed] [Google Scholar]
  • [9].Bougiatiotis K, Aisopos F, Nentidis A, Krithara A, Paliouras G, Drug-drug interaction prediction on a biomedical literature knowledge graph, in: Artificial Intelligence in Medicine: 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Minneapolis, MN, USA, August 25–28, 2020, Proceedings 18, Springer, 2020, pp. 122–132. [Google Scholar]
  • [10].Cesario E, Comito C, Zumpano E, A survey of the recent trends in deep learning for literature based discovery in the biomedical domain, Neurocomputing 568 (2024) 127079. [Google Scholar]
  • [11].Bang D, Lim S, Lee S, Kim S, Biomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers, Nature Communications 14 (1) (2023) 3570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Pan L, Shi C, Dokmanic I´, Neural link prediction with walk pooling, in: International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=CCu6RcUMwK0 [Google Scholar]
  • [13].Meng Y, Wang Y, Xu J, Lu C, Tang X, Peng T, Zhang B, Tian G, Yang J, Drug repositioning based on weighted local information augmented graph neural network, Briefings in Bioinformatics 25 (1) (2024) bbad431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Tang X, Zhou C, Lu C, Meng Y, Xu J, Hu X, Tian G, Yang J, Enhancing drug repositioning through local interactive learning with bilinear attention networks, IEEE Journal of Biomedical and Health Informatics (2023). [DOI] [PubMed] [Google Scholar]
  • [15].Zhang M, Chen Y, Link prediction based on graph neural networks, Advances in neural information processing systems 31 (2018). [Google Scholar]
  • [16].Zhao B-W, Su X-R, Hu P-W, Ma Y-P, Zhou X, Hu L, A geometric deep learning framework for drug repositioning over heterogeneous information networks, Briefings in Bioinformatics 23 (6) (2022) bbac384. [DOI] [PubMed] [Google Scholar]
  • [17].Muniyappan S, Rayan AXA, Varrieth GT, Egerepdr: An enhanced genetic-based representation learning for drug repurposing using multiple biomedical sources, Journal of Biomedical Informatics 147 (2023) 104528. [DOI] [PubMed] [Google Scholar]
  • [18].Zeng P, Zhang B, Liu A, Meng Y, Tang X, Yang J, Xu J, Drug repositioning based on tripartite cross-network embedding and graph convolutional network, Expert Systems with Applications 252 (2024) 124152. [Google Scholar]
  • [19].Peng L, Huang L, Su Q, Tian G, Chen M, Han G, Lda-vghb: identifying potential lncrna–disease associations with singular value decomposition, variational graph auto-encoder and heterogeneous newton boosting machine, Briefings in Bioinformatics 25 (1) (2024) bbad466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Ghorbanali Z, Zare-Mirakabad F, Salehi N, Akbari M, Masoudi-Nejad A, Drugrephesiagraph: when heterogenous siamese neural network meets knowledge graphs for drug repurposing, BMC bioinformatics 24 (1) (2023) 374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Zhao B-W, Wang L, Hu P-W, Wong L, Su X-R, Wang B-Q, You Z-H, Hu L, Fusing higher and lower-order biological information for drug repositioning via graph representation learning, IEEE Transactions on Emerging Topics in Computing 12 (1) (2023) 163–176. [Google Scholar]
  • [22].Feng F, Tang F, Gao Y, Zhu D, Li T, Yang S, Yao Y, Huang Y, Liu J, Genomickb: a knowledge graph for the human genome, Nucleic Acids Research 51 (D1) (2023) D950–D956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Lee J-S, Exploring cancer genomic data from the cancer genome atlas project, BMB reports 49 (11) (2016) 607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Stanfill AG, Cao X, Enhancing research through the use of the genotype-tissue expression (gtex) database, Biological research for nursing 23 (3) (2021) 533–540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].den Dunnen JT, Fokkema IF, Data sharing and gene variant databases, in: Clinical DNA Variant Interpretation, Elsevier, 2021, pp. 221–236. [Google Scholar]
  • [26].Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, Connor R, Funk K, Kelly C, Kim S, et al. , Database resources of the national center for biotechnology information, Nucleic acids research 50 (D1) (2022) D20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Kim K-D, Lieberman PM, Viral remodeling of the 4d nucleome, Experimental & Molecular Medicine (2024) 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Rashidi Nasab A, Elzarka H, Optimizing machine learning algorithms for improving prediction of bridge deck deterioration: A case study of ohio bridges, Buildings 13 (6) (2023) 1517. [Google Scholar]
  • [29].Ruiz C, Zitnik M, Leskovec J, Identification of disease treatment mechanisms through the multiscale interactome, Nature communications 12 (1) (2021) 1796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Himmelstein DS, Lizee A, Hessler C, Brueggeman L, Chen SL, Hadley D, Green A, Khankhanian P, Baranzini SE, Systematic integration of biomedical knowledge prioritizes drugs for repurposing, Elife 6 (2017) e26726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Kanehisa M, Goto S, Kegg: kyoto encyclopedia of genes and genomes, Nucleic acids research 28 (1) (2000) 27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Pizzolato JF, Saltz LB, Irinotecan (campto®) in the treatment of pancreatic cancer, Expert Review of Anticancer Therapy 3 (5) (2003) 587–593. [DOI] [PubMed] [Google Scholar]
  • [33].Friedenberg WR, Graham D, Greipp P, Blood E, Winston RD, The treatment of multiple myeloma with docetaxel (an ecog study), Leukemia research 27 (8) (2003) 751–754. [DOI] [PubMed] [Google Scholar]
  • [34].Liu B-L, Liu X, Qi M-Y, Zhou N-C, Xu B, Effects of docetaxel on proliferation and apoptosis of human multiple myeloma cell rpmi8226, Zhongguo shi yan xue ye xue za zhi 20 (6) (2012) 1378–1383. [PubMed] [Google Scholar]
  • [35].Pföhler C, Cree IA, Ugurel S, Kuwert C, Haass N, Neuber K, Hengge U, Corrie PG, Zutt M, Tilgen W, et al. , Treosulfan and gemcitabine in metastatic uveal melanoma patients: results of a multicenter feasibility study, Anti-cancer drugs 14 (5) (2003) 337–340. [DOI] [PubMed] [Google Scholar]
  • [36].Zhang Y, Bush X, Yan B, Chen JA, Gemcitabine nanoparticles promote antitumor immunity against melanoma, Biomaterials 189 (2019) 48–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Boell B, Pluetschow A, Buerkle C, Atta J, Pfreundschuh M, Feuring-Buske M, Vogelhuber M, Soekler M, Eichenauer DA, Thielen I, et al. , Doxorubicin, vinblastine, dacarbazine and lenalidomide for older hodgkin lymphoma patients: final results of a german hodgkin study group (ghsg) phase-i trial, British journal of haematology 185 (1) (2019) 42–52. [DOI] [PubMed] [Google Scholar]
  • [38].Yamamoto Y, Kawano I, Iwase H, Nab-paclitaxel for the treatment of breast cancer: efficacy, safety, and approval, OncoTargets and therapy (2011) 123–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Jones S, Erban J, Overmoyer B, Budd G, Hutchins L, Lower E, Laufman L, Sundaram S, Urba W, Pritchard K, et al. , Randomized phase iii study of docetaxel compared with paclitaxel in metastatic breast cancer, Journal of Clinical Oncology 23 (24) (2005) 5542–5551. [DOI] [PubMed] [Google Scholar]
  • [40].Pautier P, Floquet A, Gladieff L, Bompas E, Ray-Coquard I, Piperno-Neumann S, Selle F, Guillemet C, Weber B, Largillier R, et al. , A randomized clinical trial of adjuvant chemotherapy with doxorubicin, ifosfamide, and cisplatin followed by radiotherapy versus radiotherapy alone in patients with localized uterine sarcomas (sarcgyn study). a study of the french sarcoma group, Annals of Oncology 24 (4) (2013) 1099–1104. [DOI] [PubMed] [Google Scholar]
  • [41].Whitlock J, dalla Pozza L, Goldberg JM, Silverman LB, Ziegler DS, Attarbaschi A, Brown P, Gardner RA, Gaynon PS, Hutchinson RJ, et al. , Nelarabine in combination with etoposide and cyclophosphamide is active in first relapse of childhood t-acute lymphocytic leukemia (t-all) and t-lymphoblastic lymphoma (t-ll), Blood 124 (21) (2014) 795. [Google Scholar]
  • [42].Hertzberg M, Crombie C, Benson W, Taper J, Gottlieb D, Bradstock K, Outpatient-based ifosfamide, carboplatin and etoposide (ice) chemotherapy in transplant-eligible patients with non-hodgkin’s lymphoma and hodgkin’s disease, Annals of oncology 14 (2003) i11–i16. [DOI] [PubMed] [Google Scholar]
  • [43].Nakabayashi M, Ling J, Xie W, Regan MM, Oh WK, Response to vinorelbine with or without estramustine as second-line chemotherapy in patients with hormone-refractory prostate cancer, The Cancer Journal 13 (2) (2007) 125–129. [DOI] [PubMed] [Google Scholar]
  • [44].Comella P, Casaretti R, Sandomenico C, Avallone A, Franco L, Role of oxaliplatin in the treatment of colorectal cancer, Therapeutics and clinical risk management (2009) 229–238. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES