Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2022 Mar 11;8(1):67–76. doi: 10.1007/s41066-022-00315-4

Enhancing drug–drug interaction prediction by three-way decision and knowledge graph embedding

Xinkun Hao 1,3, Qingfeng Chen 1,2,3,, Haiming Pan 1,3, Jie Qiu 1,3, Yuxiao Zhang 1,3, Qian Yu 1,3, Zongzhao Han 1,3, Xiaojing Du 1,3
PMCID: PMC8913867  PMID: 38624759

Abstract

Drug–Drug interaction (DDI) prediction is essential in pharmaceutical research and clinical application. Existing computational methods mainly extract data from multiple resources and treat it as binary classification. However, this cannot unambiguously tell the boundary between positive and negative samples owing to the incompleteness and uncertainty of derived data. A granular computing method called three-way decision is proved to be effective in making uncertain decision, but it relies on supplementary information to make delay decision. Recently, biomedical knowledge graph has been regarded as an important source to obtain abundant supplementary information about drugs. This paper proposes a three-way decision-based method called 3WDDI, in combination with knowledge graph embedding as supplementary features to enhance DDI prediction. The drug pairs are divided into positive, negative and boundary regions by Convolutional Neural Network (CNN) according to drug chemical structure feature. Further, delay decision is made for objects in the boundary region by integrating knowledge graph embedding feature to promote the accuracy of decision-making. The empirical results show that 3WDDI yields up to 0.8922, 0.9614, 0.9582, 0.8930 for Accuracy, AUPR, AUC and F1-score, respectively, and outperforms several baseline models.

Keywords: Drug–drug interactions, Three-way decision, Knowledge graph, Convolutional neural network

Introduction

Poly-pharmacy is increasingly common in real-world clinical settings, especially for aged and cancer patients who suffer from multiple diseases and need complex drug treatments. However, unexpected drug–drug interactions (DDI) frequently occur and lead to adverse drug reactions, while several drugs are taken together. Unexpected DDI may threaten the patient’s health, and give rise to re-hospitalization. This may directly or indirectly result in high burden on medical system. Therefore, it is of vital importance to identify potential drug–drug interactions before putting them on the market for clinical use.

Traditional methods for DDI prediction mainly depend on in vivo and in vitro experiments, which are usually labor-intensive and limited in experimental scale. In recent years, with the development of computer technology and establishment of kinds of drug-related databases, a number of data-driven methods for DDI prediction have been proposed and yielded considerable results. Machine learning and deep learning methods are impressive due to their good performance. They tend to extract multiple drug-related features from open-source databases, such as DrugBank (Law et al. 2013), KEGG-DRUG (Kanehisa et al. 2015). And the derived drug-related features usually consist of drug molecular fingerprint, drug target information, drug phenotype information and so on. However, it is not only labor-intensive to obtain comprehensive drug-related information from diverse data sources, but also difficult to guarantee the quality of drug information.

Recently, knowledge graph has been widely studied and used in various research fields (Ji et al. 2021) for knowledge representation and knowledge integration. And several large-scale domain knowledge graphs have also been established in bioinformatics, such as Bio2RDF (Callahan et al. 2013) and DRKG (Ioannidis et al. 2020). Existing methods mainly utilize well-established knowledge graphs or integrate knowledge from multiple data sources to construct knowledge graphs (Abdelaziz et al. 2017; Karim et al. 2019; Dai et al. 2020). Knowledge graph embedding model (or knowledge representation learning model) is trained on the knowledge graph to generate embedding vectors for entities and relations as external knowledge, which is able to enhance traditional machine learning and deep learning methods for DDI prediction. And many studies have proven knowledge graph embedding vector is a powerful feature.

No matter obtaining drug features from multiple data sources or using knowledge graph embedding as feature ideally assume that we can obtain detailed and complete information about the drug all at once, and then make an immediate decision for the existence of DDI. However, only chemical structure information of a drug is easy to obtain, while others, such as target, pathway, need further experiments to discover. As a result, it is not easy to decide whether there is an interaction between two drugs based on incomplete information. And modeling the problem as a binary classification directly may hinder the accuracy of DDI prediction due to the uncertainty and incompleteness of drug information. In recent years, a granular computing method called three-way decision aiming to solve uncertain decision-making is proposed and widely applied in Artificial Intelligence field, such as image recognition (Huaxiong et al. 2016), recommendation system (Zhang et al. 2017a). The main idea of three-way decision is to divide the objects into three regions instead of two regions in binary classification, and adopt delay decision for an additional region called boundary region until more information is obtained. To model the uncertain decision boundary and prompt the accuracy of DDI prediction, we thus introduce the three-way decision for DDI prediction. Meanwhile, delay decision of three-way decision tends to rely on a supplementary feature. Herein, we treat powerful knowledge graph embedding vector as the additional feature which contains global and abundant knowledge of drug.

In this paper, we treat the knowledge graph as auxiliary information and combine three-way decision to enhance drug–drug interaction prediction. The chemical structure information of drug is collected to generate drug structure feature for each drug pair in training set. A CNN model with strong feature extraction ability is adopted to divide training data into positive, negative and boundary region based on the drug structure feature. Unlike binary classification, immediate decision is not made for the samples in the boundary region. To achieve more accurate decision on drug–drug interaction in boundary region, knowledge graph embedding is trained to represent a drug pair, and further decision will be made by an enhancing model. Knowledge graph embedding feature shows strong representation ability and is significantly different from drug structure feature. The experimental results demonstrate that delay decision based on knowledge graph embedding feature produce more confidence prediction for drug pairs in the boundary region.

To summarize, this paper makes the following contributions:

  • We introduce three-way decision to solve the uncertain decision of DDI prediction and utilize the knowledge graph embedding as supplementary feature to enhance DDI prediction.

  • We propose a novel method called 3WDDI to predict the existence of drug–drug interaction.

  • We compare our method with several classical models and state-of-the-art works on a popular dataset. The results show the proposed 3WDDI outperforms the baselines.

The remaining content of this paper is organized as follows. In Sect. 2, we introduce the background and related works briefly. In Sect. 3, the proposed method 3WDDI is introduced. And we report the experiment results and analysis in Sect. 4. The conclusions are offered in Sect. 5.

Related work

This section introduces the research background and relevant works, including DDI prediction, knowledge graph embedding and three-way decision.

DDI prediction

In the past few years, many works extract drug-related information from existing databases to construct drug pair representation and train machine learning models to predict drug–drug interactions. They mainly model the drug–drug prediction problem as binary classification. Given a representation of a drug pair, the machine learning models calculate the probability that the drug interaction exists. We roughly classify current methods into two categories: traditional machine learning-based method and deep learning-based method. The former focuses on effective drug pair representation. Gottlieb et al. (2012) constructed drug feature vectors based on seven types of drug–drug similarities to describe drug–drug pairs, and then applies logistic regression model to predict DDI. A heterogeneous network-assisted inference framework (Cheng and Zhao 2014) is proposed to assist the prediction of DDI, which integrates multiple similarity features and applies five classification models. Zhang et al. (2015) built a high-order similarity weight network and used a semi-supervised label propagation algorithm to predict drug–drug interactions. Drug–drug interaction (Ferdousi et al. 2017) is predicted by the functional similarity features of drug, including carriers, transporters, enzymes and targets (CTET). Zhang et al. (2017b) adopted ensemble learning model to predict DDI and evaluated the contribution of different features extracted from multiple data sources. Kastrin et al. (2018) introduced several statistical machine learning models to predict DDI in terms of semantic similarity and topology similarity.

Traditional machine learning-based methods provide a comprehensive guideline for information extraction and feature construction of drug. However, feature engineering is often time-consuming and labor-intensive. Recently, deep learning models have been widely studied in many fields due to its powerful ability of modeling complicated relations and extracting high-order features. Ryu et al. (2018) proposed a deep learning model called DeepDDI. They treat chemical drug substructure fingerprint (SSP) as input feature to predict 86 types of DDI interactions. In the same way, Lee et al. (2019) integrated three types of drug similarity fingerprints SSP, GSP and TSP. And they introduced an autoencoder to solve the curse of dimensionality aroused by additional fingerprints. Rohani and Eslahchi (2019) constructed several drug similarity matrices based on substructure, target, side effect, off-label side effect, pathway, transporter, and indication data. The subset of those matrices is then selected to integrate into an ensemble similarity matrix by a heuristic method. Finally, the drug representation denoted by the ensemble similarity matrix is fed into deep neural network for predicting DDI. A multi-modality drug representation learning method based on autoencoder DDI-MDAT is proposed by Zhang et al. (2020), and positive-unlabeled (PU) learning setting is adopted for higher accuracy. Deng et al. (2020) designed a multi-modal deep learning framework and generated drug molecular structure, target, pathway and enzyme fingerprints to describe a drug pair.

The aforementioned methods have achieved remarkable results in DDI prediction. However, they usually predicted DDI by binary classification models, which cannot tell the clear boundary of the decision due to the uncertainty and incompleteness of drug information. Further, the process of feature construction is labor-intensive and the information extracted from different data sources lack of consistency and globality. In contrast, we consider uncertainty of drug information in real word by introducing three-way decision classification framework to model uncertain decision boundary for DDI prediction, and utilize knowledge graph embedding as the input feature of delay decision. In the following Sects. 2.2 and 2.3, we will give a brief introduction to knowledge graph embedding in DDI prediction and application of three-way decision.

Knowledge graph embedding

Knowledge graphs, a form of structured human knowledge, have aroused widespread interest from both the academia and the industry (Ji et al. 2021). Knowledge graph is composed of fact triple h,r,t, where h and t are the head entity and tail entity, respectively, and r denotes the relation between h and t. From the perspective of graph theory, knowledge graph belongs to heterogeneous graph, consisting of different types of nodes and edges corresponding to entities and relations in the knowledge graph. To utilize rich knowledge contained in knowledge graph, we need knowledge graph representation learning, namely knowledge graph embedding, aiming to map entities and relations into low-dimensional vectors while capturing their semantic information. Score function is a key component within knowledge graph embedding models. Current models can be roughly classified into two categories according to the score function: translation distance-based model and semantic similarity-based model.

Translation distance-based models, such as TransE (Bordes et al. 2013), TransH (Wang et al. 2014) and TransR (Lin et al. 2015) measure the plausibility of fact triples by calculating the distance between entities, where addictive translation with relations as h+rt is widely used. Semantic similarity-based models measure the plausibility of fact triples by semantic matching. They usually adopt a multiplicative formulation, i.e., hMrt, to transform head entity near the tail in the representation space. Typical models, such as DistMult (Yang et al. 2014) and ComplEx (Trouillon et al. 2016), can be divided into this class. Knowledge graph has been studied and applied in many fields, such as recommendation systems and question answering systems. There are also some applications of knowledge graph in bioinformatics (Lan et al. 2021b, 2022). And several works have also introduced knowledge graphs embedding to DDIs prediction. Abdelaziz et al. (2017) constructed knowledge graph based on multiple data source and proposed framework named Tiresias that generated drug pair representation according to cosine similarity of knowledge graph embedding vectors. Karim et al. (2019) applied knowledge graph as a tool for data integration to overcome data skewness, and conducted ComplEx over knowledge graph to obtain embedding vectors. DDIs are predicted by convolutional-LSTM model, where knowledge graph embedding is the only input. Instead of applying knowledge graph in feature generation, Dai et al. (2020) modeled multiple DDIs data as a knowledge graph and designed a knowledge graph embedding model based on Wasserstein Adversarial Autoencoder. DDI prediction is treated as link prediction over the knowledge graph. The above works have proved that knowledge graph is a promising tool for many tasks. Knowledge graphs related to drug are established based on the information derived from several different databases. They contain more detailed information about drug and the interactions with other entities, such as gene and protein, which enhances the performance of DDI prediction. The knowledge graph embedding feature can provide more consistent and global information for drugs.

Three-way decision

Three-way decision, as a new interpretation of rules in rough set theory, proposed by Yao (2010, 2021), aims to deal with complex and uncertain decision-making. It has been widely applied in several domains of Artificial Intelligence. Zhou et al. (2014) introduced three-way decision for e-mail spam filtering. Huaxiong et al. (2016) applied it for face recognition and designed cost-sensitive sequential three-way decision framework. Zhang et al. (2017a) designed a regression-based three-way recommender system to minimize the average cost by adjusting the thresholds for different behaviors. Li et al. (2017) developed a three-way decision model for handling the uncertain boundary to improve the binary text classification performance based on the rough set techniques and centroid solution. Zhang et al. (2019) introduces a three-way enhanced convolutional neural network model named 3W-CNN for sentence-level sentiment classification. Yu et al. (2020) proposed an active three-way clustering method to model uncertainty relations between objects and improve the accuracy of high-dimensional multi-view clustering. Li and Huang (2020) integrated three-way decision into a fuzzy condition decision information system for credit card evaluation. And three-way decision was introduced into the traditional k-means clustering to combine knowledge of set-pair information granule (Zhang et al. 2021). The above works enrich the theoretic foundation of three-way decisions, and indicate that three-way decisions are capable of handling many practical decision problems. This paper introduces three-way decision to model uncertainty decision boundary of DDI prediction. Further, we treat the knowledge graph embedding feature as the delay decision feature of three-way decision for more accurate prediction.

Drug–drug Prediction using three-way decision

In this section, we first provide an overview of proposed 3WDDI framework in Sect. 3.1. The construction of the drug chemical structure feature is presented in Sect. 3.2. The acquisition of knowledge graph embedding features is later illustrated in Sect. 3.3. Finally, the components of three-way decision including boundary division and delay decision are offered in Sects. 3.4 and 3.5, respectively.

Overview

The procedure of our proposed method is showed in Figure 1. It takes drug chemical structure feature as the primary feature and knowledge graph embedding of drugs in KG as the supplementary feature to predict interaction value for drug–drug pair using three-way decision classification model. Therefore, we design our method as the following steps for DDI prediction:

  1. drug chemical structure construction;

  2. knowledge acquisition;

  3. three-way decision and boundary division;

  4. enhancing module.

In step 1, we collect the SMILES (Simplified Molecular Input Line Entry Specification) of drug from dataset and compute the drug chemical structure feature. In step 2, we train a knowledge graph embedding model on large-scale biomedical knowledge graph to generate embedding for candidate drugs. Then, based on the drug chemical structures of drug pairs, we divide the training samples into three regions including positive, negative and boundary in step 3. The boundary region contains samples that cannot be decided immediately into positive or negative. To further handle samples in the boundary region, we feed them to an enhancing module and treat knowledge graph embedding as input feature for the final result in step 4. The final score of classification will be calculated in terms of the results from step 3 and step 4. Next, we will describe our proposed method in more details.

Fig. 1.

Fig. 1

Overview of 3WDDI

Drug chemical structure feature construction

To train a deep learning model, we need to represent a drug pair as a continuous vector. Drug chemical structure information is critical and easy to obtain. Drug feature based on the chemical structure has already been successfully developed in many DDI prediction studies (Cheng and Zhao 2014; Zhang et al. 2015; Ryu et al. 2018). According to Zhang et al. (2017b), we construct the drug feature based on 881 types of chemical substructures, as known as chemical fingerprints defined in PubChem (Li et al. 2010). Each drug can be represented as an 881-dimensional bit vector where the value 1 or 0 denotes the presence or absence of the corresponding chemical substructure respectively, such as –NH2, –CH3.

We can obtain a high-dimensional and sparse feature vector for each candidate drug. To get a low-dimensional and dense drug feature vector, we calculate the pairwise drug–drug similarity from bit vectors using Jaccard similarity measure. Given drug di and dj in candidate drug set D, and their feature vectors Vi and Vj, the Jaccard similarity can be defined as:

S(Vi,Vj)=M11M01+M10+M11, 1

where M11 is the number of chemical substructures shared by di and dj, or the number of dimensions where Vi and Vj both have the value of 1; M01 is the number of substructures that dj has but di does not; and M10 is the number of substructures that di has but dj does not.

According to the obtained initial Jaccard similarity, we can obtain a |D|×|D| pair-wise drug–drug similarity matrix S where the element Sij means the similarity between di and dj. Then, each drug di can be represented as a |D|-dimensional dense row vector called Fis in matrix S. As a result, for each drug pair di,dj in dataset, we are able to concatenate Fis and Fjs to get drug pair representation [Fis;Fjs]. This drug chemical structure feature will be used as the input of decision function in Sect. 3.4 below.

Knowledge acquisition

Compared with the drug structure feature in Sect. 3.2, knowledge graph embedding feature is a distinctive feature containing global information of drugs. This section introduces how to generate the knowledge graph embedding feature for each drug in dataset.

We download a large-scale biomedical knowledge graph DRKG (Ioannidis et al. 2020), which covers all drugs in our dataset and other biology entities, such as gene, disease, pathway and so on, and relationships among them. A popular knowledge graph embedding model ComplEx (Trouillon et al. 2016) is used to generate embedding vectors for all entities and relations in DRKG We denote the knowledge graph as G=Ne,Nr, where Ne and Nr is the set of entities and the set of relations, respectively. G is composed of entity–relation–entity triples Ti=hi,ri,ti where hi,tiNe, riNr. For each entity eNe and rNr, knowledge graph embedding model aims to generate embedding vectors eeRde and erRdr where de and dr mean the dimension of ee and er, respectively (ex denotes the embedding vector of object x). Each embedding model has a scoring function f:Ne×Nr×NeR to assign a score fhi,ri,ti for a possible triple hi,ri,ti, which indicates its plausibility. Models are trained in a way such that for every fact triple hi,ri,tiT and fake triple hi,ri,tiT . The models assign scores that satisfy fhi,ri,ti>0 and fhi,ri,ti<0. A scoring function is generally a function of eh,er,et.

In this paper, a popular semantic similarity-based model named ComplEx is applied to train embedding vectors for entities and relations. ComplEx represents entities and relations in complex space for modeling multi-type complicated relations. Given h,t Ne and rNr, it generates embedding vector eh,er,etRd and defines scoring function as:

f(h,r,t)=Reeh,er,e¯t=Rek=1deh(k)er(k)e¯t(k), 2

where fh,r,t>0 for fact triples and fh,r,t<0 for fake triples. Re denotes the real part of a complex number. e¯t is the conjugate vector of complex vector et.

According to complEx model, we can get an embedding vector for each entity and relation in the biomedical knowledge. For each drug di in dataset, we can find a corresponding compound entity ei in the knowledge graph G. Here, we use embedding vector ei of entity ei as the representation of drug di. According to Eq. 2, for each drug pair di,dj, we concatenate ei and ej to yield drug pair representation [ei;ej], which is regarded as a new feature for boundary samples and fed to enhancing model for further decision. Considering the consistency of formula, we denote the drug pair knowledge graph embedding feature [ei;ej] as [Fie;Fje],

Three-way decision and boundary division

This section describes the theory of Three-way Decision and discusses how to divide decision boundary based on the drug chemical structural feature of Sect. 3.2.

Three-way decision is originally derived from rough set theory. In rough set theory, object x belongs to either set C or ¬C, where C¬C=U, and U is a finite non-empty set called the universe. For each object x, there are three decisions to choose, including dividing x to positive region POS (predictingxC), negative region NEG (predictingxC) or boundary region BND (unpredicting) . According to the differences between real label and decisional label of the object x, there will be six decision actions and corresponding six kinds of costs (Table 1): λPP, λBP, λNP and λNN, λBN, λNN. λPP, λBP, and λNP denote the costs of dividing x into POS, NEG and BND, respectively in case of xC. λNN, λBN, λNN mean the costs of dividing x into POS, NEG and BND, respectively while x¬C. In the practical application environment, the values of costs are given by domain experts.

Table 1.

Costs matrix of three-way decision

POS(C) NEG(C) BND(C)
C λPP λBP λNP
¬C λNN λBN λNN

Generally, the inequations λPPλBP<λNP and λNNλBN<λPN should be satisfied, indicating that the cost of classifying an object x that belongs to C into the positive region POS(C) is less than or equal to the cost of classifying x into the boundary region BND(C), and both of these losses are strictly less than the cost of classifying x into the negative region NEG(C). The goal of three-way decision is to make a proper decision in a minimum cost. Based on the two inequations, Bayesian decision procedure suggests the following minimum-risk decision rules:

decidexPOS(C),IfP(Cx)αBND(C),IfαP(Cx)βNEG(C),IfP(Cx)β, 3

where α and β are a pair of threshold parameters used to divide decision boundary, and α=λPN-λBNλPN-λBN+λBP-λPP, β=λBN-λNNλBN-λNN+λNP-λBP, 0βα<1; PCx0,1, called decision status value, is the predicted probability of x belonging to C, and is calculated by a decision function. In the application of three-way decision for machine learning, the decision function is generally a certain machine learning model. For example, Naive Bayes model is adapted as the decision function in the three-way spam filtering system based on Naive Bayes Zhou et al. (2014). Our method applies a CNN model as the decision function.

In our task, object x is a pair of drugs (di,dj). C denotes the set of drug pairs in which an interaction exists between two drugs, while ¬C not. In other words, xC indicates that di interacts with dj, while xC indicates that di does not interact with dj in contrast. PCxis the probability that di may interact with dj. We introduce a CNN model as the decision function to predict the probability of xC. The drug substructure feature representation[Fis;Fjs] of drug pair (di,dj) will be treated as the input feature of the CNN model called SCNN. The forward propagation of SCNN is calculated as:

fscnnFis;Fjs=MLPvecσFis;Fjsω. 4

In the above formula, is the convolution operator and ω is the convolution kernel. σ denotes the activation function and vec() means reshaping a tensor to a vector. MLP is a multi-layer perceptron and its output layer contains only one neuron activated by a sigmoid function. Therefore, the output value of SCNN, represented as fscnn[Fis;Fjs], ranging from 0 to 1, is regarded as the decision status value of three-way decision framework by which we can classify drug pairs (di,dj) into POS(C), BND(C), and NEG(C) with two threshold parameters α and β. For drug pairs divided into BND(C), delay decision will be executed in Sect. 3.5.

Enhancing Module

Enhancing Module is an important component of three-way decision classification model, which makes it outperform binary classification models. According to the work of Zhang et al. (2019), we choose knowledge graph embedding of each drug from DRKG as supplementary features and the same CNN classifier in Sect. 3.4. We use the same model but different features in boundary division and enhancing module, while Zhang et al. (2019) uses the same feature of object but totally different classification models to guarantee the complementary property between boundary division and enhancing module. Global information contained in knowledge graph embedding focus on the drug interactions with other entities, while structure feature only considers local structure characteristics. In this way, different drug features between boundary division and enhancing module can maintain the complementary property between them. Meanwhile, treating the same deep learning model as decision models of boundary division and enhancing module can maintain strong ability of learning and feature extraction in both periods.

For drug pair (di,dj) in BND(C), we have obtained the knowledge graph embedding representation of (di,dj), denoted as [Fie;Fje] in Sect. 3.4. Then, following the Eq. 4, the second CNN that has the same network structure with SCNN, called ECNN will utilize the feature[Fie;Fje] to produce a new prediction probability fecnn[Fie;Fje] for drug pairs in BND(C). Finally, each drug pair in BND(C) will be classified into POS(C) or BND(C) according to the final prediction probability scored by ECNN. Given a drug pair (di,dj), the pseudo code of predicting it as positive or negative is showed in Algorithm 1 below.

graphic file with name 41066_2022_315_Figa_HTML.jpg

Experiment results

In this section, we introduce the experiment and analyze the result. We first introduce the used datasets in Sect. 4.1. Then, the baseline and parameter settings are detailed in Sect. 4.2. Finally, Sect. 4.3 discusses the results.

Datasets

DDIs label data and large-scale biology knowledge graph data are needed for our model training and evaluation. For DDIs data, we introduce a popular DDI data zhangDDI (Zhang et al. 2017b) as experiment data, which contain 548 drugs and 48,548 pairwise known DDIs. However, since zhangDDI was collected 4 years ago, some newly found DDIs by subsequent studies might not be included. Thus, instead of directly treating drug pairs without known interactions as negative samples, we validate them based on the latest DrugBank (Law et al. 2013) which covers millions of newly discovered DDIs. We remove negative DDIs that appear in DrugBank to update the dataset zhangDDI. Finally, we randomly select 48,548 negative sample from the validated negative samples set.

For the knowledge graph, we choose DRKG constructed for drug repurposing of COVID-19 (Ioannidis et al. 2020) as our knowledge graph, which contains several drug-related biological entities and relations. DRKG is a comprehensive biological knowledge graph related to genes, compounds, diseases, biological processes, side effects and symptoms that are extracted from six datasets including DrugBank, Hetionet, GNBR, String, IntAct and DGIdb. Statistical information about DRKG is shown in Table 2. Compound entities in DRKG have already covered all 548 candidate drugs of zhangDDI. Drug (compounds in DRKG)-related entities and relations information will be maintained in embedding vectors generated by ComplEx to promote the prediction of unexpected DDIs.

Table 2.

Statistics of DRKG

Entities Entity types Relation types Triples
DRKG 97,238 13 107 5,874,261

As mentioned in Sect. 3, the problem of DDI prediction is modeled as binary classification. Therefore, we apply several evaluation metrics of classification to measure the performance of prediction model including accuracy (ACC), the area under the precision–recall curve (AUPR), the area under ROC (AUC), and F1 score. And we take ACC as our primary evaluation metrics.

Baselines and parameter settings

We compare our model against a variety of baselines which covers two state-of-the-art DDI prediction methods DeepDDI (Ryu et al. 2018), KGDDI (Karim et al. 2019) and three classical machine learning classification models including logistic regression (LR), random forest (RF), and k-nearest neighbor (KNN). DeepDDI is a deep learning method that predicts multi-type DDIs with dimension-reduction drug structure similarity feature. KGDDI applies drug embedding from knowledge graph and predicts DDIs with conv-LSTM. Both DeepDDI and KGDDI are similar to our model. In addition, to validate the effectiveness of delaying decision from three-way decision, we remove the delay decision of proposed model (3WDDI-delay) and test its performance.

We implement DeepDDI and KGDDI according to the original study (Ryu et al. 2018; Karim et al. 2019) and adjust the input and output layers for our experiment data. Chemical structure feature representation of a drug pair [Fis;Fjs] is treated as the input feature while 0, 1 labels are the expected prediction value to train DeepDDI. And knowledge graph embedding representation [Fie;Fje] is used to train KGDDI. We carry out the other baseline methods with scikit-learn python package and adopt gird search to optimize parameters. We finally set regularization coefficient of LR to 100, size of decision trees for RF to 100, and the neighbor number of KNN to 95. Unlike the training of DeepDDI and KGDDI, we concatenate both the structure feature and the knowledge graph embedding as the final representation of a drug pair[Fis;Fjs;Fie;Fje] to train the other baselines.

In terms of our proposed method, both SCNN and ECNN have the same network structure except for the input layer. We design the CNN with one convolution layer and three full connection layers where convolution is conducted on kernel with 3×3 and 64 filters. The numbers of full connection neurons are 128, 64, 32, respectively. As described in Eq. 4, the output layer is set as one neuron and activated by a sigmoid function. The other hyperparameter settings are provided in Table 3. In addition, two thresholds of boundary division are the only parameters of the three-way decision. After evaluating on different threshold values, we find the best threshold parameters α as 0.1 and β as 0.9.

Table 3.

Hyperparameter settings of CNN

Hyperparameter Setting
Dropout 0.4
Batch normalization Yes
Learning rate 0.001
Optimizer Adam
Batch size 128
Activation function ReLU

Results and discussion

In this section, we compare the performance of our proposed model with the baselines and discuss the effectiveness of delayed decision. Table 4 reports the ACC, AUPR and AUC of the proposed model and baselines, where all of the scores are average obtained from 5 runs and the bold indicates the best performance of each metric. As shown in Table 4, the proposed model 3WDDI significantly outperforms baselines and the join of three-way decision can achieve obvious promotion in several metrics. Specifically, compared with structure-based method DeepDDI, 3WDDI achieves about 4.5% improvement on ACC, 3.0% on AUPR and 3.1% on AUC. In addition, our method is also slightly better than embedding-based method KGDDI, which are 1.0%, 0.4% and 0.4% on ACC, AUPR and AUC. RF model achieves the best performance among three classical classification methods. The possible reason is that the performance of ensemble learning model tends to be better than a single model. The precision–recall curves and ROC of 3WDDI against compared methods are shown in Fig. 2. To evaluate the performance of three-way decision, we remove the enhancing module of three-way decision (3WDDI-delay). It is observed that 3WDDI-delay drops 0.7%, 0.3%, and 0.4% on ACC, AUPR and AUC, suggesting that delayed decision with supplementary knowledge can improve the accuracy of DDIs prediction. The reasons of achieving better performance is that: (i) compared to DeepDDI that only uses drug structure feature, we introduce knowledge graph embedding as auxiliary feature to prediction samples in boundary region more accurately; (ii) compared to KGDDI that only uses embedding feature, we add structure information as another feature to DDIs prediction; (iii) compared with classical binary decision, we adopt three-way decision classification to predict DDIs by delaying decision for uncertain DDIs.

Table 4.

Performance of proposed model comparative approaches

Method ACC AUPR AUC F1 score
DeepDDI (Ryu et al. 2018) 0.8472 0.9310 0.9270 0.8482
KGDDI (Karim et al. 2019) 0.8827 0.9575 0.9541 0.8827
LR 0.8720 0.9534 0.9485 0.8732
RF 0.8778 0.9547 0.9508 0.8780
KNN 0.8155 0.8954 0.8962 0.8236
3WDDI-delay 0.8850 0.9587 0.9547 0.8835
3WDDI 0.8922 0.9614 0.9582 0.8930

Fig. 2.

Fig. 2

Performances of all models

Conclusion

This article proposes a novel method 3WDDI for drug–drug interaction prediction by combing three-way decision and application of knowledge graph. It applies a simple CNN structure as the decision function and the delay decision model. We prepare drug sub-structure similarity feature and drug embedding from knowledge graph DRKG based on ComplEx for 3WDDI. Then the drug sub-structure similarity feature is used to divide boundary region and provide classification result for the rest regions. The knowledge graph embedding of a drug pair is treated as new supplementary features by 3WDDI to carry out delay decision. We implement the proposed method and conduct comparison experiment on a widely used datasets. The experimental results show that 3WDDI outperforms DDI prediction models of baselines.

There is some work we may address in the near future to improve the DDI prediction. We will consider a more effective strategy to divide sample data into different regions. Further, diverse drug features and multi-omics data (Lan et al. 2020, 2021a; Chen et al. 2019, 2020) can be extracted to promote the performance of DDI prediction.

Acknowledgements

The work reported in this paper was partially supported by a National Natural Science Foundation of China project 61963004 and 62072124, a key project of Natural Science Foundation of Guangxi 2017GXNSFDA198033, and a key research and development plan of Guangxi AB17195055.

Data availability

Data sharing not applicable to this article as no datasets were generated.

Declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Xinkun Hao, Email: haoxinkun@st.gxu.edu.cn.

Qingfeng Chen, Email: qingfeng@gxu.edu.cn.

Haiming Pan, Email: 2574958437@qq.com.

Jie Qiu, Email: jgxyqj@126.com.

Yuxiao Zhang, Email: 2636213416@qq.com.

Qian Yu, Email: 823627043@qq.com.

Zongzhao Han, Email: 1051360288@qq.com.

Xiaojing Du, Email: 1779245288@qq.com.

References

  1. Abdelaziz I, Fokoue A, Hassanzadeh O, et al. Large-scale structural and textual similarity-based mining of knowledge graph to predict drug-drug interactions. J Web Semant. 2017;44:104–117. doi: 10.1016/j.websem.2017.06.002. [DOI] [Google Scholar]
  2. Bordes A, Usunier N, García-Durán A, et al. Translating embeddings for modeling multi-relational data. Adv Neural Inf Process Syst. 2013;26:2787–2795. [Google Scholar]
  3. Callahan A, Cruz-Toledo J, Ansell P, et al. Extended semantic web conference. New York: Springer; 2013. Bio2rdf release 2: improved coverage, interoperability and provenance of life science linked data; pp. 200–212. [Google Scholar]
  4. Chen Q, Lai D, Lan W, et al. ILDMSF: inferring associations between long non-coding rna and disease based on multi-similarity fusion. IEEE/ACM Trans Comput Biol Bioinform. 2019;18(3):1106–1112. doi: 10.1109/TCBB.2019.2936476. [DOI] [PubMed] [Google Scholar]
  5. Chen Q, Qiao Y, Hu F, et al. Community detection in complex network based on APT method. Pattern Recogn Lett. 2020;138:193–200. doi: 10.1016/j.patrec.2020.07.021. [DOI] [Google Scholar]
  6. Cheng F, Zhao Z. Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. J Am Med Inf Assoc. 2014;21(e2):e278–e286. doi: 10.1136/amiajnl-2013-002512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dai Y, Guo C, Guo W, et al. Drug-drug interaction prediction with Wasserstein adversarial autoencoder-based knowledge graph embeddings. Brief Bioinform. 2020;22(4):bbaa256. doi: 10.1093/bib/bbaa256. [DOI] [PubMed] [Google Scholar]
  8. Deng Y, Xu X, Qiu Y, et al. A multimodal deep learning framework for predicting drug-drug interaction events. Bioinformatics. 2020;36(15):4316–4322. doi: 10.1093/bioinformatics/btaa501. [DOI] [PubMed] [Google Scholar]
  9. Ferdousi R, Safdari R, Omidi Y. Computational prediction of drug-drug interactions based on drugs functional similarities. J Biomed Inform. 2017;70:54–64. doi: 10.1016/j.jbi.2017.04.021. [DOI] [PubMed] [Google Scholar]
  10. Gottlieb A, Stein GY, Oron Y, et al. INDI: a computational framework for inferring drug interactions and their associated recommendations. Mol Syst Biol. 2012;8(1):592. doi: 10.1038/msb.2012.26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Huaxiong L, Libo Z, Bing H, et al. Sequential three-way decision and granulation for cost-sensitive face recognition. Knowl-Based Syst. 2016;91:241–251. doi: 10.1016/j.knosys.2015.07.040. [DOI] [Google Scholar]
  12. Ioannidis VN, Song X, Manchanda S et al. (2020) DRKG- drug repurposing knowledge graph for covid-19. https://github.com/gnn4dr/DRKG/
  13. Ji S, Pan S, Cambria E, et al. A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans Neural Netw Learn Syst. 2022;33(2):494–514. doi: 10.1109/TNNLS.2021.3070843. [DOI] [PubMed] [Google Scholar]
  14. Kanehisa M, Sato Y, Kawashima M, et al. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2015;44(D1):D457–D462. doi: 10.1093/nar/gkv1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Karim MR, Cochez M, Jares JB et al. (2019) Drug-drug interaction prediction based on knowledge graph embeddings and convolutional-lstm network. In: International conference on bioinformatics, computational biology and health informatics, pp 113–123
  16. Kastrin A, Ferk P, Leskošek B. Predicting potential drug-drug interactions on topological and semantic similarity features using statistical learning. PLoS One. 2018;13(5):1–23. doi: 10.1371/journal.pone.0196865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Lan W, Lai D, Chen Q, et al. LDICDL: Lncrna-disease association identification based on collaborative deep learning. IEEE/ACM Trans Comput Biol Bioinform. 2020 doi: 10.1109/TCBB.2020.3034910. [DOI] [PubMed] [Google Scholar]
  18. Lan W, Dong Y, Chen Q, et al. IGNSCDA: predicting circrna-disease associations based on improved graph convolutional network and negative sampling. IEEE/ACM Trans Comput Biol Bioinform. 2021 doi: 10.1109/TCBB.2021.3111607. [DOI] [PubMed] [Google Scholar]
  19. Lan W, Dong Y, Chen Q, et al. KGANCDA: predicting circRNA-disease associations based on knowledge graph attention network. Brief Bioinform. 2021;23(1):bbab494. doi: 10.1093/bib/bbab494. [DOI] [PubMed] [Google Scholar]
  20. Lan W, Wu X, Chen Q, et al. GANLDA: Graph attention network for lncrna-disease associations prediction. Neurocomputing. 2022;469:384–393. doi: 10.1016/j.neucom.2020.09.094. [DOI] [Google Scholar]
  21. Law V, Knox C, Djoumbou Y, et al. Drugbank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2013;42(D1):D1091–D1097. doi: 10.1093/nar/gkt1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lee G, Park C, Ahn J. Novel deep learning model for more accurate prediction of drug-drug interaction effects. BMC Bioinform. 2019;20(1):1–8. doi: 10.1186/s12859-019-3013-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Li Z, Huang D. A three-way decision method in a fuzzy condition decision information system and its application in credit card evaluation. Granul Comput. 2020;5(4):513–526. doi: 10.1007/s41066-019-00172-8. [DOI] [Google Scholar]
  24. Li Q, Cheng T, Wang Y, et al. Pubchem as a public resource for drug discovery. Drug Discov Today. 2010;15(23):1052–1057. doi: 10.1016/j.drudis.2010.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Li Y, Zhang L, Xu Y, et al. Enhancing binary classification by modeling uncertain boundary in three-way decisions. IEEE Trans Knowl Data Eng. 2017;29(7):1438–1451. doi: 10.1109/TKDE.2017.2681671. [DOI] [Google Scholar]
  26. Lin Y, Liu Z, Sun M et al. (2015) Learning entity and relation embeddings for knowledge graph completion. In: Twenty-ninth AAAI conference on artificial intelligence, pp 2181–2187
  27. Rohani N, Eslahchi C. Drug-drug interaction predicting by neural network using integrated similarity. Sci Rep. 2019;9(1):1–11. doi: 10.1038/s41598-019-50121-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Ryu JY, Kim HU, Lee SY. Deep learning improves prediction of drug–drug and drug–food interactions. Proc Natl Acad Sci USA. 2018;115(18):E4304–E4311. doi: 10.1073/pnas.1803294115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Trouillon T, Welbl J, Riedel S et al. (2016) Complex embeddings for simple link prediction. In: International conference on machine learning, pp 2071–2080
  30. Wang Z, Zhang J, Feng J, et al. (2014) Knowledge graph embedding by translating on hyperplanes. In: Twenty-Eighth AAAI conference on artificial intelligence, pp 1112–1119
  31. Yang B, Yih W, He X, et al. (2014) Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575
  32. Yao Y. Three-way decisions with probabilistic rough sets. Inf Sci. 2010;180(3):341–353. doi: 10.1016/j.ins.2009.09.021. [DOI] [Google Scholar]
  33. Yao Y. Set-theoretic models of three-way decision. Granul Comput. 2021;6(1):133–148. doi: 10.1007/s41066-020-00211-9. [DOI] [Google Scholar]
  34. Yu H, Wang X, Wang G, et al. An active three-way clustering method via low-rank matrices for multi-view data. Inf Sci. 2020;507:823–839. doi: 10.1016/j.ins.2018.03.009. [DOI] [Google Scholar]
  35. Zhang P, Wang F, Hu J, et al. Label propagation prediction of drug-drug interactions based on clinical side effects. Sci Rep. 2015;5(1):1–10. doi: 10.1038/srep12339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Zhang HR, Min F, Shi B. Regression-based three-way recommendation. Inf Sci. 2017;378:444–461. doi: 10.1016/j.ins.2016.03.019. [DOI] [Google Scholar]
  37. Zhang W, Chen Y, Liu F, et al. Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data. BMC Bioinform. 2017;18(1):1–12. doi: 10.1186/s12859-016-1415-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Zhang Y, Zhang Z, Miao D, et al. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci. 2019;477:55–64. doi: 10.1016/j.ins.2018.10.030. [DOI] [Google Scholar]
  39. Zhang Y, Qiu Y, Cui Y, et al. Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning. Methods. 2020;179:37–46. doi: 10.1016/j.ymeth.2020.05.007. [DOI] [PubMed] [Google Scholar]
  40. Zhang C, Gao R, Qin H, et al. Three-way clustering method for incomplete information system based on set-pair analysis. Granul Comput. 2021;6(2):389–398. doi: 10.1007/s41066-019-00197-z. [DOI] [Google Scholar]
  41. Zhou B, Yao Y, Luo J. Cost-sensitive three-way email spam filtering. J Intell Inf Syst. 2014;42(1):19–45. doi: 10.1007/s10844-013-0254-7. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated.


Articles from Granular Computing are provided here courtesy of Nature Publishing Group

RESOURCES