Drug repositioning based on individual bi-random walks on a heterogeneous network

Yuehui Wang; Maozu Guo; Yazhou Ren; Lianyin Jia; Guoxian Yu

doi:10.1186/s12859-019-3117-6

. 2019 Dec 24;20(Suppl 15):547. doi: 10.1186/s12859-019-3117-6

Drug repositioning based on individual bi-random walks on a heterogeneous network

Yuehui Wang ¹, Maozu Guo ^2,^✉, Yazhou Ren ³, Lianyin Jia ⁴, Guoxian Yu ^1,^5,^✉

PMCID: PMC6929332 PMID: 31874623

Abstract

Background

Traditional drug research and development is high cost, time-consuming and risky. Computationally identifying new indications for existing drugs, referred as drug repositioning, greatly reduces the cost and attracts ever-increasing research interests. Many network-based methods have been proposed for drug repositioning and most of them apply random walk on a heterogeneous network consisted with disease and drug nodes. However, these methods generally adopt the same walk-length for all nodes, and ignore the different contributions of different nodes.

Results

In this study, we propose a drug repositioning approach based on individual bi-random walks (DR-IBRW) on the heterogeneous network. DR-IBRW firstly quantifies the individual work-length of random walks for each node based on the network topology and knowledge that similar drugs tend to be associated with similar diseases. To account for the inner structural difference of the heterogeneous network, it performs bi-random walks with the quantified walk-lengths, and thus to identify new indications for approved drugs. Empirical study on public datasets shows that DR-IBRW achieves a much better drug repositioning performance than other related competitive methods.

Conclusions

Using individual random walk-lengths for different nodes of heterogeneous network indeed boosts the repositioning performance. DR-IBRW can be easily generalized to prioritize links between nodes of a network.

Keywords: Drug repositioning, Drug-disease heterogeneous network, Individual walk-length, Bi-random walks

Background

Traditional drug research and development depends on cell-based or target-based screening of chemical compounds to identify a small subset of ‘hits’. The identification process aims to further increase their affinity, efficacy and selectivity, before moving forward to animal tests and clinical trials [1]. Drug development in general is complicated, time-consuming and expensive with high-risk [2]. In light of these difficulties in traditional drug discovery, identifying new indications for existing drugs, also known as drug repositioning, has attracted increasing interests from both the pharmaceutical industry and research community [3]. Drug repositioning is much more economic compared with traditional approaches, it offers a promising alternative to reduce the cost and time, since the repositioned drug has already passed the required safety tests.

However, most successfully repositioned drugs up to date have been the consequence of incidental observations of unexpected efficacy and side effects in the development or on the market [4]. For example, Sildenafil was originally tested for angina, now is indicated for erectile dysfunction and pulmonary hypertension [2]; Minoxidil was originally tested for hypertension; now is indicated for hair loss [5]. With the influx of big biochemical and phenotypic data, drug repositioning holds great potential for precise medicine. It is profitable and promising to develop computational methods to predict new indications for approved drugs on large scale.

Some computational drug repositioning methods have been proposed and they can be roughly divided into two categories: focusing on the interactions between drugs and the targets; and focusing on exploiting the knowledge of diseases and drugs [6]. To name a few, Bleakley and Yamanishi [7] developed a bipartite local model (BLM) to predict target proteins of a given drug and target drugs of a give protein, and then combine these two predictions to give a final prediction for each candidate drug-target interaction. Cheng et al. [8] used a drug-target bipartite network topology similarity and a network based inference algorithm (NBI) to infer new targets for known drugs. Wang et al. [9] used known drug-target interactions as well as drug-drug and target-target similarities to construct a heterogeneous network, and then introduced a Heterogeneous Graph Based Inference (HGBI) method to iteratively update the strength between unlinked drug-target pairs based on all the paths in the network connecting them. These drug-target prediction methods can be readily adopted for drug repositioning.

Chiang et al. [10] attempted to predict novel associations between drugs and diseases based on the widely-adopted ‘guilt-by-association’ principle. This principle assumes that if a drug can treat one of two similar diseases, then it might treat the other also; alternatively a disease can be treated by two similar drugs. Following this principle, Gottlieb et al. [11] measured the similarity between the pertaining drug and disease of drug-disease pairs that are known to be associated based on multiple drug-drug sources and disease-disease similarity metrics, and then ranked the accumulative evidence for association using a logistic regression scheme to predict novel drug indications. Wang et al. [1] integrated omics data about diseases, drugs and drug targets to construct a heterogeneous network and then applied random walks on the network to replenish missing associations between drugs and diseases. Martinez et al. [6] integrated information on diseases, drugs and targets (proteins) to construct a heterogeneous network and then performed propagation flow on the network to prioritize candidate associations between diseases and drugs according to their interconnections in the network. Luo et al. [12] proposed MBiRW to predict drug-disease associations. MBiRW employs known drug-disease associations to improve the drug-drug and disease-disease similarity measures; and then integrates the similarity networks and drug-disease associations to build a drug-disease heterogeneous network; after that, it performs bi-random walk with restart on the network to predict novel potential drug-disease associations. Liu et al. [13] performed a drug-centric random walk and a disease-centric random walk to obtain the association confidence between the disease nodes and drug nodes of a heterogeneous network.

Most of these aforementioned methods in essence are random walk based solutions. Although they make use of the network topology from different perspectives, they ignore the different contributions of different nodes on transferring the information on the network and almost all adopt a fixed walk-length for all nodes. To overcome this issue, we propose a novel drug repositioning approach (called DR-IBRW) that performs bi-random walk with restart on a heterogeneous network with quantified individual walk-length for each node. DR-IBRW uses disease symptom information [14] and drug chemical fingerprints [15] to construct a composite disease-disease similarity network, drug-drug similarity network. It then quantifies the individual walk-length for each node based on the topology of known drug-disease association network. Next, it constructs a heterogeneous network based on these three networks. After that, it performs bi-random walks with the quantified walk-lengths to account for the structural differences of these networks and contribution differences of different nodes (including diseases and drugs), and to predict new associations between drugs and diseases, and thus to accomplish the drug repositioning. We evaluate and compare the performance of DR-IBRW on several public datasets. DR-IBRW obtains much better performance than other related comparing methods [7–9, 12, 16] in identifying new indications for existing drugs, and the quantified individual walk-length indeed contributes to an improved prediction performance. We want to remark that the proposed individual bi-random walk solution is different from existing personalized random walk solutions [17, 18] that mainly focus on setting different restart probabilities for different nodes.

Materials and methods

Dataset

The datasets used in this work include drug-disease associations, drug fingerprints and disease symptoms. We collected 4219 diseases from MeSH [19] and 322 symptoms for each disease from the supplementary material of [14]. The drug-disease association dataset was obtained from [20], it includes 3250 known drug-disease associations involving 799 drugs and 719 diseases. We also collected 881 fingerprints for each drug from PubChem [15]. Since only 525 diseases can find their relevant symptom information from the supplementary material of [14], the final processed dataset includes 525 diseases, 718 drugs and 2177 drug-disease associations. All these data were collected on November 1st, 2017.

Similarity measures

We separately apply a four-step measurement to quantify the inner-similarity between diseases and between drugs. The first three steps are based on the comprehensive similarity measurement used by Luo et al. [12]. In the fourth step, we use Gaussian interaction profile kernel similarity [21] to measure the similarity between drugs and diseases. Finally, we combine these similarities to form the composite similarity between diseases and between drugs. The four-step procedure of measuring the similarity between drugs is briefly introduced as follows.

Step 1: Based on the chemical fingerprints of the drug molecules, we can initially measure the similarity $S_{r}^{1} \in ℝ^{n_{r} \times n_{r}}$ between n_r drugs via the widely used Cosine similarity metric [22]. Let r_i and r_j be the vector forms of the chemical fingerprints of drug r_i and r_j, the chemical similarity $S_{r}^{1} (r_{i}, r_{j})$ between drug r_i and r_j is defined as:

S_{r}^{1} (r_{i}, r_{j}) = \frac{r_{i}^{T} r_{j}}{∥r_{i}∥ ∥r_{j}∥},

Step 2: Too small similarity provides little information for drug repositioning and can be transformed into zeros for accurate prediction [9, 12]. We partition $S_{r}^{1}$ into ten subranges ((0,0.1], (0.1,0.2], etc.) and calculate the average similarity of drug pairs with shared diseases for each subrange. We also randomly shuffle $S_{r}^{1}$ and repeat the partition and calculation process again. If the average of the non-shuffled subrange is smaller than that of the respective shuffled subrange, the drug similarities divided into this subrange are viewed as not informative; otherwise, they are informative. We then adopt a logistic function [23] to shrink these non-informative similarities to zero and to enlarge these informative similarities. The logistic function is defined as follows:

S_{r}^{2} (r_{i}, r_{j}) = \frac{1}{1 + e^{c S_{r}^{1} (r_{i}, r_{j}) + d}},

where c and d are the parameters can be tuned to control the adjustment of $S_{r}^{1}$ . c is the upper bound of the first subrange whose average similarity is smaller than that of the respective shuffled subrange, d=log(999). After that, we obtain a updated drug similarity matrix $S_{r}^{2}$ .

Step 3: Two drugs are more similar if they are grouped into the same cluster. To make use of this assumption, we first construct a new weighted drug sharing network with drugs as nodes and edge weight reflecting the number of common diseases by respective pair nodes. After that, we adopt a graph clustering method, ClusterONE [24], to identify potential drug clusters on the network. We then add the clustering cohesiveness of a cluster with $S_{r}^{2}$ if and only if the two drugs belong to that cluster.

f (C) = \frac{W_{in} (C)}{(W_{in} (C) + W_{bound} (C) + P (C))},

where $W_{in} (C)$ denotes the total weight of edges within a cluster of vertices, $W_{bound} (C)$ represents the total weight of edges connecting nodes of this cluster to nodes of other clusters, and $P (C)$ is the penalty term. Suppose that drug r_i and drug r_j locating in the same cluster $C$ , the comprehensive drug similarity $S_{r}^{3} (r_{i}, r_{j})$ between drug r_i and r_j is defined as $(1 + f (C)) * S_{r}^{2} (r_{i}, r_{j})$ . In this way, we obtain an improved drug similarity matrix $S_{r}^{3}$ .

Step 4: Based on the assumption that similar drugs tend to show similar interaction and non-interaction profiles with the diseases, we further use Gaussian interaction profile kernel similarity to measure the similarity between drugs [21, 25, 26]. The interaction profile IP(r_i) of drug r_i is defined as a binary vector encoding the presence or absence of the known associations between the drug and n_d diseases. The Gaussian interaction profile kernel similarity between two drugs (r_i and r_j) is computed as follows:

S_{r}^{KR} (r_{i}, r_{j}) = exp (- Υ_{r} ∥ IP (r_{i}) - IP (r_{j}) ∥^{2}),

Υ_{r} = {\tilde{Υ}}_{r} / (\frac{1}{n_{r}} \sum_{i = 1}^{n_{r}} ∥ IP (r_{i}) ∥^{2}),

where Υ_r is the kernel bandwidth, ${\tilde{Υ}}_{r}$ is the average number of associated diseases per drug.

To this end, we combine $S_{r}^{3}$ and $S_{r}^{KR}$ into the composite similarity matrix S_r between n_r drugs as follows:

S_{r} = (S_{r}^{3} + S_{r}^{KR}) / 2

Following the above four-step, we can also compute the composite similarity $S_{d} \in ℝ^{n_{d} \times n_{d}}$ between n_d diseases based on the symptom information of these diseases and drug-disease associations.

Quantifying individual walk-length

Network-based drug repositioning methods generally apply random walk on a network with a fixed walk-length for all nodes to explore the network topology [12, 27, 28]. They ignore the different contributions of different nodes to some extent. Given that, we introduce an individual walk-length measure and try to make better use of the topology of known drug-disease association bipartite network $W_{rd} \in ℝ^{n_{r} \times n_{d}}$ of n_r drugs and n_d diseases. W_rd(r_i,d_j)=1 if the association between the drug r_i and disease d_j is known; and 0 otherwise.

The walk-length of a node generally depends on its influence in the network [29]. We extend the Jaccard index measure introduced by Lu et al. [16] to quantify the individual walk-length of nodes. Suppose $N^{r} (r_{i})$ denote the set of neighbours of drug r_i and $N^{d} (d_{j})$ denote the set of neighbours of disease d_j, if r_i and d_j share many common neighbours, they will be more probably influenced with each other. For a randomly selected feature f of either r_i or d_j, traditional Jaccard index measures the probability that both r_i and d_j have that feature as follows [30]:

{JI}^{'} (r_{i}, d_{j}) = \frac{| N^{r} (r_{i}) \cap N^{d} (d_{j}) |}{| N^{r} (r_{i}) \cup N^{d} (d_{j}) |},

Since there is no relationship between diseases or between drugs in the drug-disease bipartite network, $N^{r} (r_{i}) \cap N^{d} (d_{j})$ is an empty set. For this reason, we have to modify the definition of Jaccard index for a bipartite graph. Particularly, we define ${\hat{N}}^{r} (r_{i}) = \cup_{c \in N^{r} (r_{i})} N^{d} (c)$ as the set of drugs associated with r_i’s neighbours. Then the Jaccard index on the bipartite network is defined as follows:

JI (r_{i}, d_{j}) = \frac{| N^{r} (r_{i}) \cap {\hat{N}}^{d} (d_{j}) |}{| N^{r} (r_{i}) \cup {\hat{N}}^{d} (d_{j}) |},

JI(r_i,d_i) represents the influence between drug-disease pair (r_i,d_j). We assume that a node with high quantified influence has more probability to interact with others during the random walk process, and this node should have larger walk-length. Based on this assumption, we can measure the walk-length of each node as follows:

L_{r} (r_{i}) = \sum_{j = 1}^{n_{d}} JI (r_{i}, d_{j}), L_{d} (d_{j}) = \sum_{i = 1}^{n_{r}} JI (r_{i}, d_{j}),

where $L_{r} \in ℝ^{n_{r}}$ and $L_{d} \in ℝ^{n_{d}}$ store the individual walk-lengths of n_r drugs and n_d diseases, respectively.

Individual bi-random walk

Based on the inner similarity network (defined by S^r) of drugs, the inner similarity network (defined by S^d) of diseases, and the drug-disease bipartite network initialized by known drug-disease associations, we can construct a heterogeneous network of drugs and diseases (see Fig. 1 for example). We adopt a bi-random walk with restart procedure [27] on the heterogeneous network. Compared with traditional random walk with restart, the bi-random walk with restart can separately propagate information in different subnetworks, instead of the global network [28]. For this reason, bi-random walk can separately account for the inner structure of disease similarity network and of drug similarity network, and also make use of associations between drugs and diseases.

Fig. 1 — A heterogeneous network consists of drug similarity network $S_{r} \in ℝ^{n_{r} \times n_{r}}$ with n_r drugs, disease similarity network $S_{d} \in ℝ^{n_{d} \times n_{d}}$ with n_d diseases, drug-disease association network $W_{rd} \in ℝ^{n_{r} \times n_{d}}$ between n_r drugs and n_d diseases. Each circle represents a drug, each hexagon represents a disease. In the drug (disease) similarity network, the solid edges describe the similarities of drug (disease) pairs. In the drug-disease association network, the solid edges indicate the known drug-disease associations, and the dashed edges indicate the potential associations between drugs and diseases, which are the new indications of drugs

A random walker can take a drug as the starting node, its associated diseases as intermediate nodes, and then traverse to other disease nodes. In this way, we can get probabilistic associations between the drug and new diseases, and thus identify potential new indications of the drug. To mimic this process, we perform random walk with restart starting from drug nodes and then traversing to disease nodes based on the quantified individual walk-length and the heterogeneous network topology as follows:

\begin{array}{lcr} F_{r}^{t} (r_{i}, d_{j}) = \{\begin{matrix} α \sum_{k = 1}^{n_{d}} {\tilde{S}}_{d} (d_{k}, d_{j}) F_{r}^{t - 1} (r_{i}, d_{k}) + \\ (1 - α) W_{rd} (r_{i}, d_{j}), if t \leq ⌊ L_{r} (r_{i}) ⌋ \\ F_{r}^{t - 1} (r_{i}, d_{j}), otherwise \end{matrix} \end{array}

where $F_{r}^{t} (r_{i}, d_{j})$ is the predicted relevance between drug r_i and disease d_j in the t-th iteration, $F_{r}^{0} = W_{rd}$ , α>0 controls the probability for a walker staying at the starting point, ${\tilde{S}}_{d} = D_{d}^{- \frac{1}{2}} * S_{d} * D_{d}^{- \frac{1}{2}}$ is the Laplacian normalized result of S_d and D_d is a diagonal matrix with $D_{d} (d_{j}, d_{j}) = Σ_{k = 1}^{n_{d}} S_{d} (d_{j}, d_{k})$ . If t>L_r(r_i), the random walker starting from r_i will not jump any more. We want to recomment that unlike traditional random walks and bi-random walks that adopt the same walk-length for all the nodes, the walk-length of a node in Eq. (10) is adaptively set based on its topology relationship with other nodes and is different from the walk-lengths of other nodes.

Similarly, a random walker can also start from a disease node and then traverse to drug nodes based on known drug-disease relationships and drug similarity network. In this way, we can obtain another probability between the disease and drug. To simulate this process, we perform random walk with restart from the disease node (d_j) as follows:

\begin{array}{lcr} F_{d}^{t} (r_{i}, d_{j}) = \{\begin{matrix} α \sum_{k = 1}^{n_{r}} {\tilde{S}}_{r} (r_{i}, r_{k}) F_{d}^{t - 1} (r_{k}, d_{j}) + \\ (1 - α) W_{rd} (r_{i}, d_{j}), if t \leq ⌊ L_{d} (d_{j}) ⌋ \\ F_{d}^{t - 1} (r_{i}, d_{j}), otherwise \end{matrix} \end{array}

where $F_{d}^{t} (r_{i}, d_{j})$ is the predicted relevance between drug r_i and disease d_j in the t-th iteration, and the same normalization procedure is applied to S_r to construct the normalization matrix ${\tilde{S}}_{r} = D_{r}^{- \frac{1}{2}} * S_{r} * D_{r}^{- \frac{1}{2}}$ , D_r is a diagonal matrix with $D_{r} (r_{i}, r_{i}) = Σ_{j = 1}^{n_{r}} S_{r} (r_{i}, r_{j})$ .

After iteratively applying Eqs. (10-11) with individual walk-lengths, we can obtain F_r and F_d, which separately reflect the association confidences between n_r drugs and n_d diseases from the perspective of the disease similarity network, and from the drug similarity network, along with the known drug-disease associations. To this end, we integrate them as follows:

F = \frac{F_{r} + F_{d}}{2}

Obviously, the larger the value of F(r_i,d_j), the larger the probability that drug r_i associated with disease d_j is. In this way, we can finally identify new indications for existing drugs. The whole procedure of DR-IBRW is described in Algorithm 1.

graphic file with name 12859_2019_3117_Figa_HTML.jpg

Results and discussion

Performance comparison with other methods

DR-IBRW is compared with five related and recent methods (MBiRW [12], BLM [7], JI (Jaccard Index) [16], HGBI [9] and NBI [8]) on the processed dataset. MBiRW, BLM, HGBI and NBI were introduced in the Introduction, the last four methods are originally developed for predicting drug-target interactions and can be directly adopted to predict drug-disease associations. Parameters of these comparing methods are set (or optimized) as the authors suggested (or provided) in their respective papers or codes. As to DR-IBRW, α for random walk restart probability is set to 0.1. To reach a comprehensive evaluation, we use six widely used metrics, namely AUROC, AUPR, Macro-F1, Micro-F1, Precision, Recall. These metrics are also used by those comparing methods [7–9, 12, 16]. The formal definitions of these metrics are omitted here, but interested readers than can find the formal definitions of these metrics in these references and references therein. All these methods follow ten fold cross-validation experimental protocol, and then report the average results and standard deviation in Table 1. In addition, we also plot the receiver operating characteristic (ROC) curve and precision recall (PR) curve, and the value of area under perspective curve in Fig. 2.

Table 1.

The results of DR-IBRW and five comparing methods

Methods	Micro-F1	Macro-F1	Precision	Recall
DR-IBRW	0.395 ±0.002	0.328±0.001	0.212 ±0.000	0.766±0.007
MBiRW	0.294 ±0.004	0.245 ±0.003	0.158 ±0.001	0.572 ±0.016
BLM	0.413±0.006	0.304 ±0.002	0.224±0.003	0.740 ±0.002
JI	0.229 ±0.001	0.188 ±0.001	0.123 ±0.000	0.427 ±0.002
HGBI	0.013 ±0.000	0.010 ±0.000	0.007 ±0.000	0.021 ±0.000
NBI	0.009 ±0.000	0.007 ±0.000	0.004 ±0.000	0.016 ±0.000

Open in a new tab

The entry in boldface represent the method perform best in this evaluation metric

Fig. 2 — The ROC and PR curves of DR-IBRW and comparison algorithms. AUROC and AUPR are the values of area under the ROC and PR curve, respectively

We can easily find that DR-IBRW achieves better performance than these comparing methods. Although both DR-IBRW and MBiRW utilize the drug similarity network, disease similarity network and drug-disease association network to construct a heterogeneous network, and then apply bi-random walks with restart to account for the structural difference of this network, DR-IBRW still performs significantly better than MBiRW. That is because DR-IBRW takes into account the different contributions of different nodes and applies individual walk-lengths for them, whereas MBiRW equally treats all nodes and applies the same walk-length. In addition, DR-IBRW uses the Gaussian interaction profile kernel similarity to strengthen the effect of known drug-disease associations.

HGBI also applies random walks with restart on the heterogeneous network, but it does not take into account structural difference between drug similarity network and disease similarity network. BLM tries to build a separate classifier for each drug and each drug, but it is still suffered from biased training data, since there are more negative samples than positive samples (known associations). In fact, a number of negative samples should be positive ones. For this reason, BLM has a high Precision and Recall but with a low AUPR value. JI takes into account the influence of a node in the bipartite network and uses common neighbours to predict drug-disease associations. NBI only utilizes known drug-disease associations to run a two-step diffusion model on the bipartite graph and it can not predict new associations for a drug without known associations. For these reasons, both JI and NBI are outperformed by DR-IBRW.

Individual walk-length analysis

To study the contribution of our proposed individual walk-lengths, we also test the performance of DR-IBRW with fixed walk-lengths for all the nodes by varying walk-length in the disease network and drug network from 0 to 10, respectively. Fig. 3 reveals the AUROC and AUPR of DR-IBRW under different combined configurations of L_r and L_d. From this figure, we can clearly see that the AUROC stops increasing when L_r and L_d are larger than 2, and the AUROC and AUPR values with a fixed walk-length are smaller than those of DR-IBRW with individual walk-lengths. This comparison further corroborates the effectiveness and rationality of individual walk-lengths.

Fig. 3 — The AUROC and AUPR values of DR-IBRW with different fixed walk-lengths. The blue star is the value of DR-IBRW with individual walk-lengths

Drug and disease similarity analysis

As introduced in Section 5, we measure the composite inner similarity between diseases and drugs in four steps. To investigate the impact of these four steps and the contribution of Gaussian interaction kernel profile similarity, we introduce three variants (DR-IBRW123, DR-IBRW124, DR-IBRW134) of DR-IBRW. Particularly, DR-IBRW123 only uses the first three steps (as done by Luo et al. [12]), or excludes the Gaussian interaction kernel profile similarity, to measure the inner similarity between diseases and between drugs. Similarly, DR-IBRW134 excludes the second step without shrinking low similarity and enlarging high similarity. DR-IBRW124 follows the same naming rule. The AUROC and AUPR values of DR-IBRW and its variants by ten fold cross-validations are shown in Fig. 4. Obviously, the AUROC and AUPR values of DR-IBRW123 are lower than those of other methods, which show the contribution of Gaussian interaction profile kernel similarity for drug repositioning. Another interesting observation is that DR-IBRW134 has a higher AUPR value than other variants and DR-IBRW. The cause is that AUPR and AUROC measure the performance from different perspectives and under varying thresholds. The second step may wrongly shrink low similarity and enlarge high similarity, and thus compromise the performance.

Fig. 4 — The AUROC and AUPR values of DR-IBRW and its variants

Experiments on another two datasets

We collected another two datasets to further study the performance of DR-IBRW. The first dataset (named ‘Gottlieb’s Dataset’), was obtained from [11]. This dataset contains 1933 known drug-disease associations involving 593 drugs registered in DrugBank and 313 diseases listed in the Online Mendelian Inheritance in Man (OMIM). The another dataset (‘Luo’s Dataset’) is obtained from [12], it includes 663 drugs registered in DrugBank, 409 diseases listed in OMIM database and 2352 known drug-disease associations. Table 2 reports the results of 10 fold cross-validation of DR-IBRW and comparing methods on these two datasets. The experimental setups are kept the same as in previous experiments. From these tables, we can also find that DR-IBRW again obtains much better performance than these comparing methods across different evaluation metrics.

Table 2.

The performance results of DR-IBRW and comparing methods on Gottlieb’s dataset [11] and Luo’s dataset [12]

	AUROC	AUPR	Micro-F1	Macro-F1	Precision	Recall
	Gottlieb’s dataset
DR-IBRW	0.955±0.000	0.499±0.174	0.613±0.006	0.513±0.005	0.332±0.002	0.880 ±0.000
MBiRW	0.933 ±0.000	0.213 ±0.028	0.294 ±0.004	0.244 ±0.003	0.256 ±0.001	0.906±0.000
BLM	0.865 ±0.000	0.298 ±0.003	0.583 ±0.001	0.479 ±0.001	0.315 ±0.000	0.891 ±0.000
JI	0.845 ±0.001	0.247 ±0.043	0.385 ±0.003	0.462 ±0.004	0.250 ±0.001	0.894 ±0.181
HGBI	0.811 ±0.000	0.016 ±0.000	0.187 ±0.001	0.157 ±0.001	0.101 ±0.000	0.367 ±0.007
NBI	0.503 ±0.000	0.000 ±0.000	0.022 ±0.000	0.018 ±0.000	0.012 ±0.000	0.039 ±0.001
	Luo’s dataset
DR-IBRW	0.964±0.000	0.529±0.167	0.537±0.006	0.452 ±0.004	0.294±0.002	0.895±0.002
MBiRW	0.945 ±0.000	0.285 ±0.042	0.431 ±0.004	0.363 ±0.003	0.236 ±0.001	0.835 ±0.013
BLM	0.892 ±0.000	0.424 ±0.017	0.527 ±0.003	0.463±0.004	0.278 ±0.001	0.843 ±0.000
JI	0.865 ±0.000	0.287 ±0.041	0.537 ±0.004	0.447 ±0.003	0.294 ±0.001	0.783 ±0.000
HGBI	0.848 ±0.000	0.037 ±0.001	0.170 ±0.001	0.141 ±0.001	0.093 ±0.000	0.318 ±0.005
NBI	0.479 ±0.000	0.000 ±0.000	0.020 ±0.000	0.016 ±0.000	0.011 ±0.000	0.032 ±0.000

Open in a new tab

The entry in boldface represent the method perform best in this evaluation metric

Case study

To further demonstrate that the drug-disease associations predicted by DR-IBRW can be confirmed by biological experiments, we apply DR-IBRW to prioritize potential drug-disease pairs. Here, we use all the collected drug-disease associations as training samples, and then select the top 10 drug-disease pairs with the largest association probabilities as the predicted drug-disease associations. After that, we manually check these associations by referring to the associations stored in Comparative Toxicogenomics Database(CTD) [31]. Particularly, we use the data of chemical-disease associations labeled with therapeutic downloaded from CTD. The label therapeutic represents a chemical that has a known or potential therapeutic role in a disease. For the predicted associations cannot find in the CTD, we further manually check them on PubMed and list the supportive PubMed IDs. We highlight the drug-disease associations supported by recent papers in PubMed but not included in CTD in boldface. The currently supported and un-supported associations are listed in Table 3.

Table 3.

DR-IBRW predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence

Drug	Disease	Evidence (PMID)	Rank
Labetalol	Hypertension	6124264; 25692529	1
Irbesartan	Heart Failure	19001508	2
Enalapril	Hypertension	2994986	3
Flurandrenolide	Scalp Dermatoses	without evidence	4
Hydralazine	Hypertension	20687078; 22071816	5
Fenoldopam	Hypertension	8105829	6
Captopril	Hypertension	6754186; 3520132; 1747216; 23161035	7
Erythrityl Tetranitrate	Hypertension	without evidence	8
Nitroprusside	Hypertension	21272230; 9796241	9
Ranolazine	Hypertension	24464752; 26401256	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

From Table 3, 6 out of top 10 predicted associations are supported by associations in CTD, the other two drug-disease pairs are supported by recent papers in PubMed but not included in CTD. For instance, Labetalol is an effective agent in essential hypertension as documented in open studies and controlled studies [32]. For another instance, Greminger et al. confirmed the high efficacy of captopril in treatment of severe hypertension refractory to conventional drugs [33]. Meanwhile, ranolazine therapy is safe and well tolerated in a pilot study involving pulmonary arterial hypertension [34]. Although we can not find the direct evidence for the associations of flurandrenolide and scalp dermatoses, flurandrenolide topical is used to treat the itching, redness, dryness, crusting, scaling, inflammation, and discomfort of various skin conditions [35].

These predicted results confirm the capability of DR-IBRW in identifying novel drug-disease associations with high confidence. We want to remark that the 2 unsupported associations should not be viewed as incorrect associations. As more experimental evidence becomes available, they maybe further supported.

We also report the top 10 repositioned examples made by other comparing methods, and then manually check these examples by referring to the associations stored in CTD. We further check the associations that cannot find in the CTD on PubMed and list the supportive PubMed IDs. We highlight the drug-disease associations supported by recent papers in PubMed but not included in CTD in boldface. Tables 4, 5, 6, 7 and 8 list the currently supported and un-supported associations for MBiRW, BLM, JI, HGBI and NBI, respectively.

Table 4.

MBiRW predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence.

Drug	Disease	Evidence (PMID)	Rank
Echothiophate	Esotropia	13907355; 7166393	1
Cysteamine	Cystinosis	22532830;23651769	2
Ethanol	Complex Regional Pain Syndromes	without evidence	3
Ethanol	Warts	9557098	4
Foscarnet	Herpes Genitalis	without evidence	5
Ribavirin	Respiratory Syncytial Virus Infections	11781627	6
Foscarnet	Cytomegalovirus Infecti	11362300;11050094;10795660	7
Nitisinone	Tyrosinemias	11488774	8
Hydroxocobalamin	Alcoholic Neuropathy	without evidence	9
Methimazole	Goiter	14723259	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

Table 5.

BLM predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence

Drug	Disease	Evidence (PMID)	Rank
Cefixime	Streptococcal Infections	2041146	1
Cefdinir	Urinary Tract Infections	20573040	2
Ceftibuten	Soft Tissue Infections	without evidence	3
Ceftibuten	Klebsiella Infections	25813819	4
Ceftibuten	Urinary Tract Infections	11605809;2391749	5
Cefdinir	Soft Tissue Infections	15313534;16765555	6
Cefditoren	Urinary Tract Infections	20542206;8455334	7
Cefditoren	Escherichia coli Infections	without evidence	8
Cefprozil	Urinary Tract Infections	26391612	9
Aztreonam	Soft Tissue Infections	without evidence	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

Table 6.

JI predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence

Drug	Disease	Evidence (PMID)	Rank
Alprostadil	Tetralogy of Fallot	3543871	1
Alprostadil	Tricuspid Atresia	without evidence	2
Alprostadil	Hypoplastic Left Heart Syndrome	without evidence	3
Cefepime	Escherichia coli Infections	26815433	4
Cefepime	Urinary Tract Infections	26243291;1804010	5
Atorvastatin	Hypercholesterolemia	24593216;20946910;20135644	6
Cefotaxime	Escherichia coli Infections	without evidence	7
Clofibrate	Hypercholesterolemi	1175893;7080553;7157849	8
Fenofibrate	Hypercholesterolemi	2492189;24593216;2045526	9
Levofloxacin	Escherichia coli Infections	without evidence	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

Table 7.

HGBI predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence

Drug	Disease	Evidence (PMID)	Rank
Lithium	Conduct Disorder	7491395;7691178;7751258	1
Lithium	Depressive Disorder	27752079;2723135;21252007	2
Ertapenem	Pyelonephritis	22563210	3
Moxifloxacin	Pyelonephritis	without evidence	4
Gatifloxacin	Pyelonephritis	11911553;15037328	5
Methotrexate	Psoriasis	20178709;19626273;19323665	6
Levofloxacin	Pyelonephritis	without evidence	7
Cefamandole	Staphylococcal Infections	9419181	8
Cefprozil	Staphylococcal Infections	without evidence	9
Vinblastine	Kidney Neoplasms	8602639;11194540	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

Table 8.

NBI predicted drug-disease associations (top 10 in ranking list), and the corresponding evidence

Drug	Disease	Evidence (PMID)	Rank
Flurandrenolide	Facial Dermatoses	without evidence	1
Levofloxacin	Urinary Tract Infections	25931244	2
Cefoperazone	Escherichia coli Infections	without evidence	3
Flurandrenolide	Scalp Dermatoses	without evidence	4
Ceftazidime	Proteus Infections	without evidence	5
Ceftizoxime	Escherichia coli Infections	24755996	6
Moxifloxacin	Streptococcal Infections	19188393;18818055;17562794	7
Cefpodoxime	Escherichia coli Infections	23537823	8
Ofloxacin	Streptococcal Infections	19856068	9
Ampicillin	Streptococcal Infections	2306432	10

Open in a new tab

The entries in boldface represent the drug-disease associations supported by recent papers in PubMed but not included in CTD

From Table 4, 5 out of top 10 predicted associations are supported by associations in CTD, the other two drug-disease pairs are supported by recent papers in PubMed but not include in CTD. From Table 5, we can clearly see that 1 out of top 10 predicted associations is supported by CTD and the other six associations are supported by recent papers in PubMed. From Table 6, JI totally finds 6 drug-disease pairs with evidence among the top 10 predicted associations. From Table 7, 5 out of top 10 predicted associations are supported by associations in CTD, the other two drug-disease pairs are supported by recent papers in PubMed but not include in CTD. From Table 8, NBI can find 6 associations with evidence. In summary, DR-IBRW can make more confident drug-disease repositioning than these comparing methods.

Quantified individual walk length is reasonable

The drug-disease association prediction task is frequently modeled as a link prediction problem in a heterogeneous graph [36–38]. The link prediction relies on calculating the similarity between nodes. The number of paths between nodes and walk lengths are regarded as effective similarity metrics in the social network and biological network [36, 39, 40]. The similarities between drugs and diseases can be measured based on the number of walks that connect drug nodes and disease nodes in the network. Integrating the number of walks and their lengths can more comprehensively quantify the potential association probability of the drug-disease pair. In addition, the contribution of different nodes in the heterogeneous network is different. In other words, the information carried by each node in the heterogeneous work is imbalanced. Therefore, it is an issue to adopt a fixed walk-length for all nodes in link prediction.

In order to answer why the choice of quantified individual walk length is reasonable, we calculate the shortest path for each drug and disease node, and measure the difference between shortest path and quantified individual walk length. We use the matrix SP(r_i,d_j) to represent the shortest path from the i−th drug to j−th disease, $SP \in ℝ^{(n_{r} + n_{d}) \times (n_{r} + n_{d})}$ . To calculate SP, we firstly construct an adjacency matrix W:

W = [\begin{array}{l} W_{rr} & W_{rd} \\ W_{dr} & W_{dd} \end{array}]

where $W_{rr} \in ℝ^{n_{r} \times n_{r}}$ contains the shortest path between each two drug nodes, $W_{dd} \in ℝ^{n_{d} \times n_{d}}$ contains the shortest path between each two disease nodes. W_rd is the drug-disease association matrix and W_dr is the transpose of W_rd. Then, we adopt the Dijkstra algorithm to compute the shortest path between two nodes in matrix W. $P_{r} = (r p_{1}, r p_{2}, \dots, r p_{i}, \dots, {rp}_{n_{r}})$ where rp_i represents the longest path in the shortest path between i-th drug and all the diseases. $P_{d} = (d p_{1}, d p_{2}, \dots, d p_{j}, \dots, {dp}_{n_{d}})$ where dp_j represents the longest path in the shortest path between j-th disease and all the drugs. In other words, rp_i is the maximum shortest path for drug i, which can include nearly all the path information with diseases. dp_j is the maximum shortest path for disease j and it can approximately represent the path between disease j and all the drugs. L_r and L_d store the quantified individual walk-lengths of n_r drugs and n_d diseases. After that, we calculate the margin between P_r and L_r for drugs, and that between P_d and L_d for diseases. The statistical results are shown in Fig. 5. We can find that nearly 60% nodes’ differences are no larger than one. It can explain that the quantified individual walk lengths of most nodes are inline with the shortest path between the respective nodes. However, the maximum shortest path can only partially represent the path information from a drug node to a disease node. L_r can give more emphasize on shorter path between diseases and drugs than maximum shortest path, and it generally has a smaller value than P_r. It is recognized that the shorter the distance between two nodes, the larger the similarity between them is. For these reasons, our random walk with individual walk achieves more prominent performance than random walk fixed walk length (as shown in Fig. 3)

Fig. 5 — The margin between P_r and L_r for drugs (Left), and the margin between P_d and L_d for diseases (Right)

We also perform the correlation analysis on drug similarity matrix S_r and drug shortest path matrix W_rr. We firstly partition S_r into ten subranges ((0, 0.1], (0.1, 0.2], etc.) and then partition W_rr into ten subranges to ensure that all the drug pairs in each subrange of S_r falling into the corresponding subrange of W_rr. Next, we calculate the average shortest path of each subrange for W_rr, and compute the correlation of average shortest paths and drug similarities between W_rr and S_r in each subrange. Similarly, we conduct the correlation analysis on disease similarity S_d and disease shortest path matrix W_dd in the same way and report the results in Fig. 6. We can clearly observe that the average shortest paths between drug pairs or disease pairs decrease as the increases of their similarities. This observation also differentiates the contribution of different walk lengths based on the assumption that nodes with shorter walk lengths contribute more to the similarity between two nodes.

Fig. 6 — The average shortest path in different drug similarity subranges (Left), and in different disease similarity subranges (Right)

Conclusion

In this paper, we proposed a computational drug repositioning approach that encodes the drug chemical structure information, disease symptom information and known drug-disease interactions information into a heterogeneous network. Our approach accounts for structural difference of subnetworks of the heterogeneous network by bi-random walk, and for the contribution differences of different nodes via specifying quantified individual walk-lengths to them. Experimental study demonstrates that our approach performs better than other related competitive methods and the individual walk lengths contribute to an improved performance. We want to remark that our proposed approach can be easily generalized to predict links between nodes of a heterogeneous network.

Acknowledgements

The two-page short paper of this work was an oral presentation in the 14th International Symposium on Bioinformatics Research and Applications (ISBRA 2018).

About this supplement

This article has been published as part of BMC Bioinformatics, Volume 20 Supplement 15, 2019: Selected articles from the 14th International Symposium on Bioinformatics Research and Applications (ISBRA-18): bioinformatics. The full contents of the supplement are available at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-20-supplement-15

Abbreviations

AUPR: The area under precision-recall curve
AUROC: The area under the receiver operating characteristic curve
CTD: Comparative Toxicogenomics Database
DR-IBRW: Drug repositioning approach based on individual bi-random walks
OMIM: Online Mendelian Inheritance in Man
PR: precision-recall
ROC: receiver operating characteristic

Authors’ contributions

YW and GY conceived, designed and carried out the experiments; GY and MG initialized and conceived of the whole program; YW and GY analyzed the results, drafted and finalized the manuscript; MG, YR and LJ were involved in revising the manuscript. All authors read and approved the final manuscript.

Funding

Publication cost is funded by Natural Science Foundation of China (61741214 and 61872300), Fundamental Research Funds for the Central Universities (XDJK2019B024), Natural Science Foundation of CQ CSTC (cstc2018jcyjAX0228).

Availability of data and material

The datasets used during the current study are available from the corresponding author on reasonable request.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Maozu Guo, Email: guomaozu@bucea.edu.cn.

Guoxian Yu, Email: gxyu@swu.edu.cn.

References

1.Wang W, Yang S, Zhang X, Li J. Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics. 2014;30(20):2923–30. doi: 10.1093/bioinformatics/btu403. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Sardana D., Zhu C., Zhang M., Gudivada R. C., Yang L., Jegga A. G. Drug repositioning for orphan diseases. Brief Bioinform. 2011;12(4):346–56. doi: 10.1093/bib/bbr021. [DOI] [PubMed] [Google Scholar]
3.Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: From data to therapeutics. Clin Pharmacol Ther. 2013;93(4):335–41. doi: 10.1038/clpt.2013.1. [DOI] [PubMed] [Google Scholar]
4.Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(4):673–683. doi: 10.1038/nrd1468. [DOI] [PubMed] [Google Scholar]
5.Graul Al, Sorbera L, Pina P, Tell M, Cruces E, Rosa E. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(4):673–83. doi: 10.1038/nrd1468. [DOI] [PubMed] [Google Scholar]
6.Martínez V, Navarro C, Cano C, Fajardo W, Blanco A. Drugnet: Network-based drug–disease prioritization by integrating heterogeneous data. Artif Intell Med. 2015;63(1):41–9. doi: 10.1016/j.artmed.2014.11.003. [DOI] [PubMed] [Google Scholar]
7.Bleakley K., Yamanishi Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics. 2009;25(18):2397–403. doi: 10.1093/bioinformatics/btp433. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, Zhou W, Huang J, Tang Y. Prediction of drug-target interactions and drug repositioning via network-based inference. PLOS Computat Biol. 2012;8(5):1–12. doi: 10.1371/journal.pcbi.1002503. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Wang W, Yang S, Li J. Drug target predictions based on heterogeneous graph inference. In: Pacific Symposium on Biocomputing: 2013. p. 53–64. 10.1142/9789814447973_0006. [PMC free article] [PubMed]
10.Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther. 2009;86(5):507–10. doi: 10.1038/clpt.2009.103. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Gottlieb A, Stein GY, Ruppin E, Sharan R. Predict: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol. 2011; 7(1). [DOI] [PMC free article] [PubMed]
12.Luo H, Wang J, Li M, Luo J, Peng X, Wu F-X, Pan Y. Drug repositioning based on comprehensive similarity measures and bi-random walk algorithm. Bioinformatics. 2016;32(17):2664–71. doi: 10.1093/bioinformatics/btw228. [DOI] [PubMed] [Google Scholar]
13.Liu H, Song Y, Guan J, Luo L, Zhuang Z. Inferring new indications for approved drugs via random walk on drug-disease heterogenous networks. BMC Bioinformatics. 2016;17(17):539. doi: 10.1186/s12859-016-1336-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Zhou X., Menche J., Barabási A-L, Sharma A. Human symptoms-?disease network. Nature Communications. 2014;5:4212. doi: 10.1038/ncomms5212. [DOI] [PubMed] [Google Scholar]
15.Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH. Pubchem substance and compound databases. Nucleic Acids Res. 2016;44(D1):1202–13. doi: 10.1093/nar/gkv951. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Lu Y, Guo Y, Korhonen A. Link prediction in drug-target interactions network using similarity indices. BMC Bioinformatics. 2017;18(1):39. doi: 10.1186/s12859-017-1460-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Tong H, Faloutsos C, Pan J-Y. Random walk with restart: fast solutions and applications. Knowl Inf Syst. 2008;14(3):327–46. [Google Scholar]
18.Lofgren P, Banerjee S, Goel A. Personalized pagerank estimation and search: A bidirectional approach. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining: 2016. p. 163–172. 10.1145/2835776.2835823.
19.Lipscomb CE. Medical subject headings (mesh) Bull Med Libr Assoc. 2000;88(3):265–6. [PMC free article] [PubMed] [Google Scholar]
20.Li J, Lu Z. A new method for computational drug repositioning using drug pairwise similarity. In: IEEE International Conference on Bioinformatics and Biomedicine: 2012. p. 1–4. 10.1109/bibm.2012.6392722. [DOI] [PMC free article] [PubMed]
21.van der Vaart A, van Zanten H. Information rates of nonparametric gaussian process methods. J Mach Learn Res. 2011;12(6):2095–119. [Google Scholar]
22.Zhang F, Gong T, Lee VE, Zhao G, Rong C, Qu G. Fast algorithms to evaluate collaborative filtering recommender systems. Knowl-Based Syst. 2016;96:96–103. [Google Scholar]
23.Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010;6(1):1000641. doi: 10.1371/journal.pcbi.1000641. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods. 2012;9:471. doi: 10.1038/nmeth.1938. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Wang J, Zuo R. Identification of geochemical anomalies through combined sequential gaussian simulation and grid-based local singularity analysis. Comput Geosci. 2018;118:52–64. [Google Scholar]
26.Yu S, Zhang A, Li H. A review of estimating the shape parameter of generalized gaussian distribution. J Comput Inf Syst. 2012;8(21):9055–64. [Google Scholar]
27.Xie M, Hwang T, Kuang R. Prioritizing disease genes by bi-random walk. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2012;7301 LNAI(PART 2):292–303. [Google Scholar]
28.Yu G, Fu G, Wang J, Zhao Y. Newgoa: Predicting new go annotations of proteins by bi-random walks on a hybrid graph. IEEE/ACM Trans Comput Biol Bioinforma. 2018;15(4):1390–402. doi: 10.1109/TCBB.2017.2715842. [DOI] [PubMed] [Google Scholar]
29.Estrada E. Generalized walks-based centrality measures for complex biological networks. J Theor Biol. 2010;263(4):556–65. doi: 10.1016/j.jtbi.2010.01.014. [DOI] [PubMed] [Google Scholar]
30.Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks. J Am Soc Inf Sci Technol. 2007;58(7):1019–31. [Google Scholar]
31.Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 2016;45(D1):972–8. doi: 10.1093/nar/gkw838. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Breckenridge A, Orme M, Serlin M, Maciver M. Labetalol in essential hypertension. Br J Clin Pharmacol. 1982;13(S1):37–9. doi: 10.1111/j.1365-2125.1982.tb01887.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Greminger P, Foerster E, Vetter H, Baumgart P, Vetter W. Minoxidil and captopril in severe hypertension. Klin Wochenschr. 1986;64(7):327–32. doi: 10.1007/BF01711952. [DOI] [PubMed] [Google Scholar]
34.Khan SS, Cuttica MJ, Beussink-Nelson L, Kozyleva A, Sanchez C, Mkrdichian H, Selvaraj S, Dematte JE, Lee DC, Shah SJ. Effects of ranolazine on exercise capacity, right ventricular indices, and hemodynamic characteristics in pulmonary arterial hypertension: A pilot study. Pulm Circ. 2015;5(3):547–56. doi: 10.1086/682427. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Setaluri V, Clark AR, Feldman SR. Transmittance properties of flurandrenolide tape for psoriasis: Helpful adjunct to phototherapy. J Cutan Med Surg. 2000;4(4):196–8. doi: 10.1177/120347540000400404. [DOI] [PubMed] [Google Scholar]
36.Chen X. Katzlda: Katz measure for the lncrna-disease association prediction. Sci Rep. 2015;5:16840. doi: 10.1038/srep16840. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Yang X, Gao L, Guo X, Shi X, Wu H, Song F, Wang B. A network based method for analysis of lncrna-disease associations and prediction of lncrnas implicated in diseases. PLOS ONE. 2014;9(1):1–10. doi: 10.1371/journal.pone.0087797. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18(1):39–43. [Google Scholar]
39.Krauthammer M, Kaufmann CA, Gilliam TC, Rzhetsky A. Molecular triangulation: Bridging linkage and molecular-network information for identifying candidate genes in alzheimer’s disease. Proc Nat Acad Sci. 2004;101(42):15148–53. doi: 10.1073/pnas.0404315101. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Radivojac P, Peng K, Clark WT, Peters BJ, Mohan A, Boyle SM, Mooney SD. An integrated approach to inferring gene-disease associations in humans. Proteins. 2008;72(3):1030–7. doi: 10.1002/prot.21989. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used during the current study are available from the corresponding author on reasonable request.

[CR1] 1.Wang W, Yang S, Zhang X, Li J. Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics. 2014;30(20):2923–30. doi: 10.1093/bioinformatics/btu403. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Sardana D., Zhu C., Zhang M., Gudivada R. C., Yang L., Jegga A. G. Drug repositioning for orphan diseases. Brief Bioinform. 2011;12(4):346–56. doi: 10.1093/bib/bbr021. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: From data to therapeutics. Clin Pharmacol Ther. 2013;93(4):335–41. doi: 10.1038/clpt.2013.1. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(4):673–683. doi: 10.1038/nrd1468. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Graul Al, Sorbera L, Pina P, Tell M, Cruces E, Rosa E. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(4):673–83. doi: 10.1038/nrd1468. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Martínez V, Navarro C, Cano C, Fajardo W, Blanco A. Drugnet: Network-based drug–disease prioritization by integrating heterogeneous data. Artif Intell Med. 2015;63(1):41–9. doi: 10.1016/j.artmed.2014.11.003. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Bleakley K., Yamanishi Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics. 2009;25(18):2397–403. doi: 10.1093/bioinformatics/btp433. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, Zhou W, Huang J, Tang Y. Prediction of drug-target interactions and drug repositioning via network-based inference. PLOS Computat Biol. 2012;8(5):1–12. doi: 10.1371/journal.pcbi.1002503. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Wang W, Yang S, Li J. Drug target predictions based on heterogeneous graph inference. In: Pacific Symposium on Biocomputing: 2013. p. 53–64. 10.1142/9789814447973_0006. [PMC free article] [PubMed]

[CR10] 10.Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther. 2009;86(5):507–10. doi: 10.1038/clpt.2009.103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Gottlieb A, Stein GY, Ruppin E, Sharan R. Predict: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol. 2011; 7(1). [DOI] [PMC free article] [PubMed]

[CR12] 12.Luo H, Wang J, Li M, Luo J, Peng X, Wu F-X, Pan Y. Drug repositioning based on comprehensive similarity measures and bi-random walk algorithm. Bioinformatics. 2016;32(17):2664–71. doi: 10.1093/bioinformatics/btw228. [DOI] [PubMed] [Google Scholar]

[CR13] 13.Liu H, Song Y, Guan J, Luo L, Zhuang Z. Inferring new indications for approved drugs via random walk on drug-disease heterogenous networks. BMC Bioinformatics. 2016;17(17):539. doi: 10.1186/s12859-016-1336-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Zhou X., Menche J., Barabási A-L, Sharma A. Human symptoms-?disease network. Nature Communications. 2014;5:4212. doi: 10.1038/ncomms5212. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH. Pubchem substance and compound databases. Nucleic Acids Res. 2016;44(D1):1202–13. doi: 10.1093/nar/gkv951. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Lu Y, Guo Y, Korhonen A. Link prediction in drug-target interactions network using similarity indices. BMC Bioinformatics. 2017;18(1):39. doi: 10.1186/s12859-017-1460-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Tong H, Faloutsos C, Pan J-Y. Random walk with restart: fast solutions and applications. Knowl Inf Syst. 2008;14(3):327–46. [Google Scholar]

[CR18] 18.Lofgren P, Banerjee S, Goel A. Personalized pagerank estimation and search: A bidirectional approach. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining: 2016. p. 163–172. 10.1145/2835776.2835823.

[CR19] 19.Lipscomb CE. Medical subject headings (mesh) Bull Med Libr Assoc. 2000;88(3):265–6. [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Li J, Lu Z. A new method for computational drug repositioning using drug pairwise similarity. In: IEEE International Conference on Bioinformatics and Biomedicine: 2012. p. 1–4. 10.1109/bibm.2012.6392722. [DOI] [PMC free article] [PubMed]

[CR21] 21.van der Vaart A, van Zanten H. Information rates of nonparametric gaussian process methods. J Mach Learn Res. 2011;12(6):2095–119. [Google Scholar]

[CR22] 22.Zhang F, Gong T, Lee VE, Zhao G, Rong C, Qu G. Fast algorithms to evaluate collaborative filtering recommender systems. Knowl-Based Syst. 2016;96:96–103. [Google Scholar]

[CR23] 23.Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010;6(1):1000641. doi: 10.1371/journal.pcbi.1000641. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods. 2012;9:471. doi: 10.1038/nmeth.1938. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Wang J, Zuo R. Identification of geochemical anomalies through combined sequential gaussian simulation and grid-based local singularity analysis. Comput Geosci. 2018;118:52–64. [Google Scholar]

[CR26] 26.Yu S, Zhang A, Li H. A review of estimating the shape parameter of generalized gaussian distribution. J Comput Inf Syst. 2012;8(21):9055–64. [Google Scholar]

[CR27] 27.Xie M, Hwang T, Kuang R. Prioritizing disease genes by bi-random walk. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2012;7301 LNAI(PART 2):292–303. [Google Scholar]

[CR28] 28.Yu G, Fu G, Wang J, Zhao Y. Newgoa: Predicting new go annotations of proteins by bi-random walks on a hybrid graph. IEEE/ACM Trans Comput Biol Bioinforma. 2018;15(4):1390–402. doi: 10.1109/TCBB.2017.2715842. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Estrada E. Generalized walks-based centrality measures for complex biological networks. J Theor Biol. 2010;263(4):556–65. doi: 10.1016/j.jtbi.2010.01.014. [DOI] [PubMed] [Google Scholar]

[CR30] 30.Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks. J Am Soc Inf Sci Technol. 2007;58(7):1019–31. [Google Scholar]

[CR31] 31.Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 2016;45(D1):972–8. doi: 10.1093/nar/gkw838. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Breckenridge A, Orme M, Serlin M, Maciver M. Labetalol in essential hypertension. Br J Clin Pharmacol. 1982;13(S1):37–9. doi: 10.1111/j.1365-2125.1982.tb01887.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] 33.Greminger P, Foerster E, Vetter H, Baumgart P, Vetter W. Minoxidil and captopril in severe hypertension. Klin Wochenschr. 1986;64(7):327–32. doi: 10.1007/BF01711952. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Khan SS, Cuttica MJ, Beussink-Nelson L, Kozyleva A, Sanchez C, Mkrdichian H, Selvaraj S, Dematte JE, Lee DC, Shah SJ. Effects of ranolazine on exercise capacity, right ventricular indices, and hemodynamic characteristics in pulmonary arterial hypertension: A pilot study. Pulm Circ. 2015;5(3):547–56. doi: 10.1086/682427. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Setaluri V, Clark AR, Feldman SR. Transmittance properties of flurandrenolide tape for psoriasis: Helpful adjunct to phototherapy. J Cutan Med Surg. 2000;4(4):196–8. doi: 10.1177/120347540000400404. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Chen X. Katzlda: Katz measure for the lncrna-disease association prediction. Sci Rep. 2015;5:16840. doi: 10.1038/srep16840. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Yang X, Gao L, Guo X, Shi X, Wu H, Song F, Wang B. A network based method for analysis of lncrna-disease associations and prediction of lncrnas implicated in diseases. PLOS ONE. 2014;9(1):1–10. doi: 10.1371/journal.pone.0087797. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18(1):39–43. [Google Scholar]

[CR39] 39.Krauthammer M, Kaufmann CA, Gilliam TC, Rzhetsky A. Molecular triangulation: Bridging linkage and molecular-network information for identifying candidate genes in alzheimer’s disease. Proc Nat Acad Sci. 2004;101(42):15148–53. doi: 10.1073/pnas.0404315101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Radivojac P, Peng K, Clark WT, Peters BJ, Mohan A, Boyle SM, Mooney SD. An integrated approach to inferring gene-disease associations in humans. Proteins. 2008;72(3):1030–7. doi: 10.1002/prot.21989. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Drug repositioning based on individual bi-random walks on a heterogeneous network

Yuehui Wang

Maozu Guo

Yazhou Ren

Lianyin Jia

Guoxian Yu

Conference

Abstract

Background

Results

Conclusions

Background

Materials and methods

Dataset

Similarity measures

Quantifying individual walk-length

Individual bi-random walk

Fig. 1.

Results and discussion

Performance comparison with other methods

Table 1.

Fig. 2.

Individual walk-length analysis

Fig. 3.

Drug and disease similarity analysis

Fig. 4.

Experiments on another two datasets

Table 2.

Case study

Table 3.

Table 4.

Table 5.

Table 6.

Table 7.

Table 8.

Quantified individual walk length is reasonable

Fig. 5.

Fig. 6.

Conclusion

Acknowledgements

About this supplement

Abbreviations

Authors’ contributions

Funding

Availability of data and material

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Contributor Information

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases