Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Aug 7;16(8):e1008098. doi: 10.1371/journal.pcbi.1008098

A machine learning and network framework to discover new indications for small molecules

Coryandar Gilvary 1,2,3,4,#, Jamal Elkhader 1,2,3,4,#, Neel Madhukar 5, Claire Henchcliffe 6, Marcus D Goncalves 3,7, Olivier Elemento 1,2,3,4,5,8,*
Editor: Avner Schlessinger9
PMCID: PMC7437923  PMID: 32764756

Abstract

Drug repurposing, identifying novel indications for drugs, bypasses common drug development pitfalls to ultimately deliver therapies to patients faster. However, most repurposing discoveries have been led by anecdotal observations (e.g. Viagra) or experimental-based repurposing screens, which are costly, time-consuming, and imprecise. Recently, more systematic computational approaches have been proposed, however these rely on utilizing the information from the diseases a drug is already approved to treat. This inherently limits the algorithms, making them unusable for investigational molecules. Here, we present a computational approach to drug repurposing, CATNIP, that requires only biological and chemical information of a molecule. CATNIP is trained with 2,576 diverse small molecules and uses 16 different drug similarity features, such as structural, target, or pathway based similarity. This model obtains significant predictive power (AUC = 0.841). Using our model, we created a repurposing network to identify broad scale repurposing opportunities between drug types. By exploiting this network, we identified literature-supported repurposing candidates, such as the use of systemic hormonal preparations for the treatment of respiratory illnesses. Furthermore, we demonstrated that we can use our approach to identify novel uses for defined drug classes. We found that adrenergic uptake inhibitors, specifically amitriptyline and trimipramine, could be potential therapies for Parkinson’s disease. Additionally, using CATNIP, we predicted the kinase inhibitor, vandetanib, as a possible treatment for Type 2 Diabetes. Overall, this systematic approach to drug repurposing lays the groundwork to streamline future drug development efforts.

Author summary

Currently, clinical approval of a drug is an arduous process that results in an overwhelming number of compounds failing due to safety or efficacy concerns, which leaves patients without novel, lifesaving treatments. The idea of drug repurposing is to take approved drugs, or compounds that were shelved due to reasons other than safety and identify new diseases for them to treat. This would allow drugs, if they are sufficiently effective, to quickly go through the FDA approval process and be available to patients quicker, which also cuts the ever growing cost of novel compound research and development. Here, we introduce CATNIP, a computational model, that can predict novel indications for specific drugs or entire drug classes. This approach analyzes drug similarity across a wide range of biological, chemical and clinical features, giving a complete picture of each drug’s mechanism and possible indications. Interestingly, CATNIP can be used for drugs that not only are previously approved, but also shelved compounds, which are often overlooked in previous repurposing analyses. Most importantly, CATNIP successfully identified novel treatments for both Parkinson’s disease and Type 2 Diabetes, which are currently undergoing pre-clinical validation.

Introduction

With over $800 million spent bringing a single drug to market over the course of 15 years, drug development has remained a costly and time-consuming affair[1]. In response, there has been an increase in interest in drug repurposing, the identification of novel indications for known, safe drugs. Successes in this area have been seen in the past, most notably in sildenafil (e.g. Viagra), which was originally intended to treat hypertension and angina pectoris but was later repurposed to treat erectile dysfunction. Other examples of compounds repurposed for new therapeutic applications include minoxidil[2] and raloxifene[3], which are now used to treat androgenic alopecia and osteoporosis, respectively. However, most of these repurposing opportunities were discovered through inefficient approaches including anecdotal observations or hypothesis-driven investigations, and a more efficient approach could lead to many more repurposing opportunities.

Computational approaches for repurposing drugs are appealing in that they can be systematically and quickly applied to many drugs at a low cost compared to their experimental counterparts. One computational approach that has proven to be invaluable in other areas of the drug development pipeline is machine learning. Machine learning is the use of computational algorithms to learn from available data to make novel predictions and gain new insight. Using this technique, one can create unbiased algorithms to match seemingly disparate drugs by comparing their common features[4], such as clinical indication, toxicity profile[5] or therapeutic target[6, 7]. Previously, our lab used a ‘similarity’ approach, leveraging the principle that similar drugs tend to have similar characteristics, to predict a drug’s target by investigating the known targets of other drugs that were predicted to be “similar” to the investigated drug based on shared features[6]. We found that DRD2, a dopamine receptor, was the predicted target for the compound ONC201. After identifying and experimentally validating this target, clinical trials were shifted to focus on gliomas, which are now successfully completing phase two trials at the time of this publication[8]. The approach of leveraging drug similarity could immensely aid drug repurposing efforts with the appropriate data.

Others have successfully used this ‘similarity’ approach to repurpose drugs and demonstrated high predictive power when tested against FDA approved drug-diseases[9]. However, these methods have primarily linked drugs together using a disease-centric approach instead of using features related to the drug itself (i.e. drug-centric). These repurposing opportunities are identified by predicting diseases similar to the diseases a drug is already known to treat. Disease similarities can be based on semantic, pathophysiological, or clinical similarities related to the drug’s clinical indication. For example, PREDICT, a repurposing method developed by Gottlieb et al.[10], exploits the semantic similarity of disease terms as a form of disease-disease similarity. Such approaches, while reliable, limit the scope of the repositioning effort in several ways. First, the vast majority of small molecules never reach clinical approval and would be overlooked in this type of analysis. Second, the use of a disease-centric approach biases repurposing predictions toward exclusively similar clinical diseases (i.e.: cancer drugs to other cancer types) [11]. We postulated that using solely drug information, such as chemical and biological features, would be a more effective and broader approach to drug repurposing.

Here, we propose a novel approach to drug repurposing, which operates by a platform we call, Creating A Translational Network for Indication Prediction (CATNIP). CATNIP is a machine-learning algorithm that learns to predict whether two molecules share an indication based solely on the drug’s chemical and biological features, using 2,576 unique drugs. The systematic application of CATNIP to molecule pairs creates a network with ~4.6 million nodes that can then be used to identify potential drug repurposing opportunities. By identifying feature importance through the use of chemical structure and target information to make broad scale predictions, CATNIP is able to effectively bridge between different therapeutic indications to advance methods of drug repurposing. In this report, we have identified various candidate drug classes that are predicted to have therapeutic activity outside of their intended indication in diseases such as Parkinson’s disease and Type 2 Diabetes.

Results

Variance in drug indication nomenclature can be standardized

We collected a wide variety of drugs (N = 3,066, including both approved and investigational molecules) with a diverse set of indications to ensure that our drug network covered a large portion of the known chemical space. A subset of these drugs (2,576 FDA approved drugs and 2,492 indications taken from DrugBank [12]) were used as a gold-standard of drug-indication associations in the training set for the model. Disease names are often not standardized, which can lead to many diverse names for the same disease. This problem leads to many drug pairs appearing to not have shared indications, when they are associated with two different names for the same disease. To address inconsistencies in nomenclature for drug indications, such as “prostate carcinoma” and “carcinoma of the prostate”, the MetaMap tool [13] was applied to map disease names to UMLS concepts (Methods). This standardization of medical terminologies allowed us to reconcile various variations in the database, allowing us to confirm that drugs did, in fact, treat the same disease. (Examples of these variations and their mappings may be seen in Table 1.). Using MetaMap, we clustered the 2,492 DrugBank indications into 1,042 standardized indications. A multitude of indication types were included in this standardization including, but not limited to, oncological, mental health, and neurological diseases (S1A Fig). Our rigorous standardization of drug indications ensured an accurate training set, allowing for the discovery and modeling of drug-indication relationships.

Table 1. Indication nomenclatures and their mappings.

Metamap Mapped Indication Indication (DrugBank) Indication ID (DrugBank) Number of unique drugs associated with Indication ID Unique drugs associated with Indication ID
Prostate Carcinoma Advanced Prostate Carcinoma DBCOND0070333 2 Cyproterone acetate, Esterified estrogens
Advanced carcinoma of the prostate DBCOND0020265 1 Goserelin
Acne Vulgaris Severe Acne DBCOND0077433 3 Cyproterone acetate, Doxycycline, Tetracycline
Acne DBCOND0019842 10 Aloe Vera Leaf, Benzoyl peroxide, Chloramphenicol, Clioquinol, Glycolic acid, Linoleic acid, Octasulfur, Salicylic acid, Silver, Spironolactone
Moderate Acne vulgaris DBCOND0022329 3 Ethinylestradiol, Minocycline, Norgestimate, Tazarotene
Dementia, Vascular Mild Vascular Dementia DBCOND0022662 1 Memantine
Dementia, Vascular DBCOND0029264 1 Donepezil
Dementias DBCOND0060453 3 Galantamine, Trazodone, Trifluoperazine
Idiopathic Pulmonary Fibrosis Idiopathic Pulmonary Fibrosis (IPF) DBCOND0031843 2 Nintedanib, Prednisolone
Mild Idiopathic Pulmonary Fibrosis DBCOND0093824 1 Pirfenidone
Paget Disease Paget’s Disease DBCOND0038793 4 Alendronic acid, Pamidronic acid, Risedronic acid, Zoledronic acid
Paget’s Disease of Bone DBCOND0030189 1 Etidronic acid

Drug pairs sharing indications have other similar characteristics

We hypothesized that pairs of drugs that shared at least one indication would have other similar drug characteristics (S1 Table). To test this hypothesis, we integrated the similarity of two drugs across chemical and biological drug properties, and created a computational model to predict if two drugs will share an indication (Fig 1). All 16 of the drug similarity features (S1 Table) collected could significantly distinguish between drug pairs known to share an indication and those not known to share an indication (S2S5 Figs). For example, we found that drug pairs with a shared clinical indication, according to their listed DrugBank indications, tended to have significant overlap in targets (D-statistic = 0.168, p-value < 0.001, S2A Fig). The feature which best discriminated between drug pairs that shared a clinical indication versus drug pairs that do not was the similarity between the KEGG pathways that each drug’s targets are involved in (D-statistic = 0.241, p < 0.001, S4C Fig). Pathway similarity was calculated as the Jaccard Index between the KEGG pathways that contain each drug’s gene targets (Methods). The difference in effect size between the target similarity and the pathway similarity (D-statistic = 0.168 vs 0.241, respectively) indicates that the drugs do not necessarily have to target the same exact genes, but rather the same biological pathway, in order to share a clinical indication. Additionally, we found that drug pairs that share an indication had a more similar chemical structure than drug pairs that did not share an indication (D-statistic = 0.105, p-value < 0.001, S5A Fig). A biological network containing both physical and non-physical interactions was curated, containing 22,399 protein-coding genes, 6,679 drugs, and 170 TFs. This curated network provided another feature for our model, allowing us to utilize previously established interactions between proteins to aid with distinguishing drug pairs that share an indication. Overall, these features seem to indicate sufficient power in differentiating drugs that share and do not share indications, which we hypothesized can then be leveraged to create a predictive model.

Fig 1. Schematic of CATNIP repurposing approach.

Fig 1

A) The use of drug similarity properties to predict if two drugs will share an indication using a gradient boosting model, the model is referred to as CATNIP. B) Schematic showing the use of CATNIP output scores to create a network, with the scores used as edge weights. The colors of each drug represent the known disease and this demonstrates how one could identify novel indications for drugs through the network.

Drug pairs that share indications can be predicted by model

Using these diverse drug properties as features we trained a Gradient Boosting model to predict if two drugs share a clinical indication. A Gradient Boosting model showed superior results when compared with other algorithms (Methods, S2 Table). The model output is a drug similarity score (hereby referred to as a “CATNIP score”), which allows us to classify drug pairs that share clinical indications. We performed a 5-fold cross-validation analysis and achieved significant predictive performance with an area-under-the-receiver-operator curve (AUC) of 0.841 (Fig 2A). We confirmed the statistical significance of our model with a precision-recall curve (PRC) because of the class imbalance in our dataset between drug pairs that share indications against those that do not (23,840 Shared, 1,299,623 Not Shared). When compared to random predictions, our model showed significant improvement (0.189 vs 0.0184 area-under PRC, S6 Fig). We retained a low percent of false positive predictions at various cut-offs (15.5% false positives and 5.4% false positives at a model prediction probability of two drugs sharing an indication of 50% and 75%, respectively), providing extra confidence that our predictions can lead to strong repurposing candidates.

Fig 2. CATNIP model accurately predicts drugs that share an indication and can be used for repurposing.

Fig 2

A) Receiver-operating characteristic curve for CATNIP, the performance for drug pairs with high and low structural similarity is also shown. B) A network of all drug pairs with a CATNIP score higher than 7.4. Nodes (drugs) are colored based on ATC classification and a specific example of repurposing between ATC classifications is highlighted. C) A graph of all ATC classification and the median CATNIP score between the drugs belonging to each of them (only including drug pairs with > 7.4 CATNIP score). The edges between ATC Classifications with the highest median CATNIP scores are colored red.

We found that the predictive model greatly benefited from the addition of diverse data types. While structure similarity showed the highest feature importance of any single feature (S11 Fig), when used as a single feature within a gradient boosting model it only achieved an AUC of 0.596 (S12 Fig). Interestingly, when only supplying the model with ontology features, a Jaccard index for the GO terms of the known targets of each drug within a drug pair, achieved an AUC of 0.776. However, even at 0.77, the highest AUC of any single feature type, it is significantly below the performance when combining all feature types.

In certain cases, a high predictive performance is expected, such as when two drugs are structurally similar or share targets. It has been shown before that structurally similar drugs have a high probability of treating the same indication[14]. However, there continue to be drug pairs known to treat the same indication that are not structurally similar. For example, tamoxifen[15] and anastrozole[16] are structurally dissimilar compounds (Dice similarity = 0.372) that treat the same indication (Metathesaurus term: Cancer, Breast). We recalculated our performance metrics to evaluate how our model performed in classifying drug pairs that shared indications when only exposed to drug pairs with low structure similarity (Dice < 0.5). High performance was retained under with an AUC = 0.828 (Fig 2A). Additionally, we found that our model performed similarly well when only exposed to drug pairs that did not have any known shared targets (AUC = 0.813, Fig 2A). These performance metrics confirm that our model is robust enough to predict if a drug pair will share an indication even for more difficult prediction tasks.

Network clusters identify drugs with similar clinical characteristics

We constructed a repurposing network by calculating a CATNIP score for all possible drug pairs found within DrugBank, and assigning the drugs as nodes and the CATNIP score as the edge weight. We pruned the network using a cut-off value of 7.4 for the CATNIP scores (Fig 2B), which included 792 different drug pairs. This cut-off is equivalent to a predicted probability of >99% to share an indication and allowed for a balance between confidence within our predictions and drug diversity and availability.

We hypothesized that drugs sharing at least one indication would cluster together in our network. To confirm this theory, we classified each drug per its 1st order Anatomical Therapeutic Chemical (ATC) classification. This identification is a method of distinguishing the clinical use of a drug that is widely used in European and North American chemoinformatics databases[17]. Using ATC, we observed clearly defined clusters within the repurposing network (Fig 2B). Many clusters featured multiple ATC classifications, suggesting potential repurposing opportunities. For example, one cluster included the thiazolidinediones, rosiglitazone and pioglitazone (ATC classification: ‘Alimentary Tract and Metabolism’) and the fibrates, fenofibrate and bezafibrate (ATC classification: ‘Cardiovascular system’). These two clustered ATC classifications were connected by a high (7.42) CATNIP score between bezafibrate and pioglitazone, an antidiabetic drug; a relationship driven by the shared targeting of PPARa and PPARg resulting in the improvement of lipid and glucose metabolism. Bezafibrate has shown efficacy in the treatment of Type 2 Diabetes in numerous retrospective and pre-clinical studies, including Phase 2 trials[1820], however is still not an approved antidiabetic. The identification of bezafibrate as a potential diabetes treatment is a key example of how CATNIP can be used to identify repurposing opportunities.

We reasoned that the connections between ATC classifications across all the drug clusters could provide additional aid for drug repurposing purposes. Using the pruned network (CATNIP Score > 7.4), we collected all the scores between drugs of differing ATC classifications. From this collection, we were able to determine the median score associated between each pair of ATC classifications. The ATC classifications with the highest median CATNIP scores had literature support for numerous repurposing efforts between them (Table 2). For example, drugs with the ATC classifications of “Respiratory System” and “Systemic Hormonal Preparations, excluding sex hormones and insulins” were strongly connected to each other (7.97 median CATNIP score). This connection was driven by highly scored pairs of drugs including rimexolone to mometasone (8.31 CATNIP score) and prednisone to triamcinolone (8.13 CATNIP score). These connections are supported by the fact that hormonal agents like glucocorticoids and beta adrenergic agonists have been used for decades to relax the airway musculature in patients with reactive airways disease and chronic obstructive pulmonary disease[21]. Interestingly, our analysis identified glucagon, a peptide hormone that increases blood glucose levels, as a candidate for “Respiratory System” repurposing and this use already has clinical support[22],[23]. Additionally, drugs classified as “Respiratory System” and “Dermatological” were also observed to be highly associated because of interactions such as the one between ciclesonide and hydrocortisone (8.43 CATNIP score). Ciclesonide and hydrocortisone do in fact share a clinical indication, “Asthma Bronchial”, giving added confidence to our findings. These types of network observations are important in laying the groundwork for suggesting novel clinical repurposing strategies for FDA-approved drugs.

Table 2. Literature Support for ATC Repurposing Predictions.

ATC Code 1 ATC Code 2 Reference
Dermatologicals Respiratory System [2428]
Alimentary Tract and Metabolism Respiratory System [2932]
Sensory Organs Respiratory System [3335]
Systemic Hormonal Preparations, Excluding Sex Hormones And Insulins Respiratory System [36, 37]
Sensory Organs Alimentary Tract and Metabolism [3842]

CATNIP identifies novel disease areas for drug classes

We investigated the ability to leverage CATNIP scores to identify repurposing opportunities by evaluating specific drug classes. Drug classes are predefined in DrugBank. In order to identify actionable repurposing possibilities, we narrowed this list down to 50 classes containing inhibitors, antagonists, or agonists of specific gene or protein families. We focused our attention on specific disease areas that are attractive for drug repurposing opportunities, due to a lack of current treatments or high rates of acquired resistance. The specific disease areas were: “mental disorders”, “neurological diseases”, “diabetes”, and “cancer” (cancer was further divided into specific cancer types due to the large variance in disease pathology between types, Methods).

We hypothesized that CATNIP scores could be used to identify specific drug classes that would be efficacious for a new disease area. For each drug class and disease area, we found the statistical difference in the CATNIP score distribution between two sets of drug pairs. The first set included pairs that had one drug within the drug class and the other drug approved for the disease in question, while the other set included drug pairs that had one drug within the drug class and the other drug not approved for the disease in question (Methods). We compared the effect size, estimated by the Wilcoxon location shift, for all drug class-disease pairs that had a significant difference in distribution compared to drug class-non-disease pairs (FDR < 0.1, Supplementary Data). By using CATNIP scores, we found that many well-known drug class-diseases associations could be recovered. For example, “muscarinic antagonists” were highly ranked for “neurological diseases” and many such agents are FDA-approved for this indication[43]. In addition, we found that “kinase inhibitors” were closely associated with the treatment of cancer and “dopamine antagonists” for the treatment of “mental disorders”[44, 45] (Wilcoxon Location Shift = 0.711–0.945 for “kinase inhibitors” and select cancer types, Location Shift = 0.882 for “dopamine antagonists” and “mental disorders”, p-value < 0.001, S7 Fig). In fact, almost all drug class-disease associations contained at least one FDA-approved drug for the respective disease, giving us added confidence in our model. Of note, each drug was allowed to be categorized into numerous drug classes, leading to unexpected, yet easily explained, results; for example, “dopamine antagonists” appearing as a top drug class for “neurological diseases”. This is due to risperidone, a drug traditionally used for schizophrenia and mood disorders, also having a secondary indication of Alzheimer’s type severe dementia.

Our method reached significant levels of predictive power for predicting both drug class—disease associations and individual drug-disease association. When predicting drug class-disease associations, under our most lenient conditions (calling cases where at least one drug within the class was known to treat the disease a true positives), our method achieved a sensitivity of greater than 0.75. However, this improved to a sensitivity of 1 when we implemented stricter cut-offs (ie: only calling drug class-disease associations true positives if >15% of drugs within the class treated that disease, S10 Fig). We additionally compared our method’s ability to predict individual predictions to that of a previously highlighted method, Gottlieb et al’s PREDICT[10]. We found our method had a slightly higher AUPRC (0.674 vs. 0.645) and higher sensitivity (0.6268 vs. 0.6203) (S4 Table, S1 Methods). While these results indicate modest improvements over PREDICT, it is important to note that unlike in PREDICT, disease information is not a required feature in CATNIP’s machine learning approach. This means that CATNIP can be applied towards investigational molecules with no previously known indications. Additionally, by not using disease information as a feature, repositioning of drugs with known indications using CATNIP is not directly biased by the associated disease indication and instead uses mechanistic features (chemical structure and properties, targets, etc.) as part of the repositioning strategy.

Next, we further interrogated the drug classes associated with “neurological diseases” and “diabetes”, specifically. CATNIP scores could correctly identify drug classes known to treat these diseases (Table 3). To identify possible repurposing candidates, we focused our attention on drug classes shown to have a large positive effect size with this CATNIP analysis but are not currently approved for treatment. For “neurological diseases”, the use of adrenergic uptake inhibitors, traditionally used as antidepressants, was the top repurposing candidate; for “diabetes”, alpha 1 antagonists and kinase inhibitors were identified as possible novel treatments (Table 3). We believe further investigation into these drug classes and diseases could lead to successful clinical applications.

Table 3. Top Predictions of Drug Class Repurposing Opportunities.

Class Disease Prediction Rank
Diabetes Alpha1 Antagonists 1
Kinase Inhibitor 2
Protein Kinase Inhibitors 3
Protein Synthesis Inhibitors 4
Cytochrome P450 CYP2E1 Inhibitors 5
Monoamine Oxidase Inhibitors 6
Neurological Adrenergic Uptake Inhibitors 1
Adrenergic alpha Agonists 2
Protease Inhibitors 3

CATNIP interpretability reveals reasoning for repurposing candidates

From our list of repurposing candidates, we chose two novel drug class-disease associations to further investigate.

Adrenergic uptake inhibitors applied to Parkinson’s disease

First, we evaluated the relationship between “neurological diseases” and “adrenergic uptake inhibitors”. We focused on the drug pairs with the highest CATNIP scores, i.e. those predicted with the highest confidence to share at least one indication (Fig 3A). Of all the adrenergic uptake inhibitors, we found that amitriptyline and trimipramine, two anti-depressants, had the highest CATNIP scores with the “neurological diseases” drugs. The drugs that shared the strongest connections with amitriptyline and trimipramine were drugs approved for Parkinson’s disease (PD). Specifically, metixene, atropine, pergolide and benzatropine were associated with amitriptyline, according to CATNIP, and trimipramine was associated to benzatropine and rotigotine. Trimipramine was also strongly connected with orphenadrine, which is sometimes used off label in PD, but will not be included in the following analyses.

Fig 3. CATNIP networks identify drug class repurposing opportunities.

Fig 3

A) The network of neurological drugs and adrenergic uptake inhibitors drug pairs with the highest CATNIP scores. B) The decrease in the CATNIP score when removing each feature for amitriptyline and select Parkinson’s Disease drugs. C) The network of anti-diabetes and kinase inhibitor drug pairs with the highest CATNIP scores. D) The decrease in the CATNIP score when removing each feature for the drug pair vandetanib and gliclazide.

Using the CATNIP model, we evaluated which features contributed towards the prediction of amitriptyline and trimipramine to share an indication with PD drugs. We found that target, gene ontology, and pathway similarity all strongly contributed to the predictions for both amitriptyline and trimipramine (Fig 3B, S8 Fig). Since target similarity and distance between targets (in a protein-protein interaction network) were among the top contributing features, we investigated which gene targets were shared amongst these drug pairs. We found that amitriptyline targets three specific gene classes that are also targeted by at least one of the PD drugs: muscarinic acetylcholine receptors, G-coupled protein receptors (GPCRs), and alpha adrenergic receptor. Trimipramine also targets muscarinic acetylcholine receptors, alpha-adrenergic receptors, and dopamine transporters, which is similar to benzatropine, a PD drug. All these receptors have well-defined relationships with PD and other neurological diseases[43, 46, 47], which adds support for repurposing amitriptyline and/or trimipramine.

Amitriptyline may be an ideal candidate for use in PD patients. We evaluated the shared molecular function gene ontology terms shared between amitriptyline and all four PD drugs. GPCR activity was once again identified (S1S4 Files). We then interrogated the biological pathways these drug targets are involved in and found many broad GPCR pathways overlapping between amitriptyline and the PD drugs (S9 Fig) including the Reactome pathway “GASTRIN_CREB_SIGNALLING PATHWAY VIA PKC AND MAPK”. Several recent studies support the link between gastrin-releasing peptide signaling to brain function[48]. Through CATNIP, we have identified “adrenergic uptake inhibitors” like amitriptyline and trimipramine as a possible treatment for PD.

Kinase inhibitors applied to diabetes

Our CATNIP analysis identified an opportunity to repurpose “kinase inhibitors” for the treatment of diabetes (Fig 3B). Of the drug pairs evaluated in this context, the link between vandetanib, a thyroid cancer drug, and gliclazide, a Type 2 diabetes drug (CATNIP Score = 6.39, Fig 3C) was the strongest. This association was driven by target similarity and similarity between KEGG pathways of the drug targets (Fig 3D). Vendetanib and gliclazide have an overlapping target, VEGFA. Several KEGG pathways are shared between vandetanib and gliclazide including the “Cytokine cytokine receptor interaction” pathway (Supplementary Data). This pathway contains VEGFA, the shared target, and the epidermal growth factor receptor (EGFR), another one of vandetanib’s targets. The similarity between these two drug’s targets and pathway effects leads us to believe there is strong potential for vendetanib to be repurposed.

Discussion

Although considerable improvements have been made in drug repurposing efforts over the past decade, the use of previous disease associations will eventually curtail these improvements due to the imposed restriction of previous knowledge. Our new approach, CATNIP, could provide a highly effective aid to drug repurposing endeavors. Here, we accurately predicted drugs that shared an indication, while keeping high levels of both sensitivity and specificity. Leveraging our prediction metric enabled us to generate a network for repurposing, identifying, and repurposing predictions based on system-wide drug scopes.

The CATNIP method allows for broad-scale drug repurposing opportunities to be readily identified. By identifying and interpreting our drug similarity features, we can investigate the possible mechanisms behind these repurposing candidates. The benefit of using drug similarity features is two-fold. First, these features are readily available for both approved and investigational drugs, which have been underserved by previous repurposing methods. The features utilized in our model have been limited to those that may be frequently available for both early stage compounds and investigational compounds that may have been previously shelved due to a variety of reasons. Second, the interpretability of the features allows us to identify possible mechanisms of action when we back engineer what contributed to high CATNIP scores.

We found strong support for repurposing amitriptyline and trimipramine, both of which are in clinical use as anti-depressants, for PD. These drugs have many functions in addition to being adrenergic uptake inhibitors, such as serotonin blockers, anticholinergics, and the mechanisms overlapping with current PD drugs described above. Movement Disorders Society guidelines found insufficient evidence to support the use of amitriptyline for depression in PD[49] and a published Practice Parameter found only level C evidence for its use[50]. However, amitriptyline has been commonly used for not only depression but other off-label indications in neurological disorders, including pain[51]. While clinical trials have been conducted for the effect of amitriptyline on depression in PD patients[52], currently there are no trials evaluating amitriptyline or trimipramine as a treatment for other symptoms and signs of PD. There have, however, been preclinical studies evaluating amitriptyline as a potential therapy for PD. In rodent models of PD, amitriptyline affects levels of neurotrophic factors including BDNF[53] and decreases dopamine cell loss in these models[54, 55]. It has been suggested to mitigate microglial inflammation[56]. Moreover, with the suggestion that amitriptyline may have shorter term symptomatic motor benefit, it may enhance levodopa efficacy[57].

When we more closely evaluated trimipramine, we found compelling evidence this could be a potential PD therapeutic. Specifically, the targets of trimipramine make it a potentially strong therapeutic to combat loss of motor function amongst PD patients. This benefit is due to the dual targeting of DRD2 and alpha 2 adrenergic receptors, which is similar to piribedil, an investigational PD medication that was not included within our final CATNIP network due to a lack of available information. In a review of piribedil, it was highlighted that the agonistic D2/D3 activity combined with alpha 2 adrenergic antagonism can lead to preservation of motor function[58]. However, further research must be done to better understand the exact effects that trimipramine has on both dopamine and alpha 2 adrenergic receptors. Further research into trimipramine could quickly lead to a clinical trial for PD patients with specific motor function end points.

We also identified a repurposing opportunity with kinase inhibitors for the treatment of diabetes, due to the strong predicted connection between vandetanib, a thyroid cancer drug, and gliclazide. While there have been some preclinical animal studies investigating the use of kinase inhibitors in diabetes[59, 60], to our knowledge, there has yet to be an approved kinase inhibitor for the treatment of diabetes. Both vandetanib and gliclazide are known to target VEGFA, which has shown a clear connection to diabetes pathology[61] and treatment[62]. Additionally, Hagberg et al. published work suggesting that antagonism of VEGFB, a gene within the same pathway as VEGFA, improves insulin sensitivity and increases skeletal muscle glucose uptake in db/db mice[63]. Because vandetanib targets VEGFR1[64], the receptor VEGFB binds, it could also have insulin sensitizing effects. Further experimental work is required to verify this hypothesis[65].

Besides the targeting of VEGFA/VEGFR1, vandetanib’s target EGFR can also potentially help diabetes pathology. Inflammatory cytokines (including, but not limited to, IL-8 and TNF-α) have been shown to be associated with the progression of diabetic neuropathy[66]. The inhibition of EGFR through the use of a kinase inhibitor in past work has reduced the expression of both to IL-8 and TNF-α in rats[67]. Therefore, we believe vandetanib could be considered as a potential diabetes treatment, due to its ability to target EGFR leading to a possible decrease in inflammatory cytokine production.

In addition to the exciting predicted repurposing opportunities we have chosen to highlight, many other drug classes showed significant repurposing potential for mental disorders, neurological diseases, and several different cancer types. While diving into each of these opportunities is outside the scope of this paper, we hope that researchers take it upon themselves to further investigate these candidate drug class-disease associations.

It is important to acknowledge certain limitations to CATNIP, such as data availability and the application to rare diseases. Although this model does not rely on disease similarity information, it does require known molecular target information to obtain peak predictive power. This target information can frequently be unavailable for early stage compounds. Additionally, this method would have limited use when searching for drugs to be repurposed for diseases with very few or no clinically approved compounds. Disease information is not used as a feature of molecules in CATNIP, thus making it applicable to investigational compounds, which by definition do not have any approved indications. We note that CATNIP does nonetheless rely on other molecules having a previously known indication in the CATNIP repurposing network.

To our knowledge, CATNIP is the first method capable of predicting a novel indication for a drug without relying on disease similarities. Our method not only utilizes a variety different ligand based features, but combines them in a method to aid with broad scale repurposing, an idea that has been rarely visited before. Many predictions gained from CATNIP have substantial preclinical research or mechanistic support, suggesting that other predictions may also provide valuable information for future investigations. We have provided an online tool available for download (www.github.com/coryandar/CATNIP) that can be used for researchers investigate repurposing opportunities of drugs or diseases of their interest. Due to its demonstrated ability to identify large scale drug repurposing opportunities, CATNIP will likely serve as a significant basis towards a bright future in drug repurposing efforts.

Methods

Indication mapping

Using a custom Python script with the Beautiful Soup package[68], we webscraped DrugBank 5.0[69] for drug compound names and indication information with a total of 3066 drugs being found. DrugBank was webscraped to ensure the most up to date information available, since the online version is updated in real-time, unlike the XML. The “structured indications” and the “description” of indications were collected. These were both collected, as there were instances where indications would be missed due to not being classified as “structured indications” (S3 Table). Indication information were run through the Unified Medical Language System (UMLS) tool, MetaMap[13], to match DrugBank assigned indications to MESH IDs and UMLS Concept Unique Identifiers (CUIs). MetaMap is a computational approach that combines linguistic and natural language processing techniques to map biomedical texts to the UMLS Metathesaurus. MetaMap has previously been shown to successfully exceeded human mapping capabilities[70]. Using a custom Python script we identified synonym candidate, based upon which CUIs were consistently matched together, to further improve indication semantics. A random subset of 100 of the indications were manually reviewed and found to correctly map to standardized terms with a 95% accuracy. We then filtered our list of drugs to the 2576 drugs that shared at least one indication with another drug.

Similarity feature collection

Compound features

Similarities between drugs were found by creating all possible pairs of the drugs with known indications. Multiple compound similarity features and drug target similarity features were collected. The drug targets listed within DrugBank 5.0[69] were used as our set of ‘known targets’ for each drug. Additionally, we collected genomic information about each drug target using MSigDB [71, 72]. Below please find the sources and methods of similarity measurement used:

  1. MSigDB: The following gene sets were collected from MSigDB: Gene ontologies, transcription factor, KEGG pathways, Reactome pathways, canonical pathways, motif, microRNA, oncogenic signature, immunogenic signature and chemical perturbation. For each gene set the similarity between two drugs was measure by finding the jaccard index between all sets the targets of the first drug’s targets are involved within with the all sets the targets of the second drug’s targets are involved within.

  2. DrugBank[69]: The jaccard index between the targets listed for both drugs. Additionally, the SMILES of each drug was collected from DrugBank and the R package ChemmineR [73] was used to find the Dice Similarity between both drugs’ structures.

  3. DepMap [74]: The essentiality, measured using the CERES score, of each drugs’ targets was collected. The correlation between the essentiality of each drug’s targets was found and the average was used to come to one similarity score.

  4. Protein-protein interaction network: The in-house network (described below) was used to find the minimum distance between the targets of each drug pair.

  5. PubChem[75]: Bioassays were collected from PubChem using the PubChem API. For each drug all bioassays that had a result listed as “active” was collected. The jaccard index between all active bioassays for a pair of compounds was calculated.

In cases where there was insufficient or missing information, features were imputed by using the median value for that feature in drug pairs with complete information.

Network features

We curated a biological network that contains 22,399 protein-coding genes, 6,679 drugs, and 170 TFs. The protein-protein interactions represent established interaction [7678], which include both physical (protein-protein) and non-physical (phosphorylation, metabolic, signaling, and regulatory) interactions. The drug-protein interactions were curated from several drug target databases [78].

Statistical analysis

For each similarity feature, a Kolmogorov-Smirnov (KS) test was performed between the set of drug pairs that shared an indication and those that did not share an indication. The KS test was chosen to identify non-linear predictive power. In addition, the Pearson correlation between all numeric features was calculated. These tests were performed using custom scripts in R statistical software [79].

Model building

We trained a two-class classifier predictive model using the features described above. Our classes were determined as a binary of “shared” or “non-shared” indication. Drugs were only included if they shared an indication with at least one other drug. A 5-fold cross-validation gradient boosting model was used after careful model selection and implemented using the XGBoost package[80] within the R statistical software. Additional models that were tested and compared using the AUC and AUPRC of 5-fold cross-validation were: Support Vector Machine with a radial kernel model, logistic regression with elastic net and logistic regression with lasso, all using custom R scripts. A custom-made R script was used to implement a grid-search to optimize the hyper parameters of our model. Our model objective was a logistic regression for binary classification and we output a score pre-logistic transformation. The class size of “shared” vs. “non-shared” was imbalanced, therefore we applied downsampling to each fold of training via the R package Caret[81]. Feature importance was found using the built in method within the XGBoost package[80].

Classification evaluation

For evaluating the model performance on predicting if two drugs share an indication, receiver operating characteristic (ROC) and precision-recall curve (PRC) curves were created in R using the pROC[82] and precrec[83] packages respectively. The raw-logistic values were normalized on a scale from 0–1 to enable easier interpretation and ROC/PRC calculation. Area-under-the-ROC curve (AUC) and area-under-the-PRC (AUPRC) scores were used to evaluate model performance.

Drug similarity network

Network construction

We constructed a drug similarity network based upon our classifier results with drugs as nodes and our raw model output as the edge weight. This network was visualized using the visNetwork package[84] and used in analyses using the iGraph package[85] within R[79].

ATC repurposing analysis

The Anatomical Therapeutic Chemical (ATC) code for all drugs were found in DrugBank[69], and the highest level code was assigned. A circular repurposing network was created with ATC codes as the nodes using the iGraph[85] and gGraph[86] packages with R[79]. The graph edge weights were based on the mean classifier output between all drugs of each ATC code category. For drugs with multiple ATC codes when comparing two ATC codes all drugs had to meet the condition to only be associated with one of the codes. To reduce noise within the repurposing network an initial cut-off of drug pairs with a classifier output of 7.4 and above was implemented, leaving 792 drug pairs to examine. Manual literature searches were used to validate repurposing opportunities.

Drug class repurposing analysis

Drug classes for all drugs were found in DrugBank[69] and were filtered to include only classes that had the words: “inhibitor”, “antagonist,” and “agonist” within them and at least 20 drugs, to ensure enough statistical power. Additionally, we identified four main disease areas of interest: “mental disorders”, “neurological diseases”, “diabetes”, and “cancer”. The UMLS[13] sematic codes “modb” and “neop” were used to identify indications falling within mental disorders and cancer, respectively. Cancer was further refined into different cancer types based on a keyword search in a custom Python script. All UMLS concept IDs containing the word “diabetes” were included within the diabetes category. For “neurological diseases”, we refined our list to only include Parkinson’s Disease, Alzheimer’s, Epilepsy, and Dementia, to balance both specificity in disease type and enough drugs to make statistically sufficient sample size.

Wilcox-Mann-Whitney tests between all drug class-disease associations were performed. The test specifically tested if the mean of the CATNIP scores of drug pairs with one drug being a member of the class of interest and the other being approved for the disease of interest were significantly different than the mean of the CATNIP scores of all drug pairs that included one drug within the class of interest and the other drug not being approved for the diseases of interest. A positive location shift meant that drug class-disease pairs had significantly higher CATNIP scores than drug class-non-disease pairs. A FDR multiple hypothesis correction was applied.

Sensitivity was measured using any drug class-disease associations that had a positive and significant Wilcox location shift. True positives were determined using a cut-off of percent of drugs within a drug class that were approved to treat the specific disease in the drug class-disease association.

CATNIP feature effect analysis

The effect of each feature on the CATNIP score for specific drug pairs was found by iteratively changing the feature value to the median value of that feature for all drug pairs. Since the clear majority of all drug pairs do not share an indication this is the best approximate for that feature having no contribution to the CATNIP score. The difference in the new CATNIP score and the correct CATNIP score was then measured.

Code availability

Model results and visualization tool available at www.github.com/coryandar/CATNIP. Other select pieces of code available upon request.

Supporting information

S1 Fig. MetaMap performs well in drug indication mapping.

A) The number of occurrences of different UMLS sematic types. B) The accuracy of mapping indications using MetaMap for indications categorized a “Structured” and the “Description” section.

(TIF)

S2 Fig. Target ontology similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of targets’ A) biological processes, B) cellular component, C) molecular function, D) chemical perturbation, E) oncological, F) immunogenic, G) micro-RNA, and H) transcription factor. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S3 Fig. Target similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of A) targets, B) the Protein-Protein Interaction network distance between targets and the C) correlation of target essential within cancer cell lines. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S4 Fig. Target pathway similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of the A) reactome pathways, B) all pathway types and C) KEGG pathways a drug’s target is known to be involved within. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S5 Fig. Structure similarity varies for drug pairs that share an indication and those that do not.

A) The violin plot of the Dice chemical fingerprint similarity, statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S6 Fig. CATNIP performs significantly better than random.

A) The Precision–Recall curve for classifying if two drugs share an indication using CATNIP and the random expectation.

(TIF)

S7 Fig. CATNIP scores are statistically higher between drugs of certain drug classes and drugs that treat associated diseases.

The distributions of CATNIP score between A) kinase inhibitors and drugs known to treat cancer and those that do not and B) dopamine antagonists and drugs known to treat mental illness and those that do not.

(TIF)

S8 Fig. Target features drive the prediction of trimipramine as a Parkinson’s Disease treatment.

A) The decrease in the CATNIP score when removing each feature for trimipramine and select Parkinson’s Disease drugs.

(TIF)

S9 Fig. Many pathways or gene ontology groups overlap, fueling CATNIP predictions.

The overlap between amitriptyline and select Parkinson’s Disease drugs for A) reactome pathways, B) KEGG pathways, and C) molecular function gene ontologies. The overlap between vandetanib and gliclazide for D) reactome pathways, E) KEGG pathways, and F) molecular function gene ontologies.

(TIF)

S10 Fig. Implementing stricter cut-off scores when predicting drug class-disease associations improves CATNIP’s sensitivity.

(TIF)

S11 Fig. Feature importance of individual features used in the CATNIP model.

(TIF)

S12 Fig. AUC curves of individual features used in the CATNIP model.

(TIF)

S1 Table. The drug similarity features used within CATNIP.

(XLSX)

S2 Table. Comparison of model performance using other model types.

(XLSX)

S3 Table. List of DrugBank drugs and indications, in which some indications may be missed if only examining structured indications.

(XLSX)

S4 Table. Comparison of model performance against PREDICT.

(XLSX)

S1 Methods. Comparison with PREDICT.

(DOCX)

S1 File. All pathways and gene ontologies that amitriptyline’s targets and the targets of select Parkinson’s Disease drugs’ targets are associated with.

(XLSX)

S2 File. All pathways and gene ontologies that trimipramine’s targets and the targets of select Parkinson’s Disease drugs’ targets are associated with.

(XLSX)

S3 File. All pathways and gene ontologies that vandetanib’s targets and gliclazide’s are associated with.

(XLSX)

S4 File. Location shifts calculated using Wilcox-Mann-Whitney for all CATNIP scores of drug class-disease drug pairs vs. drug class-non-disease drug pairs.

(XLSX)

Data Availability

Data is available at the following URL: www.github.com/coryandar/CATNIP.

Funding Statement

JE is supported by NLM of the National Institutes of Health under award number F31LM013058. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. OE and his laboratory are supported by NIH grants 1R01CA194547, 1U24CA210989, P50CA211024, UL1TR002384. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Adams CP, Brantner VV. Estimating The Cost Of New Drug Development: Is It Really $802 Million? Health Affairs. 2006;25(2):420–8. 10.1377/hlthaff.25.2.420 [DOI] [PubMed] [Google Scholar]
  • 2.Ishida J, Konishi M, Ebner N, Springer J. Repurposing of approved cardiovascular drugs. Journal of Translational Medicine. 2016;14(1):269 10.1186/s12967-016-1031-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Goldstein SR, Duvernoy CS, Calaf J, Adachi JD, Mershon JL, Dowsett SA, et al. Raloxifene use in clinical practice: efficacy and safety. Menopause. 2009;16(2):413–21. 10.1097/gme.0b013e3181883dae [DOI] [PubMed] [Google Scholar]
  • 4.Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clinical pharmacology and therapeutics. 2009;86(5):507–10. Epub 2009/07/01. 10.1038/clpt.2009.103 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Gayvert KM, Madhukar NS, Elemento O. A Data-Driven Approach to Predicting Successes and Failures of Clinical Trials. Cell chemical biology. 2016;23(10):1294–301. Epub 2016/09/15. 10.1016/j.chembiol.2016.07.023 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Madhukar NS, Khade PK, Huang L, Gayvert K, Galletti G, Stogniew M, et al. A New Big-Data Paradigm for Target Identification and Drug Discovery. bioRxiv. 2017:134973 10.1101/134973 [DOI] [Google Scholar]
  • 7.Madhukar NS, Gayvert K, Gilvary C, Elemento O. A Machine Learning Approach Predicts Tissue-Specific Drug Adverse Events. bioRxiv. 2018:288332 10.1101/288332 [DOI] [Google Scholar]
  • 8.McCullough M. Cancer therapy shows promise for some brain tumors 2018. [Google Scholar]
  • 9.Dudley JT, Deshpande T, Butte AJ. Exploiting drug-disease relationships for computational drug repositioning. Briefings in bioinformatics. 2011;12(4):303–11. Epub 2011/06/20. 10.1093/bib/bbr013 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gottlieb A, Stein G, Ruppin E, Sharan R. PREDICT: A method for inferring novel drug indications with application to personalized medicine 2011. 496 p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Luo H, Wang J, Luo J, Li M, Peng X, Wu F-X, et al. Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm. Bioinformatics. 2016;32(17):2664–71. 10.1093/bioinformatics/btw228 [DOI] [PubMed] [Google Scholar]
  • 12.Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic acids research. 2008;36(Database issue):D901–D6. Epub 2007/11/29. 10.1093/nar/gkm958 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings AMIA Symposium. 2001:17–21. [PMC free article] [PubMed]
  • 14.Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, et al. Predicting new molecular targets for known drugs. Nature. 2009;462(7270):175–81. Epub 2009/11/01. 10.1038/nature08506 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Craig Jordan V. The role of tamoxifen in the treatment and prevention of breast cancer. Current Problems in Cancer. 1992;16(3):134–76. 10.1016/0147-0272(92)90002-6. [DOI] [PubMed] [Google Scholar]
  • 16.Milani M, Jha G, Potter DA. Anastrozole Use in Early Stage Breast Cancer of Post-Menopausal Women. Clinical medicine Therapeutics. 2009;1:141–56. 10.4137/cmt.s9 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Chen L, Zeng W-M, Cai Y-D, Feng K-Y, Chou K-C. Predicting Anatomical Therapeutic Chemical (ATC) classification of drugs by integrating chemical-chemical interactions and similarities. PloS one. 2012;7(4):e35254–e. 10.1371/journal.pone.0035254 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Triolo M, Annema W, de Boer JF, Tietge UJ, Dullaart RP. Simvastatin and bezafibrate increase cholesterol efflux in men with type 2 diabetes. European journal of clinical investigation. 2014;44(3):240–8. 10.1111/eci.12226 [DOI] [PubMed] [Google Scholar]
  • 19.Teramoto T, Shirai K, Daida H, Yamada N. Effects of bezafibrate on lipid and glucose metabolism in dyslipidemic patients with diabetes: the J-BENEFIT study. Cardiovascular diabetology. 2012;11(1):29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tenenbaum A, Motro M, Fisman EZ, Adler Y, Shemesh J, Tanne D, et al. Effect of bezafibrate on incidence of type 2 diabetes mellitus in obese patients. European heart journal. 2005;26(19):2032–8. 10.1093/eurheartj/ehi310 [DOI] [PubMed] [Google Scholar]
  • 21.Pujols L, Mullol J, Picado C. Alpha and beta glucocorticoid receptors: relevance in airway diseases. Current allergy and asthma reports. 2007;7(2):93–9. 10.1007/s11882-007-0005-3 [DOI] [PubMed] [Google Scholar]
  • 22.Cavallari JM, Jawaro TS, Awad NI, Bridgeman PJ. Glucagon for refractory asthma exacerbation. The American Journal of Emergency Medicine. 2017;35(1):144–5. 10.1016/j.ajem.2016.09.063. [DOI] [PubMed] [Google Scholar]
  • 23.Insuela DBR, Daleprane JB, Coelho LP, Silva AR, e Silva PMR, Martins MA, et al. Glucagon induces airway smooth muscle relaxation by nitric oxide and prostaglandin E2. Journal of Endocrinology. 2015;225(3):205–17. 10.1530/JOE-14-0648 [DOI] [PubMed] [Google Scholar]
  • 24.Carter NJ. Bilastine. Drugs. 2012;72(9):1257–69. 10.2165/11209310-000000000-00000 [DOI] [PubMed] [Google Scholar]
  • 25.Krause K, Spohr A, Zuberbier T, Church MK, Maurer M. Up-dosing with bilastine results in improved effectiveness in cold contact urticaria. Allergy. 2013;68(7):921–8. Epub 2013/06/06. 10.1111/all.12171 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Greaves MW. Antihistamines in Dermatology. Skin Pharmacology and Physiology. 2005;18(5):220–9. 10.1159/000086667 [DOI] [PubMed] [Google Scholar]
  • 27.Kuna P, Jurkiewicz D, Czarnecka-Operacz MM, Pawliczak R, Woroń J, Moniuszko M, et al. The role and choice criteria of antihistamines in allergy management—expert opinion. Postepy dermatologii i alergologii. 2016;33(6):397–410. Epub 2016/12/02. 10.5114/pdia.2016.63942 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.La Rosa Mea. A randomized, double-blind, placebo-controlled, crossover trial of systemic flunisolide in the treatment of children with severe atopic dermatitis. Current Therapeutic Research. 1995;56(7):720–6. 10.1016/0011-393X(95)85143-7. LAROSA1995720. [DOI] [Google Scholar]
  • 29.Ekström T, Lindgren BR, Tibbling L. Effects of ranitidine treatment on patients with asthma and a history of gastro-oesophageal reflux: a double blind crossover study. Thorax. 1989;44(1):19–23. 10.1136/thx.44.1.19 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Dixon AE, Subramanian M, DeSarno M, Black K, Lane L, Holguin F. A pilot randomized controlled trial of pioglitazone for the treatment of poorly controlled asthma in obesity. Respiratory Research. 2015;16(1):143 10.1186/s12931-015-0303-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Moore M, Stuart B, Coenen S, Butler CC, Goossens H, Verheij TJM, et al. Amoxicillin for acute lower respiratory tract infection in primary care: subgroup analysis of potential high-risk groups. The British journal of general practice: the journal of the Royal College of General Practitioners. 2014;64(619):e75–e80. 10.3399/bjgp14X677121 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Reznikov LR, Meyerholz DK, Abou Alaiwa M, Kuan S-P, Liao Y-SJ, Bormann NL, et al. The vagal ganglia transcriptome identifies candidate therapeutics for airway hyperreactivity. American Journal of Physiology-Lung Cellular and Molecular Physiology. 2018;315(2):L133–L48. 10.1152/ajplung.00557.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Beigelman A, Chipps BE, Bacharier LB. Update on the utility of corticosteroids in acute pediatric respiratory disorders. Allergy and asthma proceedings. 2015;36(5):332–8. 10.2500/aap.2015.36.3865 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hua F, Wang X, Zhu L. Terlipressin Decreases Vascular Endothelial Growth Factor Expression and Improves Oxygenation in Patients with Acute Respiratory Distress Syndrome and Shock. The Journal of Emergency Medicine. 2013;44(2):434–9. 10.1016/j.jemermed.2012.02.073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Crestani B, Chapron J, Wallaert B, Bergot E, Delaval P, Israel-Biet D, et al. Octreotide treatment of idiopathic pulmonary fibrosis: a proof-of-concept study. European Respiratory Journal. 2012;39(3):772 10.1183/09031936.00113011 [DOI] [PubMed] [Google Scholar]
  • 36.Abid S, Xie S, Bose M, Shaul PW, Terada LS, Brody SL, et al. 17β-estradiol dysregulates innate immune responses to Pseudomonas aeruginosa respiratory infection and is modulated by estrogen receptor antagonism. Infection and immunity. 2017;85(10):e00422–17. 10.1128/IAI.00422-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kharkevich DA, Chizh BA, Kasparov SA. Stimulant effect of thyrotropin-releasing hormone and its analog, RGH 2202, on the diaphragm respiratory activity, and their antagonism with morphine: possible involvement of the N-methyl-D-aspartate receptors. Brain research. 1991;551(1–2):110–5. 10.1016/0006-8993(91)90920-q [DOI] [PubMed] [Google Scholar]
  • 38.El-Haggar SM, Farrag WF, Kotkata FA. Effect of ketotifen in obese patients with type 2 diabetes mellitus. Journal of Diabetes and its Complications. 2015;29(3):427–32. 10.1016/j.jdiacomp.2015.01.013. [DOI] [PubMed] [Google Scholar]
  • 39.Manjunath S, Kugali SN, Deodurg PM. Effect of clonidine on blood glucose levels in euglycemic and alloxan-induced diabetic rats and its interaction with glibenclamide. Indian journal of pharmacology. 2009;41(5):218–20. 10.4103/0253-7613.58510 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Paul S, Wand M, Emerick GT, Richter JM. The role of latanoprost in an inflammatory bowel disease flare. Gastroenterology report. 2014;2(3):232–4. Epub 2014/07/26. 10.1093/gastro/gou044 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kern TS, Miller CM, Du Y, Zheng L, Mohr S, Ball SL, et al. Topical Administration of Nepafenac Inhibits Diabetes-Induced Retinal Microvascular Disease and Underlying Abnormalities of Retinal Metabolism and Physiology. Diabetes. 2007;56(2):373 10.2337/db05-1621 [DOI] [PubMed] [Google Scholar]
  • 42.Pereira Arias AM, Romijn JA, Corssmit EPM, Ackermans MT, Nijpels G, Endert E, et al. Indomethacin decreases insulin secretion in patients with type 2 diabetes mellitus. Metabolism. 2000;49(7):839–44. 10.1053/meta.2000.6748. [DOI] [PubMed] [Google Scholar]
  • 43.Langmead CJ, Watson J, Reavill C. Muscarinic acetylcholine receptors as CNS drug targets. Pharmacology & therapeutics. 2008;117(2):232–43. [DOI] [PubMed] [Google Scholar]
  • 44.Laruelle M, Frankle WG, Narendran R, Kegeles LS, Abi-Dargham A. Mechanism of action of antipsychotic drugs: from dopamine D2 receptor antagonism to glutamate NMDA facilitation. Clinical therapeutics. 2005;27:S16–S24. 10.1016/j.clinthera.2005.07.017 [DOI] [PubMed] [Google Scholar]
  • 45.Zhang J, Yang PL, Gray NS. Targeting cancer with small molecule kinase inhibitors. Nature reviews cancer. 2009;9(1):28 10.1038/nrc2559 [DOI] [PubMed] [Google Scholar]
  • 46.Perry E, Smith C, Perry R. Cholinergic nicotinic and muscarinic receptors in dementia of Alzheimer, Parkinson and Lewy body types. Journal of Neural Transmission-Parkinson’s Disease and Dementia Section. 1990;2(3):149–58. 10.1007/BF02257646 [DOI] [PubMed] [Google Scholar]
  • 47.Xu Y, Yan J, Zhou P, Li J, Gao H, Xia Y, et al. Neurotransmitter receptors and cognitive dysfunction in Alzheimer’s disease and Parkinson’s disease. Progress in neurobiology. 2012;97(1):1–13. 10.1016/j.pneurobio.2012.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Roesler R, Schwartsmann G. Gastrin-releasing peptide receptors in the central nervous system: role in brain function and as a drug target. Frontiers in endocrinology. 2012;3:159 10.3389/fendo.2012.00159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Seppi K, Weintraub D, Coelho M, Perez-Lloret S, Fox SH, Katzenschlager R, et al. The Movement Disorder Society evidence-based medicine review update: treatments for the non-motor symptoms of Parkinson’s disease. Movement Disorders. 2011;26(S3):S42–S80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Miyasaki J, Shannon K, Voon V, Ravina B, Kleiner-Fisman G, Anderson K, et al. Practice Parameter: Evaluation and treatment of depression, psychosis, and dementia in Parkinson disease (an evidence-based review):[RETIRED]: Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology. 2006;66(7):996–1002. 10.1212/01.wnl.0000215428.46057.3d [DOI] [PubMed] [Google Scholar]
  • 51.Frost J, Okun S, Vaughan T, Heywood J, Wicks P. Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe. Journal of medical Internet research. 2011;13(1):e6 10.2196/jmir.1643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Antonini A, Tesei S, Zecchinelli A, Barone P, De Gaspari D, Canesi M, et al. Randomized study of sertraline and low-dose amitriptyline in patients with Parkinson’s disease and depression: effect on quality of life. Movement disorders: official journal of the Movement Disorder Society. 2006;21(8):1119–22. [DOI] [PubMed] [Google Scholar]
  • 53.Paumier KL, Sortwell CE, Madhavan L, Terpstra B, Daley BF, Collier TJ. Tricyclic antidepressant treatment evokes regional changes in neurotrophic factors over time within the intact and degenerating nigrostriatal system. Experimental neurology. 2015;266:11–21. 10.1016/j.expneurol.2015.02.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Paumier KL, Sortwell CE, Madhavan L, Terpstra B, Celano SL, Green JJ, et al. Chronic amitriptyline treatment attenuates nigrostriatal degeneration and significantly alters trophic support in a rat model of parkinsonism. Neuropsychopharmacology. 2015;40(4):874 10.1038/npp.2014.262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kandil EA, Abdelkader NF, El-Sayeh BM, Saleh S. Imipramine and amitriptyline ameliorate the rotenone model of Parkinson’s disease in rats. Neuroscience. 2016;332:26–37. 10.1016/j.neuroscience.2016.06.040 [DOI] [PubMed] [Google Scholar]
  • 56.Lauterbach EC. Repurposing psychiatric medicines to target activated microglia in anxious mild cognitive impairment and early Parkinson’s disease. American journal of neurodegenerative disease. 2016;5(1):29 [PMC free article] [PubMed] [Google Scholar]
  • 57.Kamińska K, Lenda T, Konieczny J, Wardas J, Lorenc-Koci E. Interactions of the tricyclic antidepressant drug amitriptyline with L-DOPA in the striatum and substantia nigra of unilaterally 6-OHDA-lesioned rats. Relevance to motor dysfunction in Parkinson’s disease. Neurochemistry international. 2018;121:125–39. 10.1016/j.neuint.2018.10.004 [DOI] [PubMed] [Google Scholar]
  • 58.Millan MJ. From the cell to the clinic: a comparative review of the partial D2/D3 receptor agonist and α2-adrenoceptor antagonist, piribedil, in the treatment of Parkinson’s disease. Pharmacology & therapeutics. 2010;128(2):229–73. [DOI] [PubMed] [Google Scholar]
  • 59.Louvet C, Szot GL, Lang J, Lee MR, Martinier N, Bollag G, et al. Tyrosine kinase inhibitors reverse type 1 diabetes in nonobese diabetic mice. Proceedings of the National Academy of Sciences. 2008;105(48):18895–900. [DOI] [PMC free article] [PubMed]
  • 60.Kikuchi Y, Yamada M, Imakiire T, Kushiyama T, Higashi K, Hyodo N, et al. A Rho-kinase inhibitor, fasudil, prevents development of diabetes and nephropathy in insulin-resistant diabetic rats. Journal of Endocrinology. 2007;192(3):595–603. 10.1677/JOE-06-0045 [DOI] [PubMed] [Google Scholar]
  • 61.Aiello LP, Avery RL, Arrigg PG, Keyt BA, Jampel HD, Shah ST, et al. Vascular endothelial growth factor in ocular fluid of patients with diabetic retinopathy and other retinal disorders. New England Journal of Medicine. 1994;331(22):1480–7. 10.1056/NEJM199412013312203 [DOI] [PubMed] [Google Scholar]
  • 62.Duh E, Aiello LP. Vascular endothelial growth factor and diabetes: the agonist versus antagonist paradox. Diabetes. 1999;48(10):1899–906. 10.2337/diabetes.48.10.1899 [DOI] [PubMed] [Google Scholar]
  • 63.Hagberg CE, Mehlem A, Falkevall A, Muhl L, Fam BC, Ortsäter H, et al. Targeting VEGF-B as a novel treatment for insulin resistance and type 2 diabetes. Nature. 2012;490(7420):426 10.1038/nature11464 [DOI] [PubMed] [Google Scholar]
  • 64.Bianco R, Rosa R, Damiano V, Daniele G, Gelardi T, Garofalo S, et al. Vascular endothelial growth factor receptor-1 contributes to resistance to anti–epidermal growth factor receptor drugs in human cancer cells. Clinical Cancer Research. 2008;14(16):5069–80. 10.1158/1078-0432.CCR-07-4905 [DOI] [PubMed] [Google Scholar]
  • 65.Robciuc MR, Kivelä R, Williams IM, de Boer JF, van Dijk TH, Elamaa H, et al. VEGFB/VEGFR1-induced expansion of adipose vasculature counteracts obesity and related metabolic complications. Cell metabolism. 2016;23(4):712–24. 10.1016/j.cmet.2016.03.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Navarro-Gonzalez JF, Mora-Fernandez C. The role of inflammatory cytokines in diabetic nephropathy. Journal of the American Society of Nephrology. 2008;19(3):433–42. 10.1681/ASN.2007091048 [DOI] [PubMed] [Google Scholar]
  • 67.Qu W-s, Tian D-s, Guo Z-b, Fang J, Zhang Q, Yu Z-y, et al. Inhibition of EGFR/MAPK signaling reduces microglial inflammatory response and the associated secondary damage in rats after spinal cord injury. Journal of neuroinflammation. 2012;9(1):178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Richardson L. Beautiful soup documentation. April 2007. [Google Scholar]
  • 69.Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Research. 2017;46(D1):D1074 10.1093/nar/gkx1037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Pratt W, Yetisgen-Yildiz M. A study of biomedical concept identification: MetaMap vs. people. AMIA Annual Symposium proceedings AMIA Symposium. 2003;2003:529–33. [PMC free article] [PubMed]
  • 71.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences. 2005;102(43):15545. [DOI] [PMC free article] [PubMed]
  • 72.Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–40. 10.1093/bioinformatics/btr260 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Cao Y, Charisi A, Cheng L-C, Jiang T, Girke T. ChemmineR: a compound mining framework for R. Bioinformatics. 2008;24(15):1733–4. 10.1093/bioinformatics/btn307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, et al. Defining a cancer dependency map. Cell. 2017;170(3):564–76. e16. 10.1016/j.cell.2017.06.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic acids research. 2015;44(D1):D1202–D13. 10.1093/nar/gkv951 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Das J, Yu H. HINT: High-quality protein interactomes and their applications in understanding human disease. BMC systems biology. 2012;6(1):92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Khurana E, Fu Y, Chen J, Gerstein M. Interpretation of genomic variants using a unified biological network approach. PLoS computational biology. 2013;9(3):e1002886 10.1371/journal.pcbi.1002886 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Aksoy BA, Gao J, Dresdner G, Wang W, Root A, Jing X, et al. PiHelper: an open source framework for drug-target and antibody-target data. Bioinformatics. 2013;29(16):2071–2. 10.1093/bioinformatics/btt345 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Retrieved from http://wwwr-projectorg/. Vienna, Austria2017. [Google Scholar]
  • 80.Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA. 2939785: ACM; 2016. p. 785–94.
  • 81.Kuhn M. Building Predictive Models in R Using the caret Package. 2008. 2008;28(5):26 Epub 2008-09-23. 10.18637/jss.v028.i05 [DOI] [Google Scholar]
  • 82.Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77-. 10.1186/1471-2105-12-77 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Saito T, Rehmsmeier M. Precrec: fast and accurate precision-recall and ROC curve calculations in R. Bioinformatics. 2017;33 (1):145–7. 10.1093/bioinformatics/btw570 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Almende BV, Thieurmel B, Robert T. visNetwork: Network Visualization using ‘vis.js’ Library. The R Journal. 2018;10(1):251–68. [Google Scholar]
  • 85.Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;Complex Systems:1695-. [Google Scholar]
  • 86.Pedersen TL. ggraph: An Implementation of Grammar of Graphics for Graphs and Networks. 2018;33 (1):145–7. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. MetaMap performs well in drug indication mapping.

A) The number of occurrences of different UMLS sematic types. B) The accuracy of mapping indications using MetaMap for indications categorized a “Structured” and the “Description” section.

(TIF)

S2 Fig. Target ontology similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of targets’ A) biological processes, B) cellular component, C) molecular function, D) chemical perturbation, E) oncological, F) immunogenic, G) micro-RNA, and H) transcription factor. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S3 Fig. Target similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of A) targets, B) the Protein-Protein Interaction network distance between targets and the C) correlation of target essential within cancer cell lines. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S4 Fig. Target pathway similarity data types vary for drug pairs that share an indication and those that do not.

The violin plots of similarity distributions for the similarities of the A) reactome pathways, B) all pathway types and C) KEGG pathways a drug’s target is known to be involved within. Statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S5 Fig. Structure similarity varies for drug pairs that share an indication and those that do not.

A) The violin plot of the Dice chemical fingerprint similarity, statistical significance found by Kolmogorov-Smirnov test.

(TIF)

S6 Fig. CATNIP performs significantly better than random.

A) The Precision–Recall curve for classifying if two drugs share an indication using CATNIP and the random expectation.

(TIF)

S7 Fig. CATNIP scores are statistically higher between drugs of certain drug classes and drugs that treat associated diseases.

The distributions of CATNIP score between A) kinase inhibitors and drugs known to treat cancer and those that do not and B) dopamine antagonists and drugs known to treat mental illness and those that do not.

(TIF)

S8 Fig. Target features drive the prediction of trimipramine as a Parkinson’s Disease treatment.

A) The decrease in the CATNIP score when removing each feature for trimipramine and select Parkinson’s Disease drugs.

(TIF)

S9 Fig. Many pathways or gene ontology groups overlap, fueling CATNIP predictions.

The overlap between amitriptyline and select Parkinson’s Disease drugs for A) reactome pathways, B) KEGG pathways, and C) molecular function gene ontologies. The overlap between vandetanib and gliclazide for D) reactome pathways, E) KEGG pathways, and F) molecular function gene ontologies.

(TIF)

S10 Fig. Implementing stricter cut-off scores when predicting drug class-disease associations improves CATNIP’s sensitivity.

(TIF)

S11 Fig. Feature importance of individual features used in the CATNIP model.

(TIF)

S12 Fig. AUC curves of individual features used in the CATNIP model.

(TIF)

S1 Table. The drug similarity features used within CATNIP.

(XLSX)

S2 Table. Comparison of model performance using other model types.

(XLSX)

S3 Table. List of DrugBank drugs and indications, in which some indications may be missed if only examining structured indications.

(XLSX)

S4 Table. Comparison of model performance against PREDICT.

(XLSX)

S1 Methods. Comparison with PREDICT.

(DOCX)

S1 File. All pathways and gene ontologies that amitriptyline’s targets and the targets of select Parkinson’s Disease drugs’ targets are associated with.

(XLSX)

S2 File. All pathways and gene ontologies that trimipramine’s targets and the targets of select Parkinson’s Disease drugs’ targets are associated with.

(XLSX)

S3 File. All pathways and gene ontologies that vandetanib’s targets and gliclazide’s are associated with.

(XLSX)

S4 File. Location shifts calculated using Wilcox-Mann-Whitney for all CATNIP scores of drug class-disease drug pairs vs. drug class-non-disease drug pairs.

(XLSX)

Data Availability Statement

Data is available at the following URL: www.github.com/coryandar/CATNIP.


Articles from PLoS Computational Biology are provided here courtesy of PLOS

RESOURCES