Skip to main content
Human Genomics logoLink to Human Genomics
. 2025 Sep 30;19:110. doi: 10.1186/s40246-025-00808-8

Exploiting similarity in drug molecular effects for drug repurposing

Katie Huang 1,2, Panagiotis Nikolaos Lalagkas 2, Beftu Sultan 2, Rachel Melamed 2,
PMCID: PMC12487472  PMID: 41029350

Abstract

Background

Using large data to propose new uses of drugs has potential to rapidly prioritize new treatments for major diseases of public health importance. One comprehensive data set, LINCS L1000 Connectivity Map, profiles gene expression associated with thousands of compounds, including many with known clinical uses. But, some recent studies have questioned the reliability of this data, and the best approach to use this resource for drug repositioning is not well established.

Methods

Here, we develop a novel generalizable approach by hypothesizing that new treatments for a disease should induce similar gene expression to existing treatments for a disease. Using the Drug Repurposing Hub compendium of known treatments, we formulate a combined logistic regression model to predict new drug indications, and we assess generalizability of our findings using independent clinical trials on experimental drug uses.

Results

We support the hypothesis that drugs sharing an indication induce more similar gene expression, additionally demonstrating that the simpler Spearman correlation (p = 7.71e-38), outperforms the popular Connectivity Score (p = 5.2e-6). Our final model, combining predicted drug indications across three diverse cell lines, generalizes to predict experimental clinical trials with AUC of 0.708.

Conclusions

By developing a new approach to using LINCS L1000 data for drug repositioning, we both propose plausible new disease treatments and provide an interpretable rationale for predictions. Our findings not only put forward new drug repositioning candidates, browseable at https://bsultan.shinyapps.io/web-app, but they also provide guidelines for future researchers employing L1000 data for drug repurposing.

Supplementary Information

The online version contains supplementary material available at 10.1186/s40246-025-00808-8.

Keywords: Drug repositioning, Gene expression, Drug discovery

Background

Drug repositioning involves finding new uses for established compounds, including approved and preclinical drugs. With the growing availability of data on the molecular effects of drugs, data-driven drug repositioning has the potential to accelerate clinical advances. In particular, the LINCS Connectivity Map tests the gene expression changes induced by thousands of compounds, including approved drugs [1]. This resource tests the effect of a drug on expression, on each of a number of cancer cell lines derived from human tumors from different body sites. One of the project’s goals is to uncover new uses of drugs by modeling their biological effects in vitro.

To this end, previous work has used the LINCS Connectivity Map to predict drug indications based on the reversal of a disease profile’s gene expression [24]. However, it is challenging to determine the correct tissue and sample to represent a disease. Additionally, differences between the tissue context of a disease and the cell lines tested for a drug can complicate comparisons. Another wrinkle is that recent work has questioned the reliability of the LINCS data, finding that drug-induced gene expression is not consistent across different data sets, particularly when the transcriptional response of a cell to a drug is low [5, 6]. As well, there is no established way to choose an appropriate cell line for testing drug response, or merge results across cell lines [7, 8]. These issues complicate the use of the LINCS Connectivity Map for drug repurposing.

Here, we develop and rigorously evaluate an approach that does not require a disease gene expression profile, but instead makes use of known connections of a disease to drugs known to treat that disease. By basing our predictions on drugs known to treat an indication, we can gain insight into the likely mechanisms of action of the new drug, creating interpretable predictions of new drug uses. Our method uses LINCS L1000 Connectivity Map to identify new therapies similar to known treatments for a disease, under the hypothesis that drugs that induce similar changes in gene expression may treat similar diseases. First, we test this hypothesis, using the Drug Repurposing Hub [9] catalog of known drug indications. Then, we investigate how the signal is impacted by analytical choices including: (1) the definition of similarity in expression, and (2) considering the extent of a drug’s impact on gene expression in a cell line. This analysis can inform other predictive approaches using the LINCS data, enriching understanding of concerns about this data set raised by previous work. Finally, we use our results to develop a predictive model for new drug indications. We test this model on independent clinical trials data, representing new experimental uses of drugs. All code, tables, and drug predictions are made available for future research, including in an interactive web browser.

Methods

Evaluating similarity in drug-induced gene expression changes

To test our hypothesis that drugs treating the same disease induce similar gene expression changes, we evaluate two metrics: Spearman correlation and Connectivity Score, the latter being a widely used metric in Connectivity Map (CMap) analyses. The processed Level 5 LINCS data provide drug gene signatures, where a drug signature is defined as the drug-induced changes to expression of each gene. As well, each signature is accompanied by a Transcriptional Activity Score (TAS), a single number that summarizes the robustness and strength of a drug’s effect on expression [1]. The TAS combines (1) the number of genes differentially expressed; and (2) the correlation between replicate drug signatures. From this data, we select the drug signatures tested on one of the three most used cell lines, for drugs with recorded indications in the Drug Repurposing Hub. To avoid redundancy from multiple dosages and durations of exposure to the same drug, we retain only the signature with the highest TAS, similar to the suggestion from Lim and Pavlidis [6].

We compute both similarity metrics between all pairs of drug signatures from the same cell line. Spearman correlation is calculated using drug-induced gene expression changes across the 978 landmark genes in the LINCS data, while Connectivity Scores are obtained directly from the Clue.io platform. To compare the two similarity metrics, we focus on drug signatures with recorded values for both metrics. For each metric, we use the rank sum test to compare pairs of drugs that treat and do not treat the same disease, based on data from the Drug Repurposing Hub.

To create our predictions for whether a drug treats an indication, we use the same metrics to compute the similarity of a drug to known treatments for an indication, again comparing signatures collected on the same cell lines. Then, we calculate the maximum similarity across all of these known treatments, under the assumption that if a new drug is similar to any known treatment, it may be a promising candidate. This similarity score is then used either on its own to calculate the AUC, or combined with the TAS in a logistic regression model, as described below.

Exploring the impact of transcriptional activity score

We investigate whether restricting our analysis to signatures with stronger transcriptional response improves the ability of our approach to predict new drug indications. To assess this, we filter the signatures for each cell line based on TAS, evaluating thresholds ranging from 0.2 to 0.5, keeping only drugs meeting this threshold, and indications which have at least one treatment meeting the threshold. For each threshold, we determine the area under the receiver operating characteristic curve (AUC) for this filtered set of drug-indication pairs. We also calculate the percentage of unique drugs that remain available for prediction after filtering.

Generating the ensemble model

We train a logistic regression model for each of the cell lines to predict drug indications based on the highest Spearman correlation between the drug of interest and any other treatment for the indication, as annotated in the Drug Repurposing Hub. Then, we generate predictions combined across all three cell lines in an ensemble model. Because not all drug-indication pairs are tested on all cell lines, we combine these by taking the predicted probabilities Inline graphic from each cell line model c as features for a second logistic regression model, trained on all drug-indication pairs present in all three cell lines. We obtain the coefficients from this logistic regression model and use these to generate the cell line weights in the ensemble model Inline graphic for each cell line c. Our ensemble model uses the weighted average to calculate the ensemble model’s predictions for drug indication pairs, using all available cell lines that assess a given drug indication probability:

graphic file with name d33e276.gif

Evaluating the ensemble model on independent clinical trials data

To assess the ensemble model’s generalizability, we curate a separate data set sourced from the Aggregate Analysis of Clinical Trials (AACT) database [10] containing new experimental uses of drugs not present in the Drug Repurposing Hub. AACT is a publicly available, relational database that contains data, in a structured format, for all clinical studies registered in ClinicalTrials.gov. From the total of clinical trials, we exclude those testing behavioral, device, diagnostic test, dietary supplements, procedures, radiation, or “other interventions”. Additionally, we filter out those that do not provide MeSH terms for both conditions and interventions tested. For the remaining clinical trials, we match the provided MeSH terms for both conditions and interventions to corresponding MeSH codes using the Unified Medical Language System (UMLS) database [11]. Then, for each intervention, we convert the MeSH code to DrugBank [12] ID using the UMLS API (crosswalk function). In case of drug combinations, we map their MeSH codes to RxNorm RxCUIs [13] using the UMLS API (“Retrieving Source-Asserted Relations”; vocabulary = RXNORM; relation label = has_part), obtain their active pharmaceutical ingredients (“Retrieving Source-Asserted Relations”; vocabulary = RXNORM; relation label = form_of), and then map them back to DrugBank IDs (crosswalk function). Finally, to enable mapping between the AACT and the Drug Repurposing Hub, we match the phenotypes in the Drug Repurposing Hub to MeSH codes using the UMLS API (“Searching the UMLS”; vocabulary = MSH; searchType = normalizedString).

The results of this normalization provide a test set for evaluating our model’s generalizability. For each drug-indication pair, we predict the probability of treatment, and for these new experimental drug indications we calculate the AUCs both for the individual cell line models and weighted average ensemble model. Specifically, we train our ensemble model using drug indications in Drug Repurposing Hub as positive training examples, and a random set of drug indication pairs not present in either Drug Repurposing Hub or in the AACT as negative training examples. Then, we test the performance on drug indication pairs present in only AACT (not present in Drug Repurposing hub) as positive testing examples, with a disjoint set of drug indication pairs not present in either data set, and not used for training the model, as negative testing examples.

To further explore the impact of TAS, we also re-fit our logistic regression models including not only Spearman correlation but also the TAS of both the drug of interest and the most correlated treatment for the indication. We then compare the AUCs of the weighted ensemble models to evaluate the influence of TAS on the model’s predictive performance.

Results

Drugs known to share clinical and biological effects induce similar changes in expression

We propose that two drugs that treat the same disease induce similar gene expression changes. To assess this hypothesis, we compare the similarity in gene expression induced by drugs that share any indication, against the similarity in pairs of drugs that do not share any indication, as annotated in the Drug Repurposing Hub.

We investigate two metrics for quantifying similarity between the drug-induced changes in gene expression: the Spearman correlation and the Connectivity Score, an algorithm developed for and widely used in the Connectivity Map data [1, 14]. In the MCF7 cancer cell line, both the Spearman correlation and Connectivity Score were significantly higher among pairs of drugs sharing an indication (Fig. 1). However, the difference is more pronounced with Spearman correlations (p = 7.7e-38, rank sum test) than for Connectivity Scores (p = 5.2e-6, rank sum test). Similar findings were also true for A375 and PC3 cancer cell lines (Supplementary Figs. S1 and S2). This suggests that an approach using Spearman correlation may more successfully identify whether drugs share an indication based on similarity of gene expression.

Fig. 1.

Fig. 1

(A) Spearman correlation between the changes in gene expression for all pairs of drugs in the Drug Repurposing Hub treated by MC7F cell line recorded connectivity scores. (B) Connectivity scores between the changes in gene expression in pairs of drugs on the same set of drugs

We expect that the success of this approach is because the Spearman correlation distinguishes drugs with biologically similar effects. In order to test this hypothesis, we examine whether higher correlation between pairs of drugs reflects other data suggesting the drugs have similar biology. We observe that pairs of drugs that share more gene targets have higher Spearman correlation (Supplementary Fig. S7A). Similarly, when using Anatomical Therapeutic Classification (ATC) as a proxy for mechanism of action, we find that drug pairs classified under the same ATC level (2,3 or 4) exhibit significantly higher Spearman correlations in their perturbation expression profiles compared to those from different ATC levels (Supplementary Fig. S7B). Together, these findings suggest that Spearman correlation captures underlying biological similarities between drugs.

Strength of transcriptional response impacts predictive utility of gene expression

The LINCS L1000 data set al.so includes a metric called the Transcriptional Activity Score (TAS) [1] describing the strength of the transcriptional response the treatment induces in a cell line. Drugs that exhibit a TAS above 0.2 induce a strong transcriptional response, meaning that they cause significant changes in expression of many genes. Therefore, we wanted to understand (1) how well similarity in gene expression changes can predict whether drugs treat the same indication and (2) whether filtering for drugs based on TAS improves this predictive signal.

To answer these questions, we filter drugs based on TAS, and we assess the ability of Spearman correlation to distinguish drugs that share an indication, as measured with area under the receiver operating characteristic curve (AUC). Without filtering on TAS, we find an AUC of 0.69 (Fig. 2B). Again, this supports that Spearman correlation in induced expression can distinguish drugs treating the same indication. Filtering for drugs at a higher TAS increases the AUC between drugs that share and do not share the indication (Fig. 2A). For instance, filtering for TAS above 0.5 improves the AUC from 0.69 to 0.80 (Fig. 2B). However, only 10% of drugs pass this stringent filter (Fig. 2B). In other words, filtering for drugs based on TAS results in a trade off between the strength of the predictive signal and number of drugs evaluated. Similar results were also true for A375 and PC3 cancer cell lines (Supplementary Figs. S3 and S4).

Fig. 2.

Fig. 2

(A) Spearman correlation between gene expression changes in pairs of drugs in the Drug Repurposing Hub treated by MC7F cell line with TAS scores above various TAS cutoffs. (B) The percent of all drugs available (orange) and AUCs of models based on spearman correlations (blue) after filtering for drugs above each TAS cutoff

Ensemble model generalizes to predict drug indications currently in trials

Next, we make a model to predict the probability, for each combination of a drug and an indication, that that drug treats that indication. Our predictions use the highest Spearman correlation between that drug and any treatment for that indication, as annotated in Drug Repurposing Hub. We fit separate logistic regression models for each of the top three cancer cell lines used in the LINCS L1000 data (MCF7, A375, and PC3) (Fig. 3A). A k-fold stratified cross-validation (k = 3) was also performed using known drug indications from the Drug Repurposing Hub. The average AUC scores for the training and validation data sets across all TAS cutoffs were similar, suggesting that training logistic regression models based on the Spearman correlation between the most correlated drug and a known drug for each indication did not result in overfitting (Supplementary Fig. S5).

Fig. 3.

Fig. 3

(A) ROC curves of individual cell line models used to generate the weighted average ensemble model and the ensemble model on experimental uses of drugs shared across all three cell Lines. All models were trained on only the spearman correlation between the drug of interest and existing treatments for an indication. (B) ROC curves of the ensemble model trained with and without including the TAS of the drugs on all experimental uses of drugs

Finally, we create an ensemble model by combining predicted probabilities from each cell line model, as described in the Methods, with an AUC of 0.854 for predicting indications in Drug Repurposing Hub. To assess the generalizability of this model, meaning its ability to predict new drug uses, we compile clinical trials data, excluding any drug indication pairs present in the Drug Repurposing Hub. Our model is able to predict these new experimental uses of 1,419 drug indications: all individual cell line models exhibited AUCs above 0.70 with the ensemble model performing similarly to the best performing cell line model (Fig. 3A).

Because not all drugs are measured across all cell lines, we also use our weighted model to generate predictions for all 19,562 drug indication pairs measured in at least one of the cell lines, finding an AUC of 0.708 (Fig. 3B). This suggests that our ensemble model can determine the best weights for each cell line and can predict new experimental uses of drugs, even with variable amounts of data.

Finally, we again explore how the transcriptional response may affect the model’s predictive ability for new indications. To test this, we repeat our analysis but include both Spearman correlation and TAS as predictors in the per-cell line models. Including the TAS as features of the ensemble model did slightly improve its performance on all experimental uses of drugs across all cell lines, with an AUC of 0.712 (Fig. 3B). Repeating this analysis with an ensemble model trained on drugs inducing a high transcriptional response resulted in a similar AUC (Supplementary Fig. S6B). These findings indicate that both filtering for drugs by TAS and including the TAS of the drugs when training the ensemble model slightly improves the predictive ability of the model.

Literature-based evaluation of the method

The analysis above provides an unbiased and objective evaluation of the likely success thousands of drug repurposing candidates. To examine whether the drug repurposing suggestions highly ranked by our method are reasonable and supported by some data, we performed a literature review of the top 20 candidates. These candidates were selected by probability of success in our final combined model, and for each, the most correlated treatment for the disease, and the correlation itself, is presented (Table 1). We are able to find either in vitro, experimental clinical data, or linkage to closely related diseases for 17 out of the 20 drug repurposing candidates. For comparison, we also repeated our literature review for a set of 20 random drug repurposing pairs, and we were able to find similar support for only 4 out of 20 (Supplementary Table S1). In this small manually reviewed set, then, we found an odds ratio of 4.25 for literature support for drug repurposing candidates, as compared to a random set of drug-indication pairs.

Table 1.

Top 20 predictions and literature-based evaluation of their likely success

drug disease In trials predicted prob most correlated treatment for disease spearman corr Literature support
metronidazole Hypertension False 0.946046 lacidipine 0.928048 None
clofarabine Ovarian Neoplasms False 0.917859 gemcitabine 0.827014 other cancers, in vitro evidence https://www.nature.com/articles/nrd2055 r
atomoxetine Gastritis False 0.913013 rebamipide 0.812971 None
cladribine Precursor Cell Lymphoblastic Leukemia-Lymphoma True 0.905612 clofarabine 0.836042 In trials, approved for other leukemia
pitavastatin Stroke False 0.901856 atorvastatin 0.860484 Other statins used for prevention, support in https://journals.plos.org/plosone/article? id=10.1371/journal.pone.0113766
pitavastatin Myocardial Infarction False 0.899704 atorvastatin 0.860484 Other statins used for therapeutics, in trials https://www.ajconline.org/article/S0002-9149(11)02287-9/abstract
atorvastatin Hyperlipidemias True 0.899704 pitavastatin 0.860484 known indication
vorinostat Lymphoma, T-Cell, Peripheral True 0.896806 belinostat 0.778831 other histone deacetylase inhibitors are approved; this drug approved for cutaneous T-Cell lymphoma
lacidipine Leukemia, Myeloid, Acute False 0.894831 midostaurin 0.765895 None
gefitinib Melanoma False 0.891577 vemurafenib 0.758247 Inhibits drivers of cancer https://www.sciencedirect.com/science/article/pii/S0304419X22000798; tested in Phase II clinical trial for melanoma https://journals.lww.com/melanomaresearch/abstract/2011/08000/a_phase_ii_study_of_gefitinib_in_patients_with.11.aspx
mitoxantrone Multiple Myeloma True 0.891278 doxorubicin 0.757554

Phase II clinical trial (terminated) (NCT00005987).

Active Phase I CT (NCT05857982)

mitoxantrone Precursor Cell Lymphoblastic Leukemia-Lymphoma True 0.891278 doxorubicin 0.757554 In clinical trial Phase III and approved for another type of leukemia (acute nonlymphocytic leukemia)
mitoxantrone Hodgkin Disease True 0.891278 doxorubicin 0.757554 In clinical trial Phase III. Hogkin categorized as an off-label use in DrugBank
doxorubicin Prostatic Neoplasms True 0.891278 mitoxantrone 0.757554 In trials
doxorubicin Multiple Sclerosis False 0.891278 mitoxantrone 0.757554 Shares mechanism of action with mitoxantrone (TOP2A). Both are cancer drugs but only mitoxantrone has been repurposed.
cladribine Breast Neoplasms False 0.885898 gemcitabine 0.783145 in vitro evidence of efficacy https://www.jstage.jst.go.jp/article/cpb/65/8/65_c17-00261/_html/-char/en
adapalene Pancreatic Neoplasms False 0.885639 gemcitabine 0.744790 Various anticancer effects reported https://www.sciencedirect.com/science/article/pii/S2211715624003904
adapalene Ovarian Neoplasms False 0.885639 gemcitabine 0.744790 See above
amsacrine Prostatic Neoplasms False 0.884674 mitoxantrone 0.742661 Treats another cancer (AML)
clofarabine Carcinoma, Non-Small-Cell Lung False 0.884219 gemcitabine 0.827014 in vitro evidence of efficacy https://jitc.bmj.com/content/13/2/e010252

Discussion and conclusions

Here, we present a simple but powerful approach to predicting new drug indications. Largely, methods for finding new drugs for a disease have used the connectivity score, searching for drugs that reverse the genes differentially expressed in disease [3, 4, 15]. Our approach can complement methods that rely on disease gene expression profiles, by leveraging the similarity between a drug and other drugs known to treat the disease. This tactic bypasses the challenge of determining the appropriate cell or tissue for comparing a drug and disease’s gene expression. By avoiding the need to estimate the expression profile of a disease, our approach also allows us to prioritize drug indications for diseases with no available gene expression profile, as long as data exists on other treatments for the disease. While similarity between drugs, such as similarity in molecular structures, has been used to suggest drug repurposing, to our knowledge no method directly combines indications of known drugs with similarity in gene expression in a predictive model of new drug indications.

Our results also advance the use of the LINCS L1000 Connectivity Map for drug repurposing by showcasing its strengths and limitations. Some recent studies have questioned the reliability of this resource, and we also perform an analysis of the strengths and weaknesses of this data. First, we are able to support our overall hypothesis by demonstrating that drugs sharing indications induce more similar gene expression, supporting the direct use of the LINCS data for drug repurposing, without necessarily relying on disease gene expression profiles. We also show that the Spearman correlation, rather than the more commonly used Connectivity Score, is more able to pick out pairs of drugs known to share an indication. We propose that this may be because the Connectivity Score relies on pre-selecting differentially expressed genes using a cutoff, while the Spearman correlation takes advantage of all landmark gene measurements. Additionally, we find that the Transcriptional Activity Score (TAS) is a meaningful factor to consider when building a model to predict drug indications: the ability of the Spearman correlation to distinguish drug pairs sharing an indication increased with increasing TAS cutoffs. However, TAS filtering also excludes many drugs, which may explain why the ability to predict on experimental uses of drugs did not benefit from this filter. Future models may benefit from accounting for TAS while not employing a strict filter.

To allow researchers to investigate new treatments for a disease of interest, we have developed a browsable web tool containing recommended drugs for 170 common indications, available at https://bsultan.shinyapps.io/web-app. Although our stringent out-of-sample AUC is moderate at around 0.70, we emphasize that our results include a number of new drug indications that are highly plausible. For example, our model predicts that olaparib treats renal cell carcinoma, which aligns with clinical observations showing a 20% reduction in disease progression in a subset of metastatic renal cell carcinomic patients treated with olaparib [16]. Similarly, our model predicts that pitavastatin can treat myocardial infarction and stroke, supported by studies showing that this drug results in a 35% reduction of major adverse cardiovascular events in HIV patients [17]. Moreover, we not only provide a predicted probability for each proposed new drug indication, but we also offer a rationale for that probability, highlighting which known treatment is most similar to that drug. In the case of olaparib, it is similar to a current renal cell carcinoma treatment, sunitinib: both drugs may have downstream effects on the DNA damage repair response, explaining the possible benefit [18]. For pitavastatin, the new indication for myocardial infarction is supported by the similarity in expression to a known treatment, atorvastatin. Statins are known to share many physiological effects, and it is logical that they may have similar effects on disease [19].

Our study does have some limitations. First, our final model predicts whether a given drug treats a given indication based solely on the highest correlation between a drug and any known treatment for a disease. While this simple approach avoids overfitting, it does not maximize the use of available data, ignoring information on any other treatments for an indication. It also does not leverage other measures of similarity in expression beyond global correlation. Second, some drugs or indications may be better modeled by particular cell lines. Our ensemble model assigns a fixed contribution to each cell line in the prediction, whereas more complex models could adaptively weight the predictions derived from each cell line. Third, the distribution of Spearman correlations between drug pairs sharing indications overlaps the distribution for drug pairs sharing no indication, suggesting that this metric does not perfectly distinguish drugs sharing indications. This is likely because drugs sharing similar targets and mechanisms of action may still cause distinct expression perturbation effects. However, our analysis shows that drugs with higher Spearman correlation not only share indications, but also share gene targets and pharmacological and chemical similarity (as annotated in ATC codes). Therefore, Spearman correlation reflects therapeutically relevant biological similarity between drugs.

Further analysis is needed before any one drug candidate can be supported for experimental or clinical follow-up, and we believe that our method can complement methods that use a list of differentially expressed genes to mine the Connectivity Map for repurposing. Future work could incorporate both methods into one shared model. As well, methods that exploit the similarity between drug in vitro effects for repurposing [20] could exploit this signal in more complex models. The strength of our analysis is its simplicity and interpretability. We have made all code and recommended new drug indications publicly browseable and reproducible. We expect that our work will help accelerate the use of the LINCS Connectivity Map for drug repurposing.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (1.9MB, docx)

Acknowledgements

Not applicable.

Author contributions

KH co-wrote the main text and performed the analyses. PNL curated the data sets for drug indications and clinical trials and contributed to the methods. BS contributed the web tool. RDM conceptualized the approach and co-wrote the manuscript. All authors reviewed the manuscript.

Funding

This work was supported by the National Institute of General Medicine Sciences (NIGMS R35 GM151001-01) to RDM. BS was supported by NSF Louis Stokes New STEM Pathways and Research Alliance: Urban Massachusetts LSAMP Award number 2308724. The funders had no role in the study design.

Data availability

No datasets were generated or analysed during the current study.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Subramanian A, et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell. 2017;171:1437–e145217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lamb J et al. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. 2006;313:8. 10.1126/science.1132939 [DOI] [PubMed]
  • 3.Chen B, et al. Reversal of cancer gene expression correlates with drug efficacy and reveals therapeutic targets. Nat Commun. 2017;8:16022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sirota M, et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med. 2011;3:96ra77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Iwata M, Sawada R, Iwata H, Kotera M, Yamanishi Y. Elucidating the modes of action for bioactive compounds in a cell-specific manner by large-scale chemically-induced transcriptomics. Sci Rep. 2017;7:40164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lim N, Pavlidis P. Evaluation of connectivity map shows limited reproducibility in drug repositioning. BioRxiv. 2019;845693. 10.1101/845693. [DOI] [PMC free article] [PubMed]
  • 7.Huang S, Hu P, Lakowski TM. Predicting breast cancer drug response using a multiple-layer cell line drug response network model. BMC Cancer. 2021;21:648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Heinrich L, Kumbier K, Li L, Altschuler SJ, Wu LF. Selection of optimal cell lines for High-Content phenotypic screening. ACS Chem Biol. 2023;18:679–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Corsello SM, et al. The drug repurposing hub: a next-generation drug library and information resource. Nat Med. 2017;23:405–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Clinical Trials Transformation Initiative (CTTI). Aggregate Analysis of ClinicalTrials.gov (AACT) Database. https://aact.ctti-clinicaltrials.org/
  • 11.Bodenreider O. The unified medical Language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wishart DS, et al. DrugBank: a comprehensive resource for in Silico drug discovery and exploration. Nucleic Acids Res. 2006;34:D668–672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inf Assoc. 2011;18:441–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Subramanian A, et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci. 2005;102:15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sun S, et al. Reversal gene expression assessment for drug repurposing, a case study of glioblastoma. J Transl Med. 2025;23:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ged G, et al. ORCHID: A phase II study of Olaparib in metastatic renal cell carcinoma patients harboring a BAP1 or other DNA repair gene mutations. Oncologist. 2023;28:S1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sugimoto H, et al. The Long-Term effects of Pitavastatin on blood lipids and platelet activation markers in stroke patients: impact of the homocysteine level. PLoS ONE. 2014;9:e113766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yang X-D, et al. PARP inhibitor Olaparib overcomes Sorafenib resistance through reshaping the pluripotent transcriptome in hepatocellular carcinoma. Mol Cancer. 2021;20:20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Moroi M, et al. Outcome of Pitavastatin versus Atorvastatin therapy in patients with hypercholesterolemia at high risk for atherosclerotic cardiovascular disease. Int J Cardiol. 2020;305:139–46. [DOI] [PubMed] [Google Scholar]
  • 20.Habib M, Lalagkas PN, Melamed RD. Mapping drug biology to disease genetics to discover drug impacts on the human phenome. Bioinf Adv. 2024;4:vbae038. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (1.9MB, docx)

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from Human Genomics are provided here courtesy of BMC

RESOURCES