Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Aug 1.
Published in final edited form as: Lancet Haematol. 2025 Feb;12(2):e94–e96. doi: 10.1016/S2352-3026(24)00278-3

Pioneering a new field of computational pharmacophenomics to unlock the life-saving potential of existing medicines

David C Fajgenbaum 1,2, Sally Nijim 3,4, Grant Mitchell 5, Matej Macak 6, Chris Bizon 7, Alexander Tropsha 8, David Koslicki 9
PMCID: PMC12090018  NIHMSID: NIHMS2063453  PMID: 39909662

All physicians, including haematologists, are faced with the challenge of what to do for patients who have diseases with no US Food and Drug Administration (FDA)-approved or European Medicines Agency (EMA)-approved therapies or who have run out of approved therapies for their diseases. What do you do when there are no more approved treatments left?

In many cases, medicines approved for another indication are repurposed for off-label use in these patients. Drugs like rituximab, tocilizumab, and thalidomide have been prescribed for many diseases other than those they were first approved for. For example, tocilizumab was first developed for idiopathic multicentric Castleman disease (iMCD) before it was approved or widely used off-label for rheumatoid arthritis, cytokine release syndrome, severe COVID-19, and other diseases. Drug repurposing is a promising approach for discovering novel treatments because many diseases share similar underlying mechanisms (eg, iMCD, rheumatoid arthritis, cytokine release syndrome, and severe COVID-19 can involve dysregulated interleukin-6-signaling) and, therefore, can be treated with the same drug (e.g., tocilizumab, which is an interleukin-6-inhibitor). Further, many drugs can modulate multiple targets—for example, thalidomide depletes proteins Ikaros and CK1ɑ—each with potential roles across multiple diseases. In fact, on average, a drug molecule is estimated to bind to approximately 30 proteins1. The average approved drug is indicated for 2–3 different diseases, and more than 60% of drugs have been considered for diseases for which they were not approved.2

Repurposing is facilitated by the fact that physicians can legally prescribe drugs off-label for diseases they are not approved for; in the USA, 20–32% of all prescriptions are off-label.3 Further, advances in data generation and analytics have enabled rapid repurposing discoveries.

Given the tremendous investment in and therapeutic potential of already-approved medicines, one would assume that drug repurposing would occur systematically – that drugs would be assessed and fully utilised for all diseases they can possibly treat. Unfortunately, repurposing is often pursued for a given disease or drug due to serendipity (eg, a researcher uncovers a new use, a doctor observes an unexpected effect). Of the approximately 18 500 diseases (this number varies widely depending on ontology and subclassification), only around 4000 are estimated to have approved treatments, and over 300 million people worldwide are estimated to have diseases with no approved therapies. While new drug development continues to identify treatments for these diseases, we should also address this treatment gap by identifying and utilising already-approved therapies in all additional diseases that could potentially benefit.

Several barriers prevent drug repurposing from being performed systematically. First, there are insufficient incentives to repurpose approved drugs, especially the approximately 80% of drugs that are already generic (and no longer highly profitable) and particularly for diseases that are rare or affect otherwise neglected populations. Second, while repurposing has been pursued sporadically, for specific diseases or drugs, no platform has centralised all available biomedical data on off-label uses or reviewed all known connections between all existing drugs and diseases. The data infrastructure, algorithms, and computing power were not sufficiently robust to perform analyses of all drugs’ biological potential across all diseases until recently. Third, no organisation has taken responsibility to use the world’s collective knowledge and data to systematically and continuously assess the most promising repurposing opportunities and ensure drugs are fully utilised for all diseases they can treat.

However, progress has recently been made to overcome these barriers and a new field is emerging. One leading effort is the National Center for Advancing Translational Sciences (NCATS) Biomedical Data Translator initiative,4 which launched in 2017 and has built crucial data infrastructure, ontologies, and models to integrate a vast array of biomedical knowledge and perform analyses that were not previously possible. Another leading effort is from the Zitnik Lab at Harvard Medical School, where they have built PrimeKG and TxGNN to help identify therapeutic opportunities for diseases with limited treatment options and minimal molecular understanding.5 It has also become possible to capture, systematically organise, and reason over millions of interconnected and semantically harmonised relationships between biological, biomedical, and chemical entities. Specifically, Biomedical Data Translator has used knowledge graphs to integrate multiple disparate biomedical databases and map connections between biomedical entities using a standardised ontology. These biomedical concepts (eg, genes, proteins, cell types, phenotypic characteristics, diseases, and drugs) are represented as interconnected nodes with edges that represent the biological relationship between the nodes; a fundamental component of the graph is a semantic triple, i.e., two nodes connected by an edge (eg, “mTOR is upregulated in iMCD” and “mTOR is inhibited by sirolimus”, with “mTOR”, “iMCD”, and “sirolimus” being nodes and “upregulated” and “inhibited” representing edges). A comprehensive knowledge graph representing our understanding of human biology can be a powerful tool to support drug repurposing.

Following a decade of repurposing drugs for iMCD and other rare diseases, we launched a nonprofit organisation called Every Cure (www.everycure.org) in 2022 to build upon the Biomedical Data Translator initiative and take responsibility for building a drug repurposing platform to advance the most promising opportunities to patients worldwide, regardless of profit potential. In the process, we are pioneering a new field called computational pharmacophenomics, which involves systematic quantitative evaluation of the potential of all drugs (pharmaco-) as treatments against all diseases and even sub-phenotypes within a disease (-phenomics; Figure). This new paradigm emerged from identifying several repurposed treatments— including one the lead author (DCF) discovered that saved his life—and the desire to do this systematically in a drug-agnostic and disease-agnostic manner.

Figure: Computational pharmacophenomics quantifies the likelihood of every approved drug to be able to treat every disease.

Figure:

(A) Visualization of traditional drug repurposing approaches that begin with either a single drug or a single disease and then look for matches for that drug or diseases. (B) Visualization of computational pharmacophenomics whereby systematic quantitative evaluation is performed for the potential of all drugs (pharmaco-) as treatments against all diseases and even sub-phenotypes within a disease (-phenomics). (C) A sample portion of a knowledge graph representing biomedical concepts (nodes) and the relationships between them (edges). ML models are trained on known treats relationships (thick lines) and applied to the rest of the graph to quantify the likelihood that the new drug-disease match would be an effective treatment (thick arrows). (D) The results of every drug-disease match are presented in a heatmap, which can be exported into a match rank list in order to prioritize matches for further research, generate pre-clinical and clinical evidence, and change clinical practice.

Computational pharmacophenomics is unique from traditional drug repurposing approaches that typically start with either a disease of interest (often done by academic researchers and disease organisations) or a specific drug of interest (often led by pharmaceutical companies) (Figure A). Instead of restricting repurposing searches to a predefined scope of diseases or drugs, computational pharmacophenomics quantifies the relative strength of evidence across all 74 million drug–disease matches (unique pairs formed by approximately 4000 drugs and 18 500 diseases; Figure B). This global scoring approach enhances repurposing success by utilising the entirety of biomedical knowledge to prioritise drug–disease pairs with the strongest evidence. Effectively, this increases the likelihood of success by metaphorically searching for the lowest hanging fruit – not just on one tree, but the entire forest.

The Every Cure platform is built to integrate multiple knowledge sources, including knowledge graphs from the Biomedical Data Translator Initiative and beyond (e.g., PrimeKG) and is constructed from dozens of biomedical data sources, real-world evidence, and other sources (e.g., next generation sequencing data, high-throughput screening data).59 Every Cure is enhancing these knowledge graphs with proprietary data, research results, and new insights from large language models. Using these comprehensive graph-based representations of biomedical knowledge (e.g., knowledge graphs) and non-graph-based models (e.g., large language models), Every Cure’s platform trains multiple machine-learning algorithms to recognise patterns of connections between drugs, their mechanism of action, diseases they are known to treat, and other associated biomedical concepts (Figure 1C). One of these algorithms, KGML-xDTD, utilises random forest and reinforcement learning models to generate a normalised score between 0–1 for each drug–disease match based on the patterns of connections for known treatments and predicts a mechanism of action.10 Other approaches utilise additional machine learning models, graph-based learning, and language models. The scores that are generated can be displayed in a heatmap of all drugs versus all diseases and ranked based on the strengths of those connections in order to prioritize matches for further data generation and changes to clinical practice (Figure D). The benefits of this methodology include the ability to: (1) consider diverse and heterogeneous datasets simultaneously to predict relationships between biomedical concepts; (2) easily update data structures; (3) perform intuitive queries and evaluations of the data; and (4) scale evaluation of all drugs versus all diseases. This approach enables the identification and prioritisation of already established but currently under-pursued repurposing opportunities as well as the evaluation and comparison of complex patterns to identify new repurposing opportunities that might not be immediately obvious to researchers.

In addition to identifying the most likely to be effective drug repurposing opportunities, the Every Cure platform selects the highest patient impact opportunities and employs a comprehensive validation framework that evaluates opportunities through multiple modalities (e.g., in vitro, in vivo, real-world evidence, clinical trials), feeds outcomes of these studies back into the platform, and advances treatments towards full clinical adoption. Since these drugs are already approved and potentially being used off-label or to treat a co-morbidity, real world studies can be performed to evaluate the predictions being made by the platform. A portal was recently created for physicians, researchers, and patients to contribute repurposing ideas at everycure.org/insights.

By early 2026, we will provide putative relative efficacy scores for all drug–disease matches on a public website. While primarily aimed at identifying top repurposing opportunities for advancement in preclinical or clinical studies by researchers at Every Cure and elsewhere, we have also identified matches that have already undergone sufficient clinical studies to prove efficacy but are underutilized and require guidelines changes and awareness. Further, clinicians will be able to view all FDA-approved drugs, guideline-recommended treatments, and a rank-order list of research-grade predictions based on mechanism and clinical data. We believe this list of data-supported hypotheses could serve as a starting point for deeper analysis by clinicians and biomedical researchers.

We have already begun to pilot these processes. Recently, we learned of a patient with POEMS syndrome refractory to therapies who was preparing for hospice care. We reviewed scores in Every Cure’s platform for POEMS and recommended several high-scoring medications approved for multiple myeloma. These treatments saved the patient’s life, and he has been in remission for 9 months as of the time of the submission of this piece (unpublished data).

In summary, the field of computational pharmacophenomics, which has emerged recently but is described for the first time in this paper, offers a novel, systematic approach to rapidly uncover new treatments by developing a universal scoring system for assessing the likelihood of therapeutic effect across all possible 74M drug-disease combinations. Recent progress has ushered in the powerful possibility of using these computational approaches to identify and advance the greatest opportunities to help patients. As more data are generated about drugs, diseases, targets, mechanisms of action, and real-world off-label use, this information will serve as fuel for predicting new treatments for patients in need.

Acknowledgements:

The authors wish to thank Daniel Korn, Tracey Sikora, and Ruxandra Draghia-Akli for their contributions to this work.

Funding:

This research was, in part, funded by the Advanced Research Projects Agency for Health (ARPA-H, Agreement # 140D042490001). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the United States Government. This research was also supported in part by the National Heart, Lung, and Blood Institute (R01HL141408), National Institute of General Medical Sciences (U24GM144330), and US Food & Drug Administration (R01FD007632) as well as the NCATS Biomedical Data Translator Initiative, Chan Zuckerberg Initiative, Lyda Hill Philanthropies, Arnold Ventures, Elevate Prize Foundation, and Carolyn Smith Foundation.

Footnotes

Competing interests: AT, DK, CB, and SN reports receiving support for attending a meeting from Chan Zuckerberg Initiative. DF and GM reports receiving support for attending a meeting from Chan Zuckerberg Initiative and the Elevate Prize Foundation.

Contributor Information

David C Fajgenbaum, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Every Cure, New York, NY, USA.

Sally Nijim, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Every Cure, New York, NY, USA.

Grant Mitchell, Every Cure, New York, NY, USA.

Matej Macak, Every Cure, New York, NY, USA.

Chris Bizon, Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, USA.

Alexander Tropsha, Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, USA.

David Koslicki, Department of Computer Science and Engineering, Biology, and The Huck Institute of the Life Sciences, Pennsylvania State University, State College, PA, USA.

References

  • (1).Chartier M, Morency L-P, Zylber MI, Najmanovich RJ. Large-scale detection of drug off-targets: hypotheses for drug repurposing and understanding side-effects. BMC Pharmacology and Toxicology 2017; 18: 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Baker NC, Ekins S, Williams AJ, Tropsha A. A bibliometric review of drug repurposing. Drug Discov Today, 2018, 23(3):661–672 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Van Norman GA. Off-Label Use vs Off-Label Marketing of Drugs. JACC Basic Transl Sci 2023; 8: 224–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Biomedical Data Translator Consortium. (2019). Toward a universal biomedical data translator. Clinical and translational science, 12(2), 86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Scientific Data. 2023; 10: 67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Wood EC, Glen AK, Kvarfordt LG, et al. RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine. BMC Bioinformatics 2022; 23: 400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Bizon C, Cox S, Balhoff J, et al. ROBOKOP KG and KGB: Integrated Knowledge Graphs from Federated Sources. J Chem Inf Model 2019; 59: 4968–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Morris JH, Soman K, Akbas RE, et al. The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information. Bioinformatics 2023; 39: btad080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Foksinska A, Crowder CM, Crouse AB, et al. The precision medicine process for treating rare disease using the artificial intelligence tool mediKanren. Front Artif Intell 2022; 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Ma C, Zhou Z, Liu H, Koslicki D. KGML-xDTD: a knowledge graph–based machine learning framework for drug treatment prediction and mechanism description. GigaScience 2023; 12, 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES