Abstract
Objective:
This study aims to investigate how machine learning (ML) contributes to drug repurposing efforts in oncology, considering the pharmaceutical industry’s mounting R and D inefficiencies and economic pressures. Through qualitative interviews with experts across artificial intelligence, oncology, and pharmaceutical development, this paper explores the real-world applications of ML in this field, the challenges to its implementation, and its future potential to streamline drug discovery.
Methods:
This study employed the “research onion” framework (Saunders et al., 2016), adopting an interpretivist philosophy and inductive approach to explore stakeholder perspectives on integrating ML into oncological drug repurposing. A multimethod strategy combined a narrative literature review with 13 semi-structured interviews, selected through purposive and snowball sampling. Data were thematically analyzed using Braun and Clarke’s six-step framework, supported by NVivo. Research trustworthiness was ensured via Lincoln and Guba’s criteria, and ethical approval was granted by Imperial College London.
Findings:
Three major thematic domains emerged: The technological, regulatory, and business landscapes. Technological challenges included poor data quality, limited accessibility to real-world datasets, and the need for robust infrastructure to support predictive modeling. Regulatory barriers are centered on ethical concerns in data governance and the difficulty of securing exclusivity and market protection for repurposed drugs. From a business perspective, profitability concerns, generic competition, and fragmented data ownership underscored the need for more collaborative and economically sustainable models.
Conclusion:
ML offers potential for oncological drug repurposing, but realizing its benefits requires addressing key technological, regulatory, and economic challenges.
KEYWORDS: Drug repositioning, machine learning, neoplasms, pattern recognition
INTRODUCTION AND BACKGROUND
Global pharmaceutical spending is expected to reach $1.6 trillion by 2025, with oncology showing a 30% compounded annual growth rate (CAGR) over the past decade.[1] However, the industry faces an R and D productivity crisis. Developing a new drug can take 10–15 years and cost over $2 billion,[2] with only 1 in 10–20 candidates gaining approval.[3] Pricing pressures and regulatory controls add further strain, as global demand for affordable healthcare rises. Governments and healthcare providers are pushing for lower prices,[4] while price controls in regions such as Europe and Japan impact profits and R and D reinvestment.[5] This creates a trade-off between innovation and accessibility.
Patents offer up to 20 years of exclusivity.[6] Once a patent expires, generic drugs – which are molecularly identical to the original branded medication – can enter the market.[7] These generic drugs contain the same active pharmaceutical ingredient as their branded counterparts, rendering them molecularly equivalent and essentially identical in terms of therapeutic effects.[7] This “genericization” severely impacts the profitability of the original company, as new companies can offer the same medication at significantly lower prices. Consequently, pharmaceutical companies fiercely guard their intellectual property (IP) and compete aggressively to generate substantial revenues during the exclusivity window before generic competition arises
Drug repurposing, defined as finding new uses for existing drugs,[8] addresses these challenges. Around 30% of Food and Drug Administration-approved drugs have been repurposed.[9] Notable examples include aspirin’s use as an antiplatelet[10] and thalidomide for multiple myeloma.[11] Repurposing reduces development time, cost, and failure risk – cutting costs from $2 billion to $300 million and duration from 10 to 15 years to 6.5 years.[11]
Two repurposing strategies exist: Hard repurposing which targets different diseases (e.g., propranolol for hemangioma[12]), and soft repurposing which applies drugs within the same field (e.g., trastuzumab initially for HER2-positive breast cancer but later for gastric cancer;[13]). Identification methods include experimental (target- or drug-based[14]) and computational approaches such as molecular docking, signature mapping, genetic association studies, pathway mapping, and retrospective clinical analysis.[15]
Artificial intelligence (AI) and machine learning (ML) tools are now integral to this process. ML branches include supervised, unsupervised, semi-supervised, reinforcement, and deep learning.[16] These tools enable analysis of large datasets, predict molecular interactions, and optimize formulations, reducing time and cost.[15]
Cancer treatment complexity, due to drug resistance and tumor heterogeneity,[17] underscores the need for personalized approaches. With a one in two lifetime risk[18] and oncology spending set to exceed $260 billion by 2025,[1] repurposing offers an efficient alternative – especially as cancer drugs average 7.3 years and $648 million to develop.[19] The ReDO database lists 335 non-cancer drugs with anticancer potential,[20] and 190 ongoing late-stage oncology trials are investigating repurposed candidates.[21]
Therefore, this paper aims to answer the question, “How does machine learning contribute to drug repurposing efforts in oncology?.” It aims to delve deeper into the applications, challenges, and future scope of incorporating ML techniques with oncological drug repurposing and evaluate them as potential means to address the larger issues the pharmaceutical industry faces. The field of oncology has attained a demonstrated global focus, with pharmaceutical companies dedicating significant resources toward optimizing the drug repurposing process for cancer therapeutics. In addition, the multi-functional aspects of ML tools hold promise to advance oncological drug repurposing, potentially addressing the obstacles faced by not only the field of oncology but the pharmaceutical industry. These factors have formed the basis of this report.
METHODOLOGY
This study followed the “research onion” framework,[22] proposed by a conceptual model that outlines the layers involved in developing a sound research methodology. Each layer represents a distinct element of research design, ensuring a logical and structured progression from philosophical stance to data collection and analysis.
We adopted an interpretivist philosophy, which acknowledges the subjective nature of human experiences. This was appropriate given our aim to explore the challenges and potential of integrating ML into oncological drug repurposing from the perspectives of industry stakeholders. Interpretivism allowed us to gather in-depth insights into the realities, opinions, and lived experiences of experts working at the intersection of oncology, AI, and pharmaceutical development.
The study employed an inductive approach, enabling us to collect empirical qualitative data and build a descriptive framework to explain observed phenomena.[23] This suited our objective to uncover the factors that support or hinder the adoption of ML techniques in oncological drug repurposing, without testing a predefined hypothesis.
Aligned with the third and fourth layers of the research onion,[22] we adopted a multi-method strategy, combining findings from a narrative literature review (NLR) and semi-structured interviews (SSIs) [Appendix A]. This allowed us to integrate theoretical knowledge with real-world experiences. Quantitative methods were not used, as stakeholder perceptions and contextual challenges are best captured through qualitative inquiry. The cross-sectional time horizon of the study ensured that we captured insights relevant to the current state of ML integration in oncology.
At the core of the research onion,[22] are the techniques and procedures used to collect and analyze data. Our qualitative data were collected through SSIs, which offered a balance between a consistent question framework and the flexibility to probe deeper into key topics. This method enabled us to explore complex themes, unlike focus groups, which risk “groupthink,”[24] or questionnaires, which often lack the depth required for exploratory studies.
We used purposive sampling to select participants based on their expertise and relevance to the research aims.[25] Our sampling framework [Figure 1] identified individuals across three domains: AI, oncology, and the pharmaceutical industry. Priority was given to those operating at the intersections of these fields, particularly professionals with pharmaceutical experience. Participants were recruited through multiple media outlets and professional networks, in line with our ethics approval checklist [Appendix B].
Figure 1.

A Venn diagram depicting the purposive sampling approach for interviewees
To complement purposive sampling, we also applied snowball sampling.[26] This involved asking participants to recommend others with relevant expertise. Snowballing was particularly effective given the niche nature of our topic and the limited accessibility of qualified professionals. We recognized the potential for community bias in this method, where early participants influence the direction of results.[27] To minimize this, we diversified our initial contacts by sourcing participants from various backgrounds and organizations, including small startups, large pharmaceutical companies, and academic institutions.
Sample size in qualitative research is typically small due to the depth of analysis involved. According to Hennink and Kaiser, 9–17 interviews are generally sufficient.[28] Our study reached theoretical saturation – the point at which no new themes emerged – after conducting 13 interviews with individuals, including data scientists, oncologists, and pharmaceutical industry leaders [Appendix C]. This sample size ensured a wide range of views were captured while remaining manageable for thorough thematic analysis.
SSIs were designed and conducted using McNamara’s 8-step framework[29] to ensure structure and consistency. Before each interview, we researched each participant’s background, allowing us to tailor questions and elicit detailed insights. Interview questions [Appendix D] were open-ended, enabling participants to express views freely and allowing us to follow up on areas of interest. The question sets were tailored to three broad interviewee categories based on professional background, ensuring relevance to the study aims.
All interviews were conducted virtually via Microsoft Teams to accommodate participants’ schedules and enable automatic audio transcription. Doing this would not solely provide a trail of evidence and increase transparency but would also enable a script to be generated that we could look back on and analyze.[30] Verbal consent was obtained at the start of each session for recording and transcription. Interviews lasted approximately 45 min and began with informal conversation to build rapport. Two group members attended each session – one to conduct the interview and the other to ensure technical reliability. All interviews followed a consistent structure to support reliability and comparability of responses.
Following data collection, interview transcripts were analyzed using Braun and Clarke’s 6-step framework for thematic analysis.[31] Transcripts were cleaned to remove timestamps and repetitive speech, allowing for familiarization with the material. Three team members imported the transcripts into NVivo, a qualitative analysis tool provided by Imperial College London, to begin coding. Initial codes were generated based on relevance to study objectives, then grouped into broader categories and themes through iterative discussion and refinement among all group members.
This process resulted in the identification of three major themes and six sub-themes, which were reviewed and reorganized to ensure internal consistency (homogeneity) within themes and distinctiveness (heterogeneity) between them. A thematic map was developed to visually represent the findings and guide the discussion section of the report. The combination of NLR insights and interview data provided a comprehensive understanding of the challenges and opportunities in using ML for drug repurposing in oncology.
To ensure research trustworthiness, we followed Lincoln and Guba’s[32] four criteria:
Credibility was ensured through iterative questioning and clarification during interviews
Transferability was achieved through purposive sampling and the inclusion of diverse professional roles
Dependability was addressed by thorough documentation of the research process.
Confirmability was reinforced through investigator triangulation, with two group members present for each interview.[25]
Finally, ethical approval was obtained from the Imperial College Research Ethics Committee on January 10, 2024 [Appendix B]. Participants gave informed consent for interview recording and data use, with all data handled in accordance with our ethical protocol.
Data management plan
Data Collection: This study involved qualitative data collection through SSIs with professionals in AI, oncology, and pharmaceutical development. Participants were identified via purposive sampling and contacted through professional networks and social media platforms. Interviews were conducted online
Consent and Ethical Procedures: Before participation, individuals were provided with a participant information sheet and consent form explaining the study’s aims, procedures, and data use. Written informed consent was obtained before interviews. It was made clear that participation was voluntary, and participants retained the right to withdraw at any point. Ethical approval was granted by Imperial College London Research Ethics Committee
Data Types and Format: Primary data consisted of audio recordings and their verbatim transcripts. These were saved in digital formats
Anonymity and Confidentiality: Transcripts were anonymized during transcription by removing any personally identifiable information, including names, institutions, and job titles. This ensured the confidentiality of participants in accordance with institutional ethics guidelines
Data Storage and Security: All data were stored on password-protected devices and secure cloud services accessible only to the researcher, in line with GDPR compliance and institutional data protection protocols
Data Sharing and Reuse: Due to the identifiable nature of the data and commitments to participant confidentiality, raw interview data will not be made publicly available. However, anonymized thematic findings may be shared upon reasonable request and subject to ethical approval
Data Retention and Disposal: Data were securely stored for a period of 5 years in accordance with institutional policy and were permanently deleted thereafter using approved secure disposal methods.
QUALITATIVE RESULTS
Technological landscape
The technological landscape for employing ML in oncological drug repurposing encompasses real-world data (RWD) analysis, Predictive Modeling, and Data Infrastructure. Data infrastructure can be further divided into Data Quality and Accessibility [Tables 1-3].
Table 1.
Thematic analysis: Technological landscape and data infrastructure in AI-driven drug repurposing. Source: Author’s qualitative interviews (2024)
| Technological Landscape | |
|---|---|
| RWD analysis and Predictive Modelling | “If we did this without AI, we would have to read 500,000 papers manually. And by the time we were done reading those, another 100,000 will have been published, probably. So, it’s just, it’s impossible to do manually”. ~ Interviewee 4 “We’re getting to a point now where you can feed an algorithm, a protein sequence, and you’ll be able to get back a 3D tertiary structure”. ~ Interviewee 7 “With the increased amount of data that we’re going to have access to as time progresses, machine learning is going to become even more useful”. ~ Interviewee 12 |
| Data Infrastructure | “We need very large databases of human samples that have accompanying electronic health records.” ~ Interviewee 2 “You need to review the actual paper. And there are two problems with that. One is you need to have access to that paper. And sometimes that’s not possible or it’s very expensive.” ~ Interviewee 6 “Whatever data you use to develop these models, needs to be up to date. essentially relevant for your study. It needs to be representative of the information out there, and it needs to no be biased. machine learning models are notorious for that~Interviewee 7 “The biggest challenge is developing a dataset that is complete” ~ Interviewee 7 “Lot of really interesting data that that dates back, you know, 15-20-25 years ago that’s still locked away behind paywalls so that that’s an issue.” ~ Interviewee 6 |
Table 3.
Thematic analysis: Business environment — profitability and industry collaboration in drug repurposing. Source: Author’s qualitative interviews (2024)
| Business Environment | |
|---|---|
| Profitability | “Generic drugs typically cost about 80 to 85% less than brand name drugs. They’re often widely available. So, the accessibility of these types of treatments is, obviously one of the advantages”. ~ Interviewee 1 “AI and machine learning makes it cheaper overall to do the target validation step and to be able to enter into the clinical side of things really much earlier.” ~ Interviewee 8 “If they use an old drug, it’s hard to generate any revenue from that.” ~ Interviewee 11 “The challenge is how to build a sustainable economic model to be able to produce drugs with the threat of generic substitution.” ~ Interviewee 8 |
| Industry collaboration | “Drug companies are a consolidation of former companies or acquisitions and actually pulling together. The information is incredibly hard. Either the people are still not there, or the systems don’t talk to each other.” ~ Interviewee 3 “There are huge amounts of relevant data locked away in the hard discs of pharm companies”. ~ Interviewee 6 “We need to have a complete reshaping of the culture so that it is a joint discovery effort.” ~ Interviewee 3 |
Table 2.
Thematic analysis: Governance of data and regulatory challenges in drug repurposing. Source: Author’s qualitative interviews (2024)
| Regulatory Landscape | |
|---|---|
| Governance of Data | “The biggest problems are institutional and structural. For example, getting access to data, say, the ethics process is easily going to set you back nine months or so. Once it’s set you back you then need to persuade a trust to extract the data and hand that over. And sort of, it’s not enough to just pay for the data.” ~ Interviewee 7 “So, the key thing here is open data sets. There exist open data sets of cancer genome sequences, particularly through the Cancer Genome Atlas.” ~ Interviewee 7 “There’s going to need to be laws developed which do not currently exist, that protects the individual and protects their privacy.” ~ Interviewee 9 |
| Regulatory Challenges | “You’ve got to find a way that lets you get commercial exclusivity.” ~ Interviewee 5 “You need to evaluate the whole picture. It’s not just drug work. It’s what you have. Can you get market protections? Can you get patent protection?” ~ Interviewee 9 “However, repurposing generally needs to be driven by the company that holds the original patent, making it tough to advance without ownership.” ~ Interviewee 13 “The MHRA has supported the NHS England Drug Repurposing Programme, so that is, a cross-sectoral approach to support academics and companies that have a repurposing project.” ~ Interviewee 8 “The problem with which we have in other areas including oncology is that quite often we are looking for molecules that are of patent and then there’s very little reward.” ~ Interviewee 10 |
RWD analysis and Predictive Modeling discuss the concepts of screening databases to identify pertinent information to develop a predictive model. This technique can significantly cut down the time and cost by simulating the interactions between drugs and cancer cells. Interviewees pointed out significant challenges, such as poor preclinical data quality and the difficulty of accessing large quantities of relevant data within pharmaceutical companies.
Data quality and accessibility are both important as they ensure that models are trained on accurate and relevant information, leading to more precise molecule identification for repurposing. Access to diverse datasets from various sources also enables a more reliable algorithm to be developed.
Together, these subthemes provide a comprehensive overview of the current technological landscape and its impact on advancing drug repurposing in oncology through ML.
Regulatory landscape
The regulatory landscape for utilizing ML in oncological drug repurposing contains two subthemes: Governance of Data and Regulatory Challenges.
Governance of Data is a major issue in the field of repurposing with privacy and ethical issues being the major concern. It highlights the challenges related to data access, consent, and methods in which companies have utilized to address these privacy issues. Accessing and using patient data involves significant challenges related to obtaining informed consent and ensuring data security. Companies are adopting various strategies to address these issues, such as anonymizing patient data to maintain privacy. These measures are essential for maintaining a delicate balance between compliance with legal and privacy standards whilst obtaining the necessary health data for research.
Regulatory Challenges focuses on obtaining commercial exclusivity and bypassing patent protections, while addressing the difficulties in advancing repurposing without original patent ownership. Navigating the regulatory framework requires addressing complexities such as IP, a topic touched on in our interviews.
Business environment
The business environment surrounding the integration of AI and ML in drug repurposing for oncology is multifaceted, as illuminated by various insights from industry experts.
Profitability emerges as a central concern, with interviewees highlighting the cost-effectiveness of AI and ML technologies. Generic drugs, renowned for their affordability and accessibility, present distinct advantages in the market, driving discussions on sustainable economic models amid the threat of generic substitution. Furthermore, industry collaboration plays a pivotal role in harnessing the vast data reservoirs held by pharmaceutical companies.
Collaborative partnerships, bridging expertise in data structuring, algorithmic coding, and domain-specific knowledge, are essential for navigating the intricate landscape of oncology drug development. However, challenges persist in consolidating disparate datasets and fostering a culture of joint discovery within the pharmaceutical industry. Amid such complexities, the business environment underscores the imperative of innovative strategies and collaborative frameworks to propel drug repurposing efforts forward in oncology.
DISCUSSION
Technological landscape
The current landscape of ML in drug repurposing is defined by two key aspects: The benefits, particularly its ability to support RWD analysis and predictive modeling, and the challenges, especially concerning data quality, quantity, and accessibility. These dimensions influence the technological landscape today and shape future developments.
Real-world data analysis
ML is increasingly pivotal in enabling the analysis of large-scale RWD. Both the literature review and SSIs affirmed ML’s ability to scan large datasets and biomedical literature, streamlining workflows and reducing costs. Senior pharmaceutical executives highlighted the value of data mining for improving resource allocation. ML’s ability to conduct literature-based text mining is increasingly important given the rapid growth in biomedical publications.
Another valuable data source is electronic health records (EHRs). EHRs provide longitudinal patient data, including history and treatments, and follow structured standards, making them useful for drug repurposing. These datasets reflect real patient outcomes, contrasting with the artificial conditions of clinical trials. EHR-based ML analyses can unlock novel repurposing insights, leveraging established safety profiles to expedite discovery. The NHS Long Term Plan supports this future by emphasizing digitalization and a paperless system.[33]
However, challenges remain in data access, standardization, and validation. For ML on RWD to reach its full potential, these issues must be addressed. As EHR adoption increases and quality improves, this type of evidence could transform the drug repurposing landscape.
Predictive modeling
Once data are extracted, ML is used to model it, producing predictions about drug targets and responses. Structure-based ML methods use computational chemistry to analyze three-dimensional (3D) protein structures, identifying binding sites and screening compounds through molecular docking. These approaches enhance the therapeutic value of repurposed drugs but demand high-quality data and significant computational resources.
Genetic association approaches combine genomics, transcriptomics, proteomics, and metabolomics, providing a complete picture of disease biology and enabling biomarker identification. Cell activity-based methods analyze observable traits of cells, generating mechanistic insights, especially relevant in oncology, given its phenotypic plasticity.
Interviewees noted that although these strategies are often seen in isolation, combining them significantly enhances ML’s impact. Integrating 3D structural data, omics profiles, and phenotypic data enables more precise predictions of drug efficacy and treatment responses. This integration could optimize repurposing strategies by identifying novel targets and personalizing therapies, though success depends on access to high-quality, well-structured datasets.
Future scope of real-world data and predictive modeling
The interviews revealed a clear shift toward personalized medicine. Using large datasets – genomic data, imaging, treatment history – ML can predict individual responses to treatments. This supports patient stratification and the design of tailored therapies, improving outcomes and minimizing side effects.
Interviewee 8 introduced the concept of basket studies: Trials that assess drug efficacy across various cancers sharing a common biomarker.[34] These studies group patients by molecular profile, not cancer type, which aligns with repurposing goals. ML can analyze data from such studies to find drugs effective across multiple cancers with shared targets. These tumor-agnostic approaches, potentially leading to faster drug approvals, exemplify how ML can advance oncological drug repurposing beyond traditional disease categories.
Data infrastructure
The success of ML applications in RWD analysis and predictive modeling depends heavily on data infrastructure, particularly data quality and accessibility.
Data quality
ML models learn from training data, and poor data quality compromises their accuracy. Experts stressed that relevance, completeness, representativeness, and lack of bias are key metrics. If training data are biased or incomplete, models risk making incorrect predictions, missing promising candidates, or identifying misleading patterns.
“Data cleaning” is essential – refining datasets by removing errors or gaps before training. This is especially important in proteochemometrics and graphical modeling, where data flaws undermine model reliability. However, cleaning data can contribute to overfitting, where models learn the training data too well, including irrelevant noise, impairing generalizability to new data.[35]
This issue is acute in oncology, given the heterogeneity of cancers. Overfitted models may identify false-positive candidates or fail in diverse patient groups. Interviewees cited real-world examples where algorithms excluded underrepresented cancer populations, causing harm.[36] Preventing overfitting requires representative data, regularization, and cross-validation.
Wider studies echo these concerns. NHS datasets suffer from documentation and curation issues, while European pharmaceutical efforts face hurdles in reusing health data. Rangineni noted that high-quality data boost ML accuracy by 25%, while poor data can reduce reliability by 30%.[36]
Limited quantity of patient-level data is also a barrier, especially in rare cancers. Legislation by the National Disease Registration Service aims to overcome this by collecting such data without explicit consent, expanding availability for research.[21]
Data accessibility
Beyond quality and quantity, accessibility poses a major hurdle. Successful ML repurposing relies on access to broad, integrated datasets on drug properties, molecular interactions, and patient outcomes.
Our SSIs revealed that companies and NHS bodies often struggle to access sufficient data. NHS, despite hosting one of the largest patient datasets, is reluctant to share information. Interviewee 7 noted that even financial incentives were insufficient to persuade NHS Trusts to release data. Similarly, large pharmaceutical companies guard data closely-interviewee 6 reported firms often did not even know where certain datasets were stored. These structural barriers slow progress and lengthen development timelines.
The literature echoes this: Describe limited data sharing as a widespread problem. Overcoming these issues requires not just technical improvements but structural changes in how data are governed and shared.
Regulatory landscape
Regulatory constraints are one of the most significant barriers to oncological drug repurposing. These include patient data privacy, IP protection, market exclusivity, and the need for trust-building with regulatory agencies.
Governance of data
Patient consent
UK law protects patient data under the 2018 Data Protection Act and the Common Law Duty of Confidentiality. ML development requires consent to use personal clinical data. Even with initiatives to collect data without consent, many patients remain unwilling to share due to privacy concerns or lack of understanding.
Our interviews confirmed this reluctance. Interviewee 7 noted that monetary compensation alone was “not enough to pay for the data.” The challenge is not only ethical and legal but practical – without consent, companies have limited access to the datasets ML models depend on.
Open datasets
To bypass privacy issues and regulatory hurdles, companies often turn to open datasets, such as the UK Biobank and The Cancer Genome Atlas. These contain rich, freely accessible data. The UK Biobank, for example, includes information on approximately 500,000 individuals.[37] These resources speed up the research process and lower costs.
However, access to some high-value datasets remains restricted or costly – interviewee 6 noted key data was often “locked behind paywalls.” In this context, open datasets are a practical solution, especially for smaller firms and nonprofits. Moreover, patient willingness to share data may be improving[38] found increasing public support for using clinical data in research.
Regulatory challenges
Patent protection
Patent law poses a barrier to repurposing. The original composition of matter patent grants exclusive rights to the first discoverer. New therapeutic uses require a “first medical use” patent, but this does not provide full commercial protection if the compound is not novel. This limits incentive for repurposing, as companies cannot block competitors once exclusivity ends.
Off-label prescribing
If parent companies do not pursue new indications, others may prescribe the drug off-label, without formal approval. While scientifically justified, off-label use raises safety, legal, and ethical concerns. Interviewees noted that this lack of endorsement reduces physician and public trust, decreasing uptake. Verbaanderd highlighted the gap between research and clinical implementation caused by these regulatory hesitations.[39]
Algorithmic intellectual property protection
ML algorithms used to identify repurposing candidates must also be protected. However, algorithms cannot be patented as a whole – each step must be claimed separately.[40] This complex process is slow, often taking a year or more,[41] during which the algorithm may become outdated. These delays reduce incentives for smaller companies to invest in algorithm development.
Regulatory support
Despite these challenges, recent regulatory support has emerged. NHS England’s Medicines Repurposing Programme, launched in 2021, aims to formalize repurposing and reduce reliance on off-label prescribing. Similarly, the MHRA was recognized by interviewees as crucial in increasing trust and uptake.
Streamlined approval pathways that rely on existing safety data can significantly shorten time to market, reducing costs and improving patient access – particularly in rare cancers where time is critical. This synergy with ML-driven discovery models could improve both speed and success rates.
Business environment
The financial and structural dynamics of the pharmaceutical industry play a pivotal role in the feasibility and uptake of ML for oncological drug repurposing. This section explores the key business factors, focusing on profitability and collaboration.
Profitability
Funding new drug development has become increasingly difficult, resulting in unmet clinical needs, especially in oncology, where drug prices and development costs are extremely high.[42] Literature reviewed in the NLR highlights ML’s potential to reduce these costs by accelerating candidate identification. For example, Sengupta, Singh, and Kumar demonstrated ML models could predict anti-cancer compounds, streamlining the discovery of novel drug–gene interactions.[43]
Cost of repurposing
While repurposing theoretically saves time and resources due to existing safety and efficacy data, the practical reality is less straightforward. Our interviews revealed that commercial value outweighs scientific rationale for many large pharmaceutical companies. Several participants expressed skepticism about repurposing’s ability to consistently deliver strong financial returns. Interviewee 9 likened current repurposing efforts to “shooting randomly at the clouds, hoping one of them sticks.”
High costs persist, especially in oncology, where additional Phase 1 or 2 trials may be needed to determine new dosing regimens. Narrow therapeutic windows and toxicity risks demand further reformulation and testing. ML methods, like molecular docking, also require extensive computing power, making them expensive to deploy at scale.
The costs associated with implementation and reformulation may offset the expected savings from bypassing early-stage trials. Furthermore, rare cancers represent small markets, offering limited return on investment. This lack of financial incentive was a recurring theme across interviews, especially for large companies that focus on profitability and shareholder value.
Despite these challenges, the drug repurposing market is growing, with a projected CAGR of 10.8% between 2018 and 2028.[44] Interviewee 5 also noted that altering a drug’s formulation or delivery route can lead to new patents, extending market exclusivity, and offering fresh commercial opportunities. Thus, while ML-based repurposing carries costs, it also opens new profit avenues, particularly in oncology.
Translation into clinical practice
ML tools are widely used to screen compounds and identify candidates through structural matching. However, these models provide recommendations, not validation. The transition from promising compound to accepted therapy still requires investment in cost–benefit analysis, clinical validation, and regulatory approval – all of which are expensive.
In oncology, where combination therapies are common, the cost of integrating a repurposed drug into treatment protocols increases. Additional layers of testing are often required to ensure compatibility and efficacy in combination with other agents.
Threat of generic drugs
Another financial challenge is the threat from generics. Once a drug’s patent expires, generics – identical in composition but cheaper – enter the market. This reduces the profitability of repurposing efforts, especially when original developers no longer have exclusive rights.
Interviewees expressed concern that investing in repurposing nearing patent expiry is risky, as generics undercut prices and shift consumer demand. With narrow margins and increased competition, companies may struggle to sustain a viable business model around a repurposed drug.
From a patient perspective, repurposing has not significantly lowered treatment costs. Gonzalez-Fierro found little difference between the end prices of repurposed and traditionally discovered oncology drugs.[45] Reformulation and dosing adjustments may introduce additional regulatory and safety expenses, ultimately increasing end-user costs (interviewee 1). As a result, neither patients nor developers consistently benefit from lower prices.
In summary, while ML-based repurposing offers potential cost reductions, financial viability remains uncertain, especially in oncology. The business case depends on disease prevalence, trial requirements, and IP protection, all of which vary widely.
Industry collaboration
Collaboration is essential for ML-driven repurposing. Our interviews consistently highlighted the need for interdisciplinary partnerships involving data scientists, AI engineers, oncologists, and clinicians. Interviewee 8 emphasized that success depends on bringing together individuals with expertise in data structuring, coding, and oncology.
Clinicians, in particular, are critical. They understand real-world patient challenges and can drive clinical validation of repurposed drugs. Cross-disciplinary collaboration enables repurposing projects to progress from algorithmic insights to practical implementation.
Challenges with outsourcing
Many pharmaceutical companies lack internal data infrastructure and AI expertise, leading to reliance on external firms. Interviewees noted that data are often outsourced to third-party IT firms for cleaning, formatting, and analysis. This dependency stems from internal skills gaps and the technical complexity of ML tools.
However, outsourcing introduces inefficiencies. Companies can face hold-up risks, where IT partners delay deliverables or demand higher fees. Interviewees also flagged pricing opacity from AI vendors, making it difficult to assess return on investment. The evolving nature of AI further complicates long-term planning.
Despite these concerns, third-party AI investment in pharma has more than doubled annually,[46] suggesting ongoing confidence in these partnerships. Nonetheless, improving internal capability remains a strategic imperative for pharma companies wishing to reduce reliance on external firms.
Challenges in industry structure
The pharmaceutical industry is highly consolidated, with major players like Pfizer and Novartis controlling large datasets and significant R and D resources. In Q1 2024 alone, 430 M and A deals were announced, valued at $68.8 billion.[47] This consolidation enables large firms to monopolize data, creating barriers for smaller players.
Large companies often guard their proprietary datasets, limiting the diversity and quantity of data available to develop robust ML models. These constraints restrict smaller firms’ ability to compete or collaborate, despite often being more innovative and agile.
The dominance of large firms also limits cross-pollination of expertise. With limited collaboration across company boundaries, opportunities for shared learning and pooled resources are lost. This fragmentation hinders industry-wide progress in ML-driven repurposing.
Moreover, smaller companies face disproportionate challenges in navigating IP regulations, data governance, and compliance frameworks. These barriers reinforce the competitive advantages of large firms and restrict wider participation in repurposing initiatives.
To address this, more public–private partnerships, open-access data initiatives, and collaborative platforms are needed. Regulatory bodies can play a role by incentivizing transparency and supporting knowledge-sharing networks.
Limitations
Limitations of the study should be considered when interpreting the findings presented in this report. Our interview sample may be susceptible to confirmation bias, as participants who agreed to be interviewed likely possessed inherent qualities, such as elevated expertise or vested interests in the topic, that could have influenced their perspectives.[48] This raises concerns about the generalisability of conclusions derived from the interview data, which may not accurately reflect the views held by the wider population.
Our study relied solely on qualitative methods, without incorporating any quantitative techniques. This lack of quantitative analysis restricts our capacity to pinpoint statistically significant trends or patterns across the participant responses.[49] As a result, the quotes we have selected and analyzed from the qualitative data may unintentionally reflect some degree of unintended selection bias, risking an imbalanced or one-sided depiction of the entire dataset. The ability to objectively identify representative excerpts becomes more challenging when qualitative approaches are not complemented by quantitative measures.
Furthermore, to maximize convenience and participation, all interviews were conducted virtually over Microsoft Teams. This virtual format introduces the limitation of potential data loss due to connectivity issues, as well as the loss of nonverbal cues and an increased risk of misinterpretation.[50]
CONCLUSION
This study aimed to understand the current landscape and future potential of using ML for drug repurposing in oncology. Through interviews with experts and a comprehensive literature review, we identified three key themes shaping this field: The technological capabilities and limitations, the regulatory landscape, and the business environment considerations.
Our in-depth analysis revealed both the significant promise and substantial challenges involved. ML algorithms can rapidly screen thousands of existing drug compounds to identify potential candidates for repurposing against specific cancer targets. However, translating these computational findings into a clinical context requires rigorous testing and trials to ensure safety and efficacy. Economic incentives and IP concerns also hinder investment in repurposing older, generic drugs.
Based on our findings, we recommend:
Fostering Collaborative Innovation between Pharmaceutical Companies and AI Developers
Establishing Data Protection and Bias Mitigation Standards
Governance of IP-Protected and Patented AI Tools and Databases.
Overcoming these hurdles will be critical toward fully utilizing ML to repurpose existing drugs and accelerate oncology therapeutic development.
While still an emerging field, ML for drug repurposing in oncology holds immense potential to transform drug discovery, reduce costs, and rapidly translate existing knowledge into new treatments for cancer patients. Looking forward, we should embrace this technology; however, the various challenges that it brings should not be ignored.
Ethics approval
This study was approved by the Imperial College Research Ethics Committee on 10th January 2024.
AUTHOR’S CONTRIBUTIONS
“Adam Mann was responsible for developing the research concept, carrying out the interviews, conducting the analysis, and writing the majority of the report. Meer Shah provided key input into the literature review and helped interpret the interview findings. Srinivas Suresh supported the organization of the data and contributed to coding and identifying core themes. Jamie Wen assisted with transcription and background research. David Braudo reviewed the final draft and offered suggestions for clarity and structure. All authors read and approved the final version.”
Conflicts of interest
There are no conflicts of interest.
Acknowledgments
We would like to express our deepest appreciation and gratitude to our supervisor, Professor James Barlow, whose exceptional support and guidance have been truly invaluable.
Finally, we are very grateful to all our industry experts for lending their time and providing us with the information to better our understanding of our research.
Full interview transcripts and additional supporting materials are available from the corresponding author upon reasonable request.
APPENDIXES
Appendix A
1. AI/ML Specialist + Drug Repurposing +/- Pharmaceutical Industry Knowledge
Table 1.
AI/ML in drug repurposing interview guide
| Section | Questions |
|---|---|
| Introduction | Can you please introduce yourself and provide a brief overview of your background in AI/ML and drug repurposing? |
| Understanding of Drug Repurposing | From your perspective, what is drug repurposing, and why is it considered an attractive strategy in pharmaceutical research? What are some examples of successful drug repurposing efforts you’ve been involved in or have observed? |
| Machine Learning Techniques in Drug Repurposing | How do you see the role of machine learning techniques in facilitating drug repurposing efforts? Can you provide insights into specific machine learning algorithms or approaches that have been effective? What data do you look at/extract to train the ML models? How do you ensure the quality and reliability of the data used in ML models for drug repurposing? |
| Challenges and Solutions | What are some of the key challenges encountered when applying AI/ML to drug repurposing, and how do you address them? How do you navigate issues related to data availability, quality, and bias? |
| Collaborative Efforts | To what extent do you collaborate with pharmaceutical experts or researchers in the field of oncology to advance drug repurposing efforts? Can you share examples of successful interdisciplinary collaborations or partnerships in this area? |
| Evaluation and Validation | How do you validate the effectiveness and reliability of machine learning models used in drug repurposing? What criteria do you use to evaluate the potential of repurposed drugs for oncology applications? |
| Future Directions | What do you envision as the future of drug repurposing with the continued integration of AI/ML technologies? Are there any emerging trends or advancements in AI/ML that you believe will significantly impact drug repurposing efforts in the near future? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in leveraging AI/ML for drug repurposing in oncology? |
| Conclusion | Is there any additional information or insights you would like to share regarding the intersection of AI/ML and drug repurposing, particularly in the context of oncology? |
AI/ML: Artificial intelligence/machine learning
2. Drug Repurposing in Oncology
Table 2.
Oncology drug repurposing interview guide
| Section | Questions |
|---|---|
| Introduction | Can you describe your background and experience in drug repurposing specifically related to oncology? |
| Understanding of Drug Repurposing in Oncology | How do you define drug repurposing in the context of oncology, and why is it significant? Can you share examples of successful drug repurposing efforts specifically targeting oncology indications? |
| Unique Challenges in Oncology Drug Repurposing | What are some of the unique challenges encountered when repurposing drugs for oncology indications? How do these challenges differ from those encountered in drug repurposing for other therapeutic areas? |
| Target Identification and Validation | How do you identify potential targets or pathways for drug repurposing in oncology? What methods or approaches do you use to validate the relevance of these targets in the context of cancer treatment? |
| Collaborative Efforts | To what extent do you collaborate with AI/ML specialists or researchers in oncology to explore drug repurposing opportunities? Can you share examples of successful interdisciplinary collaborations or partnerships in this area? |
| Clinical Trial Design and Implementation | What considerations are important when designing clinical trials for repurposed drugs in oncology? How do you navigate issues related to patient recruitment, trial endpoints, and regulatory requirements? |
| Evaluation of Efficacy and Safety | How do you evaluate the efficacy and safety of repurposed drugs in oncology? What endpoints or biomarkers do you use to assess the therapeutic potential of repurposed drugs in cancer patients? |
| Regulatory and Ethical Considerations | What regulatory challenges or considerations arise when repurposing drugs for oncology indications? How do you address ethical concerns related to patient safety, informed consent, and access to experimental therapies? |
| Future Directions | What do you see as the future of drug repurposing in oncology, and how do you anticipate it evolving in the coming years? Are there any emerging trends or advancements in oncology drug repurposing that you believe will have a significant impact? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in exploring drug repurposing opportunities specifically for oncology indications? |
| Conclusion | Is there any additional information or insights you would like to share regarding drug repurposing in oncology? |
AI/ML: Artificial intelligence/machine learning
3. People with Knowledge of drug repurposing (General)
Table 3.
General drug repurposing interview guide
| Section | Questions |
|---|---|
| Understanding of Drug Repurposing | How do you define drug repurposing, and what makes it an important strategy in pharmaceutical research in your opinion? Can you share examples of successful drug repurposing efforts you’ve observed? |
| Traditional Approaches vs. Drug Repurposing | What are the key differences or some potential advantages between traditional drug discovery methods and drug repurposing approaches? |
| Challenges and Solutions | What are some of the major challenges encountered in drug repurposing, and how do you overcome them? Do you see drug repurposing being more sustainable in the long term? |
| Collaborative Efforts | To what extent do you collaborate with AI/ML specialists or researchers in the field of oncology to explore drug repurposing opportunities? |
| Evaluation | How do you evaluate how effective repurposed drugs are? |
| Regulatory and Ethical Considerations | What regulatory/ethical challenges or considerations arise when repurposing drugs for new indications? How big is the issue of patents with regards to repurposing? Why? |
| Future Directions | How do you see drug repurposing evolving in the coming years? Are there any emerging trends in drug repurposing that you believe will have a significant impact on the field? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in exploring drug repurposing opportunities. How will repurposing impact the whole drug discovery pathway/clinical trials? |
AI/ML: Artificial intelligence/machine learning
Appendix B
Appendix C
| Participant | Role | Associated Institution |
|---|---|---|
| Participant 1 | CEO | Non-profit healthtech startup that fasttracks affordable oncology treatments |
| Participant 2 | Chief Computational Biologist | Drug repurposing company with 25+ years of experience |
| Participant 3 | CEO | Biotech company that utilises AI to discover new medicines |
| Participant 4 | Professor of Process Systems Engineering | Imperial College |
| Participant 5 | Ex-CEO | Extremely large British Life Science Investment Fund |
| Participant 6 | Program Director, Drug Repurposing | Large not for profit Cancer Fund |
| Participant 7 | ML specialist NHS oncologist | Oxford University |
| Participant 8 | Director | The Association of British Pharmaceutical Industry |
| Participant 9 | Chief Development Officer | Private biotechnology & pharmaceutical company |
| Participant 10 | Senior Director of Clinical Pharmacology | Large Global Pharmaceutical Firm |
| Participant 10 | Professor in Clinical Pharmacology & Therapeutics | UCL |
| Participant 11 | Health Economist | Imperial College |
| Participant 12 | Student undertaking PhD on AI for Drug Discovery Programme | Imperial College |
| Participant 13 | Head of European R&D | Large Global Pharmaceutical Firm |
Appendix D
AI/ML Specialist + Drug Repurposing +/- Pharmaceutical Industry Knowledge
| Section | Questions |
|---|---|
| Introduction | Can you please introduce yourself and provide a brief overview of your background in AI/ML and drug repurposing? |
| Understanding of Drug Repurposing | From your perspective, what is drug repurposing, and why is it considered an attractive strategy in pharmaceutical research? What are some examples of successful drug repurposing efforts you’ve been involved in or have observed? |
| Machine Learning Techniques in Drug Repurposing | How do you see the role of machine learning techniques in facilitating drug repurposing efforts? Can you provide insights into specific machine learning algorithms or approaches that have been effective? What data do you look at/extract to train the ML models? How do you ensure the quality and reliability of the data used in ML models for drug repurposing? |
| Challenges and Solutions | What are some of the key challenges encountered when applying AI/ML to drug repurposing, and how do you address them? How do you navigate issues related to data availability, quality, and bias? |
| Collaborative Efforts | To what extent do you collaborate with pharmaceutical experts or researchers in the field of oncology to advance drug repurposing efforts? Can you share examples of successful interdisciplinary collaborations or partnerships in this area? |
| Evaluation and Validation | How do you validate the effectiveness and reliability of machine learning models used in drug repurposing? What criteria do you use to evaluate the potential of repurposed drugs for oncology applications? |
| Future Directions | What do you envision as the future of drug repurposing with the continued integration of AI/ML technologies? Are there any emerging trends or advancements in AI/ML that you believe will significantly impact drug repurposing efforts in the near future? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in leveraging AI/ML for drug repurposing in oncology? |
| Conclusion | Is there any additional information or insights you would like to share regarding the intersection of AI/ML and drug repurposing, particularly in the context of oncology? |
Drug Repurposing in Oncology
| Section | Questions |
|---|---|
| Introduction | Can you describe your background and experience in drug repurposing specifically related to oncology? |
| Understanding of Drug Repurposing in Oncology | How do you define drug repurposing in the context of oncology, and why is it significant? Can you share examples of successful drug repurposing efforts specifically targeting oncology indications? |
| Unique Challenges in Oncology Drug Repurposing | What are some of the unique challenges encountered when repurposing drugs for oncology indications? How do these challenges differ from those encountered in drug repurposing for other therapeutic areas? |
| Target Identification and Validation | How do you identify potential targets or pathways for drug repurposing in oncology? What methods or approaches do you use to validate the relevance of these targets in the context of cancer treatment? |
| Collaborative Efforts | To what extent do you collaborate with AI/ML specialists or researchers in oncology to explore drug repurposing opportunities? Can you share examples of successful interdisciplinary collaborations or partnerships in this area? |
| Clinical Trial Design and Implementation | What considerations are important when designing clinical trials for repurposed drugs in oncology? How do you navigate issues related to patient recruitment, trial endpoints, and regulatory requirements? |
| Evaluation of Efficacy and Safety | How do you evaluate the efficacy and safety of repurposed drugs in oncology? What endpoints or biomarkers do you use to assess the therapeutic potential of repurposed drugs in cancer patients? |
| Regulatory and Ethical Considerations | What regulatory challenges or considerations arise when repurposing drugs for oncology indications? How do you address ethical concerns related to patient safety, informed consent, and access to experimental therapies? |
| Future Directions | What do you see as the future of drug repurposing in oncology, and how do you anticipate it evolving in the coming years? Are there any emerging trends or advancements in oncology drug repurposing that you believe will have a significant impact? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in exploring drug repurposing opportunities specifically for oncology indications? |
| Conclusion | Is there any additional information or insights you would like to share regarding drug repurposing in oncology? |
People with Knowledge of Drug Repurposing (General)
| Section | Questions |
|---|---|
| Understanding of Drug Repurposing | How do you define drug repurposing, and what makes it an important strategy in pharmaceutical research in your opinion? Can you share examples of successful drug repurposing efforts you’ve observed? |
| Traditional Approaches vs. Drug Repurposing | What are the key differences or some potential advantages between traditional drug discovery methods and drug repurposing approaches? |
| Challenges and Solutions | What are some of the major challenges encountered in drug repurposing, and how do you overcome them? Do you see drug repurposing being more sustainable in the long term? |
| Collaborative Efforts | To what extent do you collaborate with AI/ML specialists or researchers in the field of oncology to explore drug repurposing opportunities? |
| Evaluation | How do you evaluate how effective repurposed drugs are? |
| Regulatory and Ethical Considerations | What regulatory/ethical challenges or considerations arise when repurposing drugs for new indications? How big is the issue of patents with regards to repurposing? Why? |
| Future Directions | How do you see drug repurposing evolving in the coming years? Are there any emerging trends in drug repurposing that you believe will have a significant impact on the field? |
| Personal Insights and Recommendations | Based on your experience and expertise, what advice or recommendations would you offer to researchers or practitioners interested in exploring drug repurposing opportunities. How will repurposing impact the whole drug discovery pathway/clinical trials? |
Funding Statement
Nil.
REFERENCES
- 1.IQVIA (2021) Global Medicine Spending and Usage Trends: Outlook to 2025. 2021. Available from: https://www.iqvia.com/insights/the-iqvia-institute/reports-and-publications/reports/global-medicine-spending-and-usage-trends-outlook-to-2025 . [Last accessed on 2024 May 19]
- 2.Nawrat A. Drug Repurposing: The industry’s all-Rounder Medicines 2019. Pharmaceutical Technology. 2019. Available from: https://www.pharmaceutical-technology.com/features/drug-repurposing-all-rounder/ . [Last accessed on 2024 May 18]
- 3.Yamaguchi S, Kaneko M, Narukawa M. Approval success rates of drug candidates based on target, action, modality, application, and their combinations. Clin Transl Sci. 2021;14:1113–22. doi: 10.1111/cts.12980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kakkar AK. Pharmaceutical price regulation and its impact on drug innovation: Mitigating the trade-offs. Expert Opin Ther Pat. 2021;31:189–92. doi: 10.1080/13543776.2021.1876029. [DOI] [PubMed] [Google Scholar]
- 5.PwC Pricing Pressures and Shrinking Margins. PwC. Available from: https://www.pwc.com/il/en/pharmaceuticals/pricing-pressures-shrinking-margins.html . [Last accessed 2024 May 18]
- 6.European Federation of Pharmaceutical Industries and Assocations Intellectual Property. Available from: https://www.efpia.eu/about-medicines/development-of-medicines/intellectual-property/ . [Last accessed 2024 May 18]
- 7.NHS (2023) Medicines Information. 2023. Available from: https://www.nhs.uk/conditions/medicines-information/ [Last accessed on 2024 May 18]
- 8.Jourdan JP, Bureau R, Rochais C, Dallemagne P. Drug repositioning: A brief overview. J Pharm Pharmacol. 2020;72:1145–51. doi: 10.1111/jphp.13273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Latif K, Ullah A, Shkodina AD, Boiko DI, Rafique Z, Alghamdi BS, et al. Drug reprofiling history and potential therapies against Parkinson’s disease. Front Pharmacol. 2022;13:1028356. doi: 10.3389/fphar.2022.1028356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Europe D, Gunn S. Overview of Drug Repositioning. 2020. Available from: https://d4-pharma.com/overview-of-drug-repositioning/ . [Last accessed on 2024 May 18]
- 11.Latif T, Chauhan N, Khan R, Moran A, Usmani SZ. Thalidomide and its analogues in the treatment of multiple myeloma. Exp Hematol Oncol. 2012;1:27. doi: 10.1186/2162-3619-1-27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Srinivasan AV. Propranolol: A 50-year historical perspective. Ann Indian Acad Neurol. 2019;22:21–6. doi: 10.4103/aian.AIAN_201_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Phillips C. Changes to Treatment for HER2-Positive Gastric Cancer 2023. National Cancer Institute. 2023. Available from: https://www.cancer.gov/news-events/cancer-currents-blog/2023/fda-pembrolizumab-stomach-esophageal-her2-pdl . [Last accessed on 2024 May 18]
- 14.Weth FR, Hoggarth GB, Weth AF, Paterson E, White MPJ, Tan ST, et al. Unlocking hidden potential: Advancements, approaches, and obstacles in repurposing drugs for cancer therapy. Br J Cancer. 2024;130:703–15. doi: 10.1038/s41416-023-02502-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Pushpakom S. Drug Repurposing. London, United Kingdom: The Royal Society of Chemistry; 2022. Introduction and historical overview of drug repurposing opportunities; pp. 1–13. [Google Scholar]
- 16.Wakefield K. A Guide to the Types of Machine Learning Algorithms. 2023. Available from: https://www.sas.com/en_gb/insights/articles/analytics/machine-learning-algorithms.html . [Last accessed on 2024 May 19]
- 17.Dagogo-Jack I, Shaw AT. Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol. 2018;15:81–94. doi: 10.1038/nrclinonc.2017.166. [DOI] [PubMed] [Google Scholar]
- 18.Shaping the National Cancer Plan (no date) GOV.UK. Available from: https://www.gov.uk/government/calls-for-evidence/shaping-the-national-cancer-plan/shaping-the-national-cancer-plan . [Last accessed 2025 Aug 02]
- 19.Prasad V, Mailankody S. Research and Development Spending to Bring a Single Cancer Drug to Market and Revenues After Approval. JAMA Intern Med. 2017;177:1569–75. doi: 10.1001/jamainternmed.2017.3601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Pantziarka P, Verbaanderd C, Sukhatme V, Rica Capistrano I, Crispino S, Gyawali B, et al. ReDO_DB: The repurposing drugs in oncology database. Ecancermedicalscience. 2018;12:886. doi: 10.3332/ecancer.2018.886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Shah RR, Stonier PD. Repurposing old drugs in oncology: Opportunities with clinical and regulatory challenges ahead. J Clin Pharm Ther. 2019;44:6–22. doi: 10.1111/jcpt.12759. [DOI] [PubMed] [Google Scholar]
- 22.Saunders M, Lewis P, Thornhill A. Harlow, England: Pearson Education Limited; 2016. Research Methods for Business Students. [Google Scholar]
- 23.Ellul DB. Exploring the Depths of Research Design: Revealing the Layers of the Research Onion. 2023. Available from: https://www.linkedin.com/pulse/exploring-depths-research-design-revealing-layers-onion-borg-ellul . [Last accessed on 2025 May 25]
- 24.MacDougall C, Baum F. The devil’s advocate: A strategy to avoid groupthink and stimulate discussion in focus groups. Qualitative Health Research. 1997;7:532–41. [Google Scholar]
- 25.Forero R, Nahidi S, De Costa J, Mohsin M, Fitzgerald G, Gibson N, et al. Application of four-dimension criteria to assess rigour of qualitative research in emergency medicine. BMC Health Serv Res. 2018;18:120. doi: 10.1186/s12913-018-2915-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wohlin C. Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. New York, NY, USA: ACM; 2014. Guidelines for Snowballing in Systematic Literature Studies and a Replication in Software Engineering. [Google Scholar]
- 27.Raina S. Establishing association. Indian J Med Res. 2015;141:127. doi: 10.4103/0971-5916.154519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hennink M, Kaiser BN. Sample sizes for saturation in qualitative research: A systematic review of empirical tests. Soc Sci Med. 2022;292:114523. doi: 10.1016/j.socscimed.2021.114523. [DOI] [PubMed] [Google Scholar]
- 29.McNamara C. General Guidelines for Conducting Interviews. Available from: https://napequity.org/wp-content/uploads/10j-General-Guidelines-for-Conducting-Interviews.pdf . [Last accessed on 2024 May 26]
- 30.Wellard S, McKenna L. Turning tapes into text: Issues surrounding the transcription of interviews. Contemp Nurse. 2001;11:180–6. doi: 10.5172/conu.11.2-3.180. [DOI] [PubMed] [Google Scholar]
- 31.Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3:77–101. [Google Scholar]
- 32.Lincoln YS, Guba EG, Pilotta JJ. Naturalistic inquiry. Int J Intercult Relat. 1985;9:438–9. [Google Scholar]
- 33.Asthana S, Jones R, Sheaff R. Why Does the NHS Struggle to Adopt eHealth Innovations? A Review of Macro, Meso and micro factors – BMC Health Services Research, BioMed Central. 2019. Available from: https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-019-4790-x . [Last accessed on 2025 Aug 03] [DOI] [PMC free article] [PubMed]
- 34.Cummings J, Montes A, Kamboj S, Cacho JF. The role of basket trials in drug development for neurodegenerative disorders. Alzheimers Res Ther. 2022;14:73. doi: 10.1186/s13195-022-01015-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ying X. An overview of overfitting and its solutions. J Phys Conf Ser. 2019;1168:022022. [Google Scholar]
- 36.Rangineni S. An analysis of data quality requirements for machine learning development pipelines frameworks. Int J Comput Trends Technol. 2023;71:16–27. [Google Scholar]
- 37.Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–9. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Richter G, Borzikowsky C, Lieb W, Schreiber S, Krawczak M, Buyx A. Patient views on research use of clinical data without consent: Legal, but also acceptable? Eur J Hum Genet. 2019;27:841–7. doi: 10.1038/s41431-019-0340-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Verbaanderd C, Rooman I, Meheus L, Huys I. On-label or off-label? Overcoming regulatory and financial barriers to bring repurposed medicines to cancer patients. Front Pharmacol. 2019;10:1664. doi: 10.3389/fphar.2019.01664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rapacke A. Are Machine Learning Algorithms patentable? The Rapacke Law Group. 2020. Available from: https://arapackelaw.com/patents/softwaremobile-apps/are-machine-learning-algorithms-patentable/ . [Last accessed on 2024 May 27]
- 41.Brückner C. How Long Does It Take to Get a Patent, and How Can You Speed Up the Process? 2023. Available from: https://www.dennemeyer.com/ip-blog/news/getting-a-patent-how-long-does-it-take-and-can-ai-speed-things-up/ . [Last accessed on 2024 May 27]
- 42.Godman B, Bucsics A, Vella Bonanno P, Oortwijn W, Rothe CC, Ferrario A, et al. Barriers for access to new medicines: Searching for the balance between rising costs and limited budgets. Front Public Health. 2018;6:328. doi: 10.3389/fpubh.2018.00328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Workman P, Draetta GF, Schellens JH, Bernards R. How much longer will we put Up With $100,000 Cancer Drugs? Cell. 2017;168:579–83. doi: 10.1016/j.cell.2017.01.034. [DOI] [PubMed] [Google Scholar]
- 44.Arcas A. Drug Repurposing, Real World Data and AI/ML: Perspectives and Opportunities. Clarivate. 2023. Available from: https://clarivate.com/blog/drug-repurposing-real-world-data-and-ai-ml-perspectives-and-opportunities/ . [Last accessed on 2024 May 27]
- 45.Gonzalez-Fierro A, Romo-Pérez A, Chávez-Blanco A, Dominguez-Gomez G, Duenas-Gonzalez A. Does therapeutic repurposing in cancer meet the expectations of having drugs at a lower price? Clin Drug Investig. 2023;43:227–39. doi: 10.1007/s40261-023-01251-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ayers M, Jayatunga M, Goldader J, Meier C. Adopting AI in Drug Discovery. BCG Global. 2022. Available from: https://www.bcg.com/publications/2022/adopting-ai-in-pharmaceutical-discovery . [Last accessed on 2024 May 27]
- 47.Pharmaceutical Technology (2023) How did M&A Perform in Pharmaceutical in Q1 2024? Pharmaceutical Technology. 2023. Available from: https://www.pharmaceutical-technology.com/deals-dashboards/global-ma-activity-pharmaceutical-industry/ . [Last accessed on 2024 May 27]
- 48.McSweeney B. Fooling ourselves and others: Confirmation bias and the trustworthiness of qualitative research – Part 1 (the threats) J Organ Chang Manage. 2021;34:1063–75. [Google Scholar]
- 49.Reio TG., Jr The threat of common method variance bias to theory building. Hum Resour Dev Rev. 2010;9:405–11. [Google Scholar]
- 50.Kakilla C. Strengths and weaknesses of semi-structured interviews in qualitative research: A critical essay. Preprints. 2021 [doi:10.20944/preprints202106.0491.v1] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Beyond quality and quantity, accessibility poses a major hurdle. Successful ML repurposing relies on access to broad, integrated datasets on drug properties, molecular interactions, and patient outcomes.
Our SSIs revealed that companies and NHS bodies often struggle to access sufficient data. NHS, despite hosting one of the largest patient datasets, is reluctant to share information. Interviewee 7 noted that even financial incentives were insufficient to persuade NHS Trusts to release data. Similarly, large pharmaceutical companies guard data closely-interviewee 6 reported firms often did not even know where certain datasets were stored. These structural barriers slow progress and lengthen development timelines.
The literature echoes this: Describe limited data sharing as a widespread problem. Overcoming these issues requires not just technical improvements but structural changes in how data are governed and shared.
