Skip to main content
Inquiry: A Journal of Medical Care Organization, Provision and Financing logoLink to Inquiry: A Journal of Medical Care Organization, Provision and Financing
. 2025 Jun 13;62:00469580251335805. doi: 10.1177/00469580251335805

Applying Machine Learning Techniques to Predict Drug-Related Side Effect: A Policy Brief

Esmaeel Toni 1, Haleh Ayatollahi 2,
PMCID: PMC12166244  PMID: 40514209

Abstract

Drug safety is a critical aspect of public health, yet traditional detection methods may miss rare or long-term side effects. Recently, machine learning (ML) techniques have shown promise in predicting drug-related side effects earlier in the development pipeline. The objective of this policy brief was to propose evidence-based policy options for using ML techniques to predict drug-related side effects. This policy brief was developed upon a previously published scoping review of relevant studies. A secondary analysis synthesized key barriers and opportunities relevant to policy development. Key findings revealed some challenges in data standardization, interpretability, and regulatory alignment. Moreover, the results highlighted the potential of explainable ML and cross-sector collaboration to improve prediction accuracy and fairness. Five policy recommendations were proposed: (1) establishing standardized data collection and secure protocol sharing; (2) funding ML model development and rigorous validation; (3) integrating ML into drug development pipelines; (4) increasing public awareness through targeted education; and (5) implementing fairness regulations to address bias. These recommendations require joint efforts from governments, regulatory bodies, pharmaceutical firms, and academia to be implemented in practice. While ML offers transformative potential for drug safety, its real-world implementation faces ethical, regulatory, and technical hurdles. Policies must ensure model transparency, promote equity, and support infrastructure for ML adoption. Through interdisciplinary coordination and evidence-based policymaking, stakeholders can responsibly advance ML use in drug development to enhance patient outcomes.

Keywords: health policy, machine learning, drug-related side effects and adverse reactions, predictive models


Highlights.

● Drug safety is a critical aspect of public health.

● Machine learning (ML) techniques are regarded as promising solutions to deal with vast datasets to uncover hidden patterns linked to drug-related side effects.

● However, integrating ML into drug safety prediction is challenging.

● A set of strategies is required to address these challenges, including standard data collection, rigorous model validation, seamless integration into regulatory pipelines, public awareness initiatives, and fairness-driven approaches to minimize bias.

Background

Ensuring drug safety is vital for public health, yet unexpected drug-related side effects remain a significant challenge in the pharmaceutical industry. Traditional methods, such as preclinical studies, clinical trials, and post-market surveillance, often fail to detect rare and long-term side effects. 1 Spontaneous reporting systems and observational studies can introduce biases and inconsistencies, further limiting predictive accuracy. 2 Machine learning (ML) techniques are regarded as promising solutions to deal with vast datasets to uncover hidden patterns linked to drug-related side effects, enabling earlier and more precise risk identification. 3 ML’s versatility is evident in predicting side effects to common medications, such as cardiovascular or liver complications, 4 as well as in assessing mental health research.5,6

However, integrating ML into drug safety prediction is not without hurdles. Data fragmentation, inconsistent reporting practices, and the absence of standard protocols across regulatory agencies impede model development and validation. 7 Additionally, the complexity of many ML algorithms raises concerns about interpretability, trust, and regulatory acceptance. 8 Recent studies have addressed these challenges by developing explainable AI techniques, refining feature selection, and incorporating diverse data sources to enhance model robustness and generalizability. 9 ML has been used to enhance clinical decision-making by identifying individuals at higher risk of specific drug side effects based on personal health factors. 10

Another critical challenge lies in the lack of policy frameworks equipped to handle ML-driven drug safety assessments. Current guidelines mainly focus on traditional statistical methods, which are less suited for evaluating the probabilistic nature of ML-based predictions. Issues such as data privacy, bias mitigation, and regulatory acceptance of algorithmic decision-making remain unresolved. 11 To fully unlock ML’s potential, policymakers must create frameworks that promote transparency, accountability, and equity while ensuring compliance with ethical and legal standards. 12

This policy brief advocates for a set of strategies to address these challenges, including standard data collection, rigorous model validation, seamless integration into regulatory pipelines, public awareness initiatives, and fairness-driven approaches to minimize bias. Implementing these measures will empower the pharmaceutical industry to harness ML’s full potential, enhancing patient safety, streamlining drug development, and reducing the societal burden of drug-related side effects.

Analysis

This policy brief was based on a scoping review 13 conducted using Arksey and O’Malley’s framework and the PRISMA-ScR guidelines to ensure methodological rigor. Seven major databases including Web of Science, PubMed, Ovid, Scopus, ProQuest, IEEE Xplore, and the Cochrane Library were searched for studies published between January 1, 2013, and December 31, 2023, examining ML techniques for predicting drug-related side effects. The search strategy focused on 3 main concepts: “drug-related side effects,” “machine learning,” and “prediction” and benefited from using MeSH terms, synonyms, and Boolean operators. Reference lists of the selected studies were also screened for additional relevant articles.

Inclusion criteria targeted original English-language research that applied ML techniques to predict side effects using chemical, biological, or phenotypical features. Excluded articles were non-English ones, reviews, letters, and studies not related to the use of ML. Two reviewers independently screened the records, with a third resolving any disagreements to minimize bias.

Data extraction followed a standard protocol, capturing study characteristics, ML techniques, data sources, and evaluation metrics such as accuracy, precision, recall, F1 score, and area under the curve. Due to heterogeneity in the study designs and outcomes, a meta-analysis was not feasible. Instead, findings were narratively synthesized to identify common patterns and themes that informed the policy recommendations in this brief.

Policy Implications

In addressing the use of ML techniques for predicting drug-related side effects, several policy options can be taken into account. These options are summarized in Table 1.

Table 1.

Summery of policy options.

Policy options
criteria
Standard data collection and sharing Fostering model development and validation Integrating ML techniques into development pipeline Enhancing public awareness Addressing bias and fairness
Main advantages Fosters robust ML models, accelerates progress 14 Accelerates development of reliable models, incentivizes adoption 15 Early identification of side effects, faster development, safer drugs 16 Builds trust in new drugs, encourages research funding 17 Promotes equitable access to safe drugs 18
Main disadvantages Requires trust-building, addressing privacy concerns 14 Risk of prioritizing speed over accuracy 19 Regulatory approval needed, potential over-reliance on models 20 Risk of overpromising benefits, needs clear communication 17 Requires complex regulations and best practices 18
Cost and feasibility of implementation High upfront cost, requires industry & government collaboration Requires funding & collaboration, regulatory hurdles Training regulators & integrating models requires cost & effort Cost of developing educational materials Requires industry & government collaboration
Equity considerations Ensures all populations included data access challenges in developing countries Funding initiatives consider diverse responses Ensures diverse clinical testing is not replaced Accessible campaigns for diverse audiences Ensures diverse datasets and fair application
Stakeholders’ responsibilities Government sets guidelines, industry adheres, research contributes data/expertise Government funds & sets frameworks, research develops/validates models, industry implements Regulatory agencies set guidelines, industry implements, research develops models Government agencies, science institutions, patient advocacy groups develop/disseminate materials Government sets regulations, industry implements fair practices, research develops unbiased models

Option 1: Standard Data Collection and Sharing

Establishing standard protocols for data collection and sharing across the pharmaceutical industry and research institutions is crucial. This includes defining data formats, creating secure repositories, and developing access protocols. Standard data foster the development of robust and generalizable ML models, while collaboration through data sharing accelerates progress in drug safety prediction. Building trust and addressing privacy concerns will be essential for successful implementation. Government agencies should play a central role in developing data standards, enforcing compliance, and facilitating secure data exchange between stakeholders. Data-sharing agreements should outline clear guidelines on data access, use, and ownership, ensuring transparency and protecting patient privacy.

Option 2: Fostering Model Development and Validation

Financial support and technical assistance through grants and programs can accelerate the development of advanced and reliable ML models. Additionally, regulatory incentives can encourage pharmaceutical companies to adopt these models in their drug development processes. However, prioritizing speed over model accuracy and generalizability is a potential pitfall. Ensuring robust validation is necessary for developing reliable models for real-world application. A collaborative effort among government agencies, research institutions, and pharmaceutical companies is a key to achieve this. Governments can provide targeted funding for ML research, while academic institutions lead model development and validation efforts. Joint initiatives between pharmaceutical companies and regulatory bodies will ensure that ML models align with regulatory standards, facilitating smoother integration into existing workflows.

Option 3: Integrating ML Techniques into the Development Pipeline

Promoting the integration of ML techniques into various stages of drug development, from drug design to clinical trials, holds significant promise. Early identification of potential side effects can save time and resources, while leading to safer and more effective drugs. However, regulatory bodies may require robust validation data before accepting ML techniques for drug development decisions. In addition, over-reliance on techniques could overlook other important safety considerations. Striking a balance between leveraging the power of ML and ensuring comprehensive safety assessments is vital. Governments can facilitate pilot programs to test ML integration in collaboration with pharmaceutical companies and academic partners. Regulatory agencies should provide clear guidelines for ML adoption at each stage of drug development, ensuring alignment with safety protocols.

Option 4: Enhancing Public Awareness

Implementing public education campaigns is crucial to raise awareness about the use of ML in drug development, particularly its potential to improve drug safety and efficacy. An informed public is more likely to trust and accept new drugs developed using advanced technologies. However, it’s vital to communicate the limitations and potential biases of these models to avoid overpromising benefits. Developing effective communication strategies to reach diverse audiences is essential for building public trust in this evolving field. Differentiated outreach programs should target distinct groups: patients, medical professionals, and drug developers. For instance, patient-facing initiatives could include informational videos and social media campaigns, while professionals may benefit from webinars, workshops, and continuous medical education programs.

Option 5: Addressing Bias and Fairness

ML techniques can perpetuate existing biases in healthcare if not carefully designed and monitored. Developing regulations and best practices to address potential biases in data collection, model development, and model application is critical. This may involve requiring diverse datasets, establishing fairness metrics, and mandating human oversight in critical decisions. Ensuring fairness in these models is crucial for promoting equitable access to safe and effective medications for all populations. Government agencies should establish fairness benchmarks and enforce regular audits, while industry partners integrate these standards into model development. Academic institutions can play a role in designing algorithms that detect and mitigate bias.

Discussion

This policy brief highlighted the significant potential of ML techniques in predicting drug-related side effect, proposing a multifaceted approach through standard data practices, robust model development, integration into drug development pipelines, public education, and fairness-focused policies. These strategies are essential not only for maximizing ML’s technical strengths, but also for ensuring ethical and practical applicability in real-world healthcare settings.

While ML techniques offer advanced capabilities in detecting complex patterns within large datasets, they also present limitations that warrant critical consideration. High-performance techniques such as deep neural networks often lack interpretability, posing challenges for clinical acceptance and regulatory oversight. In contrast, more interpretable models may not capture intricate relationships in the data. 21 The reliability of these models is also highly dependent on data quality, diversity, and completeness. Addressing these limitations requires regulatory guidance and the adoption of explainable AI techniques to enhance model transparency and trustworthiness. 9

The practical implementation of ML in drug development pipelines faces logistical and infrastructural barriers. Regulatory misalignment, a lack of technical capacity, and data silos impede model integration. 22 Establishing interoperable data platforms and regulatory sandbox environments can facilitate pilot testing and inform broader adoption strategies. In addition, ethical concerns must be central to policy design. 23 ML-based systems can inadvertently reinforce health disparities if trained on biased datasets. Ensuring fairness involves mandating diverse data representation, conducting equity audits, and embedding ethical oversight throughout the model lifecycle. 24

Future research should explore strategies to improve model generalizability, integrate causal inference, and employ privacy-preserving methods such as federated learning. Cross-disciplinary collaboration will be a key to aligning technical innovation with public health priorities. Through evidence-based policies and continued research investment, ML can be responsibly used to reduce drug-related harm and advance equitable healthcare outcomes.

Footnotes

Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded and supported by Health Management and Economics Research Center, Health Management Research Institute, Iran University of Medical Sciences, Tehran, Iran (1402-2-113-26934).

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

  • 1. French DD, Margo CE, Campbell RR. Enhancing postmarketing surveillance: continuing challenges. Br J Clin Pharmacol. 2015;80(4):615-617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Alemayehu D, Cappelleri JC. Revisiting issues, drawbacks and opportunities with observational studies in comparative effectiveness research. J Eval Clin Pract. 2013;19(4):579-583. [DOI] [PubMed] [Google Scholar]
  • 3. Vamathevan J, Clark D, Czodrowski P, et al. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov. 2019;18(6):463-477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Zhao H, Zhong J, Liang X, Xie C, Wang S. Application of machine learning in drug side effect prediction: databases, methods, and challenges. Front Comput Sci. 2024;19(5):195902. [Google Scholar]
  • 5. Nooripour R, Hosseinian S, Hussain AJ, et al. How resiliency and hope can predict stress of covid-19 by mediating role of spiritual well-being based on machine learning. J Relig Health. 2021;60(4):2306-2321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Nooripour R, Nasershariati MA, Amirinia M, Ilanloo H, Habibi A, Chogani M. Investigating the effectiveness of group metacognitive therapy on internet addiction and cognitive emotion regulation among adolescents. Pract Clin Psychol. 2023;11(2):93-102. [Google Scholar]
  • 7. Le NQK, Tran TX, Nguyen PA, Ho TT, Nguyen VN. Recent progress in machine learning approaches for predicting carcinogenicity in drug development. Expert Opin Drug Metab Toxicol. 2024;20(7):621-628. [DOI] [PubMed] [Google Scholar]
  • 8. Wawira Gichoya J, McCoy LG, Celi LA, Ghassemi M. Equity in essence: a call for operationalising fairness in machine learning for healthcare. BMJ Health Care Inform. 2021;28(1):100289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable AI: a review of machine learning interpretability methods. Entropy. 2021;23(1):e23010018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Hu Q, Chen Y, Zou D, He Z, Xu T. Predicting adverse drug event using machine learning based on electronic health records: a systematic review and meta-analysis. Front Pharmacol. 2024;15:1497397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Dara S, Dhamercherla S, Jadav SS, Babu CM, Ahsan MJ. Machine learning in drug discovery: a review. Artif Intell Rev. 2022;55(3):1947-1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Díaz-Rodríguez N, Del Ser J, Coeckelbergh M, López de, Prado M, Herrera-Viedma E, Herrera F. Connecting the dots in trustworthy artificial intelligence: from AI principles, ethics, and key requirements to responsible AI systems and regulation. Inf Fusion. 2023;99:101896. [Google Scholar]
  • 13. Toni E, Ayatollahi H, Abbaszadeh R, Fotuhi Siahpirani A. Machine learning techniques for predicting drug-related side effects: a scoping review. Pharmaceuticals. 2024;17(6):795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Mallon A-M, Häring DA, Dahlke F, et al. Advancing data science in drug development through an innovative computational framework for data sharing and statistical analysis. BMC Med Res Methodol. 2021;21(1):250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Prada Gori DN, Alberca LN, Talevi A. Making the most effective use of available computational methods for drug repositioning. Expert Opin Drug Discov. 2023;18(5):495-503. [DOI] [PubMed] [Google Scholar]
  • 16. Ho TB, Le L, Thai DT, Taewijit S. Data-driven approach to detect and predict adverse drug reactions. Curr Pharm Des. 2016;22(23):3498-3526. [DOI] [PubMed] [Google Scholar]
  • 17. Chakraborty C, Bhattacharya M, Pal S, Lee S-S. From machine learning to deep learning: advances of the recent data-driven paradigm shift in medicine and healthcare. Curr Res Biotechnol. 2024;7:100164. [Google Scholar]
  • 18. Niazi SK. The coming of age of AI/ML in drug discovery, development, clinical testing, and manufacturing: the FDA perspectives. Drug Des Devel Ther. 2023;17:2691-2725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Diaz-Garelli F, Johnson TR, Rahbar MH, Bernstam EV. Exploring the hazards of scaling up clinical data analyses: a drug side effect discovery case report. AMIA Jt Summits Transl Sci Proc. 2021;2021:180-189. [PMC free article] [PubMed] [Google Scholar]
  • 20. Rajpoot K, Desai N, Koppisetti H, Tekade M, Sharma MC, Behera SK, et al. In silico methods for the prediction of drug toxicity. In: Tekade RK.ed. Pharmacokinetics and Toxicokinetic Considerations. Academic Press; 2022:357-383. [Google Scholar]
  • 21. Ennab M, Mcheick H. Enhancing interpretability and accuracy of AI models in healthcare: a comprehensive review on challenges and future directions. Front Rob AI. 2024;11:1444763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Reker D. Practical considerations for active machine learning in drug discovery. Drug Discov Today Technol. 2019;32-33:73-79. [DOI] [PubMed] [Google Scholar]
  • 23. Largent EA, Karlawish J, Wexler A. From an idea to the marketplace: identifying and addressing ethical and regulatory considerations across the digital health product-development lifecycle. BMC Digit Health. 2024;2(1):41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Abràmoff MD, Tarver ME, Loyo-Berrios N, et al. Considerations for addressing bias in artificial intelligence for health equity. NPJ Digit Med. 2023;6(1):170. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Inquiry: A Journal of Medical Care Organization, Provision and Financing are provided here courtesy of SAGE Publications

RESOURCES