Abstract
The field of computational medicinal chemistry has undergone significant advancements, transitioning from traditional methodologies to contemporary strategies powered by artificial intelligence, machine learning, and big data. Traditional approaches, such as molecular docking and QSAR modeling, have long been the foundation of drug discovery, offering reliable frameworks for target identification and lead optimization. However, contemporary methodologies, including AI-driven target identification, adaptive virtual screening, and generative models, are reshaping the landscape by increasing efficiency and expanding chemical space exploration. This article provides a comprehensive comparison between these two paradigms, highlighting their respective strengths, limitations, and the potential of their integration. By bridging traditional and contemporary approaches, researchers can establish innovative workflows to accelerate drug discovery, ultimately contributing to the development of safer and more effective therapeutics.
This review bridges traditional and AI-driven approaches in computational medicinal chemistry, showing how hybrid models, federated learning, and explainable generative AI are reshaping modern drug discovery workflows.
1. Introduction
Computational medicinal chemistry has become an essential tool in modern drug discovery, leveraging computational methods to address complex challenges in identifying, designing, and optimizing new therapeutic agents. Historically, the field has relied on well-established techniques, such as molecular docking, QSAR modeling, and pharmacophore mapping, which have contributed to the discovery of numerous successful drugs.1 These methods are rooted in physics-based and statistical approaches, offering reliable frameworks for analyzing molecular interactions and predicting biological activity.
In recent years, advances in computational power, artificial intelligence (AI), and data availability have given rise to a new era of contemporary approaches. These innovations include machine learning-driven predictions, generative models for de novo drug design, and the integration of big data analytics to uncover hidden patterns in biological systems.2 The emergence of platforms like AlphaFold for protein structure prediction3 and AI-based tools for virtual screening have demonstrated the transformative potential of these technologies. Fig. 1 illustrates the organization of key approaches in computational medicinal chemistry, highlighting traditional methodologies and contemporary innovations that are transforming the field.
Fig. 1. Overview of computational medicinal chemistry approaches. This diagram provides a structured comparison between traditional and contemporary approaches in computational medicinal chemistry.
Despite these advancements, traditional approaches remain foundational, providing a structured starting point for many computational workflows. However, their limitations, such as reliance on small, curated datasets and the need for iterative experimental validation, highlight the need for integration with contemporary methodologies. By combining the strengths of both paradigms, researchers can create more comprehensive and efficient pipelines for drug discovery, capable of addressing challenges such as drug resistance, rare diseases, and personalized medicine.
The integration of multimodal data, such as combining genomic, proteomic, and metabolomic datasets, is gaining traction in contemporary workflows. This approach enables the identification of novel drug targets and the prediction of their therapeutic relevance across multiple biological pathways.4 Such advances are particularly significant in addressing the complexity of diseases like cancer and neurodegenerative disorders, where single-target approaches often fall short.
Furthermore, the use of reinforcement learning in drug design has opened new avenues for exploring chemical space. Recent studies have demonstrated the ability of AI algorithms to generate highly potent compounds with optimized pharmacokinetic and pharmacodynamic properties, reducing the time required for lead optimization.5 These tools are being adopted by both academic and industrial research groups to accelerate the translation of computational predictions into viable drug candidates.
Finally, the application of federated learning frameworks is emerging as a solution to data-sharing challenges in the pharmaceutical industry. By allowing decentralized training of machine learning models across multiple institutions, this approach preserves data privacy while leveraging large-scale datasets to improve model accuracy.6 This paradigm shift is expected to facilitate collaborative research efforts and enhance the predictive power of computational methods.
Another critical trend is the incorporation of explainable AI (XAI) techniques, which address the “black-box” nature of many machine learning models. By providing insights into the decision-making processes of AI systems, XAI enhances the trust and interpretability of computational predictions, making them more accessible to experimental validation teams.7 This approach is particularly valuable in regulatory contexts, where understanding the rationale behind drug design decisions is essential.
High-throughput computational platforms, combining cloud computing and edge computing, are also reshaping the scalability of virtual screening and molecular simulations. These platforms allow researchers to process massive libraries of compounds efficiently, enabling faster identification of promising candidates. Cloud-based frameworks such as AWS and Google Cloud are increasingly integrated into academic and industrial pipelines to expand their capacity.8
Lastly, advances in omics technologies and their computational integration have bolstered precision medicine initiatives. By aligning genetic and epigenetic data with computational drug design tools, it is now possible to tailor therapies to individual patients or specific subpopulations, addressing variability in drug response and improving efficacy.9 These innovations represent a convergence of computational and experimental disciplines, paving the way for transformative breakthroughs in personalized medicine.
This article aims to provide a detailed comparison of traditional and contemporary approaches in Computational Medicinal Chemistry, emphasizing their complementary nature. The main text of the article should appear here with headings as appropriate.
2. Traditional approaches in computational medicinal chemistry
Traditional approaches in computational medicinal chemistry have laid the foundation for drug discovery workflows for decades. These methods, characterized by their systematic and physics-based nature, remain indispensable tools for researchers. They offer robust frameworks to analyze drug-target interactions, predict biological activity, and optimize chemical structures.
2.1. Historical context and development
The roots of computational medicinal chemistry can be traced back to the 1960s and 1970s, a period that marked the inception of molecular modeling as a scientific discipline. Early pioneers in the field utilized basic computational algorithms to study molecular structures, often limited by the computational power of the time.10 During this period, the emergence of programs such as CONGEN and CAMP, which focused on conformational analysis of molecules, set the stage for more advanced computational tools11 (Fig. 2).
Fig. 2. Timeline illustrating key milestones in the evolution of computational molecular modelling and drug design, from early algorithm development in the 1960s–1970s to contemporary QSAR approaches and high-throughput virtual screening. Each segment highlights a major methodological advance contributing to the field of computer-aided drug discovery.
In 1982, the development of DOCK, one of the first molecular docking programs, revolutionized the ability to predict ligand binding modes within a target's active site. This program laid the groundwork for subsequent tools like AutoDock and FlexX, which introduced more sophisticated scoring functions and flexible ligand docking capabilities.12 These advancements coincided with the increasing availability of protein crystal structures through the protein data bank (PDB), enabling structure-based drug design (SBDD) to flourish.13
Throughout the 1980s and 1990s, advances in quantum mechanics (QM) and molecular mechanics (MM) methods facilitated the development of hybrid approaches such as QM/MM simulations. These methods allowed researchers to model complex chemical reactions in biological systems with unprecedented accuracy, leading to breakthroughs in understanding enzymatic catalysis and inhibitor binding.14 The computational cost of QM/MM simulations was high, but they provided invaluable insights into reaction mechanisms.
The rise of high-throughput virtual screening (HTVS) in the late 1990s represented another milestone. Platforms like Schrödinger Glide and MOE Dock enabled the rapid evaluation of large compound libraries, dramatically reducing the time required for hit identification. These tools were complemented by QSAR models, which employed statistical methods to predict biological activity based on molecular descriptors.15 Early implementations of QSAR, such as those by Hansch and Fujita, emphasized the relationship between hydrophobicity and biological activity, providing a quantitative framework for lead optimization.16
2.2. Advancements in ADMET prediction
ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties are critical determinants of a compound's success as a drug candidate. In the early days of computational medicinal chemistry, ADMET studies relied heavily on in vitro and in vivo experiments. However, the advent of computational tools in the 1990s began to shift the paradigm towards predictive modeling.17
One of the first significant tools for ADMET prediction was TOPKAT (toxicity prediction by Komputer Assisted Technology), which used QSAR-based models to estimate toxicity endpoints such as carcinogenicity and mutagenicity. This was followed by the development of platforms like ADMET Predictor, which combined machine learning algorithms with large datasets to improve accuracy.18 These tools provided researchers with a cost-effective means of prioritizing compounds with favorable pharmacokinetic profiles.
Key breakthroughs in ADMET modeling also included the integration of physicochemical properties, such as lipophilicity (log P) and molecular weight, into predictive frameworks. The Lipinski rule of five, introduced in 1997, established simple guidelines for evaluating drug-likeness based on ADMET consideration.19 This rule has since become of early-stage drug discovery (Fig. 3).
Fig. 3. Timeline of ADMET prediction advancements, from QSAR models and TOPKAT in the 1990s to modern AI and deep learning approaches.
Today, advanced ADMET modeling platforms leverage deep learning algorithms to predict complex endpoints such as blood–brain barrier permeability and cytochrome P450-mediated metabolism. These predictions are often validated against experimental data, creating iterative workflows that integrate computational and experimental methodologies.20
2.3. Core techniques in traditional workflows
Molecular target identification is the first and one of the most critical steps in drug discovery. It involves selecting biological targets such as enzymes, receptors, ion channels, or nucleic acids that are implicated in disease pathways. Computational methods utilize databases like UniProt and the protein data bank (PDB) to analyze protein sequences and structures, identifying binding pockets and functional regions of potential interest. In addition, publicly available databases such as DrugBank, ZINC, and ChEMBL play a central role in computational medicinal chemistry, providing access to millions of compounds with annotated physicochemical and bioactivity data. These resources underpin both traditional and AI-driven pipelines by enabling virtual screening, QSAR model training, and the validation of drug–target interactions across multiple disease areas.21 Bioinformatics tools also enable comparative genomic studies to discover novel targets with disease specificity.21 Validation of molecular targets is performed through experimental techniques such as gene knockdown using RNA interference (RNAi) or CRISPR-Cas9 systems, complemented by computational methods that simulate molecular interactions and predict downstream effects using systems biology approaches. For instance, network pharmacology models integrate protein–protein interactions to predict the impact of target modulation on entire biological systems.22
Virtual screening (VS) is a computational method used to identify potential drug candidates from large compound libraries. Structure-based virtual screening (SBVS) uses molecular docking to predict the binding orientation and affinity of small molecules within a target's active site. Advanced docking algorithms like Glide and AutoDock enable flexible ligand docking, improving prediction accuracy. For example, SBVS has been instrumental in developing kinase inhibitors for cancer therapy.23 Ligand-based virtual screening (LBVS) relies on molecular similarity or machine learning models like QSAR to predict activity. Tools such as DeepChem and OpenEye utilize ligand-based methods to prioritize compounds with physicochemical properties like known active molecules. VS often utilizes traditional chemical databases, such as ZINC and ChEMBL, which provide access to millions of commercially available compounds. Integration with high-performance computing frameworks enables high-throughput virtual screening (HTVS) workflows, screening libraries containing millions of molecules within days.
Once promising candidates are identified, lead compound optimization focuses on improving their potency, selectivity, and pharmacokinetic properties. Techniques such as molecular dynamics simulations model the dynamic behavior of ligand-target complexes, revealing interaction stability and binding energy. For instance, MD studies using GROMACS have elucidated the conformational flexibility of GPCR-ligand systems, guiding the design of highly selective agonists.24 Quantum mechanics/molecular mechanics (QM/MM) methods provide detailed insights into electronic interactions and reaction mechanisms within active sites. These are particularly valuable in designing enzyme inhibitors.25 Predictive models estimate pharmacokinetic and toxicity profiles. Tools like ADMET Predictor and SwissADME evaluate properties such as blood–brain barrier permeability, metabolic stability, and hepatotoxicity. These predictions enable early identification of potential liabilities, reducing late-stage failures.26
Computational tools assist medicinal chemists in designing practical synthetic routes for optimized leads. Fragment-based drug design (FBDD) uses molecular fragments with high target affinity as building blocks for complex drug candidates. Tools like Schrödinger's synthetic accessibility score assess the feasibility of proposed synthesis pathways, enabling iterative refinement of designs. Retrosynthetic analysis tools like Chematica (now part of Merck) automate the design of multi-step synthesis routes, identifying commercially available intermediates and optimizing reaction conditions.27
The goal of computational approaches is to guide experimental validation. Bioassays confirm the predicted activity of computationally optimized compounds against their molecular targets. Techniques such as surface plasmon resonance (SPR) and isothermal titration calorimetry (ITC) provide kinetic and thermodynamic data on ligand binding. These experimental results refine computational models, closing the iterative loop between theory and practice.
2.4. Applications in drug discovery
Historically, traditional computational methods have played a crucial role in the discovery and development of numerous therapeutic agents. Molecular docking, for instance, was fundamental in the design of HIV protease inhibitors, which are now a mainstay in antiretroviral therapy. This technique helped in understanding the binding interactions and designing inhibitors with high specificity.28 Quantitative structure–activity relationship (QSAR) modeling significantly advanced the development of selective COX-2 inhibitors, which enhanced the safety profile of anti-inflammatory medications by reducing gastrointestinal side effects associated with nonselective NSAIDs.29 Furthermore, pharmacophore modeling has been instrumental in the discovery of novel kinase inhibitors, critical in cancer treatment. The development of osimertinib, a third-generation EGFR inhibitor, is a prime example of how these methodologies can address drug resistance mechanisms effectively, offering significant clinical benefits in treating non-small cell lung cancer with specific EGFR mutations.30
Despite their limitations, traditional techniques in computational medicinal chemistry provide a robust foundation for drug discovery processes. Their integration with both experimental methods and advanced computational strategies addresses complex challenges within the field. As these traditional methods continue to evolve, they remain essential tools in the medicinal chemist's toolkit, driving forward innovations and therapeutic breakthroughs.
3. Contemporary approaches in computational medicinal chemistry
Recent advancements in computational medicinal chemistry have redefined the traditional workflows, introducing cutting-edge technologies that leverage artificial intelligence (AI), machine learning (ML), and high-throughput computational methods. Contemporary approaches aim to enhance the speed, accuracy, and scalability of drug discovery processes, addressing limitations inherent in traditional methodologies (Fig. 4).
Fig. 4. AI-driven drug discovery pipeline. Integration of omics data, target identification, compound screening, lead optimization, and clinical trials through machine learning models for bioactivity, ADMET, and DTI prediction.
3.1. Integration of AI in target identification
The integration of artificial intelligence (AI) into target identification represents a transformative shift in computational medicinal chemistry. AI-powered tools allow researchers to navigate the complexities of biological systems and uncover novel therapeutic targets that were previously beyond reach. By analyzing extensive datasets from omics studies, AI systems identify critical nodes within disease networks, shedding light on proteins, genes, and pathways essential for disease progression.31,32
One significant advancement in AI-driven target identification is the use of deep neural networks to process genomic and transcriptomic data. For example, convolutional neural networks (CNNs) have been employed to detect patterns in gene expression profiles associated with specific diseases, providing a roadmap for target prioritization. Additionally, reinforcement learning models continuously refine predictions by incorporating feedback from experimental validation, creating a dynamic discovery pipeline.33
AlphaFold, developed by DeepMind, has revolutionized the structural biology landscape by predicting protein structures with atomic-level accuracy. This breakthrough enables researchers to study previously unresolved targets, including membrane proteins and intrinsically disordered regions, which are notoriously difficult to characterize experimentally.3 The structural insights provided by AlphaFold enhance the accuracy of docking and virtual screening workflows.
AI has also facilitated the exploration of protein–protein interaction (PPI) networks. By employing graph neural networks (GNNs), researchers can analyze PPIs to identify key hubs and connectors that serve as druggable targets. This approach has been particularly impactful in oncology, where targeting PPIs involved in tumor progression offers new therapeutic avenues.34
Recent studies (2024–2025) have shown how graph neural networks (GNNs) can model PPI topologies at a systems level, capturing nonlinear dependencies between nodes and revealing emergent biological targets. In glioblastoma, GNN frameworks identified CDC42 and RAC1 as central network regulators, later validated experimentally as druggable proteins. These findings illustrate how GNN-based modeling extends beyond sequence similarity to dynamic network inference, offering a promising direction for AI-assisted target discovery.
Incorporating natural language processing (NLP) algorithms into the target identification process has enabled the extraction of actionable insights from unstructured biomedical literature. NLP models like BioBERT can process vast corpora of research papers, patents, and clinical trial data, uncovering previously overlooked connections between targets and diseases. This capability accelerates hypothesis generation and guides experimental design.35
AI-driven multi-omics integration is another area of innovation. By combining data from genomics, proteomics, metabolomics, and epigenomics, researchers gain a holistic view of disease mechanisms. AI models identify correlations across these datasets, revealing synergistic relationships between targets that could inform combination therapy strategies.36
Furthermore, federated learning frameworks are addressing data privacy concerns in collaborative research. By enabling decentralized training of AI models across institutions, federated learning preserves patient confidentiality while leveraging diverse datasets to improve target identification accuracy.37 This approach is particularly valuable for rare diseases, where data scarcity is a significant challenge.
Drug repurposing has also benefited from AI's predictive power. By analyzing existing drug–target interaction datasets, AI systems identify potential off-label uses for approved drugs. For instance, AI models predicted the efficacy of baricitinib, a rheumatoid arthritis drug, for treating COVID-19, a finding later validated in clinical trials.38
The integration of AI in target identification has significantly reduced the time and cost associated with early-stage drug discovery. Traditional methods often require years of experimental effort to validate a single target, whereas AI-powered workflows can generate and prioritize multiple high-confidence targets within weeks.39 This efficiency is particularly valuable in responding to public health emergencies, such as pandemics, where rapid therapeutic development is critical.
As AI technologies continue to evolve, their application in target identification will likely expand to include predictive models for complex diseases, such as neurodegenerative disorders and multifactorial cancers. By combining AI with experimental methods, researchers can create robust pipelines that not only identify novel targets but also elucidate their biological relevance, paving the way for innovative therapeutic solutions.40 AI has revolutionized target identification by enabling the analysis of massive datasets derived from genomics, proteomics, and transcriptomics. Tools such as DeepMind's AlphaFold have accurately predicted protein structures at an unprecedented scale, uncovering new targets for therapeutic intervention. Additionally, ML algorithms analyze biological networks to identify key proteins or pathways involved in disease progression, facilitating the discovery of novel targets.41
One notable example of AI-driven target identification is BenevolentAI's use of machine learning and knowledge graphs to propose baricitinib as a potential treatment for COVID-19 within 48 hours. The compound, originally approved for rheumatoid arthritis, was later validated in clinical trials, demonstrating how AI systems can accelerate repurposing through integrative literature mining and target prioritization.38
Additionally, researchers have applied graph neural networks (GNNs) to model protein–protein interaction networks in glioblastoma, identifying novel therapeutic targets not previously captured by differential expression analysis. This systems-level approach uncovered central network hubs, such as CDC42 and RAC1, that were later validated as druggable nodes in vitro. In addition, publicly available databases such as DrugBank, ZINC, and ChEMBL play a central role in computational medicinal chemistry, providing access to millions of compounds with annotated physicochemical and bioactivity data. These resources underpin both traditional and AI-driven pipelines by enabling virtual screening, QSAR model training, and the validation of drug–target interactions across multiple disease areas.
3.2. Generative models for drug design
Generative models, such as generative adversarial networks (GANs) and variational autoencoders (VAEs), have revolutionized drug discovery by enabling the design of novel molecular structures with desired properties. These models are trained on large chemical libraries, learning to generate compounds that meet predefined criteria, such as potency, selectivity, and drug-likeness.42
The application of generative models begins with encoding molecular representations, such as SMILES strings or molecular graphs, into latent spaces. By manipulating these latent spaces, researchers can explore vast chemical spaces and design molecules with specific attributes. For example, GAN-based frameworks have been used to design kinase inhibitors with improved selectivity profiles, reducing off-target effects.43
VAEs have shown remarkable success in de novo drug design. These models encode molecular structures into continuous latent spaces, allowing for smooth interpolation between compounds. By decoding points in the latent space, VAEs generate novel molecules that retain the pharmacophoric features of the training set while introducing structural diversity. For instance, VAEs have been used to design inhibitors for bromodomain-containing proteins, a promising class of epigenetic targets.44
Reinforcement learning (RL) has been integrated with generative models to optimize molecules for specific properties. In this approach, a reward function guides the generation process, favoring compounds that meet criteria such as high binding affinity or low toxicity. RL-augmented generative models have successfully produced lead compounds for G-protein-coupled receptors (GPCRs), highlighting their versatility.45
Generative models have also accelerated the identification of novel scaffolds. By focusing on underexplored regions of chemical space, these models propose molecular frameworks that differ from traditional small-molecule libraries. This approach is particularly valuable for tackling challenging targets, such as protein–protein interactions, where conventional screening methods often fall short.46
The integration of multi-objective optimization into generative models allows for the simultaneous consideration of multiple properties. For example, researchers have used these models to design molecules that balance potency, solubility, and metabolic stability. This holistic approach reduces the need for iterative optimization cycles, streamlining the drug discovery process.47
Cloud-based platforms have democratized access to generative models, enabling researchers worldwide to leverage these tools without requiring extensive computational resources. Platforms like Insilico Medicine's Chemistry42 and BenevolentAI's generative pipeline provide user-friendly interfaces for molecule generation and property prediction, facilitating collaboration and innovation.48
One notable success story is the design of inhibitors for the SARS-CoV-2 main protease. Generative models rapidly proposed candidate molecules, which were subsequently validated through docking and experimental assays. This expedited workflow demonstrated the potential of generative models to address urgent global health challenges.49
Despite their successes, generative models face challenges related to model interpretability and bias. Efforts to incorporate explainable AI techniques are underway to provide insights into the molecular features driving model predictions. Additionally, expanding training datasets to include diverse chemical and biological data will enhance the generalizability of these models.50
As generative models continue to evolve, their integration with other computational and experimental techniques promises to further accelerate drug discovery. Combining these models with high-throughput screening, structural biology, and AI-driven ADMET predictions will create comprehensive workflows that address the complexity of modern drug development. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are transforming de novo drug design.51 These models generate novel molecular structures with desired properties by learning from existing chemical libraries. For example, platforms like Insilico Medicine's Chemistry42 use generative models to propose lead compounds, optimizing pharmacokinetic properties in silico.52
Generative AI demonstrated its power during the early stages of the COVID-19 pandemic. Insilico Medicine used its generative pipeline to design novel inhibitors for the SARS-CoV-2 main protease (Mpro) in less than 30 days. Several candidates showed promising docking scores and physicochemical properties, and the most potent compound progressed to synthesis and in vitro validation with nanomolar inhibition.46
In another case, researchers used variational autoencoders (VAEs) trained on bromodomain inhibitors to generate new molecules targeting BRD4. These candidates retained the key pharmacophoric elements but introduced novel scaffolds. Experimental assays confirmed high binding affinity, demonstrating how VAEs can bridge diversity with activity.
Despite their remarkable potential, generative and AI-based models face several challenges. Model performance often depends on the diversity and quality of training datasets, and bias in these data can limit generalizability. Moreover, the so-called “black-box” nature of deep neural networks raises concerns about interpretability, particularly in regulatory contexts. Synthetic accessibility is another critical limitation—many generated molecules may not be synthetically feasible or stable under physiological conditions. Addressing these challenges will require the integration of explainable AI (XAI) frameworks, continuous benchmarking with experimental data, and collaborative validation through federated learning initiatives that preserve data privacy while expanding training diversity.7,33
4. Integration and hybrid models
The convergence of traditional and contemporary methodologies in computational medicinal chemistry is redefining the paradigm of drug discovery. While traditional methods offer interpretability and validated frameworks, modern AI-based tools introduce scalability, automation, and pattern recognition beyond human capacity. Together, they form hybrid systems that are not only more efficient but also more intelligent and adaptive.
4.1. Bridging methodologies: a synergistic pipeline
Hybrid pipelines are now designed to start with physics-based simulations—such as molecular docking or molecular dynamics—to establish structure–activity baselines. These results are then fed into machine learning models that refine predictions, reprioritize compound libraries, or identify non-obvious interactions. The hybrid approach is illustrated in Fig. 5, which outlines the flow from physics-based simulations to machine learning integration and the resulting optimized compound selection.
Fig. 5. Hybrid drug discovery workflow combining traditional physics-based methods (e.g., molecular docking and dynamics) with AI-based machine learning models. The approach refines compound prioritization and supports efficacy validation through integrated predictive layers.
4.2. Layering intelligence: AI-augmented QSAR and docking
Instead of replacing legacy models, AI augments them. QSAR models are now enhanced using neural fingerprints and graph convolutions, extending their predictive power across broader chemical domains. Similarly, molecular docking results are rescored using ML algorithms that learn from past docking errors, crystal structures, and activity data.
Explainable AI further strengthens this integration. By revealing which molecular features drive predictions, tools like SHAP and integrated gradients increase model transparency and aid medicinal chemists in decision-making. This is particularly relevant in regulatory submissions, where mechanistic interpretability can make or break approval. Explainable AI (XAI) tools such as SHAP and integrated gradients help visualize the contribution of individual molecular substructures to model predictions, enhancing transparency and supporting medicinal chemists in rational lead optimization.53
An example of a hybrid pipeline combining classical docking with AI re-ranking comes from the identification of PI3Kδ inhibitors. Researchers initially performed structure-based docking to select top candidates, which were then rescored using deep neural networks trained on transcriptomic response data. The hybrid selection process led to the identification of compounds with not only high binding affinity but also validated pathway inhibition in cancer cell lines.31
Beyond single examples, several large-scale collaborative efforts have demonstrated the power of hybrid AI-physics workflows. The MELLODDY consortium, for instance, connected ten major pharmaceutical companies to train shared models for ADMET prediction via federated learning, outperforming individual models while maintaining data confidentiality. Similarly, hybrid pipelines combining generative modeling and molecular dynamics have successfully designed BRD4 inhibitors with improved pharmacophoric diversity, and graph-based neural networks have been applied to protein–protein interaction networks in glioblastoma, identifying druggable hubs such as CDC42 and RAC1 previously inaccessible to classical analyses.
4.3. Federated learning in multi-site drug discovery
A major obstacle in global drug discovery is data silos. Federated learning has emerged as a groundbreaking solution, allowing decentralized AI training across pharmaceutical companies, hospitals, and research institutes without data sharing. This has been applied in predicting toxicity profiles and blood–brain barrier permeability using datasets distributed across continents.
4.4. From data to decision: The role of multimodal fusion
Modern hybrid pipelines are not just about combining tools—they're about integrating data types. Combining omics datasets (genomics, proteomics, metabolomics), structural bioinformatics, and phenotypic screens enables a richer, systems-level understanding of biological targets.
These data are harmonized using AI models capable of multimodal learning, aligning molecular features with clinical outcomes or adverse events. This holistic approach is essential for tackling complex diseases like Alzheimer's or multi-drug-resistant infections, where single-modality analysis often falls short.
4.5. Next-generation workflows and future directions
Emerging trends point toward fully autonomous discovery platforms. These systems will integrate cloud-based virtual screening, real-time feedback from lab automation (robotic synthesis + bioassays), and reinforcement learning models that continuously optimize design choices.
Beyond current hybrid models, emerging paradigms such as quantum computing and digital twins are poised to further transform computational medicinal chemistry. Quantum algorithms promise to solve complex molecular optimization problems that remain intractable for classical computers, while digital twin systems will integrate real-time experimental feedback with in silico simulations to accelerate the design–synthesis–testing cycle. Together, these technologies point toward a future where autonomous, continuously learning discovery platforms bridge computation, robotics, and AI-driven decision-making.54
Moreover, edge computing is being explored to bring AI inference closer to lab equipment, enabling on-site decision-making in high-throughput screening facilities.
5. Conclusions
The evolution of computational medicinal chemistry marks a strategic shift from traditional physics-based and statistical methods to data-driven, AI-powered frameworks. While foundational techniques such as molecular docking, QSAR modeling, and pharmacophore analysis remain indispensable, their integration with contemporary technologies offers a more efficient and adaptive pathway for drug discovery.
This convergence is not merely beneficial—it is essential. Hybrid models that combine the interpretability of classical methods with the scalability and predictive power of machine learning enable deeper exploration of chemical space, faster candidate optimization, and improved targeting of complex biological systems. Innovations like federated learning, explainable AI, and reinforcement learning-based molecule generation are redefining the boundaries of therapeutic innovation.
Progress in this field depends not only on tool development but also on robust data interoperability, rigorous experimental validation, and interdisciplinary collaboration. By bridging traditional and modern approaches, computational medicinal chemistry is no longer just a support tool—it is becoming a driving force for personalized medicine, precision therapies, and rapid response to emerging health challenges.
Author contributions
Conceptualization, methodology, investigation, writing – original draft, writing – review & editing: Aldo Sena de Oliveira.
Conflicts of interest
There are no conflicts to declare.
Acknowledgments
The author would like to thank the Gulbenkian Institute for Molecular Medicine (GIMM) for institutional support. This work was supported by the Fundação para a Ciência e a Tecnologia (FCT, Portugal) through national funds within the scope of the Recovery and Resilience Plan (Plano de Recuperação e Resiliência – PRR), under the European Union's NextGenerationEU framework.
Biography
Aldo Sena de Oliveira.

Aldo Sena de Oliveira is a Computational Medicinal Chemist at the Gulbenkian Institute for Molecular Medicine (GIMM), University of Lisbon. He leads the Computational Medicinal Chemistry Unit (CMCU), focusing on the integration of artificial intelligence, molecular modelling, and multi-omics data in drug discovery. His research connects chemistry, data science, and biomedical innovation, contributing to projects on neurodegeneration, infectious diseases, and oncology.
Data availability
No new data were generated or analysed in this study.
Notes and references
- Kitchen D. B. Decornez H. Furr J. R. Bajorath J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nat. Rev. Drug Discovery. 2004;3:935–949. doi: 10.1038/nrd1549. [DOI] [PubMed] [Google Scholar]
- Go-ahead for first peanut allergy drug, Nat. Biotechnol., 2020, 38, 254, 10.1038/s41587-020-0458-7 [DOI] [Google Scholar]
- Jumper J. Evans R. Pritzel A. et al., Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain R. Dubey S. K. Singhvi G. The Hedgehog pathway and its inhibitors: Emerging therapeutic approaches for basal cell carcinoma. Drug Discovery Today. 2022;27(4):1176–1183. doi: 10.1016/j.drudis.2021.12.005. [DOI] [PubMed] [Google Scholar]
- Yang B. Li K. Zhong X. Zou J. Implementation of deep learning in drug design. MedComm: Future Med. 2022;1:1–17. doi: 10.1002/mef2.18. [DOI] [Google Scholar]
- Rieke N. Hancox J. Li W. et al., The future of digital health with federated learning. NPJ Digit. Med. 2020;3:119. doi: 10.1038/s41746-020-00323-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barredo Arrieta A. Díaz-Rodríguez N. Del Ser J. Bennetot A. Tabik S. Barbado A. Garcia S. Gil-Lopez S. Molina D. Benjamins R. Chatila R. Herrera F. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion. 2020;58:82–115. doi: 10.1016/j.inffus.2019.12.012. [DOI] [Google Scholar]
- Kunduru A. R. Machine learning in drug discovery: A comprehensive analysis of applications, challenges, and future directions. Int. J. Orange Technol. 2023;5:2615–7071. [Google Scholar]
- Tong L. Zhou W. Guo X. et al., Integrating multi-omics data with EHR for precision medicine using advanced artificial intelligence. IEEE Rev. Biomed. Eng. 2024;17:80–97. doi: 10.1109/RBME.2023.3324264. [DOI] [PubMed] [Google Scholar]
- Leach A. R. and Gillet V. J., An Introduction to Chemoinformatics, Springer, Dordrecht, 2007, 10.1007/978-1-4020-6291-9 [DOI] [Google Scholar]
- Jorgensen W. L. The many roles of computation in drug discovery. Science. 2004;303(5665):1813–1818. doi: 10.1126/science.1096361. [DOI] [PubMed] [Google Scholar]
- Kuntz I. D. Blaney J. M. Oatley S. J. Langridge R. Ferrin T. E. A geometric approach to macromolecule–ligand interactions. J. Mol. Biol. 1982;161:269–288. doi: 10.1016/0022-2836(82)90153-X. [DOI] [PubMed] [Google Scholar]
- Berman H. M. Battistuz T. Bhat T. N. Bluhm W. F. Bourne P. E. Burkhardt K. Feng Z. Gilliland G. L. Iype L. Jain S. Fagan P. Marvin J. Padilla D. Ravichandran V. Schneider B. Thanki N. Weissig H. Westbrook J. D. Zardecki C. The Protein Data Bank Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002;58:899–907. doi: 10.1107/S0907444902003451. [DOI] [PubMed] [Google Scholar]
- Gao J. Perspective on “Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme”. Theor. Chem. Acc. 2000;103:328–329. [Google Scholar]
- Shoichet B. K. McGovern S. L. Wei B. Irwin J. J. Lead discovery using molecular docking. Curr. Opin. Chem. Biol. 2002;6:439–446. doi: 10.1016/S1367-5931(02)00339-3. [DOI] [PubMed] [Google Scholar]
- Hansch C. Fujita T. ρ–σ–π analysis. A method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc. 1964;86:1616–1626. doi: 10.1021/ja01062a035. [DOI] [Google Scholar]
- Van de Waterbeemd H. From in vivo to in vitro/in silico ADME: progress and challenges. Expert Opin. Drug Metab. Toxicol. 2005;1:1–4. doi: 10.1517/17425255.1.1.1. [DOI] [PubMed] [Google Scholar]
- Valerio Jr L. G. In silico toxicology models and databases as FDA Critical Path Initiative toolkits. Hum. Genomics. 2011;5:200–207. doi: 10.1186/1479-7364-5-3-200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipinski C. A. Lombardo F. Dominy B. W. Feeney P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 2012;64(Suppl):4–17. doi: 10.1016/j.addr.2012.09.019. [DOI] [Google Scholar]
- Zhang L. Tan J. Han D. Zhu H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discovery Today. 2017;22:1680–1685. doi: 10.1016/j.drudis.2017.08.010. [DOI] [PubMed] [Google Scholar]
- Wishart D. S. Feunang Y. D. Guo A. C. Lo E. J. Marcu A. Grant J. R. Sajed T. Johnson D. Li C. Sayeeda Z. Assempour N. Iynkkaran I. Liu Y. Maciejewski A. Gale N. Wilson A. Chin L. Cummings R. Le D. Pon A. Knox C. Wilson M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46(D1):D1074–D1082. doi: 10.1093/nar/gkx1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hopkins A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 2008;4:682–690. doi: 10.1038/nchembio.118. [DOI] [PubMed] [Google Scholar]
- Pagadala N. S. Syed K. Tuszynski J. Software for molecular docking: a review. Biophys. Rev. 2017;9:91–102. doi: 10.1007/s12551-016-0247-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abraham M. J. Murtola T. Schulz R. Páll S. Smith J. C. Hess B. Lindahl E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1–2:19–25. doi: 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- Field M. J. Bash P. A. Karplus M. A combined quantum mechanical and molecular mechanical potential for molecular dynamics simulations. J. Comput. Chem. 1990;11:700–733. doi: 10.1002/jcc.540110605. [DOI] [Google Scholar]
- Walters W. P. Stahl M. T. Murcko M. A. Virtual screening—an overview. Drug Discovery Today. 1998;3:160–178. doi: 10.1016/S1359-6446(97)01163-X. [DOI] [Google Scholar]
- Grzybowski B. A. Szymkuć S. Gajewska E. P. Dittwald P. Wołos A. Klucznik T. Chematica: a story of computer code that started to think like a chemist. Chem. 2018;4:390–398. [Google Scholar]
- Lv Z. Chu Y. Wang Y. HIV protease inhibitors: a review of molecular selectivity and toxicity. HIV/AIDS. 2015;7:95–104. [Google Scholar]
- Halgren T. A. Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 1996;17:490–519. doi: 10.1002/(SICI)1096-987X(199604)17:5/6<490::AID-JCC1>3.0.CO;2-P. [DOI] [Google Scholar]
- Jänne P. A. Yang J. C.-H. Kim D.-W. Planchard D. Ohe Y. Ramalingam S. S. Ahn M.-J. et al., AZD9291 in EGFR inhibitor–resistant non–small-cell lung cancer. N. Engl. J. Med. 2015;372:1689–1699. doi: 10.1056/NEJMoa1411817. [DOI] [PubMed] [Google Scholar]
- Stokes J. M. Yang K. Swanson K. Jin W. Cubillos-Ruiz A. Donghia N. M. MacNair C. R. French S. Carfrae L. A. Bloom-Ackermann Z. et al., A deep learning approach to antibiotic discovery. Cell. 2020;180:688–702.e13. doi: 10.1016/j.cell.2020.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gangwal A. Ansari A. Ahmad I. Azad A. K. Kumarasamy V. Subramaniyan V. Wong L. S. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front. Pharmacol. 2024;15:1–26. [Google Scholar]
- Alizadehsani R. Oyelere S. S. Hussain S. Calixto R. R. de Albuquerque V. H. C. Roshanzamir M. Rahouti M. Jagatheesaperumal S. K. Explainable artificial intelligence for drug discovery and development: a comprehensive survey. IEEE Access. 2024;12:35796–35812. [Google Scholar]
- Soleymani F. Paquet E. Viktor H. Michalowski W. Spinello D. Protein–protein interaction prediction with deep learning: a comprehensive review. Comput. Struct. Biotechnol. J. 2022;20:5316–5341. doi: 10.1016/j.csbj.2022.08.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J. Yoon W. Kim S. Kim D. Kim S. So C. H. Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234–1240. doi: 10.1093/bioinformatics/btz682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Basu B. Gowtham N. H. Xiao Y. Kalidindi S. R. Leong K. W. Biomaterialomics: data science-driven pathways to develop fourth-generation biomaterials. Acta Biomater. 2022;143:1–25. doi: 10.1016/j.actbio.2022.02.027. [DOI] [PubMed] [Google Scholar]
- Rieke N. Hancox J. Li W. Milletarì F. Roth H. R. Albarqouni S. Bakas S. Galtier M. N. Landman B. A. Maier-Hein K. Ourselin S. Sheller M. Summers R. M. Trask A. Xu D. Baust M. Cardoso M. J. The future of digital health with federated learning. NPJ Digit. Med. 2020;3:119. doi: 10.1038/s41746-020-00323-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richardson P. Griffin I. Tucker C. Smith D. Oechsle O. Phelan A. et al., Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet. 2020;395:e30–e31. doi: 10.1016/S0140-6736(20)30304-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mak K.-K. Pichika M. R. Artificial intelligence in drug development: present status and future prospects. Drug Discovery Today. 2019;24:773–780. doi: 10.1016/j.drudis.2018.11.014. [DOI] [PubMed] [Google Scholar]
- Askr H. Elgeldawi E. Aboul Ella H. Elshaier Y. A. M. M. Gomaa M. M. Hassanien A. E. Deep learning in drug discovery: an integrative review and future challenges. Artif. Intell. Rev. 2023;56:5975–6037. doi: 10.1007/s10462-022-10306-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vişan A. I. Neguţ I. Integrating artificial intelligence for drug discovery in the context of revolutionizing drug delivery. Life. 2024;14(2):1–36. doi: 10.3390/life14020233. [DOI] [Google Scholar]
- Klambauer G. Hochreiter S. Rarey M. Machine learning in drug discovery. J. Chem. Inf. Model. 2019;59:947–948. doi: 10.1021/acs.jcim.9b00136. [DOI] [PubMed] [Google Scholar]
- Elton D. C. Boukouvalas Z. Fuge M. D. Chung P. W. Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 2019;4:828–849. doi: 10.1039/C9ME00039A. [DOI] [Google Scholar]
- Gómez-Bombarelli R. Wei J. N. Duvenaud D. Hernández-Lobato J. M. Sánchez-Lengeling B. Sheberla D. Aguilera-Iparraguirre J. Hirzel T. D. Adams R. P. Aspuru-Guzik A. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 2018;4(2):268–276. doi: 10.1021/acscentsci.7b00572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olivecrona M. Blaschke T. Engkvist O. Chen H. Molecular de-novo design through deep reinforcement learning. Aust. J. Chem. 2017;9:48. doi: 10.1186/s13321-017-0235-x. [DOI] [Google Scholar]
- Zhavoronkov A. Ivanenkov Y. A. Aliper A. Veselov M. S. Aladinskiy V. A. Aladinskaya A. V. Terentiev V. A. Polykovskiy D. A. Kuznetsov M. D. Asadulaev A. Volkov Y. Zholus A. Shayakhmetov R. R. Zhebrak A. Minaeva L. I. Zagribelnyy B. A. Lee L. H. Soll R. Madge D. Xing L. Guo T. Aspuru-Guzik A. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 2019;37:1038–1040. doi: 10.1038/s41587-019-0224-x. [DOI] [PubMed] [Google Scholar]
- Ishida S. Aasawat T. Sumita M. Katouda M. Yoshizawa T. Yoshizoe K. Tsuda K. Terayama K. ChemTSv2: Functional molecular design using de novo molecule generator. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2023;13(6):e1680. doi: 10.1002/wcms.1680. [DOI] [Google Scholar]
- Bai Q. Liu S. Tian Y. Xu T. Banegas-Luna A. J. Pérez-Sánchez H. Huang J. Liu H. Yao X. Application advances of deep learning methods for de novo drug design and molecular dynamics simulation. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022;12(3):e1581. doi: 10.1002/wcms.1581. [DOI] [Google Scholar]
- Andrianov A. M. Shuldau M. A. Furs K. V. Yushkevich A. M. Tuzikov A. V. AI-Driven De Novo Design and Molecular Modeling for Discovery of Small-Molecule Compounds as Potential Drug Candidates Targeting SARS-CoV-2 Main Protease. Int. J. Mol. Sci. 2023;24(9):8083. doi: 10.3390/ijms24098083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCloskey K. Taly A. Monti F. Colwell L. J. Using attribution to decode binding mechanism in neural network models for chemistry. Proc. Natl. Acad. Sci. U. S. A. 2019;116(24):11624–11629. doi: 10.1073/pnas.1820657116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pang C. Qiao J. Zeng X. Zou Q. Wei L. Deep Generative Models in De Novo Drug Molecule Generation. J. Chem. Inf. Model. 2024;64(7):2174–2194. doi: 10.1021/acs.jcim.3c01496. [DOI] [PubMed] [Google Scholar]
- Bai Q., Ma J. and Xu T., AI Deep Learning Generative Models for Drug Discovery, in Applications of Generative AI, Springer Nature, 2024, pp. 461–475 [Google Scholar]
- Wojtuch A. Jankowski R. Podlewska S. How can SHAP values help to shape metabolic stability of chemical compounds? Aust. J. Chem. 2021;13:74. doi: 10.1186/s13321-021-00551-8. [DOI] [Google Scholar]
- Santagati R. Aspuru-Guzik A. Babbush R. Degroote M. González L. Kyoseva E. Moll N. Oppel M. Parrish R. M. Rubin N. C. Streif M. Tautermann C. S. Weiss H. Wiebe N. Utschig-Utschig C. Drug design on quantum computers. Nat. Phys. 2024;20:549–557. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No new data were generated or analysed in this study.





