Abstract
Natural products (NPs) have long been a cornerstone of pharmaceutical innovation, contributing to approximately 50% of FDA-approved drugs over the past four decades. However, traditional NP drug discovery faces significant hurdles, including laborious isolation processes, biodiversity constraints, and low hit rates in high-throughput screening. These hurdles often extend the development timelines to 10–15 years with costs exceeding $2 billion per drug. Artificial intelligence (AI) emerges as a transformative force, leveraging machine learning (ML), deep learning (DL), and generative models (Gen. AI) to expedite these processes. AI facilitates virtual screening of vast chemical libraries, predicts molecular interactions with unprecedented accuracy, and designs novel NP-inspired scaffolds, potentially reducing discovery time by up to 70%. This interdisciplinary approach not only addresses unmet medical needs but also aligns with global sustainability goals, potentially increasing success rates from <1% in traditional pipelines to over 10%. Ultimately, AI hints at revitalizing NP drug discovery, fostering innovative, eco-friendly therapeutics. This study reviews recent advancements in AI applications for NP drug discovery, including the challenges such as NPs representing only ~5% of screened compounds in many datasets, interpretability issues in “black-box” models, and ethical concerns over bioprospecting in biodiverse regions.
Keywords: artificial intelligence, natural products, drug discovery, machine learning, virtual screening, de novo design
1. Introduction
1.1. Natural Products (NPs): A Historical Foundation of Medicine
Natural products (NPs), also referred to as bioactive small molecules originating from sources such as plants, animals, fungi, and microorganisms, have constituted the cornerstone of medical practice for thousands of years [1]. Originating from traditional practices such as Ayurveda, Traditional Chinese Medicine, and ethnopharmacological traditions of indigenous communities, these biologically derived compounds have consistently yielded structurally diverse molecules with significant therapeutic potential [2]. The earliest recorded application of natural therapeutics can be traced to ancient Mesopotamia, where cuneiform tablets dated to approximately 2600 B.C. document the medicinal use of cypress and myrrh oil for the treatment of various disorders [3]. This longstanding empirical knowledge has been continuously integrated into modern pharmacology. Prior to the advent of combinatorial chemistry and high-throughput screening platforms, over 80% of clinically used drugs were either directly derived from NPs or structurally inspired by them [4]. Comprehensive studies reveal that from 1981 to 2019, approximately one-third of small-molecule drugs approved by the U.S. Food and Drug Administration were either natural products or their direct derivatives. This proportion increases to nearly 50% when including synthetic compounds designed based on natural product frameworks [1]. This historical dependence highlights the unparalleled potential embedded in nature’s chemical diversity, refined through an evolutionary process over time to interact selectively and effectively with human biological systems [5].
Several transformative discoveries prove the foundational role of NPs in shaping contemporary medicine. Morphine, an opioid alkaloid isolated from the opium poppy (Papaver somniferum) in 1803, is an illustrative and prototypical example. Morphine (scaffold of morphine) has enabled the development of more than 70 structurally related therapeutics that include codeine, widely employed as an antitussive agent, and naltrexone, a key intervention for opioid dependence [6]. Likewise, the serendipitous discovery of penicillin by Alexander Fleming in 1928, originating from the Penicillium fungus, heralded the antibiotic era and irreversibly transformed medical practice [7]. Penicillin itself and more importantly its derivatives rapidly became indispensable for the treatment of previously fatal bacterial infections, potentially reducing mortality from pneumonia and other respiratory illnesses while facilitating the safety of surgical interventions [7].
The contribution of NP-derived therapeutics further extends to oncology and infectious diseases. Paclitaxel (Taxol), a diterpenoid isolated in the 1960s from the bark of the Pacific yew tree, emerged as a paradigm-shifting anticancer agent [8]. By stabilizing microtubules and thereby blocking mitotic progression, paclitaxel provided an innovative therapeutic option for ovarian cancer, particularly in drug-resistant cases [9]. Similarly, the discovery of artemisinin from Artemisia annua in 1972 revolutionized malaria treatment [10]. Artemisinin’s potent antimalarial activity prompted the World Health Organization to endorse artemisinin-based combination therapies (ACTs) as the global standard of care against drug-resistant malaria. Collectively, these landmark advances underscore the intrinsic strengths of NPs, namely their remarkable structural diversity, pronounced biological activity, and novel modes of action, which enable modulation of multiple molecular pathways and provide unique scaffolds for drug discovery that are often absent from synthetic compound libraries [11].
Overall, new molecular entities (NMEs), structurally unique pharmaceutical compounds that are significantly different from their existing counterparts, have an exceedingly low approval rate by the U.S. Food and Drug Administration (FDA), with only approximately 0.01% of candidates achieving market authorization [12].
1.2. Bottlenecks of Traditional NP Drug Discovery
Despite the long-standing success of natural products (NPs) in drug discovery, the pharmaceutical sector experienced a pronounced decline in NP-driven research beginning in the 1990s [13]. This transition was not attributable to a lack of therapeutic promise but rather to a set of inherent technical and logistical challenges that rendered the conventional discovery process inefficient, costly, and often unpredictable. These obstacles diminished the competitiveness of NP research relative to the emerging high-throughput screening and combinatorial chemistry platforms, which at the time offered the prospect of rapid and scalable drug discovery [13,14].
Among the most formidable hurdles is the technical complexity associated with NP isolation and characterization [1]. Natural extracts, particularly from plants, represent chemically intricate matrices comprising diverse primary and secondary metabolites, which makes the purification of an individual bioactive compound a labor-intensive, time-demanding, and resource-intensive endeavor [14]. Moreover, the sophisticated architectures of many NPs, frequently comprising cyclic or semi-rigid scaffolds enriched with multiple stereogenic centers, pose significant challenges for structural elucidation as well as for reproducible large-scale synthesis [15]. Such structural intricacy often correlates with suboptimal physicochemical properties; many otherwise promising NPs exhibit poor solubility, chemical instability, and inadequate bioavailability, all of which present critical barriers to successful clinical translation [16]. These findings are illustrated in Figure 1.
Figure 1.
Challenges in the development of natural product-derived agents: from source complexity to safety and synthetic hurdles.
Another recurrent limitation in NP research is dereplication, defined as the inadvertent re-isolation of compounds already described in the scientific literature. This redundancy not only consumes substantial resources but also delays the identification of genuinely novel chemical entities [17]. Furthermore, while NPs represent a rich reservoir of pharmacologically active molecules, they may also manifest intrinsic toxicities. For instance, hepatotoxicity has been documented for comfrey, whereas Hypericum perforatum (St. John’s Wort) is associated with clinically significant drug–drug interactions [18]. Such risks necessitate rigorous toxicological evaluation, thereby introducing additional complexity, time, and cost into the development pipeline. Finally, the sustainability of NP-based drug discovery remains a pressing concern. The restricted availability of certain rare or endangered species, coupled with the typically low yields of bioactive constituents, presents profound ethical and ecological challenges, impeding the scalability and feasibility of clinical production [1].
1.3. AI: The Catalyst for a New Paradigm
The decline of natural product (NP)-oriented research in the late 20th century reflected a rational response to the prevailing economic and technological constraints of the pre-artificial intelligence (AI) era. However, the emergence of AI has fundamentally altered this landscape, acting as a powerful catalyst for the resurgence of interest in nature’s chemical space [19,20]. The analytical and predictive capabilities of AI, particularly through machine learning (ML) and deep learning (DL) frameworks, provide the computational infrastructure required to address the very barriers that previously hindered NP research [19]. By enabling rapid data processing, robust predictive modeling, and even the de novo design of novel molecular architectures, AI substantially accelerates the discovery process [20,21]. The integration of AI into NP drug discovery thus represents more than an incremental methodological refinement; it constitutes a transformative operational and conceptual paradigm shift that restores both the economic feasibility and scientific attractiveness of NP exploration [19,20]. The following table (Table 1) consolidates this central argument by mapping the principal historical challenges of NP drug discovery to contemporary AI-driven strategies and methodologies, thereby illustrating the stimulating potential of AI in revitalizing this domain.
Table 1.
Artificial intelligence-driven solutions to traditional bottlenecks in natural product drug discovery: key technologies and references.
| Traditional Bottleneck | AI-Driven Solution | Key AI Technology | Ref. |
|---|---|---|---|
| Time-consuming isolation and characterization | Spectral data analysis, automated workflow | Deep Neural Networks and Computer Vision | [19] |
| Dereplication, redundant discovery | AI-powered databases, classification and clustering | Unsupervised Learning (e.g., K-means) | [19] |
| Poor Drugability (solubility and bioavailability) | In silico AMET prediction | Graph Neural Networks, QSAR models | [22] |
| Limited supply and low yields from source | Biosynthetic engineering via in silico design | Reinforcement Learning, Generative Models, Variational Autoencoders | [23] |
| Inadequate understanding of mechanisms | Multi-Omics data integration and Network analysis | Deep (Reinforcement) Learning and Knowledge Graphs | [24] |
2. AI in Initial Discovery and Identification of NP Leads
2.1. Unearthing NPs Through Omics Mining and Textual Data
Artificial intelligence (AI) is reshaping the early stages of drug discovery by enabling systematic interrogation of the vast, largely unexplored chemical diversity encoded within natural sources. This transformation is driven primarily through two complementary strategies: omics-based mining and the computational analysis of traditional knowledge. The advantage of AI in this context extends well beyond acceleration; its core strength lies in the ability to integrate and interpret heterogeneous, multimodal datasets at scales unattainable by conventional human-driven analysis, thereby generating a more comprehensive and interconnected view of the natural product (NP) landscape. One of the most transformative applications has been the exploration of the so-called “microbial Pandora’s box” through multi-omics integration, encompassing genomics, proteomics, and metabolomics [25]. Within this domain, genome mining leverages AI to identify biosynthetic gene clusters (BGCs) embedded in genomic sequences, molecular blueprints that encode secondary metabolite biosynthesis. Advanced tools such as DeepBGC have significantly outperformed traditional rule-based methods, achieving predictive accuracies of approximately 80% compared with only 60% for earlier approaches [26]. The true power of AI emerges when these disparate omics layers are integrated. Computational platforms such as NPLinker and GNPS directly connect genomic BGCs with mass spectrometry (MS)-derived metabolomic profiles, effectively linking biosynthetic potential to empirically observed chemical outputs [27]. Such integrative analyses uncover previously inaccessible relationships between genetic capacity and metabolite production, representing a major leap beyond what was previously achievable.
Similarly, a team of researchers at Westlake University, China has developed an AI-driven web-based molecule sharing platform named ShennongAlpha for the intelligent management, acquisition, and translation of compounds extracted or reported from Natural Medicine [28]. The platform holds over 14,593 pieces of compound/phytochemical information as per the last update reported [29]. The platform includes Shennong Dialog and Shennong Nomenclature, which uses NMT-CPT (Neural Machine Translation with Coreferential Principal Term) to enable standardized translation between Chinese and English, eliminating language barriers, as well as generates standardized systematic names while automatically producing a graph to link the molecules with their association to diseases [29].
By systematically extracting association rules between canonical biosynthetic pathways and their corresponding chemical structures, using publicly available resources such as MIBiG and antiSMASH, computational frameworks like antiSMASH have become indispensable in natural product research. To date, thousands of putative biosynthetic gene clusters (BGCs) have been detected across microbial genomes and subsequently cataloged in public repositories. The functional interpretation and assessment of the novelty of these predicted BGCs necessitates comparative analysis with a reference set of experimentally validated clusters of known activity. To address this need, the Minimum Information about a Biosynthetic Gene cluster (MIBiG) standard and repository was launched in 2015 to provide a structured framework for the storage and curation of characterized BGCs [30]. The release of MIBiG 2.0 introduced significant improvements to its infrastructure, data content, and accessibility, incorporating 851 newly curated BGCs over the past five years. Furthermore, extensive expert-driven manual curation has substantially enhanced the accuracy of functional annotations, thereby enabling the development of comprehensive predictive pipelines that bridge gene sequences to their respective molecular products [31]. These advances underscore the capacity of machine learning approaches to integrate multi-omic datasets, including genomic, transcriptomic, and metabolomic information, for the identification and prioritization of novel drug targets.
In parallel, AI-driven natural language processing (NLP) and Large Language Models (LLMs) have emerged as a transformative tool for extracting knowledge embedded within vast repositories of unstructured text [32]. This encompasses diverse sources, such as ancient manuscripts, ethnobotanical records, and contemporary scientific literature [33]. By parsing and structuring this heterogeneous data, NLP systems can systematically catalog medicinal plant species, their historical therapeutic uses, and reported pharmacological effects [34,35]. What was once a painstaking manual process has now become scalable and data-driven. Critically, it is critically important to mention when this textual knowledge is cross-referenced with chemical and omics datasets. AI enables prioritization of high-value NP candidates for experimental validation [23]. This creates a powerful bidirectional feedback system by providing insights from traditional knowledge-directed AI-based exploration toward promising chemical space, while multi-omics validation provides mechanistic grounding by identifying and characterizing the bioactive compounds responsible for these effects.
Traditional Chinese Medicine (TCM), grounded in concepts such as “Qi” and the yin–yang equilibrium, has been practiced for centuries through modalities including acupuncture, herbal therapy, and dietary interventions [36,37]. The complexity of its multi-herb, multi-compound formulations has traditionally limited mechanistic understanding. Recent advances in artificial intelligence (AI) now enable systematic elucidation of bioactive constituents and therapeutic mechanisms using data mining, pattern recognition, and predictive modeling, thereby reframing TCM within the context of systems pharmacology and network medicine [38,39].
A key development in this field is the TCMBank database, established by CHEN Yuqian’s team, which integrates 9192 herbal medicines, 61,966 unique ingredients, 15,179 targets, and 32,529 diseases, transitioning TCM research from experience-based practice to data-driven discovery [40]. TCMBank addresses three major challenges: (i) a human–machine annotation system that improves curation efficiency 17-fold; (ii) multi-source heterogeneous data fusion via deep transfer learning to unify terminologies from classical texts; and (iii) AI-assisted models that reveal mechanisms of complex formulas and support applications such as drug–target prediction, lead compound design, safety evaluation of Chinese–Western medicine combinations, retrosynthetic analysis, and vaccine development [41,42].
The application of AI to many African, Amazonian, or Indigenous traditions highlights the fundamental challenges of data scarcity, shifting its primary role from high-level prediction to intelligent data curation and preservation [43]. Tools are being developed to systematically document plant use (e.g., the UmzimbaOmhle app for South African plants), with the aim of constructing structured, machine-readable datasets that can be used for future predictive discovery [44].
Discovery in the modern frontier of marine and microbial metabolites is fueled by dedicated omics databases; for example, the microbial database SBC and the marine antimicrobial database AntiMarin. AI models trained on these resources can be used to predict novel bioactive structures from genomic or metabolomic data, effectively mining the chemical innovations of entire ecosystems [45].
By integrating AI with TCM and linking traditional knowledge to multi-omics and systems pharmacology, this paradigm establishes a powerful translational framework for precision medicine and accelerates drug discovery.
2.2. Accelerating Characterization and Dereplication of NPs
Following the acquisition of a natural extract, artificial intelligence (AI) has markedly accelerated the traditionally laborious process of structural elucidation. Contemporary AI and machine learning (ML) algorithms are now seamlessly integrated with advanced analytical platforms, including nuclear magnetic resonance (NMR), high-performance liquid chromatography–mass spectrometry (HPLC–MS), and gas chromatography–mass spectrometry (GC–MS), thereby enhancing the speed, precision, and interpretability of experimental data [19,46,47,48]. This synergistic integration has transformed structural analysis into a significantly more efficient workflow, enabling the practical implementation of high-throughput screening pipelines for natural products (NPs) [19]. Notable progress has been demonstrated through the development of deep learning-based frameworks. For instance, DN-Unet, a deep neural network, has been shown to potentially improve the signal-to-noise ratio of NMR spectra (by over 200-fold), recovering weak spectral peaks that are otherwise masked by noise [49]. In parallel, the DP4-AI and DP5-AI platform automates the analysis and assignment of raw NMR data, achieving processing speeds up to 60 times faster than conventional manual approaches while significantly reducing dependence on expert interpretation [50]. These advances enable the rapid structural identification of compounds within complex mixtures, thereby minimizing the necessity for extensive and resource-intensive purification procedures. AI-driven virtual screening (AI-VS) strategies are generally classified into ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS). LBVS leverages structure–activity relationships to predict new bioactive compounds using graph neural networks for three-dimensional molecular feature extraction (AUC > 0.90), geometric deep learning to optimize pharmacophore models, and Transformer-based architectures for ADMET prediction. SBVS, in turn, utilizes three-dimensional target structures for precise molecular interaction modeling, with advanced docking algorithms such as DiffDock addressing limitations in conformational sampling [51,52].
To facilitate natural product research, the HERB database (Ben Cao Zu Jian) was established as a large-scale TCM resource linking Chinese herbal medicines with modern pharmacology. Through the reanalysis of 6164 gene expression profiles from 1037 experiments and integration with CMap, HERB mapped TCM ingredients to 2837 modern drugs. Additionally, curated datasets linked 12,933 targets and 28,212 diseases to 7263 medicinal materials and 49,258 compounds across six types of pairwise relationships [53,54,55]. While HERB provides a powerful platform for TCM modernization and rational drug development, limitations remain in data coverage and novel toxicity prediction, highlighting the need for multi-omics integration and causal inference frameworks [54].
One of the most persistent bottlenecks in NP discovery is dereplication, the repeated re-identification of previously known molecules, which results in considerable inefficiency and redundancy [56]. AI directly addresses this limitation through sophisticated pattern recognition, classification, and clustering methodologies that can rapidly compare newly generated spectral profiles against curated reference databases [56,57]. For example, unsupervised learning algorithms such as K-means clustering can organize structurally related compounds, facilitating a clearer assessment of chemical diversity within libraries and prioritizing molecules with genuine novelty for downstream investigation [58]. Complementary tools, such as NaturePred, employ natural language processing (NLP)-based approaches to predict NP classes with high accuracy, further optimizing dereplication and candidate prioritization [59].
These breakthroughs in molecular semantic vectorization, 3D structure–activity modeling, and accurate free energy prediction have systematically increased success rates in active compound identification. By advancing beyond conventional library-matching techniques, AI establishes a more intelligent, scalable, and resource-efficient strategy for identifying structurally novel and pharmacologically promising natural product leads [60].
2.3. Translational Pathways and Regulatory Considerations for AI in NP Discovery
The translation of AI-driven NP discovery from academic research to a component of the regulated drug development process necessitates rigorous engagement with regulatory science, robust validation, and honest assessment of translational readiness. Regulatory bodies like the FDA and EMA currently provide guiding principles rather than prescriptive rules for AI usage in discovery, emphasizing transparency, scientific rigor, and robustness [61]. For instance, the FDA’s discussion paper on AI/ML usage in drug development highlights the importance of creating a “predetermined change control plan” for models that learn and adapt, which is directly relevant to active learning pipelines in NP optimization [62]. This aligns with the lifecycle approach to model validation advocated in guidelines like ICH Q9, which moves beyond one-time testing to ongoing performance monitoring and management [63].
Consequently, validation strategies must evolve. Beyond reporting cross-validation accuracy, models intended for decision support must undergo prospective validation by using a locked version on external, blinded datasets that simulate real-world use. Furthermore, documenting the model’s “applicability domain”—the chemical and biological space within which its predictions are reliable—is crucial to prevent spurious extrapolation to novel NP scaffolds outside the distribution of the training data. A practical framework for assessing maturity is the Technology Readiness Level (TRL). While most AI-for-NP tools reside at a TRL of 3–4 (experimental proof-of-concept), achieving a TRL 6–7 (prototype validation in a relevant industrial environment) requires demonstrating interoperability with existing lab informatics systems, reproducibility across batches, and a tangible impact on key metrics [64].
The journey from a published algorithm to an adopted tool is bridged by addressing these translational gaps. This includes developing standardized formats for NP-omics data, benchmarking challenges under controlled conditions, and fostering pre-competitive collaborations to validate tools on proprietary industry datasets. Ultimately, rather than constraints, these frameworks should be seen as catalysts for building trustworthy, impactful, and scalable AI solutions that can reliably contribute new natural product-based therapies [28]. Further details are provided in Figure 2.
Figure 2.
AI-powered strategies for addressing natural product drug development challenges and enabling rational molecule design.
3. Preclinical Development: From Target Engagement to Lead Optimization
3.1. Predicting Targets and Mechanisms of Action
In the preclinical phase, artificial intelligence (AI) is redefining drug discovery by transforming it from a predominantly empirical, trial-and-error process into a systematic, data-driven endeavor, wherein the interactions of candidate compounds with biological systems are computationally inferred. This stage, referred to as target deconvolution, presents particular challenges in the context of natural products (NPs), which frequently exhibit pleiotropic or multi-target activities as a consequence of their evolutionary adaptations [65]. AI methodologies are uniquely suited to address this complexity by integrating multi-omics datasets with network-based analytical frameworks to elucidate novel therapeutic targets and pathways [66,67]. This provides a more holistic characterization of a compound’s mechanism of action (MoA), extending beyond the identification of a single, discrete molecular target.
To facilitate this process, several AI-driven platforms have been developed. For instance, the SPiDER (self-organizing map-based prediction of drug equivalence relationships) algorithm can identify putative molecular targets by analyzing the physicochemical properties of a compound and mapping them against those of known drugs, even in cases lacking strong structural similarity [68]. Likewise, the STARFish (stacked ensemble target fishing) framework employs ensemble learning strategies to predict the interactions of small molecules with a broad spectrum of targets, with particular utility in NP target identification [69]. These computational approaches permit the generation of in silico hypotheses regarding a compound’s MoA, thereby streamlining the prioritization of candidates for subsequent experimental validation, which is often resource-intensive and time-consuming [70].
For instance, a multimodal machine learning framework was applied to identify anti-Alzheimer’s disease (AD) compounds within complex TCM formulations. Four deep neural network (DNN) models—trained at both the disease and target levels (acetylcholinesterase, monoamine oxidase-A, and 5-HT6 receptors)—successfully predicted candidate compounds, which were experimentally validated at the enzymatic, cellular, and animal levels. Molecules such as 2,4-di-tert-butylphenol and elemene exhibited strong inhibitory effects on AD targets, while compounds including α-asarone penetrated the blood–brain barrier and enhanced microglial β-amyloid clearance, confirming the therapeutic potential of AI-driven predictions [71,72,73,74,75,76,77]. Similarly, a team from the School of Pharmacy, Fudan University integrated geometry-aware deep learning with biological validation to screen over 300,000 natural products, identifying bifunctional compounds that simultaneously regulated lipid membranes and targeted Glut1. These were incorporated into a liposome-based delivery system, improving tumor targeting and therapeutic efficacy in preclinical models [78].
3.2. Assessing Bioactivity, ADMET, and Toxicity Profiles
One of the principal factors underlying the high cost and low success rates of conventional drug development is the substantial attrition of candidate molecules during both preclinical and clinical evaluations, largely attributable to unfavorable pharmacokinetic characteristics or toxicological liabilities [79]. Artificial intelligence (AI) is an efficient and cost-effective tool for addressing this issue, enabling the large-scale in silico screening of compound libraries for their Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) profiles; this substantially decreases the number of molecules that require synthesis and experimental testing, thereby streamlining and accelerating the overall drug development pipeline [80].
A representative example of this approach is ADMET-AI, a web-based platform that leverages the Chemprop-RDKit graph neural network architecture to predict 41 distinct ADMET parameters with high accuracy. This system allows thousands of candidate compounds to be evaluated rapidly, benchmarking their predicted pharmacokinetic and safety properties against those of approved drugs cataloged in resources such as DrugBank, and thus providing a contextual framework for assessing both safety and druggability [81,82,83].
Qi Yang et al. developed a machine learning model for hepatotoxicity prediction, validated with 56 chemical constituents of Gardenia jasminoides [84]. Their results revealed the dualistic nature of its hepatotoxic components, which exert therapeutic benefits at specific doses while inducing toxicity at others. This work highlights the pivotal role of artificial intelligence (AI) in ADMET prediction, as it enables focused experimentation, reduces clinical attrition rates, and lowers drug development costs [85].
To expand on this, Prof. Cao Dongsheng’s team has made substantial contributions to small-molecule drug-likeness prediction and Lead Optimization, advancing ADMET modeling and drug discovery paradigms. The ADMETlab platform is a notable example, which has developed from ADMETlab (2018) to ADMETlab 2.0 (2021), and most recently to ADMETlab 3.0 (2024) [86,87,88]. The latest version integrates multi-task directed message-passing neural networks (DMPNNs) to predict 119 ADMET endpoints with enhanced accuracy and robustness, thereby facilitating early-stage screening. With over 3.95 million cumulative uses, the ADMETlab series is now the most widely adopted online platform for drug-likeness prediction [87,89].
In parallel, drug–drug interactions (DDIs) remain a major concern for clinical safety [86], yet existing databases are limited by incomplete coverage, inconsistent evidence hierarchies, and insufficient clinical decision support. To address these limitations, DDInter 2.0 was developed as a comprehensive upgrade to the original DDInter [89,90]. It expands interaction coverage; incorporates drug–food and drug–disease interactions; and, for the first time, integrates therapeutic duplication data. Its enhanced search capabilities and intuitive visualization tools improve the interpretability and applicability of complex interaction profiles, making it a valuable resource for clinicians and researchers that can support safer prescription practices and more informed drug development. The CSM (Cutoff Scanning Matrix) methodology is a computational approach to biological prediction, utilizing the structural and chemical signatures of protein inter-residue distance patterns for predicting feature vectors, enzyme functions [91], and synergistic anticancer drug combinations.
Importantly, AI-driven tools such as ProTox-II also play a pivotal role in predicting the toxicological potential of natural products, which often display a dual nature as both pharmacologically active agents and potential toxicants [92,93]. By facilitating the early identification of compounds with favorable ADMET and toxicity profiles, AI enables the most promising candidates to be prioritized, ultimately enhancing the likelihood of success in downstream stages of drug development [94]. Similarly, several tools have proven to be helpful for predicting bioactivity and synergy between molecules and targets [95,96,97].
3.3. Lead Optimization and De Novo Design
Arguably the most innovative and transformative application of artificial intelligence (AI) in preclinical research lies in its capacity to design and optimize molecules de novo. This directly addresses long-standing challenges associated with natural products (NPs), including their limited availability and inherently low yields [98]. Rather than depending on nature to provide optimal bioactive scaffolds, AI-driven methodologies can generate entirely novel chemical entities with predefined characteristics, which can subsequently be optimized to enhance their efficacy, selectivity, and ADMET properties [99]. OptADMET is a web-based platform that can improve the ADMET properties of compounds through substructure mediation, containing around 41,779 validated modifications rules from the 177,191 experimental datasets and additional 146,450 rules from the molecular prescription of 239,194 covering 32 properties around 41,779 validated modification rules from 177,191 experimental datasets and an additional 146,450 rules from the molecular prescriptions of 239,194 datasets, covering 32 properties [100].
This advancement has been facilitated by generative AI frameworks such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). By learning the latent chemical rules embedded in existing molecular datasets, they can subsequently generate novel, synthetically feasible compounds [101,102,103]. Beyond initial generation, AI supports key optimization strategies, including “group modification,” in which small, targeted structural alterations are introduced, and “scaffold hopping,” where entirely new scaffolds are created while preserving the intended biological activity [104,105].
Reinforcement learning (RL) represents a particularly powerful paradigm within molecular design. In this framework, a generative agent is trained to propose novel chemical structures and is subsequently rewarded or penalized according to how closely the generated molecules satisfy pre-specified pharmacological or physicochemical criteria [106]. This closed-loop, iterative refinement enables the agent to systematically explore chemical space and progressively improve the quality of proposed candidates. Such an approach signifies a paradigm shift in preclinical development, replacing the conventional linear, trial-and-error methodology with an iterative, goal-oriented design cycle [107].
Stokes and colleagues employed a conditional generative adversarial network (GAN) to design the novel antibiotic Halicin, completing the entire workflow from virtual generation to in vitro validation within 48 h, nearly 100-fold faster than conventional approaches [108]. Halicin, structurally distinct from traditional antibiotics, exhibits broad-spectrum activity against multidrug-resistant pathogens, including Mycobacterium tuberculosis and carbapenem-resistant Enterobacteriaceae. It demonstrated therapeutic efficacy in murine models of Clostridioides difficile and extensively drug-resistant Acinetobacter baumannii infections. By training a deep learning model on antibacterial activity prediction, the team screened multiple chemical libraries and identified Halicin from the Drug Repurposing Hub. Further, from 107 million molecules in the ZINC15 database, their framework identified eight structurally novel antibacterial agents, underscoring the potential of deep learning to expand the antibiotic arsenal through the discovery of noncanonical scaffolds [109,110]. This case is seminal because it demonstrated a target-agnostic, phenotype-first AI approach, successfully predicting a novel, synthetic chemotype with a distinct mechanism from a relatively small training set. Although Halicin is not a natural product, this case is highly instructive for NP discovery as it validates the power of AI to identify entirely novel scaffolds against pressing challenges like antibiotic resistance.
In the domain of natural products, quercetin emerges as the phytochemical most frequently associated with AI-based applications [111,112]. As a bioactive flavanol, it possesses strong antioxidant and anti-inflammatory properties, with therapeutic relevance for cancer, AIDS, hypertension, and diabetes [113]. Synergistically with kaempferol, quercetin has also demonstrated antiviral activity against SARS-CoV-2 [114,115,116]. Current research leverages AI to optimize plant-based extraction processes, design novel quercetin analogs, and construct predictive models for evaluating its antioxidant and anticancer activities.
Together, these examples illustrate the progression of deep learning and generative modeling from isolated “point breakthroughs” to systematic strategies for rational drug design. By integrating multimodal datasets, dynamic optimization algorithms, and mechanism-informed modeling, AI is transitioning drug discovery from empirical “trial-and-error” toward rational construction, enabling concurrent optimization of efficacy, safety, and developability [117].
Looking forward into the future with all plausibility, it is expected to have a fully integrated pipeline of virtual design with zero human involvement, with robotic synthesis up to experimental feedback, creating a seamless and automated design-make-test-analyze cycle that maximizes both efficiency and innovation [19,57]. The following table (Table 2) consolidates Artificial intelligence tools and their applications in drug development thereby illustrating the stimulating potential of AI in revitalizing this domain.
Table 2.
Artificial intelligence tools and their applications in the preclinical drug development pipeline: algorithms, functions, and references.
| Stage of Preclinical Pipeline | AI Tool/Model | Underlying Algorithm(s) | Function/Application | Applicability Domain and Notes | Validation Level | Typical Data Requirements | Ref. |
|---|---|---|---|---|---|---|---|
| Target Prediction | SPiDER, STarFish, TiGER | Self-organizing maps, Ensemble methods (RF, k-NN) | Identifies innovative molecules and their targets; predicts drug side effects and repurposing options | General small molecules; not specific to natural products. | Primarily in silico; some tools have limited experimental validation. | Chemical structures, bioactivity databases, and omics data. | [68,69,70] |
| ADMET Screening | ADMET-AI, PiscesCSM, ProTox-II | Graph Neural Networks, Machine learning models | Predicts absorption, distribution, metabolism, excretion, and toxicity; filters vast chemical libraries for druggability | General chemical libraries; applicability may vary with chemical space. | Mostly in silico; some tools benchmarked with experimental datasets. | Molecular descriptors, SMILES strings, and historical ADMET data. | [81,85,86,87,88,90] |
| Bioactivity and Synergy Prediction | CCSynergy, SynAI, SynPred | Deep Neural Networks (DNNs), QSAR | Predicts and ranks the biological activities of compounds; predicts synergistic interactions for combination therapies | General drug pairs; limited validation for natural product combinations. | Predominantly in silico; few tools validated in cell-based assays. | Dose–response matrices, drug structures, genomic profiles. | [95,96,97,98] |
| Lead Optimization and De Novo Design | GANs, VAEs, OptADMET | Generative AI, Reinforcement Learning | Designs novel molecules with optimized properties; refines molecular synthesis paths through iterative learning | General de novo design; may require tuning for natural product-like chemical space. | Proof-of-concept in silico; experimental validation rare. | Chemical libraries, property labels (e.g., solubility, potency). | [95,96,114] |
3.4. AI and NP-Based Drug Delivery Systems
As illustrated in Figure 3, In the field of advanced drug delivery systems, artificial intelligence (AI)-enabled strategies are increasingly employed to refine nanoparticle engineering by systematically optimizing parameters such as particle size, surface functionalization, and release kinetics. These optimizations enhance drug bioavailability while simultaneously reducing off-target effects and systemic toxicity [118]. AI-driven methodologies not only accelerate the drug development pipeline but also support the progression toward precision and personalized medicine. Effective drug delivery remains indispensable for optimizing the pharmacokinetic (PK) and pharmacodynamic (PD) characteristics of therapeutic agents, thereby ensuring improved clinical outcomes. In light of the escalating complexity and cost of new molecular entity (NME) development, the relevance of advanced delivery platforms has expanded significantly [119]. AI technologies now underpin an integrated workflow encompassing the input of disease- and drug-specific data, molecular and physicochemical property-based screening, AI-mediated predictive modeling, and the identification of optimized carrier systems [120,121]. This convergence of computational and experimental approaches holds considerable potential to increase the efficiency, accuracy, and scalability of drug delivery systems (DDSs) design [79].
Figure 3.
AI-driven in silico ADMET screening: optimizing drug candidate attrition and accelerating development pipeline.
Traditional liposomal formulations often exhibit inadequate tumor-specific accumulation, constraining their therapeutic efficacy [122]. Recent studies suggest that natural products with dual capabilities of lipid bilayer modulation and tumor-targeting activity can significantly enhance liposomal performance [123]. However, the experimental evaluation of over 300,000 natural product candidates presents formidable challenges in terms of time and resources. Addressing this limitation, researchers from Fudan University’s School of Pharmacy and Shanghai Jiao Tong University established a bidirectional deep learning-integrated platform coupled with experimental validation. This platform successfully identified compounds capable of lipid membrane regulation and glucose transporter 1 (Glut1)-targeting, leading to the development of bifunctional liposomal systems. These advanced nanocarriers demonstrated improved tumor selectivity and therapeutic efficacy in murine models, establishing a paradigm for intelligent DDS engineering via AI-guided prediction and FDA [78].
While many natural products possess potent bioactivity, their translation into therapies can be hindered by poor pharmacokinetics, off-target effects, or inability to reach intracellular sites of action. Adeno-associated virus (AAV) vectors, particularly with engineered capsids, represent a promising delivery platform for NP-derived modalities. For instance, AAV has been used to deliver genes encoding engineered nanobodies that mimic the activity of plant-derived cytotoxins [124], or to express biosynthetic enzymes for the local production of therapeutic terpenoids in animal models [125]. These approaches directly address the delivery challenges of classical NP small molecules. Parallel advances in AAV capsid engineering can further enhance the precision of such strategies. Companies like Dyno Therapeutics employ AI to design capsids with improved tissue specificity and delivery efficiency [126], a methodology that involves generating diverse capsid libraries and using algorithms to predict optimized sequences based on functional maps [127]. Together, the integration of NP-inspired therapeutic cargo with these advanced delivery vectors highlights a promising direction for advancing NP-based therapies.
Fei Li and colleagues employed AI-driven strategies combining combinatorial hydrogel libraries with machine learning to design self-assembling peptide hydrogels exhibiting tunable mechanical properties such as stiffness and elasticity. These hydrogels effectively encapsulate a broad spectrum of therapeutics, including nucleic acids and small molecules, expanding their translational potential [128]. In a related effort, Safa and Samar Damiati utilized artificial neural networks (ANNs) to optimize the drug-loading efficiency of poly (lactic-co-glycolic acid) (PLGA) nanoparticles. The optimized system produced monodispersed PLGA particles encapsulating indomethacin (IND) with controlled morphology, high encapsulation efficiency, and sustained release profiles achieving up to 80% release, underscoring the role of AI in rational nanoparticle design [129].
AI-driven innovation is increasingly embedded across the DDS research pipeline. Pharmaceutical companies such as Medicilon have developed dedicated AI-enabled platforms, including AiLNP, AiRNA, and AiTEM, that streamline nucleic acid screening, DDS optimization, and lipid nanoparticle formulation. These platforms substantially reduce developmental timelines and associated costs, thereby expediting clinical translation [107]. Furthermore, in silico medicine has leveraged graph neural networks (GNNs) to design PD-L1-targeting nanobodies with enhanced receptor-binding affinity, achieving improved tumor-specific accumulation [130]. Complementarily, ETH Zurich has developed an AI-assisted navigation system that integrates real-time ultrasonic imaging to precisely guide magnetic microspheres toward pancreatic tumors. This approach overcomes physiological barriers and enables highly localized therapeutic delivery, representing a promising advance in AI-orchestrated nanomedicine [131].
The integration of nanobodies—single-domain antibody fragments—into natural product (NP) research addresses two fundamental bottlenecks in the field: mechanism elucidation and translational delivery. Their unique biophysical properties, including their small size, high stability, and deep tissue penetration, make them uniquely suited for this role [132].
Nanobodies provide a crucial bridge between the bioactivity of natural products (NPs) and modern biologic therapeutics, a strategy exemplified by the development of NP-derived nanomedicines. Their application follows two principal pathways: First, they can serve as highly specific targeting moieties to deliver potent, cytotoxic NP-derived payloads (e.g., maytansinoid conjugates) directly to diseased cells. This approach, central to several antibody–drug conjugates (ADCs), minimizes systemic toxicity by enhancing tumor-specific accumulation [133]. Second, their hypervariable loops can be engineered to mimic the essential pharmacophore of an NP, creating a stable, protein-based equivalent. This “biologization” strategy transforms NPs with inherently poor drug-like properties into viable therapeutic candidates with superior pharmacokinetics [134,135]. The integration of NP-inspired pharmacophores with the targeting prowess of nanobodies aligns with and advances the emerging paradigm of targeted NP–biologic conjugates for next-generation nanomedicines.
4. AI in Clinical Translation and Post-Market Surveillance for NPs
4.1. Drug Repurposing of NPs
The exorbitant cost and extended timelines associated with de novo drug discovery have positioned drug repurposing as a transformative strategy in contemporary biopharmaceutical research. This approach, which seeks to identify novel therapeutic indications for existing pharmacological agents, markedly reduces development time and financial risk by capitalizing on established safety and pharmacokinetic data [136]. Artificial intelligence (AI) has revolutionized this traditionally serendipitous practice, converting it into a systematic, highly efficient, and data-driven methodology [135].
AI achieves this by employing advanced machine learning and deep learning models capable of integrating and analyzing vast, heterogeneous datasets encompassing genomic, proteomic, pharmacological, and clinical information [66]. These algorithms can discern subtle patterns and latent associations between chemical entities and biological targets that remain imperceptible to human analysis [137]. For instance, natural language processing (NLP) methods can systematically mine the scientific literature and patent databases, extracting semantic relationships that reveal previously unrecognized drug–disease connections [138]. Through such data-driven strategies, AI enables the rapid identification of repurposing candidates at a scope and scale unattainable by conventional approaches [139].
Natural products (NPs) represent particularly compelling candidates for repurposing owing to their long-standing safety records derived from traditional use, alongside their inherent pleiotropic and multi-target pharmacology, features especially advantageous for addressing complex, multifactorial pathologies [140,141].
The bioactivity of many NPs is critically dependent on their absolute stereochemical configuration. However, commonly used 2D molecular fingerprints fail to distinguish between different stereoisomers, potentially leading to the oversight of active compounds. Furthermore, NPs, especially macrocycles, often exhibit significant conformational flexibility. Traditional molecular docking methods may fail to accurately identify the optimal bioactive conformation for target binding, resulting in erroneously low scores and false negatives [64].
NPs are characterized by highly complex and diverse three-dimensional scaffolds. Most molecular descriptors were developed for structurally “flatter” synthetic drug-like molecules and are often inadequate for accurately capturing the unique structural features of NPs [142]. This underscores the need to develop NP-tailored molecular representations or to adopt models like Graph Neural Networks, which can learn directly from molecular graphs.
Predominant VS compound libraries are primarily composed of synthetic molecules. Evaluating NPs within this context creates an unfair comparison and biases the screening process toward familiar chemical archetypes. Effective NP-focused VS requires the use of specialized, NP-centric databases [132]. However, these databases are typically smaller in scale and sparser in bioactivity annotations, presenting a significant bottleneck for training robust AI models. Transfer learning—pre-training models on large-scale synthetic data followed by fine-tuning with limited NP data—is currently an effective strategy to mitigate this data scarcity.
AI’s role extends beyond therapeutic compound optimization to deciphering the fundamental biosynthesis of complex NPs. A prime example is the study of saxitoxin, a potent marine neurotoxin. Researchers employed AI-driven comparative genomics to predict its biosynthetic gene clusters across dinoflagellate species and model the evolutionary trajectory of its pathway [143]. This case highlights AI’s power in transforming genomic data into testable hypotheses about NP origin and diversification, forming a knowledge foundation for future bioengineering and discovery. A multi-step AI workflow combining network pharmacology, deep learning-based docking, and molecular dynamics simulations identified active anti-fibrotic flavonoids from a Traditional Chinese Medicine formula, demonstrating its efficacy through a mechanistic framework [144]. This represents a systems pharmacology approach which links complex mixtures to molecular targets.
Notably, AI-based approaches have been employed to predict novel therapeutic applications for compounds such as quercetin, a plant-derived flavanol with well-documented antioxidant and anti-inflammatory properties, in the treatment of conditions such as COVID-19 and cancer. Moreover, AI can forecast potential synergistic interactions between natural compounds, thereby informing the rational design of combination therapies [90,145,146,147]. These cases confirm that AI is a versatile tool, but underscore that success depends on high-quality training data and efficient experimental validation. They illustrate different strategic applications—phenotypic screening, Lead Optimization, and systems pharmacology—which provide a roadmap for integrating AI into specific stages of NP-based drug discovery. Collectively, these advancements underscore the untapped potential of NPs as a fertile resource for AI-driven drug repurposing initiatives.
4.2. Personalized Phytotherapy and Precision Medicine
The overarching objective of artificial intelligence (AI) in medicine is to facilitate a paradigm shift from conventional “one-size-fits-all” therapeutic strategies toward individualized treatment approaches. Such a framework, commonly referred to as personalized or precision medicine, seeks to deliver more effective interventions with reduced adverse effects by accounting for a patient’s unique genetic background, clinical history, and lifestyle factors [148]. Within the domain of natural products (NPs), AI is catalyzing the emergence of a new paradigm of “personalized phytotherapy” [20,149].
AI achieves this by integrating and analyzing heterogeneous patient-derived datasets, including electronic health records (EHRs), genomic profiles, wearable sensor outputs, and multi-omics information [150,151]. Through advanced computational modeling, AI algorithms can delineate individual molecular signatures and forecast patient-specific therapeutic responses, thereby enabling optimization of dosage regimens and the development of targeted treatment strategies [78,152].
An illustrative application can be found in Traditional Chinese Medicine (TCM), which has long emphasized a multiparametric evaluation of patient conditions [149,153]. Contemporary AI systems are being deployed to enhance this process by systematically analyzing symptomatology, medical history, and biometric parameters to recommend individualized herbal formulations and acupoint prescriptions, thus aligning ancient therapeutic principles with modern data-driven precision frameworks. Moreover, the advent of decentralized AI methodologies, such as federated learning, is addressing critical challenges related to data privacy and security [154,155]. These approaches allow predictive models to be trained across large-scale, distributed datasets without necessitating the exchange of raw patient information, thereby safeguarding confidentiality while maintaining analytical robustness.
4.3. Quality Control and Standardization
Maintaining the purity, potency, and safety of natural products (NPs) remains a formidable challenge due to their intrinsic variability and vulnerability to adulteration [156]. Artificial intelligence (AI) has emerged as a pivotal tool in overcoming these limitations, enabling an unprecedented degree of precision in quality assurance and standardization.
AI augments conventional analytical methodologies, including spectroscopy and chromatography, by deploying machine learning algorithms to interrogate the extensive datasets they generate [157,158]. These algorithms are capable of extracting distichemicali “fingerprintsi”, thereby facilitating accurate discrimination between authentic and adulterated herbal materials [142]. For example, the integration of hyperspectral imaging with machine learning classifiers has demonstrated over 98% accuracy in detecting adulteration in commodities such as honey [159]. In addition to chemical profiling, deep learning models trained on large-scale plant image repositories can visually authenticate botanical species, effectively distinguishing even closely related taxa with high precision and efficiency [160,161].
Beyond laboratory-based analyses, AI is being coupled with blockchain technology to establish secure, transparent, and tamper-proof supply chains for NPs. By generating an immutable digital ledger that traces a product from its point of origin to end-user delivery, AI-enabled verification at multiple checkpoints mitigates counterfeiting, ensures compliance with ethical sourcing practices, and preserves product integrity across the entire distribution network [162]. As shown in Figure 4, This integrated ecosystem, which interconnects plant sourcing with patient administration, leverages multiple layers of AI applications that collectively reinforce transparency, safety, and therapeutic efficacy. It is critically important to mention that the ability to track the complete trajectory of a natural product, from harvest through patient delivery, is essential for advancing personalized phytotherapy, as it guarantees that individuals receive standardized, high-quality preparations tailored to their specific therapeutic requirements [163].
Figure 4.
AI-driven lifecycle management of marketed nanomedicines: from post-market surveillance to data-informed repurposing and optimization.
5. Challenges and Enabling Infrastructure
5.1. Data Ecosystem
The success of any AI system relies on the quality and availability of data, as is the case for its usage in NP drug discovery. At present, the data landscape on NPs constitutes a major bottleneck to progress, as available resources remain highly fragmented, inconsistently curated, and dispersed across multiple, frequently siloed repositories [164]. These datasets are inherently multimodal, encompassing genomic sequences, metabolomic signatures, spectral readouts, and unstructured textual records, yet they lack standardized formats and harmonized ontologies [165]. The absence of standardization and interoperability severely restricts compatibility with contemporary deep learning architectures, which are optimized for processing clean, structured, and uniform data inputs [166,167].
Compounding this challenge is the scarcity of high-quality, annotated datasets specific to NPs, making effective model training more difficult [168]. This deficiency contributes to algorithmic pitfalls such as overfitting, whereby models demonstrate high performance on training data yet fail to generalize to novel or unseen inputs [169]. The resulting data fragmentation hinders the ability of AI frameworks to extract cross-modal patterns that connect structural, functional, and pharmacological dimensions, thereby constraining their ability to predict novel chemotypes or uncover emergent bioactivities [170]. Addressing this limitation will necessitate the development of a unified, interoperable repository capable of systematically linking and cross-referencing all modalities of NP-related data, from molecular structures and bioactivity profiles to ethnopharmacological knowledge [171] in order to unlock the full potential of AI-driven NP drug discovery.
5.2. Algorithmic and Methodological Hurdles
The obstacles facing artificial intelligence (AI) usage in natural product (NP) research extend beyond data-related limitations to encompass substantial algorithmic and methodological challenges [172]. A particularly critical issue is the “black box” nature of advanced deep learning models, wherein their internal decision-making processes remain opaque and lack interpretability [173]. This opacity is a significant barrier in a field that demands rigorous, transparent, and verifiable evidence to meet the stringent requirements of regulatory approval and clinical translation. In the absence of mechanistic interpretability, researchers are unable to fully trust model-generated predictions.
The practical hurdles of each paradigm are significant. Ligand-based approaches, while efficient, are notoriously prone to “scaffold bias”, often memorizing training set chemotypes rather than learning generalizable rules, leading to poor performance on novel NP scaffolds [174]. Structure-based methods like docking struggle with the conformational flexibility of many NPs and the inaccuracy of scoring functions for non-drug-like molecules, frequently resulting in false negatives for genuine binders. Although deep learning models (e.g., Graph Neural Networks) promise to overcome these issues, their “black box” nature and extreme sensitivity to hyperparameters and data splits raise major reproducibility concerns, making their predictions difficult to trust and validate prospectively. The field of de novo generation grapples with the synthetic intractability of its outputs; a landmark study found that a substantial fraction of AI-designed molecules were deemed unrealizable by expert chemists, highlighting a critical disconnect between computational optimization and practical synthesis. Even the promising paradigm of multi-omics integration is bottlenecked by the scarcity of large, paired, and standardized datasets (e.g., linking genomic clusters directly to isolated metabolites and their bioactivity), which are essential for training robust models [174].
As summarized in Table 3, different AI methodologies in drug discovery exhibit distinct key strengths and major limitations, which are comparatively analyzed in detail. For instance, virtual screening enables rapid analog or novel scaffold discovery but suffers from scaffold bias or scoring inaccuracies; de novo generation explores novel chemical space yet faces challenges in synthetic tractability; ADMET prediction supports early attrition risk assessment but is constrained by data quality; explainable AI enhances transparency while risking non-unique explanations; and integrated systems model complex biology but require heterogeneous data for validation.
Table 3.
Comparative analysis of AI methodologies in drug discovery.
| Methodology | Key Strengths | Major Limitations | Ref. |
|---|---|---|---|
| Virtual Screening | LBVS: Fast, efficient for analog discovery. SBVS: Target-agnostic, enables novel scaffold discovery. |
LBVS: High scaffold bias, poor generalization. SBVS: Challenged by flexibility/scoring inaccuracies. |
[14] |
| De Novo Generation | Explores novel chemical space; enables multi-property optimization. | Outputs often lack synthetic tractability; validation is complex. | [175] |
| ADMET Prediction | Enables early attrition risk assessment; cost-efficient. | Models limited by data quality/coverage; unreliable for novel chemotypes. | [176] |
| Explainable AI (XAI) | Increases trust and transparency; provides actionable insights for chemists. | Explanations can be non-unique; may reduce model performance. | [177] |
| Integrated Systems | Models complex biology; links molecular to phenotypic effects. | Requires heterogeneous data; complex to build and validate. | [178] |
Therefore, a critical appraisal reveals that no single AI methodology is a universal solution. The choice must be strategic, dictated by the specific research question, data availability, and the stage of the discovery pipeline. Acknowledging and rigorously testing against these methodological bottlenecks—through practices like scaffold-split validation, prospective experimental confirmation, and adherence to application domain boundaries—is paramount for advancing robust, reproducible, and impactful AI-driven NP research.
Beyond individual tools, the field faces systemic challenges. Comparative methodological weaknesses are evident: ligand-based models suffer from scaffold bias and depend heavily on training set quality [14,179,180], while structure-based methods struggle with the flexibility of NPs. The foundational data itself is problematic; public bioactivity databases underrepresent NP chemotypes, while NP-specific resources often contain noisy or non-standardized data [175,176,177], creating a “garbage in, garbage out” risk. Furthermore, many studies only report optimistic internal validation metrics, neglecting the critical need for prospective external testing and clear definition of the model’s applicability domain—the chemical space where its predictions are reliable [178]. This over-reliance on convenient but flawed benchmarks, combined with a frequent disconnect between computational hit identification and practical experimental validation, forms a major translational gap that must be bridged for the field to mature.
In addition, contemporary AI systems face intrinsic constraints in their capacity to extrapolate beyond known chemical and enzymatic landscapes [181]. While these models are highly effective at identifying patterns and relationships within established chemical space, their ability to predict genuinely novel chemistries or previously uncharacterized enzyme functions remains limited [182]. This limitation underscores the necessity of adopting human-in-the-loop strategies and hybrid frameworks that integrate AI-driven computational capabilities with the domain expertise, creativity, and critical reasoning of researchers. Such synergistic approaches are essential to overcome the boundaries of current methodologies and to advance AI-driven discovery in NP research.
5.3. Critical Data Hurdles: Bias, Noise, and the Path to FAIR Data
The reliability of AI models in NP discovery is fundamentally constrained by the “garbage in, garbage out” principle, with data quality being a primary bottleneck. Systematic biases in public repositories (e.g., ChEMBL), which are dominated by synthetic compounds and single-target assays, lead to a severe underrepresentation of NP chemical space and polypharmacology. Furthermore, heterogeneous data from diverse sources—such as crude extracts, varied bioassays, and traditional records—suffer from a lack of standardized annotation, introducing noise and making machine learning integration profoundly challenging [183].
To build robust models, researchers in the field must prioritize data-centric solutions. Adopting the FAIR principles is essential. Community-driven resources like the “MIBiG database” for biosynthetic gene clusters demonstrate the value of enforced curation standards. Meticulous “preprocessing of raw data” is also critical; for example, optimized preprocessing reduced false positive rates in a cannabis provenance study from 21 to 27% to 11–14%. Ultimately, advancing AI in NP discovery requires a paradigm shift where investment in high-quality, standardized, and ethically curated data is recognized as the indispensable foundation of all computational progress [184].
5.4. Ethical, Regulatory, and Sustainability Considerations
As depicted in Figure 5, the integration of artificial intelligence (AI) with traditional knowledge and natural resources introduces a distinct set of ethical, regulatory, and sustainability challenges that diverge considerably from those encountered in conventional drug discovery. A central issue concerns the intellectual property rights of Indigenous communities, whose knowledge systems have been transmitted across generations [33]. The application of AI to mine such knowledge in the absence of a robust ethical framework risks both the exploitation and “digital marginalization” of traditional practitioners [33]. To mitigate this, it is imperative that AI-driven research be designed to empower these communities through inclusive practices and equitable benefit-sharing mechanisms.
Figure 5.
Addressing core challenges in AI-driven natural product drug discovery: from critical infrastructure to interdisciplinary collaboration.
A cornerstone of this discussion is the Nagoya Protocol on Access and Benefit-Sharing (ABS). While it establishes a vital legal framework for Prior Informed Consent (PIC) and Mutually Agreed Terms (MATs), its practical implementation reveals significant complexities. A salient case study involves the alkaloids from Mitragyna speciosa (kratom), a plant with a long history of traditional use in Southeast Asia. The rapid global commercialization of kratom-derived products has largely occurred outside any formal ABS framework, triggering international regulatory disputes and raising critical questions about benefit and risk distribution in the commodification of traditionally managed species [185]. This case underscores the gap between international agreements and on-the-ground governance.
Robinson extends the debate to the digital realm, arguing that using digitized TK in databases—a potential feedstock for AI—creates new obligations, necessitating “digital PIC” and traceability mechanisms absent in most platforms [186]. This analysis reveals a core tension: profound knowledge asymmetry. AI models can efficiently mine bioactivity patterns from TK-associated compounds, yet the current data ecosystem is ill-equipped to recognize the value of knowledge holders or ensure justice for them, risking a form of digital bioprospecting.
Another critical consideration is the preservation of “living knowledge” held by herbalists and traditional healers. AI systems, which are typically limited to codified and structured data, may overlook subtle contextual factors and experiential safety insights embedded within traditional practices. Over-reliance on algorithmic outputs without adequate understanding of the original knowledge base could result in serious errors, such as neglecting warnings regarding the toxicity of botanicals when improperly prepared [187]. Thus, the integration of AI must complement practitioner expertise rather than replace it.
It is equally important to ensure that AI serves as an instrument for advancing environmental sustainability, rather than undermining it. AI technologies can be leveraged to address ecological challenges, including overharvesting and habitat loss, by analyzing genomic and ecological datasets to identify alternative, sustainable sources of natural products. Furthermore, AI-driven agricultural innovations, such as precision cultivation and controlled-environment farming, enable growth conditions to be optimized while minimizing land use, pesticide application, and water consumption [188]. Collectively, these strategies can ensure that the resurgence of NP-based drug discovery is not only technologically sophisticated but also ethically responsible and environmentally sustainable [189].
6. Conclusions and Future Course
6.1. Critical Appraisal: Limitations, Failures, and Unresolved Challenges
A fundamental challenge in this field stems from biases embedded within widely used bioactivity datasets, such as ChEMBL, which are dominated by synthetic compounds and high-throughput screening (HTS) data. This can lead to “analog bias,” where models learn to associate simple chemical fingerprints with activity labels rather than genuine structure–activity relationships. Consequently, they may over-predict the activity of molecules that have similar fingerprints to known actives in the training set, generating false positives. A systematic evaluation demonstrated that models achieving excellent performance in random cross-validation often see a dramatic drop in accuracy when assessed via more realistic temporal or scaffold splits, designed to simulate prospective prediction of novel chemotypes [181]. For NP discovery, where chemical space differs significantly from synthetic libraries, this bias is particularly acute. Models validated only by internal (random) metrics may provide a false sense of security, and their predictions for novel NP scaffolds require rigorous external validation.
The ultimate test for an AI model in NP discovery is its performance on truly novel, structurally distinct natural scaffolds—precisely the “out-of-distribution” (OOD) data where many models falter. A study on antimicrobial activity prediction illustrated this gap: a model performed well on standard benchmark sets but exhibited significantly degraded performance (e.g., a drop in AUC-ROC) when applied to an independent test set of marine-derived NPs featuring complex macrocyclic and polyketide architectures [190]. This failure highlights a core paradox: the most therapeutically interesting NPs are often those farthest from the training data distribution. Many contemporary AI models excel at interpolation within known chemical space but struggle with extrapolation to the unique structural motifs characteristic of many NPs. This underscores the non-negotiable requirement for external validation on diverse, NP-centric compound sets as a minimum standard for assessing translational utility.
While generative models (e.g., GANs and VAEs) can optimize molecules for computationally driven objectives like predicted binding affinity or quantitative estimate of drug-likeness (QED), the generated structures often lack synthetic tractability. A landmark study found that a substantial fraction of AI-generated molecules were rated as difficult or impossible to synthesize by expert medicinal chemists [191]. For NP-like molecules, the challenge is compounded by complex stereochemistry and intricate ring systems. Optimizing for simplistic scores without embedding hard constraints from retrosynthetic analysis can yield molecules that are computationally elegant but practically unrealizable. This disconnect necessitates the tighter integration of generative AI with synthesis-aware algorithms, prioritizing synthetic feasibility and stereochemical soundness from the earliest design stages.
Many NPs exert their therapeutic effects through polypharmacology—modulating multiple targets within a biological network. However, most AI models are designed for single-target activity prediction and are ill-equipped to capture these synergistic, system-level effects. Attempts to predict multi-target profiles or downstream phenotypic outcomes often result in low accuracy and misleading associations [192]. This is a significant limitation of the prevailing reductionist AI approach when applied to NPs, whose value may lie in their network pharmacology. Predicting the nuanced, often beneficial side-effect profiles of NPs remains a formidable challenge, highlighting the need for novel AI paradigms that incorporate systems biology and phenotypic screening data.
6.2. Towards Rigorous Science: Reproducibility and Benchmarking
6.2.1. The Reproducibility Challenge
For AI to transition from a promising research tool to a trusted component of NP discovery pipelines, resolving its reproducibility crisis and establishing rigorous, standardized benchmarking are critical. Addressing these issues is fundamental to building a reliable and cumulative knowledge base.
The replication of published studies on AI-powered NP discovery is frequently impeded by several interconnected factors. A primary obstacle is the lack of standardized benchmarks. Many studies utilize proprietary, non-public, or inconsistently curated datasets, rendering direct comparison impossible. This problem is exacerbated by the prevalent non-disclosure of critical materials, including source code, exact compound structures, and the specific data splits used for model training and validation. Without access to these, independent verification is unfeasible. Furthermore, the performance of complex AI models is highly sensitive to hyperparameter configurations and the random initialization used for data partitioning—details that are often insufficiently reported. An insightful analysis demonstrated that the superior performance reported for novel models could often be replicated or surpassed by standard baselines through meticulous hyperparameter optimization alone, underscoring how incomplete reporting can distort perceived progress in the field [193]. In NP research, where data is inherently sparse and heterogeneous, these issues are magnified. The reported success of a model may be an artifact of a favorable data split that included chemically similar training and test compounds, rather than a true indicator of its ability to generalize to novel, structurally unique NP scaffolds. Collectively, these practices hinder the fair comparison and robust advancement of methodologies.
6.2.2. The Imperative for Comparative Benchmarking
To substantiate claims of added value, AI approaches must be evaluated using head-to-head comparisons against established traditional discovery methods under stringent, transparent conditions. Assertions of “accelerated discovery” or “improved success rates” remain ambiguous without a definitive baseline. It is important to contextualize these reported accelerations; they often pertain to the in silico phase and do not account for downstream experimental timelines, which remain a major bottleneck.
Valid benchmarking requires comparison against established pillars of NP discovery, such as bioactivity-guided fractionation of crude extracts, high-throughput screening (HTS) of physical compound libraries, and structure-based virtual screening using molecular docking [175].
Evaluations must employ consistent, blinded metrics on shared, publicly accessible datasets. The field must progress beyond reliance on internal validation metrics and adopt prospective, temporal, or scaffold-split validation protocols that simulate real-world discovery scenarios targeting novel chemical entities. These improved metrics, while encouraging, are often achieved under optimized conditions on benchmark datasets. Their generalization across diverse target classes and chemical spaces requires further extensive validation.
Encouragingly, community-driven initiatives are emerging to set these standards. While general-purpose resources like the MoleculeNet benchmark suite provide a foundation for open comparisons when predicting molecular properties [2], the development of NP-specific benchmarking challenges is crucial. The establishment of curated, blinded NP datasets for community-wide algorithm testing would represent a significant step forward. The adoption of such frameworks, coupled with strong data and code sharing mandates from journals and funders, is an essential prerequisite for the field’s maturation.
6.2.3. Synthesis and Path Forward
Confronting reproducibility and benchmarking issues is central to demonstrating the scientific credibility of AI usage in NP discovery. Embracing a culture of open science and rigorous validation is paramount; only through such concerted efforts can the field evolve from publishing isolated demonstrations of potential to generating robust, reproducible, and comparative evidence. This will clearly delineate when and how AI can provide a definitive advantage over conventional methodologies in unlocking the therapeutic potential of natural products [194].
The integration of artificial intelligence (AI) has initiated a profound paradigm shift, reinvigorating the domain of natural product (NP) drug discovery and enabling researchers to surmount longstanding barriers that have historically impeded its progress [19]. As outlined throughout this review, AI provides advanced computational solutions that can accelerate every stage of the drug development continuum, from the initial identification of bioactive leads to clinical translation and post-market surveillance. By facilitating systematic, data-driven exploration of an expansive and chemically diverse molecular space, AI unlocks the vast therapeutic potential embedded within nature’s chemical repertoire. This transformative capacity not only reduces temporal and financial burdens but also heralds a new era of drug development that is more efficient, precise, and patient-centered.
First, evolution toward an integration of multi-omics systems with AI is essential. In order to move beyond models that are solely reliant on chemical structure, future systems must be able to perform joint learning using genomic (biosynthetic potential), metabolomic (compound profiles), and phenotypic data streams in order to enable true end-to-end discovery pipelines. Second, adopting collaborative frameworks such as Federated Learning can overcome data silos by enabling model training across decentralized, proprietary datasets without sharing raw data, thus fostering pre-competitive collaboration while preserving intellectual property [195]. Third, establishing community-driven, NP-specific open-source benchmarks is fundamental for ensuring reproducibility and rigor, shifting the focus from algorithmic novelty to solving robust, real-world problems. Finally, ethical principles must be translated into practice through actionable technical tools, such as digital provenance tracking for traditional knowledge and frameworks for equitable benefit sharing, developed in collaboration with legal and Indigenous data sovereignty experts [196].
Ultimately, the greatest impact will stem from the strategic convergence of these pillars: building powerful, privacy-aware AI models trained on federated multi-omics data, rigorously validated against open benchmarks, and inherently designed to guide equitable and transparent discovery. By steering efforts toward these integrative, collaborative, rigorous, and ethically embedded pathways, the research community can ensure AI’s role as a powerful engine for discovering the next generation of sustainable and equitable natural product-inspired therapeutics.
6.3. Future Perspectives on AI-Driven Natural Product Discovery
In the future, AI-driven NP discovery is poised to show even greater advancements, catalyzed by innovations in computational modeling and interdisciplinary collaboration. Foundational models such as AlphaFold, which has revolutionized protein structure prediction, will serve as springboards for more specialized applications in chemistry and biology, democratizing access to cutting-edge tools and markedly reducing research costs [52]. In future developments, the establishment of fully integrated, closed-loop systems encompassing virtual molecular design, robotic synthesis, and automated experimental feedback is critical [41]. Guided by reinforcement learning, such workflows will enable the continuous, autonomous optimization of candidate compounds, thereby accelerating the drug discovery cycle to an unprecedented pace. Fully autonomous discovery remains a long-term goal, as current systems are best described as powerful decision-support tools that augment, rather than replace, expert intuition and experimental validation.
The convergence of AI with complementary emerging technologies, such as organ-on-chip platforms and quantum computing, further augments this vision [197,198,199,200]. These synergies promise enhanced accuracy in predicting drug behavior and more comprehensive modeling of biological complexity, significantly reducing false positives within the development pipeline [199]. To achieve widespread adoption and regulatory acceptance, however, it is imperative to address issues with the interpretability of current deep learning systems. Developing explainable AI (XAI) frameworks will provide insight into algorithmic decision-making, thereby fostering scientific trust and facilitating integration into stringent regulatory processes [201]. Equally important is the establishment of robust ethical frameworks that balance technological innovation with the protection of traditional knowledge systems, cultural heritage, and ecological sustainability [197].
Acknowledgments
We are thankful to Figdraw. Its platform made creating mechanism diagrams for this article easy, enhancing our research’s presentation.
Author Contributions
Conceptualization, Y.P.; methodology, W.Z.; software, D.D.; validation, A.A. and A.B.; formal analysis, N.Y.; investigation, A.B.; resources, A.A.; data curation, X.Q.; writing—original draft preparation, Y.P.; writing—review and editing, Y.P.; visualization, Y.W. and A.A.; supervision, A.B.; project administration, D.D.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Conflicts of Interest
The authors declare no conflict of interest.
Funding Statement
This research was funded by Tianshan Talents-Youth Science and Technology Innovation Talents Training Program of Xinjiang Autonomous Region, grant number 2022TSYCCX0035; Natural Science Foundation for Distinguished Young Scholars of Xinjiang Autonomous Region (2025D01E32); The “Fourteenth Five-Year Plan” Key Discipline Construction Project of Xinjiang Autonomous Region (2021); Xinjiang Key Laboratory of Natural Medicines Active Components and Drug Release Technology (XJDX1713); Xinjiang Key Laboratory of Biopharmaceuticals and Medical Devices (2023); Engineering Research Center of Xinjiang and Central Asian Medicine Resources, Ministry of Education (2023).
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Newman D.J., Cragg G.M. Natural Products as Sources of New Drugs over the Nearly Four Decades from 01/1981 to 09/2019. J. Nat. Prod. 2020;83:770–803. doi: 10.1021/acs.jnatprod.9b01285. [DOI] [PubMed] [Google Scholar]
- 2.Mullowney M.W., Duncan K.R., Elsayed S.S., Garg N., van der Hooft J.J.J., Martin N.I., Meijer D., Terlouw B.R., Biermann F., Blin K., et al. Artificial intelligence for natural product drug discovery. Nature reviews. Drug Discov. 2023;22:895–916. doi: 10.1038/s41573-023-00774-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dias D.A., Urban S., Roessner U. A Historical Overview of Natural Products in Drug Discovery. Metabolites. 2012;2:303–336. doi: 10.3390/metabo2020303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Patwardhan B., Vaidya A.D. Natural products drug discovery: Accelerating the clinical candidate development using reverse pharmacology approaches. Indian J. Exp. Biol. 2010;48:220–227. [PubMed] [Google Scholar]
- 5.Chaachouay N., Zidane L. Plant-Derived Natural Products: A Source for Drug Discovery and Development. Drugs Drug Candidates. 2024;3:184–207. doi: 10.3390/ddc3010011. [DOI] [Google Scholar]
- 6.Bharate S.B., Lindsley C.W. Natural Products Driven Medicinal Chemistry. J. Med. Chem. 2024;67:20723–20730. doi: 10.1021/acs.jmedchem.4c02736. [DOI] [PubMed] [Google Scholar]
- 7.Gaynes R. The Discovery of Penicillin—New Insights After More Than 75 Years of Clinical Use. Emerg. Infect. Dis. J. 2017;23:849. doi: 10.3201/eid2305.161556. [DOI] [Google Scholar]
- 8.Imani S., Moradi S., Faraj T.A., Hassanpoor P., Musapour N., Najmaldin S.K., Abdulhamd A.H., Mohammadi A.T., Taha C.H., Aminnezhad S. Nanoparticle technologies in precision oncology and personalized vaccine development: Challenges and advances. Int. J. Pharm. X. 2025;10:100353. doi: 10.1016/j.ijpx.2025.100353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Smith E.R., Chen Z.-S., Xu X.-X. 11—Paclitaxel and cancer treatment: Non-mitotic mechanisms of paclitaxel action in cancer therapy. In: Swamy M.K., Pullaiah T., Chen Z.-S., editors. Paclitaxel. Academic Press; Cambridge, MA, USA: 2022. pp. 269–286. [Google Scholar]
- 10.Meshnick S.R., Thomas A., Ranz A., Xu C.M., Pan H.Z. Artemisinin (qinghaosu): The role of intracellular hemin in its mechanism of antimalarial action. Mol. Biochem. Parasitol. 1991;49:181–189. doi: 10.1016/0166-6851(91)90062-B. [DOI] [PubMed] [Google Scholar]
- 11.Posadino A.M., Giordo R., Pintus G., Mohammed S.A., Orhan I.E., Fokou P.V.T., Sharopov F., Adetunji C.O., Gulsunoglu-Konuskan Z., Ydyrys A., et al. Medicinal and mechanistic overview of artemisinin in the treatment of human diseases. Biomed. Pharmacother. 2023;163:114866. doi: 10.1016/j.biopha.2023.114866. [DOI] [PubMed] [Google Scholar]
- 12.Kinch M.S. 2015 in review: FDA approval of new drugs. Drug Discov. Today. 2016;21:1046–1050. doi: 10.1016/j.drudis.2016.04.008. [DOI] [PubMed] [Google Scholar]
- 13.Domingo-Fernández D., Gadiya Y., Preto A.J., Krettler C.A., Mubeen S., Allen A., Healey D., Colluru V. Natural Products Have Increased Rates of Clinical Trial Success throughout the Drug Development Process. J. Nat. Prod. 2024;87:1844–1851. doi: 10.1021/acs.jnatprod.4c00581. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Atanasov A.G., Zotchev S.B., Dirsch V.M., International Natural Product Sciences Taskforce. Supuran C.T. Natural products in drug discovery: Advances and opportunities. Nature reviews. Drug Discov. 2021;20:200–216. doi: 10.1038/s41573-020-00114-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lewis K., Lee R.E., Brötz-Oesterhelt H., Hiller S., Rodnina M.V., Schneider T., Weingarth M., Wohlgemuth I. Sophisticated natural products as antibiotics. Nature. 2024;632:39–49. doi: 10.1038/s41586-024-07530-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ciardiello J.J., Stewart H.L., Sore H.F., Galloway W.R.J.D., Spring D.R. A novel complexity-to-diversity strategy for the diversity-oriented synthesis of structurally diverse and complex macrocycles from quinine. Bioorganic Med. Chem. 2017;25:2825–2843. doi: 10.1016/j.bmc.2017.02.060. [DOI] [PubMed] [Google Scholar]
- 17.Yonchev D., Dimova D., Stumpfe D., Vogt M., Bajorath J. Redundancy in two major compound databases. Drug Discov. Today. 2018;23:1183–1186. doi: 10.1016/j.drudis.2018.03.005. [DOI] [PubMed] [Google Scholar]
- 18.Peron A.P., Mariucci R.G., de Almeida I.V., Düsman E., Mantovani M.S., Vicentini V.E. Evaluation of the cytotoxicity, mutagenicity and antimutagenicity of a natural antidepressant, Hypericum perforatum L. (St. John’s wort), on vegetal and animal test systems. BMC Complement. Altern. Med. 2013;13:97. doi: 10.1186/1472-6882-13-97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Saldívar-González F.I., Aldas-Bulos V.D., Medina-Franco J.L., Plisson F. Natural product drug discovery in the artificial intelligence era. Chem. Sci. 2021;13:1526–1546. doi: 10.1039/D1SC04471K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Basnet B.B., Zhou Z.Y., Wei B., Wang H. Advances in AI-based strategies and tools to facilitate natural product and drug development. Crit. Rev. Biotechnol. 2025;45:1527–1558. doi: 10.1080/07388551.2025.2478094. [DOI] [PubMed] [Google Scholar]
- 21.Arora S., Chettri S., Percha V., Kumar D., Latwal M. Artifical intelligence: A virtual chemist for natural product drug discovery. J. Biomol. Struct. Dyn. 2024;42:3826–3835. doi: 10.1080/07391102.2023.2216295. [DOI] [PubMed] [Google Scholar]
- 22.Ishikawa M., Hashimoto Y. Improvement in Aqueous Solubility in Small Molecule Drug Discovery Programs by Disruption of Molecular Planarity and Symmetry. J. Med. Chem. 2011;54:1539–1554. doi: 10.1021/jm101356p. [DOI] [PubMed] [Google Scholar]
- 23.Meijer D., Beniddir M.A., Coley C.W., Mejri Y.M., Öztürk M., van der Hooft J.J.J., Medema M.H., Skiredj A. Empowering natural product science with AI: Leveraging multimodal data and knowledge graphs. Nat. Prod. Rep. 2025;42:654–662. doi: 10.1039/D4NP00008K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ocana A., Pandiella A., Privat C., Bravo I., Luengo-Oroz M., Amir E., Gyorffy B. Integrating artificial intelligence in drug discovery and early drug development: A transformative approach. Biomark. Res. 2025;13:45. doi: 10.1186/s40364-025-00758-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sahayasheela V.J., Lankadasari M.B., Dan V.M., Dastager S.G., Pandian G.N., Sugiyama H. Artificial intelligence in microbial natural product drug discovery: Current and emerging role. Nat. Prod. Rep. 2022;39:2215–2230. doi: 10.1039/D2NP00035K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hannigan G.D., Prihoda D., Palicka A., Soukup J., Klempir O., Rampula L., Durcak J., Wurst M., Kotowski J., Chang D., et al. A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic Acids Res. 2019;47:e110. doi: 10.1093/nar/gkz654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hjörleifsson Eldjárn G., Ramsay A., van der Hooft J.J.J., Duncan K.R., Soldatou S., Rousu J., Daly R., Wandy J., Rogers S. Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions. PLoS Comput. Biol. 2021;17:e1008920. doi: 10.1371/journal.pcbi.1008920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yang Z., Yin Y., Kong C., Chi T., Tao W., Zhang Y., Xu T. ShennongAlpha: An AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledge. Cell Discov. 2025;11:32. doi: 10.1038/s41421-025-00776-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.ShennongAlpha ShennongAlpha Knowledge. [(accessed on 4 February 2026)]. Available online: https://shennongalpha.westlake.edu.cn/
- 30.Kautsar S.A., Blin K., Shaw S., Weber T., Medema M.H. BiG-FAM: The biosynthetic gene cluster families database. Nucleic Acids Res. 2021;49:D490–D497. doi: 10.1093/nar/gkaa812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Halper S.M., Cetnar D.P., Salis H.M. In: An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization, in Synthetic Metabolic Pathways: Methods and Protocols. Jensen M.K., Keasling J.D., editors. Springer; New York, NY, USA: 2018. pp. 39–61. [DOI] [PubMed] [Google Scholar]
- 32.Usuyama N., Wong C., Zhang S., Naumann T., Poon H. Biomedical Natural Language Processing in the Era of Large Language Models. Annu. Rev. Biomed. Data Sci. 2025;8:471–490. doi: 10.1146/annurev-biodatasci-103123-095406. [DOI] [PubMed] [Google Scholar]
- 33.Directory S. AI and the Future of Herbal Medicine Authentication. 2025. [(accessed on 4 February 2026)]. Available online: https://prism.sustainability-directory.com/scenario/ai-and-the-future-of-herbal-medicine-authentication/
- 34.Laxmi Priya S., Jerlin Anusha P., Gowri Vidhya N. Streamlined Ayurvedic Species Detection Using VGG16 and Neural Networks Chatbot Integration; Proceedings of the 2024 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES); Chennai, India. 12–13 December 2024. [Google Scholar]
- 35.Cho H., Kim B., Choi W., Lee D., Lee H. Plant phenotype relationship corpus for biomedical relationships between plants and phenotypes. Sci. Data. 2022;9:235. doi: 10.1038/s41597-022-01350-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tang J.-L., Liu B.-Y., Ma K.-W. Traditional Chinese medicine. Lancet. 2008;372:1938–1940. doi: 10.1016/S0140-6736(08)61354-9. [DOI] [PubMed] [Google Scholar]
- 37.Cheng J.-T. Review: Drug Therapy in Chinese Traditional Medicine. J. Clin. Pharmacol. 2000;40:445–450. doi: 10.1177/00912700022009198. [DOI] [PubMed] [Google Scholar]
- 38.Zhou E., Shen Q., Hou Y. Integrating artificial intelligence into the modernization of traditional Chinese medicine industry: A review. Front. Pharmacol. 2024;15:1181183. doi: 10.3389/fphar.2024.1181183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zhang P., Zhang D., Zhou W., Wang L., Wang B., Zhang T., Li S. Network pharmacology: Towards the artificial intelligence-based precision traditional Chinese medicine. Brief. Bioinform. 2023;25:bbad518. doi: 10.1093/bib/bbad518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lv Q., Chen G., He H., Yang Z., Zhao L., Zhang K., Chen C.Y. TCMBank-the largest TCM database provides deep learning-based Chinese-Western medicine exclusion prediction. Signal Transduct. Target. Ther. 2023;8:127. doi: 10.1038/s41392-023-01339-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhang N.D., Han T., Huang B.K., Rahman K., Jiang Y.P., Xu H.T., Qin L.P., Xin H.L., Zhang Q.Y., Li Y.M. Traditional Chinese medicine formulas for the treatment of osteoporosis: Implication for antiosteoporotic drug discovery. J. Ethnopharmacol. 2016;189:61–80. doi: 10.1016/j.jep.2016.05.025. [DOI] [PubMed] [Google Scholar]
- 42.Lv Q., Chen G., He H., Yang Z., Zhao L., Chen H.Y., Chen C.Y. TCMBank: Bridges between the largest herbal medicines, chemical ingredients, target proteins, and associated diseases with intelligence text mining. Chem. Sci. 2023;14:10684–10701. doi: 10.1039/D3SC02139D. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Oxford University Africa Oxford Initiative (AUOI) AI and the Future of Work in Africa. Oxford University Africa Oxford Initiative (AUOI); Oxford, UK: 2024. [Google Scholar]
- 44.Luna-Viramontes N.I., Morlett-Paredes A., Ordoñez-Lozano I., de la Cruz-López F., Gonzalez-Chavez V.E., Vargas-Hernández G., Pérez-Pérez E.G., Méndez-Llaca R.E., Villanueva-Fierro I., Ortiz-Butron R., et al. Brain banks in Latin America: Infrastructure for diagnosis, research, and scientific equity in Mexico and the Caribbean. Alzheimer’s Dement. J. Alzheimer’s Assoc. 2025;21:e70819. doi: 10.1002/alz.70819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Boer M.D., Melkonian C., Zafeiropoulos H., Haas A.F., Garza D.R., Dutilh B.E. Improving genome-scale metabolic models of incomplete genomes with deep learning. iScience. 2024;27:111349. doi: 10.1016/j.isci.2024.111349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kalpana D., Prasad S.K. Automation in Analytical Chemistry: The Role of AI in Chromatography. Int. J. Appl. Pharm. 2024;16:14–21. doi: 10.22159/ijap.2024v16i3.50290. [DOI] [Google Scholar]
- 47.Ghosh K., Stuke A., Todorović M., Jørgensen P.B., Schmidt M.N., Vehtari A., Rinke P. Deep Learning Spectroscopy: Neural Networks for Molecular Excitation Spectra. Adv. Sci. 2019;6:1801367. doi: 10.1002/advs.201801367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Avcu F.M. Theoretical and applied potential of artificial intelligence and machine learning in analysing molecular data. Turk. J. Anal. Chem. 2025;7:61–70. doi: 10.51435/turkjac.1607205. [DOI] [Google Scholar]
- 49.Wu K., Luo J., Zeng Q., Dong X., Chen J., Zhan C., Chen Z., Lin Y. Improvement in Signal-to-Noise Ratio of Liquid-State NMR Spectroscopy via a Deep Neural Network DN-Unet. Anal. Chem. 2021;93:1377–1382. doi: 10.1021/acs.analchem.0c03087. [DOI] [PubMed] [Google Scholar]
- 50.Howarth A., Goodman J.M. The DP5 probability, quantification and visualisation of structural uncertainty in single molecules. Chem. Sci. 2022;13:3507–3518. doi: 10.1039/D1SC04406K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Dai J., Zhou Z., Zhao Y., Kong F., Zhai Z., Zhu Z., Cai J., Huang S., Xu Y., Sun T. Combined usage of ligand- and structure-based virtual screening in the artificial intelligence era. Eur. J. Med. Chem. 2025;283:117162. doi: 10.1016/j.ejmech.2024.117162. [DOI] [PubMed] [Google Scholar]
- 52.Singh N., Chaput L., Villoutreix B.O. Virtual screening web servers: Designing chemical probes and drug candidates in the cyberspace. Brief. Bioinform. 2020;22:1790–1818. doi: 10.1093/bib/bbaa034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Fang S., Dong L., Liu L., Guo J., Zhao L., Zhang J., Bu D., Liu X., Huo P., Cao W., et al. HERB: A high-throughput experiment- and reference-guided database of traditional Chinese medicine. Nucleic Acids Res. 2021;49:D1197–D1206. doi: 10.1093/nar/gkaa1063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Gao K., Liu L., Lei S., Li Z., Huo P., Wang Z., Dong L., Deng W., Bu D., Zeng X., et al. HERB 2.0: An updated database integrating clinical and experimental evidence for traditional Chinese medicine. Nucleic Acids Res. 2025;53:D1404–D1414. doi: 10.1093/nar/gkae1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Cheng W., Xia K., Wu S., Li Y. Herb-Drug Interactions and Their Impact on Pharmacokinetics: An Update. Curr. Drug Metab. 2023;24:28–69. doi: 10.2174/1389200224666230116113240. [DOI] [PubMed] [Google Scholar]
- 56.Roemer T., Xu D., Singh S.B., Parish C.A., Harris G., Wang H., Davies J.E., Bills G.F. Confronting the challenges of natural product-based antifungal discovery. Chem. Biol. 2011;18:148–164. doi: 10.1016/j.chembiol.2011.01.009. [DOI] [PubMed] [Google Scholar]
- 57.Gaudêncio S.P., Bayram E., Lukić Bilela L., Cueto M., Díaz-Marrero A.R., Haznedaroglu B.Z., Jimenez C., Mandalakis M., Pereira F., Reyes F., et al. Advanced Methods for Natural Products Discovery: Bioactivity Screening, Dereplication, Metabolomics Profiling, Genomic Sequencing, Databases and Informatic Tools, and Structure Elucidation. Mar. Drugs. 2023;21:308. doi: 10.3390/md21050308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Backman T.W.H., Cao Y., Girke T. ChemMine tools: An online service for analyzing and clustering small molecules. Nucleic Acids Res. 2011;39:W486–W491. doi: 10.1093/nar/gkr320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Madival S.D., Mishra D.C., Chaturvedi K.K., Sharma A., Budhlakoti N., Angadi U.B., Basavaraja P., Farooqi M.S., Srivastava S., Jha G.K. NaturePred: A Tool for Revolutionizing Natural Product Classification with Artificial Intelligence. Curr. Proteom. 2024;21:429–436. doi: 10.2174/0115701646322417241101055512. [DOI] [Google Scholar]
- 60.Li H., Sun X., Cui W., Xu M., Dong J., Ekundayo B.E., Ni D., Rao Z., Guo L., Stahlberg H., et al. Computational drug development for membrane protein targets. Nat. Biotechnol. 2024;42:229–242. doi: 10.1038/s41587-023-01987-2. [DOI] [PubMed] [Google Scholar]
- 61.Sheng J., Zhang T. Advancing drug development with “Fit-for-Purpose” modeling informed approaches. J. Pharmacokinet. Pharmacodyn. 2025;52:52. doi: 10.1007/s10928-025-09995-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Santra S., Kukreja P., Saxena K., Gandhi S., Singh O.V. Navigating regulatory and policy challenges for AI enabled combination devices. Front. Med. Technol. 2024;6:1473350. doi: 10.3389/fmedt.2024.1473350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Waldron K., McFarland A., Baseman H., Jornitz M. A Risk Assessment and Risk-Based Approach Review of Pre-Use/Post-Sterilization Integrity Testing (PUPSIT) PDA J. Pharm. Sci. Technol. 2025;79:88–97. doi: 10.5731/pdajpst.2024-003038.1. [DOI] [PubMed] [Google Scholar]
- 64.Chen C., Yaari Z., Apfelbaum E., Grodzinski P., Shamay Y., Heller D.A. Merging data curation and machine learning to improve nanomedicines. Adv. Drug Deliv. Rev. 2022;183:114172. doi: 10.1016/j.addr.2022.114172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Pan X., Jiang S., Zhang X., Wang Z., Wang X., Cao L., Xiao W. Recent strategies in target identification of natural products: Exploring applications in chronic inflammation and beyond. Br. J. Pharmacol. 2025;182:4841–4860. doi: 10.1111/bph.17356. [DOI] [PubMed] [Google Scholar]
- 66.Periyasamy M. AI-Driven Multi-Omics Integration for Enhanced Drug Discovery Pipelines; Proceedings of the 2025 International Conference on Multi-Agent Systems for Collaborative Intelligence (ICMSCI); Erode, India. 20–22 January 2025. [Google Scholar]
- 67.Ye Q., Guo N.L. Guo Inferencing Bulk Tumor and Single-Cell Multi-Omics Regulatory Networks for Discovery of Biomarkers and Therapeutic Targets. Cells. 2023;12:101. doi: 10.3390/cells12010101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Reker D., Rodrigues T., Schneider P., Schneider G. Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc. Natl. Acad. Sci. USA. 2014;111:4067–4072. doi: 10.1073/pnas.1320001111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Cockroft N.T., Cheng X., Fuchs J.R. STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products. J. Chem. Inf. Model. 2019;59:4906–4920. doi: 10.1021/acs.jcim.9b00489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Qiang B., Lai J., Jin H., Zhang L., Liu Z. Target Prediction Model for Natural Products Using Transfer Learning. Int. J. Mol. Sci. 2021;22:4632. doi: 10.3390/ijms22094632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Knopman D.S., Amieva H., Petersen R.C., Chételat G., Holtzman D.M., Hyman B.T., Nixon R.A., Jones D.T. Alzheimer disease. Nature reviews. Dis. Primers. 2021;7:33. doi: 10.1038/s41572-021-00269-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Talesa V.N. Acetylcholinesterase in Alzheimer’s disease. Mech. Ageing Dev. 2001;122:1961–1969. doi: 10.1016/S0047-6374(01)00309-8. [DOI] [PubMed] [Google Scholar]
- 73.Manzoor S., Hoda N. A comprehensive review of monoamine oxidase inhibitors as Anti-Alzheimer’s disease agents: A review. Eur. J. Med. Chem. 2020;206:112787. doi: 10.1016/j.ejmech.2020.112787. [DOI] [PubMed] [Google Scholar]
- 74.Upton N., Chuang T.T., Hunter A.J., Virley D.J. 5-HT6 receptor antagonists as novel cognitive enhancing agents for Alzheimer’s disease. Neurother. J. Am. Soc. Exp. Neurother. 2008;5:458–469. doi: 10.1016/j.nurt.2008.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988;28:31–36. doi: 10.1021/ci00057a005. [DOI] [Google Scholar]
- 76.Ballard C., Creese B., Gatt A., Doherty P., Francis P.T., Corbett A., Aarsland D. Identifying novel candidates for re-purposing as potential therapeutic agents for Alzheimer’s disease. bioRxiv. 2019 doi: 10.1101/622308. [DOI] [Google Scholar]
- 77.Wu T., Lin R., Cui P., Yong J., Yu H., Li Z. Deep learning-based drug screening for the discovery of potential therapeutic agents for Alzheimer’s disease. J. Pharm. Anal. 2024;14:101022. doi: 10.1016/j.jpha.2024.101022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Xia J., Gan Z., Zhang J., Dong M., Liu S., Cui B., Guo P., Pang Z., Lu T., Gu N., et al. Geometric-aware deep learning enables discovery of bifunctional ligand-based liposomes for tumor targeting therapy. Nano Today. 2025;61:102668. doi: 10.1016/j.nantod.2025.102668. [DOI] [Google Scholar]
- 79.Vora L.K., Gholap A.D., Jetha K., Thakur R.R.S., Solanki H.K., Chavda V.P. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics. 2023;15:1916. doi: 10.3390/pharmaceutics15071916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Gryniukova A., Kaiser F., Myziuk I., Alieksieieva D., Leberecht C., Heym P.P., Tarkhanova O.O., Moroz Y.S., Borysko P., Haupt V.J. AI-Powered Virtual Screening of Large Compound Libraries Leads to the Discovery of Novel Inhibitors of Sirtuin-1. J. Med. Chem. 2023;66:10241–10251. doi: 10.1021/acs.jmedchem.3c00128. [DOI] [PubMed] [Google Scholar]
- 81.Swanson K., Walther P., Leitz J., Mukherjee S., Wu J.C., Shivnaraine R.V., Zou J. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. Bioinformatics. 2024;40:btae416. doi: 10.1093/bioinformatics/btae416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Heid E., Greenman K.P., Chung Y., Li S.C., Graff D.E., Vermeire F.H., Wu H., Green W.H., McGill C.J. Chemprop: A Machine Learning Package for Chemical Property Prediction. J. Chem. Inf. Model. 2024;64:9–17. doi: 10.1021/acs.jcim.3c01250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Wishart D.S., Knox C., Guo A.C., Shrivastava S., Hassanali M., Stothard P., Chang Z., Woolsey J. DrugBank: A comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34:D668–D672. doi: 10.1093/nar/gkj067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Yang Q., Fan L., Hao E., Hou X., Deng J., Du Z., Xia Z. Construction of an explanatory model for predicting hepatotoxicity: A case study of the potentially hepatotoxic components of Gardenia jasminoides. Drug Chem. Toxicol. 2025;48:107–119. doi: 10.1080/01480545.2024.2364905. [DOI] [PubMed] [Google Scholar]
- 85.Moingeon P., Kuenemann M., Guedj M. Artificial intelligence-enhanced drug design and development: Toward a computational precision medicine. Drug Discov. Today. 2022;27:215–222. doi: 10.1016/j.drudis.2021.09.006. [DOI] [PubMed] [Google Scholar]
- 86.Dong J., Wang N.N., Yao Z.J., Zhang L., Cheng Y., Ouyang D., Lu A.P., Cao D.S. ADMETlab: A platform for systematic ADMET evaluation based on a comprehensively collected ADMET database. J. Cheminform. 2018;10:29. doi: 10.1186/s13321-018-0283-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Fu L., Shi S., Yi J., Wang N., He Y., Wu Z., Peng J., Deng Y., Wang W., Wu C., et al. ADMETlab 3.0: An updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support. Nucleic Acids Res. 2024;52:W422–W431. doi: 10.1093/nar/gkae236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Xiong G., Wu Z., Yi J., Fu L., Yang Z., Hsieh C., Yin M., Zeng X., Wu C., Lu A., et al. ADMETlab 2.0: An integrated online platform for accurate and comprehensive predictions of ADMET properties. Nucleic Acids Res. 2021;49:W5–W14. doi: 10.1093/nar/gkab255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Tian Y., Yi J., Wang N., Wu C., Peng J., Liu S., Yang G., Cao D. DDInter 2.0: An enhanced drug interaction resource with expanded data coverage, new interaction types, and improved user interface. Nucleic Acids Res. 2025;53:D1356–D1362. doi: 10.1093/nar/gkae726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Xiong G., Yang Z., Yi J., Wang N., Wang L., Zhu H., Wu C., Lu A., Chen X., Liu S., et al. DDInter: An online drug-drug interaction database towards improving clinical decision-making and patient safety. Nucleic Acids Res. 2022;50:D1200–D1207. doi: 10.1093/nar/gkab880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.AlJarf R., Rodrigues C.H.M., Myung Y., Pires D.E.V., Ascher D.B. piscesCSM: Prediction of anticancer synergistic drug combinations. J. Cheminform. 2024;16:81. doi: 10.1186/s13321-024-00859-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Nasnodkar S., Cinar B., Ness S. Artificial Intelligence in Toxicology and Pharmacology. J. Eng. Res. Rep. 2023;25:192–206. doi: 10.9734/jerr/2023/v25i7952. [DOI] [Google Scholar]
- 93.Banerjee P., Eckert A.O., Schrey A.K., Preissner R. ProTox-II: A webserver for the prediction of toxicity of chemicals. Nucleic Acids Res. 2018;46:W257–W263. doi: 10.1093/nar/gky318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Tran T.T.V., Surya Wibowo A., Tayara H., Chong K.T. Artificial Intelligence in Drug Toxicity Prediction: Recent Advances, Challenges, and Future Perspectives. J. Chem. Inf. Model. 2023;63:2628–2643. doi: 10.1021/acs.jcim.3c00200. [DOI] [PubMed] [Google Scholar]
- 95.Hosseini S.-R., Zhou X. CCSynergy: An integrative deep-learning framework enabling context-aware prediction of anti-cancer drug synergy. Brief. Bioinform. 2023;24:bbac588. doi: 10.1093/bib/bbac588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Preto A.J., Matos-Filipe P., Mourão J., Moreira I.S. SYNPRED: Prediction of drug combination effects in cancer using different synergy metrics and ensemble learning. GigaScience. 2022;11:giac087. doi: 10.1093/gigascience/giac087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Yan K., Jia R., Guo S. SynAI: An AI-driven cancer drugs synergism prediction platform. Bioinform. Adv. 2023;3:vbad160. doi: 10.1093/bioadv/vbad160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Jorgensen W.L. Efficient Drug Lead Discovery and Optimization. Acc. Chem. Res. 2009;42:724–733. doi: 10.1021/ar800236t. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Tong X., Liu X., Tan X., Li X., Jiang J., Xiong Z., Xu T., Jiang H., Qiao N., Zheng M. Generative Models for De Novo Drug Design. J. Med. Chem. 2021;64:14011–14027. doi: 10.1021/acs.jmedchem.1c00927. [DOI] [PubMed] [Google Scholar]
- 100.Yi J., Shi S., Fu L., Yang Z., Nie P., Lu A., Wu C., Deng Y., Hsieh C., Zeng X., et al. OptADMET: A web-based tool for substructure modifications to improve ADMET properties of lead compounds. Nat. Protoc. 2024;19:1105–1121. doi: 10.1038/s41596-023-00942-4. [DOI] [PubMed] [Google Scholar]
- 101.Lim J., Ryu S., Kim J.W., Kim W.Y. Molecular generative model based on conditional variational autoencoder for de novo molecular design. J. Cheminform. 2018;10:31. doi: 10.1186/s13321-018-0286-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Lee Y.J., Kahng H., Kim S.B. Generative Adversarial Networks for De Novo Molecular Design. Mol. Inform. 2021;40:2100045. doi: 10.1002/minf.202100045. [DOI] [PubMed] [Google Scholar]
- 103.Kadurin A., Nikolenko S., Khrabrov K., Aliper A., Zhavoronkov A. druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. Mol. Pharm. 2017;14:3098–3104. doi: 10.1021/acs.molpharmaceut.7b00346. [DOI] [PubMed] [Google Scholar]
- 104.Lv Q., Zhou F., Liu X., Zhi L. Artificial intelligence in small molecule drug discovery from 2018 to 2023: Does it really work? Bioorganic Chem. 2023;141:106894. doi: 10.1016/j.bioorg.2023.106894. [DOI] [PubMed] [Google Scholar]
- 105.Duo L., Liu Y., Ren J., Tang B., Hirst J.D. Artificial intelligence for small molecule anticancer drug discovery. Expert Opin. Drug Discov. 2024;19:933–948. doi: 10.1080/17460441.2024.2367014. [DOI] [PubMed] [Google Scholar]
- 106.Yoshimori A., Kawasaki E., Kanai C., Tasaka T. Strategies for Design of Molecular Structures with a Desired Pharmacophore Using Deep Reinforcement Learning. Chem. Pharm. Bull. 2020;68:227–233. doi: 10.1248/cpb.c19-00625. [DOI] [PubMed] [Google Scholar]
- 107.Horwood J., Noutahi E. Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning. ACS Omega. 2020;5:32984–32994. doi: 10.1021/acsomega.0c04153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Stokes J.M., Yang K., Swanson K., Jin W., Cubillos-Ruiz A., Donghia N.M., MacNair C.R., French S., Carfrae L.A., Bloom-Ackermann Z., et al. A Deep Learning Approach to Antibiotic Discovery. Cell. 2020;180:688–702.e13. doi: 10.1016/j.cell.2020.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Sterling T., Irwin J.J. ZINC 15—Ligand Discovery for Everyone. J. Chem. Inf. Model. 2015;55:2324–2337. doi: 10.1021/acs.jcim.5b00559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Wong F., Zheng E.J., Valeri J.A., Donghia N.M., Anahtar M.N., Omori S., Li A., Cubillos-Ruiz A., Krishnan A., Jin W., et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature. 2024;626:177–185. doi: 10.1038/s41586-023-06887-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Xing J., Jiang L., Fu C., Liu J.Q., Zhou X., Zhang J., Zhang Y.L., He Y.C., Zhao W.D. A novel predictive model and therapeutic potential of quercetin derivatives in chronic kidney disease progression. Biochem. Pharmacol. 2025;239:117024. doi: 10.1016/j.bcp.2025.117024. [DOI] [PubMed] [Google Scholar]
- 112.Gori M., Giannitelli S.M., Zancla A., Mozetic P., Trombetta M., Merendino N., Rainer A. Quercetin and hydroxytyrosol as modulators of hepatic steatosis: A NAFLD-on-a-chip study. Biotechnol. Bioeng. 2021;118:142–152. doi: 10.1002/bit.27557. [DOI] [PubMed] [Google Scholar]
- 113.Hosseini A., Razavi B.M., Banach M., Hosseinzadeh H. Quercetin and metabolic syndrome: A review. Phytother. Res. PTR. 2021;35:5352–5364. doi: 10.1002/ptr.7144. [DOI] [PubMed] [Google Scholar]
- 114.Derosa G., Maffioli P., D’Angelo A., Di Pierro F. A role for quercetin in coronavirus disease 2019 (COVID-19) Phytother. Res. PTR. 2021;35:1230–1236. doi: 10.1002/ptr.6887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Dabeek W.M., Marra M.V. Dietary Quercetin and Kaempferol: Bioavailability and Potential Cardiovascular-Related Bioactivity in Humans. Nutrients. 2019;11:2288. doi: 10.3390/nu11102288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Periferakis A., Periferakis K., Badarau I.A., Petran E.M., Popa D.C., Caruntu A., Costache R.S., Scheau C., Caruntu C., Costache D.O. Kaempferol: Antimicrobial Properties, Sources, Clinical, and Traditional Applications. Int. J. Mol. Sci. 2022;23:15054. doi: 10.3390/ijms232315054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Luo Y., Liu X.Y., Yang K., Huang K., Hong M., Zhang J., Wu Y., Nie Z. Toward Unified AI Drug Discovery with Multimodal Knowledge. Health Data Sci. 2024;4:0113. doi: 10.34133/hds.0113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Ye M., Bhat G., Johnston K.A., Tan H., Garnick M. Proprietary Rel-Ease drug delivery technology: Opportunity for sustained delivery of peptides, proteins and small molecules. Expert Opin. Drug Deliv. 2006;3:663–675. doi: 10.1517/17425247.3.5.663. [DOI] [PubMed] [Google Scholar]
- 119.Oku N. Innovations in Liposomal DDS Technology and Its Application for the Treatment of Various Diseases. Biol. Pharm. Bull. 2017;40:119–127. doi: 10.1248/bpb.b16-00857. [DOI] [PubMed] [Google Scholar]
- 120.Priyanka K.M., Varghese S.A., Naveen N.R. Revolutionizing Hyperlipidaemia Treatment: Magnetic Nanoparticle-Based Delivery Systems. Recent Adv. Drug Deliv. Formul. 2025 doi: 10.2174/0126673878383901250918195712. advance online publication . [DOI] [PubMed] [Google Scholar]
- 121.Hornick T., Mao C., Koynov A., Yawman P., Thool P., Salish K., Giles M., Nagapudi K., Zhang S. In silico formulation optimization and particle engineering of pharmaceutical products using a generative artificial intelligence structure synthesis method. Nat. Commun. 2024;15:9622. doi: 10.1038/s41467-024-54011-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Large D.E., Abdelmessih R.G., Fink E.A., Auguste D.T. Liposome composition in drug delivery design, synthesis, characterization, and clinical application. Adv. Drug Deliv. Rev. 2021;176:113851. doi: 10.1016/j.addr.2021.113851. [DOI] [PubMed] [Google Scholar]
- 123.Feng L., Wang X., Guo X., Shi L., Su S., Li X., Wang J., Tan N., Ma Y., Wang Z. Identification of Novel Target DCTPP1 for Colorectal Cancer Therapy with the Natural Small-Molecule Inhibitors Regulating Metabolic Reprogramming. Angew. Chem. 2024;63:e202402543. doi: 10.1002/anie.202402543. [DOI] [PubMed] [Google Scholar]
- 124.Qi Y., Ding L., Zhang S., Yao S., Ong J., Li Y., Wu H., Du P. A plant immune protein enables broad antitumor response by rescuing microRNA deficiency. Cell. 2022;185:1888–1904.e24. doi: 10.1016/j.cell.2022.04.030. [DOI] [PubMed] [Google Scholar]
- 125.Wu X., Yu Y., Wang M., Dai D., Yin J., Liu W., Kong D., Tang S., Meng M., Gao T., et al. AAV-delivered muscone-induced transgene system for treating chronic diseases in mice via inhalation. Nat. Commun. 2024;15:1122. doi: 10.1038/s41467-024-45383-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Dynotx National Standards Dynamic Network. [(accessed on 4 February 2026)]. Available online: https://www.dynotx.com/
- 127.Ogden P.J., Kelsic E.D., Sinai S., Church G.M. Comprehensive AAV capsid fitness landscape reveals a viral gene and enables machine-guided design. Science. 2019;366:1139–1143. doi: 10.1126/science.aaw2900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Li F., Han J., Cao T., Lam W., Fan B., Tang W., Chen S., Fok K.L., Li L. Design of self-assembly dipeptide hydrogels and machine learning via their chemical features. Proc. Natl. Acad. Sci. USA. 2019;116:11259–11264. doi: 10.1073/pnas.1903376116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Damiati S.A., Damiati S. Microfluidic Synthesis of Indomethacin-Loaded PLGA Microparticles Optimized by Machine Learning. Front. Mol. Biosci. 2021;8:677547. doi: 10.3389/fmolb.2021.677547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Pun F.W., Ozerov I.V., Zhavoronkov A. AI-powered therapeutic target discovery. Trends Pharmacol. Sci. 2023;44:561–572. doi: 10.1016/j.tips.2023.06.010. [DOI] [PubMed] [Google Scholar]
- 131.Delamarche E., Temiz Y., Lovchik R.D., Christiansen M.G., Schuerle S. Capillary Microfluidics for Monitoring Medication Adherence. Angew. Chem. 2021;60:17784–17796. doi: 10.1002/anie.202101316. [DOI] [PubMed] [Google Scholar]
- 132.Cao J., Lu W., Lv Y., Li M., Tang Y., Feng Y. Mitochondria-targeted nanocarriers for smart response delivery of natural products: A review. J. Drug Target. 2026:1–17. doi: 10.1080/1061186X.2025.2611940. advance online publication . [DOI] [PubMed] [Google Scholar]
- 133.Dumontet C., Reichert J.M., Senter P.D., Lambert J.M., Beck A. Antibody-drug conjugates come of age in oncology. Nat. Rev. Drug Discov. 2023;22:641–661. doi: 10.1038/s41573-023-00709-2. [DOI] [PubMed] [Google Scholar]
- 134.Phuna Z.X., Kumar P.A., Haroun E., Dutta D., Lim S.H. Antibody-drug conjugates: Principles and opportunities. Life Sci. 2024;347:122676. doi: 10.1016/j.lfs.2024.122676. [DOI] [PubMed] [Google Scholar]
- 135.Li X., Lai Y., Wan G., Zou J., He W., Yang P. Approved natural products-derived nanomedicines for disease treatment. Chin. J. Nat. Med. 2024;22:1100–1116. doi: 10.1016/S1875-5364(24)60726-0. [DOI] [PubMed] [Google Scholar]
- 136.Pinzi L., Bisi N., Rastelli G. How drug repurposing can advance drug discovery: Challenges and opportunities. Front. Drug Discov. 2024;4:1460100. doi: 10.3389/fddsv.2024.1460100. [DOI] [Google Scholar]
- 137.Bender A., Cortes-Ciriano I. Artificial intelligence in drug discovery: What is realistic, what are illusions? Part 2: A discussion of chemical and biological data. Drug Discov. Today. 2021;26:1040–1052. doi: 10.1016/j.drudis.2020.11.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Karaa W.B.A., Alkhammash E.H., Bchir A. Drug Disease Relation Extraction from Biomedical Literature Using NLP and Machine Learning. Mob. Inf. Syst. 2021;2021:9958410. doi: 10.1155/2021/9958410. [DOI] [Google Scholar]
- 139.Mohanty S., Harun Ai Rashid M., Mridul M., Mohanty C., Swayamsiddha S. Application of Artificial Intelligence in COVID-19 drug repurposing. Diabetes Metab. Syndr. 2020;14:1027–1031. doi: 10.1016/j.dsx.2020.06.068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Joshi C.P., Baldi A., Kumar N., Pradhan J. Harnessing network pharmacology in drug discovery: An integrated approach. Naunyn-Schmiedeberg’s Arch. Pharmacol. 2025;398:4689–4703. doi: 10.1007/s00210-024-03625-3. [DOI] [PubMed] [Google Scholar]
- 141.Prati F., Uliassi E., Bolognesi M.L. Two diseases, one approach: Multitarget drug discovery in Alzheimer’s and neglected tropical diseases. MedChemComm. 2014;5:853–861. doi: 10.1039/C4MD00069B. [DOI] [Google Scholar]
- 142.Guo P., Jiang M., Hu S., Jiang Q., Li L., Wu J., Ma Y., Wu Z. Advancing the modernization of traditional Chinese medicine through artificial intelligence and multimodal data integration. Chin. Med. 2026;21:54. doi: 10.1186/s13020-025-01194-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Muhammad B.L., Kim H.S., Aliyu I., Shehu H.A., Ki J.S. Artificial Intelligence (AI) in Saxitoxin Research: The Next Frontier for Understanding Marine Dinoflagellate Toxin Biosynthesis and Evolution. Toxins. 2026;18:26. doi: 10.3390/toxins18010026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Wang A., Luo Q., Tan X., Yao Y., Peng X., Luo H., Hu Y. Development and application of artificial intelligence in traditional Chinese medicine research and development. Chin. Med. 2026;21:17. doi: 10.1186/s13020-025-01288-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Lin A., Che C., Jiang A., Qi C., Glaviano A., Zhao Z., Zhang Z., Liu Z., Zhou Z., Cheng Q., et al. Protein Spatial Structure Meets Artificial Intelligence: Revolutionizing Drug Synergy-Antagonism in Precision Medicine. Adv. Sci. 2025;12:e07764. doi: 10.1002/advs.202507764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Peng Z., Ding Y., Zhang P., Lv X., Li Z., Zhou X., Huang S. Artificial Intelligence Application for Anti-tumor Drug Synergy Prediction. Curr. Med. Chem. 2024;31:6572–6585. doi: 10.2174/0109298673290777240301071513. [DOI] [PubMed] [Google Scholar]
- 147.Rani P., Dutta K., Kumar V. Artificial intelligence techniques for prediction of drug synergy in malignant diseases: Past, present, and future. Comput. Biol. Med. 2022;144:105334. doi: 10.1016/j.compbiomed.2022.105334. [DOI] [PubMed] [Google Scholar]
- 148.Serrano D.R., Luciano F.C., Anaya B.J., Ongoren B., Kara A., Molina G., Ramirez B.I., Sánchez-Guirales S.A., Simon J.A., Tomietto G., et al. Artificial Intelligence (AI) Applications in Drug Discovery and Drug Delivery: Revolutionizing Personalized Medicine. Pharmaceutics. 2024;16:1328. doi: 10.3390/pharmaceutics16101328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Li W., Ge X., Liu S., Xu L., Zhai X., Yu L. Opportunities and challenges of traditional Chinese medicine doctors in the era of artificial intelligence. Front. Med. 2024;10:1336175. doi: 10.3389/fmed.2023.1336175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Panigrahi B.S., Dogra A., Sivakumar K., Divakar S., Bagal T.S., Upadbye M. Big data and AI in natural product drug discovery: Uncovering hidden medicinal chemistry gems; Proceedings of the 2024 9th International Conference on Science Technology Engineering and Mathematics (ICONSTEM); Chennai, India. 4–5 April 2024; New York, NY, USA: IEEE; 2024. p. 900. [DOI] [Google Scholar]
- 151.Qureshi R., Khan M.A., Hameed I.A. Artificial intelligence and biosensors in healthcare and its clinical relevance: A review. IEEE Access. 2023;11:61600–61620. doi: 10.1109/ACCESS.2023.3285596. [DOI] [Google Scholar]
- 152.Gayathri R., Sangeetha S.K.B., Sangeetha R., Mary G.L.R., Mathivanan S.K., Moorthy U. Dynamic AI-enhanced therapeutic framework for precision medicine using multi-modal data and patient-centric reinforcement learning. IEEE Access. 2025;13:77709–77733. doi: 10.1109/ACCESS.2025.3564971. [DOI] [Google Scholar]
- 153.Song Z., Chen G.-X., Chen C.Y.-C. AI empowering traditional Chinese medicine? Chem. Sci. 2024;15:16844–16886. doi: 10.1039/D4SC04107K. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Lobo V.B., Chandra J.L. Convergence of blockchain and artificial intelligence to decentralize healthcare systems; Proceedings of the 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC); Erode, India. 11–13 March 2020; New York, NY, USA: IEEE; 2020. pp. 1140–1145. [DOI] [Google Scholar]
- 155.Garapati K., Maram S.S., Manikandan V.M., Ahmed S. A comprehensive approach for healthcare decision-making through integrated data mining and NLP-enhanced drug recommendation systems; Proceedings of the 2024 International Conference on Intelligent Computing and Emerging Communication Technologies (ICEC); Guntur, India. 23–25 November 2024; New York, NY, USA: IEEE; 2024. pp. 1–6. [DOI] [Google Scholar]
- 156.Rotake S.B., Hatwar P.R., Bakal R.L., Meshram S.I. The role of artificial intelligence in drug discovery. J. Drug Deliv. Ther. 2025;15:102–108. doi: 10.22270/jddt.v15i7.7251. [DOI] [Google Scholar]
- 157.Carreras-Puigvert J., Spjuth O. Artificial intelligence for high content imaging in drug discovery. Curr. Opin. Struct. Biol. 2024;87:102842. doi: 10.1016/j.sbi.2024.102842. [DOI] [PubMed] [Google Scholar]
- 158.Huanbutta K., Burapapadh K., Kraisit P., Sriamornsak P., Ganokratanaa T., Suwanpitak K., Sangnim T. Artificial intelligence-driven pharmaceutical industry: A paradigm shift in drug discovery, formulation development, manufacturing, quality control, and post-market surveillance. Eur. J. Pharm. Sci. Off. J. Eur. Fed. Pharm. Sci. 2024;203:106938. doi: 10.1016/j.ejps.2024.106938. [DOI] [PubMed] [Google Scholar]
- 159.Sharma S., Naman S., Baldi A. Recognition and quality mapping of traditional herbal drugs: Way forward towards artificial intelligence. Tradit. Med. Res. 2025;10:106938. doi: 10.53388/TMR20240416001. [DOI] [Google Scholar]
- 160.Scheeder C., Heigwer F., Boutros M. Machine learning and image-based profiling in drug discovery. Curr. Opin. Syst. Biol. 2018;10:43–52. doi: 10.1016/j.coisb.2018.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Tang Q., Ratnayake R., Seabra G., Jiang Z., Fang R., Cui L., Ding Y., Kahveci T., Bian J., Li C., et al. Morphological profiling for drug discovery in the era of deep learning. Brief. Bioinform. 2024;25:bbae284. doi: 10.1093/bib/bbae284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Visan A.I., Negut I. Integrating Artificial Intelligence for Drug Discovery in the Context of Revolutionizing Drug Delivery. Life. 2024;14:233. doi: 10.3390/life14020233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Álvarez-Machancoses Ó., Fernández-Martínez J.L. Using artificial intelligence methods to speed up drug discovery. Expert Opin. Drug Discov. 2019;14:769–777. doi: 10.1080/17460441.2019.1621284. [DOI] [PubMed] [Google Scholar]
- 164.Ruehle B. Natural language processing for automated workflow and knowledge graph generation in self-driving labs. Digit. Discov. 2025;4:1534–1543. doi: 10.1039/D5DD00063G. [DOI] [Google Scholar]
- 165.Behr A.S., Chernenko D., Koßmann D., Neyyathala A., Hanf S., Schunk S.A., Kockmann N. Generating knowledge graphs through text mining of catalysis research related literature. Catal. Sci. Technol. 2024;14:5699–5713. doi: 10.1039/D4CY00369A. [DOI] [Google Scholar]
- 166.Kelm J.M., Ferrer M., Bittner M.I., Lal-Nag M. Data standards in drug discovery: A long way to go. Drug Discov. Today. 2024;29:103879. doi: 10.1016/j.drudis.2024.103879. [DOI] [PubMed] [Google Scholar]
- 167.Ghosh S., Matsuoka Y., Kitano H. Connecting the dots: Role of standardization and technology sharing in biological simulation. Drug Discov. Today. 2010;15:1024–1031. doi: 10.1016/j.drudis.2010.10.001. [DOI] [PubMed] [Google Scholar]
- 168.Gold E.R., Cook-Deegan R. AI drug development’s data problem. Science. 2025;388:131. doi: 10.1126/science.adx0339. [DOI] [PubMed] [Google Scholar]
- 169.Gangwal A., Ansari A., Ahmad I., Azad A.K., Wan Sulaiman W.M.A. Current strategies to address data scarcity in artificial intelligence-based drug discovery: A comprehensive review. Comput. Biol. Med. 2024;179:108734. doi: 10.1016/j.compbiomed.2024.108734. [DOI] [PubMed] [Google Scholar]
- 170.Cahan E.M., Khatri P. Data Heterogeneity: The Enzyme to Catalyze Translational Bioinformatics? J. Med. Internet Res. 2020;22:e18044. doi: 10.2196/18044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Waller C.L., Shah A., Nolte M. Strategies to support drug discovery through integration of systems and data. Drug Discov. Today. 2007;12:634–639. doi: 10.1016/j.drudis.2007.06.007. [DOI] [PubMed] [Google Scholar]
- 172.Yang X., Wang Y., Byrne R., Schneider G., Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019;119:10520–10594. doi: 10.1021/acs.chemrev.8b00728. [DOI] [PubMed] [Google Scholar]
- 173.The shape of artificial intelligence: Just a black box? Edith Chan A.W. Chem. Biol. Drug Des. 2020;96:882–885. doi: 10.1111/cbdd.13793. [DOI] [PubMed] [Google Scholar]
- 174.Haas C.M., Jasti N., Dosey A., Allen J.D., Gillespie R., McGowan J., Leaf E.M., Crispin M., DeForest C.A., Kanekiyo M., et al. From sequence to scaffold: Computational design of protein nanoparticle vaccines from AlphaFold2-predicted building blocks. Proc. Natl. Acad. Sci. USA. 2025;122:e2409566122. doi: 10.1073/pnas.2409566122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175.Parekh V.A., Amina M., Islam M.L., Patil P.C., Ali M.A., Wabaidur S.M., Islam M.A. Identification of phosphodiesterase 10 A modulators for neurodegenerative and psychiatric disorders: Combination of physics-based virtual screening and machine learning approaches. Comput. Biol. Chem. 2026;121:108875. doi: 10.1016/j.compbiolchem.2025.108875. [DOI] [PubMed] [Google Scholar]
- 176.Marzouk M.A. Pyrimidine derivatives as multifaceted antidiabetic agents: A comprehensive review of structure-activity relationships, mechanisms, and clinical potential. Eur. J. Med. Chem. 2025;296:117859. doi: 10.1016/j.ejmech.2025.117859. [DOI] [PubMed] [Google Scholar]
- 177.Meng F., Martínez González M., Chuiko V., Tehrani A., Al Nabulsi A.R., Broscius A., Khaleel H., López-Pérez K., Miranda-Quintana R.A., Ayers P.W., et al. Selector: A General Python Library for Diverse Subset Selection. J. Chem. Inf. Model. 2026;66:1275–1285. doi: 10.1021/acs.jcim.5c01499. [DOI] [PubMed] [Google Scholar]
- 178.Ghislat G., Hernandez-Hernandez S., Piyawajanusorn C., Ballester P.J. Data-centric challenges with the application and adoption of artificial intelligence for drug discovery. Expert Opin. Drug Discov. 2024;19:1297–1307. doi: 10.1080/17460441.2024.2403639. [DOI] [PubMed] [Google Scholar]
- 179.Manan A., Baek E., Ilyas S., Lee D. Digital Alchemy: The Rise of Machine and Deep Learning in Small-Molecule Drug Discovery. Int. J. Mol. Sci. 2025;26:6807. doi: 10.3390/ijms26146807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180.Panchalingam S., Kasivelu G., Jayaraman M., Jeyaraman J. Machine learning guided structural dynamics identifies translation elongation factor 1 (EEF1A1) as an immunological biomarker and marine natural products as therapeutic leads for rheumatoid arthritis with major depressive disorder. Comput. Biol. Med. 2026;203:111480. doi: 10.1016/j.compbiomed.2026.111480. [DOI] [PubMed] [Google Scholar]
- 181.Lavecchia A. Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov. Today. 2024;29:104133. doi: 10.1016/j.drudis.2024.104133. [DOI] [PubMed] [Google Scholar]
- 182.Hashmi O.K., Aghayeva S., Uddin R. Interpretable machine learning models for QSAR-based prediction of anti-Salmonella typhi activity. Future Med. Chem. 2026:1–15. doi: 10.1080/17568919.2026.2619464. advance online publication . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 183.Blin K., Loureiro C., Louwen N.L.L., Navarro-Muñoz J.C., Gerstmans H., Robinson S.L., Rutz A., Reitz Z.L., Doering D.T., van der Hooft J.J.J., et al. Strategies for community-sourced biocuration in bioinformatics: A case study on MIBiG 4.0. Brief. Bioinform. 2025;26:bbaf659. doi: 10.1093/bib/bbaf659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 184.Friso F., Mendive F., Soffiato M., Bombardelli V., Hesketh A., Heinrich M., Menghini L., Politi M. Implementation of Nagoya Protocol on access and benefit-sharing in Peru: Implications for researchers. J. Ethnopharmacol. 2020;259:112885. doi: 10.1016/j.jep.2020.112885. [DOI] [PubMed] [Google Scholar]
- 185.Houmenou C.T., Sokhna C., Fenollar F., Mediannikov O. Advancements and challenges in bioinformatics tools for microbial genomics in the last decade: Toward the smart integration of bioinformatics tools, digital resources, and emerging technologies for the analysis of complex biological data. Infect. Genet. Evol. J. Mol. Epidemiol. Evol. Genet. Infect. Dis. 2025;136:105859. doi: 10.1016/j.meegid.2025.105859. [DOI] [PubMed] [Google Scholar]
- 186.Blankespoor J. Risks of Using AI in Herbalism. 2025. [(accessed on 4 February 2026)]. Available online: https://chestnutherbs.com/risks-of-using-ai-in-herbalism/
- 187.Ahmed S.A., Pinjari N.S. A Review of Sustainable Paradigms in Contemporary Drug Discovery and Development: Review Article. J. Pharma Insights Res. 2025;3:1–18. doi: 10.69613/a500k960. [DOI] [Google Scholar]
- 188.Warokar P., Lote S. Ethical and Regulatory Consideration in AI-Assisted Drug Development; Proceedings of the 2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI); Wardha, India. 29–30 November 2024. [Google Scholar]
- 189.Zhang C., You J., Lin R., Ye Y., Cheng C., Wang H., Li D., Wang J., Chen S. Engineering Self-Assembled PEEK Scaffolds with Marine-Derived Exosomes and Bacteria-Targeting Aptamers for Enhanced Antibacterial Functions. J. Funct. Biomater. 2025;17:23. doi: 10.3390/jfb17010023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190.Kotkondawar R.R., Sutar S.R., Kiwelekar A.W., Kadam V.J., Jadhav S.M. A generative framework for enhancing drug target interaction prediction in drug discovery. Sci. Rep. 2025;15:35588. doi: 10.1038/s41598-025-01589-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.Qu J., Ju S., Zhang M., Yin R., Zhang L., Zhang J., Pan Y., Wang L., Liu Y. Research Hotspots and Trends of Artificial Intelligence in Drug Discovery: A Review and Bibliometric Analysis. Mini Rev. Med. Chem. 2026 doi: 10.2174/0113895575400393251007093333. advance online publication . [DOI] [PubMed] [Google Scholar]
- 192.Carriel C.C., Halberg-Spencer S.A., Kotvanova M., Pyne S., Park S.C., Seo H.W., Schmidt A., Calise D.G., Ané J.M., Keller N.P., et al. A network-based model of Aspergillus fumigatus elucidates regulators of development and defensive natural products of an opportunistic pathogen. Nucleic Acids Res. 2026;54:gkaf1439. doi: 10.1093/nar/gkaf1439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193.Li X., Xing J., Zhang S., Zhou J. Editorial: Advancing drug discovery with AI: Drug-target interactions, mechanisms of action, and screening. Front. Pharmacol. 2025;16:1721323. doi: 10.3389/fphar.2025.1721323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 194.Gupta R., Srivastava D., Sahu M., Tiwari S., Ambasta R.K., Kumar P. Artificial intelligence to deep learning: Machine intelligence approach for drug discovery. Mol. Divers. 2021;25:1315–1360. doi: 10.1007/s11030-021-10217-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 195.Niu T., Zhu Y., Mou M., Fu T., Yang H., Sun H., Liu Y., Zhu F., Zhang Y., Liu Y. Identification of natural product-based drug combination (NPDC) using artificial intelligence. Chin. J. Nat. Med. 2025;23:1377–1390. doi: 10.1016/S1875-5364(25)60942-3. [DOI] [PubMed] [Google Scholar]
- 196.Polini A., Moroni L. The convergence of high-tech emerging technologies into the next stage of organ-on-a-chips. Biomater. Biosyst. 2021;1:100012. doi: 10.1016/j.bbiosy.2021.100012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 197.De Chiara F., Ferret-Miñana A., Ramón-Azcón J. The Synergy between Organ-on-a-Chip and Artificial Intelligence for the Study of NAFLD: From Basic Science to Clinical Research. Biomedicines. 2021;9:248. doi: 10.3390/biomedicines9030248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.Kandula S.K., Katam N., Kangari P.R., Hijmal A., Gurrala R., Mahmoud M. Quantum computing potentials for drug discovery; Proceedings of the 2023 International Conference on Computational Science and Computational Intelligence (CSCI); Las Vegas, NV, USA. 13–15 December 2023; New York, NY, USA: IEEE; 2023. pp. 1467–1473. [DOI] [Google Scholar]
- 199.Srivastava R. Quantum computing in drug discovery. Inf. Syst. Smart City. 2023;3:294. doi: 10.59400/issc.v3i1.294. [DOI] [Google Scholar]
- 200.Thomford N.E., Senthebane D.A., Rowe A., Munro D., Seele P., Maroyi A., Dzobo K. Natural Products for Drug Discovery in the 21st Century: Innovations for Novel Drug Discovery. Int. J. Mol. Sci. 2018;19:1578. doi: 10.3390/ijms19061578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201.Wang Y., Hu Z., Chang J., Yu B. Thinking on the Use of Artificial Intelligence in Drug Discovery. J. Med. Chem. 2025;68:4996–4999. doi: 10.1021/acs.jmedchem.5c00373. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.





