Abstract
Toxicity risk assessment plays a crucial role in determining the clinical success and market potential of drug candidates. Traditional animal-based testing is costly, time-consuming, and ethically controversial, which has led to the rapid development of computational toxicology. This review surveys over 20 ADMET prediction platforms, categorizing them into rule/statistical-based methods, machine learning (ML) methods, and graph-based methods. We also summarize major toxicological databases into four types: chemical toxicity, environmental toxicology, alternative toxicology, and biological toxin databases, highlighting their roles in model training and validation. Furthermore, we review recent advancements in ML and artificial intelligence (AI) applied to toxicity prediction, covering acute toxicity, organ-specific toxicities, and carcinogenicity. The field is transitioning from single-endpoint predictions to multi-endpoint joint modeling, incorporating multimodal features. We also explore the application of generative modeling techniques and interpretability frameworks to improve the accuracy and credibility of predictions. Additionally, we discuss the use of network toxicology in evaluating the safety of traditional Chinese medicines (TCMs) and the potential of large language models (LLMs) in literature mining, knowledge integration, and molecular toxicity prediction. Finally, we address current challenges, including data quality, model interpretability, and causal inference, and propose future directions such as multi-omics integration, interpretable AI models, and domain-specific LLMs, aiming to provide more efficient and precise technical support for preclinical toxicity assessments in drug development.
Keywords: drug discovery, ADMET prediction, computational toxicology, machine learning, toxin databases, large language models
Introduction
Drug discovery and development constitute a complex system engineering endeavor that integrates scientific rigor, economic viability, and societal implications, where the primary challenge is balancing therapeutic efficacy and safety thresholds of candidate compounds [1]. It has been reported that ~30% of preclinical candidate compounds (PCCs) fail due to toxicity issues, making adverse toxicological reactions the leading cause of drug withdrawal from the market [2, 3]. This reality underscores the strategic importance of toxicity assessment within the drug development pipeline. Toxicological evaluation serves as a pivotal link between fundamental research and clinical translation, significantly influencing not only development timelines and cost control but also public health safety and optimal allocation of healthcare resources [4]. Consequently, establishing efficient, accurate toxicity prediction methodologies has emerged as a global technological imperative in innovative drug discovery.
Traditional toxicity assessment paradigms rely heavily on in vivo animal experiments, typically employing sequential toxicity tests (acute, subacute, and chronic toxicity assays) to characterize the risk profiles of candidate compounds [5]. This approach has extensive historical data, but it no longer meets modern ethical and efficiency standards. On one hand, animal experiments are hindered by uncertainties in cross-species extrapolation, protracted testing durations (typically 6–24 months), and extremely high costs per compound (often exceeding millions of dollars) [6]. On the other hand, the widespread adoption of the “3Rs principle” (replacement, reduction, and refinement) places significant ethical pressure on traditional animal-based methodologies [7]. These conflicting demands have spurred the rapid emergence of computational toxicology, which integrates quantum chemical calculations, molecular dynamics simulations, machine learning (ML) algorithms, and multi-omics datasets to develop mechanism-based predictive models, thereby shifting from an “experience-driven” to a “data-driven” evaluation paradigm [8–11].
The theoretical advances underpinning computational toxicology have arisen from a deeper understanding of the multiscale mechanisms driving toxicological effects. Modern toxicological research has elucidated that drug toxicity is essentially an emergent property stemming from multiscale interactions between small molecules and biological systems: at the molecular level, metabolic activation, covalent modifications, and off-target interactions serve as initial triggers of toxicity; at the cellular level, mitochondrial dysfunction, oxidative stress, and aberrant activation of cell-death pathways amplify toxic phenotypes; and at the systemic level, disruptions of inter-organ metabolic networks and disturbances in the immune microenvironment ultimately manifest as clinically observable pathological outcomes [12, 13]. This hierarchical progression of toxic mechanisms necessitates predictive models with comprehensive, multidimensional information integration capabilities.
Currently, computational methods such as quantitative structure–activity relationship (QSAR), molecular docking, and systems toxicology have achieved significant predictive accuracy in critical toxicity evaluations, including hepatotoxicity and cardiotoxicity. Under conditions of sufficient data availability, their predictive performance has approached or even surpassed that of traditional animal-based assays [14–16]. Meanwhile, the rapid advancement of artificial intelligence (AI) technologies has further enhanced the predictive capabilities of computational toxicology [17]. Deep learning algorithms, notably graph neural networks (GNNs), can automatically extract molecular structural features and identify latent relationships between molecular structures and toxicity profiles [18, 19]. Furthermore, transformer architectures effectively integrate multimodal data, including chemical structure, genomic perturbations, and pathological phenotypes into end-to-end predictive pipelines, significantly improving model generalization [20]. Concurrently, the collaborative evolution of large-scale toxicity databases and cloud computing platforms has made virtual screening of millions of compounds feasible, improving screening efficiency by two to three orders of magnitude relative to traditional experimental approaches [21].
Despite these notable advances, computational toxicology continues to face substantial challenges. Current toxicity datasets often exhibit uneven data quality, limited model interpretability, and insufficient coverage, particularly when predicting novel or structurally complex multitarget compounds, leading to suboptimal predictive accuracy [22]. To address these bottlenecks, increasing research efforts have adopted multilayered, multidimensional integrated approaches, combining experimental data with network pharmacology and systems biology to construct more accurate and comprehensive toxicity prediction frameworks. Additionally, integrating predictive tools deeply with clinical drug data is essential to accurately identify potential toxicity risks during early drug discovery stages, thus providing reliable decision-making support for subsequent clinical development [23–25].
Given these considerations, this review systematically examines recent advancements in bioinformatics methods and technologies for drug toxicity research, focusing on applications of ML/AI methods. We begin by introducing the ADMET (absorption, distribution, metabolism, excretion, and toxicity) prediction system and prevalent computational platforms. Subsequently, we discuss key features and application scenarios of existing toxicology databases, critically analyzing the strengths and limitations of various toxicity task prediction algorithms. Furthermore, we provide an overview of network toxicology and its applications in assessing the safety of complex therapeutics, such as traditional Chinese medicine (TCM) formulations, and outline emerging potentials of large language models (LLMs) in toxicological research. Collectively, this review aims to provide theoretical and practical guidance for toxicity assessments in drug discovery and to inform the design of preclinical research and early-stage clinical trials.
Artificial intelligence in ADMET research
Framework of ADMET prediction methods
Adverse pharmacokinetic properties pose a significant threat to human health and environmental safety, representing one of the leading causes of drug development failure. ~40% of preclinical candidate drugs fail due to insufficient ADMET profiles, while nearly 30% of marketed drugs are withdrawn due to unforeseen toxic reactions [26]. Early integration of ADMET factors into the evaluation of new chemical entities has been shown to significantly reduce attrition rates in drug discovery [27]. Therefore, it is crucial to predict and optimize the ADMET properties of candidate compounds in advance. ADMET evaluation encompasses the absorption, distribution, metabolism, excretion, and toxicity of drugs, providing a comprehensive assessment of their in vivo behavior and predicting their clinical efficacy and safety (Fig. 1).
Figure 1.
The five parts of ADMET and their related endpoints.
Toxicity is a critical component of drug safety assessment, with potential adverse effects including neurotoxicity, organ toxicity, genotoxicity, carcinogenicity, and more [28]. Acute toxicity is typically assessed through in vivo metrics such as LD50 (median lethal dose) and in vitro endpoints like IGC50 (half-maximal inhibitory concentration). Hepatotoxicity, nephrotoxicity, and cardiotoxicity are common drug-induced toxicities. Hepatic damage is generally characterized by elevated alanine aminotransferase (ALT), aspartate aminotransferase (AST), and bilirubin levels, while nephrotoxicity can be detected in clinical or preclinical settings via serum creatinine and blood urea nitrogen measurements. Cardiotoxicity is associated with hERG channel inhibition, potentially leading to fatal arrhythmias [28]. Therefore, comprehensive toxicological evaluation integrating both in vitro and in vivo endpoints is essential for ensuring drug safety and minimizing clinical adverse reactions.
With the continuous advancements in ML/AI technologies, numerous ADMET prediction platforms based on these approaches have emerged. These platforms significantly enhance the efficiency of drug discovery and development, offering several key advantages. Firstly, they can rapidly process extensive datasets of chemical compounds, such as their molecular structures and physicochemical properties, substantially reducing experimental costs and time. Secondly, AI models can leverage large-scale historical ADMET data from previous experiments to deliver more accurate predictions. Lastly, these platforms facilitate drug performance analysis from multiple perspectives by simulating various physiological conditions and environmental factors, thereby enabling researchers to make more scientifically informed decisions.
The fundamental framework of an ADMET prediction platform constitutes a multilayered system encompassing the complete workflow from data input and model training to predictive output (Fig. 2). This framework uses robust computational methods, big data, and multidimensional information to improve prediction accuracy and reliability. Specifically, the ADMET prediction platform typically comprises the following critical components:
Figure 2.
The basic framework of ADMET prediction platforms.
Input component: the input component forms the foundation for the platform’s operation. It requires comprehensive chemical structural data and related molecular information, including molecular formulas, molecular weights, and molecular structures. Additionally, extensive ADMET experimental data, such as drug bioavailability, hepatic metabolic stability, and clearance rates, must be integrated. Experimental datasets are not limited to static data but also encompass reaction data under various conditions (e.g. different pH levels, physiological states). Moreover, literature-derived data is indispensable, enabling researchers to enrich the platform’s databases with previously published datasets and experimental outcomes, thereby ensuring the reliability and generalizability of platform predictions.
Tools/methods component: this component is the core of the platform and consists of two main submodules:
Physicochemical property calculation module: utilizing chemoinformatics software packages such as RDKit and Scopy, this module computes basic physicochemical properties of chemical compounds, including molecular weight, pKa, log P, TPSA, and hydrogen bond acceptors/donors. These fundamental physicochemical properties provide preliminary predictive information for various ADMET characteristics, such as drug absorption, distribution, and metabolism. The calculated results typically serve as foundational features for ML models.
ML/AI prediction module: based on substantial experimental data and computational chemical information, ML algorithms (e.g. support vector machines (SVMs), random forests (RFs), neural networks, gradient boosting trees) are applied to predict various ADMET properties. These models can be classified into regression and classification types, depending on specific prediction tasks. Regression models predict continuous ADMET parameters, such as in vivo drug t1/2 (half-life), VDss (volume of distribution at steady state), CL (clearance), and MRT (mean residence time). In contrast, classification models predict discrete ADMET indicators, such as in vitro BBB (blood–brain barrier) permeability assays and in vitro HLM (human liver microsome) stability metrics. These classification models identify potential risks and features by training on extensive datasets that encompass both in vitro and in vivo data. Furthermore, ML/AI models continuously improve prediction performance through techniques like feature selection and hyperparameter optimization, ensuring the platform’s adaptability across different drug types and both in vitro and in vivo ADMET characteristics.
Output component: the output component represents the final form of the platform’s predictions, generally referred to as “endpoints” indicating the predictive values or classification results corresponding to each ADMET characteristic. The number and type of these endpoints vary among platforms. However, comprehensive ADMET prediction platforms can evaluate over 100 different endpoints, encompassing both in vitro properties (such as HLM stability and PPB) and in vivo pharmacokinetic parameters (such as bioavailability (F), half-life (t1/2), and renal clearance (CLr)). These results are presented through intuitive data visualization and reporting tools, enabling researchers to swiftly acquire both in vitro and in vivo drug ADMET characteristics to support informed decision-making. The platform outputs extend beyond single predictive values and incorporate physicochemical properties with predictive outcomes to offer deeper analyses, such as directions for drug optimization, potential side effects, and optimal routes of administration [29]. Additionally, some platforms facilitate comparisons between candidate drugs and other compounds, assisting developers in drug screening and optimization processes [30].
In summary, the fundamental architecture of ADMET prediction platforms constitutes an integrated and intelligent system providing robust support for drug research and development through efficient data input, precise computational tools, and powerful ML algorithms. These platforms not only enhance development efficiency and reduce experimental costs but also accurately predict drug safety and efficacy in early development stages, thereby establishing a solid scientific foundation for successful new drug discovery.
Overview of ADMET prediction platforms
Nowadays, numerous computational tools have been developed to predict various ADMET-related properties, ranging from broad-spectrum platforms to tools specialized in specific aspects (Table 1 and Supplementary Table 1). Broad-spectrum platforms, such as admetSAR 3.0 [31], ADMETlab 3.0 [32], vNN-ADMET [33], and ADMETboost [34], provide comprehensive coverage across all five ADMET dimensions. By integrating multiple predictive models, these platforms offer systematic assessments of compounds in terms of ADMET. In contrast, platforms like Swiss ADME [30], FAF-Drugs4.0 [35], and ADMET-AI [36] focus specifically on pharmacokinetic properties, emphasizing predictions related to absorption, distribution, metabolism, and excretion. Additionally, certain platforms concentrate solely on toxicity prediction; e.g. ProTox 3.0 [37] and VenomPred 2.0 [38] are tailored for evaluating toxicity endpoints such as hepatotoxicity, carcinogenicity, and mutagenicity. Furthermore, tools like CypRules [39], BioTransformer 3.0 [40], XenoSite [41], SOMP [42], and SMARTCyp 3.0 [43] specifically address interactions with cytochrome P450 enzymes (CYPs), playing critical roles in drug metabolism studies.
Table 1.
Overview of ADMET Prediction Platforms
| Platform | ADMET endpoint | ADME endpoint | T endpoint | Model |
|---|---|---|---|---|
| CypRules [39] | 5 | 5 | / | Rule-based C5.0 decision tree |
| Swiss ADME [30] | 37 | 9 | / | Rule-based method、SVM |
| FAF-drug4.0 [35] | 40 | / | 4 | Rule-based method |
| SMARTCyp3.0 [43] | 3 | 3 | / | rule-based method |
| SOMP [42] | 6 | 6 | / | Bayesian-based algorithm |
| XenoSite [41] | 9 | 9 | / | DNN |
| vNN-ADMET [33] | 15 | 9 | 6 | vNN |
| ProTox 3.0 [37] | 61 | / | 61 | RF、DNN |
| Virtual Rat [54] | 12 | 10 | 2 | RF、C5.0、SVM、CART |
| FP-ADMET [55] | 56 | 24 | 30 | RF |
| ICDrug [56] | 14 | 8 | 6 | RF |
| BioTransformer 3.0 [40] | 9 | 9 | / | ML |
| ADMETboost [34] | 29 | 18 | 4 | XGBoost |
| ADMET-AI [36] | 49 | 22 | 18 | GNN |
| VenomPred2.0 [38] | 12 | / | 12 | RF、SVM、KNN、MLP |
| AquaticTox [50] | 5 | / | 5 | Ensemble model |
| OptADMET [57] | 32 | 14 | 15 | QSAR |
| PKCSM [51] | 36 | 20 | 10 | Graph-based signatures |
| Interpretable-ADMET [52] | 49 | 20 | 29 | GCNN、GAT |
| HelixADMET [53] | 52 | 14 | 16 | GNN |
| admetSAR3.0 [31] | 119 | 39 | 43 | CLMGraph |
| ADMET lab3.0 [32] | 119 | 34 | 36 | DMPNN |
| Deep-PK [58] | 73 | 30 | 34 | DMPNN |
Note: The number in the “ADMET endpoint” column represents the total number of ADMET property endpoints the tool can predict; the number in the “ADME endpoint” column represents the number of ADME property endpoints; the number in the “T endpoint” column represents the number of Toxicity (T) endpoints. The symbol “/” indicates that the tool does not provide predictions for that particular category of properties.
According to the computational methodologies employed, existing ADMET prediction platforms can generally be classified into three categories: rule/statistical based methods, ML based methods, and graph-based methods (Fig. 3). Next, let’s briefly examine each category’s unique characteristics and their applications in ADMET prediction.
Figure 3.
Classification of computational modeling approaches in ADMET prediction platform: rule/statistics based methods, ML based methods, and graph-based methods.
Rule-based/statistical methods
Rule-based and statistical methods represent an earlier paradigm of ADMET prediction. These methods stand out due to their high computational efficiency and interpretability. They typically leverage chemical rule databases, experimental data, and statistical inference to perform rapid pharmacokinetic evaluations. For example, SMARTCyp 3.0 [43] and CypRules [39] specialize in predicting cytochrome P450 metabolism sites by combining chemical heuristics with quantum chemical computations, enabling quick identification of potential metabolism sites within a molecule [44]. Swiss ADME [30], as a comprehensive prediction tool, integrates physicochemical property calculations (e.g. log P, TPSA), drug-likeness rules (Lipinski, Veber rules), and structural toxicity alerts (such as PAINS), supplemented with SVM-optimized key endpoints (log S, blood–brain barrier permeability, P-gp substrate specificity), providing an intuitive and comprehensive ADMET evaluation for early-stage drug screening. The FAF-Drugs platform [35], currently in version 4.0, employs an approach that calculates physicochemical descriptors for input molecules, followed by preliminary screening based on predefined thresholds and rules (e.g. Lipinski’s rules). Subsequently, it uses SMARTS pattern matching to identify potentially toxic or undesirable structures, relying entirely on expert-driven methodologies rather than ML algorithms. Its notable strengths include simplicity, efficiency, ease of customization, and scalability for large-scale screening efforts. Additionally, probabilistic acm-pproaches have been explored; for instance, the SOMP [42] tool generates all possible Sites of Labeled Atoms (SoLAs) from the 2D molecular structure and characterizes them using LMNA descriptors based on molecular neighborhoods. It then applies a Bayesian probabilistic model to rank and predict likely metabolic sites, effectively utilizing early-stage structural information. However, these rule-based/statistical methods inherently depend heavily on established chemical rules and experimental data, often exhibiting limited predictive capability when encountering novel or structurally complex compounds.
Machine learning-based methods
ML-based methods leverage chemical descriptors and algorithms such as RF, SVM, k-Nearest Neighbors (k-NN), and Gradient Boosting Trees (GBT) to predict ADMET properties [45–48]. The primary strength of these approaches lies in their ability to discern complex patterns within large datasets, making them well-suited for handling diverse chemical spaces and structurally intricate molecules [49]. For example, vNN-ADMET [33] utilizes a variant of the k-NN approach, variable Nearest Neighbor (vNN) for pharmacokinetic predictions. ADMETboost [34] employs XGBoost, a gradient boosting tree-based ensemble algorithm optimized for structured data handling. Some platforms adopt combined or hybrid approaches tailored to specific prediction tasks; ProTox 3.0 [37], specialized in toxicity prediction, uses eight data-sampling methods alongside RF and deep neural networks (DNNs) to predict up to 61 distinct toxicity endpoints, including hepatotoxicity, carcinogenicity, and mutagenicity. AquaticTox [50], the first tool dedicated specifically to aquatic toxicity predictions, employs a stacked ensemble methodology comprising six ML models [RF, AdaBoost, Gradient Boosting, SVM, Fully Connected Networks (FCNs), and Graph Convolutional Neural Networks (GCNNs)]. VenomPred 2.0 [38] predicts multiple toxicity endpoints by combining three different chemical fingerprints (Morgan, RDKit, PubChem) with four distinct algorithms (RF, SVM, k-NN, Multilayer Perceptron (MLP)). Despite their substantial predictive capabilities, these ML methods are strongly reliant on high-quality training data, inadequate or biased training datasets may significantly impair performance. Furthermore, some models, particularly deep learning approaches, often lack interpretability regarding their underlying chemical rationale.
Graph-based methods
Graph-based computational methods represent a cutting-edge development in the ADMET prediction domain, characterized by their superior ability to deeply analyze molecular graph structures. By aggregating neighborhood information to update node representations, these deep learning models effectively capture complex relationships and structural features within molecules. Their primary advantage is superior performance in predicting complex molecular properties compared to traditional methods. For instance, pkCSM [51] introduced in 2015, represents molecules as molecular graphs, extracting interatomic distance and topological features using Graph-Based Signatures, subsequently leveraging these features for ML-based predictions of pharmacokinetic and toxicological properties. This approach offers significant flexibility and interpretability without relying on predefined structural fragments, providing a robust and accurate method for ADMET prediction.
With the proliferation of GNNs, several recent platforms have adopted these methodologies. ADMET-AI [36], e.g. applies the Chemprop-RDKit GNN model, representing molecules as graphs and learning atomic-level features through message-passing neural networks, which are further enriched with physicochemical properties calculated by RDKit. This approach has demonstrated outstanding performance across 41 ADMET datasets. Similarly, Interpretable-ADMET [52] employs GCNN and Graph Attention Networks (GATs), incorporating Grad-CAM to explain predictions by identifying molecular substructures most contributing to specific ADMET properties, thus achieving both accuracy and interpretability. Recent research efforts, such as HelixADMET [53], have begun leveraging self-supervised pretraining strategies for GNN models on large compound datasets, transferring the learned knowledge to specific ADMET prediction tasks, resulting in robust and scalable prediction systems. ADMETlab 3.0 [32] combines multitask Deep Message Passing Neural Networks (DMPNNs, a GNN variant) with molecular descriptors, first pretraining the model to obtain general molecular features, then fine-tuning for multiple ADMET tasks, significantly expanding its applicability and performance. AdmetSAR 3.0 [31] also incorporates pretrained GNN models to extract molecular features, subsequently fine-tuning for specific ADMET properties, achieving notable performance in chemical exploration, prediction, and optimization tasks. Nevertheless, these graph-based methods have notable computational complexities, particularly when dealing with large datasets, potentially requiring extensive computational resources during training and inference. Moreover, GNN-based methods typically necessitate substantial training datasets to reach optimal performance; insufficient data can lead to overfitting or degraded predictive accuracy.
In the drug discovery process, various ADMET prediction tools exhibit certain complementarity due to their differences in functional emphasis, algorithmic architectures, and coverage of prediction endpoints. For early-stage compound screening and preliminary evaluation of multiparameter ADMET properties, integrated platforms such as ADMETlab 3.0 [32], admetSAR 3.0 [31], and Deep-PK [58] offer certain advantages. These platforms generally cover multiple endpoints, including physicochemical properties, pharmacokinetic characteristics, and toxicity, thereby supporting large-scale systematic virtual screening. If the research focuses on toxicity assessment, tools such as ProTox 3.0 [37] (for general chemical toxicity) or VenomPred2.0 [38] (for peptide toxins) may be considered, as they often provide predictive models tailored to specific toxicity endpoints. For studies emphasizing prediction accuracy and model interpretability, tools based on ML algorithms such as GNNs, e.g. ADMETboost [34] and interpretable-ADMET [52] show promising potential. They not only deliver prediction results but also offer certain interpretative insights through methods like uncertainty estimation and atom contribution visualization. Furthermore, for specific subproblems in ADMET research, such as predicting compound metabolic pathways, BioTransformer 3.0 [40] may be more applicable, while SMARTCyp3.0 [43] could be considered for identifying CYP metabolic sites.
In summary, when selecting ADMET prediction tools, it is advisable to align the choice with specific research needs: integrated platforms are often used for high-throughput preliminary screening, specialized tools are more suitable for in-depth analysis of specific endpoints, and advanced algorithmic tools hold certain value in scenarios requiring higher prediction accuracy, robustness, and mechanistic interpretation.
Artificial intelligence in toxicological research
Toxicological databases
Toxicological databases represent fundamental platforms for integrating, storing, and disseminating toxicity-related information, which are indispensable for chemical safety assessment and drug discovery. Toxins are substances capable of causing cellular damage or diseases following exposure via inhalation, ingestion, or dermal contact. Their toxicity arises from complex interactions between their chemical structures and biological systems [59]. Efficient and systematic management of toxicological data not only provides timely and accurate risk assessment tools for researchers and regulatory authorities but also significantly reduces clinical trial failures and developmental costs [60]. According to data content and application context, toxicological databases can be classified into four major categories: chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases. The classification scheme is illustrated in Fig. 4, and each category’s main features and representative platforms are detailed below.
Figure 4.
Four types of toxin databases: chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases.
Chemical toxicity databases
Chemical toxicity databases focus on elucidating the potential health hazards of various chemical compounds, especially pharmaceuticals, by aggregating multidimensional data such as cytotoxicity, hepatotoxicity, nephrotoxicity, teratogenicity, carcinogenicity, genotoxicity, and reproductive toxicity. These resources provide scientific support for early-stage risk assessment in drug development, enabling research teams to promptly identify safety issues, guide molecular optimization, dose selection, and inform clinical strategies. Representative databases and their key characteristics are summarized in Table 2. PubChem [61], ChEMBL [62], TOXRIC [63], DrugBank [64], SuperToxic [65], and CompTox Chemicals Dashboard [66] are comprehensive chemical toxicity databases. Specifically, PubChem [61], maintained by the National Library of Medicine (NLM) under the U.S. National Institutes of Health (NIH), consolidates chemical data from over 750 sources and freely disseminates them publicly. ChemIDplus [67], HSDB [68], and CCRIS, originally sub-databases of TOXNET, have now been integrated into PubChem. Among these, HSDB includes toxicological information of ~5600 chemicals, covering pharmacological properties, environmental fate, emergency handling, and occupational health data, and is widely utilized globally. ChEMBL [62], an open-source bioactivity database maintained by the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL), provides extensive compound data tailored toward drug discovery and chemical biology research. Developed by the U.S. Environmental Protection Agency (EPA), the CompTox Chemicals Dashboard [66] collates extensive physicochemical, toxicity, and exposure data for numerous chemical substances, serving as a vital tool for environmental health research. TOXRIC [63] contains standardized toxicological attributes, molecular representations, practical benchmarks, and intuitive visualization interfaces for diverse chemical substances. SuperToxic [65] offers a comprehensive collection of toxic substances from diverse sources (animals, plants, synthetic origins, and etc.), enabling detailed investigations into correlations between their chemical, functional, and structural properties. DrugBank [64], another comprehensive resource, integrates extensive chemical, pharmacological, pharmacokinetic, side effect, and toxicological data widely utilized across drug research and toxicological studies. Specialized toxicity databases such as DILIrank [69], DILIst [70], LTKB [71], hERGCentral [72], and LCDB [73] focus specifically on hepatotoxicity, cardiotoxicity, or carcinogenicity. For instance, DILIrank [69], DILIst [70], and LTKB [71] provide critical insights into mechanisms and targets of drug-induced liver injury (DILI). The hERGCentral [72] database delivers extensive information on drug interactions with the hERG potassium channel, pivotal for cardiac toxicity assessment, while LCDB [73] provides robust carcinogenicity data from long-term experiments involving 1726 chemicals and 7745 experimental records.
Table 2.
List of chemical toxicity databases
| Database | Compounds | URL | Key features |
|---|---|---|---|
| PubChem [61] | >119 million | https://pubchem.ncbi.nlm.nih.gov/ | NIH-maintained, integrates three interlinked repositories (Substance, Compound, BioAssay) with rich, multi-dimensional data. |
| ChemIDplus [67] | >420,000 | Integrated into PubChem | Covers a vast number of compounds and offers structure-based visualization tools. |
| HSDB [68] | ~5600 | Integrated into PubChem | Focused on high-quality toxicology profiles; now includes nanomaterials and animal toxins. |
| CCRIS | >9000 | Integrated into PubChem | Specializes in carcinogenicity and mutagenicity data for chemical substances. |
| ChEMBL [62] | >2.1 million | https://www.ebi.ac.uk/chembl | EMBL-EBI-curated, aggregates and standardizes bioactivity data across millions of entries. |
| OCHEM [74] | ~4 million | http://www.ochem.eu | Hosts chemical and biological measurement data plus an integrated QSAR modeling framework. |
| ECHA | >360 000 | https://chem.echa.europa.eu/ | Authoritative REACH registry with 360 k substance dossiers linked directly to regulations. |
| TOXRIC [63] | 113 372 | https://toxric.bioinforai.tech/ | Open-source, ML-ready toxicology platform supporting multiple endpoint predictions. |
| SuperToxic [65] | ~60 000 | http://bioinformatics.charite.de/supertoxic | Multi-dimensional toxin assessments with target-prediction capability. |
| DrugBank [64] | >10 000 | https://go.drugbank.com | Integrates drug chemistry, pharmacology, ADMET, and target-interaction data. |
| DILIrank [69] | 1036 | https://www.fda.gov/science-research/liver-toxicity-knowledge-base-ltkb/dili-rank | Grades drugs by liver-injury severity and provides mechanistic annotations. |
| DILIst [70] | 1279 | https://www.fda.gov/science-research/liver-toxicity-knowledge-base-ltkb/drug-induced-liver-injury-severity-and-toxicity-dilist-dataset | Binary classification of DILI risk, aggregating data from multiple sources. |
| T3DB [75] | >3600 | http://www.t3db.ca/ | Includes NMR, MS/MS, and GC–MS spectra, and supports toxicity and target prediction. |
| eChemPortal | >1.44 million | https://www.echemportal.org/echemportal/ | Aggregates multiple regulatory databases, covers existing and new chemicals, and offers GHS hazard look-up. |
| LTKB [71] | 287 | https://www.fda.gov/science-research/bioinformatics-tools/liver-toxicity-knowledge-base-ltkb | Integrates multisource DILI research via systems-biology analyses to elucidate mechanisms. |
| ICE [76] | ~1 million | https://ice.ntp.niehs.nih.gov/ | Compiles acute (oral, dermal, inhalation), and chronic (developmental, carcinogenic, reproductive) endpoints with reference compounds and predictive models. |
| SIDER [77] | 1430 | http://sideeffects.embl.de/ | Focuses on marketed drug adverse reactions with standardized, visualized labels, and ontology links. |
| hERGCentral [72] | >300 000 | www.hergcentral.org | Extensive hERG channel inhibition assays with flexible query options. |
| LCDB [73] | 1726 | https://carcdb.lhasalimited.org/ | Contains GLP-compliant, long-term carcinogenicity bioassay data with high data integrity. |
| CompTox Chemicals Dashboard [66] | >1 218 248 | https://comptox.epa.gov/dashboard/ | Curated chemical data with properties, exposure, hazard, and risk information from multiple public and government sources. |
| OnSIDES [78] | 2783 | https://onsidesdb.org/ | Provides access to structured and standardized side effect data from drug labels. |
| VigiBase [79] | >35 000 000 | https://www.vigiaccess.org/ | VigiBase is the world’s largest drug safety database, containing reports of adverse drug reactions. |
| VAERS [80] | >2 000 000 | https://vaers.hhs.gov/ | VAERS provides a vast number of reports on adverse events following vaccination. |
| FAERS | 31 770 750 | https://www.fda.gov/drugs/drug-approvals-and-databases/fda-adverse-event-reporting-system-faers-database | U.S. system containing postmarketing adverse event and medication error reports for drugs and therapeutic biologics. |
Environmental toxicology databases
Environmental toxicology databases comprehensively document chemical behavior in environmental matrices (e.g. water, soil, and atmosphere) and their potential impacts on ecosystems. Beyond recording acute and chronic toxicity effects on diverse organisms (algae, benthic organisms, fish, and birds), these databases include parameters like environmental persistence, bioaccumulation, and transformation, thus serving as foundational tools for ecological risk assessments, pollution control strategies, and ecosystem conservation. In drug discovery contexts, these databases facilitate environmental risk evaluation of candidate drugs and their metabolites, promote green chemistry design, and inform postmarketing environmental risk management. Representative databases are listed in Table 3. The ECOTOX database [81], comprising decades of published ecological toxicological test data, elucidates cumulative distribution patterns of species, chemicals, and biological effects, supporting ecological risk assessment and ecosystem management decisions. Additionally, TOXNET [82] is extensively employed in environmental health research. EnviroTox [83] enables users to explore ecotoxicity patterns based on modes of action, analyze organism-specific sensitivities within chemical groups, and assess relative taxonomic sensitivity. Databases like AquaticTox [50] specialize in aquatic organism toxicity data, providing critical insights into pollution impacts on aquatic ecosystems. Collectively, these databases not only advance the understanding of environmental pollutants but also encourage public engagement in environmental protection, laying a solid foundation for ongoing environmental toxicological research.
Table 3.
List of Environmental Toxicology Databases
| Database | Compounds | URL | Key feature |
|---|---|---|---|
| ECOTOX [81] | >13 000 | http://www.epa.gov/ecotox | Comprehensive EPA repository with >1 million peer-reviewed aquatic & terrestrial toxicity records. |
| EnviroTox[83] | 4016 | http://www.EnviroToxdatabase.org | Quality-scored ecotoxicity dataset for ~4000 chemicals and 1500+ species, curated for QSAR development. |
| AquaticTox[50] | >1000 | https://chemyang.ccnu.edu.cn/ccb/server/AquaticTox/ | Ensemble-learning web server offering rapid multi-endpoint aquatic toxicity predictions (fish, daphnia, algae). |
| Pesticide Info | ~15 300 | https://www.pesticideinfo.org | Detailed pesticide active-ingredient profiles, including nontarget organism toxicity (e.g. bees, birds), and environmental fate. |
| PPDB[84] | ~1500 | https://sitem.herts.ac.uk/aeru/ppdb/ | AERU’s relational database of pesticide physicochemical, ecotoxicological, and human-health properties with source quality tags. |
Alternative toxicology databases
As ethical and animal welfare concerns increasingly constrain traditional animal-based toxicity testing, alternative toxicology databases have emerged as valuable resources leveraging in vitro high-throughput screening (HTS), high-content imaging, computational toxicology models, and multi-omics techniques. These databases typically integrate genomic, transcriptomic, metabolomic, and proteomic data with systems biology and ML approaches, thereby offering efficient, reproducible, and scalable solutions for chemical toxicity assessments. Early-stage drug development benefits from rapid screening and optimization of candidate molecules based on cell viability, gene expression profiles, and receptor activation data, dramatically reducing animal experimentation, associated costs, and timelines [85]. Key platforms are summarized in Table 4. The Tox21 initiative [86], jointly spearheaded by the U.S. EPA, National Toxicology Program (NTP), and NIH, utilizes high-throughput methodologies to drive toxicity assessments towards mechanism-based approaches. Tox21 compiles and publicly shares extensive in vitro screening data involving thousands of compounds tested across multiple biological pathways, including nuclear receptor signaling and cellular stress responses, providing invaluable resources for understanding chemical-biological interactions. Similarly, Open TG-GATEs [87] stores comprehensive toxicogenomic profiles for 170 compounds across various dosages and time points from in vivo (rats) and in vitro (rat and human primary hepatocytes) studies. ToxicoDB [88] integrates toxicogenomic data from Open TG-GATEs [87], DrugMatrix [89] and EMEXP2458 [90], facilitating queries and analyses of gene expression and signaling pathway perturbations induced by potential toxicants. The Comparative Toxicogenomics Database (CTD) [91] further enhances understanding of human health by integrating chemical, genetic, disease, and exposure information. These resources accelerate the transition from traditional animal-based studies toward more efficient and precise in vitro and computational toxicology methods, enhancing both environmental and public health protection.
Table 4.
List of Alternative Toxicology Databases
| Database | Compounds | URL | Key feature |
|---|---|---|---|
| ToxCast/Tox21 [86] | ~1.2 million | https://comptox.epa.gov/dashboard | Aggregates high-throughput in vitro screening data, covering thousands of chemicals across hundreds of bioassay endpoints. |
| ToxicoDB [88] | 231 | https://toxicodb.ca | Integrates three majors in vitro toxicogenomic datasets with harmonized chemical annotations and interactive time- and dose-response gene-expression plots. |
| Open TG-GATEs [87] | 170 | https://toxico.nibiohn.go.jp/english/index.html | Contains transcriptomic, biochemical, histopathology, and cytotoxicity data for 170 compounds in both rat in vivo and primary rat/human hepatocyte in vitro models. |
| CTD [91] | >16 300 | http://ctdbase.org/ | Manually curates over 3.3 million chemical–gene interactions (covering >16 000 chemicals), integrated into chemical–gene–disease networks to illuminate exposure effects. |
| DrugMatrix [89] | >600 | https://norecopa.no/3r-guide/drugmatrix | Comprehensive toxicogenomic reference from the US NTP, offering in vivo and in vitro gene–expression and pathology data for >600 compounds. |
Biological toxin databases
Biological toxin databases (Table 5) specialize in collecting, annotating, and analyzing natural toxins derived from animals, plants, and microorganisms, playing a critical role in elucidating their toxicological mechanisms, pharmacological potential, and biodefense applications. These natural toxins exhibit distinctive bioactivities such as antihypertensive, analgesic, and antimicrobial effects, and some have transitioned into clinical or preclinical development [92] databases such as ToxinDB [93], the first comprehensive biological toxin database encompassing over 4836 toxins and associated molecular descriptors and ADMET properties, provide platforms for predicting toxin metabolites and developing detoxification enzymes. TPPT [94] catalogs 1586 plant toxins with ecological toxicological significance, providing extensive biological and chemical information, alongside computational property estimates. MycoCentral [95], with 904 mycotoxins and metabolites, integrates data on biosynthetic pathways, physicochemical properties, ADME predictions, and QSAR-derived medicinal chemistry parameters. ATDB [96] unifies structural and annotation data for animal toxins, standardizing functional annotations through a novel toxin ontology system. BioTD [92], the most comprehensive open-source biological toxin database to date, provides extensive annotations, sequence data, mutagenesis information, and biological activities derived from over 5220 publications and patents, spanning more than 900 species, underpinning toxin-based drug design and mechanistic studies.
Table 5.
List of Biological Toxin Databases
| Database | Compounds | URL | Key features |
|---|---|---|---|
| ToxinDB [93] | >4836 | http://www.rxnfinder.org/toxindb/ | Defines a unified chemical space of 4836 toxins and their potential metabolites, combining in silico predictions with experimental validation. |
| TPPT [94] | 1586 | https://www.agroscope.admin.ch/agroscope/en/home/publications/apps/tppt.html | Catalogs 1586 toxic plants and their phytotoxins, providing physicochemical properties and toxicity predictions for understudied compounds. |
| MycoCentral [95] | 904 | http://www.mycocentral.eu | Integrates data on 904 fungal toxins and metabolites, using seven open-source QSAR/ADMET tools to predict 147 endpoints alongside experimental data. |
| ATDB [96] | 3240 | http://protchem.hunnu.edu.cn/toxin | Stores chemical structures and annotations for 3240 animal toxins, introducing a “Toxin Ontology” for standardized functional annotation. |
| SCORPION2 [97] | >800 | http://sdmc.i2r.a-star.edu.sg/scorpion/ | Specializes in structure–function analysis of scorpion toxins and predicts toxin–ion channel binding modes. |
| ConoServer [98] | ~10 000 | http://www.conoserver.org | Provides sequences, structures, and functional data for ~10 000 conotoxins, with graphical visualization and receptor-targeted search. |
| MycotoxinDB [99] | 189 | http://www.mycotoxin-db.com/ | Contains 189 mycotoxins and their masked forms, offering data-driven tools to predict masked mycotoxins. |
| Toxinome [100] | 14 83 028 | http://toxinome.pythonanywhere.com/ | Comprehensive bacterial protein toxin database with >1.48 million entries and associated antitoxin information. |
| ToxinDB [93] | 8975 | http://biotoxin.net/ | Covers 8975 biotoxins from over 900 species and provides multi-endpoint bioactivity and toxicity data. |
Methodological workflow for drug toxicity prediction
In recent years, ML has garnered significant attention in the realm of computational drug toxicity prediction, establishing itself as a cutting-edge technique for toxicity assessment using computational models [23]. With the continuous expansion of large-scale toxicological databases and improvements in data quality, the predictive efficacy of ML models has seen substantial enhancement [101, 102]. The typical workflow for ML-driven drug toxicity prediction can be summarized into five core stages (Fig. 5) [3]:
Figure 5.
Workflow for AI/machine learning–based drug toxicity prediction.
(1) Data collection: integration of heterogeneous datasets from multiple sources, including molecular structural data (SMILES, InChI), in vitro assay outcomes (e.g. hERG inhibition activity, cytotoxicity), in vivo toxicological endpoints (e.g. LD50 values, organ pathological phenotypes), and clinical adverse drug reaction reports.
(2) Data preprocessing: this stage encompasses molecular feature engineering, such as generating molecular fingerprints (ECFP4, MACCS) and calculating physicochemical descriptors, data cleaning (removal of duplicate compounds and balancing positive/negative samples), and dimensionality reduction techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) [103]. For sparse datasets, methods such as transfer learning and generative adversarial networks (GANs) can be utilized to generate synthetic data to mitigate data insufficiency [104].
(3) Model construction and training: traditional ML methodologies mainly include RF and SVM, whereas deep learning models, such as graph convolutional networks (GCNs) and Transformers, have emerged prominently due to their capability for automatic hierarchical feature extraction [105, 106]. For instance, DNNs can simultaneously predict multiple toxicity endpoints through multitask learning frameworks, and GNNs effectively capture inter-atomic interactions via molecular graph representations [107].
(4) Model evaluation: common metrics for model evaluation include accuracy, area under the ROC curve (AUC), and the F1-score for classification tasks. In continuous regression tasks, commonly used evaluation metrics include mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R2). Cross-validation methods (e.g. 10-fold cross-validation) and external testing datasets are employed to assess generalizability. Interpretability techniques (e.g. SHAP and LIME) are often applied to identify key toxicity-related features, thus enhancing model transparency and credibility [108, 109]. In addition to understanding the model’s decision-making mechanism, practical applications also require attention to the model’s confidence scores, which quantify the model’s certainty in its individual predictions. This is particularly important in high-stakes decision-making scenarios that demand high reliability, such as medical diagnosis and autonomous driving. Common methods for generating confidence scores include probability-based approaches (e.g. leveraging the inherent probabilities from models like Logistic Regression, though these often require calibration via Platt Scaling or Isotonic Regression to be reliable), Bayesian methods for uncertainty quantification, specific ensemble learning techniques like Deep Ensembles which measure disagreement among multiple models to estimate certainty, uncertainty estimation techniques like Monte Carlo Dropout [110], as well as distance-based methods that assess the similarity of a new input to the training data. These techniques can be integrated into the decision-making process to ensure that predictions with low confidence are flagged for further human review or other appropriate safety measures, thereby enhancing the transparency and trustworthiness of the model [111].
(5) Algorithm Application: The resultant predictions can guide drug design, facilitate early-stage toxicity screening, and support regulatory decision-making processes. Specifically: (i) They guide drug design by allowing medicinal chemists to prioritize or deprioritize specific compound series based on predicted ADMET profiles, thus focusing synthesis efforts on leads with higher probabilities of success. (ii) They facilitate early-stage toxicity screening by serving as a rapid, cost-effective virtual triage tool, highlighting high-risk molecules for in vitro or in vivo experimental validation and reducing the reliance on animal testing in initial phases. (iii) They support regulatory decision-making by providing supplementary, evidence-based data that can be used to assess compound risk, potentially informing the design of further necessary clinical trials or contributing to the weight of evidence in a regulatory submission.
ML/AI applications across diverse toxicity prediction tasks
In recent years, the landscape of toxicity prediction has evolved significantly, encompassing a broad spectrum of endpoints, from acute toxicity to multi-organ damage, as well as chronic effects like carcinogenicity and genotoxicity [2, 3]. Acute toxicity assessments focus on immediate, life-threatening effects resulting from short-term, high-dose exposures. Organ-specific models delve into cardiotoxicity (e.g. in vitro hERG channel blockade assays), hepatotoxicity (DILI), nephrotoxicity (tubular damage), and neurotoxicity. Carcinogenicity studies consider both in vivo tumor induction and in vitro genotoxic endpoints, while genotoxicity analyses target molecular mechanisms such as mutagenesis, chromosomal aberrations, and DNA damage [2, 3].
This review highlights recent cutting-edge developments and representative achievements applying ML to predict these diverse toxicity endpoints, particularly acute toxicity, cardiotoxicity, hepatotoxicity, nephrotoxicity, and carcinogenicity, as well as innovative case studies from Tox21 challenges (see Table 6). It aims to provide a comprehensive overview of how methods such as multitask learning, GNNs, and generative models are shaping the future of toxicity screening.
Table 6.
Methods for different Toxicity Prediction Tasks
| Method | Model | Endpoint/Task | Data source | Performance |
|---|---|---|---|---|
| Wang et al. [112] | ensemble learning, NB, SVM | hERG | WOMBAT-PK, literatures | AC = 84.7% (training set) AC = 82.1% (external test set) AC = 83.6% (hERG blockers in test set) AC = 78.2% (nonblockers in test set) |
| deephERG [113] | multitask DNN | hERG | CHEMBL, literatures | AUC = 0.944 (training set) AUC = 0.967 (validation set) |
| CToxPred [115] | GCN | hERG, Cav1.2, Nav1.5 | ChEMBL, ChEMBL, BindingDB, hERGCentral, patents, literatures | 1.hERG Prediction: AC = 81.4% (Eval-70 external set) AC = 71.2% (Eval-60 external set) SN = 86.7% (blockers in Eval-70) SP = 74.6% (nonblockers in Eval-70) 2.Nav1.5 Prediction: AC = 81.7% (Eval-70 external set) AC = 76.6% (Eval-60 external set) SN = 85.6% (blockers in Eval-70) SP = 73.3% (nonblockers in Eval-70) 3.Cav1.2 Prediction: AC = 86.4% (Eval-70 external set) AC = 69.4% (Eval-60 external set) SN = 96.2% (blockers in Eval-70) SP = 69.0% (nonblockers in Eval-70) |
| CardiOT [143] | GNN, KAN | hERG, Cav1.2, Nav1.5 | ChEMBL, ChEMBL, BindingDB, hERGCentral, patents, literatures | ACC = 74.7% F1 = 76.4% SEN = 82.2% SPE = 68.2% CCR = 75.2% MCC = 50.1% |
| CardioGenAI [116] | Transformer, GAT | hERG, Cav1.2, Nav1.5 | ChEMBL, GuacaMol v1, MOSES, BindingDB | 1.hERG Channel: AC = 83.5% SN = 86.2% SP = 80.3% F1 = 85.1% CCR = 83.2% MCC = 66.7% 2.Nav1.5 Channel: AC = 89.4% SN = 95.9% SP = 75.6% F1 = 92.5% CCR = 85.7% MCC = 75.1% 3.Cav1.2 Channel: AC = 91.4% SN 96.2% SP = 82.8% F1 = 93.5% CCR = 89.5% MCC = 81.0% |
| InterDILI [118] | RF, LGBM, LR, attention | DILI | DILIrank, NCTR, literatures | AUROC = 0.88–0.97 AUPRC = 0.81–0.95 |
| StackDILI [117] | GA, Stacking Architecture | DILI | DILIrank, NCTR, literatures | DILIrank test set: AC = 92.7% SN = 96.2% SP = 90.3% precision = 87.2% F1 = 91.5% 10-fold cross-validation: AC = 79.2% SN = 81.4% SP = 76.9% precision = 79.0% F1 = 80.1% |
| pDILI_v1 [119] | LR, KNN), NB, RF, DT, QDA, MLP | DILI | DILIst | FDR = 0.053 FOR = 0.230 SN = 82.9% (training set) SN = 78.6% (test set) FDR = 0.240 FOR = 0.378 (test set) |
| DILIPredictor [120] | RF | Human hepatotoxicity, Animal hepatotoxicity A, Animal hepatotoxicity B, Preclinical hepatotoxicity, Diverse DlL A, Diverse DlLl C, BSEP, Mitotox, Reactive Metabolite | DILIst, DILIrank, multi-Proxy-DILI Data Sets | AUC-ROC = 0.79 detection capability = 2.68 LR+ score (top 25 toxic compounds) |
| jin et al [121] | XGBoost-SHAP | pathway | Open TG-GATES, DrugMatrix | Precision = 86% (49 TP, eight FP, 57 predicted positives) SP = 71% (20 TN/32 predicted negatives) AC overall = 78% (69 correct/89 total) precision = 91%, SP = 89% (cutoff = 1.22) |
| Tox-GAN [122] | CGAN, WGAN | Gene activities and expression profiles | Open TG-GATES | rge = 0.997 ± 0.002 re = 0.740 ± 0.08 |
| Shi et al [123] | Consensus Modeling, SVM, RF, DT, ASNN, RFR, XGBoost | DIRI | SIDER | AUC = 0.93 (consensus model) Q = 86.24% (external validation) MCC = 0.82 (external validation) SE = 85.45% (external validation) SP = 87.04% (external validation) EF = 1.72% (external validation) |
| Gong et al. [124] | ANN, LightGBM, SVM, RF, DT, KNN, NB, XGBoost | DIRI | SIDER, DrugBank, ChEMBL | ANN_GraphFP (AUC = 0.870, ACC = 0.782, SE = 0.844) SVM_GraphFP (AUC = 0.856, ACC = 0.795, SE = 0.781) RF_GraphFP (AUC = 0.846, ACC = 0.808, SE = 0.719) LightGBM_GraphFP (AUC = 0.846, ACC = 0.782, SE = 0.719) LightGBM_KRFP (AUC = 0.812, ACC = 0.769, SE = 0.719) ANN_PubChemFP (AUC = 0.810, ACC = 0.769, SE = 0.750) |
| Mazumdar et al. [125] | DNN, XGBoost, Extra-tree | DIRI | literatures | ROC-AUC = 0.85–0.88 AC = 82% (DNN) ROC-AUC = 0.6 (Extra-tree) ROC-AUC = 0.7 (XG Boost) |
| Nguyen-Vo [126] | AdaBoost, XGBoost, GB, ERT, RF, KNN, SVM, LR | DIRI | literatures | Best models: AUC-ROC = 0.7583 ± 0.0189 AUC-PR = 0.8883 ± 0.0185 |
| Att-RethinkNet [127] | RNN | 8 kidney pathological findings | Open TG-GATEs | ACC = 89.4%SPE = 98.2% SEN = 94.2% F1 = 93.8% AUC = 0.993 (liver data) ACC = 97.5% SPE = 99.5% SEN = 99.1% AUC = 0.9949 F1 = 99.2% (kidney data) |
| Jain et al. [132] |
MT-DNN, ST-DNN, Consensus Model | 59 different end points | ChemIDplus | average RMSE = 0.65 average R2 = 0.57 (consensus B models) |
| STopTox [131] | QSAR, RF | Skin sensitization, Skin irritation/corrosion, Eye irritation/corrosion, Acute dermal, Acute inhalation, Acute oral | ECHA, REACH, ICCVAM, ToxValDB, NICEATM, literatures | Skin sensitization: CCR = 0.70, Se = 0.66, Sp = 0.75, PPV = 0.71, NPV = 0.75, Coverage = 0.96 Skin irritation/corrosion: CCR = 0.72, Se = 0.77, Sp = 0.66, PPV = 0.69, NPV = 0.74, Coverage = 0.94 Eye irritation/corrosion: CCR = 0.72, Se = 0.72, Sp = 0.71, PPV = 0.71, NPV = 0.71, Coverage = 0.95 Acute dermal: CCR = 0.76, Se = 0.74, Sp = 0.78, PPV = 0.77, NPV = 0.75, Coverage = 0.93, Acute inhalation: CCR = 0.74, Se = 0.69, Sp = 0.80, PPV = 0.77, NPV = 0.72, Coverage = 0.95 Acute oral: CCR = 0.77, Se = 0.85, Sp = 0.70, PPV = 0.79, NPV = 0.78, Coverage = 0.95 |
| PredAOT [128] | RF | AOT | OCHEM, literatures | RMSE = 0.3806, R2 = 0.3557 (mice, toxic regressor) RMSE = 0.2923, R2 = 0.3881 (mice, nontoxic regressor) RMSE = 0.5323, R2 = 0.3065 (rats, toxic regressor) RMSE = 0.3863, R2 = 0.2702 (rats, nontoxic regressor) |
| DermalPred [130] | RF, SVM, XGBoost, LightGBM, GCN, GAT, Attentive FP | ADT | ChemIDplus, eChemPortal | AUC = 78.0% (species 1, 10-fold CV) AUC = 82.0% (species 2, 10-fold CV |
| Wijeyesakere et al. [129] | QSAR, RF | AOT | NTP ICE portal | SN = 76.1% (GHS 1–2) SN = 76.6% (GHS 1–3) balanced AC 73.7% |
| CapsCarcino [133] | Capsule network | Carcinogenicity | CPDB, ISSCAN | AC = 85.0% (external validation) |
| HNN-Cancer [144] | HNN, CNN | Carcinogenicity, pTD50 | MEG, TG230, NTP, IARC, JSOH, NIOSH, CPDB, CCRIS, Drugbank | AC = 74% (HNN-Cancer/RF/Bagging, binary classification) AUC ≈ 0.81 (binary classification, 7994 chemicals) SN = 79.5%, SP = 67.3% (HNN-Cancer, binary) AC = 70% (HNN-Cancer/RF/Bagging/AdaBoost, multiclass, 1618 chemicals) AUC = 0.7 (multiclass, 1618 chemicals) R ≈ 0.62 (HNN-Cancer/RF, regression) |
| CONCERTO [134] | GNN, transfer learning, | Carcinogenicity | CPDB, CCRIS, Hansen | ROCAUC = 0.73 |
| DCAMCP [136] | Capsule network, graph attention | Carcinogenicity | CPDB, CCRIS, ISSCAN | ACC = 0.718 ± 0.009 SE = 0.721 ± 0.006 SP = 0.715 ± 0.014 AUC = 0.793 ± 0.012 ACC = 0.750, SE = 0.778, SP = 0.727, AUC = 0.811 (external validation, 100 compounds) |
| Metabokiller [137] | RF, MLP, KNN, SVM, SGD, LR, GCM, attentive FP, GCN, GAN | electrophilic properties, epigenetic modifications, genomic instability, oxidative stress, proliferative properties, anti-apoptotic properties | Literatures and databases: https://zenodo.org/records/6683106 | AUC = 0.87 AC = 0.82 Recall = 0.89 F1 = 0.82 Precision = 0.76 |
| DeepTox [139] | DNN | AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 | Tox21 | AUC: AIR = 0.923, AR = 0.778, AR-LBD = 0.825, ARE = 0.829, Aromatase = 0.804, ATADS = 0.775, ER = 0.791, ER-LBD = 0.811, HSE = 0.863, MMP = 0.930, p53 = 0.860, PPAR.g = 0.856 |
| CensNet [140] | GCN | AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 | Tox21, Lipophilicity, Cora, Citeseer, literatures | Tox21: Train PCT 60%: val set: AUC = 0.76 ± 0.00 test set: AUC = 0.77 ± 0.00 Train PCT 70%: val set: AUC = 0.76 ± 0.00 test set: AUC = 0.77 ± 0.00 Train PCT 80%: val set: A UC = 0.76 ± 0.00 test set: AUC = 0.78 ± 0.00 Train PCT 90%: val set: AUC = 0.78 ± 0.01 test set: AUC = 0.79 ± 0.01 Lipophilicity: Train PCT 60%: val set: RMSE = 0.94 ± 0.01 test set: RMSE = 0.97 ± 0.01 Train PCT 70%: val set: RMSE = 0.92 ± 0.01 test set: RMSE = 0.95 ± 0.01 Train PCT 80%: val set: RMSE = 0.96 ± 0.01 test set: RMSE = 0.93 ± 0.01 Train PCT 90%: val set: RMSE = 0.94 ± 0.02 test set: RMSE = 0.83 ± 0.02 |
| GMT [141] | GMT | AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 | HIV, Tox21, ToxCast, BBBP | HIV: ACC = 77.56 ± 1.25 Tox21: ACC = 77.30 ± 0.59 ToxCast: ACC = 65.44 ± 0.58 BBBP: ACC = 68.31 ± 1.62 |
| Meta-MGNN [142] | GNN, Meta learning | AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 | Tox21, SIDER | Tox21:1-Shot: average AUC = 76.87% 5-Shot: average AUC = 78.02% SIDER: 1-Shot: average AUC = 73.34% 6-shot: average AUC = 74.72% |
Cardiotoxicity
In cardiotoxicity prediction, hERG, Cav1.2, and Nav1.5 are three key targets typically measured through in vitro assays. Historically, due to limited bioactivity data for the latter two targets, most research concentrated on hERG channel blockade prediction. In 2016, Wang et al. [112] combined naïve Bayes (NB) with SVM, selecting optimal pharmacophore subsets via recursive partitioning (RP), and integrated multiple pharmacophore features through ensemble learning, developing classification models with high accuracy. Their SVM model demonstrated excellent external test performance, elucidating complex hERG blocker interactions. In 2019, Cai et al. [113] introduced deephERG, a multitask DNN-based model trained on 7889 structurally diverse compounds, achieving highly accurate predictions by simultaneously learning chemical features and hERG inhibitory activity. In 2020, Ryu et al. [114] proposed DeepHIT, an end-to-end deep learning approach that transforms molecules into extended connectivity fingerprints (ECFP4) and graph representations, extracting global and local toxicological features using GCNs, optimized via binary cross-entropy loss. This approach effectively predicted hERG toxicity without traditional reliance on manual feature engineering. With accumulating data, researchers have extended their predictions to joint modeling of hERG, Nav1.5, and Cav1.2 channels. Issar et al. [115] developed CToxPred, systematically evaluating fingerprints, descriptors, and graph-based numerical representations within a deep-learning framework, significantly improving multitarget predictive capabilities. Additionally, Kyro et al. [116] developed CardioGenAI, which combines autoregressive Transformer generative models and discriminative deep-learning models. In a case study involving the high-affinity hERG inhibitor pimozide, their method generated candidate drug fluspirilene with over 700-fold reduction in hERG affinity, while preserving therapeutic activity. Collectively, current ML- and deep learning-based cardiotoxicity prediction methods have evolved from single-target models to comprehensive, multitarget, multifeature assessment systems, substantially enhancing prediction accuracy and generalizability. Future advancements in high-quality cross-target activity data and model interpretability techniques will drive cardiotoxicity prediction towards greater automation and precision. In practice, traditional ML approaches (e.g. SVM, ensemble learning) remain robust and interpretable for single-target predictions with limited data, whereas deep learning models (e.g. GCNs, Transformers) are more advantageous in multitarget modeling and large-scale compound screening due to their ability to capture complex molecular features.
Hepatotoxicity
Prediction of drug-induced hepatotoxicity remains a critical bottleneck in drug development. Traditional serum biomarkers such as ALT and AST, which are measured in vivo, have limited sensitivity and specificity for detecting liver injury. Furthermore, discrepancies between in vitro assays and the complex in vivo metabolic environment, coupled with species-specific differences that limit the translational value of animal models, contribute significantly to the clinical failure of numerous drug candidates due to liver toxicity. Recent advancements in ML have markedly improved predictive accuracy by integrating multidimensional datasets. For instance, StackDILI [117] employs stacked ensemble learning to merge chemical structure and bioactivity data, effectively reducing bias inherent to single algorithms. InterDILI [118] combines permutation feature importance and attention mechanisms, achieving high prediction accuracy while identifying critical structural alerts, such as aniline derivatives, thereby providing explicit guidance for drug design. Additionally, pDILI_v1 [119] utilizes probabilistic modeling to analyze extensive drug-adverse reaction datasets, enabling automated extraction, and quantification of liver toxicity risk from unstructured text data. Notably, DILIPredictor [120] integrates nine toxicity endpoints, including mitochondrial toxicity and bile acid transporter inhibition, along with chemical structure, pharmacokinetic parameters, and surrogate toxicity data. Using a RF model, it delivers superior predictive performance, discriminates cross-species toxicities, and captures mechanisms far exceeding those of single-target models.
With the accumulation of toxicogenomics data, research has progressively shifted toward mechanism-based analyses rooted in gene expression and pathway perturbations. Jin et al. [121] proposed an entropy weight method (EWM) to quantify gene expression dispersion, facilitating the evaluation of pathway disruptions. This approach, combined with ML, effectively identifies key toxic pathways—such as ferroptosis during acetaminophen-induced liver injury—thus bridging predictive modeling with underlying biological processes. Moreover, in the realm of generative AI applications, the Tox-GAN model [122] leverages GANs to simulate drug-induced transcriptomic profiles, overcoming the limitations posed by limited experimental data availability. This innovative approach generates virtual toxicogenomic data, enabling the prediction of pathway activation patterns for unknown compounds, and thereby offering an efficient strategy for early-stage hepatotoxicity screening. Collectively, these advances are transitioning hepatotoxicity prediction from traditional single-marker assessments toward integrative mechanism-driven models that combine chemical structure, genomic signatures, and multi-omics data. Such efforts promise to accelerate the development of safer therapeutics and reduce toxicity-related clinical attrition. For practical applications, ensemble learning remains a reliable choice for integrating heterogeneous data sources, whereas mechanism-driven models leveraging toxicogenomics and generative approaches are particularly recommended when exploring novel compounds or elucidating underlying biological pathways.
Nephrotoxicity
Nephrotoxicity is typically assessed in vivo by measuring blood urea nitrogen and serum creatinine. However, similar to hepatotoxicity, predicting nephrotoxicity faces two major challenges: the limited translatability of findings from animal models to humans, and the suboptimal specificity and accuracy of available biomarkers. Research in nephrotoxicity prediction has been comparatively less abundant compared to cardiotoxicity and hepatotoxicity studies. Shi et al. [123] compiled a real-world dataset comprising 565 compounds (287 nephrotoxic drugs and 278 non-nephrotoxic drugs). They developed predictive models utilizing five conventional ML methods (e.g. RF, Extreme Gradient Boosting (XGBoost)) and five deep learning algorithms, such as convolutional neural fingerprint networks (CNFNs). An ensemble of the top three individual models was further employed as a consensus predictor. Additionally, the study identified 87 structural alerts based on Klekota–Roth fingerprints (KRFP), calculating substructure frequency (f-score) and positivity rates. Among these alerts, 16 exhibited specificity for nephrotoxic drugs, offering mechanistic insights through structural signatures. Gong et al. [124] constructed a dataset consisting of 777 drugs, including 125 TCM components. They employed nine molecular fingerprints (e.g. Atom Pair, MACCS) combined with eight ML algorithms, generating a total of 72 classification models. Through 10-fold cross-validation and external validation, an optimal model, SVM combined with CDK graph fingerprints—demonstrated robust generalizability for both TCM-derived and chemically synthesized drugs. Following OECD guidelines, the applicability domain was rigorously evaluated, and SARpy and information-gain methods identified eight potential nephrotoxicity-related structural alerts, including fluorinated benzene rings and polyamine derivatives, thus providing critical structural warnings for drug safety considerations.
Mazumdar et al. [125] integrated deep learning and traditional ML methods, employing eight types of molecular fingerprints and RDKit descriptors to build 27 ML models and one DNN. Their DNN achieved optimal performance in five-fold cross-validation, whereas an Extra-tree model showed an accuracy of 82.1% on an independent test set. Innovatively, the study applied association rule mining, identifying 10 high-frequency substructures within nephrotoxic compounds, such as benzimidazole derivatives and fluorinated substituents, revealing that structural cooperativity significantly influences nephrotoxicity. Nguyen-Vo et al. [126] prioritized data quality optimization through meticulous data cleaning and class balancing, generating a dataset containing 604 positive and 228 negative samples. Utilizing eight algorithms (e.g. Extremely Randomized Trees (ERT), XGBoost) and three molecular representations (Mol2vec embeddings, RDKit descriptors, ECFP fingerprints), they developed 32 models. Their results revealed consistently superior performance of ERT models across different molecular representations, providing a reliable baseline for subsequent investigations.
Recent incorporation of toxicogenomic data has ushered nephrotoxicity prediction into a novel direction. Su et al. [127] introduced the Att-RethinkNet model, a multilabel learning framework based on gene expression data from the Open TG-GATEs database. By employing memory structures and attention mechanisms, the model effectively captures correlations between hepatic and renal pathological phenotypes. Furthermore, it integrates multidimensional parameters such as compound type, dosage, and administration duration, enabling simultaneous predictions of 20 renal/hepatic pathological phenotypes. Demonstrating robust performance in in vivo rat datasets, its attention mechanism also highlights key gene features, thereby offering interpretability at the gene-expression level and facilitating deeper mechanistic insights into nephrotoxicity. Together, these advances signify a progressive evolution in nephrotoxicity prediction: from traditional structure-based analyses toward multi modal data integration and interpretability-driven modeling, thus supporting early-stage toxicity risk assessment in drug development. In practice, ensemble or tree-based models (e.g. ERT, RF) remain reliable baselines for small to medium datasets, whereas attention-based multi-omics frameworks are particularly suitable when exploring mechanistic insights or cross-organ toxicities.
Acute toxicity
Acute toxicity refers to harmful effects caused by chemical exposure within a short period (usually less than 24 h) through single or multiple administrations. Quantitative measures such as median lethal dose (LD50), median lethal concentration (LC50), and minimal lethal dose (MLD), which are classic in vivo endpoints, are employed to evaluate the severity of lethal toxicity. Acute toxicity assessments encompass various endpoints depending on administration routes (oral, inhalation, dermal, and etc.) and symptom profiles, among which acute oral toxicity prediction remains a primary research focus. In the domain of acute oral toxicity prediction, the computational framework PredAOT [128], based on multiple RF models trained using acute oral toxicity data from OCHEM database and literature in mice and rats, categorizes compounds as “toxic” or “nontoxic/low-toxicity”. Initially, an “AOT classifier” determines compound toxicity class, followed by regression models for precise LD50 (in vivo) prediction. Synthetic minority oversampling technique (SMOTE) was utilized effectively to manage class imbalance, achieving predictive performance comparable or superior to existing tools. Wijeyesakere et al. [129] developed a mechanistically driven QSAR model by analyzing the U.S. NTP’s rat acute toxicity database. This model assigned primary mechanisms of action using various mechanistic analysis tools, and further employed RF for acute oral LD50 prediction, optimizing results by structure-mechanism similarity. The method exhibited enhanced sensitivity and balanced accuracy in identifying highly toxic compounds, enabling toxicity predictions based on specific mechanisms, such as aconitase inhibition.
Regarding acute dermal toxicity prediction, DermalPred [130] exemplifies significant advancement. Integrating rabbit and rat experimental data, researchers constructed predictive models utilizing ML and deep learning algorithms. Structure alerts were extracted using tools such as SARpy, and multiple interpretation methods were combined to identify significant features and structural fragments correlated with acute dermal toxicity, culminating in an independent software tool for toxicity prediction. The optimal model demonstrated robust performance with high AUC scores in 10-fold cross-validation, effectively supporting safety assessments for pesticides, cosmetics, and pharmaceuticals. Moreover, several studies aim to predict multiple acute toxicity endpoints simultaneously. STopTox [131] integrated publicly available datasets to develop QSAR models targeting six acute endpoints—skin sensitization, skin irritation, eye irritation, acute oral (in vivo), acute inhalation (in vivo), and acute dermal toxicity—aggregated within a comprehensive online platform. Rigorously validated, these models effectively identify potentially toxic or nontoxic compounds across multiple endpoints. Jain et al. [132] compiled extensive public datasets and developed diverse single-task and multitask models using RFs and DNNs. They introduced consensus models derived from multiple multitask frameworks, demonstrating excellent predictive capability for 59 acute systemic toxicity endpoints, especially excelling at predicting less-represented endpoints. Collectively, acute toxicity prediction research continues to evolve, broadening from single-route acute oral predictions to multifaceted acute toxicity profiling, significantly contributing to drug development, environmental chemical screening, and regulatory decisions, thereby reducing reliance on animal testing and steering toxicological research toward enhanced accuracy and sustainability. In practice, QSAR and RF-based models remain suitable for rapid screening and regulatory purposes, whereas multitask and consensus deep learning frameworks are recommended for handling diverse endpoints and capturing underrepresented toxicity profiles.
Carcinogenicity
Accurate prediction of drug carcinogenicity is crucial for ensuring public medication safety. Traditional in vivo animal and in vitro cellular methods are constrained by high costs, prolonged durations, poor extrapolation accuracy to humans, and limited simulation of realistic physiological environments. Recent rapid advances in ML and deep learning have spurred the development of highly efficient carcinogenicity prediction models. For instance, CapsCarcino [133], leveraging the dynamic routing algorithm of capsule networks, achieved high accuracy in external validation datasets, demonstrating superior generalizability even in sparse data conditions. CONCERTO [134] combined graph Transformer architectures with molecular fingerprint representations, significantly enhancing prediction performance through iterative pretraining and transfer learning strategies. Limbu and Dakshanamurthy introduced the HNN-Cancer model [135], integrating convolutional neural networks (CNNs) and feedforward neural networks (FFNN), using improved SMILES-based representations, and delivering robust results in binary, multiclass, and regression tasks across diverse chemical categories. Similarly, DCAMCP [136], incorporating capsule networks and attention mechanisms alongside various molecular fingerprints and graph-based descriptors, displayed impressive results in both cross-validation and external validation phases. Additionally, Metabokiller [137], integrating biochemical characteristics related to carcinogenicity and utilizing ensemble classification methods, attained high precision, and recall rates when predicting carcinogenic potential for unknown compounds, with selected predictions experimentally validated. Nonetheless, current carcinogenicity prediction models still encounter substantial hurdles, including inadequate quantity and quality of training data, limited interpretability, inconsistent cross-species predictive capability, elevated risks of overfitting, and challenges achieving widespread acceptance from regulatory authorities. From an application standpoint, capsule network, and graph-transformer models show strong promise for sparse or heterogeneous datasets, while ensemble approaches that integrate biochemical descriptors remain a practical choice for improving robustness and regulatory acceptance.
Tox21 data challenge
In 2014, the Tox21 initiative launched the Tox21 data challenge [138], utilizing quantitative HTS data from in vitro assays based on nuclear receptor signaling pathways and cellular stress response pathways from the Tox21 10 K chemical library. The dataset comprised 12 060 training samples and 647 test samples, covering 12 toxicity endpoints related to nuclear receptor activation and cellular stress responses. The global competition invited bioinformatics, data science, and ML experts to collaboratively develop and validate innovative toxicity prediction models, addressing the high-cost and low-efficiency bottlenecks of traditional toxicity testing methodologies. Results revealed diverse methodologies applied by participating teams, including classical algorithms such as RFs, SVM, k-nearest neighbors, NB, and alongside state-of-the-art deep learning methods [138]. Among these, deep learning approaches, due to their automatic extraction of hierarchical chemical feature representations, generally demonstrated superior performance. Mayr et al. [139] notably pioneered the application of deep learning models in toxicity prediction, establishing hierarchical chemical feature models significantly outperforming traditional approaches.
As of March 2025, the Tox21 dataset has evolved into five benchmark tasks: Molecular Property Prediction, Drug Discovery, Graph Regression, Graph Classification, and Molecular Property Prediction (1-shot) (https://paperswithcode.com/dataset/tox21-1). Within these benchmarks, deep learning models, particularly GNNs, dominate in terms of performance. For example, Deep-CBN currently leads in molecular property prediction tasks, whereas CensNet [140] excels in graph regression tasks. GMT [141] demonstrates outstanding performance in graph classification tasks, and Meta-MGNN [142] dominates the single-shot molecular property prediction task, all based upon GNN architectures. Furthermore, within the drug discovery task, six out of the top 10 performing models leverage graph-based architectures, underscoring the unique strengths of GNN-based approaches in molecular representation learning.
Network toxicology and its application in toxicity evaluation of traditional Chinese medicine
TCM, characterized by a longstanding history and widespread clinical application, remains an essential component of China’s healthcare heritage. Compared to Western medicines, which typically act on defined targets through chemical synthesis or natural extraction, providing rapid onset yet usually targeting single symptoms or specific diseases, TCMs possess distinct advantages due to their multicomponent, multitarget nature [145]. TCM formulations consist of diverse bioactive ingredients that interact complexly to produce comprehensive therapeutic effects through multiple targets and pathways. Notably, during the COVID-19 pandemic, TCM demonstrated its irreplaceable role by intervening at multiple stages of viral infection and regulating immune responses, effectively leveraging its multicomponent, multitarget strategy [146]. Nevertheless, as TCM advances rapidly, increasing concerns regarding its potential toxicities have emerged [147–149], making TCM toxicity evaluation a critical area in contemporary research. However, the complex composition and multifaceted mechanisms of action of TCM pose considerable challenges for accurate safety assessment.
Network toxicology, an emerging approach derived from network pharmacology, has become a valuable tool for assessing drug toxicity. Liu and colleagues extended network modeling from “drug–target–efficacy” to “drug–target–adverse reaction,” transforming traditional pharmacological databases into specialized toxicological databases, thereby enhancing the precision of toxicological predictions [150]. Utilizing network analysis and prediction methodologies, network toxicology has successfully identified toxic constituents within TCMs and elucidated their molecular mechanisms of toxicity, providing novel insights into TCM safety evaluations [151].
The application of network toxicology in TCM toxicity evaluation generally comprises the following steps: (i) Data collection and curation: Initially, it is crucial to acquire detailed information about TCM constituents from literature reviews, databases (such as CTD [91] TCMSP [152]), and experimental analyses. (ii) Target prediction and identification: Toxicity targets are predicted using computational tools (e.g. Hazard Expert [153], TOPKAT [154], and DEREK [155]), followed by comparative analyses with TCM-derived toxic compound targets, identifying potential toxic targets. (iii) Network construction and analysis: Toxic compounds and their targets serve as nodes within a network, visualized using tools such as Cytoscape [156], enabling topological analysis and identifying critical nodes. (iv) Toxicity mechanism and pathway analysis: The selected targets undergo Gene Ontology (GO) and KEGG enrichment analyses to clarify potential molecular mechanisms and signaling pathways underlying toxicity. (v) Molecular docking and experimental validation: molecular docking simulations (e.g. using AutoDock [157]) assess compound-target binding interactions, while subsequent in vitro cellular or in vivo animal experiments validate predicted mechanisms and targets identified via network toxicology (Fig. 6).
Figure 6.
Workflow for network toxicology–based toxicity prediction.
Commonly utilized resources in network toxicology include toxicological databases, TCM-related constituent databases, predictive toxicological software, and visualization platforms. Reliable toxicological data and accurate TCM constituent information are vital. For example, TOXNET, developed by the U.S. NLM, provides extensive toxicological and chemical data relevant to environmental health and pharmacology. TCM databases such as TCMID [158] and TCMSP [152] (Table 7) facilitate access to constituent information. Predictive toxicology tools described previously (section 3.2) are extensively used for identifying toxic compounds, exogenous substance toxicity, carcinogenicity, and sensitization risks. Although visualization platforms like Cytoscape [156] and STRING [159] are ancillary, they significantly aid in the intuitive interpretation and analysis of complex biomolecular interaction networks.
Table 7.
Databases of TCM Ingredients
| Database | URL | Key features |
|---|---|---|
| TCMID [158] | http://www.megabionet.org/tcmid/ | Integrates herbs, compounds, prescriptions, targets, and diseases; supports network visualization of herb–target–disease relationships. |
| TCMSP [152] | http://tcmsp-e.com/ | Covers 499 official Chinese herbs with compounds, protein targets, and diseases; provides ADME properties and built-in compound–target–disease network construction. |
| TCM@Taiwan [166] | http://tcm.cmu.edu.tw/ | Hosts >20 000 pure compounds from 453 herbs with curated 2D/3D structures; supports virtual screening and molecular docking. |
| TCM-Mesh [167] | http://mesh.tcm.microbioinformatics.org/ | Builds herb–compound–target–disease networks including side-effect and toxicity annotations for holistic safety assessment. |
| TCMGeneDIT [168] | http://tcm.lifescience.ntu.edu.tw/ | Curates literature-mined associations among herbs, genes, and diseases; enables exploration of TCM’s molecular mechanisms. |
| HIM [169] | http://www.bioinformatics.org.cn/ | Integrates metabolomic, bioactivity, toxicity, and ADMET data for TCM compounds; supports multi-omics safety evaluation. |
| TCMSTD [170] | https://www.bic.ac.cn/TCMSTD/ | The first system-toxicology database for TCM, with sections on five major toxicities and standardized toxic-target annotations. |
| TCM Bank [171] | https://TCMBank.cn/ | Aggregates active ingredients, 3D structures, gene targets, pathways, and disease associations into a unified, queryable platform. |
| DCABM-TCM [172] | http://bionet.ncpsb.org.cn/dcabm-tcm/ | Provides active compounds and targets, pharmacodynamic mechanisms, plus integrated ADMET profiles for each ingredient. |
| BATMAN-TCM 2.0 [173] | http://bionet.ncpsb.org.cn/batman-tcm/ | Predicts herb-compound–protein interactions and constructs chemical–target–disease networks for candidate selection. |
Network toxicology continues to evolve, showcasing distinct advantages in TCM toxicity research. Researchers initially predict toxic targets and mechanisms using network toxicology, followed by empirical validations to build comprehensive safety evaluation frameworks for TCM. For instance, one study investigated the hepatotoxicity of Mesaconitine (MA), a constituent of Aconitum species, using online databases and network toxicology. It identified 31 crucial hepatotoxic targets and suggested that MA may induce hepatic injury via oxidative stress activation, inflammatory response initiation, and apoptosis induction, offering critical insights into the toxicity of aconite-based TCMs [160]. Another study by Xi et al. [161] analyzed 42 nephrotoxic TCM compounds using network toxicology, identifying alkaloids as the predominant toxic class, followed by terpenoids and phenolics, highlighting the necessity of vigilance regarding nephrotoxic risks associated with these compound classes. Lv et al. [162] employed network toxicology and molecular docking to explore acute toxicity mechanisms associated with palmitic acid, a common TCM component, identifying 117 potential cardiac-toxicity-related targets.
Integrating AI with network toxicology represents an innovative research frontier, significantly enhancing toxicity prediction, risk assessment, and drug development processes. Currently, AI predominantly supports molecular interaction prediction, target identification, and molecular feature modeling within network toxicology. Convolutional neural networks (CNNs), a principal deep learning architecture, effectively predict pharmacological activities and toxicities through structural deep learning, providing robust technological support for network toxicology [163]. Moreover, GNNs [164] facilitate the construction of drug-target-disease interaction networks, enabling predictions of drug-target affinities and potential toxicity profiles, thereby enriching scientific frameworks for safety evaluations. Tian et al. [165] developed an innovative approach, MHADTI, utilizing multi view heterogeneous information network embeddings and hierarchical attention mechanisms to efficiently predict drug-target interactions, paving new avenues for mechanistic exploration in network toxicology. Although AI-driven methodologies in network toxicology still require extensive data accumulation and validation, their early successes underscore significant potential to accelerate TCM safety research and modernization efforts.
Drug toxicology research in the era of large language models
The rapid advancement of AI and natural language processing (NLP) technologies in recent years, exemplified by LLMs such as ChatGPT and DeepSeek, has profoundly impacted numerous scientific domains, becoming essential auxiliary tools for researchers [174–177]. Given the enormous, heterogeneous, and complex data involved in drug toxicology and pharmacokinetic safety evaluations, there is an urgent demand for automated analytical tools, where LLMs have demonstrated immense potential.
One of the most notable applications of LLMs in drug toxicity and safety assessment is automated literature analysis and knowledge integration. Traditionally, drug toxicity information scattered across extensive literature, clinical reports, and drug labeling documents requires laborious, subjective manual analyses. Silberg et al. [178] developed the UniTox platform employing the GPT-4 model to automatically extract and categorize toxicity information from drug labeling data of 2418 FDA-approved medications. This initiative successfully created a standardized toxicity database encompassing eight primary toxicological categories, including cardiotoxicity, hepatotoxicity, neurotoxicity, and nephrotoxicity, demonstrating LLMs’ advantages in rapidly and accurately extracting structured toxicity data, thus significantly enhancing toxicological data mining efficiency.
Additionally, LLMs contribute significantly to constructing and enhancing drug safety evaluation databases. Traditional toxicological and ADMET databases often suffer limitations such as insufficient scale, inconsistent annotation, and fragmented information, limiting their utility for deep learning model training. To overcome these challenges, Niu et al. [179] developed the PharmaBench platform, utilizing LLMs for automated extraction and integration of information from vast literature sources, drug labels, and public databases, thereby establishing a comprehensive ADMET benchmark dataset consisting of 11 sub datasets totaling over 50 000 data entries. This initiative significantly improved traditional ADMET resources in both scope and quality, facilitating fair performance comparisons and evaluations of various predictive algorithms.
Furthermore, LLMs also demonstrate considerable potential in molecular toxicity prediction and toxicological mechanism exploration. Unlike conventional models relying on simplistic structural descriptors and statistical learning methods, LLMs leverage contextual understanding from textual molecular representations, yielding superior predictive generalizability. Yang et al. [180] benchmarked GPT-4 and its multimodal variant GPT-4o against conventional ML and deep learning approaches for molecular toxicity prediction. They found that GPT-4 outperformed all comparators across multiple evaluation metrics. Building on this, they integrated GPT-4 with molecular docking techniques to probe the potential cardiotoxicity of TCM compounds, successfully pinpointing several high-risk ingredients and their principal binding sites on cardiac targets. This pioneering work represents the first application of LLMs to molecular toxicity prediction, offering a streamlined, highly efficient workflow for early-stage drug safety screening and underscoring the immense potential of LLMs to accelerate drug development and enhance safety assessments.
Despite their strong text processing abilities, LLMs face challenges in accuracy and reliability due to noisy and inconsistently labeled data, as well as risks of data poisoning [181]. Moreover, their reasoning depends mainly on statistical patterns rather than true causal inference, which restricts their capacity to explain the detailed molecular mechanisms behind drug toxicity [182]. Furthermore, the inherent “black-box” nature and limited transparency of LLM decision-making processes fail to meet stringent interpretability and credibility demands required by drug safety assessments [183]. To mitigate these limitations and enhance trust, several strategies can be employed. These include utilizing post-hoc interpretation techniques (e.g. attention mechanism analysis, feature importance scoring, and counterfactual explanations) to decipher model predictions, and adopting inherently more interpretable model architectures or hybrid approaches that combine LLMs with knowledge graphs for explicit reasoning pathways [184]. Finally, general-purpose LLMs lack profound comprehension of specialized pharmacotoxicological terminology, necessitating domain-specific fine-tuning or the development of dedicated models to enhance prediction accuracy and domain understanding [185].
Looking ahead, LLMs are anticipated to continuously advance drug toxicology research in several key areas. In the domain of clinical applications and safety monitoring, real-time mining of electronic health records and adverse event reports by LLMs can facilitate early alerts and dynamic risk assessments of adverse drug reactions (Fig. 7). Regarding ADMET property prediction, multimodal foundation models integrating molecular structures, gene expression profiles, and metabolomics data are expected to significantly enhance prediction accuracy for complex compounds. For molecular toxicity mechanism exploration, the integration of knowledge graphs and causal inference frameworks will enable deeper elucidation of toxicity pathways. Additionally, automated agents utilizing LLMs can systematically mine and integrate emerging literature and experimental data, thus constructing continuously updated toxicology databases. Lastly, interpretability enhancement tools will improve the transparency and traceability of LLM-driven decision processes, further strengthening their credibility and applicability.
Figure 7.
Prospective applications of LLMs in drug safety evaluation and toxicology research.
Conclusion and perspectives
Computational toxicology is playing an increasingly pivotal role in drug development. Traditional toxicological assessments primarily rely on animal experiments, a methodology characterized by considerable time consumption, high cost, and significant ethical concerns regarding animal welfare. Computational toxicology, integrating advanced bioinformatics, cheminformatics, and ML techniques, facilitates rapid and efficient prediction of compound toxicities, substantially enhancing drug discovery efficiency and reinforcing drug safety [186–188]. In this review, we initially summarized 23 computational tools capable of predicting drug ADMET properties, covering various stages including data input, model training, and prediction output. These methodologies were categorized into three main approaches: rule-based/statistical models, ML-based models, and graph-based methods. Despite demonstrating powerful capabilities in pharmacokinetic prediction and toxicity evaluation, these approaches still face limitations such as inconsistent data quality, inadequate model transparency, and room for improvement in prediction accuracy.
Moreover, recent advances in toxicological databases and toxicity prediction tools were systematically discussed. Toxicological databases, classified according to their data types and application scenarios into chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases, provide comprehensive data support for various predictive models. Concurrently, significant progress in ML methodologies has been achieved for toxicity prediction. Researchers have integrated heterogeneous multisource data to construct a series of high-performing models, accurately predicting multiple toxicity endpoints, including acute toxicity, organ-specific toxicity, carcinogenicity, and genetic toxicity. Furthermore, innovative technologies such as GANs have been utilized to generate virtual samples, effectively alleviating data scarcity. Overall, toxicity prediction is progressively evolving toward multimodal integration and generative AI: transitioning from single-endpoint to multi-endpoint joint modeling, expanding data dimensions from traditional chemical structures to clinical data and multi-omics information (e.g. genomics and metabolomics), and shifting from discriminative architectures towards generative network architectures such as GANs and diffusion models. This evolution offers novel paradigms for addressing toxicity evaluations in small sample sizes and complex biological systems [189, 190].
As an emerging paradigm, network toxicology exhibits distinct advantages in the safety assessment of TCM formulas. Researchers construct and analyze multidimensional “component-target-pathway” networks. This approach allows them to clarify the molecular mechanisms behind TCM toxicity, and further provides reliable evidence for safety evaluations. Notably, AI technologies show immense potential in network toxicology: intelligent reasoning based on knowledge graphs can automatically identify potential toxicity pathways, GNNs effectively capture nonlinear relationships within biological networks, and transfer learning offers promising solutions for overcoming modeling bottlenecks posed by limited TCM-specific data [191]. Additionally, this review has explored the prospective applications of LLMs in drug toxicology. Although LLMs have demonstrated substantial potential in automated literature mining, data integration, and toxicity prediction, further research is required to improve their capabilities in causal inference and model interpretability.
Reflecting upon current research progress, toxicity prediction methodologies still encounter several significant challenges: (i) Existing databases exhibit limitations in both data quality and coverage, lacking reliable data for novel compounds such as multi component TCMs or multi target drugs. (ii) The multidimensional nature of toxicity mechanisms—including metabolic pathways, target interactions, and dynamic cellular responses—remains challenging to fully characterize using a single predictive model. (iii) The “black-box” nature of many ML models hinders the biological interpretability of their predictions, thereby limiting their application in clinical decision-making. Thus, future research urgently requires the integration of multidisciplinary techniques, including network toxicology, systems biology, and AI, to develop dynamic and interpretable toxicity prediction frameworks that more accurately reflect real biological systems. Specifically, future efforts should focus on the following four directions:
Future development of prediction methods must prioritize rigorous, transparent, and continuous benchmarking against high-quality in vivo data. Building upon the foundation laid by the Tox21 initiative, we recommend the establishment of a community-driven, iterative evaluation framework. This framework would involve: (i) utilizing existing animal data not just for initial training but as a gold standard for ongoing model validation and refinement in an iterative loop; (ii) creating blinded challenge sets containing novel compounds to impartially assess the generalizability and predictive power of new algorithms against established benchmarks; and (iii) expanding the scope of toxicity endpoints beyond HTS outcomes to include more complex adverse outcomes, thus driving the development of models that can better predict human-relevant toxicity pathways.
Multi-level modeling integrated with systems toxicology: Leveraging systems biology and experimental validation data, researchers should build comprehensive, multidimensional toxicity prediction models spanning molecular, cellular, organ, and organismal levels. For instance, single-cell sequencing combined with metabolomics could dynamically simulate drug distribution and metabolism in vivo, revealing fundamental mechanisms underpinning toxic reactions.
Deepening and innovating AI technologies: Natural language processing techniques can be employed to extract toxicological associations from unstructured literature, establishing dynamically updated toxicity knowledge networks to optimize predictive models. Furthermore, inverse design of low-toxicity molecules and toxicity avoidance strategies could be realized through molecular graph representations and GANs. Developing transparent algorithms (such as SHAP and LIME) will further enhance biological interpretability of model predictions, addressing interpretability requirements in predictive modeling.
Modernizing toxicity evaluation of TCM: Given the inherent complexity of multicomponent and multitarget TCM formulas, it is imperative to establish integrated evaluation frameworks based on network toxicology. These frameworks should incorporate multidimensional “component-target-pathway-phenotype” data, combining traditional empirical toxicity knowledge with advanced ML models to effectively resolve challenges in tracing toxicity sources and clarifying dose-effect relationships within TCM preparations.
Data standardization and cross-platform collaboration: promoting standardized global data collection and sharing mechanisms for toxicological information, building comprehensive toxicity databases covering broader chemical spaces (e.g. natural products and nanomedicines), and utilizing transfer learning, and federated learning techniques could significantly mitigate overfitting issues arising from limited sample sizes.
In conclusion, continuous technological innovations and advancing research endeavors are anticipated to overcome existing challenges and drive further developments in computational toxicology. Such progress promises to provide more efficient and precise toxicity evaluation tools for drug development, facilitating successful drug discovery and clinical application, and ultimately propelling drug research towards a more precise and intelligent new era.
Declaration of generative AI and AI-assisted technologies in the writing process
During the preparation of this manuscript, the authors utilized DeepSeek R1 to enhance the language and readability. Following the use of this service, the authors thoroughly reviewed and revised the content as necessary, and take full responsibility for the final version of the publication.
Key Points
ADMET tools have improved with machine learning and graph neural networks. Future focus should be on multi endpoint predictions integrating chemical, clinical, and multi omics data, while addressing data quality and interpretability, particularly for complex multi target drugs.
Toxicity prediction is moving towards multi endpoint modeling and network toxicology. Future research should integrate genomics and metabolomics data to enhance understanding of toxicity mechanisms and improve model interpretability for regulatory and clinical applications.
LLMs have potential in literature mining, data integration, and molecular toxicity prediction. Future work should focus on optimizing LLMs for toxicology tasks and integrating them with multi omics and network toxicology models to improve toxicity screening and safety evaluations.
Supplementary Material
Acknowledgements
We would like to thank the anonymous reviewers for valuable suggestions.
Contributor Information
Jiangyan Zhang, School of Pharmacy/School of Modern Chinese Medicine Industry, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.
Haolin Li, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.
Yuncong Zhang, Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai Institute of Translational Medicine, Zhuhai People's Hospital (The Affiliated Hospital of Beijing Institute of Technology, Zhuhai Clinical Medical College of Jinan University), No. 79, Kangning Road, Xiangzhou District, Zhuhai City, Guangdong Province, 519000, China.
Junyang Huang, Department of Ophthalmology, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, No. 32, Section 2, West Yihuan Road, Qingyang District, Chengdu, Sichuan Province, 610072, China.
Liping Ren, School of Healthcare Technology, Chengdu Neusoft University, No. 1, Neusoft Avenue, Qingchengshan Town, Dujiangyan City, Chengdu, Sichuan Province, 611844, China.
Chuantao Zhang, Department of Respiratory Medicine, Hospital of Chengdu University of Traditional Chinese Medicine, No. 39, Shi'erqiao Road, Jinniu District, Chengdu, Sichuan Province, 610072, China.
Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006, Xiyuan Avenue, High-tech Zone (West Zone), Chengdu, Sichuan Province, 611731, China.
Yang Zhang, Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.
Author contributions
Y.Z., Q.Z. and C.Z. conceived the manuscript and outlined it. J.Z., Y.C.Z. and H.L. conducted the literature search and wrote the draft. J.H. and L.R. reviewed and edited the draft. All authors have approved the final review and the submission.
Conflict of interest: None declared.
Funding
This work was supported by the National Natural Science Foundation of China (62471071, 62202069), Chengdu Health Commission-Chengdu University of Traditional Chinese Medicine Joint Research Fund (WXLH202402041).
Availability of data and materials
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
References
- 1. Pammolli F, Magazzini L, Riccaboni M. The productivity crisis in pharmaceutical R&D. Nat Rev Drug Discov 2011;10:428–38. 10.1038/nrd3405. [DOI] [PubMed] [Google Scholar]
- 2. Dowden H, Munro J. Trends in clinical success rates and therapeutic focus. Nat Rev Drug Discov 2019;18:495–6. 10.1038/d41573-019-00074-z. [DOI] [PubMed] [Google Scholar]
- 3. Tran TTV, Surya Wibowo A, Tayara H, et al. Artificial intelligence in drug toxicity prediction: recent advances, challenges, and future perspectives. J Chem Inf Model 2023;63:2628–43. 10.1021/acs.jcim.3c00200. [DOI] [PubMed] [Google Scholar]
- 4. Parboosing R, Mzobe G, Chonco L, et al. Cell-based assays for assessing toxicity: A basic guide. Med Chem 2016;13:13–21. 10.2174/1573406412666160229150803. [DOI] [PubMed] [Google Scholar]
- 5. Khabib MNH, Sivasanku Y, Lee HB, et al. Alternative animal models in predictive toxicology. Toxicology 2022;465:153053. 10.1016/j.tox.2021.153053. [DOI] [PubMed] [Google Scholar]
- 6. Wang N, Li X, Xiao J, et al. Data-driven toxicity prediction in drug discovery: current status and future directions. Drug Discov Today 2024;29:104195. 10.1016/j.drudis.2024.104195. [DOI] [PubMed] [Google Scholar]
- 7. Prior H, Sewell F, Stewart J. Overview of 3Rs opportunities in drug discovery and development using non-human primates. Drug Discov Today Dis Model 2017;23:11–6. [Google Scholar]
- 8. Ekins S. Progress in computational toxicology. J Pharmacol Toxicol Methods 2014;69:115–40. 10.1016/j.vascn.2013.12.003. [DOI] [PubMed] [Google Scholar]
- 9. Wang Y, Zeng T, Tang D, et al. Integrated multi-omics analyses reveal lipid metabolic signature in osteoarthritis. J Mol Biol 2025;437:168888. [DOI] [PubMed] [Google Scholar]
- 10. Meng X, Yan X, Zhang K, et al. The application of large language models in medicine: A scoping review, iScience 2024;27:109713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Momanyi BM, Zhou YW, Grace-Mercure BK, et al. SAGESDA: multi-GraphSAGE networks for predicting SnoRNA-disease associations. Curr Res Struct Biol 2024;7:100122. 10.1016/j.crstbi.2023.100122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Guengerich FP. Mechanisms of drug toxicity and relevance to pharmaceutical development. Drug Metab Pharmacokinet 2011;26:3–14. 10.2133/dmpk.DMPK-10-RV-062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Liu Y, Li H, Zeng T, et al. Integrated bulk and single-cell transcriptomes reveal pyroptotic signature in prognosis and therapeutic options of hepatocellular carcinoma by combining deep learning. Brief Bioinform 2023;25:bbad487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Mulliner D, Schmidt F, Stolte M, et al. Computational models for human and animal hepatotoxicity with a global application scope. Chem Res Toxicol 2016;29:757–67. 10.1021/acs.chemrestox.5b00465. [DOI] [PubMed] [Google Scholar]
- 15. Khan MZI, Ren JN, Cao C, et al. Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning. Front Pharmacol 2024;15:1441587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Liu J, Khan MKH, Guo W, et al. Machine learning and deep learning approaches for enhanced prediction of hERG blockade: A comprehensive QSAR modeling study. Expert Opin Drug Metab Toxicol 2024;20:665–84. 10.1080/17425255.2024.2377593. [DOI] [PubMed] [Google Scholar]
- 17. Mahapatra M, Sahu C, Mohapatra S. Trends of artificial intelligence (AI) use in drug targets, discovery and development: current status and future perspectives. Curr Drug Targets 2025;26:221–42. [DOI] [PubMed] [Google Scholar]
- 18. Zhang Y, Liu C, Liu M, et al. Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023;25:bbad467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Zulfiqar H, Guo Z, Ahmad RM, et al. Deep-STP: A deep learning-based approach to predict snake toxin proteins by using word embeddings. Front Med (Lausanne) 2024;10:1291352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Zhang L, Zhang H, Ai H, et al. Applications of machine learning methods in drug toxicity prediction. Curr Top Med Chem 2018;18:987–97. 10.2174/1568026618666180727152557. [DOI] [PubMed] [Google Scholar]
- 21. Wu Z, Jiang D, Wang J, et al. Mining toxicity information from large amounts of toxicity data. J Med Chem 2021;64:6924–36. 10.1021/acs.jmedchem.1c00421. [DOI] [PubMed] [Google Scholar]
- 22. Wu Z, Chen J, Li Y, et al. From black boxes to actionable insights: A perspective on explainable artificial intelligence for scientific discovery. J Chem Inf Model 2023;63:7617–27. 10.1021/acs.jcim.3c01642. [DOI] [PubMed] [Google Scholar]
- 23. Guo W, Liu J, Dong F, et al. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023;248:1952–73. 10.1177/15353702231209421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Masarone S, Beckwith KV, Wilkinson MR, et al. Advancing predictive toxicology: overcoming hurdles and shaping the future. Dig Dis 2024;4:303–15. [Google Scholar]
- 25. Sharma B, Chenthamarakshan V, Dhurandhar A, et al. Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations. Sci Rep 2023;13:4908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Wu P, Lin S, Cao G, et al. Absorption, distribution, metabolism, excretion and toxicity of microplastics in the human body and health implications. J Hazard Mater 2022;437:129361. 10.1016/j.jhazmat.2022.129361. [DOI] [PubMed] [Google Scholar]
- 27. Waring MJ, Arrowsmith J, Leach AR, et al. An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nat Rev Drug Discov 2015;14:475–86. 10.1038/nrd4609. [DOI] [PubMed] [Google Scholar]
- 28. Adamson RH. The acute lethal dose 50 (LD50) of caffeine in albino rats. Regul Toxicol Pharmacol 2016;80:274–6. 10.1016/j.yrtph.2016.07.011. [DOI] [PubMed] [Google Scholar]
- 29. Yi J-C, Yang Z-Y, Zhao W-T, et al. ChemMORT: an automatic ADMET optimization platform using deep learning and multi-objective particle swarm optimization. Brief Bioinform 2024;25:bbae008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Daina A, Michielin O, Zoete V. SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep 2017;7:42717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Gu Y, Yu Z, Wang Y, et al. admetSAR3.0: A comprehensive platform for exploration, prediction and optimization of chemical ADMET properties. Nucleic Acids Res 2024;52:W432–w438. 10.1093/nar/gkae298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Fu L, Shi S, Yi J, et al. ADMETlab 3.0: an updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support. Nucleic Acids Res 2024;52:W422–w431. 10.1093/nar/gkae236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Schyman P, Liu R, Desai V, et al. vNN web server for ADMET predictions. Front Pharmacol 2017;8:889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Tian H, Ketkar R, Tao P. ADMETboost: A web server for accurate ADMET prediction. J Mol Model 2022;28:408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Lagorce D, Bouslama L, Becot J, et al. FAF-Drugs4: free ADME-tox filtering computations for chemical biology and early stages drug discovery. Bioinformatics 2017;33:3658–60. 10.1093/bioinformatics/btx491. [DOI] [PubMed] [Google Scholar]
- 36. Swanson K, Walther P, Leitz J, et al. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. Bioinformatics 2024;40:btae416. 10.1101/2023.12.28.573531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Banerjee P, Kemmler E, Dunkel M, et al. ProTox 3.0: A webserver for the prediction of toxicity of chemicals. Nucleic Acids Res 2024;52:W513–w520. 10.1093/nar/gkae303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Di Stefano M, Galati S, Piazza L, et al. VenomPred 2.0: A novel In Silico platform for an extended and human interpretable toxicological profiling of small molecules. J Chem Inf Model 2024;64:2275–89. 10.1021/acs.jcim.3c00692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Shao CY, Su BH, Tu YS, et al. CypRules: A rule-based P450 inhibition prediction server. Bioinformatics 2015;31:1869–71. 10.1093/bioinformatics/btv043. [DOI] [PubMed] [Google Scholar]
- 40. Wishart DS, Tian S, Allen D, et al. BioTransformer 3.0-a web server for accurately predicting metabolic transformation products. Nucleic Acids Res 2022;50:W115–w123. 10.1093/nar/gkac313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Matlock MK, Hughes TB, Swamidass SJ. XenoSite server: A web-available site of metabolism prediction tool. Bioinformatics 2015;31:1136–7. 10.1093/bioinformatics/btu761. [DOI] [PubMed] [Google Scholar]
- 42. Rudik A, Dmitriev A, Lagunin A, et al. SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics 2015;31:2046–8. 10.1093/bioinformatics/btv087. [DOI] [PubMed] [Google Scholar]
- 43. Olsen L, Montefiori M, Tran KP, et al. SMARTCyp 3.0: enhanced cytochrome P450 site-of-metabolism prediction server. Bioinformatics 2019;35:3174–5. 10.1093/bioinformatics/btz037. [DOI] [PubMed] [Google Scholar]
- 44. Zhang Y, Pan X, Shi T, et al. P450Rdb: A manually curated database of reactions catalyzed by cytochrome P450 enzymes. J Adv Res 2024;63:35–42. 10.1016/j.jare.2023.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Zou X, Ren L, Cai P, et al. Accurately identifying hemagglutinin using sequence information and machine learning methods. Front Med (Lausanne) 2023;10:1281880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Manavalan B, Lee J. FRTpred: A novel approach for accurate prediction of protein folding rate and type. Comput Biol Med 2022;149:105911. 10.1016/j.compbiomed.2022.105911. [DOI] [PubMed] [Google Scholar]
- 47. Basith S, Pham NT, Manavalan B, et al. SEP-AlgPro: an efficient allergen prediction tool utilizing traditional machine learning and deep learning techniques with protein language model features. Int J Biol Macromol 2024;273:133085. 10.1016/j.ijbiomac.2024.133085. [DOI] [PubMed] [Google Scholar]
- 48. Pham NT, Zhang Y, Rakkiyappan R, et al. HOTGpred: enhancing human O-linked threonine glycosylation prediction using integrated pretrained protein language model-based features and multi-stage feature selection approach. Comput Biol Med 2024;179:108859. 10.1016/j.compbiomed.2024.108859. [DOI] [PubMed] [Google Scholar]
- 49. Zheng L, Liu D, Li YA, et al. RaacFold: A webserver for 3D visualization and analysis of protein structure by using reduced amino acid alphabets. Nucleic Acids Res 2022;50:W633–8. 10.1093/nar/gkac415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Shi XX, Wang ZZ, Wang YL, et al. AquaticTox: A web-based tool for aquatic toxicity evaluation based on ensemble learning to facilitate the screening of green chemicals. Environ Health (Wash) 2024;2:202–11. 10.1021/envhealth.4c00014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Pires DE, Blundell TL, Ascher DB. pkCSM: predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J Med Chem 2015;58:4066–72. 10.1021/acs.jmedchem.5b00104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Wei Y, Li S, Li Z, et al. Interpretable-ADMET: A web service for ADMET prediction and optimization based on deep neural representation. Bioinformatics 2022;38:2863–71. 10.1093/bioinformatics/btac192. [DOI] [PubMed] [Google Scholar]
- 53. Zhang S, Yan Z, Huang Y, et al. HelixADMET: A robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer. Bioinformatics 2022;38:3444–53. 10.1093/bioinformatics/btac342. [DOI] [PubMed] [Google Scholar]
- 54. Hsiao Y, Su BH, Tseng YJ. Current development of integrated web servers for preclinical safety and pharmacokinetics assessments in drug development. Brief Bioinform 2021;22:bbaa160. [DOI] [PubMed] [Google Scholar]
- 55. Venkatraman V. FP-ADMET: A compendium of fingerprint-based ADMET prediction models. J Chem 2021;13:75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Wei M, Zhang X, Pan X, et al. HobPre: accurate prediction of human oral bioavailability for small molecules. J Chem 2022;14:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Yi J, Shi S, Fu L, et al. OptADMET: A web-based tool for substructure modifications to improve ADMET properties of lead compounds. Nat Protoc 2024;19:1105–21. 10.1038/s41596-023-00942-4. [DOI] [PubMed] [Google Scholar]
- 58. Myung Y, de Sá AGC, Ascher DB. Deep-PK: deep learning for small molecule pharmacokinetic and toxicity prediction. Nucleic Acids Res 2024;52:W469–w475. 10.1093/nar/gkae254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. George J, Singh R, Mahmood Z, et al. Toxicoproteomics: new paradigms in toxicology research. Toxicol Mech Methods 2010;20:415–23. 10.3109/15376511003667842. [DOI] [PubMed] [Google Scholar]
- 60. Tanoli Z, Fernández-Torras A, Özcan UO, et al. Computational drug repurposing: approaches, evaluation of in silico resources and case studies. Nat Rev Drug Discov 2025;24:521–42. 10.1038/s41573-025-01164-x. [DOI] [PubMed] [Google Scholar]
- 61. Kim S, Chen J, Cheng T, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 2021;49:D1388–d1395. 10.1093/nar/gkaa971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Zdrazil B, Felix E, Hunter F, et al. The ChEMBL database in 2023: A drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res 2024;52:D1180–d1192. 10.1093/nar/gkad1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Wu L, Yan B, Han J, et al. TOXRIC: A comprehensive database of toxicological data and benchmarks. Nucleic Acids Res 2023;51:D1432–d1445. 10.1093/nar/gkac1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Knox C, Wilson M, Klinger CM, et al. DrugBank 6.0: the DrugBank knowledgebase for 2024. Nucleic Acids Res 2024;52:D1265–d1275. 10.1093/nar/gkad976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Schmidt U, Struck S, Gruening B, et al. SuperToxic: A comprehensive database of toxic compounds. Nucleic Acids Res 2009;37:D295–9. 10.1093/nar/gkn850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Williams AJ, Grulke CM, Edwards J, et al. The CompTox chemistry dashboard: A community data resource for environmental chemistry. J Chem 2017;9:61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Tomasulo P. ChemIDplus-super source for chemical and drug information. Med Ref Serv Q 2002;21:53–9. 10.1300/J115v21n01_04. [DOI] [PubMed] [Google Scholar]
- 68. Fonger GC, Hakkinen P, Jordan S, et al. The National Library of Medicine's (NLM) hazardous substances data Bank (HSDB): background, recent enhancements and future plans. Toxicology 2014;325:209–16. 10.1016/j.tox.2014.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Chen M, Suzuki A, Thakkar S, et al. DILIrank: the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans. Drug Discov Today 2016;21:648–53. 10.1016/j.drudis.2016.02.015. [DOI] [PubMed] [Google Scholar]
- 70. Thakkar S, Li T, Liu Z, et al. Drug-induced liver injury severity and toxicity (DILIst): binary classification of 1279 drugs by human hepatotoxicity. Drug Discov Today 2020;25:201–8. 10.1016/j.drudis.2019.09.022. [DOI] [PubMed] [Google Scholar]
- 71. Chen M, Vijay V, Shi Q, et al. FDA-approved drug labeling for the study of drug-induced liver injury. Drug Discov Today 2011;16:697–703. 10.1016/j.drudis.2011.05.007. [DOI] [PubMed] [Google Scholar]
- 72. Du F, Yu H, Zou B, et al. hERGCentral: A large database to store, retrieve, and analyze compound-human ether-à-go-go related gene channel interactions to facilitate cardiotoxicity assessment in drug development. Assay Drug Dev Technol 2011;9:580–8. 10.1089/adt.2011.0425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Fitzpatrick RB. CPDB: carcinogenic potency database. Med Ref Serv Q 2008;27:303–11. 10.1080/02763860802198895. [DOI] [PubMed] [Google Scholar]
- 74. Sushko I, Novotarskyi S, Körner R, et al. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 2011;25:533–54. 10.1007/s10822-011-9440-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Wishart D, Arndt D, Pon A, et al. T3DB: the toxic exposome database. Nucleic Acids Res 2015;43:D928–34. 10.1093/nar/gku1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Bell SM, Phillips J, Sedykh A, et al. An integrated chemical environment to support 21st-century toxicology. Environ Health Perspect 2017;125:054501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Kuhn M, Letunic I, Jensen LJ, et al. The SIDER database of drugs and side effects. Nucleic Acids Res 2015;44:D1075–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Tanaka Y, Chen HY, Belloni P, et al. OnSIDES database: extracting adverse drug events from drug labels using natural language processing models. Fortschr Med 2025;6:100642. 10.1016/j.medj.2025.100642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Lindquist M. VigiBase, the WHO global ICSR database system: basic facts. Drug information journal : DIJ / Drug Information Association 2008;42:409–19. 10.1177/009286150804200501. [DOI] [Google Scholar]
- 80. Shimabukuro TT, Nguyen M, Martin D, et al. Safety monitoring in the vaccine adverse event reporting system (VAERS). Vaccine 2015;33:4398–405. 10.1016/j.vaccine.2015.07.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Olker JH, Elonen CM, Pilli A, et al. The ECOTOXicology knowledgebase: A curated database of ecologically relevant toxicity tests to support environmental research and risk assessment. Environ Toxicol Chem 2022;41:1520–39. 10.1002/etc.5324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Wexler P. TOXNET: an evolving web resource for toxicology and environmental health information. Toxicology 2001;157:3–10. 10.1016/S0300-483X(00)00337-1. [DOI] [PubMed] [Google Scholar]
- 83. Connors KA, Beasley A, Barron MG, et al. Creation of a curated aquatic toxicology database: EnviroTox. Environ Toxicol Chem 2019;38:1062–73. 10.1002/etc.4382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Lewis KA, John T, J. WD et al. An international database for pesticide risk assessments and management, human and ecological risk assessment: an Int J 2016;22:1050–64. [Google Scholar]
- 85. Zhang Y, Yang Y, Ren L, et al. Predicting intercellular communication based on metabolite-related ligand-receptor interactions with MRCLinkdb. BMC Biol 2024;22:152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Attene-Ramos MS, Miller N, Huang R, et al. The Tox21 robotic platform for the assessment of environmental chemicals--from vision to reality. Drug Discov Today 2013;18:716–23. 10.1016/j.drudis.2013.05.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Igarashi Y, Nakatsu N, Yamashita T, et al. Open TG-GATEs: A large-scale toxicogenomics database. Nucleic Acids Res 2015;43:D921–7. 10.1093/nar/gku955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Nair SK, Eeles C, Ho C, et al. ToxicoDB: an integrated database to mine and visualize large-scale toxicogenomic datasets. Nucleic Acids Res 2020;48:W455–w462. 10.1093/nar/gkaa390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Smith AJ. Norecopa: A global knowledge base of resources for improving animal research and testing. Front Vet Sci 2023;10:1119923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Jennen DGJ, Magkoufopoulou C, Ketelslegers HB, et al. Comparison of HepG2 and HepaRG by whole-genome gene expression analysis for the purpose of chemical Hazard identification. Toxicol Sci 2010;115:66–79. 10.1093/toxsci/kfq026. [DOI] [PubMed] [Google Scholar]
- 91. Davis AP, Grondin CJ, Johnson RJ, et al. Comparative Toxicogenomics database (CTD): update 2021. Nucleic Acids Res 2021;49:D1138–d1143. 10.1093/nar/gkaa891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Wang G, Wu H, Liao Y. et al. BioTD:An Online Database of Biotoxins. arXiv[Preprint]:2412.20038[q-bio.BM]. 10.48550/arXiv.2412.20038. [DOI]
- 93. Zhang D, Tian Y, Tian Y, et al. A data-driven integrative platform for computational prediction of toxin biotransformation with a case study. J Hazard Mater 2021;408:124810. 10.1016/j.jhazmat.2020.124810. [DOI] [PubMed] [Google Scholar]
- 94. Günthardt BF, Hollender J, Hungerbühler K, et al. Comprehensive toxic plants-Phytotoxins database and its application in assessing aquatic micropollution potential. J Agric Food Chem 2018;66:7577–88. 10.1021/acs.jafc.8b01639. [DOI] [PubMed] [Google Scholar]
- 95. Habauzit D, Lemée P, Fessard V. MycoCentral: an innovative database to compile information on mycotoxins and facilitate hazard prediction. Food Control 2024;159:110273. 10.1016/j.foodcont.2023.110273. [DOI] [Google Scholar]
- 96. He QY, He QZ, Deng XC, et al. ATDB: A uni-database platform for animal toxins. Nucleic Acids Res 2008;36:D293–7. 10.1093/nar/gkm832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Tan PT, Veeramani A, Srinivasan KN, et al. SCORPION2: A database for structure-function analysis of SCORPION toxins. Toxicon 2006;47:356–63. 10.1016/j.toxicon.2005.12.001. [DOI] [PubMed] [Google Scholar]
- 98. Kaas Q, Yu R, Jin AH, et al. ConoServer: updated content, knowledge, and discovery tools in the conopeptide database. Nucleic Acids Res 2012;40:D325–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99. Ji J, Zhang D, Ye J, et al. MycotoxinDB: A data-driven platform for investigating masked forms of mycotoxins. J Agric Food Chem 2023;71:9501–7. 10.1021/acs.jafc.3c01403. [DOI] [PubMed] [Google Scholar]
- 100. Danov A, Segev O, Bograd A, et al. Toxinome-the bacterial protein toxin database. MBio 2024;15:e0191123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101. Rodea-Palomares I, Bone AJ. Predictive value of the ToxCast/Tox21 high throughput toxicity screening data for approximating in vivo ecotoxicity endpoints and ecotoxicological risk in eco- surveillance applications. Sci Total Environ 2024;914:169783. [DOI] [PubMed] [Google Scholar]
- 102. Ahmed Z, Shahzadi K, Jin Y, et al. Identification of RNA-dependent liquid-liquid phase separation proteins using an artificial intelligence strategy. Proteomics 2024;24:2400044. [DOI] [PubMed] [Google Scholar]
- 103. Ma J, Motsinger-Reif A. Prediction of synergistic drug combinations using PCA-initialized deep learning. BioData Mining 2021;14:46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104. Xu Y, Liu T, Yang Y, et al. ACVPred: enhanced prediction of anti-coronavirus peptides by transfer learning combined with data augmentation. Futur Gener Comput Syst 2024;160:305–15. 10.1016/j.future.2024.06.008. [DOI] [Google Scholar]
- 105. Liu T, Qiao H, Wang Z, et al. CodLncScape provides a self-enriching framework for the systematic collection and exploration of coding LncRNAs. Adv Sci 2024;11:2400009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Liu T, Huang J, Luo D, et al. Cm-siRPred: predicting chemically modified siRNA efficiency based on multi-view learning strategy. Int J Biol Macromol 2024;264:130638. 10.1016/j.ijbiomac.2024.130638. [DOI] [PubMed] [Google Scholar]
- 107. Gangwal A, Lavecchia A. Artificial intelligence in natural product drug discovery: current applications and future perspectives. J Med Chem 2025;68:3948–69. 10.1021/acs.jmedchem.4c01257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108. Jaganathan K, Tayara H, Chong KT. An explainable supervised machine learning model for predicting respiratory toxicity of chemicals using optimal molecular descriptors. Pharmaceutics 2022;14:832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109. Ribeiro MT, Singh S, Guestrin C. "why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, 1135–44. arXiv[Preprint]:1602.04938[cs.LG]. 10.48550/arXiv.1602.04938. [DOI]
- 110. Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. Published in ICML 2016. arXiv[Preprint]:1506.02142. 10.48550/arXiv.1506.02142. [DOI] [Google Scholar]
- 111. Abdar M, Pourpanah F, Hussain S, et al. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Information Fusion 2021;76:243–97. 10.1016/j.inffus.2021.05.008. [DOI] [Google Scholar]
- 112. Wang S, Sun H, Liu H, et al. ADMET evaluation in drug discovery. 16. Predicting hERG blockers by combining multiple pharmacophores and machine learning approaches. Mol Pharm 2016;13:2855–66. 10.1021/acs.molpharmaceut.6b00471. [DOI] [PubMed] [Google Scholar]
- 113. Cai C, Guo P, Zhou Y, et al. Deep learning-based prediction of drug-induced cardiotoxicity. J Chem Inf Model 2019;59:1073–84. 10.1021/acs.jcim.8b00769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114. Ryu JY, Lee MY, Lee JH, et al. DeepHIT: A deep learning framework for prediction of hERG-induced cardiotoxicity. Bioinformatics 2020;36:3049–55. 10.1093/bioinformatics/btaa075. [DOI] [PubMed] [Google Scholar]
- 115. Arab I, Egghe K, Laukens K, et al. Benchmarking of small molecule feature representations for hERG, Nav1.5, and Cav1.2 cardiotoxicity prediction. J Chem Inf Model 2024;64:2515–27. 10.1021/acs.jcim.3c01301. [DOI] [PubMed] [Google Scholar]
- 116. Kyro GW, Martin MT, Watt ED, et al. CardioGenAI: A machine learning-based framework for re-engineering drugs for reduced hERG liability. J Chem 2025;17:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117. Guan J, Dong D, Xie P, et al. StackDILI: enhancing drug-induced liver injury prediction through stacking strategy with effective molecular representations. J Chem Inf Model 2025;65:1027–39. 10.1021/acs.jcim.4c02079. [DOI] [PubMed] [Google Scholar]
- 118. Lee S, Yoo S. InterDILI: interpretable prediction of drug-induced liver injury through permutation feature importance and attention mechanism. J Chem 2024;16:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119. Amin SA, Kar S, Piotto S. pDILI_v1: A web-based machine learning tool for predicting drug-induced liver injury (DILI) integrating chemical space analysis and molecular fingerprints. ACS Omega 2025;10:13502–14. 10.1021/acsomega.5c00075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120. Seal S, Williams D, Hosseini-Gerami L, et al. Improved detection of drug-induced liver injury by integrating predicted In vivo and In vitro data. Chem Res Toxicol 2024;37:1290–305. 10.1021/acs.chemrestox.4c00015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121. Jin Y, Shou Y, Lei Q, et al. An entropy weight method to integrate big omics and mechanistically evaluate DILI. Hepatology 2024;79:1264–78. 10.1097/HEP.0000000000000628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122. Chen X, Roberts R, Tong W, et al. Tox-GAN: an artificial intelligence approach alternative to animal studies—A case study with Toxicogenomics. Toxicol Sci 2022;186:242–59. 10.1093/toxsci/kfab157. [DOI] [PubMed] [Google Scholar]
- 123. Shi Y, Hua Y, Wang B, et al. In Silico prediction and insights into the structural basis of drug induced nephrotoxicity. Front Pharmacol 2022;12:793332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124. Gong Y, Teng D, Wang Y, et al. In silico prediction of potential drug-induced nephrotoxicity with machine learning methods. J Appl Toxicol 2022;42:1639–50. 10.1002/jat.4331. [DOI] [PubMed] [Google Scholar]
- 125. Mazumdar B, Sarma PKD, Mahanta HJ. Predicting renal toxicity of compounds with deep learning and machine learning methods. SN Computer Science 2023;4:812. [Google Scholar]
- 126. Nguyen-Vo TH, Bui L, Do TTT, et al. Identifying nephrotoxicity of small molecules using machine learning. In: TENCON 2024–2024 IEEE Region 10 Conference (TENCON). Singapore, Singapore: IEEE, 2024, pp. 482–85.
- 127. Su R, Yang H, Wei L, et al. A multi-label learning model for predicting drug-induced pathology in multi-organ based on toxicogenomics data. PLoS Comput Biol 2022;18:e1010402. 10.1371/journal.pcbi.1010402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128. Ryu JY, Jang WD, Jang J, et al. PredAOT: A computational framework for prediction of acute oral toxicity based on multiple random forest models. BMC Bioinformatics 2023;24:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129. Wijeyesakere SJ, Auernhammer T, Parks A, et al. Profiling mechanisms that drive acute oral toxicity in mammals and its prediction via machine learning. Toxicol Sci 2023;193:18–30. 10.1093/toxsci/kfad025. [DOI] [PubMed] [Google Scholar]
- 130. Lou S, Yu Z, Huang Z, et al. In Silico prediction of chemical acute dermal toxicity using explainable machine learning methods. Chem Res Toxicol 2024;37:513–24. 10.1021/acs.chemrestox.4c00012. [DOI] [PubMed] [Google Scholar]
- 131. Borba JVB, Alves VM, Braga RC, et al. STopTox: an in Silico alternative to animal testing for acute systemic and topical toxicity. Environ Health Perspect 2022;130:27012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132. Jain S, Siramshetty VB, Alves VM, et al. Large-scale Modeling of multispecies acute toxicity end points using consensus of multitask deep learning methods. J Chem Inf Model 2021;61:653–63. 10.1021/acs.jcim.0c01164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133. Wang Y-W, Huang L, Jiang S-W, et al. CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens. Food Chem Toxicol 2020;135:110921. 10.1016/j.fct.2019.110921. [DOI] [PubMed] [Google Scholar]
- 134. Fradkin P, Young A, Atanackovic L, et al. A graph neural network approach for molecule carcinogenicity prediction. Bioinformatics 2022;38:i84–91. 10.1093/bioinformatics/btac266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135. Limbu S, Dakshanamurthy S. Predicting chemical carcinogens using a hybrid neural network deep learning method. Sensors (Basel) 2022;22:8185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136. Chen Z, Zhang L, Sun J, et al. DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction. J Cell Mol Med 2023;27:3117–26. 10.1111/jcmm.17889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137. Mittal A, Mohanty SK, Gautam V, et al. Artificial intelligence uncovers carcinogenic human metabolites. Nat Chem Biol 2022;18:1204–13. 10.1038/s41589-022-01110-7. [DOI] [PubMed] [Google Scholar]
- 138. R eH, M eX, D-T eN, et al. Tox21Challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs, Frontiers in environmental. Science 2016;3:85. 10.3389/fenvs.2015.00085. [DOI] [Google Scholar]
- 139. A eM, A eM, G eK, et al. DeepTox: toxicity prediction using deep learning, Frontiers in environmental. Science 2016;3:80. 10.3389/fenvs.2015.00080. [DOI] [Google Scholar]
- 140. Jiang X, Ji P, Li S. CensNet: Convolution with Edge-Node Switching in Graph Neural Networks. In: Kraus S (ed.), Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019. Macao, China, August 10-16, 2019, pp. 2656–62.
- 141. Baek J, Kang M, Hwang SJ. Accurate Learning of Graph Representations with Graph Multiset Pooling. arXiv[Preprint]:abs/2102.11533.
- 142. Guo Z, Zhang C, Yu W, et al. Few-shot graph learning for molecular property prediction. In: Proceedings of the Web Conference 2021. Ljubljana, Slovenia: Association for Computing Machinery. New York, NY, USA: Association for Computing Machinery, 2021, 2559––67..
- 143. Zhang X, Wang H, Du Z, et al. CardiOT: towards interpretable drug cardiotoxicity prediction using optimal transport and Kolmogorov--Arnold networks. IEEE J Biomed Health Inform 2025;29:1759–70. 10.1109/JBHI.2024.3510297. [DOI] [PubMed] [Google Scholar]
- 144. Limbu S, Dakshanamurthy S. Predicting chemical carcinogens using a hybrid neural network deep learning method. Sensors 2022;22:8185. 10.3390/s22218185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145. Li Y, Zhang Y, Wang Y, et al. A strategy for the discovery and validation of toxicity quality marker of Chinese medicine based on network toxicology. Phytomedicine 2019;54:365–70. 10.1016/j.phymed.2018.01.018. [DOI] [PubMed] [Google Scholar]
- 146. Ren L, Xu Y, Ning L, et al. TCM2COVID: A resource of anti-COVID-19 traditional Chinese medicine with effects and mechanisms. iMeta 2022;1:e42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147. Yang C-q, Lai C-c, Pan J-c, et al. Maintaining calcium homeostasis as a strategy to alleviate nephrotoxicity caused by evodiamine. Ecotoxicol Environ Saf 2024;281:116563. 10.1016/j.ecoenv.2024.116563. [DOI] [PubMed] [Google Scholar]
- 148. Singh D, Singh R. Pharmacological and therapeutic potential of a natural flavonoid Icariside II in human complication. Curr Drug Targets 2025;26:320–30. 10.2174/0113894501329810241117231839. [DOI] [PubMed] [Google Scholar]
- 149. Subhan I, Siddique YH. Effect of rotenone on the neurodegeneration among different models. Curr Drug Targets 2024;25:530–42. 10.2174/0113894501281496231226070459. [DOI] [PubMed] [Google Scholar]
- 150. Fan X, Zhao X, Jin Y, et al. Network toxicology and its application to traditional Chinese medicine. Zhongguo Zhong Yao Za Zhi 2011;36:2920–2. 10.4268/cjcmm20112104. [DOI] [PubMed] [Google Scholar]
- 151. Li S, Zhang B. Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin J Nat Med 2013;11:110–20. 10.1016/S1875-5364(13)60037-0. [DOI] [PubMed] [Google Scholar]
- 152. Ru J, Li P, Wang J, et al. TCMSP: A database of systems pharmacology for drug discovery from herbal medicines. J Chem 2014;6:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153. Lewi DF, Bird MG, Jacobs MN. Human carcinogens: an evaluation study via the COMPACT and HazardExpert procedures. Hum Exp Toxicol 2002;21:115–22. 10.1191/0960327102ht233oa. [DOI] [PubMed] [Google Scholar]
- 154. Prival MJ. Evaluation of the TOPKAT system for predicting the carcinogenicity of chemicals. Environ Mol Mutagen 2001;37:55–69. . [DOI] [PubMed] [Google Scholar]
- 155. Greene N, Judson PN, Langowski JJ, et al. Knowledge-based expert systems for toxicity and metabolism prediction: DEREK. StAR and METEOR, SAR QSAR Environ Res 1999;10:299–314. [DOI] [PubMed] [Google Scholar]
- 156. Franz M, Lopes CT, Fong D et al. Cytoscape.js 2023. Update: A graph theory library for visualization and analysis, Bioinformatics 2023;39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157. Forli S, Huey R, Pique ME, et al. Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat Protoc 2016;11:905–19. 10.1038/nprot.2016.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158. Xue R, Fang Z, Zhang M, et al. TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res 2013;41:D1089–95. 10.1093/nar/gks1100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159. Szklarczyk D, Gable AL, Lyon D, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019;47:D607–d613. 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160. Chen Q, Zhang K, Jiao M, et al. Study on the mechanism of Mesaconitine-induced hepatotoxicity in rats based on Metabonomics and toxicology network. Toxins (Basel) 2022;14:486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161. Xi K, Zhang M, Li M, et al. Unveiling the mechanisms of nephrotoxicity caused by nephrotoxic compounds using toxicological network analysis. Mol Ther Nucleic Acids 2023;34:102075. 10.1016/j.omtn.2023.102075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162. Lv L, Wang X, Wu H. Assessment of palmitic acid toxicity to animal hearts and other major organs based on acute toxicity, network pharmacology, and molecular docking. Comput Biol Med 2023;158:106899. 10.1016/j.compbiomed.2023.106899. [DOI] [PubMed] [Google Scholar]
- 163. Tian Y. Artificial intelligence image recognition method based on convolutional neural network algorithm. IEEE Access 2020;8:125731–44. 10.1109/ACCESS.2020.3006097. [DOI] [Google Scholar]
- 164. Li X, Lin L, Pang L, et al. Application and development trends of network toxicology in the safety assessment of traditional Chinese medicine. J Ethnopharmacol 2025;343:119480. 10.1016/j.jep.2025.119480. [DOI] [PubMed] [Google Scholar]
- 165. Tian Z, Peng X, Fang H, et al. MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform 2022;23:bbac434. 10.1093/bib/bbac434. [DOI] [PubMed] [Google Scholar]
- 166. Chen CY. TCM database@Taiwan: the world's largest traditional Chinese medicine database for drug screening in silico. PLoS One 2011;6:e15939. 10.1371/journal.pone.0015939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167. Zhang RZ, Yu SJ, Bai H, et al. TCM-mesh: the database and analytical system for network pharmacology analysis for TCM preparations. Sci Rep 2017;7:2821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168. Fang YC, Huang HC, Chen HH, et al. TCMGeneDIT: A database for associated traditional Chinese medicine, gene and disease information using text mining. BMC Complement Altern Med 2008;8:58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169. Kang H, Tang K, Liu Q, et al. HIM-herbal ingredients in-vivo metabolism database. J Chem 2013;5:28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170. Song L, Qian W, Yin H, et al. TCMSTD 1.0: A systematic analysis of the traditional Chinese medicine system toxicology database. Sci China Life Sci 2023;66:2189–92. 10.1007/s11427-022-2318-4. [DOI] [PubMed] [Google Scholar]
- 171. Lv Q, Chen G, He H, et al. TCMBank-the largest TCM database provides deep learning-based Chinese-Western medicine exclusion prediction. Signal Transduct Target Ther 2023;8:127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 172. Liu X, Liu J, Fu B, et al. DCABM-TCM: A database of constituents absorbed into the blood and metabolites of traditional Chinese medicine. J Chem Inf Model 2023;63:4948–59. 10.1021/acs.jcim.3c00365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 173. Kong X, Liu C, Zhang Z, et al. BATMAN-TCM 2.0: an enhanced integrative database for known and predicted interactions between traditional Chinese medicine ingredients and target proteins. Nucleic Acids Res 2024;52:D1110–d1120. 10.1093/nar/gkad926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174. Wei J, Zhuo L, Fu X, et al. DrugReAlign: A multisource prompt framework for drug repurposing based on large language models. BMC Biol 2024;22:226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175. Wang M, Lin T, Lin A, et al. Enhancing diagnostic accuracy in rare and common fundus diseases with a knowledge-rich vision-language model. Nat Commun 2025;16:5528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176. Pal S, Bhattacharya M, Islam MA, et al. ChatGPT or LLM in next-generation drug discovery and development: pharmaceutical and biotechnology companies can make use of the artificial intelligence-based device for a faster way of drug discovery and development. Int J Surg 2023;109:4382–4. 10.1097/JS9.0000000000000719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 177. Goh E, Gallo R, Hom J, et al. Large language model influence on diagnostic reasoning: A randomized clinical trial. JAMA Netw Open 2024;7:e2440969. 10.1001/jamanetworkopen.2024.40969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 178. Silberg J, Swanson K, Simon E, et al. UniTox: leveraging LLMs to curate a unified dataset of drug-induced Toxicity from FDA labels. medRxiv [Preprint] 2024;2024.2006.2021.24309315. [Google Scholar]
- 179. Niu Z, Xiao X, Wu W, et al. PharmaBench: enhancing ADMET benchmarks with large language models. Scientific Data 2024;11:985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180. Yang H, Xiu J, Yan W, et al. Large language models as tools for molecular toxicity prediction: AI insights into cardiotoxicity. J Chem Inf Model 2025;65:2268–82. 10.1021/acs.jcim.4c01371. [DOI] [PubMed] [Google Scholar]
- 181. Alber DA, Yang Z, Alyakin A, et al. Medical large language models are vulnerable to data-poisoning attacks. Nat Med 2025;31:618–26. 10.1038/s41591-024-03445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182. Hakim JB, Painter JL, Ramcharran D. et al. The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem. arXiv[Preprint]:2407.18322[cs.CL]. 10.48550/arXiv.2407.18322. [DOI]
- 183. Ullah E, Parwani A, Baig MM, et al. Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology—a recent scoping review. Diagn Pathol 2024;19:43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 184. Rajabi E, Etminani K. Knowledge-graph-based explainable AI: A systematic review. J Inf Sci 2024;50:1019–29. 10.1177/01655515221112844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 185. Wang C, Li M, He J. et al. A Survey for Large Language Models Models in Biomedicine. Artif Intell Med 2025;170:103268. 10.1016/j.artmed.2025.103268. [DOI] [PubMed] [Google Scholar]
- 186. Li M, Peng W, Zhu S, et al. The role of glycolipids and their toxicity in the context of nanomaterials and nanoparticles: A review of the literature. Curr Drug Targets 2025;26:571–85. 10.2174/0113894501347158250305074908. [DOI] [PubMed] [Google Scholar]
- 187. Jia X, Wang T, Zhu H. Advancing computational toxicology by interpretable machine learning. Environ Sci Technol 2023;57:17690–706. 10.1021/acs.est.3c00653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188. Wang X, Li F, Chen J, et al. Integration of computational toxicology, Toxicogenomics data mining, and omics techniques to unveil toxicity pathways. ACS Sustain Chem Eng 2021;9:4130–8. 10.1021/acssuschemeng.0c09196. [DOI] [Google Scholar]
- 189. Ford KA. Refinement, reduction, and replacement of animal toxicity tests by computational methods. ILAR J 2017;57:226–33. [DOI] [PubMed] [Google Scholar]
- 190. Kleinstreuer NC, Tetko IV, Tong W. Introduction to special issue: computational toxicology. Chem Res Toxicol 2021;34:171–5. 10.1021/acs.chemrestox.1c00032. [DOI] [PubMed] [Google Scholar]
- 191. Zhai Y, Liu L, Zhang F, et al. Network pharmacology: A crucial approach in traditional Chinese medicine research. Chin Med 2025;20:8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data sharing is not applicable to this article as no new data were created or analyzed in this study.







