Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2025 Oct 6;26(5):bbaf533. doi: 10.1093/bib/bbaf533

Computational toxicology in drug discovery: applications of artificial intelligence in ADMET and toxicity prediction

Jiangyan Zhang 1,#, Haolin Li 2,#, Yuncong Zhang 3,#, Junyang Huang 4, Liping Ren 5, Chuantao Zhang 6,, Quan Zou 7,, Yang Zhang 8
PMCID: PMC12499773  PMID: 41052279

Abstract

Toxicity risk assessment plays a crucial role in determining the clinical success and market potential of drug candidates. Traditional animal-based testing is costly, time-consuming, and ethically controversial, which has led to the rapid development of computational toxicology. This review surveys over 20 ADMET prediction platforms, categorizing them into rule/statistical-based methods, machine learning (ML) methods, and graph-based methods. We also summarize major toxicological databases into four types: chemical toxicity, environmental toxicology, alternative toxicology, and biological toxin databases, highlighting their roles in model training and validation. Furthermore, we review recent advancements in ML and artificial intelligence (AI) applied to toxicity prediction, covering acute toxicity, organ-specific toxicities, and carcinogenicity. The field is transitioning from single-endpoint predictions to multi-endpoint joint modeling, incorporating multimodal features. We also explore the application of generative modeling techniques and interpretability frameworks to improve the accuracy and credibility of predictions. Additionally, we discuss the use of network toxicology in evaluating the safety of traditional Chinese medicines (TCMs) and the potential of large language models (LLMs) in literature mining, knowledge integration, and molecular toxicity prediction. Finally, we address current challenges, including data quality, model interpretability, and causal inference, and propose future directions such as multi-omics integration, interpretable AI models, and domain-specific LLMs, aiming to provide more efficient and precise technical support for preclinical toxicity assessments in drug development.

Keywords: drug discovery, ADMET prediction, computational toxicology, machine learning, toxin databases, large language models

Introduction

Drug discovery and development constitute a complex system engineering endeavor that integrates scientific rigor, economic viability, and societal implications, where the primary challenge is balancing therapeutic efficacy and safety thresholds of candidate compounds [1]. It has been reported that ~30% of preclinical candidate compounds (PCCs) fail due to toxicity issues, making adverse toxicological reactions the leading cause of drug withdrawal from the market [2, 3]. This reality underscores the strategic importance of toxicity assessment within the drug development pipeline. Toxicological evaluation serves as a pivotal link between fundamental research and clinical translation, significantly influencing not only development timelines and cost control but also public health safety and optimal allocation of healthcare resources [4]. Consequently, establishing efficient, accurate toxicity prediction methodologies has emerged as a global technological imperative in innovative drug discovery.

Traditional toxicity assessment paradigms rely heavily on in vivo animal experiments, typically employing sequential toxicity tests (acute, subacute, and chronic toxicity assays) to characterize the risk profiles of candidate compounds [5]. This approach has extensive historical data, but it no longer meets modern ethical and efficiency standards. On one hand, animal experiments are hindered by uncertainties in cross-species extrapolation, protracted testing durations (typically 6–24 months), and extremely high costs per compound (often exceeding millions of dollars) [6]. On the other hand, the widespread adoption of the “3Rs principle” (replacement, reduction, and refinement) places significant ethical pressure on traditional animal-based methodologies [7]. These conflicting demands have spurred the rapid emergence of computational toxicology, which integrates quantum chemical calculations, molecular dynamics simulations, machine learning (ML) algorithms, and multi-omics datasets to develop mechanism-based predictive models, thereby shifting from an “experience-driven” to a “data-driven” evaluation paradigm [8–11].

The theoretical advances underpinning computational toxicology have arisen from a deeper understanding of the multiscale mechanisms driving toxicological effects. Modern toxicological research has elucidated that drug toxicity is essentially an emergent property stemming from multiscale interactions between small molecules and biological systems: at the molecular level, metabolic activation, covalent modifications, and off-target interactions serve as initial triggers of toxicity; at the cellular level, mitochondrial dysfunction, oxidative stress, and aberrant activation of cell-death pathways amplify toxic phenotypes; and at the systemic level, disruptions of inter-organ metabolic networks and disturbances in the immune microenvironment ultimately manifest as clinically observable pathological outcomes [12, 13]. This hierarchical progression of toxic mechanisms necessitates predictive models with comprehensive, multidimensional information integration capabilities.

Currently, computational methods such as quantitative structure–activity relationship (QSAR), molecular docking, and systems toxicology have achieved significant predictive accuracy in critical toxicity evaluations, including hepatotoxicity and cardiotoxicity. Under conditions of sufficient data availability, their predictive performance has approached or even surpassed that of traditional animal-based assays [14–16]. Meanwhile, the rapid advancement of artificial intelligence (AI) technologies has further enhanced the predictive capabilities of computational toxicology [17]. Deep learning algorithms, notably graph neural networks (GNNs), can automatically extract molecular structural features and identify latent relationships between molecular structures and toxicity profiles [18, 19]. Furthermore, transformer architectures effectively integrate multimodal data, including chemical structure, genomic perturbations, and pathological phenotypes into end-to-end predictive pipelines, significantly improving model generalization [20]. Concurrently, the collaborative evolution of large-scale toxicity databases and cloud computing platforms has made virtual screening of millions of compounds feasible, improving screening efficiency by two to three orders of magnitude relative to traditional experimental approaches [21].

Despite these notable advances, computational toxicology continues to face substantial challenges. Current toxicity datasets often exhibit uneven data quality, limited model interpretability, and insufficient coverage, particularly when predicting novel or structurally complex multitarget compounds, leading to suboptimal predictive accuracy [22]. To address these bottlenecks, increasing research efforts have adopted multilayered, multidimensional integrated approaches, combining experimental data with network pharmacology and systems biology to construct more accurate and comprehensive toxicity prediction frameworks. Additionally, integrating predictive tools deeply with clinical drug data is essential to accurately identify potential toxicity risks during early drug discovery stages, thus providing reliable decision-making support for subsequent clinical development [23–25].

Given these considerations, this review systematically examines recent advancements in bioinformatics methods and technologies for drug toxicity research, focusing on applications of ML/AI methods. We begin by introducing the ADMET (absorption, distribution, metabolism, excretion, and toxicity) prediction system and prevalent computational platforms. Subsequently, we discuss key features and application scenarios of existing toxicology databases, critically analyzing the strengths and limitations of various toxicity task prediction algorithms. Furthermore, we provide an overview of network toxicology and its applications in assessing the safety of complex therapeutics, such as traditional Chinese medicine (TCM) formulations, and outline emerging potentials of large language models (LLMs) in toxicological research. Collectively, this review aims to provide theoretical and practical guidance for toxicity assessments in drug discovery and to inform the design of preclinical research and early-stage clinical trials.

Artificial intelligence in ADMET research

Framework of ADMET prediction methods

Adverse pharmacokinetic properties pose a significant threat to human health and environmental safety, representing one of the leading causes of drug development failure. ~40% of preclinical candidate drugs fail due to insufficient ADMET profiles, while nearly 30% of marketed drugs are withdrawn due to unforeseen toxic reactions [26]. Early integration of ADMET factors into the evaluation of new chemical entities has been shown to significantly reduce attrition rates in drug discovery [27]. Therefore, it is crucial to predict and optimize the ADMET properties of candidate compounds in advance. ADMET evaluation encompasses the absorption, distribution, metabolism, excretion, and toxicity of drugs, providing a comprehensive assessment of their in vivo behavior and predicting their clinical efficacy and safety (Fig. 1).

Figure 1.

The five key properties of ADMET (absorption, distribution, metabolism, excretion, and toxicity) and their respective related evaluation endpoints.

The five parts of ADMET and their related endpoints.

Toxicity is a critical component of drug safety assessment, with potential adverse effects including neurotoxicity, organ toxicity, genotoxicity, carcinogenicity, and more [28]. Acute toxicity is typically assessed through in vivo metrics such as LD50 (median lethal dose) and in vitro endpoints like IGC50 (half-maximal inhibitory concentration). Hepatotoxicity, nephrotoxicity, and cardiotoxicity are common drug-induced toxicities. Hepatic damage is generally characterized by elevated alanine aminotransferase (ALT), aspartate aminotransferase (AST), and bilirubin levels, while nephrotoxicity can be detected in clinical or preclinical settings via serum creatinine and blood urea nitrogen measurements. Cardiotoxicity is associated with hERG channel inhibition, potentially leading to fatal arrhythmias [28]. Therefore, comprehensive toxicological evaluation integrating both in vitro and in vivo endpoints is essential for ensuring drug safety and minimizing clinical adverse reactions.

With the continuous advancements in ML/AI technologies, numerous ADMET prediction platforms based on these approaches have emerged. These platforms significantly enhance the efficiency of drug discovery and development, offering several key advantages. Firstly, they can rapidly process extensive datasets of chemical compounds, such as their molecular structures and physicochemical properties, substantially reducing experimental costs and time. Secondly, AI models can leverage large-scale historical ADMET data from previous experiments to deliver more accurate predictions. Lastly, these platforms facilitate drug performance analysis from multiple perspectives by simulating various physiological conditions and environmental factors, thereby enabling researchers to make more scientifically informed decisions.

The fundamental framework of an ADMET prediction platform constitutes a multilayered system encompassing the complete workflow from data input and model training to predictive output (Fig. 2). This framework uses robust computational methods, big data, and multidimensional information to improve prediction accuracy and reliability. Specifically, the ADMET prediction platform typically comprises the following critical components:

Figure 2.

The basic framework of ADMET prediction platforms, showing its key components (input, methods/tools, and endpoints).

The basic framework of ADMET prediction platforms.

Input component: the input component forms the foundation for the platform’s operation. It requires comprehensive chemical structural data and related molecular information, including molecular formulas, molecular weights, and molecular structures. Additionally, extensive ADMET experimental data, such as drug bioavailability, hepatic metabolic stability, and clearance rates, must be integrated. Experimental datasets are not limited to static data but also encompass reaction data under various conditions (e.g. different pH levels, physiological states). Moreover, literature-derived data is indispensable, enabling researchers to enrich the platform’s databases with previously published datasets and experimental outcomes, thereby ensuring the reliability and generalizability of platform predictions.

Tools/methods component: this component is the core of the platform and consists of two main submodules:

Physicochemical property calculation module: utilizing chemoinformatics software packages such as RDKit and Scopy, this module computes basic physicochemical properties of chemical compounds, including molecular weight, pKa, log P, TPSA, and hydrogen bond acceptors/donors. These fundamental physicochemical properties provide preliminary predictive information for various ADMET characteristics, such as drug absorption, distribution, and metabolism. The calculated results typically serve as foundational features for ML models.

ML/AI prediction module: based on substantial experimental data and computational chemical information, ML algorithms (e.g. support vector machines (SVMs), random forests (RFs), neural networks, gradient boosting trees) are applied to predict various ADMET properties. These models can be classified into regression and classification types, depending on specific prediction tasks. Regression models predict continuous ADMET parameters, such as in vivo drug t1/2 (half-life), VDss (volume of distribution at steady state), CL (clearance), and MRT (mean residence time). In contrast, classification models predict discrete ADMET indicators, such as in vitro BBB (blood–brain barrier) permeability assays and in vitro HLM (human liver microsome) stability metrics. These classification models identify potential risks and features by training on extensive datasets that encompass both in vitro and in vivo data. Furthermore, ML/AI models continuously improve prediction performance through techniques like feature selection and hyperparameter optimization, ensuring the platform’s adaptability across different drug types and both in vitro and in vivo ADMET characteristics.

Output component: the output component represents the final form of the platform’s predictions, generally referred to as “endpoints” indicating the predictive values or classification results corresponding to each ADMET characteristic. The number and type of these endpoints vary among platforms. However, comprehensive ADMET prediction platforms can evaluate over 100 different endpoints, encompassing both in vitro properties (such as HLM stability and PPB) and in vivo pharmacokinetic parameters (such as bioavailability (F), half-life (t1/2), and renal clearance (CLr)). These results are presented through intuitive data visualization and reporting tools, enabling researchers to swiftly acquire both in vitro and in vivo drug ADMET characteristics to support informed decision-making. The platform outputs extend beyond single predictive values and incorporate physicochemical properties with predictive outcomes to offer deeper analyses, such as directions for drug optimization, potential side effects, and optimal routes of administration [29]. Additionally, some platforms facilitate comparisons between candidate drugs and other compounds, assisting developers in drug screening and optimization processes [30].

In summary, the fundamental architecture of ADMET prediction platforms constitutes an integrated and intelligent system providing robust support for drug research and development through efficient data input, precise computational tools, and powerful ML algorithms. These platforms not only enhance development efficiency and reduce experimental costs but also accurately predict drug safety and efficacy in early development stages, thereby establishing a solid scientific foundation for successful new drug discovery.

Overview of ADMET prediction platforms

Nowadays, numerous computational tools have been developed to predict various ADMET-related properties, ranging from broad-spectrum platforms to tools specialized in specific aspects (Table 1 and Supplementary Table 1). Broad-spectrum platforms, such as admetSAR 3.0 [31], ADMETlab 3.0 [32], vNN-ADMET [33], and ADMETboost [34], provide comprehensive coverage across all five ADMET dimensions. By integrating multiple predictive models, these platforms offer systematic assessments of compounds in terms of ADMET. In contrast, platforms like Swiss ADME [30], FAF-Drugs4.0 [35], and ADMET-AI [36] focus specifically on pharmacokinetic properties, emphasizing predictions related to absorption, distribution, metabolism, and excretion. Additionally, certain platforms concentrate solely on toxicity prediction; e.g. ProTox 3.0 [37] and VenomPred 2.0 [38] are tailored for evaluating toxicity endpoints such as hepatotoxicity, carcinogenicity, and mutagenicity. Furthermore, tools like CypRules [39], BioTransformer 3.0 [40], XenoSite [41], SOMP [42], and SMARTCyp 3.0 [43] specifically address interactions with cytochrome P450 enzymes (CYPs), playing critical roles in drug metabolism studies.

Table 1.

Overview of ADMET Prediction Platforms

Platform ADMET endpoint ADME endpoint T endpoint Model
CypRules [39] 5 5 / Rule-based C5.0 decision tree
Swiss ADME [30] 37 9 / Rule-based method、SVM
FAF-drug4.0 [35] 40 / 4 Rule-based method
SMARTCyp3.0 [43] 3 3 / rule-based method
SOMP [42] 6 6 / Bayesian-based algorithm
XenoSite [41] 9 9 / DNN
vNN-ADMET [33] 15 9 6 vNN
ProTox 3.0 [37] 61 / 61 RF、DNN
Virtual Rat [54] 12 10 2 RF、C5.0、SVM、CART
FP-ADMET [55] 56 24 30 RF
ICDrug [56] 14 8 6 RF
BioTransformer 3.0 [40] 9 9 / ML
ADMETboost [34] 29 18 4 XGBoost
ADMET-AI [36] 49 22 18 GNN
VenomPred2.0 [38] 12 / 12 RF、SVM、KNN、MLP
AquaticTox [50] 5 / 5 Ensemble model
OptADMET [57] 32 14 15 QSAR
PKCSM [51] 36 20 10 Graph-based signatures
Interpretable-ADMET [52] 49 20 29 GCNN、GAT
HelixADMET [53] 52 14 16 GNN
admetSAR3.0 [31] 119 39 43 CLMGraph
ADMET lab3.0 [32] 119 34 36 DMPNN
Deep-PK [58] 73 30 34 DMPNN

Note: The number in the “ADMET endpoint” column represents the total number of ADMET property endpoints the tool can predict; the number in the “ADME endpoint” column represents the number of ADME property endpoints; the number in the “T endpoint” column represents the number of Toxicity (T) endpoints. The symbol “/” indicates that the tool does not provide predictions for that particular category of properties.

According to the computational methodologies employed, existing ADMET prediction platforms can generally be classified into three categories: rule/statistical based methods, ML based methods, and graph-based methods (Fig. 3). Next, let’s briefly examine each category’s unique characteristics and their applications in ADMET prediction.

Figure 3.

The classification of computational modeling methods in ADMET prediction platforms, categorizing the methods into three major types: rule/Statistics based Methods, ML-based methods, and graph-based Methods.

Classification of computational modeling approaches in ADMET prediction platform: rule/statistics based methods, ML based methods, and graph-based methods.

Rule-based/statistical methods

Rule-based and statistical methods represent an earlier paradigm of ADMET prediction. These methods stand out due to their high computational efficiency and interpretability. They typically leverage chemical rule databases, experimental data, and statistical inference to perform rapid pharmacokinetic evaluations. For example, SMARTCyp 3.0 [43] and CypRules [39] specialize in predicting cytochrome P450 metabolism sites by combining chemical heuristics with quantum chemical computations, enabling quick identification of potential metabolism sites within a molecule [44]. Swiss ADME [30], as a comprehensive prediction tool, integrates physicochemical property calculations (e.g. log P, TPSA), drug-likeness rules (Lipinski, Veber rules), and structural toxicity alerts (such as PAINS), supplemented with SVM-optimized key endpoints (log S, blood–brain barrier permeability, P-gp substrate specificity), providing an intuitive and comprehensive ADMET evaluation for early-stage drug screening. The FAF-Drugs platform [35], currently in version 4.0, employs an approach that calculates physicochemical descriptors for input molecules, followed by preliminary screening based on predefined thresholds and rules (e.g. Lipinski’s rules). Subsequently, it uses SMARTS pattern matching to identify potentially toxic or undesirable structures, relying entirely on expert-driven methodologies rather than ML algorithms. Its notable strengths include simplicity, efficiency, ease of customization, and scalability for large-scale screening efforts. Additionally, probabilistic acm-pproaches have been explored; for instance, the SOMP [42] tool generates all possible Sites of Labeled Atoms (SoLAs) from the 2D molecular structure and characterizes them using LMNA descriptors based on molecular neighborhoods. It then applies a Bayesian probabilistic model to rank and predict likely metabolic sites, effectively utilizing early-stage structural information. However, these rule-based/statistical methods inherently depend heavily on established chemical rules and experimental data, often exhibiting limited predictive capability when encountering novel or structurally complex compounds.

Machine learning-based methods

ML-based methods leverage chemical descriptors and algorithms such as RF, SVM, k-Nearest Neighbors (k-NN), and Gradient Boosting Trees (GBT) to predict ADMET properties [45–48]. The primary strength of these approaches lies in their ability to discern complex patterns within large datasets, making them well-suited for handling diverse chemical spaces and structurally intricate molecules [49]. For example, vNN-ADMET [33] utilizes a variant of the k-NN approach, variable Nearest Neighbor (vNN) for pharmacokinetic predictions. ADMETboost [34] employs XGBoost, a gradient boosting tree-based ensemble algorithm optimized for structured data handling. Some platforms adopt combined or hybrid approaches tailored to specific prediction tasks; ProTox 3.0 [37], specialized in toxicity prediction, uses eight data-sampling methods alongside RF and deep neural networks (DNNs) to predict up to 61 distinct toxicity endpoints, including hepatotoxicity, carcinogenicity, and mutagenicity. AquaticTox [50], the first tool dedicated specifically to aquatic toxicity predictions, employs a stacked ensemble methodology comprising six ML models [RF, AdaBoost, Gradient Boosting, SVM, Fully Connected Networks (FCNs), and Graph Convolutional Neural Networks (GCNNs)]. VenomPred 2.0 [38] predicts multiple toxicity endpoints by combining three different chemical fingerprints (Morgan, RDKit, PubChem) with four distinct algorithms (RF, SVM, k-NN, Multilayer Perceptron (MLP)). Despite their substantial predictive capabilities, these ML methods are strongly reliant on high-quality training data, inadequate or biased training datasets may significantly impair performance. Furthermore, some models, particularly deep learning approaches, often lack interpretability regarding their underlying chemical rationale.

Graph-based methods

Graph-based computational methods represent a cutting-edge development in the ADMET prediction domain, characterized by their superior ability to deeply analyze molecular graph structures. By aggregating neighborhood information to update node representations, these deep learning models effectively capture complex relationships and structural features within molecules. Their primary advantage is superior performance in predicting complex molecular properties compared to traditional methods. For instance, pkCSM [51] introduced in 2015, represents molecules as molecular graphs, extracting interatomic distance and topological features using Graph-Based Signatures, subsequently leveraging these features for ML-based predictions of pharmacokinetic and toxicological properties. This approach offers significant flexibility and interpretability without relying on predefined structural fragments, providing a robust and accurate method for ADMET prediction.

With the proliferation of GNNs, several recent platforms have adopted these methodologies. ADMET-AI [36], e.g. applies the Chemprop-RDKit GNN model, representing molecules as graphs and learning atomic-level features through message-passing neural networks, which are further enriched with physicochemical properties calculated by RDKit. This approach has demonstrated outstanding performance across 41 ADMET datasets. Similarly, Interpretable-ADMET [52] employs GCNN and Graph Attention Networks (GATs), incorporating Grad-CAM to explain predictions by identifying molecular substructures most contributing to specific ADMET properties, thus achieving both accuracy and interpretability. Recent research efforts, such as HelixADMET [53], have begun leveraging self-supervised pretraining strategies for GNN models on large compound datasets, transferring the learned knowledge to specific ADMET prediction tasks, resulting in robust and scalable prediction systems. ADMETlab 3.0 [32] combines multitask Deep Message Passing Neural Networks (DMPNNs, a GNN variant) with molecular descriptors, first pretraining the model to obtain general molecular features, then fine-tuning for multiple ADMET tasks, significantly expanding its applicability and performance. AdmetSAR 3.0 [31] also incorporates pretrained GNN models to extract molecular features, subsequently fine-tuning for specific ADMET properties, achieving notable performance in chemical exploration, prediction, and optimization tasks. Nevertheless, these graph-based methods have notable computational complexities, particularly when dealing with large datasets, potentially requiring extensive computational resources during training and inference. Moreover, GNN-based methods typically necessitate substantial training datasets to reach optimal performance; insufficient data can lead to overfitting or degraded predictive accuracy.

In the drug discovery process, various ADMET prediction tools exhibit certain complementarity due to their differences in functional emphasis, algorithmic architectures, and coverage of prediction endpoints. For early-stage compound screening and preliminary evaluation of multiparameter ADMET properties, integrated platforms such as ADMETlab 3.0 [32], admetSAR 3.0 [31], and Deep-PK [58] offer certain advantages. These platforms generally cover multiple endpoints, including physicochemical properties, pharmacokinetic characteristics, and toxicity, thereby supporting large-scale systematic virtual screening. If the research focuses on toxicity assessment, tools such as ProTox 3.0 [37] (for general chemical toxicity) or VenomPred2.0 [38] (for peptide toxins) may be considered, as they often provide predictive models tailored to specific toxicity endpoints. For studies emphasizing prediction accuracy and model interpretability, tools based on ML algorithms such as GNNs, e.g. ADMETboost [34] and interpretable-ADMET [52] show promising potential. They not only deliver prediction results but also offer certain interpretative insights through methods like uncertainty estimation and atom contribution visualization. Furthermore, for specific subproblems in ADMET research, such as predicting compound metabolic pathways, BioTransformer 3.0 [40] may be more applicable, while SMARTCyp3.0 [43] could be considered for identifying CYP metabolic sites.

In summary, when selecting ADMET prediction tools, it is advisable to align the choice with specific research needs: integrated platforms are often used for high-throughput preliminary screening, specialized tools are more suitable for in-depth analysis of specific endpoints, and advanced algorithmic tools hold certain value in scenarios requiring higher prediction accuracy, robustness, and mechanistic interpretation.

Artificial intelligence in toxicological research

Toxicological databases

Toxicological databases represent fundamental platforms for integrating, storing, and disseminating toxicity-related information, which are indispensable for chemical safety assessment and drug discovery. Toxins are substances capable of causing cellular damage or diseases following exposure via inhalation, ingestion, or dermal contact. Their toxicity arises from complex interactions between their chemical structures and biological systems [59]. Efficient and systematic management of toxicological data not only provides timely and accurate risk assessment tools for researchers and regulatory authorities but also significantly reduces clinical trial failures and developmental costs [60]. According to data content and application context, toxicological databases can be classified into four major categories: chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases. The classification scheme is illustrated in Fig. 4, and each category’s main features and representative platforms are detailed below.

Figure 4.

Four types of toxin databases: chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases. The characteristics of each type of database and the toxicological information they contain are presented.

Four types of toxin databases: chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases.

Chemical toxicity databases

Chemical toxicity databases focus on elucidating the potential health hazards of various chemical compounds, especially pharmaceuticals, by aggregating multidimensional data such as cytotoxicity, hepatotoxicity, nephrotoxicity, teratogenicity, carcinogenicity, genotoxicity, and reproductive toxicity. These resources provide scientific support for early-stage risk assessment in drug development, enabling research teams to promptly identify safety issues, guide molecular optimization, dose selection, and inform clinical strategies. Representative databases and their key characteristics are summarized in Table 2. PubChem [61], ChEMBL [62], TOXRIC [63], DrugBank [64], SuperToxic [65], and CompTox Chemicals Dashboard [66] are comprehensive chemical toxicity databases. Specifically, PubChem [61], maintained by the National Library of Medicine (NLM) under the U.S. National Institutes of Health (NIH), consolidates chemical data from over 750 sources and freely disseminates them publicly. ChemIDplus [67], HSDB [68], and CCRIS, originally sub-databases of TOXNET, have now been integrated into PubChem. Among these, HSDB includes toxicological information of ~5600 chemicals, covering pharmacological properties, environmental fate, emergency handling, and occupational health data, and is widely utilized globally. ChEMBL [62], an open-source bioactivity database maintained by the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL), provides extensive compound data tailored toward drug discovery and chemical biology research. Developed by the U.S. Environmental Protection Agency (EPA), the CompTox Chemicals Dashboard [66] collates extensive physicochemical, toxicity, and exposure data for numerous chemical substances, serving as a vital tool for environmental health research. TOXRIC [63] contains standardized toxicological attributes, molecular representations, practical benchmarks, and intuitive visualization interfaces for diverse chemical substances. SuperToxic [65] offers a comprehensive collection of toxic substances from diverse sources (animals, plants, synthetic origins, and etc.), enabling detailed investigations into correlations between their chemical, functional, and structural properties. DrugBank [64], another comprehensive resource, integrates extensive chemical, pharmacological, pharmacokinetic, side effect, and toxicological data widely utilized across drug research and toxicological studies. Specialized toxicity databases such as DILIrank [69], DILIst [70], LTKB [71], hERGCentral [72], and LCDB [73] focus specifically on hepatotoxicity, cardiotoxicity, or carcinogenicity. For instance, DILIrank [69], DILIst [70], and LTKB [71] provide critical insights into mechanisms and targets of drug-induced liver injury (DILI). The hERGCentral [72] database delivers extensive information on drug interactions with the hERG potassium channel, pivotal for cardiac toxicity assessment, while LCDB [73] provides robust carcinogenicity data from long-term experiments involving 1726 chemicals and 7745 experimental records.

Table 2.

List of chemical toxicity databases

Database Compounds URL Key features
PubChem [61] >119 million https://pubchem.ncbi.nlm.nih.gov/ NIH-maintained, integrates three interlinked repositories (Substance, Compound, BioAssay) with rich, multi-dimensional data.
ChemIDplus [67] >420,000 Integrated into PubChem Covers a vast number of compounds and offers structure-based visualization tools.
HSDB [68] ~5600 Integrated into PubChem Focused on high-quality toxicology profiles; now includes nanomaterials and animal toxins.
CCRIS >9000 Integrated into PubChem Specializes in carcinogenicity and mutagenicity data for chemical substances.
ChEMBL [62] >2.1 million https://www.ebi.ac.uk/chembl EMBL-EBI-curated, aggregates and standardizes bioactivity data across millions of entries.
OCHEM [74] ~4 million http://www.ochem.eu Hosts chemical and biological measurement data plus an integrated QSAR modeling framework.
ECHA >360 000 https://chem.echa.europa.eu/ Authoritative REACH registry with 360 k substance dossiers linked directly to regulations.
TOXRIC [63] 113 372 https://toxric.bioinforai.tech/ Open-source, ML-ready toxicology platform supporting multiple endpoint predictions.
SuperToxic [65] ~60 000 http://bioinformatics.charite.de/supertoxic Multi-dimensional toxin assessments with target-prediction capability.
DrugBank [64] >10 000 https://go.drugbank.com Integrates drug chemistry, pharmacology, ADMET, and target-interaction data.
DILIrank [69] 1036 https://www.fda.gov/science-research/liver-toxicity-knowledge-base-ltkb/dili-rank Grades drugs by liver-injury severity and provides mechanistic annotations.
DILIst [70] 1279 https://www.fda.gov/science-research/liver-toxicity-knowledge-base-ltkb/drug-induced-liver-injury-severity-and-toxicity-dilist-dataset Binary classification of DILI risk, aggregating data from multiple sources.
T3DB [75] >3600 http://www.t3db.ca/ Includes NMR, MS/MS, and GC–MS spectra, and supports toxicity and target prediction.
eChemPortal >1.44 million https://www.echemportal.org/echemportal/ Aggregates multiple regulatory databases, covers existing and new chemicals, and offers GHS hazard look-up.
LTKB [71] 287 https://www.fda.gov/science-research/bioinformatics-tools/liver-toxicity-knowledge-base-ltkb Integrates multisource DILI research via systems-biology analyses to elucidate mechanisms.
ICE [76] ~1 million https://ice.ntp.niehs.nih.gov/ Compiles acute (oral, dermal, inhalation), and chronic (developmental, carcinogenic, reproductive) endpoints with reference compounds and predictive models.
SIDER [77] 1430 http://sideeffects.embl.de/ Focuses on marketed drug adverse reactions with standardized, visualized labels, and ontology links.
hERGCentral [72] >300 000 www.hergcentral.org Extensive hERG channel inhibition assays with flexible query options.
LCDB [73] 1726 https://carcdb.lhasalimited.org/ Contains GLP-compliant, long-term carcinogenicity bioassay data with high data integrity.
CompTox Chemicals Dashboard [66] >1 218 248 https://comptox.epa.gov/dashboard/ Curated chemical data with properties, exposure, hazard, and risk information from multiple public and government sources.
OnSIDES [78] 2783 https://onsidesdb.org/ Provides access to structured and standardized side effect data from drug labels.
VigiBase [79] >35 000 000 https://www.vigiaccess.org/ VigiBase is the world’s largest drug safety database, containing reports of adverse drug reactions.
VAERS [80] >2 000 000 https://vaers.hhs.gov/ VAERS provides a vast number of reports on adverse events following vaccination.
FAERS 31 770 750 https://www.fda.gov/drugs/drug-approvals-and-databases/fda-adverse-event-reporting-system-faers-database U.S. system containing postmarketing adverse event and medication error reports for drugs and therapeutic biologics.

Environmental toxicology databases

Environmental toxicology databases comprehensively document chemical behavior in environmental matrices (e.g. water, soil, and atmosphere) and their potential impacts on ecosystems. Beyond recording acute and chronic toxicity effects on diverse organisms (algae, benthic organisms, fish, and birds), these databases include parameters like environmental persistence, bioaccumulation, and transformation, thus serving as foundational tools for ecological risk assessments, pollution control strategies, and ecosystem conservation. In drug discovery contexts, these databases facilitate environmental risk evaluation of candidate drugs and their metabolites, promote green chemistry design, and inform postmarketing environmental risk management. Representative databases are listed in Table 3. The ECOTOX database [81], comprising decades of published ecological toxicological test data, elucidates cumulative distribution patterns of species, chemicals, and biological effects, supporting ecological risk assessment and ecosystem management decisions. Additionally, TOXNET [82] is extensively employed in environmental health research. EnviroTox [83] enables users to explore ecotoxicity patterns based on modes of action, analyze organism-specific sensitivities within chemical groups, and assess relative taxonomic sensitivity. Databases like AquaticTox [50] specialize in aquatic organism toxicity data, providing critical insights into pollution impacts on aquatic ecosystems. Collectively, these databases not only advance the understanding of environmental pollutants but also encourage public engagement in environmental protection, laying a solid foundation for ongoing environmental toxicological research.

Table 3.

List of Environmental Toxicology Databases

Database Compounds URL Key feature
ECOTOX [81] >13 000 http://www.epa.gov/ecotox Comprehensive EPA repository with >1 million peer-reviewed aquatic & terrestrial toxicity records.
EnviroTox[83] 4016 http://www.EnviroToxdatabase.org Quality-scored ecotoxicity dataset for ~4000 chemicals and 1500+ species, curated for QSAR development.
AquaticTox[50] >1000 https://chemyang.ccnu.edu.cn/ccb/server/AquaticTox/ Ensemble-learning web server offering rapid multi-endpoint aquatic toxicity predictions (fish, daphnia, algae).
Pesticide Info ~15 300 https://www.pesticideinfo.org Detailed pesticide active-ingredient profiles, including nontarget organism toxicity (e.g. bees, birds), and environmental fate.
PPDB[84] ~1500 https://sitem.herts.ac.uk/aeru/ppdb/ AERU’s relational database of pesticide physicochemical, ecotoxicological, and human-health properties with source quality tags.

Alternative toxicology databases

As ethical and animal welfare concerns increasingly constrain traditional animal-based toxicity testing, alternative toxicology databases have emerged as valuable resources leveraging in vitro high-throughput screening (HTS), high-content imaging, computational toxicology models, and multi-omics techniques. These databases typically integrate genomic, transcriptomic, metabolomic, and proteomic data with systems biology and ML approaches, thereby offering efficient, reproducible, and scalable solutions for chemical toxicity assessments. Early-stage drug development benefits from rapid screening and optimization of candidate molecules based on cell viability, gene expression profiles, and receptor activation data, dramatically reducing animal experimentation, associated costs, and timelines [85]. Key platforms are summarized in Table 4. The Tox21 initiative [86], jointly spearheaded by the U.S. EPA, National Toxicology Program (NTP), and NIH, utilizes high-throughput methodologies to drive toxicity assessments towards mechanism-based approaches. Tox21 compiles and publicly shares extensive in vitro screening data involving thousands of compounds tested across multiple biological pathways, including nuclear receptor signaling and cellular stress responses, providing invaluable resources for understanding chemical-biological interactions. Similarly, Open TG-GATEs [87] stores comprehensive toxicogenomic profiles for 170 compounds across various dosages and time points from in vivo (rats) and in vitro (rat and human primary hepatocytes) studies. ToxicoDB [88] integrates toxicogenomic data from Open TG-GATEs [87], DrugMatrix [89] and EMEXP2458 [90], facilitating queries and analyses of gene expression and signaling pathway perturbations induced by potential toxicants. The Comparative Toxicogenomics Database (CTD) [91] further enhances understanding of human health by integrating chemical, genetic, disease, and exposure information. These resources accelerate the transition from traditional animal-based studies toward more efficient and precise in vitro and computational toxicology methods, enhancing both environmental and public health protection.

Table 4.

List of Alternative Toxicology Databases

Database Compounds URL Key feature
ToxCast/Tox21 [86] ~1.2 million https://comptox.epa.gov/dashboard Aggregates high-throughput in vitro screening data, covering thousands of chemicals across hundreds of bioassay endpoints.
ToxicoDB [88] 231 https://toxicodb.ca Integrates three majors in vitro toxicogenomic datasets with harmonized chemical annotations and interactive time- and dose-response gene-expression plots.
Open TG-GATEs [87] 170 https://toxico.nibiohn.go.jp/english/index.html Contains transcriptomic, biochemical, histopathology, and cytotoxicity data for 170 compounds in both rat in vivo and primary rat/human hepatocyte in vitro models.
CTD [91] >16 300 http://ctdbase.org/ Manually curates over 3.3 million chemical–gene interactions (covering >16 000 chemicals), integrated into chemical–gene–disease networks to illuminate exposure effects.
DrugMatrix [89] >600 https://norecopa.no/3r-guide/drugmatrix Comprehensive toxicogenomic reference from the US NTP, offering in vivo and in vitro gene–expression and pathology data for >600 compounds.

Biological toxin databases

Biological toxin databases (Table 5) specialize in collecting, annotating, and analyzing natural toxins derived from animals, plants, and microorganisms, playing a critical role in elucidating their toxicological mechanisms, pharmacological potential, and biodefense applications. These natural toxins exhibit distinctive bioactivities such as antihypertensive, analgesic, and antimicrobial effects, and some have transitioned into clinical or preclinical development [92] databases such as ToxinDB [93], the first comprehensive biological toxin database encompassing over 4836 toxins and associated molecular descriptors and ADMET properties, provide platforms for predicting toxin metabolites and developing detoxification enzymes. TPPT [94] catalogs 1586 plant toxins with ecological toxicological significance, providing extensive biological and chemical information, alongside computational property estimates. MycoCentral [95], with 904 mycotoxins and metabolites, integrates data on biosynthetic pathways, physicochemical properties, ADME predictions, and QSAR-derived medicinal chemistry parameters. ATDB [96] unifies structural and annotation data for animal toxins, standardizing functional annotations through a novel toxin ontology system. BioTD [92], the most comprehensive open-source biological toxin database to date, provides extensive annotations, sequence data, mutagenesis information, and biological activities derived from over 5220 publications and patents, spanning more than 900 species, underpinning toxin-based drug design and mechanistic studies.

Table 5.

List of Biological Toxin Databases

Database Compounds URL Key features
ToxinDB [93] >4836 http://www.rxnfinder.org/toxindb/ Defines a unified chemical space of 4836 toxins and their potential metabolites, combining in silico predictions with experimental validation.
TPPT [94] 1586 https://www.agroscope.admin.ch/agroscope/en/home/publications/apps/tppt.html Catalogs 1586 toxic plants and their phytotoxins, providing physicochemical properties and toxicity predictions for understudied compounds.
MycoCentral [95] 904 http://www.mycocentral.eu Integrates data on 904 fungal toxins and metabolites, using seven open-source QSAR/ADMET tools to predict 147 endpoints alongside experimental data.
ATDB [96] 3240 http://protchem.hunnu.edu.cn/toxin Stores chemical structures and annotations for 3240 animal toxins, introducing a “Toxin Ontology” for standardized functional annotation.
SCORPION2 [97] >800 http://sdmc.i2r.a-star.edu.sg/scorpion/ Specializes in structure–function analysis of scorpion toxins and predicts toxin–ion channel binding modes.
ConoServer [98] ~10 000 http://www.conoserver.org Provides sequences, structures, and functional data for ~10 000 conotoxins, with graphical visualization and receptor-targeted search.
MycotoxinDB [99] 189 http://www.mycotoxin-db.com/ Contains 189 mycotoxins and their masked forms, offering data-driven tools to predict masked mycotoxins.
Toxinome [100] 14 83 028 http://toxinome.pythonanywhere.com/ Comprehensive bacterial protein toxin database with >1.48 million entries and associated antitoxin information.
ToxinDB [93] 8975 http://biotoxin.net/ Covers 8975 biotoxins from over 900 species and provides multi-endpoint bioactivity and toxicity data.

Methodological workflow for drug toxicity prediction

In recent years, ML has garnered significant attention in the realm of computational drug toxicity prediction, establishing itself as a cutting-edge technique for toxicity assessment using computational models [23]. With the continuous expansion of large-scale toxicological databases and improvements in data quality, the predictive efficacy of ML models has seen substantial enhancement [101, 102]. The typical workflow for ML-driven drug toxicity prediction can be summarized into five core stages (Fig. 5) [3]:

Figure 5.

A schematic diagram of the AI/machine learning-based drug toxicity prediction workflow, encompassing key steps such as data collection, data processing, model construction and training, model evaluation, and model application.

Workflow for AI/machine learning–based drug toxicity prediction.

(1) Data collection: integration of heterogeneous datasets from multiple sources, including molecular structural data (SMILES, InChI), in vitro assay outcomes (e.g. hERG inhibition activity, cytotoxicity), in vivo toxicological endpoints (e.g. LD50 values, organ pathological phenotypes), and clinical adverse drug reaction reports.

(2) Data preprocessing: this stage encompasses molecular feature engineering, such as generating molecular fingerprints (ECFP4, MACCS) and calculating physicochemical descriptors, data cleaning (removal of duplicate compounds and balancing positive/negative samples), and dimensionality reduction techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) [103]. For sparse datasets, methods such as transfer learning and generative adversarial networks (GANs) can be utilized to generate synthetic data to mitigate data insufficiency [104].

(3) Model construction and training: traditional ML methodologies mainly include RF and SVM, whereas deep learning models, such as graph convolutional networks (GCNs) and Transformers, have emerged prominently due to their capability for automatic hierarchical feature extraction [105, 106]. For instance, DNNs can simultaneously predict multiple toxicity endpoints through multitask learning frameworks, and GNNs effectively capture inter-atomic interactions via molecular graph representations [107].

(4) Model evaluation: common metrics for model evaluation include accuracy, area under the ROC curve (AUC), and the F1-score for classification tasks. In continuous regression tasks, commonly used evaluation metrics include mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R2). Cross-validation methods (e.g. 10-fold cross-validation) and external testing datasets are employed to assess generalizability. Interpretability techniques (e.g. SHAP and LIME) are often applied to identify key toxicity-related features, thus enhancing model transparency and credibility [108, 109]. In addition to understanding the model’s decision-making mechanism, practical applications also require attention to the model’s confidence scores, which quantify the model’s certainty in its individual predictions. This is particularly important in high-stakes decision-making scenarios that demand high reliability, such as medical diagnosis and autonomous driving. Common methods for generating confidence scores include probability-based approaches (e.g. leveraging the inherent probabilities from models like Logistic Regression, though these often require calibration via Platt Scaling or Isotonic Regression to be reliable), Bayesian methods for uncertainty quantification, specific ensemble learning techniques like Deep Ensembles which measure disagreement among multiple models to estimate certainty, uncertainty estimation techniques like Monte Carlo Dropout [110], as well as distance-based methods that assess the similarity of a new input to the training data. These techniques can be integrated into the decision-making process to ensure that predictions with low confidence are flagged for further human review or other appropriate safety measures, thereby enhancing the transparency and trustworthiness of the model [111].

(5) Algorithm Application: The resultant predictions can guide drug design, facilitate early-stage toxicity screening, and support regulatory decision-making processes. Specifically: (i) They guide drug design by allowing medicinal chemists to prioritize or deprioritize specific compound series based on predicted ADMET profiles, thus focusing synthesis efforts on leads with higher probabilities of success. (ii) They facilitate early-stage toxicity screening by serving as a rapid, cost-effective virtual triage tool, highlighting high-risk molecules for in vitro or in vivo experimental validation and reducing the reliance on animal testing in initial phases. (iii) They support regulatory decision-making by providing supplementary, evidence-based data that can be used to assess compound risk, potentially informing the design of further necessary clinical trials or contributing to the weight of evidence in a regulatory submission.

ML/AI applications across diverse toxicity prediction tasks

In recent years, the landscape of toxicity prediction has evolved significantly, encompassing a broad spectrum of endpoints, from acute toxicity to multi-organ damage, as well as chronic effects like carcinogenicity and genotoxicity [2, 3]. Acute toxicity assessments focus on immediate, life-threatening effects resulting from short-term, high-dose exposures. Organ-specific models delve into cardiotoxicity (e.g. in vitro hERG channel blockade assays), hepatotoxicity (DILI), nephrotoxicity (tubular damage), and neurotoxicity. Carcinogenicity studies consider both in vivo tumor induction and in vitro genotoxic endpoints, while genotoxicity analyses target molecular mechanisms such as mutagenesis, chromosomal aberrations, and DNA damage [2, 3].

This review highlights recent cutting-edge developments and representative achievements applying ML to predict these diverse toxicity endpoints, particularly acute toxicity, cardiotoxicity, hepatotoxicity, nephrotoxicity, and carcinogenicity, as well as innovative case studies from Tox21 challenges (see Table 6). It aims to provide a comprehensive overview of how methods such as multitask learning, GNNs, and generative models are shaping the future of toxicity screening.

Table 6.

Methods for different Toxicity Prediction Tasks

Method Model Endpoint/Task Data source Performance
Wang et al. [112] ensemble learning, NB, SVM hERG WOMBAT-PK, literatures AC = 84.7% (training set) AC = 82.1% (external test set) AC = 83.6% (hERG blockers in test set) AC = 78.2% (nonblockers in test set)
deephERG [113] multitask DNN hERG CHEMBL, literatures AUC = 0.944 (training set) AUC = 0.967 (validation set)
CToxPred [115] GCN hERG, Cav1.2, Nav1.5 ChEMBL, ChEMBL, BindingDB, hERGCentral, patents, literatures 1.hERG Prediction: AC = 81.4% (Eval-70 external set) AC = 71.2% (Eval-60 external set) SN = 86.7% (blockers in Eval-70) SP = 74.6% (nonblockers in Eval-70) 2.Nav1.5 Prediction: AC = 81.7% (Eval-70 external set) AC = 76.6% (Eval-60 external set) SN = 85.6% (blockers in Eval-70) SP = 73.3% (nonblockers in Eval-70) 3.Cav1.2 Prediction: AC = 86.4% (Eval-70 external set) AC = 69.4% (Eval-60 external set) SN = 96.2% (blockers in Eval-70) SP = 69.0% (nonblockers in Eval-70)
CardiOT [143] GNN, KAN hERG, Cav1.2, Nav1.5 ChEMBL, ChEMBL, BindingDB, hERGCentral, patents, literatures ACC = 74.7% F1 = 76.4% SEN = 82.2% SPE = 68.2% CCR = 75.2% MCC = 50.1%
CardioGenAI [116] Transformer, GAT hERG, Cav1.2, Nav1.5 ChEMBL, GuacaMol v1, MOSES, BindingDB 1.hERG Channel: AC = 83.5% SN = 86.2% SP = 80.3% F1 = 85.1% CCR = 83.2% MCC = 66.7% 2.Nav1.5 Channel: AC = 89.4% SN = 95.9% SP = 75.6% F1 = 92.5% CCR = 85.7% MCC = 75.1% 3.Cav1.2 Channel: AC = 91.4% SN 96.2% SP = 82.8% F1 = 93.5% CCR = 89.5% MCC = 81.0%
InterDILI [118] RF, LGBM, LR, attention DILI DILIrank, NCTR, literatures AUROC = 0.88–0.97 AUPRC = 0.81–0.95
StackDILI [117] GA, Stacking Architecture DILI DILIrank, NCTR, literatures DILIrank test set: AC = 92.7% SN = 96.2% SP = 90.3% precision = 87.2% F1 = 91.5% 10-fold cross-validation: AC = 79.2% SN = 81.4% SP = 76.9% precision = 79.0% F1 = 80.1%
pDILI_v1 [119] LR, KNN), NB, RF, DT, QDA, MLP DILI DILIst FDR = 0.053 FOR = 0.230 SN = 82.9% (training set) SN = 78.6% (test set) FDR = 0.240 FOR = 0.378 (test set)
DILIPredictor [120] RF Human hepatotoxicity, Animal hepatotoxicity A, Animal hepatotoxicity B, Preclinical hepatotoxicity, Diverse DlL A, Diverse DlLl C, BSEP, Mitotox, Reactive Metabolite DILIst, DILIrank, multi-Proxy-DILI Data Sets AUC-ROC = 0.79 detection capability = 2.68 LR+ score (top 25 toxic compounds)
jin et al [121] XGBoost-SHAP pathway Open TG-GATES, DrugMatrix Precision = 86% (49 TP, eight FP, 57 predicted positives) SP = 71% (20 TN/32 predicted negatives) AC overall = 78% (69 correct/89 total) precision = 91%, SP = 89% (cutoff = 1.22)
Tox-GAN [122] CGAN, WGAN Gene activities and expression profiles Open TG-GATES rge = 0.997 ± 0.002 re = 0.740 ± 0.08
Shi et al [123] Consensus Modeling, SVM, RF, DT, ASNN, RFR, XGBoost DIRI SIDER AUC = 0.93 (consensus model) Q = 86.24% (external validation) MCC = 0.82 (external validation) SE = 85.45% (external validation) SP = 87.04% (external validation) EF = 1.72% (external validation)
Gong et al. [124] ANN, LightGBM, SVM, RF, DT, KNN, NB, XGBoost DIRI SIDER, DrugBank, ChEMBL ANN_GraphFP (AUC = 0.870, ACC = 0.782, SE = 0.844) SVM_GraphFP (AUC = 0.856, ACC = 0.795, SE = 0.781) RF_GraphFP (AUC = 0.846, ACC = 0.808, SE = 0.719) LightGBM_GraphFP (AUC = 0.846, ACC = 0.782, SE = 0.719) LightGBM_KRFP (AUC = 0.812, ACC = 0.769, SE = 0.719) ANN_PubChemFP (AUC = 0.810, ACC = 0.769, SE = 0.750)
Mazumdar et al. [125] DNN, XGBoost, Extra-tree DIRI literatures ROC-AUC = 0.85–0.88 AC = 82% (DNN) ROC-AUC = 0.6 (Extra-tree) ROC-AUC = 0.7 (XG Boost)
Nguyen-Vo [126] AdaBoost, XGBoost, GB, ERT, RF, KNN, SVM, LR DIRI literatures Best models: AUC-ROC = 0.7583 ± 0.0189 AUC-PR = 0.8883 ± 0.0185
Att-RethinkNet [127] RNN 8 kidney pathological findings Open TG-GATEs ACC = 89.4%SPE = 98.2% SEN = 94.2% F1 = 93.8% AUC = 0.993 (liver data) ACC = 97.5% SPE = 99.5% SEN = 99.1% AUC = 0.9949 F1 = 99.2% (kidney data)
Jain et al.
[132]
MT-DNN, ST-DNN, Consensus Model 59 different end points ChemIDplus average RMSE = 0.65 average R2 = 0.57 (consensus B models)
STopTox [131] QSAR, RF Skin sensitization, Skin irritation/corrosion, Eye irritation/corrosion, Acute dermal, Acute inhalation, Acute oral ECHA, REACH, ICCVAM, ToxValDB, NICEATM, literatures Skin sensitization: CCR = 0.70, Se = 0.66, Sp = 0.75, PPV = 0.71, NPV = 0.75, Coverage = 0.96 Skin irritation/corrosion: CCR = 0.72, Se = 0.77, Sp = 0.66, PPV = 0.69, NPV = 0.74, Coverage = 0.94 Eye irritation/corrosion: CCR = 0.72, Se = 0.72, Sp = 0.71, PPV = 0.71, NPV = 0.71, Coverage = 0.95 Acute dermal: CCR = 0.76, Se = 0.74, Sp = 0.78, PPV = 0.77, NPV = 0.75, Coverage = 0.93, Acute inhalation: CCR = 0.74, Se = 0.69, Sp = 0.80, PPV = 0.77, NPV = 0.72, Coverage = 0.95 Acute oral: CCR = 0.77, Se = 0.85, Sp = 0.70, PPV = 0.79, NPV = 0.78, Coverage = 0.95
PredAOT [128] RF AOT OCHEM, literatures RMSE = 0.3806, R2 = 0.3557 (mice, toxic regressor) RMSE = 0.2923, R2 = 0.3881 (mice, nontoxic regressor) RMSE = 0.5323, R2 = 0.3065 (rats, toxic regressor) RMSE = 0.3863, R2 = 0.2702 (rats, nontoxic regressor)
DermalPred [130] RF, SVM, XGBoost, LightGBM, GCN, GAT, Attentive FP ADT ChemIDplus, eChemPortal AUC = 78.0% (species 1, 10-fold CV) AUC = 82.0% (species 2, 10-fold CV
Wijeyesakere et al. [129] QSAR, RF AOT NTP ICE portal SN = 76.1% (GHS 1–2) SN = 76.6% (GHS 1–3) balanced AC 73.7%
CapsCarcino [133] Capsule network Carcinogenicity CPDB, ISSCAN AC = 85.0% (external validation)
HNN-Cancer [144] HNN, CNN Carcinogenicity, pTD50 MEG, TG230, NTP, IARC, JSOH, NIOSH, CPDB, CCRIS, Drugbank AC = 74% (HNN-Cancer/RF/Bagging, binary classification) AUC ≈ 0.81 (binary classification, 7994 chemicals) SN = 79.5%, SP = 67.3% (HNN-Cancer, binary) AC = 70% (HNN-Cancer/RF/Bagging/AdaBoost, multiclass, 1618 chemicals) AUC = 0.7 (multiclass, 1618 chemicals) R ≈ 0.62 (HNN-Cancer/RF, regression)
CONCERTO [134] GNN, transfer learning, Carcinogenicity CPDB, CCRIS, Hansen ROCAUC = 0.73
DCAMCP [136] Capsule network, graph attention Carcinogenicity CPDB, CCRIS, ISSCAN ACC = 0.718 ± 0.009 SE = 0.721 ± 0.006 SP = 0.715 ± 0.014 AUC = 0.793 ± 0.012 ACC = 0.750, SE = 0.778, SP = 0.727, AUC = 0.811 (external validation, 100 compounds)
Metabokiller [137] RF, MLP, KNN, SVM, SGD, LR, GCM, attentive FP, GCN, GAN electrophilic properties, epigenetic modifications, genomic instability, oxidative stress, proliferative properties, anti-apoptotic properties Literatures and databases: https://zenodo.org/records/6683106 AUC = 0.87 AC = 0.82 Recall = 0.89 F1 = 0.82 Precision = 0.76
DeepTox [139] DNN AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 Tox21 AUC: AIR = 0.923, AR = 0.778, AR-LBD = 0.825, ARE = 0.829, Aromatase = 0.804, ATADS = 0.775, ER = 0.791, ER-LBD = 0.811, HSE = 0.863, MMP = 0.930, p53 = 0.860, PPAR.g = 0.856
CensNet [140] GCN AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 Tox21, Lipophilicity, Cora, Citeseer, literatures Tox21: Train PCT 60%: val set: AUC = 0.76 ± 0.00 test set: AUC = 0.77 ± 0.00 Train PCT 70%: val set: AUC = 0.76 ± 0.00 test set: AUC = 0.77 ± 0.00 Train PCT 80%: val set: A UC = 0.76 ± 0.00 test set: AUC = 0.78 ± 0.00 Train PCT 90%: val set: AUC = 0.78 ± 0.01 test set: AUC = 0.79 ± 0.01 Lipophilicity: Train PCT 60%: val set: RMSE = 0.94 ± 0.01 test set: RMSE = 0.97 ± 0.01 Train PCT 70%: val set: RMSE = 0.92 ± 0.01 test set: RMSE = 0.95 ± 0.01 Train PCT 80%: val set: RMSE = 0.96 ± 0.01 test set: RMSE = 0.93 ± 0.01 Train PCT 90%: val set: RMSE = 0.94 ± 0.02 test set: RMSE = 0.83 ± 0.02
GMT [141] GMT AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 HIV, Tox21, ToxCast, BBBP HIV: ACC = 77.56 ± 1.25 Tox21: ACC = 77.30 ± 0.59 ToxCast: ACC = 65.44 ± 0.58 BBBP: ACC = 68.31 ± 1.62
Meta-MGNN [142] GNN, Meta learning AR, AhR, AR-LBD, ER, ER-LBD, aromatase, PPAR-gamma, ARE, ATAD5, HSE, MMP, p53 Tox21, SIDER Tox21:1-Shot: average AUC = 76.87% 5-Shot: average AUC = 78.02% SIDER: 1-Shot: average AUC = 73.34% 6-shot: average AUC = 74.72%

Cardiotoxicity

In cardiotoxicity prediction, hERG, Cav1.2, and Nav1.5 are three key targets typically measured through in vitro assays. Historically, due to limited bioactivity data for the latter two targets, most research concentrated on hERG channel blockade prediction. In 2016, Wang et al. [112] combined naïve Bayes (NB) with SVM, selecting optimal pharmacophore subsets via recursive partitioning (RP), and integrated multiple pharmacophore features through ensemble learning, developing classification models with high accuracy. Their SVM model demonstrated excellent external test performance, elucidating complex hERG blocker interactions. In 2019, Cai et al. [113] introduced deephERG, a multitask DNN-based model trained on 7889 structurally diverse compounds, achieving highly accurate predictions by simultaneously learning chemical features and hERG inhibitory activity. In 2020, Ryu et al. [114] proposed DeepHIT, an end-to-end deep learning approach that transforms molecules into extended connectivity fingerprints (ECFP4) and graph representations, extracting global and local toxicological features using GCNs, optimized via binary cross-entropy loss. This approach effectively predicted hERG toxicity without traditional reliance on manual feature engineering. With accumulating data, researchers have extended their predictions to joint modeling of hERG, Nav1.5, and Cav1.2 channels. Issar et al. [115] developed CToxPred, systematically evaluating fingerprints, descriptors, and graph-based numerical representations within a deep-learning framework, significantly improving multitarget predictive capabilities. Additionally, Kyro et al. [116] developed CardioGenAI, which combines autoregressive Transformer generative models and discriminative deep-learning models. In a case study involving the high-affinity hERG inhibitor pimozide, their method generated candidate drug fluspirilene with over 700-fold reduction in hERG affinity, while preserving therapeutic activity. Collectively, current ML- and deep learning-based cardiotoxicity prediction methods have evolved from single-target models to comprehensive, multitarget, multifeature assessment systems, substantially enhancing prediction accuracy and generalizability. Future advancements in high-quality cross-target activity data and model interpretability techniques will drive cardiotoxicity prediction towards greater automation and precision. In practice, traditional ML approaches (e.g. SVM, ensemble learning) remain robust and interpretable for single-target predictions with limited data, whereas deep learning models (e.g. GCNs, Transformers) are more advantageous in multitarget modeling and large-scale compound screening due to their ability to capture complex molecular features.

Hepatotoxicity

Prediction of drug-induced hepatotoxicity remains a critical bottleneck in drug development. Traditional serum biomarkers such as ALT and AST, which are measured in vivo, have limited sensitivity and specificity for detecting liver injury. Furthermore, discrepancies between in vitro assays and the complex in vivo metabolic environment, coupled with species-specific differences that limit the translational value of animal models, contribute significantly to the clinical failure of numerous drug candidates due to liver toxicity. Recent advancements in ML have markedly improved predictive accuracy by integrating multidimensional datasets. For instance, StackDILI [117] employs stacked ensemble learning to merge chemical structure and bioactivity data, effectively reducing bias inherent to single algorithms. InterDILI [118] combines permutation feature importance and attention mechanisms, achieving high prediction accuracy while identifying critical structural alerts, such as aniline derivatives, thereby providing explicit guidance for drug design. Additionally, pDILI_v1 [119] utilizes probabilistic modeling to analyze extensive drug-adverse reaction datasets, enabling automated extraction, and quantification of liver toxicity risk from unstructured text data. Notably, DILIPredictor [120] integrates nine toxicity endpoints, including mitochondrial toxicity and bile acid transporter inhibition, along with chemical structure, pharmacokinetic parameters, and surrogate toxicity data. Using a RF model, it delivers superior predictive performance, discriminates cross-species toxicities, and captures mechanisms far exceeding those of single-target models.

With the accumulation of toxicogenomics data, research has progressively shifted toward mechanism-based analyses rooted in gene expression and pathway perturbations. Jin et al. [121] proposed an entropy weight method (EWM) to quantify gene expression dispersion, facilitating the evaluation of pathway disruptions. This approach, combined with ML, effectively identifies key toxic pathways—such as ferroptosis during acetaminophen-induced liver injury—thus bridging predictive modeling with underlying biological processes. Moreover, in the realm of generative AI applications, the Tox-GAN model [122] leverages GANs to simulate drug-induced transcriptomic profiles, overcoming the limitations posed by limited experimental data availability. This innovative approach generates virtual toxicogenomic data, enabling the prediction of pathway activation patterns for unknown compounds, and thereby offering an efficient strategy for early-stage hepatotoxicity screening. Collectively, these advances are transitioning hepatotoxicity prediction from traditional single-marker assessments toward integrative mechanism-driven models that combine chemical structure, genomic signatures, and multi-omics data. Such efforts promise to accelerate the development of safer therapeutics and reduce toxicity-related clinical attrition. For practical applications, ensemble learning remains a reliable choice for integrating heterogeneous data sources, whereas mechanism-driven models leveraging toxicogenomics and generative approaches are particularly recommended when exploring novel compounds or elucidating underlying biological pathways.

Nephrotoxicity

Nephrotoxicity is typically assessed in vivo by measuring blood urea nitrogen and serum creatinine. However, similar to hepatotoxicity, predicting nephrotoxicity faces two major challenges: the limited translatability of findings from animal models to humans, and the suboptimal specificity and accuracy of available biomarkers. Research in nephrotoxicity prediction has been comparatively less abundant compared to cardiotoxicity and hepatotoxicity studies. Shi et al. [123] compiled a real-world dataset comprising 565 compounds (287 nephrotoxic drugs and 278 non-nephrotoxic drugs). They developed predictive models utilizing five conventional ML methods (e.g. RF, Extreme Gradient Boosting (XGBoost)) and five deep learning algorithms, such as convolutional neural fingerprint networks (CNFNs). An ensemble of the top three individual models was further employed as a consensus predictor. Additionally, the study identified 87 structural alerts based on Klekota–Roth fingerprints (KRFP), calculating substructure frequency (f-score) and positivity rates. Among these alerts, 16 exhibited specificity for nephrotoxic drugs, offering mechanistic insights through structural signatures. Gong et al. [124] constructed a dataset consisting of 777 drugs, including 125 TCM components. They employed nine molecular fingerprints (e.g. Atom Pair, MACCS) combined with eight ML algorithms, generating a total of 72 classification models. Through 10-fold cross-validation and external validation, an optimal model, SVM combined with CDK graph fingerprints—demonstrated robust generalizability for both TCM-derived and chemically synthesized drugs. Following OECD guidelines, the applicability domain was rigorously evaluated, and SARpy and information-gain methods identified eight potential nephrotoxicity-related structural alerts, including fluorinated benzene rings and polyamine derivatives, thus providing critical structural warnings for drug safety considerations.

Mazumdar et al. [125] integrated deep learning and traditional ML methods, employing eight types of molecular fingerprints and RDKit descriptors to build 27 ML models and one DNN. Their DNN achieved optimal performance in five-fold cross-validation, whereas an Extra-tree model showed an accuracy of 82.1% on an independent test set. Innovatively, the study applied association rule mining, identifying 10 high-frequency substructures within nephrotoxic compounds, such as benzimidazole derivatives and fluorinated substituents, revealing that structural cooperativity significantly influences nephrotoxicity. Nguyen-Vo et al. [126] prioritized data quality optimization through meticulous data cleaning and class balancing, generating a dataset containing 604 positive and 228 negative samples. Utilizing eight algorithms (e.g. Extremely Randomized Trees (ERT), XGBoost) and three molecular representations (Mol2vec embeddings, RDKit descriptors, ECFP fingerprints), they developed 32 models. Their results revealed consistently superior performance of ERT models across different molecular representations, providing a reliable baseline for subsequent investigations.

Recent incorporation of toxicogenomic data has ushered nephrotoxicity prediction into a novel direction. Su et al. [127] introduced the Att-RethinkNet model, a multilabel learning framework based on gene expression data from the Open TG-GATEs database. By employing memory structures and attention mechanisms, the model effectively captures correlations between hepatic and renal pathological phenotypes. Furthermore, it integrates multidimensional parameters such as compound type, dosage, and administration duration, enabling simultaneous predictions of 20 renal/hepatic pathological phenotypes. Demonstrating robust performance in in vivo rat datasets, its attention mechanism also highlights key gene features, thereby offering interpretability at the gene-expression level and facilitating deeper mechanistic insights into nephrotoxicity. Together, these advances signify a progressive evolution in nephrotoxicity prediction: from traditional structure-based analyses toward multi modal data integration and interpretability-driven modeling, thus supporting early-stage toxicity risk assessment in drug development. In practice, ensemble or tree-based models (e.g. ERT, RF) remain reliable baselines for small to medium datasets, whereas attention-based multi-omics frameworks are particularly suitable when exploring mechanistic insights or cross-organ toxicities.

Acute toxicity

Acute toxicity refers to harmful effects caused by chemical exposure within a short period (usually less than 24 h) through single or multiple administrations. Quantitative measures such as median lethal dose (LD50), median lethal concentration (LC50), and minimal lethal dose (MLD), which are classic in vivo endpoints, are employed to evaluate the severity of lethal toxicity. Acute toxicity assessments encompass various endpoints depending on administration routes (oral, inhalation, dermal, and etc.) and symptom profiles, among which acute oral toxicity prediction remains a primary research focus. In the domain of acute oral toxicity prediction, the computational framework PredAOT [128], based on multiple RF models trained using acute oral toxicity data from OCHEM database and literature in mice and rats, categorizes compounds as “toxic” or “nontoxic/low-toxicity”. Initially, an “AOT classifier” determines compound toxicity class, followed by regression models for precise LD50 (in vivo) prediction. Synthetic minority oversampling technique (SMOTE) was utilized effectively to manage class imbalance, achieving predictive performance comparable or superior to existing tools. Wijeyesakere et al. [129] developed a mechanistically driven QSAR model by analyzing the U.S. NTP’s rat acute toxicity database. This model assigned primary mechanisms of action using various mechanistic analysis tools, and further employed RF for acute oral LD50 prediction, optimizing results by structure-mechanism similarity. The method exhibited enhanced sensitivity and balanced accuracy in identifying highly toxic compounds, enabling toxicity predictions based on specific mechanisms, such as aconitase inhibition.

Regarding acute dermal toxicity prediction, DermalPred [130] exemplifies significant advancement. Integrating rabbit and rat experimental data, researchers constructed predictive models utilizing ML and deep learning algorithms. Structure alerts were extracted using tools such as SARpy, and multiple interpretation methods were combined to identify significant features and structural fragments correlated with acute dermal toxicity, culminating in an independent software tool for toxicity prediction. The optimal model demonstrated robust performance with high AUC scores in 10-fold cross-validation, effectively supporting safety assessments for pesticides, cosmetics, and pharmaceuticals. Moreover, several studies aim to predict multiple acute toxicity endpoints simultaneously. STopTox [131] integrated publicly available datasets to develop QSAR models targeting six acute endpoints—skin sensitization, skin irritation, eye irritation, acute oral (in vivo), acute inhalation (in vivo), and acute dermal toxicity—aggregated within a comprehensive online platform. Rigorously validated, these models effectively identify potentially toxic or nontoxic compounds across multiple endpoints. Jain et al. [132] compiled extensive public datasets and developed diverse single-task and multitask models using RFs and DNNs. They introduced consensus models derived from multiple multitask frameworks, demonstrating excellent predictive capability for 59 acute systemic toxicity endpoints, especially excelling at predicting less-represented endpoints. Collectively, acute toxicity prediction research continues to evolve, broadening from single-route acute oral predictions to multifaceted acute toxicity profiling, significantly contributing to drug development, environmental chemical screening, and regulatory decisions, thereby reducing reliance on animal testing and steering toxicological research toward enhanced accuracy and sustainability. In practice, QSAR and RF-based models remain suitable for rapid screening and regulatory purposes, whereas multitask and consensus deep learning frameworks are recommended for handling diverse endpoints and capturing underrepresented toxicity profiles.

Carcinogenicity

Accurate prediction of drug carcinogenicity is crucial for ensuring public medication safety. Traditional in vivo animal and in vitro cellular methods are constrained by high costs, prolonged durations, poor extrapolation accuracy to humans, and limited simulation of realistic physiological environments. Recent rapid advances in ML and deep learning have spurred the development of highly efficient carcinogenicity prediction models. For instance, CapsCarcino [133], leveraging the dynamic routing algorithm of capsule networks, achieved high accuracy in external validation datasets, demonstrating superior generalizability even in sparse data conditions. CONCERTO [134] combined graph Transformer architectures with molecular fingerprint representations, significantly enhancing prediction performance through iterative pretraining and transfer learning strategies. Limbu and Dakshanamurthy introduced the HNN-Cancer model [135], integrating convolutional neural networks (CNNs) and feedforward neural networks (FFNN), using improved SMILES-based representations, and delivering robust results in binary, multiclass, and regression tasks across diverse chemical categories. Similarly, DCAMCP [136], incorporating capsule networks and attention mechanisms alongside various molecular fingerprints and graph-based descriptors, displayed impressive results in both cross-validation and external validation phases. Additionally, Metabokiller [137], integrating biochemical characteristics related to carcinogenicity and utilizing ensemble classification methods, attained high precision, and recall rates when predicting carcinogenic potential for unknown compounds, with selected predictions experimentally validated. Nonetheless, current carcinogenicity prediction models still encounter substantial hurdles, including inadequate quantity and quality of training data, limited interpretability, inconsistent cross-species predictive capability, elevated risks of overfitting, and challenges achieving widespread acceptance from regulatory authorities. From an application standpoint, capsule network, and graph-transformer models show strong promise for sparse or heterogeneous datasets, while ensemble approaches that integrate biochemical descriptors remain a practical choice for improving robustness and regulatory acceptance.

Tox21 data challenge

In 2014, the Tox21 initiative launched the Tox21 data challenge [138], utilizing quantitative HTS data from in vitro assays based on nuclear receptor signaling pathways and cellular stress response pathways from the Tox21 10 K chemical library. The dataset comprised 12 060 training samples and 647 test samples, covering 12 toxicity endpoints related to nuclear receptor activation and cellular stress responses. The global competition invited bioinformatics, data science, and ML experts to collaboratively develop and validate innovative toxicity prediction models, addressing the high-cost and low-efficiency bottlenecks of traditional toxicity testing methodologies. Results revealed diverse methodologies applied by participating teams, including classical algorithms such as RFs, SVM, k-nearest neighbors, NB, and alongside state-of-the-art deep learning methods [138]. Among these, deep learning approaches, due to their automatic extraction of hierarchical chemical feature representations, generally demonstrated superior performance. Mayr et al. [139] notably pioneered the application of deep learning models in toxicity prediction, establishing hierarchical chemical feature models significantly outperforming traditional approaches.

As of March 2025, the Tox21 dataset has evolved into five benchmark tasks: Molecular Property Prediction, Drug Discovery, Graph Regression, Graph Classification, and Molecular Property Prediction (1-shot) (https://paperswithcode.com/dataset/tox21-1). Within these benchmarks, deep learning models, particularly GNNs, dominate in terms of performance. For example, Deep-CBN currently leads in molecular property prediction tasks, whereas CensNet [140] excels in graph regression tasks. GMT [141] demonstrates outstanding performance in graph classification tasks, and Meta-MGNN [142] dominates the single-shot molecular property prediction task, all based upon GNN architectures. Furthermore, within the drug discovery task, six out of the top 10 performing models leverage graph-based architectures, underscoring the unique strengths of GNN-based approaches in molecular representation learning.

Network toxicology and its application in toxicity evaluation of traditional Chinese medicine

TCM, characterized by a longstanding history and widespread clinical application, remains an essential component of China’s healthcare heritage. Compared to Western medicines, which typically act on defined targets through chemical synthesis or natural extraction, providing rapid onset yet usually targeting single symptoms or specific diseases, TCMs possess distinct advantages due to their multicomponent, multitarget nature [145]. TCM formulations consist of diverse bioactive ingredients that interact complexly to produce comprehensive therapeutic effects through multiple targets and pathways. Notably, during the COVID-19 pandemic, TCM demonstrated its irreplaceable role by intervening at multiple stages of viral infection and regulating immune responses, effectively leveraging its multicomponent, multitarget strategy [146]. Nevertheless, as TCM advances rapidly, increasing concerns regarding its potential toxicities have emerged [147–149], making TCM toxicity evaluation a critical area in contemporary research. However, the complex composition and multifaceted mechanisms of action of TCM pose considerable challenges for accurate safety assessment.

Network toxicology, an emerging approach derived from network pharmacology, has become a valuable tool for assessing drug toxicity. Liu and colleagues extended network modeling from “drug–target–efficacy” to “drug–target–adverse reaction,” transforming traditional pharmacological databases into specialized toxicological databases, thereby enhancing the precision of toxicological predictions [150]. Utilizing network analysis and prediction methodologies, network toxicology has successfully identified toxic constituents within TCMs and elucidated their molecular mechanisms of toxicity, providing novel insights into TCM safety evaluations [151].

The application of network toxicology in TCM toxicity evaluation generally comprises the following steps: (i) Data collection and curation: Initially, it is crucial to acquire detailed information about TCM constituents from literature reviews, databases (such as CTD [91] TCMSP [152]), and experimental analyses. (ii) Target prediction and identification: Toxicity targets are predicted using computational tools (e.g. Hazard Expert [153], TOPKAT [154], and DEREK [155]), followed by comparative analyses with TCM-derived toxic compound targets, identifying potential toxic targets. (iii) Network construction and analysis: Toxic compounds and their targets serve as nodes within a network, visualized using tools such as Cytoscape [156], enabling topological analysis and identifying critical nodes. (iv) Toxicity mechanism and pathway analysis: The selected targets undergo Gene Ontology (GO) and KEGG enrichment analyses to clarify potential molecular mechanisms and signaling pathways underlying toxicity. (v) Molecular docking and experimental validation: molecular docking simulations (e.g. using AutoDock [157]) assess compound-target binding interactions, while subsequent in vitro cellular or in vivo animal experiments validate predicted mechanisms and targets identified via network toxicology (Fig. 6).

Figure 6.

Workflow for network toxicology-based toxicity prediction, encompassing five key steps: data collection and organization, target prediction and identification, building network models and analysis, toxicity mechanism and pathway analysis, molecular docking, and experimental validation.

Workflow for network toxicology–based toxicity prediction.

Commonly utilized resources in network toxicology include toxicological databases, TCM-related constituent databases, predictive toxicological software, and visualization platforms. Reliable toxicological data and accurate TCM constituent information are vital. For example, TOXNET, developed by the U.S. NLM, provides extensive toxicological and chemical data relevant to environmental health and pharmacology. TCM databases such as TCMID [158] and TCMSP [152] (Table 7) facilitate access to constituent information. Predictive toxicology tools described previously (section 3.2) are extensively used for identifying toxic compounds, exogenous substance toxicity, carcinogenicity, and sensitization risks. Although visualization platforms like Cytoscape [156] and STRING [159] are ancillary, they significantly aid in the intuitive interpretation and analysis of complex biomolecular interaction networks.

Table 7.

Databases of TCM Ingredients

Database URL Key features
TCMID [158] http://www.megabionet.org/tcmid/ Integrates herbs, compounds, prescriptions, targets, and diseases; supports network visualization of herb–target–disease relationships.
TCMSP [152] http://tcmsp-e.com/ Covers 499 official Chinese herbs with compounds, protein targets, and diseases; provides ADME properties and built-in compound–target–disease network construction.
TCM@Taiwan [166] http://tcm.cmu.edu.tw/ Hosts >20 000 pure compounds from 453 herbs with curated 2D/3D structures; supports virtual screening and molecular docking.
TCM-Mesh [167] http://mesh.tcm.microbioinformatics.org/ Builds herb–compound–target–disease networks including side-effect and toxicity annotations for holistic safety assessment.
TCMGeneDIT [168] http://tcm.lifescience.ntu.edu.tw/ Curates literature-mined associations among herbs, genes, and diseases; enables exploration of TCM’s molecular mechanisms.
HIM [169] http://www.bioinformatics.org.cn/ Integrates metabolomic, bioactivity, toxicity, and ADMET data for TCM compounds; supports multi-omics safety evaluation.
TCMSTD [170] https://www.bic.ac.cn/TCMSTD/ The first system-toxicology database for TCM, with sections on five major toxicities and standardized toxic-target annotations.
TCM Bank [171] https://TCMBank.cn/ Aggregates active ingredients, 3D structures, gene targets, pathways, and disease associations into a unified, queryable platform.
DCABM-TCM [172] http://bionet.ncpsb.org.cn/dcabm-tcm/ Provides active compounds and targets, pharmacodynamic mechanisms, plus integrated ADMET profiles for each ingredient.
BATMAN-TCM 2.0 [173] http://bionet.ncpsb.org.cn/batman-tcm/ Predicts herb-compound–protein interactions and constructs chemical–target–disease networks for candidate selection.

Network toxicology continues to evolve, showcasing distinct advantages in TCM toxicity research. Researchers initially predict toxic targets and mechanisms using network toxicology, followed by empirical validations to build comprehensive safety evaluation frameworks for TCM. For instance, one study investigated the hepatotoxicity of Mesaconitine (MA), a constituent of Aconitum species, using online databases and network toxicology. It identified 31 crucial hepatotoxic targets and suggested that MA may induce hepatic injury via oxidative stress activation, inflammatory response initiation, and apoptosis induction, offering critical insights into the toxicity of aconite-based TCMs [160]. Another study by Xi et al. [161] analyzed 42 nephrotoxic TCM compounds using network toxicology, identifying alkaloids as the predominant toxic class, followed by terpenoids and phenolics, highlighting the necessity of vigilance regarding nephrotoxic risks associated with these compound classes. Lv et al. [162] employed network toxicology and molecular docking to explore acute toxicity mechanisms associated with palmitic acid, a common TCM component, identifying 117 potential cardiac-toxicity-related targets.

Integrating AI with network toxicology represents an innovative research frontier, significantly enhancing toxicity prediction, risk assessment, and drug development processes. Currently, AI predominantly supports molecular interaction prediction, target identification, and molecular feature modeling within network toxicology. Convolutional neural networks (CNNs), a principal deep learning architecture, effectively predict pharmacological activities and toxicities through structural deep learning, providing robust technological support for network toxicology [163]. Moreover, GNNs [164] facilitate the construction of drug-target-disease interaction networks, enabling predictions of drug-target affinities and potential toxicity profiles, thereby enriching scientific frameworks for safety evaluations. Tian et al. [165] developed an innovative approach, MHADTI, utilizing multi view heterogeneous information network embeddings and hierarchical attention mechanisms to efficiently predict drug-target interactions, paving new avenues for mechanistic exploration in network toxicology. Although AI-driven methodologies in network toxicology still require extensive data accumulation and validation, their early successes underscore significant potential to accelerate TCM safety research and modernization efforts.

Drug toxicology research in the era of large language models

The rapid advancement of AI and natural language processing (NLP) technologies in recent years, exemplified by LLMs such as ChatGPT and DeepSeek, has profoundly impacted numerous scientific domains, becoming essential auxiliary tools for researchers [174–177]. Given the enormous, heterogeneous, and complex data involved in drug toxicology and pharmacokinetic safety evaluations, there is an urgent demand for automated analytical tools, where LLMs have demonstrated immense potential.

One of the most notable applications of LLMs in drug toxicity and safety assessment is automated literature analysis and knowledge integration. Traditionally, drug toxicity information scattered across extensive literature, clinical reports, and drug labeling documents requires laborious, subjective manual analyses. Silberg et al. [178] developed the UniTox platform employing the GPT-4 model to automatically extract and categorize toxicity information from drug labeling data of 2418 FDA-approved medications. This initiative successfully created a standardized toxicity database encompassing eight primary toxicological categories, including cardiotoxicity, hepatotoxicity, neurotoxicity, and nephrotoxicity, demonstrating LLMs’ advantages in rapidly and accurately extracting structured toxicity data, thus significantly enhancing toxicological data mining efficiency.

Additionally, LLMs contribute significantly to constructing and enhancing drug safety evaluation databases. Traditional toxicological and ADMET databases often suffer limitations such as insufficient scale, inconsistent annotation, and fragmented information, limiting their utility for deep learning model training. To overcome these challenges, Niu et al. [179] developed the PharmaBench platform, utilizing LLMs for automated extraction and integration of information from vast literature sources, drug labels, and public databases, thereby establishing a comprehensive ADMET benchmark dataset consisting of 11 sub datasets totaling over 50 000 data entries. This initiative significantly improved traditional ADMET resources in both scope and quality, facilitating fair performance comparisons and evaluations of various predictive algorithms.

Furthermore, LLMs also demonstrate considerable potential in molecular toxicity prediction and toxicological mechanism exploration. Unlike conventional models relying on simplistic structural descriptors and statistical learning methods, LLMs leverage contextual understanding from textual molecular representations, yielding superior predictive generalizability. Yang et al. [180] benchmarked GPT-4 and its multimodal variant GPT-4o against conventional ML and deep learning approaches for molecular toxicity prediction. They found that GPT-4 outperformed all comparators across multiple evaluation metrics. Building on this, they integrated GPT-4 with molecular docking techniques to probe the potential cardiotoxicity of TCM compounds, successfully pinpointing several high-risk ingredients and their principal binding sites on cardiac targets. This pioneering work represents the first application of LLMs to molecular toxicity prediction, offering a streamlined, highly efficient workflow for early-stage drug safety screening and underscoring the immense potential of LLMs to accelerate drug development and enhance safety assessments.

Despite their strong text processing abilities, LLMs face challenges in accuracy and reliability due to noisy and inconsistently labeled data, as well as risks of data poisoning [181]. Moreover, their reasoning depends mainly on statistical patterns rather than true causal inference, which restricts their capacity to explain the detailed molecular mechanisms behind drug toxicity [182]. Furthermore, the inherent “black-box” nature and limited transparency of LLM decision-making processes fail to meet stringent interpretability and credibility demands required by drug safety assessments [183]. To mitigate these limitations and enhance trust, several strategies can be employed. These include utilizing post-hoc interpretation techniques (e.g. attention mechanism analysis, feature importance scoring, and counterfactual explanations) to decipher model predictions, and adopting inherently more interpretable model architectures or hybrid approaches that combine LLMs with knowledge graphs for explicit reasoning pathways [184]. Finally, general-purpose LLMs lack profound comprehension of specialized pharmacotoxicological terminology, necessitating domain-specific fine-tuning or the development of dedicated models to enhance prediction accuracy and domain understanding [185].

Looking ahead, LLMs are anticipated to continuously advance drug toxicology research in several key areas. In the domain of clinical applications and safety monitoring, real-time mining of electronic health records and adverse event reports by LLMs can facilitate early alerts and dynamic risk assessments of adverse drug reactions (Fig. 7). Regarding ADMET property prediction, multimodal foundation models integrating molecular structures, gene expression profiles, and metabolomics data are expected to significantly enhance prediction accuracy for complex compounds. For molecular toxicity mechanism exploration, the integration of knowledge graphs and causal inference frameworks will enable deeper elucidation of toxicity pathways. Additionally, automated agents utilizing LLMs can systematically mine and integrate emerging literature and experimental data, thus constructing continuously updated toxicology databases. Lastly, interpretability enhancement tools will improve the transparency and traceability of LLM-driven decision processes, further strengthening their credibility and applicability.

Figure 7.

Prospective Applications of LLMs in drug safety evaluation and toxicology research, encompassing multiple aspects such as text Mining, enhancement of model interpretability, toxicity prediction, ADMET property prediction, clinical application and safety monitoring, and knowledge organization.

Prospective applications of LLMs in drug safety evaluation and toxicology research.

Conclusion and perspectives

Computational toxicology is playing an increasingly pivotal role in drug development. Traditional toxicological assessments primarily rely on animal experiments, a methodology characterized by considerable time consumption, high cost, and significant ethical concerns regarding animal welfare. Computational toxicology, integrating advanced bioinformatics, cheminformatics, and ML techniques, facilitates rapid and efficient prediction of compound toxicities, substantially enhancing drug discovery efficiency and reinforcing drug safety [186–188]. In this review, we initially summarized 23 computational tools capable of predicting drug ADMET properties, covering various stages including data input, model training, and prediction output. These methodologies were categorized into three main approaches: rule-based/statistical models, ML-based models, and graph-based methods. Despite demonstrating powerful capabilities in pharmacokinetic prediction and toxicity evaluation, these approaches still face limitations such as inconsistent data quality, inadequate model transparency, and room for improvement in prediction accuracy.

Moreover, recent advances in toxicological databases and toxicity prediction tools were systematically discussed. Toxicological databases, classified according to their data types and application scenarios into chemical toxicity databases, environmental toxicology databases, alternative toxicology databases, and biological toxin databases, provide comprehensive data support for various predictive models. Concurrently, significant progress in ML methodologies has been achieved for toxicity prediction. Researchers have integrated heterogeneous multisource data to construct a series of high-performing models, accurately predicting multiple toxicity endpoints, including acute toxicity, organ-specific toxicity, carcinogenicity, and genetic toxicity. Furthermore, innovative technologies such as GANs have been utilized to generate virtual samples, effectively alleviating data scarcity. Overall, toxicity prediction is progressively evolving toward multimodal integration and generative AI: transitioning from single-endpoint to multi-endpoint joint modeling, expanding data dimensions from traditional chemical structures to clinical data and multi-omics information (e.g. genomics and metabolomics), and shifting from discriminative architectures towards generative network architectures such as GANs and diffusion models. This evolution offers novel paradigms for addressing toxicity evaluations in small sample sizes and complex biological systems [189, 190].

As an emerging paradigm, network toxicology exhibits distinct advantages in the safety assessment of TCM formulas. Researchers construct and analyze multidimensional “component-target-pathway” networks. This approach allows them to clarify the molecular mechanisms behind TCM toxicity, and further provides reliable evidence for safety evaluations. Notably, AI technologies show immense potential in network toxicology: intelligent reasoning based on knowledge graphs can automatically identify potential toxicity pathways, GNNs effectively capture nonlinear relationships within biological networks, and transfer learning offers promising solutions for overcoming modeling bottlenecks posed by limited TCM-specific data [191]. Additionally, this review has explored the prospective applications of LLMs in drug toxicology. Although LLMs have demonstrated substantial potential in automated literature mining, data integration, and toxicity prediction, further research is required to improve their capabilities in causal inference and model interpretability.

Reflecting upon current research progress, toxicity prediction methodologies still encounter several significant challenges: (i) Existing databases exhibit limitations in both data quality and coverage, lacking reliable data for novel compounds such as multi component TCMs or multi target drugs. (ii) The multidimensional nature of toxicity mechanisms—including metabolic pathways, target interactions, and dynamic cellular responses—remains challenging to fully characterize using a single predictive model. (iii) The “black-box” nature of many ML models hinders the biological interpretability of their predictions, thereby limiting their application in clinical decision-making. Thus, future research urgently requires the integration of multidisciplinary techniques, including network toxicology, systems biology, and AI, to develop dynamic and interpretable toxicity prediction frameworks that more accurately reflect real biological systems. Specifically, future efforts should focus on the following four directions:

Future development of prediction methods must prioritize rigorous, transparent, and continuous benchmarking against high-quality in vivo data. Building upon the foundation laid by the Tox21 initiative, we recommend the establishment of a community-driven, iterative evaluation framework. This framework would involve: (i) utilizing existing animal data not just for initial training but as a gold standard for ongoing model validation and refinement in an iterative loop; (ii) creating blinded challenge sets containing novel compounds to impartially assess the generalizability and predictive power of new algorithms against established benchmarks; and (iii) expanding the scope of toxicity endpoints beyond HTS outcomes to include more complex adverse outcomes, thus driving the development of models that can better predict human-relevant toxicity pathways.

Multi-level modeling integrated with systems toxicology: Leveraging systems biology and experimental validation data, researchers should build comprehensive, multidimensional toxicity prediction models spanning molecular, cellular, organ, and organismal levels. For instance, single-cell sequencing combined with metabolomics could dynamically simulate drug distribution and metabolism in vivo, revealing fundamental mechanisms underpinning toxic reactions.

Deepening and innovating AI technologies: Natural language processing techniques can be employed to extract toxicological associations from unstructured literature, establishing dynamically updated toxicity knowledge networks to optimize predictive models. Furthermore, inverse design of low-toxicity molecules and toxicity avoidance strategies could be realized through molecular graph representations and GANs. Developing transparent algorithms (such as SHAP and LIME) will further enhance biological interpretability of model predictions, addressing interpretability requirements in predictive modeling.

Modernizing toxicity evaluation of TCM: Given the inherent complexity of multicomponent and multitarget TCM formulas, it is imperative to establish integrated evaluation frameworks based on network toxicology. These frameworks should incorporate multidimensional “component-target-pathway-phenotype” data, combining traditional empirical toxicity knowledge with advanced ML models to effectively resolve challenges in tracing toxicity sources and clarifying dose-effect relationships within TCM preparations.

Data standardization and cross-platform collaboration: promoting standardized global data collection and sharing mechanisms for toxicological information, building comprehensive toxicity databases covering broader chemical spaces (e.g. natural products and nanomedicines), and utilizing transfer learning, and federated learning techniques could significantly mitigate overfitting issues arising from limited sample sizes.

In conclusion, continuous technological innovations and advancing research endeavors are anticipated to overcome existing challenges and drive further developments in computational toxicology. Such progress promises to provide more efficient and precise toxicity evaluation tools for drug development, facilitating successful drug discovery and clinical application, and ultimately propelling drug research towards a more precise and intelligent new era.

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this manuscript, the authors utilized DeepSeek R1 to enhance the language and readability. Following the use of this service, the authors thoroughly reviewed and revised the content as necessary, and take full responsibility for the final version of the publication.

Key Points

  • ADMET tools have improved with machine learning and graph neural networks. Future focus should be on multi endpoint predictions integrating chemical, clinical, and multi omics data, while addressing data quality and interpretability, particularly for complex multi target drugs.

  • Toxicity prediction is moving towards multi endpoint modeling and network toxicology. Future research should integrate genomics and metabolomics data to enhance understanding of toxicity mechanisms and improve model interpretability for regulatory and clinical applications.

  • LLMs have potential in literature mining, data integration, and molecular toxicity prediction. Future work should focus on optimizing LLMs for toxicology tasks and integrating them with multi omics and network toxicology models to improve toxicity screening and safety evaluations.

Supplementary Material

Supplementary_Table_1_the_detail_information_of_ADMET_platforms_bbaf533

Acknowledgements

We would like to thank the anonymous reviewers for valuable suggestions.

Contributor Information

Jiangyan Zhang, School of Pharmacy/School of Modern Chinese Medicine Industry, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.

Haolin Li, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.

Yuncong Zhang, Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai Institute of Translational Medicine, Zhuhai People's Hospital (The Affiliated Hospital of Beijing Institute of Technology, Zhuhai Clinical Medical College of Jinan University), No. 79, Kangning Road, Xiangzhou District, Zhuhai City, Guangdong Province, 519000, China.

Junyang Huang, Department of Ophthalmology, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, No. 32, Section 2, West Yihuan Road, Qingyang District, Chengdu, Sichuan Province, 610072, China.

Liping Ren, School of Healthcare Technology, Chengdu Neusoft University, No. 1, Neusoft Avenue, Qingchengshan Town, Dujiangyan City, Chengdu, Sichuan Province, 611844, China.

Chuantao Zhang, Department of Respiratory Medicine, Hospital of Chengdu University of Traditional Chinese Medicine, No. 39, Shi'erqiao Road, Jinniu District, Chengdu, Sichuan Province, 610072, China.

Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006, Xiyuan Avenue, High-tech Zone (West Zone), Chengdu, Sichuan Province, 611731, China.

Yang Zhang, Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, No. 1166, Liutai Avenue, Wenjiang District, Chengdu City, Sichuan Province, 611137, China.

Author contributions

Y.Z., Q.Z. and C.Z. conceived the manuscript and outlined it. J.Z., Y.C.Z. and H.L. conducted the literature search and wrote the draft. J.H. and L.R. reviewed and edited the draft. All authors have approved the final review and the submission.

Conflict of interest: None declared.

Funding

This work was supported by the National Natural Science Foundation of China (62471071, 62202069), Chengdu Health Commission-Chengdu University of Traditional Chinese Medicine Joint Research Fund (WXLH202402041).

Availability of data and materials

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

  • 1. Pammolli  F, Magazzini  L, Riccaboni  M. The productivity crisis in pharmaceutical R&D. Nat Rev Drug Discov  2011;10:428–38. 10.1038/nrd3405. [DOI] [PubMed] [Google Scholar]
  • 2. Dowden  H, Munro  J. Trends in clinical success rates and therapeutic focus. Nat Rev Drug Discov  2019;18:495–6. 10.1038/d41573-019-00074-z. [DOI] [PubMed] [Google Scholar]
  • 3. Tran  TTV, Surya Wibowo  A, Tayara  H, et al.  Artificial intelligence in drug toxicity prediction: recent advances, challenges, and future perspectives. J Chem Inf Model  2023;63:2628–43. 10.1021/acs.jcim.3c00200. [DOI] [PubMed] [Google Scholar]
  • 4. Parboosing  R, Mzobe  G, Chonco  L, et al.  Cell-based assays for assessing toxicity: A basic guide. Med Chem  2016;13:13–21. 10.2174/1573406412666160229150803. [DOI] [PubMed] [Google Scholar]
  • 5. Khabib  MNH, Sivasanku  Y, Lee  HB, et al.  Alternative animal models in predictive toxicology. Toxicology  2022;465:153053. 10.1016/j.tox.2021.153053. [DOI] [PubMed] [Google Scholar]
  • 6. Wang  N, Li  X, Xiao  J, et al.  Data-driven toxicity prediction in drug discovery: current status and future directions. Drug Discov Today  2024;29:104195. 10.1016/j.drudis.2024.104195. [DOI] [PubMed] [Google Scholar]
  • 7. Prior  H, Sewell  F, Stewart  J. Overview of 3Rs opportunities in drug discovery and development using non-human primates. Drug Discov Today Dis Model  2017;23:11–6. [Google Scholar]
  • 8. Ekins  S. Progress in computational toxicology. J Pharmacol Toxicol Methods  2014;69:115–40. 10.1016/j.vascn.2013.12.003. [DOI] [PubMed] [Google Scholar]
  • 9. Wang  Y, Zeng  T, Tang  D, et al.  Integrated multi-omics analyses reveal lipid metabolic signature in osteoarthritis. J Mol Biol 2025;437:168888. [DOI] [PubMed] [Google Scholar]
  • 10. Meng  X, Yan  X, Zhang  K, et al.  The application of large language models in medicine: A scoping review, iScience  2024;27:109713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Momanyi  BM, Zhou  YW, Grace-Mercure  BK, et al.  SAGESDA: multi-GraphSAGE networks for predicting SnoRNA-disease associations. Curr Res Struct Biol  2024;7:100122. 10.1016/j.crstbi.2023.100122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Guengerich  FP. Mechanisms of drug toxicity and relevance to pharmaceutical development. Drug Metab Pharmacokinet  2011;26:3–14. 10.2133/dmpk.DMPK-10-RV-062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Liu  Y, Li  H, Zeng  T, et al.  Integrated bulk and single-cell transcriptomes reveal pyroptotic signature in prognosis and therapeutic options of hepatocellular carcinoma by combining deep learning. Brief Bioinform  2023;25:bbad487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Mulliner  D, Schmidt  F, Stolte  M, et al.  Computational models for human and animal hepatotoxicity with a global application scope. Chem Res Toxicol  2016;29:757–67. 10.1021/acs.chemrestox.5b00465. [DOI] [PubMed] [Google Scholar]
  • 15. Khan  MZI, Ren  JN, Cao  C, et al.  Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning. Front Pharmacol  2024;15:1441587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Liu  J, Khan  MKH, Guo  W, et al.  Machine learning and deep learning approaches for enhanced prediction of hERG blockade: A comprehensive QSAR modeling study. Expert Opin Drug Metab Toxicol  2024;20:665–84. 10.1080/17425255.2024.2377593. [DOI] [PubMed] [Google Scholar]
  • 17. Mahapatra  M, Sahu  C, Mohapatra  S. Trends of artificial intelligence (AI) use in drug targets, discovery and development: current status and future perspectives. Curr Drug Targets 2025;26:221–42. [DOI] [PubMed] [Google Scholar]
  • 18. Zhang  Y, Liu  C, Liu  M, et al.  Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023;25:bbad467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Zulfiqar  H, Guo  Z, Ahmad  RM, et al.  Deep-STP: A deep learning-based approach to predict snake toxin proteins by using word embeddings. Front Med (Lausanne)  2024;10:1291352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Zhang  L, Zhang  H, Ai  H, et al.  Applications of machine learning methods in drug toxicity prediction. Curr Top Med Chem  2018;18:987–97. 10.2174/1568026618666180727152557. [DOI] [PubMed] [Google Scholar]
  • 21. Wu  Z, Jiang  D, Wang  J, et al.  Mining toxicity information from large amounts of toxicity data. J Med Chem  2021;64:6924–36. 10.1021/acs.jmedchem.1c00421. [DOI] [PubMed] [Google Scholar]
  • 22. Wu  Z, Chen  J, Li  Y, et al.  From black boxes to actionable insights: A perspective on explainable artificial intelligence for scientific discovery. J Chem Inf Model  2023;63:7617–27. 10.1021/acs.jcim.3c01642. [DOI] [PubMed] [Google Scholar]
  • 23. Guo  W, Liu  J, Dong  F, et al.  Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood)  2023;248:1952–73. 10.1177/15353702231209421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Masarone  S, Beckwith  KV, Wilkinson  MR, et al.  Advancing predictive toxicology: overcoming hurdles and shaping the future. Dig Dis  2024;4:303–15. [Google Scholar]
  • 25. Sharma  B, Chenthamarakshan  V, Dhurandhar  A, et al.  Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations. Sci Rep  2023;13:4908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Wu  P, Lin  S, Cao  G, et al.  Absorption, distribution, metabolism, excretion and toxicity of microplastics in the human body and health implications. J Hazard Mater  2022;437:129361. 10.1016/j.jhazmat.2022.129361. [DOI] [PubMed] [Google Scholar]
  • 27. Waring  MJ, Arrowsmith  J, Leach  AR, et al.  An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nat Rev Drug Discov  2015;14:475–86. 10.1038/nrd4609. [DOI] [PubMed] [Google Scholar]
  • 28. Adamson  RH. The acute lethal dose 50 (LD50) of caffeine in albino rats. Regul Toxicol Pharmacol  2016;80:274–6. 10.1016/j.yrtph.2016.07.011. [DOI] [PubMed] [Google Scholar]
  • 29. Yi  J-C, Yang  Z-Y, Zhao  W-T, et al.  ChemMORT: an automatic ADMET optimization platform using deep learning and multi-objective particle swarm optimization. Brief Bioinform  2024;25:bbae008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Daina  A, Michielin  O, Zoete  V. SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep  2017;7:42717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Gu  Y, Yu  Z, Wang  Y, et al.  admetSAR3.0: A comprehensive platform for exploration, prediction and optimization of chemical ADMET properties. Nucleic Acids Res  2024;52:W432–w438. 10.1093/nar/gkae298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Fu  L, Shi  S, Yi  J, et al.  ADMETlab 3.0: an updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support. Nucleic Acids Res  2024;52:W422–w431. 10.1093/nar/gkae236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Schyman  P, Liu  R, Desai  V, et al.  vNN web server for ADMET predictions. Front Pharmacol  2017;8:889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Tian  H, Ketkar  R, Tao  P. ADMETboost: A web server for accurate ADMET prediction. J Mol Model  2022;28:408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Lagorce  D, Bouslama  L, Becot  J, et al.  FAF-Drugs4: free ADME-tox filtering computations for chemical biology and early stages drug discovery. Bioinformatics  2017;33:3658–60. 10.1093/bioinformatics/btx491. [DOI] [PubMed] [Google Scholar]
  • 36. Swanson  K, Walther  P, Leitz  J, et al.  ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. Bioinformatics 2024;40:btae416. 10.1101/2023.12.28.573531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Banerjee  P, Kemmler  E, Dunkel  M, et al.  ProTox 3.0: A webserver for the prediction of toxicity of chemicals. Nucleic Acids Res  2024;52:W513–w520. 10.1093/nar/gkae303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Di Stefano  M, Galati  S, Piazza  L, et al.  VenomPred 2.0: A novel In Silico platform for an extended and human interpretable toxicological profiling of small molecules. J Chem Inf Model  2024;64:2275–89. 10.1021/acs.jcim.3c00692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Shao  CY, Su  BH, Tu  YS, et al.  CypRules: A rule-based P450 inhibition prediction server. Bioinformatics  2015;31:1869–71. 10.1093/bioinformatics/btv043. [DOI] [PubMed] [Google Scholar]
  • 40. Wishart  DS, Tian  S, Allen  D, et al.  BioTransformer 3.0-a web server for accurately predicting metabolic transformation products. Nucleic Acids Res  2022;50:W115–w123. 10.1093/nar/gkac313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Matlock  MK, Hughes  TB, Swamidass  SJ. XenoSite server: A web-available site of metabolism prediction tool. Bioinformatics  2015;31:1136–7. 10.1093/bioinformatics/btu761. [DOI] [PubMed] [Google Scholar]
  • 42. Rudik  A, Dmitriev  A, Lagunin  A, et al.  SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics  2015;31:2046–8. 10.1093/bioinformatics/btv087. [DOI] [PubMed] [Google Scholar]
  • 43. Olsen  L, Montefiori  M, Tran  KP, et al.  SMARTCyp 3.0: enhanced cytochrome P450 site-of-metabolism prediction server. Bioinformatics  2019;35:3174–5. 10.1093/bioinformatics/btz037. [DOI] [PubMed] [Google Scholar]
  • 44. Zhang  Y, Pan  X, Shi  T, et al.  P450Rdb: A manually curated database of reactions catalyzed by cytochrome P450 enzymes. J Adv Res  2024;63:35–42. 10.1016/j.jare.2023.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Zou  X, Ren  L, Cai  P, et al.  Accurately identifying hemagglutinin using sequence information and machine learning methods. Front Med (Lausanne)  2023;10:1281880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Manavalan  B, Lee  J. FRTpred: A novel approach for accurate prediction of protein folding rate and type. Comput Biol Med  2022;149:105911. 10.1016/j.compbiomed.2022.105911. [DOI] [PubMed] [Google Scholar]
  • 47. Basith  S, Pham  NT, Manavalan  B, et al.  SEP-AlgPro: an efficient allergen prediction tool utilizing traditional machine learning and deep learning techniques with protein language model features. Int J Biol Macromol  2024;273:133085. 10.1016/j.ijbiomac.2024.133085. [DOI] [PubMed] [Google Scholar]
  • 48. Pham  NT, Zhang  Y, Rakkiyappan  R, et al.  HOTGpred: enhancing human O-linked threonine glycosylation prediction using integrated pretrained protein language model-based features and multi-stage feature selection approach. Comput Biol Med  2024;179:108859. 10.1016/j.compbiomed.2024.108859. [DOI] [PubMed] [Google Scholar]
  • 49. Zheng  L, Liu  D, Li  YA, et al.  RaacFold: A webserver for 3D visualization and analysis of protein structure by using reduced amino acid alphabets. Nucleic Acids Res  2022;50:W633–8. 10.1093/nar/gkac415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Shi  XX, Wang  ZZ, Wang  YL, et al.  AquaticTox: A web-based tool for aquatic toxicity evaluation based on ensemble learning to facilitate the screening of green chemicals. Environ Health (Wash)  2024;2:202–11. 10.1021/envhealth.4c00014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Pires  DE, Blundell  TL, Ascher  DB. pkCSM: predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J Med Chem  2015;58:4066–72. 10.1021/acs.jmedchem.5b00104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Wei  Y, Li  S, Li  Z, et al.  Interpretable-ADMET: A web service for ADMET prediction and optimization based on deep neural representation. Bioinformatics  2022;38:2863–71. 10.1093/bioinformatics/btac192. [DOI] [PubMed] [Google Scholar]
  • 53. Zhang  S, Yan  Z, Huang  Y, et al.  HelixADMET: A robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer. Bioinformatics  2022;38:3444–53. 10.1093/bioinformatics/btac342. [DOI] [PubMed] [Google Scholar]
  • 54. Hsiao  Y, Su  BH, Tseng  YJ. Current development of integrated web servers for preclinical safety and pharmacokinetics assessments in drug development. Brief Bioinform  2021;22:bbaa160. [DOI] [PubMed] [Google Scholar]
  • 55. Venkatraman  V. FP-ADMET: A compendium of fingerprint-based ADMET prediction models. J Chem  2021;13:75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Wei  M, Zhang  X, Pan  X, et al.  HobPre: accurate prediction of human oral bioavailability for small molecules. J Chem  2022;14:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Yi  J, Shi  S, Fu  L, et al.  OptADMET: A web-based tool for substructure modifications to improve ADMET properties of lead compounds. Nat Protoc  2024;19:1105–21. 10.1038/s41596-023-00942-4. [DOI] [PubMed] [Google Scholar]
  • 58. Myung  Y, de  Sá  AGC, Ascher  DB. Deep-PK: deep learning for small molecule pharmacokinetic and toxicity prediction. Nucleic Acids Res  2024;52:W469–w475. 10.1093/nar/gkae254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. George  J, Singh  R, Mahmood  Z, et al.  Toxicoproteomics: new paradigms in toxicology research. Toxicol Mech Methods  2010;20:415–23. 10.3109/15376511003667842. [DOI] [PubMed] [Google Scholar]
  • 60. Tanoli  Z, Fernández-Torras  A, Özcan  UO, et al.  Computational drug repurposing: approaches, evaluation of in silico resources and case studies. Nat Rev Drug Discov  2025;24:521–42. 10.1038/s41573-025-01164-x. [DOI] [PubMed] [Google Scholar]
  • 61. Kim  S, Chen  J, Cheng  T, et al.  PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res  2021;49:D1388–d1395. 10.1093/nar/gkaa971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Zdrazil  B, Felix  E, Hunter  F, et al.  The ChEMBL database in 2023: A drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res  2024;52:D1180–d1192. 10.1093/nar/gkad1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Wu  L, Yan  B, Han  J, et al.  TOXRIC: A comprehensive database of toxicological data and benchmarks. Nucleic Acids Res  2023;51:D1432–d1445. 10.1093/nar/gkac1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Knox  C, Wilson  M, Klinger  CM, et al.  DrugBank 6.0: the DrugBank knowledgebase for 2024. Nucleic Acids Res  2024;52:D1265–d1275. 10.1093/nar/gkad976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Schmidt  U, Struck  S, Gruening  B, et al.  SuperToxic: A comprehensive database of toxic compounds. Nucleic Acids Res  2009;37:D295–9. 10.1093/nar/gkn850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Williams  AJ, Grulke  CM, Edwards  J, et al.  The CompTox chemistry dashboard: A community data resource for environmental chemistry. J Chem  2017;9:61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Tomasulo  P. ChemIDplus-super source for chemical and drug information. Med Ref Serv Q  2002;21:53–9. 10.1300/J115v21n01_04. [DOI] [PubMed] [Google Scholar]
  • 68. Fonger  GC, Hakkinen  P, Jordan  S, et al.  The National Library of Medicine's (NLM) hazardous substances data Bank (HSDB): background, recent enhancements and future plans. Toxicology  2014;325:209–16. 10.1016/j.tox.2014.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Chen  M, Suzuki  A, Thakkar  S, et al.  DILIrank: the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans. Drug Discov Today  2016;21:648–53. 10.1016/j.drudis.2016.02.015. [DOI] [PubMed] [Google Scholar]
  • 70. Thakkar  S, Li  T, Liu  Z, et al.  Drug-induced liver injury severity and toxicity (DILIst): binary classification of 1279 drugs by human hepatotoxicity. Drug Discov Today  2020;25:201–8. 10.1016/j.drudis.2019.09.022. [DOI] [PubMed] [Google Scholar]
  • 71. Chen  M, Vijay  V, Shi  Q, et al.  FDA-approved drug labeling for the study of drug-induced liver injury. Drug Discov Today  2011;16:697–703. 10.1016/j.drudis.2011.05.007. [DOI] [PubMed] [Google Scholar]
  • 72. Du  F, Yu  H, Zou  B, et al.  hERGCentral: A large database to store, retrieve, and analyze compound-human ether-à-go-go related gene channel interactions to facilitate cardiotoxicity assessment in drug development. Assay Drug Dev Technol  2011;9:580–8. 10.1089/adt.2011.0425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Fitzpatrick  RB. CPDB: carcinogenic potency database. Med Ref Serv Q  2008;27:303–11. 10.1080/02763860802198895. [DOI] [PubMed] [Google Scholar]
  • 74. Sushko  I, Novotarskyi  S, Körner  R, et al.  Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des  2011;25:533–54. 10.1007/s10822-011-9440-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Wishart  D, Arndt  D, Pon  A, et al.  T3DB: the toxic exposome database. Nucleic Acids Res  2015;43:D928–34. 10.1093/nar/gku1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Bell  SM, Phillips  J, Sedykh  A, et al.  An integrated chemical environment to support 21st-century toxicology. Environ Health Perspect  2017;125:054501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Kuhn  M, Letunic  I, Jensen  LJ, et al.  The SIDER database of drugs and side effects. Nucleic Acids Res  2015;44:D1075–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Tanaka  Y, Chen  HY, Belloni  P, et al.  OnSIDES database: extracting adverse drug events from drug labels using natural language processing models. Fortschr Med  2025;6:100642. 10.1016/j.medj.2025.100642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Lindquist  M. VigiBase, the WHO global ICSR database system: basic facts. Drug information journal : DIJ / Drug Information Association  2008;42:409–19. 10.1177/009286150804200501. [DOI] [Google Scholar]
  • 80. Shimabukuro  TT, Nguyen  M, Martin  D, et al.  Safety monitoring in the vaccine adverse event reporting system (VAERS). Vaccine  2015;33:4398–405. 10.1016/j.vaccine.2015.07.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Olker  JH, Elonen  CM, Pilli  A, et al.  The ECOTOXicology knowledgebase: A curated database of ecologically relevant toxicity tests to support environmental research and risk assessment. Environ Toxicol Chem  2022;41:1520–39. 10.1002/etc.5324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Wexler  P. TOXNET: an evolving web resource for toxicology and environmental health information. Toxicology  2001;157:3–10. 10.1016/S0300-483X(00)00337-1. [DOI] [PubMed] [Google Scholar]
  • 83. Connors  KA, Beasley  A, Barron  MG, et al.  Creation of a curated aquatic toxicology database: EnviroTox. Environ Toxicol Chem  2019;38:1062–73. 10.1002/etc.4382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Lewis  KA, John  T, J. WD et al.  An international database for pesticide risk assessments and management, human and ecological risk assessment: an  Int J  2016;22:1050–64. [Google Scholar]
  • 85. Zhang  Y, Yang  Y, Ren  L, et al.  Predicting intercellular communication based on metabolite-related ligand-receptor interactions with MRCLinkdb. BMC Biol  2024;22:152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Attene-Ramos  MS, Miller  N, Huang  R, et al.  The Tox21 robotic platform for the assessment of environmental chemicals--from vision to reality. Drug Discov Today  2013;18:716–23. 10.1016/j.drudis.2013.05.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Igarashi  Y, Nakatsu  N, Yamashita  T, et al.  Open TG-GATEs: A large-scale toxicogenomics database. Nucleic Acids Res  2015;43:D921–7. 10.1093/nar/gku955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88. Nair  SK, Eeles  C, Ho  C, et al.  ToxicoDB: an integrated database to mine and visualize large-scale toxicogenomic datasets. Nucleic Acids Res  2020;48:W455–w462. 10.1093/nar/gkaa390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89. Smith  AJ. Norecopa: A global knowledge base of resources for improving animal research and testing. Front Vet Sci  2023;10:1119923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Jennen  DGJ, Magkoufopoulou  C, Ketelslegers  HB, et al.  Comparison of HepG2 and HepaRG by whole-genome gene expression analysis for the purpose of chemical Hazard identification. Toxicol Sci  2010;115:66–79. 10.1093/toxsci/kfq026. [DOI] [PubMed] [Google Scholar]
  • 91. Davis  AP, Grondin  CJ, Johnson  RJ, et al.  Comparative Toxicogenomics database (CTD): update 2021. Nucleic Acids Res  2021;49:D1138–d1143. 10.1093/nar/gkaa891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Wang  G, Wu  H, Liao  Y. et al. BioTD:An Online Database of Biotoxins. arXiv[Preprint]:2412.20038[q-bio.BM]. 10.48550/arXiv.2412.20038. [DOI]
  • 93. Zhang  D, Tian  Y, Tian  Y, et al.  A data-driven integrative platform for computational prediction of toxin biotransformation with a case study. J Hazard Mater  2021;408:124810. 10.1016/j.jhazmat.2020.124810. [DOI] [PubMed] [Google Scholar]
  • 94. Günthardt  BF, Hollender  J, Hungerbühler  K, et al.  Comprehensive toxic plants-Phytotoxins database and its application in assessing aquatic micropollution potential. J Agric Food Chem  2018;66:7577–88. 10.1021/acs.jafc.8b01639. [DOI] [PubMed] [Google Scholar]
  • 95. Habauzit  D, Lemée  P, Fessard  V. MycoCentral: an innovative database to compile information on mycotoxins and facilitate hazard prediction. Food Control  2024;159:110273. 10.1016/j.foodcont.2023.110273. [DOI] [Google Scholar]
  • 96. He  QY, He  QZ, Deng  XC, et al.  ATDB: A uni-database platform for animal toxins. Nucleic Acids Res  2008;36:D293–7. 10.1093/nar/gkm832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Tan  PT, Veeramani  A, Srinivasan  KN, et al.  SCORPION2: A database for structure-function analysis of SCORPION toxins. Toxicon  2006;47:356–63. 10.1016/j.toxicon.2005.12.001. [DOI] [PubMed] [Google Scholar]
  • 98. Kaas  Q, Yu  R, Jin  AH, et al.  ConoServer: updated content, knowledge, and discovery tools in the conopeptide database. Nucleic Acids Res  2012;40:D325–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Ji  J, Zhang  D, Ye  J, et al.  MycotoxinDB: A data-driven platform for investigating masked forms of mycotoxins. J Agric Food Chem  2023;71:9501–7. 10.1021/acs.jafc.3c01403. [DOI] [PubMed] [Google Scholar]
  • 100. Danov  A, Segev  O, Bograd  A, et al.  Toxinome-the bacterial protein toxin database. MBio  2024;15:e0191123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Rodea-Palomares  I, Bone  AJ. Predictive value of the ToxCast/Tox21 high throughput toxicity screening data for approximating in vivo ecotoxicity endpoints and ecotoxicological risk in eco- surveillance applications. Sci Total Environ  2024;914:169783. [DOI] [PubMed] [Google Scholar]
  • 102. Ahmed  Z, Shahzadi  K, Jin  Y, et al.  Identification of RNA-dependent liquid-liquid phase separation proteins using an artificial intelligence strategy. Proteomics  2024;24:2400044. [DOI] [PubMed] [Google Scholar]
  • 103. Ma  J, Motsinger-Reif  A. Prediction of synergistic drug combinations using PCA-initialized deep learning. BioData Mining  2021;14:46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104. Xu  Y, Liu  T, Yang  Y, et al.  ACVPred: enhanced prediction of anti-coronavirus peptides by transfer learning combined with data augmentation. Futur Gener Comput Syst  2024;160:305–15. 10.1016/j.future.2024.06.008. [DOI] [Google Scholar]
  • 105. Liu  T, Qiao  H, Wang  Z, et al.  CodLncScape provides a self-enriching framework for the systematic collection and exploration of coding LncRNAs. Adv Sci  2024;11:2400009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Liu  T, Huang  J, Luo  D, et al.  Cm-siRPred: predicting chemically modified siRNA efficiency based on multi-view learning strategy. Int J Biol Macromol  2024;264:130638. 10.1016/j.ijbiomac.2024.130638. [DOI] [PubMed] [Google Scholar]
  • 107. Gangwal  A, Lavecchia  A. Artificial intelligence in natural product drug discovery: current applications and future perspectives. J Med Chem  2025;68:3948–69. 10.1021/acs.jmedchem.4c01257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Jaganathan  K, Tayara  H, Chong  KT. An explainable supervised machine learning model for predicting respiratory toxicity of chemicals using optimal molecular descriptors. Pharmaceutics  2022;14:832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Ribeiro  MT, Singh  S, Guestrin  C. "why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, 1135–44. arXiv[Preprint]:1602.04938[cs.LG]. 10.48550/arXiv.1602.04938. [DOI]
  • 110. Gal  Y, Ghahramani  Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. Published in ICML 2016. arXiv[Preprint]:1506.02142. 10.48550/arXiv.1506.02142. [DOI] [Google Scholar]
  • 111. Abdar  M, Pourpanah  F, Hussain  S, et al.  A review of uncertainty quantification in deep learning: techniques, applications and challenges. Information Fusion  2021;76:243–97. 10.1016/j.inffus.2021.05.008. [DOI] [Google Scholar]
  • 112. Wang  S, Sun  H, Liu  H, et al.  ADMET evaluation in drug discovery. 16. Predicting hERG blockers by combining multiple pharmacophores and machine learning approaches. Mol Pharm  2016;13:2855–66. 10.1021/acs.molpharmaceut.6b00471. [DOI] [PubMed] [Google Scholar]
  • 113. Cai  C, Guo  P, Zhou  Y, et al.  Deep learning-based prediction of drug-induced cardiotoxicity. J Chem Inf Model  2019;59:1073–84. 10.1021/acs.jcim.8b00769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114. Ryu  JY, Lee  MY, Lee  JH, et al.  DeepHIT: A deep learning framework for prediction of hERG-induced cardiotoxicity. Bioinformatics  2020;36:3049–55. 10.1093/bioinformatics/btaa075. [DOI] [PubMed] [Google Scholar]
  • 115. Arab  I, Egghe  K, Laukens  K, et al.  Benchmarking of small molecule feature representations for hERG, Nav1.5, and Cav1.2 cardiotoxicity prediction. J Chem Inf Model  2024;64:2515–27. 10.1021/acs.jcim.3c01301. [DOI] [PubMed] [Google Scholar]
  • 116. Kyro  GW, Martin  MT, Watt  ED, et al.  CardioGenAI: A machine learning-based framework for re-engineering drugs for reduced hERG liability. J Chem  2025;17:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117. Guan  J, Dong  D, Xie  P, et al.  StackDILI: enhancing drug-induced liver injury prediction through stacking strategy with effective molecular representations. J Chem Inf Model  2025;65:1027–39. 10.1021/acs.jcim.4c02079. [DOI] [PubMed] [Google Scholar]
  • 118. Lee  S, Yoo  S. InterDILI: interpretable prediction of drug-induced liver injury through permutation feature importance and attention mechanism. J Chem  2024;16:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119. Amin  SA, Kar  S, Piotto  S. pDILI_v1: A web-based machine learning tool for predicting drug-induced liver injury (DILI) integrating chemical space analysis and molecular fingerprints. ACS Omega  2025;10:13502–14. 10.1021/acsomega.5c00075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120. Seal  S, Williams  D, Hosseini-Gerami  L, et al.  Improved detection of drug-induced liver injury by integrating predicted In vivo and In vitro data. Chem Res Toxicol  2024;37:1290–305. 10.1021/acs.chemrestox.4c00015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121. Jin  Y, Shou  Y, Lei  Q, et al.  An entropy weight method to integrate big omics and mechanistically evaluate DILI. Hepatology  2024;79:1264–78. 10.1097/HEP.0000000000000628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122. Chen  X, Roberts  R, Tong  W, et al.  Tox-GAN: an artificial intelligence approach alternative to animal studies—A case study with Toxicogenomics. Toxicol Sci  2022;186:242–59. 10.1093/toxsci/kfab157. [DOI] [PubMed] [Google Scholar]
  • 123. Shi  Y, Hua  Y, Wang  B, et al.  In Silico prediction and insights into the structural basis of drug induced nephrotoxicity. Front Pharmacol  2022;12:793332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124. Gong  Y, Teng  D, Wang  Y, et al.  In silico prediction of potential drug-induced nephrotoxicity with machine learning methods. J Appl Toxicol  2022;42:1639–50. 10.1002/jat.4331. [DOI] [PubMed] [Google Scholar]
  • 125. Mazumdar  B, Sarma  PKD, Mahanta  HJ. Predicting renal toxicity of compounds with deep learning and machine learning methods. SN Computer Science  2023;4:812. [Google Scholar]
  • 126. Nguyen-Vo  TH, Bui  L, Do  TTT, et al.  Identifying nephrotoxicity of small molecules using machine learning. In: TENCON 2024–2024 IEEE Region 10 Conference (TENCON). Singapore, Singapore: IEEE, 2024, pp. 482–85.
  • 127. Su  R, Yang  H, Wei  L, et al.  A multi-label learning model for predicting drug-induced pathology in multi-organ based on toxicogenomics data. PLoS Comput Biol  2022;18:e1010402. 10.1371/journal.pcbi.1010402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128. Ryu  JY, Jang  WD, Jang  J, et al.  PredAOT: A computational framework for prediction of acute oral toxicity based on multiple random forest models. BMC Bioinformatics  2023;24:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129. Wijeyesakere  SJ, Auernhammer  T, Parks  A, et al.  Profiling mechanisms that drive acute oral toxicity in mammals and its prediction via machine learning. Toxicol Sci  2023;193:18–30. 10.1093/toxsci/kfad025. [DOI] [PubMed] [Google Scholar]
  • 130. Lou  S, Yu  Z, Huang  Z, et al.  In Silico prediction of chemical acute dermal toxicity using explainable machine learning methods. Chem Res Toxicol  2024;37:513–24. 10.1021/acs.chemrestox.4c00012. [DOI] [PubMed] [Google Scholar]
  • 131. Borba  JVB, Alves  VM, Braga  RC, et al.  STopTox: an in Silico alternative to animal testing for acute systemic and topical toxicity. Environ Health Perspect  2022;130:27012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132. Jain  S, Siramshetty  VB, Alves  VM, et al.  Large-scale Modeling of multispecies acute toxicity end points using consensus of multitask deep learning methods. J Chem Inf Model  2021;61:653–63. 10.1021/acs.jcim.0c01164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133. Wang  Y-W, Huang  L, Jiang  S-W, et al.  CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens. Food Chem Toxicol  2020;135:110921. 10.1016/j.fct.2019.110921. [DOI] [PubMed] [Google Scholar]
  • 134. Fradkin  P, Young  A, Atanackovic  L, et al.  A graph neural network approach for molecule carcinogenicity prediction. Bioinformatics  2022;38:i84–91. 10.1093/bioinformatics/btac266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135. Limbu  S, Dakshanamurthy  S. Predicting chemical carcinogens using a hybrid neural network deep learning method. Sensors (Basel)  2022;22:8185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136. Chen  Z, Zhang  L, Sun  J, et al.  DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction. J Cell Mol Med  2023;27:3117–26. 10.1111/jcmm.17889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137. Mittal  A, Mohanty  SK, Gautam  V, et al.  Artificial intelligence uncovers carcinogenic human metabolites. Nat Chem Biol  2022;18:1204–13. 10.1038/s41589-022-01110-7. [DOI] [PubMed] [Google Scholar]
  • 138. R  eH, M  eX, D-T  eN, et al.  Tox21Challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs, Frontiers in environmental. Science  2016;3:85. 10.3389/fenvs.2015.00085. [DOI] [Google Scholar]
  • 139. A  eM, A  eM, G  eK, et al.  DeepTox: toxicity prediction using deep learning, Frontiers in environmental. Science  2016;3:80. 10.3389/fenvs.2015.00080. [DOI] [Google Scholar]
  • 140. Jiang  X, Ji  P, Li  S. CensNet: Convolution with Edge-Node Switching in Graph Neural Networks. In: Kraus S (ed.), Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019. Macao, China, August 10-16, 2019, pp. 2656–62.
  • 141. Baek  J, Kang  M, Hwang  SJ. Accurate Learning of Graph Representations with Graph Multiset Pooling. arXiv[Preprint]:abs/2102.11533.
  • 142. Guo  Z, Zhang  C, Yu  W, et al.  Few-shot graph learning for molecular property prediction. In: Proceedings of the Web Conference 2021. Ljubljana, Slovenia: Association for Computing Machinery. New York, NY, USA: Association for Computing Machinery, 2021, 2559––67..
  • 143. Zhang  X, Wang  H, Du  Z, et al.  CardiOT: towards interpretable drug cardiotoxicity prediction using optimal transport and Kolmogorov--Arnold networks. IEEE J Biomed Health Inform  2025;29:1759–70. 10.1109/JBHI.2024.3510297. [DOI] [PubMed] [Google Scholar]
  • 144. Limbu  S, Dakshanamurthy  S. Predicting chemical carcinogens using a hybrid neural network deep learning method. Sensors  2022;22:8185. 10.3390/s22218185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145. Li  Y, Zhang  Y, Wang  Y, et al.  A strategy for the discovery and validation of toxicity quality marker of Chinese medicine based on network toxicology. Phytomedicine  2019;54:365–70. 10.1016/j.phymed.2018.01.018. [DOI] [PubMed] [Google Scholar]
  • 146. Ren  L, Xu  Y, Ning  L, et al.  TCM2COVID: A resource of anti-COVID-19 traditional Chinese medicine with effects and mechanisms. iMeta  2022;1:e42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147. Yang  C-q, Lai  C-c, Pan  J-c, et al.  Maintaining calcium homeostasis as a strategy to alleviate nephrotoxicity caused by evodiamine. Ecotoxicol Environ Saf  2024;281:116563. 10.1016/j.ecoenv.2024.116563. [DOI] [PubMed] [Google Scholar]
  • 148. Singh  D, Singh  R. Pharmacological and therapeutic potential of a natural flavonoid Icariside II in human complication. Curr Drug Targets  2025;26:320–30. 10.2174/0113894501329810241117231839. [DOI] [PubMed] [Google Scholar]
  • 149. Subhan  I, Siddique  YH. Effect of rotenone on the neurodegeneration among different models. Curr Drug Targets  2024;25:530–42. 10.2174/0113894501281496231226070459. [DOI] [PubMed] [Google Scholar]
  • 150. Fan  X, Zhao  X, Jin  Y, et al.  Network toxicology and its application to traditional Chinese medicine. Zhongguo Zhong Yao Za Zhi  2011;36:2920–2. 10.4268/cjcmm20112104. [DOI] [PubMed] [Google Scholar]
  • 151. Li  S, Zhang  B. Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin J Nat Med  2013;11:110–20. 10.1016/S1875-5364(13)60037-0. [DOI] [PubMed] [Google Scholar]
  • 152. Ru  J, Li  P, Wang  J, et al.  TCMSP: A database of systems pharmacology for drug discovery from herbal medicines. J Chem  2014;6:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 153. Lewi  DF, Bird  MG, Jacobs  MN. Human carcinogens: an evaluation study via the COMPACT and HazardExpert procedures. Hum Exp Toxicol  2002;21:115–22. 10.1191/0960327102ht233oa. [DOI] [PubMed] [Google Scholar]
  • 154. Prival  MJ. Evaluation of the TOPKAT system for predicting the carcinogenicity of chemicals. Environ Mol Mutagen  2001;37:55–69. . [DOI] [PubMed] [Google Scholar]
  • 155. Greene  N, Judson  PN, Langowski  JJ, et al.  Knowledge-based expert systems for toxicity and metabolism prediction: DEREK. StAR and METEOR, SAR QSAR Environ Res  1999;10:299–314. [DOI] [PubMed] [Google Scholar]
  • 156. Franz  M, Lopes  CT, Fong  D  et al. Cytoscape.js 2023.  Update: A graph theory library for visualization and analysis, Bioinformatics  2023;39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157. Forli  S, Huey  R, Pique  ME, et al.  Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat Protoc  2016;11:905–19. 10.1038/nprot.2016.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158. Xue  R, Fang  Z, Zhang  M, et al.  TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res  2013;41:D1089–95. 10.1093/nar/gks1100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159. Szklarczyk  D, Gable  AL, Lyon  D, et al.  STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res  2019;47:D607–d613. 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160. Chen  Q, Zhang  K, Jiao  M, et al.  Study on the mechanism of Mesaconitine-induced hepatotoxicity in rats based on Metabonomics and toxicology network. Toxins (Basel)  2022;14:486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 161. Xi  K, Zhang  M, Li  M, et al.  Unveiling the mechanisms of nephrotoxicity caused by nephrotoxic compounds using toxicological network analysis. Mol Ther Nucleic Acids  2023;34:102075. 10.1016/j.omtn.2023.102075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162. Lv  L, Wang  X, Wu  H. Assessment of palmitic acid toxicity to animal hearts and other major organs based on acute toxicity, network pharmacology, and molecular docking. Comput Biol Med  2023;158:106899. 10.1016/j.compbiomed.2023.106899. [DOI] [PubMed] [Google Scholar]
  • 163. Tian  Y. Artificial intelligence image recognition method based on convolutional neural network algorithm. IEEE Access  2020;8:125731–44. 10.1109/ACCESS.2020.3006097. [DOI] [Google Scholar]
  • 164. Li  X, Lin  L, Pang  L, et al.  Application and development trends of network toxicology in the safety assessment of traditional Chinese medicine. J Ethnopharmacol  2025;343:119480. 10.1016/j.jep.2025.119480. [DOI] [PubMed] [Google Scholar]
  • 165. Tian  Z, Peng  X, Fang  H, et al.  MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform  2022;23:bbac434. 10.1093/bib/bbac434. [DOI] [PubMed] [Google Scholar]
  • 166. Chen  CY. TCM database@Taiwan: the world's largest traditional Chinese medicine database for drug screening in silico. PLoS One  2011;6:e15939. 10.1371/journal.pone.0015939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 167. Zhang  RZ, Yu  SJ, Bai  H, et al.  TCM-mesh: the database and analytical system for network pharmacology analysis for TCM preparations. Sci Rep  2017;7:2821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168. Fang  YC, Huang  HC, Chen  HH, et al.  TCMGeneDIT: A database for associated traditional Chinese medicine, gene and disease information using text mining. BMC Complement Altern Med  2008;8:58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169. Kang  H, Tang  K, Liu  Q, et al.  HIM-herbal ingredients in-vivo metabolism database. J Chem  2013;5:28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170. Song  L, Qian  W, Yin  H, et al.  TCMSTD 1.0: A systematic analysis of the traditional Chinese medicine system toxicology database. Sci China Life Sci  2023;66:2189–92. 10.1007/s11427-022-2318-4. [DOI] [PubMed] [Google Scholar]
  • 171. Lv  Q, Chen  G, He  H, et al.  TCMBank-the largest TCM database provides deep learning-based Chinese-Western medicine exclusion prediction. Signal Transduct Target Ther  2023;8:127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172. Liu  X, Liu  J, Fu  B, et al.  DCABM-TCM: A database of constituents absorbed into the blood and metabolites of traditional Chinese medicine. J Chem Inf Model  2023;63:4948–59. 10.1021/acs.jcim.3c00365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173. Kong  X, Liu  C, Zhang  Z, et al.  BATMAN-TCM 2.0: an enhanced integrative database for known and predicted interactions between traditional Chinese medicine ingredients and target proteins. Nucleic Acids Res  2024;52:D1110–d1120. 10.1093/nar/gkad926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 174. Wei  J, Zhuo  L, Fu  X, et al.  DrugReAlign: A multisource prompt framework for drug repurposing based on large language models. BMC Biol  2024;22:226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 175. Wang  M, Lin  T, Lin  A, et al.  Enhancing diagnostic accuracy in rare and common fundus diseases with a knowledge-rich vision-language model. Nat Commun  2025;16:5528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 176. Pal  S, Bhattacharya  M, Islam  MA, et al.  ChatGPT or LLM in next-generation drug discovery and development: pharmaceutical and biotechnology companies can make use of the artificial intelligence-based device for a faster way of drug discovery and development. Int J Surg  2023;109:4382–4. 10.1097/JS9.0000000000000719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177. Goh  E, Gallo  R, Hom  J, et al.  Large language model influence on diagnostic reasoning: A randomized clinical trial. JAMA Netw Open  2024;7:e2440969. 10.1001/jamanetworkopen.2024.40969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178. Silberg  J, Swanson  K, Simon  E, et al.  UniTox: leveraging LLMs to curate a unified dataset of drug-induced Toxicity from FDA labels. medRxiv [Preprint] 2024;2024.2006.2021.24309315. [Google Scholar]
  • 179. Niu  Z, Xiao  X, Wu  W, et al.  PharmaBench: enhancing ADMET benchmarks with large language models. Scientific Data  2024;11:985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180. Yang  H, Xiu  J, Yan  W, et al.  Large language models as tools for molecular toxicity prediction: AI insights into cardiotoxicity. J Chem Inf Model  2025;65:2268–82. 10.1021/acs.jcim.4c01371. [DOI] [PubMed] [Google Scholar]
  • 181. Alber  DA, Yang  Z, Alyakin  A, et al.  Medical large language models are vulnerable to data-poisoning attacks. Nat Med  2025;31:618–26. 10.1038/s41591-024-03445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 182. Hakim  JB, Painter  JL, Ramcharran  D. et al. The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem. arXiv[Preprint]:2407.18322[cs.CL]. 10.48550/arXiv.2407.18322. [DOI]
  • 183. Ullah  E, Parwani  A, Baig  MM, et al.  Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology—a recent scoping review. Diagn Pathol  2024;19:43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 184. Rajabi  E, Etminani  K. Knowledge-graph-based explainable AI: A systematic review. J Inf Sci  2024;50:1019–29. 10.1177/01655515221112844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 185. Wang  C, Li  M, He  J. et al. A Survey for Large Language Models Models in Biomedicine. Artif Intell Med  2025;170:103268. 10.1016/j.artmed.2025.103268. [DOI] [PubMed] [Google Scholar]
  • 186. Li  M, Peng  W, Zhu  S, et al.  The role of glycolipids and their toxicity in the context of nanomaterials and nanoparticles: A review of the literature. Curr Drug Targets 2025;26:571–85. 10.2174/0113894501347158250305074908. [DOI] [PubMed] [Google Scholar]
  • 187. Jia  X, Wang  T, Zhu  H. Advancing computational toxicology by interpretable machine learning. Environ Sci Technol  2023;57:17690–706. 10.1021/acs.est.3c00653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 188. Wang  X, Li  F, Chen  J, et al.  Integration of computational toxicology, Toxicogenomics data mining, and omics techniques to unveil toxicity pathways. ACS Sustain Chem Eng  2021;9:4130–8. 10.1021/acssuschemeng.0c09196. [DOI] [Google Scholar]
  • 189. Ford  KA. Refinement, reduction, and replacement of animal toxicity tests by computational methods. ILAR J  2017;57:226–33. [DOI] [PubMed] [Google Scholar]
  • 190. Kleinstreuer  NC, Tetko  IV, Tong  W. Introduction to special issue: computational toxicology. Chem Res Toxicol  2021;34:171–5. 10.1021/acs.chemrestox.1c00032. [DOI] [PubMed] [Google Scholar]
  • 191. Zhai  Y, Liu  L, Zhang  F, et al.  Network pharmacology: A crucial approach in traditional Chinese medicine research. Chin Med  2025;20:8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary_Table_1_the_detail_information_of_ADMET_platforms_bbaf533

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.


Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES