Abstract
Exposure to various chemicals found in the environment and in the context of drug development can cause acute toxicity. To provide an alternative to in vivo animal toxicity testing, the U.S. Tox21 consortium developed in vitro assays to test a library of approximately 10,000 drugs and environmental chemicals (Tox21 10K compound library) in a quantitative high-throughput screening (qHTS) approach. In this study, we assessed the utility of Tox21 assay data in comparison with chemical structure information in predicting acute systemic toxicity. Prediction models were developed using four machine learning algorithms, namely Random Forest, Naïve Bayes, eXtreme Gradient Boosting, and Support Vector Machine, and their performance was assessed using the area under the receiver operating characteristic curve (AUC-ROC). The chemical structure-based models as well as the Tox21 assay data demonstrated good predictive power for acute toxicity, achieving AUC-ROC values ranging from 0.83 to 0.93 and 0.73 to 0.79, respectively. We applied the models to predict the acute toxicity potential of the compounds in the Tox21 10K compound library, most of which were found to be non-toxic. In addition, we identified the Tox21 assays that contributed the most to acute toxicity prediction, such as acetylcholinesterase (AChE) inhibition and p53 induction. Chemical features including organophosphates and carbamates were also identified to be significantly associated with acute toxicity. In conclusion, this study underscores the utility of in vitro assay data in predicting acute toxicity.
Keywords: Tox21, 10K Compound Library, In Vitro Assay, High-throughput Screening, Acute toxicity
1. Introduction
Exposure to various chemicals found in the environment can cause toxicity in human (Damalas et al. 2016). The severity of toxicity varies depending on factors such as the type of chemical, its concentration, the route of exposure (inhalation, ingestion, or skin contact), the duration of exposure, and individual susceptibility. For example, pesticides can pose risks to human health and the environment. Symptoms of pesticide toxicity can vary widely and may include irritation of the eyes, skin, and respiratory tract, nausea, vomiting, headache, dizziness, muscle twitching, seizures, and even death in severe cases (Gurung et al. 2017). Moreover, toxicity is a major concern in the context of drug development, as it is one of the important causes that lead to the high failure rate of drug candidates during human clinical trials (Low et al. 2021).
Assessing acute toxicity plays a crucial role in evaluating the safety of existing chemicals and in the creation of new treatments (Lee et al. 2010). The effects of acute toxicity on health can vary from mild, temporary symptoms to severe and potentially life-threatening conditions (Zaitsu et al. 2014, Ye et al. 2022). Immediate symptoms of acute toxicity can range in severity depending on the toxic substance, encompassing nausea, vomiting, dizziness, headaches, respiratory distress, skin irritation, and in severe instances, seizures or loss of consciousness (Yanagisawa et al. 2006). Diverse toxic substances operate through distinct mechanisms of action, having different targets in physiological systems or specific affinity for certain organs in the body (Ashauer et al. 2017). Acute exposure to such substances can lead to harm to essential organs such as the liver, kidneys, lungs, heart, nervous system, or other vital bodily systems (Parrón et al. 2014). The extent of health consequences correlates with the degree of exposure, usually elevated doses or concentrations of a toxic substance increase the likelihood of severe symptoms and can escalate to life-threatening situations (Meyer et al. 2015). Although critical to human health, acute toxicity is largely unpredictable owing to multifaceted factors influencing the dose-response relationship, individual differences, exposure route, species variability, and interactions with other substances (Fotiou et al. 2020).
Acute toxicity testing is a pivotal component in environmental toxicity assessment and the innovation of new drugs. Regulatory agencies worldwide mandate acute toxicity testing to guide chemical hazard classification, labeling, and risk management. Traditionally, this testing is performed in vivo, following specific exposure routes (oral, dermal, or inhalation) during a defined observation period. Nevertheless, in vivo acute oral toxicity testing is prohibitively expensive and time-consuming, and it poses significant ethical concerns due to the extensive use of animals. Given the substantial number of new and existing substances requiring evaluation, there is a pressing demand for cost-effective, rapid, and non-animal alternatives (Mansouri et al. 2021).
Quantitative structure-activity relationship (QSAR) modeling and machine learning techniques have gained popularity in predicting compound properties, including toxicity (Ivanov et al. 2020). Thus far, QSAR modeling has been extensively utilized for the prediction of acute toxicity (Huang et al. 2021, Mansouri et al. 2021). Despite the recent development of numerous acute toxicity prediction models, many of them encounter issues stemming from imbalanced or restricted training/testing datasets which has led to subpar predictive accuracy (Gupta et al. 2019). The etiology of acute toxicity is intricate, involving the interplay of numerous factors. Consequently, the scarcity of human acute toxicity data poses a significant challenge for QSAR modeling (Erhirhie et al. 2018). The limitations posed by small training datasets and the absence of validation further curtail the usefulness of these models (Idakwo et al. 2018). Moreover, most previously published models rely on in-house or proprietary data sourced from industry, which are not accessible to the public, whereas a minority of studies utilizing data extracted from public databases often contend with experimental uncertainties (Dong et al. 2015). These combined factors undermine the practical applicability of existing models for the reliable assessment of acute toxicity.
One significant initiative in utilizing machine learning in acute toxicity prediction is the Collaborative Acute Toxicity Modeling Suite (CATMoS) (Mansouri et al. 2021). CATMoS is an open-source, open-data tool developed with contributions from 35 international collaborators to support predictive modeling for acute toxicity. data, The CATMoS dataset comprises rat oral Lethal Dose 50 (LD50) values for over 15,000 substances, which was used to develop 139 QSAR models to facilitate the screening of new chemicals using a weight-of-evidence approach. Integrated predictions from the CATMoS models exhibited accuracy and robustness comparable to in vivo results, demonstrating the value of chemical structures in acute toxicity prediction (Mansouri et al. 2021).
Another effort to provide an alternative to traditional animal based in vivo toxicity testing is establishment of the Toxicology in the 21st Century (Tox21) consortium (NCATS 2022), which is a collaboration among four U.S. federal agencies, including the U.S. Environmental Protection Agency (EPA), the National Toxicology Program (NTP) based at the National Institute of Environmental Health Sciences (NIEHS), the National Center for Advancing Translational Sciences (NCATS), and the Food and Drug Administration (FDA), dedicated to advancing toxicology by devising techniques to assess the safety of various commercial chemicals, pesticides, food additives/contaminants, and medical products swiftly and effectively (Collins et al. 2008, Kavlock et al. 2009, Tice et al. 2013). Tox21 utilizes quantitative high-throughput screening (qHTS), an automated robotic procedure that evaluates each compound within extensive chemical libraries at various concentrations across multiple cell-based and biochemical assays (NCATS 2022, Attene et al. 2013, Lynch et al. 2023). To date, the Tox21 program has screened a collection of ~10,000 drugs and environmental chemicals (Tox21 10K compound library) (Richard et al. 2021) against nearly 80 in vitro cell-based and biochemical assays (Huang et al. 2016, NCATS 2021, PubChem 2021). Tox21 assay data as well as chemical structures have been applied to build models for in vivo toxicity prediction. For instance, machine learning models were developed for 14 human in vivo toxicity endpoints using chemical structure and Tox21 qHTS data (Xu et al. 2020). Another study also compared the performance of Tox21 assay data with chemical structure information in building prediction models for human in vivo hepatotoxicity and cardiotoxicity (Ye et al. 2022). These studies showed that, while chemical structure-based models exhibited good predictive performance, in vitro assay data alone had limited predictive capacity of in vivo toxicity, albeit it could provide clues on the biological targets or mechanisms involved in various toxicity endpoints (Huang et al. 2016, Xu et al. 2020, Xu et al. 2021, Ye et al. 2022).
In this study, we aimed to evaluate the capacity of in vitro assay data in predicting acute toxicity. We built machine learning models for acute toxicity prediction using Tox21 assay data in comparison with chemical structure information. We then applied the models to predict the acute toxicity potential of compounds within the Tox21 10K compound library. Through this process, we also identified assay targets, pathways, and chemical features that exerted the greatest influence on model performance. Chemical structural features and assay targets significantly associated with acute toxicity merit further investigation as potential indicators of the underlying pathways and mechanisms of acute systemic toxicity.
2. Materials and methods
2.1. Acute toxicity data
We used the acute toxicity data assembled by the CATMoS (Collaborative Acute Toxicity Modeling Suite) project. An acute oral toxicity data inventory for 11,992 chemicals was compiled, and provided as Supplemental Material in the CATMoS research paper (The training set from Supplemental Material1 and Evaluation Set from Supplemental Material2) (Mansouri et al. 2021). The dataset includes rat oral LD50 values categorized into U.S. EPA hazard categories, GHS hazard categories, highly toxic chemicals (LD50 ≤ 50 mg/kg), and non-toxic chemicals (LD50 > 2,000 mg/kg). We used chemicals that were categorized as either “highly toxic” or “non-toxic” to build our training dataset. Specifically, we included chemicals (in Supplemental Material tables of the CATMoS paper) that were designated as “TRUE” under the “very_toxic” column and “FALSE” under the “nontoxic” column as toxic compounds, and chemicals that had a “FALSE” value under the “very_toxic” column and “TRUE” under the “nontoxic” column as nontoxic ones (Mansouri et al. 2021). The training set with “very toxic” and “nontoxic” classifications consists of 3,974 compounds, including 741 toxic and 3,233 nontoxic compounds. The testing set comprises 1,324 compounds, with 250 classified as toxic and 1,074 as nontoxic.
2.2. In vitro assay data
The qHTS data utilized in this study were obtained through screening the Tox21 10K compound library, comprising 78 assays and 263 readouts. All in vitro assay data and assay descriptions used in this study are publicly accessible on the NCATS website (https://tripod.nih.gov/tox21/pubdata/) and PubChem (PubChem 2021, PubChem 2022). Various cell lines were employed for the qHTS assays, with human cell lines comprising the majority (70.5%), followed by murine embryo fibroblast (6.4%), Chinese hamster ovary cell line (5.1%), and others (7.7%). The assay targets include nuclear receptor signaling (NR, 48.7%), stress response pathways (SR, 10.3%), cytotoxicity (7.7%), metabolism (6.4%), G-protein coupled receptors (GPCR, 5.1%), and other toxicity pathways (21.8%). These assays have undergone extensive validation for performance and reproducibility (Sakamuru et al. 2020, Lynch et al. 2023). Curve rank, ranging from −9 to 9, was used as a measure for compound activity, with positive values indicating activation and negative values indicating inhibition (Huang 2016). For modeling, compounds with an absolute curve rank greater than 0.5 were categorized as active (1), while others were considered inactive (0).
2.3. Chemical structure data
Compounds in the Tox21 10K compound library, comprising approximately 10,000 chemical samples (8,545 unique compounds), were used for modeling. This library encompasses a wide range of substances, including industrial and consumer products, food additives, drugs, and chemical mixtures (Richard et al. 2021). Moreover, the Tox21 10K compound library contains the NCATS Pharmaceutical Collection (NPC) (Huang et al. 2011, Huang et al. 2019), which comprises roughly 3,000 small molecule drugs approved for clinical use or under investigational evaluation by regulatory authorities in the U.S., Europe, Japan, Australia, and Canada. In constructing classification models for this study, chemical structures were converted to ToxPrint fingerprints. These fingerprints encode publicly accessible ToxPrint chemotypes (v2.0_r711), which were generated using the associated ChemoTyper application (https://github.com/mn-am/chemotyper) (Yang et al. 2015). The ToxPrint chemotypes encompass 729 distinct chemical features.
2.4. Supervised machine learning
Models were trained and tested using selected datasets, that is, assay activity (activity-only models), chemical structure (structure-only models), and combinations of structure and activity data. Four distinct machine learning classification algorithms were utilized for model construction: Random Forest (RF), Naïve Bayes (NB), eXtreme Gradient Boosting (XGB), and Support Vector Machine (SVM). The models were trained and tested using Python, employing specific packages tailored for each classifier, including “MultinomialNB” for NB, “SVC” for SVM, “RandomForestClassifier” for RF, and “XGBClassifier” for XGB. Model parameters were package defaults. The performance of the models was evaluated by computing the area under the receiver operating characteristic curve (AUC-ROC), which evaluates predicted values against true labels in the testing set. A perfect model would have an AUC-ROC score of 1 and 0.5 indicates a random classifier for binary classification. Cross-validation (3-fold) was performed given the dataset size on the dataset with 10 random splits of the data, and multiple AUC-ROC values were averaged. Synthetic Minority Oversampling Technique (SMOTE) was also used to tackle imbalanced datasets by generating synthetic samples for the minority class in the training set, and AUC-ROC of the model when applied to the testing set was calculated.
2.5. Feature selection
Feature selection was employed to identify important features related to acute toxicity and optimize model performance. The process of identifying important features was conducted using AUC-ROC values (when each feature was used as a single descriptor model) and random forest (RF) importance scores. We imported the “random forest classifier” and “Roc Curve Display” packages from Skicit-Learn package for analysis. Features with larger RF importance scores (close to 1) or AUC-ROC values were considered to contribute more to the toxicity prediction.
To optimize model performance, the RF importance scores were ranked in descending order along with their corresponding features (assays or chemical structures). In the first step, the top 10% of features with the highest RF scores were selected to build the model. In each subsequent iteration, the next 10% of the most important features were added to train the prediction model, and the AUC-ROC score was calculated. The optimal segment of feature sets corresponding to each dataset was then determined based on the highest AUC-ROC score. The model was rebuilt using this optimal feature set to evaluate whether the combined dataset outperformed the individual assay or chemical structure dataset alone.
2.6. Predicting the acute toxicity potential of the Tox21 10K compound library
Optimal chemical structure-based models were applied to predict the acute toxicity potential of all compounds in the Tox21 10K library. To assess the reliability of the models in predicting different structural types, the specific field or area in which the model is intended to be applied, i.e., applicability domain (AD), was determined. To evaluate the model AD, the nearest structural neighbor from the training set was identified for each compound within the Tox21 10K compound library. Structural similarity was determined by computing the Tanimoto coefficient using ToxPrint fingerprints, which quantifies the similarity between two compounds based on shared structural features. A compound with a close structural neighbor in the training set, exhibiting a Tanimoto similarity (Tmax) of ≥ 0.6, was deemed to fall within the AD of the model. Predicted toxic compounds falling within the model AD were further stratified and model prediction results of compounds with and without application of AD in the Tox21 10K compound library were compared.
2.7. Consumer product use category analysis
The consumer product use categories (CPCat) covered by the Tox21 10K library compounds predicted to be acutely toxic were analyzed. The CPCat information was downloaded from the online Computational Toxicology (CompTox) Chemicals Dashboard (EPA. 2022), which includes chemical information such as physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay data integrated by EPA researchers. The proportion of compounds that fall into each CPCat among all toxic compounds was calculated and ranked. The top ten categories with the highest enrichment of predicted toxic compounds were identified.
3. Results
3.1. Data sets used for modeling
After merging the ToxPrint structure data (8545 substances) with the CATMoS acute toxicity datasets for training (3974 substances) and testing (1324 substances) as shown in Fig.1, matching on CAS number, a total of 1416 chemicals were retained for the training set and 480 for the testing set. Thus, for the chemical structure-based model, a total of 1896 chemicals encoded in 729 ToxPrint chemotypes were used as input. Similarly, after merging the in vitro assay data (8971 substances) with the CATMoS acute toxicity datasets, a total of 1442 chemicals (identified by CAS) were retained for the training set and 485 for the testing set. Consequently, for the assay-based model, a total of 1927 chemicals with 263 Tox21 assay readouts were used as input. The chemical structure dataset and the assay activity dataset were concatenated to create the input for the structure-activity combined models.
Fig. 1.

Dataset compositions of the chemical structure-based, Tox21 assay data-based and combined models.
3.2. Performance of Tox21 assay data-based models, chemical structure-based models, and combined models
We first used Tox21 assay data to build models for acute toxicity prediction. The model performances were evaluated using the AUC-ROC scores on various datasets: the original training and testing sets, the resampled training sets, and feature-selected subsets. We performed 3-fold cross-validations (CV) on the training-testing combined dataset, then built models using the training set data and applied the models to predict the acute toxicity potential of compounds in the testing set. The model performance measured by AUC-ROC values are summarized in Table 1. The AUC-ROC scores from cross-validation for models constructed on assay data ranged from 0.73±0.03 to 0.80±0.03. The assay data-based models also showed good performance on the testing set with AUC-ROC varied between 0.73 and 0.79. When the training set was resampled using the SMOTE method, the AUC-ROC values varied between 0.69 and 0.78 on the testing set. Models built using subsets of features selected by Random Forest (RF) exhibited optimal AUC-ROC values ranging from 0.78 to 0.82. These results demonstrate that the Tox21 in vitro assays have good predictive power of acute toxicity.
Table 1.
Performance of classification models constructed using four machine learning algorithms (RF, NB, XGB, and SVM) for acute toxicity prediction based on Tox21 assay, chemical structure, and combined data.
| Models | Machine learning methods | AUC-ROC (CV=3, training, and testing sets combined) | AUC-ROC (testing) | AUC-ROC (testing; training set resampled) | AUC-ROC (testing; training set with feature selected) | |
|---|---|---|---|---|---|---|
| Tox21 assay data-based models | NB | 0.73±0.03 | 0.73 | 0.69 | 0.78 | |
| SVM | 0.80±0.03 | 0.79 | 0.78 | 0.80 | ||
| RF | 0.78±0.03 | 0.78 | 0.75 | 0.82 | ||
| XGB | 0.76±0.03 | 0.75 | 0.74 | 0.79 | ||
|
| ||||||
| Chemical structure-based models | NB | 0.83±0.02 | 0.83 | 0.82 | 0.84 | |
| SVM | 0.89±0.02 | 0.90 | 0.80 | 0.90 | ||
| RF | 0.90±0.02 | 0.93 | 0.85 | 0.93 | ||
| XGB | 0.88±0.02 | 0.86 | 0.80 | 0.86 | ||
|
| ||||||
| Combined models | NB | 0.87±0.03 | 0.87 | 0.83 | 0.87 | |
| SVM | 0.85±0.02 | 0.81 | 0.80 | 0.87 | ||
| RF | 0.88±0.02 | 0.88 | 0.86 | 0.91 | ||
| XGB | 0.87±0.02 | 0.86 | 0.84 | 0.87 | ||
In comparison, we also constructed machine learning models for acute toxicity prediction using chemical structure, and the performance statistics are summarized in Table 1. The cross-validation AUC-ROC scores of the four classification models (RF, NB, XGB, and SVM) based on ToxPrints ranged from 0.83±0.02 to 0.90±0.02. The ToxPrint-based models performed even better on the testing dataset with AUC-ROC values ranging from 0.83 to 0.93 when trained on the original training set, 0.80 to 0.85 with the training set resampled using SMOTE method, and 0.84 to 0.93 when trained on the feature-selected training sets. These results indicate that chemical structure can serve as a good predictor of acute toxicity.
Additionally, we applied four machine learning methods to build combined models utilizing both Tox21 assay data and chemical structure. Feature selection was first performed to optimize model performance. The features selected based on RF scores with optimal performance consisted of 20%, 40%, and 10% of all the features, which are 52 features for Tox21 assay data-based models, 288 features for chemical structure-based models, and 99 for the combined models. As shown in Table 1, the cross-validation AUC-ROC scores for these combined models ranged from 0.85±0.02 to 0.88±0.02. The combined models showed AUC-ROC scores ranging from 0.81 to 0.88 on the testing dataset, 0.80 to 0.86 when trained on the SMOTE-resampled training sets, and 0.87 to 0.91 with feature-selected training sets. These scores were comparable to those based solely on ToxPrints. Notably, both combined and ToxPrint-based models outperformed those based solely on Tox21 assay data, suggesting that integrating in vitro assay data with chemical structure did not significantly enhance model performance.
3.3. Structural features contribute to acute toxicity
We identified chemical features, represented by ToxPrint chemotypes, that were enriched in acutely toxic compounds using RF importance scores. The most prominent features (with their RF importance scores in parentheses) found to be associated with acute toxicity (Fig.1) include organophosphates (OPs), such as bond:P~S_generic (0.067), bond:P=O_phosphate_dithio (0.023), bond:P=O_phosphate_thioate (0.016), and carbamates (CBs) such as bond:C=O_carbonyl_ (0.015), bond:C(=O)N_carbamate (0.027) and bond:NC=O_urea_thio (0.012), and long linear alkane chains such as chain:alkaneLinear_ethyl_C2(H_gt_1) (0.016) and chain:alkaneLinear_ethyl_C2_(connect_noZ_CN=4) (0.014). Moreover, small chemical fragments such as ring:aromatic_phenyl were found enriched in acute toxic compounds. Example features and representative compounds that contain these acute toxicity features, as well as their corresponding RF importance scores, are shown in Table 2. More example toxic features are listed in Supplementary Table2. In summary, these chemotypes fall largely into three categories, i.e., organophosphates, carbamates, and long alkane chains.
Table 2.
Top chemotypes associated with acute toxicity measured by RF importance and example compounds. Representative compounds that contain these chemotypes are displayed with significant chemotypes highlighted in red.
| RF importance | Feature | Example |
|---|---|---|
| 0.067 | bond:P~S_generic |
|
| 0.0267 | bond:C(=O)N_carbamate |
|
| 0.023 | bond:P=O_phosphate_dithio |
|
| 0.016 | bond:P=O_phosphate_thioate |
|
| 0.016 | bond:C=S_carbonyl_thio_generic |
|
3.4. Assays predictive of acute toxicity
We examined the contribution of each assay within the Tox21 assay panel in predicting acute toxicity (Fig. 3). The top five assays that contributed the most to the prediction of acute toxicity measured by their RF importance scores, in descending order of importance, are: tox21-ms-p53-p1_ch2 (0.035), tox21-ms-p53-p1_ch1 (0.033), tox21-ache-p3_ratio (0.031), tox21-ms-ache-p2_ratio (0.028), tox21-ms-p53-p2_ratio (0.027).
Fig. 3.

Top 20 Tox21 assays that contributed to the prediction of acute toxicity measured by RF importance.
Similarly, we evaluated the performance of each assay individually in the Tox21 assay panel in predicting acute toxicity (Table 3). The top five assays demonstrating the highest predictive power of acute toxicity, ranked in descending order based on their AUC-ROC scores, are as follows: tox21-ms-p53-p1_ch2 (0.67), tox21-ms-p53-p2_ratio (0.67), tox21-ms-p53-p1_ch1 (0.66), tox21-ms-p53-p2_ch2 (0.66), and tox21-ms-ache-p2_ratio (0.64). In summary, the major target groups covered by the top contributing assays include AChE, p53, CYP, and GR. Notably, all top 5 assays had microsomes added to introduce metabolic capacity (Li et al. 2019, Ooka et al. 2022), indicating that many inhibitors require metabolic activation to cause toxicity.
Table 3.
Top 20 Tox21 assays that are predictive of acute toxicity by AUC-ROC.
| Assay | Target | AUC-ROC |
|---|---|---|
| tox21-ms-p53-p1_ch2 | p53 (with rat microsomes) | 0.67 |
| tox21-ms-p53-p2_ratio | p53 (with human microsomes) | 0.67 |
| tox21-ms-p53-p1_ch1 | p53 (with rat microsomes) | 0.66 |
| tox21-ms-p53-p2_ch2 | p53 (with human microsomes) | 0.66 |
| tox21-ms-ache-p2_ratio | AChE (with human microsomes) | 0.64 |
| tox21-ms-p53-p2_ch1 | p53 (with human microsomes) | 0.64 |
| tox21-ache-p1_ratio | AChE | 0.63 |
| tox21-ms-p53-p1_ratio | p53 (with rat microsomes) | 0.63 |
| tox21-p450-2d6-p1_ratio | CYP2D6 | 0.63 |
| tox21-ache-p3_ratio | AChE (colormetric) | 0.62 |
| tox21-ache-p5_ratio | AChE | 0.62 |
| tox21-dt40-p1_653 | Cell viability | 0.61 |
| tox21-er-bla-agonist-p2_ch1 | ER-BLA agonist | 0.58 |
| tox21-gh3-tre-antagonist-p1_viability | TR-beta antagonist | 0.57 |
| tox21-er-bla-antagonist-p1_viability | ER-BLA antagonist | 0.56 |
| tox21-gr-hela-bla-antagonist-p1_ratio | GR-BLA antagonist | 0.55 |
| tox21-gr-hela-bla-antagonist-p1_ch2 | GR-BLA antagonist | 0.53 |
| tox21-p450-2c19-p1_ratio | CYP2C19 | 0.53 |
| tox21-ahr-p1_ratio | AhR | 0.53 |
| tox21-dt40-p1_657 | Cell viability | 0.53 |
3.5. Predicting the acute toxicity potential of the Tox21 10K compound library
We applied the RF, NB, XGB, and SVM models, which demonstrated strong performance during both training and testing phases, to predict the potential acute toxicity of compounds within the Tox21 10K compound library. Each compound received a toxicity probability based on model predictions, and the distributions of these predicted probabilities are shown in Fig. 4 with a complete listing provided in Supplementary Table 2. Among all 8545 compounds, compounds with toxicity probabilities predicted by NB with ROC-AUC scores greater than 0.5 (toxic) consists 13% of the total compounds, while the percentages of toxicity probabilities greater than 0.5 are 6%, 6%, and 7% for SVM, RF and XGB prediction methods, separately. Most compounds are non-toxic, as the toxicity probabilities less than 0.2 consists of 79%, 88%, 83%, and 86% of the total compounds. Therefore, most of the compounds in the Tox21 10K compound library were unlikely to exhibit acute toxicity.
Fig 4.

The histograms present the distribution of the toxicity probability of compounds in the Tox21 10K compound library. All four methods (RF, NB, XGB, and SVM) had a prediction that most compounds in the Tox21 10K compound library were non-toxic for acute toxicity.
The top 10 compounds with the highest predicted acute toxicity probabilities in the Tox21 10K compound library, which were not in the training dataset, are listed in descending order of RF predicted probability, include Thiocarbazide (1), Carbophenothion (1), Endrin (0.99), Parathion (0.99), Terbufos sulfone (0.98), Terbufos (0.98), Azinphos-ethyl (0.98), Phorate sulfone (0.98), Isoxathion (0.97), Endosulfan sulfate (0.97). All these compounds have been reported to be toxic according to the literature (Petreski et al. 2020, Mermer et al. 2022, Karunarathne et al. 2020) (Supplementary Table 2), validating the robustness of our models in detecting known toxicants.
To ensure the reliability of predictions from structure-based models, we calculated the similarity of each compound in the 10K collection to the compounds in the training dataset. Compounds with similar chemical structures are deemed to fall in the applicability domain (AD, Tmax ≥0.6) of the model. The acute toxicity prediction results of the 8545 compounds in the Tox21 10K compound library with and without AD were compared. The results of acute toxicity predictions using RF, NB, XGB, SVM, and consensus models with and without AD are outlined in Table 4. Without considering AD, the RF model flagged 502 compounds (6%) as toxic. Similarly, the XGB model indicated 604 compounds (7%) as potentially toxic. The NB model identified 1144 compounds (13%) as toxic, while the SVM model pinpointed 474 toxic compounds (6%). Combing the results from all four models, 237 compounds (3%) in the 10K compound library were predicted as toxic by all the models (consensus model). When AD was considered, the NB method identified the highest number of toxic compounds (283; 3%), followed by the RF method (249; 3%) and XGB (220; 3%), whereas SVM recognized the lowest number (167; 2%) as toxic. Combining the predictions from all four methods further reduced the total number of predicted toxic compounds, resulting in the identification of 109 (1%) acute toxic compounds (Table 4). When AD was taken into consideration, the predicted toxic compounds in the Tox21 10K compound library decreased by 3–10% using the RF, NB, XGB, and SVM models individually. Therefore, as most compounds in the Tox21 10K compound library predominantly fell outside the model AD, when focusing solely on 3643 compounds falling within the model AD (Tmax ≥0.6), we noted an overall decrease in the predicted number of toxic compounds. Implementing the AD helped narrow down the predicted number of toxic compounds by excluding compounds that deviated significantly from the structural characteristics of the model training set.
Table 4.
Distribution of model-predicted acute toxicity compounds in the Tox21 10K compound library with and without the application of the applicability domain (AD) using Random Forest (RF), Naïve Bayes (NB), eXtreme Gradient Boosting (XGB), Support Vector Machine (SVM), and a consensus of all four methods.
| RF | NB | XGB | SVM | Consensus | |
|---|---|---|---|---|---|
| without AD | 502 | 1144 | 604 | 474 | 237 |
| (8545) | 6% | 13% | 7% | 6% | 3% |
| with AD | 249 | 283 | 220 | 167 | 109 |
| (3643) | 3% | 3% | 3% | 2% | 1% |
3.6. Consumer product use categories
We then analyzed compounds in the Tox21 10K compound library predicted to exhibit acute toxicity based on their consumer product use categories (EPA. 2022). The proportion of compounds predicted to be acutely toxic in each category was determined. The top ten categories with an enrichment of predicted toxic compounds (listed in descending order based on the number of occurrences) are as follows: “pesticide” (45%), “drug” (35%), “chemical” (32%), “food residue” (27%), “industrial manufacturing” (27%), “active ingredient” (25%), “detected” (22%), “drinking water contaminant” (21%), “food additive” (19%), and “agrochemical” (16%).The term “pesticide” encompasses substances utilized to manage pests, including herbicides, insecticides, nematicides, fungicides, and various others, while “drug” pertains to any drug product or compound associated with drug manufacturing processes. “Chemicals” are often created by transformation of organic and inorganic raw materials by chemical processes.
4. Discussion
In this study, we built machine learning models for the prediction of acute systemic toxicity using Tox21 in vitro assay data in comparison with traditional chemical structure-based models. Both types of data showed robust predictive power of acute toxicity with performance on par with the best CATMoS models (Mansouri et al. 2021). CATMoS was a large collaborative modeling effort, where the very toxic (VT) and not toxic (NT) acute toxicity endpoints were modeled using 32 and 33 chemical structure models, respectively. The balanced accuracies (BA) for VT and NT consensus predictions following the weight-of-evidence approach were 0.84 and 0.78, with sensitivities (Sn) of 0.70 and 0.67, and specificities (Sp) of 0.97 and 0.90, respectively. In our study, the AUC-ROC scores were all above 0.8 for the chemical structure-based models and >0.7 for assay-based models, thus both studies underscored the good predictive performance of chemical structures for acute toxicity. Additionally, our study complements the CATMoS findings by validating the value of in vitro assays in predicting in vivo acute toxicity.
This research focused on predicting acute toxicity using Tox21 in vitro assay data. Similar efforts in toxicity prediction with Tox21 data were conducted previously. For instance, optimal models for various human in vivo and organ-level toxicity endpoints were developed using chemical structure and Tox21 qHTS assay data. Supervised machine learning algorithms were employed to model 14 in vivo human toxicity endpoints related to vascular, kidney, ureter, bladder, and liver systems (Xu et al. 2020). The top four models achieved AUC-ROC values above 0.8, while the best models for the remaining 10 endpoints had AUC-ROC values above 0.7. In contrast, assay data-based models were outperformed by both the structure-based and combined models. Another example is the prediction of drug-induced liver injury (DILI) and cardiotoxicity (DICT) (Ye et al. 2022). The results demonstrated that DILI and DICT could be reasonably predicted by chemical structure, with structure-based models achieving AUC-ROC scores from 0.65 to 0.75, compared to lower scores for assay-based models (0.56 to 0.61). In all previous studies, the assay data-based models achieved less-than-ideal performance compared to the chemical-structure based models (Huang et al. 2016, Xu et al. 2020, Xu et al. 2021, Ye et al. 2022). However, with AUC-ROC values ranging between 0.73 and 0.79, the current study demonstrates for the first time that Tox21 in vitro qHTS assay data can be highly predictive of an in vivo toxicity endpoint.
In addition, assays and chemical structural features that significantly contributed to acute toxicity prediction were identified. The top 20 assays with predictive capabilities for acute toxicity are listed in Table 4. Several of these assays target specific pathways or receptors, which are known to be associated with toxicity, such as p53, acetylcholinesterase (AChE), Cytochrome P450 (CYP), and the glucocorticoid receptor (GR). These assay targets can provide clues on the biological targets and pathways involved in chemical-induced acute toxicity.
P53 is a tumor suppression protein which can be activated by DNA damage and other cellular stress (Aubrey et al. 2018). The p53 tumor suppressor protein regulates the transcription of numerous genes in response to DNA damage. P53 receives signals from diverse stress sensors and strategizes to maintain cellular homeostasis, therefore it is a universal sensor of genotoxic stress and savior of genomic integrity (Zerdoumi et al. 2015). P53 signaling suppresses apoptosis following genotoxic stresses (Mirzayans et al. 2017), and p53-mediated apoptosis is determined by the severity of DNA damage (Ho et al. 2019). Several Tox21 assays that measure p53 activation (Witt et al. 2017), e.g., tox21-ms-p53-p1_ch2 and tox21-ms-p53-p2_ratio, are among the top 20 predictive assays of acute toxicity. These results indicate that genotoxic stress induced p53 activation may be one of the mechanisms underlying acute toxicity. Interestingly, the most predictive p53 assays all had microsomes incorporated (Table 4), suggesting that many compounds require metabolic activation to cause toxicity.
Acetylcholinesterase (AChE) is the primary cholinesterase in the body that hydrolyzes a key neurotransmitter, acetylcholine (ACh) in the synaptic cleft. This process is crucial for terminating the action of ACh and allowing for precise control of neuronal signaling. Inhibition of AChE activity can lead to neurotoxicity, therefore AChE can be regarded as a crucial neurotoxicity marker (Gupta et al. 2015). The top 20 predictive assays of acute toxicity included several Tox21 assays that measure AChE inhibition (Li et al. 2017, Li et al. 2019), most of which are among the top 10, e.g., tox21-ms-ache-p2_ratio and tox21-ache-p1_ratio (Table 4). Again, the AChE assay with metabolic capacity (tox21-ms-ache-p2_ratio) ranked higher than most of the AChE assays that had no microsomes added (e.g., tox21-ache-p1_ratio), indicating the importance of metabolic activation in chemical induced toxicity (Li et al. 2019).
CYPs are hemeproteins located in cell membranes, play a crucial role in metabolizing drugs and foreign substances (Zhao et al. 2021). Many chemicals undergo biotransformation, a detoxifying mechanism facilitated by cytochrome P450 enzymes to aid in the elimination of these substances (Zhang et al. 2021). Due to their responsibility for producing vital metabolic enzymes for foreign substance metabolism, CYP genes can be triggered by chemicals to increase enzyme production, thereby influencing the equilibrium between detoxification and activation processes (Zhao et al. 2021). Inhibition of P450 occurs when a substrate forms a reactive intermediate, creating a stable enzyme–intermediate complex that irreversibly reduces enzyme activity (Hrycay et al. 2015), which may lead to acute toxicity. It is interesting that two Tox21 P450 assays, tox21-p450-2d6-p1 and tox21-p450-2c9-p1, were among the top predictive assays for acute toxicity, reinforcing the role of chemical metabolism in acute toxicity.
The glucocorticoid receptor (GR) is one of the members of the nuclear receptor family of ligand-dependent transcription factors. GR is present in virtually every human cell type. Representing a nuclear receptor superfamily, GR has several different isoforms essentially acting as ligand-dependent transcription factors, regulating glucocorticoid-responsive gene expression in both a positive and a negative manner (Kassel et al. 2007). GR plays a critical role in carbohydrate, protein and lipid metabolisms, and programmed cell death (Gulliver et al. 2017). Research shows that R antagonists such as mifepristone exhibited reversible hepatotoxicity caused by long-term administrations of high doses of mifepristone (Xiao et al. 2016). The Tox21 assay that identifies GR antagonists, tox21-gr-hela-bla-antagonist-p1, is among the top 20 most predictive assays for acute toxicity, suggesting that GR inhibition may be one of the mechanisms leading to acute toxicity.
This study also identified chemotypes or chemical features that are significantly associated with acute toxicity, such as organophosphates (OPs), carbamates (CBs), and chemicals with long alkyl chains. The OPs are usually esters, amides, or thiol derivatives of phosphoric acid, having a phosphate or phosphorothioate structural formula (Ballantyne et al. 2017). They are used extensively as pesticides. OPs are highly toxic and both accidental and intentional exposures to OPs resulting in deleterious health effects have been documented for decades. OP toxicity is predominantly explained by their major mode of action in inhibiting AChE and promoting accumulation of the neurotransmitter acetylcholine, predominantly at peripheral nicotinic and muscarinic synapses. This promotes cholinergic hyperstimulation with pupillary miosis, and lacrymation, hypersalivation, diarrhea and loss of bladder control, bradycardia and bronchospasm being prominent features of muscarinic receptor stimulation. Nicotinic receptor hyperstimulation leads to muscle spasm and hypercontraction with eventual paralysis. Central nervous system effects are also present with confusion, nausea and dizziness with eventual coma and seizures and centrally mediated cardiac and respiratory arrest (Davis et al. 1980). Other than neurobehavioral effects in humans, OPs cause cholinergic crisis, intermediate syndrome, OP-induced delayed neuropathy, and chronic organophosphate-induced neuropsychiatric disorders in time and dosage dependent manner (Ganie et al. 2022).
CBs are N-substituted esters of carbamic acid. Due to their broad spectrum of biological activity, carbamates can be used as insecticides, fungicides, nematocides, acaracides, molluscicides sprout inhibitors or herbicides (Ballantyne et al. 2017). CBs impair the enzymatic pathways involved in metabolism of carbohydrates, fats and protein within cytoplasm, mitochondria, and proxisomes. It is believed that CBs show this effect through inhibition of AChE or affecting target organs directly (Karami et al. 2011). The symptoms of exposure to CBs and OPs are similar, although poisoning from CBs is of a shorter duration (Mdeni et al. 2022). CBs are known to have potential harmful effects to other non-targeted organisms than pests and diseases. The major concerns are their toxic effects such as interfering with the reproductive systems and fetal development (Morais et al. 2012).
Alkyl chains, which are common components of organic compounds, can vary widely in terms of toxicity depending on their structure, length, and functional groups attached (Wang et al. 2015). The toxicity of long alkyl chains is often related to their ability to disrupt biological processes or membranes within living organisms (Liu et al. 2023). Some alkyl chains may have different metabolism mechanisms, and exposure routes can affect the toxicity of alkyl chains (Jeremias et al. 2021). However, research is still lacking on the role of alkyl-chain lengths in acute toxicity, and further exploration is warranted to investigate the impact of alkyl chains on acute toxicity in humans.
Finally, compounds predicted to be acutely toxic in the Tox21 10K compound library were analyzed for their consumer product use categories. Compounds categorized as “pesticides” have the potential to induce acute toxicity through various exposure routes (Tudi et al. 2022). For instance, disulfoton (CAS # 298-04-4) is a synthetic organic thiophosphate utilized as a pesticide to manage a range of harmful pests that affect numerous field and vegetable crops. Disulfoton is an acetylcholinesterase inhibitor and poses toxicity risks through inhalation, skin absorption, and/or ingestion, potentially leading to skin or eye burns upon contact (Veronesi et al. 2022). Another compound, thiotepa (CAS # 52-24-4), falls into the alkylating agent class of drugs extensively employed in treating breast cancer, ovarian cancer, and bladder cancer. Thiotepa may induce pulmonary toxicity, which could compound the effects of other cytotoxic agents, and high doses might contribute to neurotoxicity (Maritaz et al. 2018). Chloroacetone (CAS # 78-95-5), categorized under “chemicals,” is used in color photography, insecticides, perfumes, organic synthesis, tear gas, and the polymerization of vinyl monomers. Chloroacetone is toxic through inhalation, ingestion, and dermal contact, causing immediate lacrimation at low concentrations. Exposure to chloroacetone can result in contact burns on the skin and eyes, nausea, bronchospasm, delayed pulmonary edema, and potentially death (National Research Council. 2009).
A limitation of the current study is that the data used for model development included only chemicals that induce oral toxicity, with in vivo data derived from rat models rather than humans. Further research on developing acute toxicity prediction models based on acute toxicity data through other routes and from human models should be explored. Nonetheless, this study demonstrates for the first time that in vitro assay data can be highly predictive of in vivo toxicity endpoints. This finding highlights the potential of assay data as a robust indicator of in vivo toxicities. The results from this study complement and corroborate the findings of the CATMoS project, which used chemical structures primarily for acute toxicity prediction. However, the assay data-based models were still slightly outperformed by the chemical structure-based models in predicting acute toxicity. This could be attributed to the limited biological response space covered by the Tox21 in vitro assays, which may not be sufficient to represent all targets and pathways underlying acute toxicity (Huang et al. 2018). Further expansion of the Tox21 assay panel to provide a better coverage of the toxicological target space holds the promise of improving the predictive capacity of the in vitro assay data.
Conclusions
In summary, we developed machine learning models based on Tox21 in vitro assay data in comparison with chemical structure to predict acute toxicity. Both types of data produced robust models, indicating the utility of in vitro assay data in predicting acute toxicity. The best models were then applied to identify potentially toxic compounds within the Tox21 10K compound library, which can be prioritized for experimental validation and more in-depth toxicological evaluation. In addition, we identified significant chemotypes and assay targets that proved most predictive of acute toxicity. These findings offer insights into the pathways and mechanisms underlying chemical-induced acute toxicity and highlight structural cues that may serve as early indicators of such toxicities. Moreover, the superior predictive performance of assays with metabolic capacity also underlines the role of metabolic activation in chemical-induced toxicity. Importantly, these results underscore the value of in vitro assay data in predicting in vivo toxicity.
Supplementary Material
Fig. 2.

Chemotypes that are significantly associated with acute toxicity. Chemotypes are sorted by RF importance.
Highlights:
Machine learning models were developed using Tox21 in vitro assay data and chemical structure information for acute toxicity prediction.
Tox21 assay data as well as chemical structure showed good predictive power for in vivo acute toxicity.
Top assay targets predictive of acute toxicity include AChE, p53, and cytochrome P450s.
Chemotypes found to be most predictive of acute toxicity include organophosphates and carbamates.
Most Tox21 10K compounds were predicted as non-toxic.
Acknowledgement
This work was supported by the Intramural Research Programs of the National Toxicology Program (Interagency agreement #Y2-ES-7020-01), National Institute of Environmental Health Sciences and the National Center for Advancing Translational Sciences, National Institutes of Health. The views expressed in this article are those of the authors and do not necessarily reflect the statements, opinions, views, conclusions, or policies of the National Center for Advancing Translational Sciences, the National Institutes of Health. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Supplementary Material
The public access link to the Python codes used in this study and the data that support the findings of this study are available in the Supplementary Material of this article.
References
- Ashauer R, O’Connor I and Escher BI (2017). “Toxic Mixtures in Time The Sequence Makes the Poison.” Environmental science & technology 51(5): 3084–3092. [DOI] [PubMed] [Google Scholar]
- Attene-Ramos MS, Miller N, Huang R, Michael S, Itkin M, Kavlock RJ, Austin CP, Shinn P, Simeonov A, Tice RR and Xia M (2013). “The Tox21 robotic platform for the assessment of environmental chemicals - from vision to reality.” Drug Discov Today 18(15–16): 716–723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aubrey BJ, Kelly GL, Janic A, Herold MJ and Strasser A (2018). “How does p53 induce apoptosis and how does this relate to p53-mediated tumour suppression?” Cell death & differentiation 25(1): 104–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ballantyne B and Marrs TC (2017). Clinical and experimental toxicology of organophosphates and carbamates, Elsevier. [Google Scholar]
- Collins FS, Gray GM and Bucher JR (2008). “Toxicology. Transforming environmental health protection.” Science 319(5865): 906–907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Damalas CA and Koutroubas SD (2016). Farmers’ exposure to pesticides: toxicity types and ways of prevention, MDPI. 4: 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davis CS and Richardson RJ (1980). “Organophosphorus compounds.” Experimental and clinical neurotoxicology 1. [Google Scholar]
- Dong Z, Liu Y, Duan L, Bekele D and Naidu R (2015). “Uncertainties in human health risk assessment of environmental contaminants: a review and perspective.” Environment international 85: 120–132. [DOI] [PubMed] [Google Scholar]
- EPA (2022). “https://www.epa.gov/chemical-research/comptox-chemicals-dashboard.” [Google Scholar]
- Erhirhie Ihekwereme and Ilodigwe EE (2018). “Advances in acute toxicity testing: strengths, weaknesses and regulatory acceptance.” Interdisciplinary toxicology 11(1): 5–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Escrivá L, Font G and Manyes L (2015). “In vivo toxicity studies of fusarium mycotoxins in the last decade: A review.” Food and Chemical Toxicology 78: 185–206. [DOI] [PubMed] [Google Scholar]
- Fotiou D, Roussou M, Gakiopoulou C, Psimenou E, Gavriatopoulou M, Migkou M, Kanellias N, Dialoupi I, Eleutherakis-Papaiakovou E and Giannouli S (2020). “Carfilzomib-associated renal toxicity is common and unpredictable: a comprehensive analysis of 114 multiple myeloma patients.” Blood Cancer Journal 10(11): 109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ganie SY, Javaid D, Hajam YA and Reshi MS (2022). “Mechanisms and treatment strategies of organophosphate pesticide induced neurotoxicity in humans: A critical appraisal.” Toxicology 472: 153181. [DOI] [PubMed] [Google Scholar]
- Gulliver LS (2017). “Xenobiotics and the glucocorticoid receptor.” Toxicology and Applied Pharmacology 319: 69–79. [DOI] [PubMed] [Google Scholar]
- Gupta VK, Pal R, Siddiqi NJ and Sharma B (2015). “Acetylcholinesterase from human erythrocytes as a surrogate biomarker of lead induced neurotoxicity.” Enzyme research 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gupta VK and Rana PS (2019). “Toxicity prediction of small drug molecules of androgen receptor using multilevel ensemble model.” Journal of bioinformatics and computational biology 17(05): 1950033. [DOI] [PubMed] [Google Scholar]
- Gurung S and Kunwar M (2017). “Awareness regarding health effects of pesticides use among farmers in a municipality of Rupandehi district.” Journal of Universal College of Medical Sciences 5(2): 18–21. [Google Scholar]
- Ho C-J, Lin R-W, Zhu W-H, Wen T-K, Hu C-J, Lee Y-L, Hung T-I and Wang C (2019). “Transcription-independent and-dependent p53-mediated apoptosis in response to genotoxic and non-genotoxic stress.” Cell death discovery 5(1): 131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hrycay EG and Bandiera SM (2015). Monooxygenase, peroxidase and peroxygenase properties and reaction mechanisms of cytochrome P450 enzymes, Springer. [DOI] [PubMed] [Google Scholar]
- Huang R (2016). “A quantitative high-throughput screening data analysis pipeline for activity profiling.” High-throughput screening assays in toxicology: 111–122. [DOI] [PubMed] [Google Scholar]
- Huang R, Southall N, Wang Y, Yasgar A, Shinn P, Jadhav A, Nguyen D-T and Austin CP (2011). “The NCGC pharmaceutical collection: a comprehensive resource of clinically approved drugs enabling repurposing and chemical genomics.” Science translational medicine 3(80): 80ps16–80ps16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang R, Xia M, Sakamuru S, Zhao J, Lynch C, Zhao T, Zhu H, Austin CP and Simeonov A (2018). “Expanding biological space coverage enhances the prediction of drug adverse effects in human using in vitro activity profiles.” Sci Rep 8(1): 3783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang R, Xia M, Sakamuru S, Zhao J, Shahane SA, Attene-Ramos M, Zhao T, Austin CP and Simeonov A (2016). “Modelling the Tox21 10 K chemical profiles for in vivo toxicity prediction and mechanism characterization.” Nat Commun 7: 10425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang R, Zhu H, Shinn P, Ngan D, Ye L, Thakur A, Grewal G, Zhao T, Southall N and Hall MD (2019). “The NCATS Pharmaceutical Collection: a 10-year update.” Drug Discovery Today 24(12): 2341–2349. [DOI] [PubMed] [Google Scholar]
- Huang T, Sun G, Zhao L, Zhang N, Zhong R and Peng Y (2021). “Quantitative structure-activity relationship (QSAR) studies on the toxic effects of nitroaromatic compounds (NACs): A systematic review.” International Journal of Molecular Sciences 22(16): 8557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Idakwo G, Luttrell J, Chen M, Hong H, Zhou Z, Gong P and Zhang C (2018). “A review on machine learning methods for in silico toxicity prediction.” Journal of Environmental Science and Health, Part C 36(4): 169–191. [DOI] [PubMed] [Google Scholar]
- Ivanov J, Polshakov D, Kato-Weinstein J, Zhou Q, Li Y, Granet R, Garner L, Deng Y, Liu C and Albaiu D (2020). “Quantitative structure–activity relationship machine learning models and their applications for identifying viral 3CLpro-and RdRp-targeting compounds as potential therapeutics for COVID-19 and related viral infections.” ACS omega 5(42): 27344–27358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeremias G, Jesus F, Ventura SP, Gonçalves FJ, Asselman J and Pereira JL (2021). “New insights on the effects of ionic liquid structural changes at the gene expression level: Molecular mechanisms of toxicity in Daphnia magna.” Journal of Hazardous Materials 409: 124517. [DOI] [PubMed] [Google Scholar]
- Karami-Mohajeri S and Abdollahi M (2011). “Toxic influence of organophosphate, carbamate, and organochlorine pesticides on cellular metabolism of lipids, proteins, and carbohydrates: a systematic review.” Human & experimental toxicology 30(9): 1119–1140. [DOI] [PubMed] [Google Scholar]
- Karunarathne A, Gunnell D, Konradsen F and Eddleston M (2020). “How many premature deaths from pesticide suicide have occurred since the agricultural Green Revolution?” Clinical toxicology 58(4): 227–232. [DOI] [PubMed] [Google Scholar]
- Kassel O and Herrlich P (2007). “Crosstalk between the glucocorticoid receptor and other transcription factors: molecular aspects.” Molecular and cellular endocrinology 275(1–2): 13–29. [DOI] [PubMed] [Google Scholar]
- Kavlock RJ, Austin CP and Tice RR (2009). “Toxicity testing in the 21st century: implications for human health risk assessment.” Risk Anal 29(4): 485–487; discussion 492–487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S, Park K, Ahn H-S and Kim D (2010). “Importance of structural information in predicting human acute toxicity from in vitro cytotoxicity data.” Toxicology and applied pharmacology 246(1–2): 38–48. [DOI] [PubMed] [Google Scholar]
- Li S, Huang R, Solomon S, Liu Y, Zhao B, Santillo MF and Xia M (2017). “Identification of acetylcholinesterase inhibitors using homogenous cell-based assays in quantitative high-throughput screening platforms.” Biotechnol J 12(5). [DOI] [PubMed] [Google Scholar]
- Li S, Zhao J, Huang R, Santillo MF, Houck KA and Xia M (2019). “Use of high-throughput enzyme-based assay with xenobiotic metabolic capability to evaluate the inhibition of acetylcholinesterase activity by organophosphorous pesticides.” Toxicol In Vitro 56: 93–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu S, Wang P, Wang C, Chen J, Wang X, Hu B and Shan X (2023). “Disparate toxicity mechanisms of parabens with different alkyl chain length in freshwater biofilms: Ecological hazards associated with antibiotic resistome.” Science of The Total Environment 881: 163168. [DOI] [PubMed] [Google Scholar]
- Low LA, Mummery C, Berridge BR, Austin CP and Tagle DA (2021). “Organs-on-chips: into the next decade.” Nature Reviews Drug Discovery 20(5): 345–361. [DOI] [PubMed] [Google Scholar]
- Lynch C, Sakamuru S, Ooka M, Huang R, Klumpp-Thomas C, Shinn P, Gerhold D, Rossoshek A, Michael S, Casey W, Santillo MF, Fitzpatrick S, Thomas RS, Simeonov A and Xia M (2023). “High-Throughput Screening to Advance In Vitro Toxicology: Accomplishments, Challenges, and Future Directions.” Annu Rev Pharmacol Toxicol. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mansouri K, Karmaus AL, Fitzpatrick J, Patlewicz G, Pradeep P, Alberga D, Alepee N, Allen TE, Allen D and Alves VM (2021). “CATMoS: collaborative acute toxicity modeling suite.” Environmental health perspectives 129(4): 047013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mansouri K, Karmaus AL, Fitzpatrick J, Patlewicz G, Pradeep P, Alberga D, Alepee N, Allen TEH, Allen D, Alves VM, Andrade CH, Auernhammer TR, Ballabio D, Bell S, Benfenati E, Bhattacharya S, Bastos JV, Boyd S, Brown JB, Capuzzi SJ, Chushak Y, Ciallella H, Clark AM, Consonni V, Daga PR, Ekins S, Farag S, Fedorov M, Fourches D, Gadaleta D, Gao F, Gearhart JM, Goh G, Goodman JM, Grisoni F, Grulke CM, Hartung T, Hirn M, Karpov P, Korotcov A, Lavado GJ, Lawless M, Li X, Luechtefeld T, Lunghini F, Mangiatordi GF, Marcou G, Marsh D, Martin T, Mauri A, Muratov EN, Myatt GJ, Nguyen DT, Nicolotti O, Note R, Pande P, Parks AK, Peryea T, Polash AH, Rallo R, Roncaglioni A, Rowlands C, Ruiz P, Russo DP, Sayed A, Sayre R, Sheils T, Siegel C, Silva AC, Simeonov A, Sosnin S, Southall N, Strickland J, Tang Y, Teppen B, Tetko IV, Thomas D, Tkachenko V, Todeschini R, Toma C, Tripodi I, Trisciuzzi D, Tropsha A, Varnek A, Vukovic K, Wang Z, Wang L, Waters KM, Wedlake AJ, Wijeyesakere SJ, Wilson D, Xiao Z, Yang H, Zahoranszky-Kohalmi G, Zakharov AV, Zhang FF, Zhang Z, Zhao T, Zhu H, Zorn KM, Casey W and Kleinstreuer NC (2021). “CATMoS: Collaborative Acute Toxicity Modeling Suite.” Environ Health Perspect 129(4): 47013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maritaz C, Lemare F, Laplanche A, Demirdjian S, Valteau-Couanet D and Dufour C (2018). “High-dose thiotepa-related neurotoxicity and the role of tramadol in children.” BMC cancer 18: 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mdeni NL, Adeniji AO, Okoh AI and Okoh OO (2022). “Analytical evaluation of carbamate and organophosphate pesticides in human and environmental matrices: a review.” Molecules 27(3): 618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mermer A and Alyar S (2022). “Synthesis, characterization, DFT calculation, antioxidant activity, ADMET and molecular docking of thiosemicarbazide derivatives and their Cu (II) complexes.” Chemico-Biological Interactions 351: 109742. [DOI] [PubMed] [Google Scholar]
- Meyer-Baron M, Knapp G, Schäper M and van Thriel C (2015). “Meta-analysis on occupational exposure to pesticides–Neurobehavioral impact and dose–response relationships.” Environmental research 136: 234–245. [DOI] [PubMed] [Google Scholar]
- Mirzayans R, Andrais B, Kumar P and Murray D (2017). “Significance of wild-type p53 signaling in suppressing apoptosis in response to chemical genotoxic agents: Impact on chemotherapy outcome.” International journal of molecular sciences 18(5): 928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morais S, Dias E and Pereira ML (2012). “Carbamates: human exposure and health effects.” The impact of pesticides: 21–38. [Google Scholar]
- National Research Council, Committee on Acute Exposure Guideline Levels (2009). “ Sixteenth Interim Report of the Committee on Acute Exposure Guideline Levels”. [PubMed] [Google Scholar]
- NCATS. (2021). “Tox21 Data Browser.” 2021, from https://tripod.nih.gov/tox21/pubdata. [Google Scholar]
- NCATS. (2022). “https://ncats.nih.gov/research/research-activities/Tox21.”
- Ooka M, Zhao J, Shah P, Travers J, Klumpp-Thomas C, Xu X, Huang R, Ferguson S, Witt KL, Smith-Roe SL, Simeonov A and Xia M (2022). “Identification of environmental chemicals that activate p53 signaling after in vitro metabolic activation.” Arch Toxicol 96(7): 1975–1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parrón T, Requena M, Hernández AF and Alarcón R (2014). “Environmental exposure to pesticides and cancer risk in multiple human organ systems.” Toxicology Letters 230(2): 157–165. [DOI] [PubMed] [Google Scholar]
- Petreski T, Kit B, Strnad M, Grenc D and Svenšek F (2020). “Cholinergic syndrome: a case report of acute organophosphate and carbamate poisoning.” Archives of Industrial Hygiene and Toxicology 71(2): 163–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- PubChem Tox21 phase II data., (2022).http://www.ncbi.nlm.nih.gov/pcassay?term=tox21,2022. [Google Scholar]
- Richard AM, Huang R, Waidyanatha S, Shinn P, Collins BJ, Thillainadarajah I, Grulke CM, Williams AJ, Lougee RR, Judson RS, Houck KA, Shobair M, Yang C, Rathman JF, Yasgar A, Fitzpatrick SC, Simeonov A, Thomas RS, Crofton KM, Paules RS, Bucher JR, Austin CP, Kavlock RJ and Tice RR (2021). “The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology.” Chem Res Toxicol 34(2): 189–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sakamuru S, Zhu H, Xia M, Simeonov A and Huang R (2020). “CHAPTER 8 Profiling the Tox21 Chemical Library for Environmental Hazards: Applications in Prioritisation, Predictive Modelling, and Mechanism of Toxicity Characterisation.” Big Data in Predictive Toxicology, The Royal Society of Chemistry: 242–263. [Google Scholar]
- Tice RR, Austin CP, Kavlock RJ and Bucher JR (2013). “Improving the human hazard characterization of chemicals: a Tox21 update.” Environ Health Perspect 121(7): 756–765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tudi M, Li H, Li H, Wang L, Lyu J, Yang L, Tong S, Yu QJ, Ruan HD and Atabila A (2022). “Exposure routes and health risks associated with pesticide application.” Toxics 10(6): 335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Veronesi M, Rodriguez M, Marinho G, Bomfeti CA, Rocha BA, Barbosa F, Souza MCO, da Silva Faria MC and Rodrigues JL (2022). “Degradation of Praguicide Disulfoton Using Nanocompost and Evaluation of Toxicological Effects.” International Journal of Environmental Research and Public Health 20(1): 786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang C, Wei Z, Wang L, Sun P and Wang Z (2015). “Assessment of bromide-based ionic liquid toxicity toward aquatic organisms and QSAR analysis.” Ecotoxicology and environmental safety 115: 112–118. [DOI] [PubMed] [Google Scholar]
- Witt KL, Hsieh JH, Smith-Roe SL, Xia M, Huang R, Zhao J, Auerbach SS, Hur J and Tice RR (2017). “Assessment of the DNA damaging potential of environmental chemicals using a quantitative high-throughput screening approach to measure p53 activation.” Environ Mol Mutagen 58(7): 494–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiao Y, Zhu Y, Yu S, Yan C, JY Ho R, Liu J, Li T, Wang J, Wan L and Yang X (2016). “Thirty-day rat toxicity study reveals reversible liver toxicity of mifepristone (RU486) and metapristone.” Toxicology mechanisms and methods 26(1): 36–45. [DOI] [PubMed] [Google Scholar]
- Xu T, Ngan DK, Ye L, Xia M, Xie HQ, Zhao B, Simeonov A and Huang R (2020). “Predictive Models for Human Organ Toxicity Based on In Vitro Bioactivity Data and Chemical Structure.” Chem Res Toxicol 33(3): 731–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu T, Wu L, Xia M, Simeonov A and Huang R (2021). “Systematic Identification of Molecular Targets and Pathways Related to Human Organ Level Toxicity.” Chem Res Toxicol 34(2): 412–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yanagisawa N, Morita H and Nakajima T (2006). “Sarin experiences in Japan: acute toxicity and long-term effects.” Journal of the neurological sciences 249(1): 76–85. [DOI] [PubMed] [Google Scholar]
- Yang C, Tarkhov A, Marusczyk J, Bienfait B, Gasteiger J, Kleinoeder T, Magdziarz T, Sacher O, Schwab CH, Schwoebel J, Terfloth L, Arvidson K, Richard A, Worth A and Rathman J (2015). “New publicly available chemical query language, CSRML, to support chemotype representations for application to data mining and modeling.” J Chem Inf Model 55(3): 510–528. [DOI] [PubMed] [Google Scholar]
- Ye L, Ngan DK, Xu T, Liu Z, Zhao J, Sakamuru S, Zhang L, Zhao T, Xia M and Simeonov A (2022). “Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data.” Toxicology and applied pharmacology 454: 116250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zaitsu K, Katagi M, Tsuchihashi H and Ishii A (2014). “Recently abused synthetic cathinones, α-pyrrolidinophenone derivatives: a review of their pharmacology, acute toxicity, and metabolism.” Forensic Toxicology 32: 1–8. [Google Scholar]
- Zerdoumi Y, Kasper E, Soubigou F, Adriouch S, Bougeard G, Frebourg T and Flaman J-M (2015). “A new genotoxicity assay based on p53 target gene induction.” Mutation Research/Genetic Toxicology and Environmental Mutagenesis 789: 28–35. [DOI] [PubMed] [Google Scholar]
- Zhang R, Li P, Zhang R, Shi X, Li Y, Zhang Q and Wang W (2021). “Computational study on the detoxifying mechanism of DDT metabolized by cytochrome P450 enzymes.” Journal of Hazardous Materials 414: 125457. [DOI] [PubMed] [Google Scholar]
- Zhao M, Ma J, Li M, Zhang Y, Jiang B, Zhao X, Huai C, Shen L, Zhang N and He L (2021). “Cytochrome P450 enzymes and drug metabolism in humans.” International journal of molecular sciences 22(23): 12808. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
