Abstract
Assessing the mutagenicity of chemical compounds is crucial for ensuring their safety and minimizing potential environmental and public health risks. However, traditional mutagenicity assessments, such as the Ames test, are time-consuming, resource-intensive, and often limited in their capacity to screen a large number of compounds. To address this gap, predictive models powered by deep learning offer a promising alternative for rapid and cost-effective mutagenicity screening. In this study, we propose an integrated deep learning framework utilizing diverse molecular features to predict compound mutagenicity. In the total usage of 5866 compounds, 5279 compounds were utilized for model training, and the other 587 compounds were utilized for model evaluation. A total of 78 integrated models were developed by systematically combining 13 types of molecular descriptors and fingerprints. The MACCS-Mordred model demonstrated the best performance, achieving a balanced accuracy of 0.885 and a precision score of 0.922 in the testing data set. In addition, we performed an activity cliff analysis to examine potential sources of mispredictions. Applicability domain analysis further confirmed the robustness of the model, indicating that most compounds in our data set fell within the reliable prediction space. Notably, feature importance analysis revealed that mutagenic compounds are more likely to contain nitrogen-containing and ring-related substructures, offering insights into structural characteristics associated with mutagenic risk. Our results support AI-enabled screening tools for prioritizing hazardous compounds and improving early stage chemical risk assessment. This work provides practical value for environmental monitoring and regulatory decision-making.
1. Introduction
Mutagenicity is a concerning hazardous end point that presents the capacity of a compound to induce mutations in deoxyribonucleic acid (DNA) sequences. − Compounds inducing mutagenicity have the potential to pose long-term risks to living beings, causing heritable mutations in germ cells and cancer in somatic cells. Chemical-related organizations and policies worldwide have regulated the assessment of mutagenicity as an essential requirement for the safety of chemical compounds, drug candidates, and consumer products. , Among all detection methods, the Ames test is considered the standard assay for mutagenicity. − The Ames test adopts at least five different cell strains for mutagenicity evaluation, of which four should be the assigned strains (TA1535, TA1537 (or TA97a or TA97), TA98, and TA100). If at least one result is positive in all tested cell strains, a compound would be considered a mutagen. These requirements have significantly enhanced the reliability and reproducibility of the Ames test, making it a widely utilized method for regulatory purposes before registering new and existing compounds.
The accumulated cost and time involved in the Ames assessment are becoming crucial as approximately 4000 emerging compounds are added to the registry daily. Therefore, this has led to a growing interest in in silico methods due to their rapid speed and cost-effectiveness. The most applied computational method for mutagenicity prediction is the quantitative structure–activity relationship (QSAR) model. In a QSAR model, the molecular descriptors are used to represent the characteristics of chemicals, , and computational approaches such as machine learning (ML) algorithms are applied to calculate the sophisticated quantitative relationship between mutagenicity and molecular descriptors. For instance, the compound mutagenicity can be predicted by the presence of expert-rule-based substructures (i.e., structural alerts) and certain types of molecular fragments, or via the quantified statistical correlations between molecular descriptors and mutagenicity. In addition, a vast number of molecular descriptors have been explored in recent years, various commercial QSAR models (e.g., CASE ultra and VEGA) have been established and proven their effectiveness. − Furthermore, an increasing use of machine learning and deep learning in cheminformatic modeling can be observed in the past decades. , These studies showed that the current development of QSAR and chemoinformatic models has demonstrated the possibility of mitigating the required costs involved in assessment due to their efficient and accurate predictive abilities.
Deep neural networks (DNNs) have become a common approach in various predictions due to their excellent ability to process large and complex data sets. Specifically, they excel in analyzing features and their connections to mutagenicity. ,, For example, in a comparative study involving 4053 compounds, deep learning models outperformed traditional machine learning algorithms in predicting mutagenicity. Similarly, the application of a message passing neural network, which represents an advanced form of graph neural networks, demonstrated superior performance in predicting mutagenicity as well as six other types of toxicity. However, most models were generally constructed solely on one specific type of molecular features. The integrated modeling approacheswhich combine the outputs of multiple models built on diverse feature typeshave been suggested to improve predictive performance , Building on this rationale, an integrated model has the potential to achieve superior accuracy in mutagenicity prediction.
In this study, we aimed to establish an integrated DNN model to predict mutagenicity (Figure ). Compounds from three databases were collected and split into training and testing data sets to establish and evaluate the models. Various models were established following the engineering of molecular features. Each model was optimized by tuning the hyperparameters. Subsequently, these optimized models were combined pairwise to integrate their predictions, and the best-integrated model was selected depending on the score metrics at cross-validation. An analysis of feature importance was conducted afterward to unveil the relationship between molecular descriptors and mutagenicity. In addition, the applicability domain (AD) was implemented to identify the reliable prediction region of the constructed model. We believe that the integrated model will be useful in reducing costs and accelerating the mutagenicity assessments.
1.
Workflow of the study. Compounds with mutagenicity data were collected from public libraries. Molecular features were generated and used to establish deep-learning models. Models were optimized by tuning hyperparameters through cross-validation. Optimized models were combined in a pairwise manner to generate a integrated model. The best integrated model was then used to analyze feature importance and calculate the applicability domain, ensuring reliable predictions of mutagenicity.
2. Material and Methods
2.1. Data Collection and Preparation
Compound data sets were gathered from the ISSSTY, ISSCAN, and MicotoXilico data sets. The former two data sets were provided by Istituto Superiore di Sanita’ under the ISSTOX project, and MicotoXilico was compiled from the work of Aydın and Rencüzoğulları and open-source databases, such as CPDB, CCRIS, and OpenFoodTox. Duplicate compounds were first removed. Next, equivocal labels, as defined by the ISSSTY and ISSCAN data sets (e.g., dimethoate and tribromoacetic acid), were excluded from this study for binary classification. The largest components were retained, and compound structures were standardized and sanitized using the datamol toolkit. In total, 5866 compounds were included in our data set, with the training and testing data points randomly selected at a 90:10 ratio. The training data set included 5279 compounds, containing 2996 mutagens and 2283 nonmutagens. The testing data set included 587 compounds, containing 327 mutagens and 260 nonmutagens. The principal component analysis (PCA) , was utilized for chemical space visualization, where the number of dimensions/principal components was set to two for the PCA plot, and the percentage of retained information was calculated.
2.2. Feature Engineering
The molecular descriptors were applied to convert chemical structures into a compatible data format for the model. The details of these features and the number of features used for each model are summarized in Table S1. The fingerprints, including MACCS, Avalon, ECFP, FCFP, AtomPair, Topological, RDKit, Layered, and Pattern were adopted due to their common application in drug discovery and biodegradability prediction models. The Mordred descriptors were selected for a variety of combinations of constitutional features, and pretrained fingerprints (e.g., Roberta-Zinc480M-102M, GPT2-Zinc480M-87M, and MOLT5) were selected for their novel generation approach along with language models. − In total, 13 different types of descriptors were utilized. All features were generated using Molfeat, and the missing values were removed.
2.3. DNN Model Construction
DNN models were established with Keras API. Each DNN model could be expressed as a fully connected network where the computing capability depends on various hyperparameters. To optimize the model, the following parameters were arranged: the number of hidden layers ∈ {1, 2, 3}, number of neurons per each hidden layer ∈ {(512,*), (512, 128), (512, 128, 8)}, and the optimizer ∈ {Adamax, Adam}. Other hyperparameters, such as the learning rate and the application of batch normalization, were fixed in this section due to their negligible effects on the learning process during preliminary experiments. In addition, the activation functions for the hidden and output layers were set as ReLU and sigmoid, respectively.
2.4. Integrated Model Construction
Integrated models were established by pairwise combining individual DNN models, which was similar to the assessment standard of OECD TG471 guidelines. A compound was labeled as positive (1) if at least one prediction was positive (mutagenic), and a compound was labeled as negative (0) if none of the models predicted it as negative (nonmutagenic). The integrated model that exhibited the best performance was evaluated using the testing data set and utilized for further analysis of feature importance. Six score metrics were used to evaluate the model performances: accuracy, balanced accuracy, precision, recall, F1 score, and the MCC (Matthews correlation coefficient). These metrics were calculated by the number of true positives (TPs), false positives (FPs), true negatives (TNs), and false negatives (FNs). The formulas for these metrics are listed below (eqs –). Balanced accuracy and precision scores were calculated.
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
2.5. Feature Importance
The importance and positive/negative contribution of each feature for mutagenicity predictions will be derived from the analysis of the integrated model with the best cross-validation performances. The feature importance was analyzed using the SHAP (SHapley Additive exPlanations) method, an explainable artificial intelligence technique derived from the cooperative game theory. The SHAP method expresses feature importance as a Shapley value. A feature with a higher absolute Shapley value indicates a greater significance in influencing mutagenicity. Furthermore, a positive Shapley value can be interpreted as a feature that positively influences the final prediction; conversely, a negative Shapley value can be considered a negative contributing factor. In addition, the exploration of how the modification of the identified feature importance affects the compound mutagenicity will be discussed by utilizing the Exmol package, a technique that generates analogs with different mutagenicity to the given chemical based on Tanimoto similarity and a specified condition equation. The equation is stated below
| 7 |
where x and x′ present the molecular descriptor vectors, d(x, x′) is a measure of distance between the molecular descriptors, and f(x) and f(x′) are the mutagenicity predictions for the compounds.
2.6. Applicability Domain
The prediction reliability of compounds was assessed by the AD, a theoretical chemical space determined by training sets. Reliable predictions are only generated within the AD; predictions outside the AD are considered unreliable. In this study, the pyADA package, an open-source Python toolkit, was utilized to determine the AD. The package employed the leverage approach to calculate the boundary, which is the critical hat value (h*), of the AD and the h value for each compound. The h* value is calculated as 3(p + 1)/n, where p is the feature number in the model and n is the compound number in the training sets. Compounds with a higher h value than h* represent a great structural difference compared to the training sets and are thus considered outside the AD.
3. Results
3.1. Data Set Analysis
The training and testing sets were obtained independently through random splitting. We first performed a PCA to visualize the chemical space of the compounds. Three of the 13 molecular descriptors, the extended connectivity fingerprint (ECFP), MACCS, and Mordred features of the compounds, were utilized to construct the PCA, mapping the chemical distribution. The PCA results indicated a similar chemical distribution between the training (blue dots) and testing (orange dots) data sets (Figure ). The chemical space represented by MACCS appeared more sparsely distributed compared to those based on ECFP and Mordred features. All testing data sets derived from these three feature types fell within the domain of their corresponding training data sets. In terms of explained variance, the first two principal components accounted for 1.59% for ECFP, 17.87% for MACCS, and 40.27% for Mordred descriptors. Furthermore, the PCA plots of mutagens and nonmutagens exhibited substantial overlap, suggesting that a nonlinear classification approach may be required to accurately distinguish between the two groups. This similarity supported the robustness of the data set division and indicated a well-balanced representation for model training and evaluation.
2.
Chemical distribution of the data set presented by the PCA plot. Training and testing sets are respectively presented in blue and orange dots. Mutagens and nonmutagens are respectively presented in red and blue dots.
3.2. Model Establishment
Next, after using 13 types of molecular features representing various structural and property aspects of the compounds, we utilized the training set and DNN approach to establish our models. In addition, we tested different network architectures by varying combinations of hidden layers and optimizers. In total, 78 models were established, and their performances were evaluated through 10-fold cross-validation. The results showed that models with three hidden layers exhibited better precision, ranging from 0.840 to 0.890 (Figure ). In addition, the models using the Adamax optimizer outperformed those with the Adam optimizer, achieving an average accuracy of 0.827, a balanced accuracy of 0.830, a precision of 0.876, a recall of 0.812, and an F1 score of 0.842 (Table ). As a result, models with three hidden layers and the Adamax optimizer were selected for further study.
3.
Performance comparison of models generated by various features and hyperparameters. (A, B) The 10-fold cross-validation results of each model between the usage of Adamax and Adam optimizer, and the number of hidden layers utilized by each model. The balanced accuracy was labeled as “X hl cv-bac”, where precision was labeled “X hl cv-pre”. The blue-colored values represent the cross-validation balanced accuracy, and the orange-colored values represent the cross-validation precision. (C–D) The performance violin plots of each model between the usage of Adamax and Adam optimizer.
1. Model Performance Obtained by Averaging Results of 10-Fold Cross-Validation.
| optimizer | accuracy | balanced accuracy | precision | recall | F1 |
|---|---|---|---|---|---|
| Adam | 0.818 ± 0.018 | 0.823 ± 0.017 | 0.877 ± 0.015 | 0.792 ± 0.026 | 0.832 ± 0.018 |
| Adamax | 0.827 ± 0.015 | 0.830 ± 0.015 | 0.876 ± 0.017 | 0.812 ± 0.024 | 0.842 ± 0.014 |
We further compared the performance of models established using various molecular features. The models utilizing substructure fingerprints (e.g., MACCS, RDKit, Layered, and Pattern) and the Mordred descriptor generally outperformed those based on circular fingerprints (e.g., ECFP and FCFP) and path-based fingerprints (e.g., Topological and Atompair) (Figure ). Among the models, those using the MACCS fingerprint and Mordred descriptor exhibited the best performance. The MACCS fingerprint model achieved the highest balanced accuracy of 0.846 and precision of 0.900, while the Mordred descriptor model showed balanced accuracy of 0.836 and precision of 0.900. A previous study has noted that random data set splitting can sometimes result in higher error variance if the resulting subsets fail to capture the diversity of the original data set, particularly when the data are complex or unevenly distributed. Nevertheless, our findings indicate that models trained on randomly split data sets in this study achieved promising predictive performance. In addition, our models had low standard deviations during the cross-validation. Taken together, the comprehensive analysis of various molecular feature-based models suggested that the MACCS fingerprint and Mordred descriptor models were reliable and had the potential for predictive applications.
3.3. Integrated Model Performances
We combined the 13 individual models in a pairwise manner to develop integrated models due to the enhanced performance potential of integrated models that merge different prediction models. In the integrated models, a compound was classified as positive for mutagenicity if any model identified the compound as mutagenic and classified it as negative if all models concluded the compound was nonmutagenic. This classification aligned with the standards specified in OECD TG471 guidelines. A total of 78 integrated models were established, and their performances were evaluated using a 10-fold cross-validation method (Figure ). The results showed that integrated models generally exhibited better balanced accuracy than one-feature models (Figure ). Several models demonstrated high accuracy and precision, exceeding 0.84. These models generally utilized a combination of different types, including substructure fingerprints and the Mordred descriptor. Among all established integrated models, we utilized a multicriteria decision-making strategy to select the best model. By considering the cross-validation accuracy, balanced accuracy, and precision of each integrated model, the average of these metrics was used for the model decision. As shown in Table S2, the MACCS-Mordred model had an average performance of 0.857, superior to the other top 10 integrated models. This suggests that integrating various predictive models can enhance their predictive capabilities, and shows that the MACCS-Mordred model has the potential for further application.
4.
Performance of integrated models combining pairwise models. The performance results are from 10-fold cross-validation. The upper right section (cells colored in blue) presents the balanced accuracy of the integrated models, and the bottom left section (cells colored in orange) shows the precision of the integrated models.
The results for the integrated MACCS and Mordred model and the individual MACCS and Mordred models are shown in Table , and the results for other single-feature models are shown in Table S3. For the training set, the integrated model achieved scores of 0.926 for accuracy, 0.929 for balanced accuracy, 0.963 for precision, 0.904 for recall, and 0.932 for F1 score. The models were then evaluated using the testing set. Consistent with the results of the training set, the integrated model had better performance compared to the individual models, achieving an accuracy of 0.882, a balanced accuracy of 0.885, a precision of 0.922, a recall of 0.862, and an F1 score of 0.891. The results suggested that the integrated model was reliable for predicting mutagenicity.
2. Performance of the Selected Integrated Model .
| training
set |
testing set |
|||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Acc | Bal acc | precision | recall | F1 | MCC | Acc | Bal acc | precision | recall | F1 | MCC | |
| MACCS model | 0.911 | 0.918 | 0.974 | 0.866 | 0.917 | 0.828 | 0.872 | 0.879 | 0.944 | 0.820 | 0.877 | 0.753 |
| mordred model | 0.896 | 0.905 | 0.976 | 0.837 | 0.902 | 0.804 | 0.850 | 0.860 | 0.944 | 0.777 | 0.852 | 0.717 |
| integrated model | 0.926 | 0.929 | 0.963 | 0.904 | 0.932 | 0.852 | 0.882 | 0.885 | 0.922 | 0.862 | 0.891 | 0.766 |
The “Acc” represent the accuracy, “Bal acc” represent the balanced accuracy.
3.4. Individual Compound Mutagenicity Impacts of Chemical Descriptors
Feature importance was analyzed using the SHAP method to identify critical features affecting the potential for mutagenicity of a compound. The features included MACCS fingerprints and Mordred descriptors. A feature was considered to have a greater impact on mutagenicity if it exhibited a high SHAP value. The top 20 key positively contributing features of MACCS fingerprints ranked by their SHAP values are shown in Figure A and can be categorized into four groups according to their structural characteristics: (1) halogen group, (2) nitrogen-containing group, (3) oxygen-containing group, and (4) ring-related group (Figure B). First, the halogen group consisted of features including halogen atoms of fluorine (F), chlorine (Cl), bromine (Br), and iodine (I). For example, the fluorine atom (i.e., MACCS134) of the compound N-(4-fluorophenyl)-5-nitro-3-thiophenecarboxamide was identified as a key feature inducing mutagenicity (Figure C). The nitrogen-containing group included eight features, such as NH2(MACCS84), A$A!N(MACCS133) and NO(MACCS63). These features also had a notably high incidence of mutagenicity. For example, 86.2% of compounds containing the NO(MACCS63) feature were mutagens, as observed in the compound 4-amino-3-nitro-6-chloroaniline (Figure C). The oxygen-containing group included four features: QO (MACCS102), OA > 1 (MACCS136), OAAO (MACCS72), and A!O!A (MACCS126), which are branched functional groups. For instance, Menogaril contained two of the features, viz., OAAO (MACCS72) and A!O!A (MACCS126), both of which are connected to the main scaffold of the molecule. The last ring-related group comprised six features, such as the 3 M ring (MACCS22) and the aromatic ring (MACCS125). For example, the compound anti-5,7-dimethylchrysene-1,2-diol-3,4-epoxide simultaneously contained multiple features (Figure C) of this group, contributing to its mutagenicity. The analysis highlighted the importance of the molecular features in understanding chemical mutagenicity.
5.
Feature importance analysis for MACCS fingerprint. (A) Top 20 important features showing a positive contribution to mutagenicity. (B) MACCS keys, definition, SMARTS presentations, and groups of features. (C) Examples of mutagenic compounds. The red dashed circle indicates features inducing mutagenicity. Red text represents features with a positive contribution, while the blue text indicates negative contributions. (D) Definitions of the SMARTS pattern.
Next, we analyzed the importance of Mordred descriptors, and the top 20 positively contributing features ranked by SHAP values were identified (Figure A). These features were categorized into four groups using a classification approach similar to the method used for grouping MACCS fingerprints: (1) ring-related group, (2) nitrogen-containing group, (3) oxygen-containing group, and (4) others (Figure B). The ring-related group included five features, such as the total valence electrons (i.e., E-state values) for “aaCa” groups in compounds (SaaaC) and the number of 12-or-greater-membered fused rings (nG12FRing). For example, the compound 7-ethylbenz[a]anthracene 5,6-imine included multiple features from this group, contributing to its mutagenicity (Figure C). The nitrogen-containing group included two features: the sum of valence electrons in NH2 (SsNH2) and the number of Y-shaped connected nitrogen atom (NsssN). In the compound 2-(butylnitroamino)ethyl nitrate, the presence of nitrogen atom connected to the nitrite functional group was identified as a key factor in inducing mutagenicity (Figure C). The sum of valence electrons in the “–O–” group (SssO) was the only member in the oxygen-containing group, and the compound fenitrothion contained that, which contributed to its mutagenic properties (Figure C). The last group included molecular features, such as electrical state indices and van der Waals surface area contribution (VSA_EState4), log P and surface area contribution (S Log P_VSA8), and topological charges (JGI8). For example, the compound 2-iodo-9-acridinamine had a high VSA_Estate 4 value. This analysis identified key Mordred descriptors, thereby enhancing our understanding of molecular features crucial for predicting mutagenicity.
6.
Feature importance analysis for Mordred descriptors. (A) Top 20 important features showing a positive contribution to mutagenicity. (B) Mordred keys, names, and groups of features. (C) Examples of mutagenic compounds. The red dashed circle indicates features inducing mutagenicity. Red and blue text respectively indicate features with positive and negative contributions.
3.5. Model Applicability Analysis
The MACCS and Mordred features applied in the final model were utilized to analyze the AD, which demonstrates the confidence region for predicting new compounds. A compound was considered to be within the AD if its hat value (h) was below the threshold (h*) using the leverage method. The pyADA package was utilized to calculate the AD. The results showed that most compounds were located in the AD when analyzed using MACCS and Mordred features. For MACCS features, 127 compounds (i.e., 2.165% of the data set) had hat values that exceeded the threshold (h*) of 0.095. Meanwhile, only eight compounds were found outside the AD when applying a threshold of 0.6 (Figure ). The AD details of other established models were summarized in Table S4. Furthermore, these “outliers” detected by the ADs were not the same, suggesting that our integrated model has the potential to correct these mispredictions (Figure S1A). For instance, among the MACCS-AD-identified outliers, three of the compounds had a misprediction by the MACCS model, but they were correctly predicted in our final integrated model since they were within the AD derived from Mordred descriptors (Figure S1B). In addition, all outliers identified by the Mordred-defined AD were correctly predicted by our final model, suggesting that their mutagenicity evaluation in the MACCS model was reliable due to their presence within the AD (Figure S1C). The results suggested that the data set exhibited a high overlap in chemical space distribution, and new compounds within the AD could be predicted with high reliability.
7.
Analysis of the applicability domain. (A) Utilizing the MACCS fingerprints, the applicability domain boundary (h*) was calculated to be 0.095. (B) Utilizing the Mordred descriptors, the applicability domain boundary (h*) was calculated to be 0.6.
3.6. Model Application of Chemicals beyond Our Data Set
To test the model application in case studies not present in our data set, we selected chloranil, 5-nitro-2-propoxyaniline, and syn-dibenz[a,j]acridine-3,4-diol-1,2-epoxide for evaluating their mutagenicity. The results showed that the three chemicals were predicted to be mutagenic, consistent with their experimental mutagenicity. Furthermore, the chemicals were located within the AD defined by the MACCS and Mordred descriptors (Table S5). These findings demonstrated our integrated model could be applied to known mutagenic compounds beyond our data set, and the mutagenicity predictions from our integrated model were reliable while the compounds were within the AD.
4. Discussion
4.1. Comparison of Integrated Model and One-Feature Models
The analysis indicated that the integrated models performed better than the models utilizing a single type of molecular feature. The superior results included accuracy, balanced accuracy, recall, and F1-score. This suggested that using a voting approach similar to the integrated mechanism used in the Ames test, enhanced the accuracy of mutagenicity predictions. The improved performance of the integrated model may be due to the integration of diverse compound features, which provided complementary information for prediction. For instance, the MACCS and Mordred models generated 470 and 532 false predictions in the training set. When the two models were combined, the integrated model resulted in only 392 false predictions. Consistent with training results, in predicting the testing set, the integrated model had only 69 false predictions, whereas the MACCS and Mordred models respectively had 75 and 88 false predictions (Figure ). Our data set comprised 3323 mutagens and 2543 nonmutagens, which is slightly unbalanced. An unbalanced data set may yield high accuracy but low balanced accuracy. For example, Martinez et al. reported a mutagenicity prediction accuracy of 0.95 but a balanced accuracy of only 0.71, which was due to the presence of 3103 mutagens among 3334 compounds. Similarly, Li et al. reported a balanced accuracy of 0.69 when their data set contained 1480 mutagens and 8546 nonmutagens. In contrast, Pandey et al. reported an accuracy of 0.76. Based on their reported results and data set composition (3503 mutagens and 3009 nonmutagens), the corresponding balanced accuracy can reasonably be considered similar. In our study, the difference between the accuracy and balanced accuracy of the integrated model was only 0.003, indicating that the data set was only mildly unbalanced. These findings not only demonstrated the effectiveness of constructing an integrated model but also proved the complementary functionality with the employment of another feature.
8.
Confusion matrices of the constructed models. The predictions of the training and testing sets are presented in blue and orange cells, respectively. The label “Positive” represents the mutagenic category, while “Negative” represents the nonmutagenic category.
4.2. Comparison of Model Performances with/without Feature Selection
Herein, a comparison analysis of models with and without feature selection was presented. The feature selection was processed by first removing the constant values. Next, the features with their absolute Pearson correlation coefficients greater than 0.9, which were the dark red features in the correlation matrix (Figure S2), were filtered by only keeping the one with the largest variance. Eventually, 140 features were retained for MACCS, and 346 features were retained for Mordred. The remaining features in feature selection were utilized to establish the model. The results showed that the models without feature selection had similar performances to those with feature selection (Table S6), where the differences in testing accuracy ranged from 0.002 to 0.015. This indicated that the highly correlated features, including the most correlated pair, contributed minimally to noisy prediction in our models. The results were aligned to a study that reported deep learning models trained without feature selection have been shown to maintain better predictive performance, which may be due to feature selection inadvertently discarding valuable information.
In addition, to observe whether the performances of established models were based on the underlying relationship between the features and mutagenicity, a chance correlation test was presented. The evaluation applied X-randomization to the MACCS and Mordred features, where we randomly shuffled the features for all compounds while their mutagenicity labels were kept unchanged. The results showed that models with X-randomization had accuracy, balanced accuracy, and precision values ranging from 0.500 to 0.568. In comparison, our MACCS and Mordred models had accuracy, balanced accuracy, and precision values of 0.850 to 0.944, which were substantially higher than those of the randomly shuffled models (Table S7). The findings suggest that the established models predict the compound mutagenicity based on underlying relationships as opposed to chance correlation, and that they exhibit reliable performance.
4.3. Model Misprediction Insights through Activity Cliffs Identification
To examine the activity cliffs in our data set, we utilized the MACCS and Mordred descriptors and the ARKA approach to generate the ARKA plot. In an ARKA plot, mutagens are expected to be in the range surrounded by the ARKA_1 descriptor greater than 0.5 and the ARKA_2 descriptor lower than −0.5, while nonmutagens are expected to be in the area circled by the ARKA_1 descriptor lower than −0.5 and the ARKA_2 descriptor greater than 0.5. In these regions, when mutagens/nonmutagens appear in a different location, they will be considered as potential activity cliffs. In Figure S3A,B, most of the mutagens appeared in the first, the third, and the fourth quadrant, while the nonmutagens had a larger proportion within the second and the third quadrant. In addition, the number of activity cliffs was summarized for each ARKA plot. In the ARKA plots using the MACCS features, the training data set contained 13 activity cliffs, while the testing data set had four. For the Mordred descriptors, the training data set also had 13 activity cliffs, but none were found in the testing data set.
Next, we showed three examples of activity cliffs in Figure S3C. 2-(dimethylamino)ethyl methacrylate, a mutagen, was identified as an activity cliff and falsely predicted as a nonmutagen by the MACCS, Mordred, and the integrated model. The phenomenon could be explained compared to its closest compound based on the sum of absolute differences in ARKA descriptor, the Ethylenemisoctadecanamide, which is a nonmutagen and thus potentially affected the prediction. Similarly, Di(n-octyl)tin-s,s′-bis(isooctylmercaptoacetate) and Tris(3,4-dibromo-2-butyl)phosphate are also mutagenic compounds but were eventually predicted as nonmutagenic. These findings indicated that the analysis of activity cliffs could provide insights into the mispredictions of our model.
4.4. Comparison to Other Models in Previous Studies
Herein, a comparison between our established model and the other models reported in three previous studies was conducted. ,, The comparison was analyzed by collecting their available model performances (i.e., the accuracy, balanced accuracy, F1, and MCC) in the testing data set. Generally, our MACCS-Mordred model performed a better balanced accuracy and MCC than the other reported models (Table S8). In addition, similar to Kumar et al., the ratio between the number of mutagens and nonmutagens in our study is close to 1.3:1, unlike the other two studies with extremely unbalanced data sets, which caused their performances to be unstable. These results indicate that our MACCS-Mordred model is more effective than the previously reported models.
4.5. Relationship between Feature Importance and the Inducing Mutagenic Mechanism
Understanding the importance of various chemical structures can help identify mutagenic compounds. The analysis of feature importance suggested that features with positive contributions could be divided into four categories: halogen group, nitrogen-containing group, oxygen-containing group, and ring-related group. Some compounds containing these groups were demonstrated to induce mutagenicity. For example, the nitro group, aliphatic halogen, aromatic nitrogen, and heterocycle were shown to possess this capability. − The mutagenic capability of the nitro group is attributed to its strong electrophilicity, which generates localized electron-deficient sites inside the molecule through electron-transferring interactions, subsequently causing damage to nucleic acids. In addition, compounds with aliphatic halogen groups were also observed to cause mutagenicity. For instance, the mutagenicity of vinyl chloride and vinylidene chloride was attributed to the formation of unsymmetric and highly electrophilic oxiranes, which are substances that can cause genotoxic effects by reacting directly with the nucleophilic constituents. , Furthermore, the mutagenic potentials of aromatic nitrogen and heterocycle may result from the ability to generate redox cycling, which disrupts oxidative stress and damages the DNA. Last, the toxicity of oxygen-containing substances (e.g., O-PAHs) is related to the capability of generating reactive oxygen and inducing an excessive amount of oxidative stress. The important features identified in this study match with known mutagenic-inducing mechanisms.
4.6. The Application of Feature Importance
The important features identified by this study highlight how changes in functional groups influence mutagenicity. We utilized the Exmol package to design compounds with modifications that would change the predicted class of a molecule. For instance, acetaminophen and anthranilic acid are nonmutagenic and can become mutagenic through specific modification (Figure S4), such as substituting an OH group with fluoride (FP-134) or adding halogen and nitrogen-containing groups (FP-84). Similarly, for the mutagenic compound Isoniazid, replacing the amine group with sulfide (FP-59) or adding methyl group (FP-149) could reduce its mutagenicity, which aligns with our findings on the negative contributors. The modifications were derived from a virtual local chemical subspace, where the compounds within it are generated based on the discussed chemical, not guaranteed for stability and synthetic feasibility, and are not included in our data set. Therefore, these virtual analogs serve only as theoretical examples to illustrate how feature importance can guide hypothesis generation, rather than definitive predictions. This analysis of molecular analogs highlights how minor modifications can significantly affect predicted mutagenicity, enhancing our understanding of the mutagenic potential of various compounds.
4.7. The Potential of Mutagenic Compounds for Environmental Contamination
The feature importance analysis not only highlighted the mutagenic potential of certain chemical structures but also their environmental impact. For example, several nitropolycyclic aromatic hydrocarbons (N-PAHs), which belong to the nitrogen-containing and ring-related groups, were listed as priority pollutants by the US Environmental Protection Agency (EPA). The increased complexity attributed to the aromatic group hinders degradation, while the nitro group enhances adsorption affinity on particulate matters or soil due to a reduced octanol–water partitioning coefficient and Henry law coefficient. Properties of contaminants contribute to their persistence and ubiquity in the environment and elevate their mutagenic potential. Our results suggested that the feature importance analysis could be further utilized to assess environmental contaminants and evaluate their associated risks.
5. Conclusions
In this study, we established an integrated machine learning model to predict the mutagenicity of organic compounds, aiming to reduce the time and cost associated with traditional assays such as the Ames test. The integrated model, combining predictions based on MACCS fingerprints and Mordred descriptors, achieved strong predictive performance, with an accuracy of 0.882, a balanced accuracy of 0.885, a precision of 0.922, a recall of 0.862, and an F1-score of 0.895. The mispredictions of our integrated model were in part attributable to activity cliffs. Our feature analysis revealed that compounds containing structural elements such as aromatic rings, nitro groups, or aliphatic halogens tend to exhibit mutagenic propertiesstructures commonly found in known or suspected environmental contaminants. Additionally, applicability domain analysis confirmed that reliable predictions were obtained for compounds below the critical hat value of each feature, which was 0.095 for MACCS descriptors and 0.6 for Mordred descriptors. Taken together, these findings suggest that our model provides a reliable and interpretable tool for mutagenicity assessment and may contribute to the early identification of potentially hazardous compounds in environmental monitoring and regulatory screening efforts.
Supplementary Material
Acknowledgments
We gratefully acknowledge the support from the National Science and Technology Council (NSTC 113-2621-M-002-011, 113-2321-B-002-041, 114-2320-B-038-045, 114-2320-B-038-002). This research was also partially supported by “TMU Research Center of Cancer Translational Medicine” from the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan (DP2-TMU-114-C-02). This research was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences.
The data set supporting the conclusions of this article is available in the GitHub repository: https://github.com/CHAOHSUTW/Mutagenicity_Intergrated-Model.git.
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.5c01586.
Figures S1–S4; Tables S1–S8 (PDF)
C.-H.Y.: Writing–original draft, Data curation, Formal analysis, Investigation, Methodology, Visualization, Validation. T.E.L.: Writing–review and editing, Software, Visualization, Methodology. J.-H.H.: Methodology, Software. K.-C.H.: Writing–review and editing, Conceptualization, Formal analysis, Funding acquisition, Resources, Supervision, Validation, Visualization. P.-T.C.: Writing–review and editing, Supervision, Conceptualization, Funding acquisition.
The authors declare no competing financial interest.
References
- Martínez M. J., Sabando M. V., Soto A. J., Roca C., Requena-Triguero C., Campillo N. E., Paez J. A., Ponzoni I.. Multitask deep neural networks for Ames mutagenicity prediction. J. Chem. Inf. Model. 2022;62(24):6342–6351. doi: 10.1021/acs.jcim.2c00532. [DOI] [PubMed] [Google Scholar]
- Lui R., Guan D., Matthews S.. Mechanistic task groupings enhance multitask deep learning of strain-specific Ames mutagenicity. Chem. Res. Toxicol. 2023;36(8):1248–1254. doi: 10.1021/acs.chemrestox.2c00385. [DOI] [PubMed] [Google Scholar]
- Pandey S. K., Roy K.. Development of a read-across-derived classification model for the predictions of mutagenicity data and its comparison with traditional QSAR models and expert systems. Toxicology. 2023;500:153676. doi: 10.1016/j.tox.2023.153676. [DOI] [PubMed] [Google Scholar]
- Kumar R., Khan F. U., Sharma A., Siddiqui M. H., Aziz I. B. A., Kamal M. A., Ashraf G. M., Alghamdi B. S., Uddin M. S.. A deep neural network–based approach for prediction of mutagenicity of compounds. Environ. Sci. Pollut. Res. 2021;28(34):47641–47650. doi: 10.1007/s11356-021-14028-9. [DOI] [PubMed] [Google Scholar]
- Valencia A., Prous J., Mora O., Sadrieh N., Valerio L. G.. A novel QSAR model of Salmonella mutagenicity and its application in the safety assessment of drug impurities. Toxicol. Appl. Pharmacol. 2013;273(3):427–434. doi: 10.1016/j.taap.2013.09.015. [DOI] [PubMed] [Google Scholar]
- Toropova A. P., Toropov A. A., Roncaglioni A., Benfenati E.. The enhancement scheme for the predictive ability of QSAR: A case of mutagenicity. Toxicol. In Vitro. 2023;91:105629. doi: 10.1016/j.tiv.2023.105629. [DOI] [PubMed] [Google Scholar]
- ICH Guideline S2 (R1) on Genotoxicity Testing and Data Interpretation for Pharmaceuticals Intended for Human Use, International Council for Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, 2013. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-guideline-s2-r1-genotoxicity-testing-and-data-interpretation-pharmaceuticals-intended-human-use-step-5_en.pdf.
- ICH Guideline M7(R1) on Assessment and Control of DNA Reactive (Mutagenic) Impurities in Pharmaceuticals to Limit Potential Carcinogenic Risk, International Council for Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, 2017. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-guideline-m7r1-assessment-and-control-dna-reactive-mutagenic-impurities-pharmaceuticals-limit-potential-carcinogenic-risk-step-5_en.pdf.
- Organisation for Economic Co-operation and Development Test No. 471: Bacterial Reverse Mutation Test, 2020. https://www.oecd.org/en/publications/test-no-471-bacterial-reverse-mutation-test_9789264071247-en.html.
- Honma M., Kitazawa A., Cayley A., Williams R. V., Barber C., Hanser T., Saiakhov R., Chakravarti S., Myatt G. J., Cross K. P.. et al. Improvement of Quantitative Structure–Activity Relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project. Mutagenesis. 2019;34(1):3–16. doi: 10.1093/mutage/gey031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muratov E. N., Bajorath J., Sheridan R. P., Tetko I. V., Filimonov D., Poroikov V., Oprea T. I., Baskin I. I., Varnek A., Roitberg A.. et al. QSAR without borders. Chem. Soc. Rev. 2020;49(11):3525–3564. doi: 10.1039/D0CS00098A. [DOI] [PMC free article] [PubMed] [Google Scholar]; 10.1039/D0CS00098A.
- Cherkasov A., Muratov E. N., Fourches D., Varnek A., Baskin I. I., Cronin M., Dearden J., Gramatica P., Martin Y. C., Todeschini R.. et al. QSAR modeling: Where have you been? Where are you going to? J. Med. Chem. 2014;57(12):4977–5010. doi: 10.1021/jm4004285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chakravarti S.. Augmenting expert knowledge-based toxicity alerts by statistically mined molecular fragments. Chem. Res. Toxicol. 2023;36(6):848–858. doi: 10.1021/acs.chemrestox.2c00368. [DOI] [PubMed] [Google Scholar]
- Kasamatsu T., Kitazawa A., Tajima S., Kaneko M., Sugiyama K.-i., Yamada M., Yasui M., Masumura K., Horibata K., Honma M.. Development of a new Quantitative Structure–Activity Relationship model for predicting Ames mutagenicity of food flavor chemicals using StarDrop auto-Modeller. Genes Environ. 2021;43(1):16. doi: 10.1186/s41021-021-00182-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Q., Yang H., Hao N., Du M., Zhao Y., Li Y., Li X.. Biodegradability analysis of Dioxins through in silico methods: Model construction and mechanism analysis. J. Environ. Manage. 2023;345:118898. doi: 10.1016/j.jenvman.2023.118898. [DOI] [PubMed] [Google Scholar]
- Hemmerich J., Ecker G. F.. In silico toxicology: From structure–activity relationships towards deep learning and adverse outcome pathways. WIREs Comput. Mol. Sci. 2020;10(4):e1475. doi: 10.1002/wcms.1475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herrmann K., Holzwarth A., Rime S., Fischer B. C., Kneuer C.. (Q)SAR tools for the prediction of mutagenic properties: Are they ready for application in pesticide regulation? Pest Manage. Sci. 2020;76(10):3316–3325. doi: 10.1002/ps.5828. [DOI] [PubMed] [Google Scholar]
- Honma M.. An assessment of mutagenicity of chemical substances by (Quantitative) Structure–Activity Relationship. Genes Environ. 2020;42(1):23. doi: 10.1186/s41021-020-00163-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cavasotto C. N., Scardino V.. Machine learning toxicity prediction: Latest advances by toxicity end point. ACS Omega. 2022;7(51):47536–47546. doi: 10.1021/acsomega.2c05693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banerjee A., Roy K., Gramatica P.. A bibliometric analysis of the Cheminformatics/QSAR literature (2000–2023) for predictive modeling in data science using the SCOPUS database. Mol. Diversity. 2025;29(4):3703–3715. doi: 10.1007/s11030-024-11056-8. [DOI] [PubMed] [Google Scholar]
- Kalian A. D., Benfenati E., Osborne O. J., Gott D., Potter C., Dorne J.-L. C. M., Guo M., Hogstrand C.. Exploring dimensionality reduction techniques for deep learning driven QSAR models of mutagenicity. Toxics. 2023;11(7):572. doi: 10.3390/toxics11070572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Habiballah S., Heath L. S., Reisfeld B.. A deep-learning approach for identifying prospective chemical hazards. Toxicology. 2024;501:153708. doi: 10.1016/j.tox.2023.153708. [DOI] [PubMed] [Google Scholar]
- Zhou Y., Ning C., Tan Y., Li Y., Wang J., Shu Y., Liang S., Liu Z., Wang Y.. ToxMPNN: A deep learning model for small molecule toxicity prediction. J. Appl. Toxicol. 2024;44:953. doi: 10.1002/jat.4591. [DOI] [PubMed] [Google Scholar]
- Wu J., Chen Y., Wu J., Zhao D., Huang J., Lin M., Wang L.. Large-scale comparison of machine learning methods for profiling prediction of kinase inhibitors. J. Cheminf. 2024;16(1):13. doi: 10.1186/s13321-023-00799-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang X., Zhang Z., Li Q., Cai Y.. Quantitative structure–activity relationship models for genotoxicity prediction based on combination evaluation strategies for toxicological alternative experiments. Sci. Rep. 2021;11(1):8030. doi: 10.1038/s41598-021-87035-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang L., Ai H., Chen W., Yin Z., Hu H., Zhu J., Zhao J., Zhao Q., Liu H.. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods. Sci. Rep. 2017;7(1):2118. doi: 10.1038/s41598-017-02365-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benigni R., Battistelli C. L., Bossa C., Tcheremenskaia O., Crettaz P.. New perspectives in toxicological information management, and the role of ISSTOX databases in assessing chemical mutagenicity and carcinogenicity. Mutagenesis. 2013;28(4):401–409. doi: 10.1093/mutage/get016. [DOI] [PubMed] [Google Scholar]
- Tolosa J., Candelas E. S., Pardo J. L. V., Goya A., Moncho S., Gozalbes R., Schätzlein M. P.. Micotoxilico: An interactive database to predict mutagenicity, genotoxicity, and carcinogenicity of mycotoxins. Toxins. 2023;15(6):355. doi: 10.3390/toxins15060355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aydın M., Rencüzoğulları E.. Genotoxic and mutagenic effects of Mycotoxins: A review. Commagene J. Biology. 2019;3(2):132–161. doi: 10.31594/commagene.633418. [DOI] [Google Scholar]
- Mary, H. ; Noutahi, E. ; Moreau, M. ; Zhu, L. ; Pak, S. ; Gilmour, D. ; Whitfield, S. ; Valence-JonnyHsu; Hounwanou, H. ; Kumar, I. ; Maheshkar, S. ; Nakata, S. ; Kovary, K. M. ; Wognum, C. ; Craig, M. ; DeepSourceBot . Datamol-io, Ver. 0.12.0, 10.5281/zenodo.10049297. [DOI]
- Pearson K.. LIII. On lines and planes of closest fit to systems of points in space. London, Edinburgh Dublin Philos. Mag. J. Sci. 1901;2(11):559–572. doi: 10.1080/14786440109462720. [DOI] [Google Scholar]
- Hotelling H.. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933;24(6):417–441. doi: 10.1037/h0071325. [DOI] [Google Scholar]
- Lee M., Min K.. A comparative study of the performance for predicting biodegradability classification: The Quantitative Structure–Activity Relationship model vs the graph convolutional network. ACS Omega. 2022;7(4):3649–3655. doi: 10.1021/acsomega.1c06274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moriwaki H., Tian Y.-S., Kawashita N., Takagi T.. Mordred: A molecular descriptor calculator. J. Cheminf. 2018;10(1):4. doi: 10.1186/s13321-018-0258-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edwards, C. ; Lai, T. ; Ros, K. ; Honke, G. ; Cho, K. ; Ji, H. . Translation between molecules and natural language, arXiv:2204.11817. arXiv.org e-Print archive, 2022. https://arxiv.org/abs/2204.11817.
- Heyer, K. Roberta_ZINC_480m. https://huggingface.co/entropy/roberta_zinc_480m.
- Heyer, K. GPT2_ZINC_87m. https://huggingface.co/entropy/gpt2_zinc_87m.s.
- Noutahi, E. ; Wognum, C. ; Mary, H. ; Hounwanou, H. ; Kovary, K. M. ; Gilmour, D. ; Burns, J. ; St-Laurent, J. ; DomInvivo; Maheshkar, S. . Molfeat, Ver. 0.9.4., 10.5281/zenodo.8373019. [DOI]
- Keras, Ver 2.10,0, 2015.
- Lundberg, S. M. ; Lee, S. . A unified approach to interpreting model predictions, arXiv:1705.07874. arXiv.org e-Print archive, 2017. https://arxiv.org/abs/1705.07874.
- Wellawatte G. P., Seshadri A., White A. D.. Model agnostic generation of counterfactual explanations for molecules. Chem. Sci. 2022;13(13):3697–3705. doi: 10.1039/D1SC05259D. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy K., Kar S., Ambure P.. On a simple approach for determining applicability domain of QSAR models. Chemom. Intell. Lab. Syst. 2015;145:22–29. doi: 10.1016/j.chemolab.2015.04.013. [DOI] [Google Scholar]
- Dias-Silva J. R., Oliveira V. M., Sanches-Neto F. O., Wilhelms R. Z., Queiroz Júnior L. H. K.. SpectraFP: A new spectra-based descriptor to aid in cheminformatics, molecular characterization and search algorithm applications. Phys. Chem. Chem. Phys. 2023;25(27):18038–18047. doi: 10.1039/D3CP00734K. [DOI] [PubMed] [Google Scholar]
- Gramatica P.. Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007;26(5):694–701. doi: 10.1002/qsar.200610151. [DOI] [Google Scholar]
- Reitermanová, Z. Data Splitting. In WDS’10 Proceedings of Contributed Papers; Šafránková, J. ; Pavlů, J. , Eds.; Matfyzpress: Prague: 2010; pp 31–36. [Google Scholar]
- National Toxicology Program Genetic Toxicity Evaluation of Chloranil in Salmonella/E. coli Mutagenicity Test or Ames Test. Study 669858, 2018. https://cebs.niehs.nih.gov/cebs/study/002-01853-0001-0000-0.
- Arulanandam C. D., Babu V., Soorni Y., Prathiviraj R.. Mutagenicity and carcinogenicity prediction of sugar substitutes: an in silico approach with compound-gene interactions network. Toxicol. Res. 2024;14(1):tfaf008. doi: 10.1093/toxres/tfaf008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonin A. M., Rosario C. A., Duke C. C., Baker R. S. U., Ryan A. J., Holder G. M.. The mutagenicity of dibenz[a, j]acridine, some metabolites and other derivatives in bacteria and mammalian cells. Carcinogenesis. 1989;10(6):1079–1084. doi: 10.1093/carcin/10.6.1079. [DOI] [PubMed] [Google Scholar]
- Li T., Liu Z., Thakkar S., Roberts R., Tong W.. DeepAmes: A deep learning-powered Ames test predictive model with potential for regulatory application. Regul. Toxicol. Pharmacol. 2023;144:105486. doi: 10.1016/j.yrtph.2023.105486. [DOI] [PubMed] [Google Scholar]
- Sabando M. V., Ponzoni I., Soto A. J.. Neural-based approaches to overcome feature selection and applicability domain in drug-related property prediction. Appl. Soft Comput. 2019;85:105777. doi: 10.1016/j.asoc.2019.105777. [DOI] [Google Scholar]
- Banerjee A., Roy K.. ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data. Environ. Sci. Processes Impacts. 2024;26(6):991–1007. doi: 10.1039/D4EM00173G. [DOI] [PubMed] [Google Scholar]
- Yang H., Sun L., Li W., Liu G., Tang Y.. Identification of nontoxic substructures: A new strategy to avoid potential toxicity risk. Toxicol. Sci. 2018;165(2):396–407. doi: 10.1093/toxsci/kfy146. [DOI] [PubMed] [Google Scholar]
- Li S., Zhang L., Feng H., Meng J., Xie D., Yi L., Arkin I. T., Liu H.. MutagenPred-GCNNs: A Graph Convolutional Neural Network-based classification model for mutagenicity prediction with data-driven molecular fingerprints. Interdiscip. Sci.: Comput. Life Sci. 2021;13(1):25–33. doi: 10.1007/s12539-020-00407-2. [DOI] [PubMed] [Google Scholar]
- Yang H., Lou C., Li W., Liu G., Tang Y.. Computational approaches to identify structural alerts and their applications in environmental toxicology and drug discovery. Chem. Res. Toxicol. 2020;33(6):1312–1322. doi: 10.1021/acs.chemrestox.0c00006. [DOI] [PubMed] [Google Scholar]
- Nepali K., Lee H.-Y., Liou J.-P.. Nitro-group-containing drugs. J. Med. Chem. 2019;62(6):2851–2893. doi: 10.1021/acs.jmedchem.8b00147. [DOI] [PubMed] [Google Scholar]
- Greim H., Bonse G., Radwan Z., Reichert D., Henschler D.. Mutagenicity in vitro and potential carcinogenicity of chlorinated ethylenes as a function of metabolic oxirane formation. Biochem. Pharmacol. 1975;24(21):2013–2017. doi: 10.1016/0006-2952(75)90396-2. [DOI] [PubMed] [Google Scholar]
- Bonse G., Henschler D., Gehring P. J.. Chemical Reactivity, Biotransformation, and Toxicity of Polychlorinated Aliphatic Compounds. CRC Crit. Rev. Toxicol. 1976;4(4):395–409. doi: 10.1080/10408447609164019. [DOI] [PubMed] [Google Scholar]
- Simoneit B. R. T., Bi X., Oros D. R., Medeiros P. M., Sheng G., Fu J.. Phenols and Hydroxy-PAHs (Arylphenols) as tracers for Coal Smoke Particulate Matter: Source tests and ambient aerosol assessments. Environ. Sci. Technol. 2007;41(21):7294–7302. doi: 10.1021/es071072u. [DOI] [PubMed] [Google Scholar]
- Krzyszczak A., Czech B.. Occurrence and toxicity of polycyclic aromatic hydrocarbons derivatives in environmental matrices. Sci. Total Environ. 2021;788:147738. doi: 10.1016/j.scitotenv.2021.147738. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data set supporting the conclusions of this article is available in the GitHub repository: https://github.com/CHAOHSUTW/Mutagenicity_Intergrated-Model.git.










