The figures show performance of the ensemble learning based global QSTR models in predicting the toxicities of pesticides in multiple test species.
Abstract
The safety assessment processes require the toxicity data of chemicals in multiple test species and thus, emphasize the need for computational methods capable of toxicity prediction in multiple test species. Pesticides are designed toxic substances and find extensive applications worldwide. In this study, we have established local and global QSTR (quantitative structure–toxicity relationship) and ISC QSAAR (interspecies correlation quantitative structure activity–activity relationship) models for predicting the toxicities of pesticides in multiple aquatic test species using the toxicity data in crustacean (Daphnia magna, Americamysis bahia, Gammarus fasciatus, and Penaeus duorarum) and fish (Oncorhynchus mykiss and Lepomis macrochirus) species in accordance with the OECD guidelines. The ensemble learning based QSTR models (decision tree forest, DTF and decision tree boost, DTB) were constructed and validated using several statistical coefficients derived on the test data. In all the QSTR and QSAAR models, Log P was an important predictor. The constructed local, global and interspecies QSAAR models yielded high correlations (R2) of >0.941; >0.943 and >0.826, respectively between the measured and model predicted endpoint toxicity values in the test data. The performances of the local and global QSTR models were comparable. Furthermore, the chemical applicability domains of these QSTR/QSAAR models were determined using the leverage and standardization approaches. The results suggest for the appropriateness of the developed QSTR/QSAAR models to reliably predict the aquatic toxicity of structurally diverse pesticides in multiple test species and can be used for the screening and prioritization of new pesticides.
1. Introduction
Due to the global chemical revolution over the last few decades, the environment has been much exposed to diverse chemicals.1 Unrestricted release of chemicals into the environment has contributed to severe pollution problems worldwide.2 An increased use of agro-chemicals, pharmaceuticals, petrochemicals, and other industrial chemicals over the last few years has largely aggravated the chemical pollution problem.3,4 Understanding the chemical toxicity to different species is becoming a point of focus in environmental toxicology research.5,6 An effective environmental management must protect different living species from stresses arising from the chemicals released into the ecosystems.7 Subsequently, the regulatory agencies require a comprehensive toxicity data prior to the registration of the new chemicals for manufacture and use. Although, experimental protocols for toxicological evaluations of different chemicals have been developed by the industry and regulatory agencies, the toxicological screening of a large number of chemicals and understanding their complex cellular interactions require animal experimentation, which are unethical, time and cost intensive, and have difficulties in correlations/interpretations with the human system.8 Recently, the European Union REACH (Registration, Evaluation, Authorisation and Restriction of Chemicals) legislation requires toxicological hazard and risk assessments for all new and existing chemicals9 and advocates for the use of sufficiently validated computational prediction models based on QSAR (quantitative structure–activity relationship) to fill in the toxicity data gaps, and thus save time, money and help to reduce the numbers of animals used for experimental testing purposes.10 QSAR offers an in silico tool for the development of predictive models towards various activity and property endpoints for a series of chemicals using the response data that have been determined through experiments and molecular structure information derived computationally or sometimes from experiments.11,12 The guidelines for the QSAR model development and validation proposed by the Organization for Economic Cooperation and Development (OECD) are expected to help increase the acceptability of QSAR models for regulatory purposes.13 Subsequently, a number of QSARs have been developed for the toxicity predictions of specific chemicals and are reported in the literature.14–18 However, the majority of such reports concern with a single species toxicity analysis, whereas, for a comprehensive safety evaluation of chemicals, toxicity data in multiple test species of different trophic levels and complexities are needed. In toxicological evaluations of chemicals, the aquatic test system offers better and reliable options as it constitutes a chain of different trophic level species for toxicity assays and in addition, it is less cumbersome than other test methods. Aquatic toxicity is one of the most important parameters in the ecotoxicological risk assessment of chemicals.19
In aquatic toxicity studies, the crustacean models are generally chosen due to their ecological relevance, the availability of well-developed test protocols, and their established use in standard toxicity testing20 of the crustaceans, Daphnia magna, mysid, scud, and pink shrimp which have been proposed for the regulatory testing of chemicals.21,22 Daphnia magna is widely used as a standard test organism in aquatic toxicology. It is an important primary consumer of primitive plant life and itself a major food source for vertebrate and invertebrate predators and has been used as a representative for other freshwater animals in the standard tests of toxicity23 because of its high sensitivity, easy handling and high reproductive rate.24 The Daphnia acute toxicity (48 h) test is used for short term toxicity (EC50) assessment of chemicals.
Recently, a few studies25–30 have reported QSTR models for the aquatic toxicity estimation of chemicals in multiple test species, however, no attempt has yet been made to develop QSTRs for the toxicological evaluation of pesticides in multiple crustacean test species. Local QSTR (L-QSTR) models based on mode of action (MOA),31–34 and specific functional groups35–39 have been proposed for toxicity assessment of chemicals. However, the application of such models is limited due to the pre-requirement of information on the MOA and functional groups (in the case of multiple groups) in the chemicals.40 Recently, L-QSTR models based on toxicity data in a single test species and G-QSTR based on the combined toxicity data for different test species have been proposed.29,40,41 Further, the interspecies correlation (ISC) based quantitative structure activity–activity relationships (QSAARs)18,42–45 and global QSTR (G-QSTR) models have been proposed for toxicity prediction of chemicals in multiple species.29 Global models have the advantage that they are applicable for large numbers of compounds across mechanisms of action and structure.46,47 The ISC QSAAR extrapolates the data for one toxicity endpoint to those for another toxicity endpoint and can be used to determine the species-specific toxicity of a chemical, whereas, the G-QSTR model can simultaneously consider the toxicity end-point in multiple test species for model building and prediction.46,47
In recent years, the artificial neural networks (ANNs) and support vector machines (SVMs) have emerged as the unbiased methods for predictive modeling. ANNs, although universal estimators, suffer from the problem of over-fitting of data. SVMs, known to overcome the problem of over-fitting, make use of limited data points in the training phase. In recent years ensemble learning (EL) methods48 have emerged as unbiased tools for modeling the complex relationships between the independent and dependent variables.49 These methods are designed to overcome the problems with weak predictors50 and over-fitting the training data.51 Decision tree forest (DTF) and decision treeboost (DTB) implementing bagging and boosting techniques improve the accuracy of a predictive function.49 These methods are inherently non-parametric statistical methods and make no assumption regarding the underlying distribution of the values of predictor variables and can handle numerical data that are highly skewed or multi-model in nature.52,53
In this study, EL based local (L-QSTR) and global (G-QSTR) models were established for the aquatic toxicity predictions of structurally diverse pesticides in single and multiple crustacean test species in accordance with the OECD guidelines for QSAR modeling. The constructed QSTR models were rigorously validated using the internal and external validation procedures. Moreover, the ISC QSAAR models were also established using the toxicity data of pesticides in crustacean (D. magna) and fish (Oncorhynchus mykiss and Lepomis macrochirus) species. The applicability domains of the developed QSTR and ISC QSAAR models were defined using the leverage and standardization methods.
2. Materials and methods
Here, QSTR models were constructed for predicting the toxicities of chemicals in single and multiple test species following the OECD guidelines13 for QSAR modeling. A schematic diagram showing the modeling steps is presented in Fig. 1.
Fig. 1. A flow chart showing the QSTR/QSAAR modeling procedure.
2.1. Datasets
For the development of high quality QSAR models, high quality experimental data are essential.54 The aquatic toxicity data of chemical pesticides on different crustacean species (Daphnia magna, Americamysis bahia, Gammarus fasciatus, and Penaeus duorarum) were collected from the OPP Pesticide Ecotoxicity Database.55 This database contained well-defined experimental toxicity values of 3767 compounds in crustacean species. Here, 48-h EC50 (ppm) toxicity in D. magna, and 96-h LC50 (ppm) in the other three species were considered. The toxicity end-points were determined following the EPA guidelines (FIFRA 158.490). All the mixtures, duplicates, salts, and the compounds that have only qualitative end-point values were removed. Finally, a total of 445 pesticides (algaecide, fumigant, fungicide, growth regulator, herbicide, insecticide, microbiocide, miticide, molluscicide, nematicide, rodenticide, etc.) for D. magna were retained for the QSTR analysis. Further, the compounds that were common in other toxicity datasets were removed and chemicals that were uncommon in other species were retained for external validation. Accordingly, 43 pesticides in A. bahia, 15 in G. fasciatus, and 8 in P. duorarum test species were retained for multi-species QSTR analysis. For the development of interspecies QSAAR models, the pesticide toxicity data in fish was considered. Toxicity data of 318 pesticides in O. mykiss (96-h LC50) and 294 in L. macrochirus (96-h LC50) were taken. Prior to the QSTR modeling, the toxicity values were converted into the negative logarithmic scale. For different test species, the end-point toxicity values (pEC50/pLC50, mmol L–1) ranged between –1.77 and 7.63 (D. magna), –1.87 and 5.24 (A. bahia), 0.01 and 7.84 (G. fasciatus), –0.07 and 7.10 (P. duorarum), –1.55 and 6.84 (O. mykiss), and –1.16 and 7.20 (L. macrochirus), respectively (Tables S1 and S2; ESI†). The Box–Whisker plots of the toxicities of pesticides in different test species considered here are given in Fig. 2.
Fig. 2. Box–Whisker plots of the toxicity of pesticides in different test species.
2.2. Molecular descriptors and data processing
For calculating the molecular descriptors (MDs), the SMILES (simplified molecular input line entry system) of the molecules were obtained using Chemspider.56 The Chemopy program57 was used to calculate the MDs. The program calculates 634 1D and 2D descriptors. These descriptors include the constitutional, connectivity, Basak, topology, Kappa, Buden, E-state, autocorrelation, molecular property, charge, and MOE-type descriptors. Relevant descriptors for QSTR analysis were selected using the model-fitting approach. The MDs with low variations (≤0.5) were excluded (380) from the pool. Finally, 254 descriptors were retained to undergo subsequent descriptor selection for QSTR modeling. Prior to the model construction, the toxicity datasets were split into the training (80%) and test (20%) subsets using the random distribution method. In this approach, the compounds are selected randomly with a uniform distribution and each sample (x) has an equal probability (p) of selection. For the training subset Ttr:
, where n and ntr are the total number of samples in the complete and training (Ttr) sets. The random distribution method leads to a low bias of the model performance.58,59 The relevant MDs and the optimal model parameters were determined using the training data through a 5-fold cross-validation (CV). The criterion of low root mean squared error (RMSE) was used to rank the contribution of the MDs in the current set. The lowest ranked descriptors (<5% contribution) were then removed29 in the successive steps. The most significant descriptors were then retained and the corresponding prediction accuracies were computed. The descriptor selection process was performed separately for the local, global QSTR and QSAAR modeling. Finally the retained MDs for the QSTR and QSAAR models are presented in Table 1.
Table 1. Descriptors used in QSTR modeling.
| Descriptors | Models | Description |
| nta | L-QSTR | Number of atoms |
| TPSA | L-QSTR, G-QSTR | Topological polarity surface area |
| PEOEVSA13 | L-QSTR, G-QSTR | MOE-type descriptors using partial charges and surface area contributions |
| Log P | L-QSTR, G-QSTR, ISC QSAAR | Log P value based on the Crippen method |
| Aweight | L-QSTR | Average atomic weight (not including H) |
| J | L-QSTR, G-QSTR | Balaban's J index |
| Hy | G-QSTR | Hydrophilic index |
The structural diversity of the considered pesticide datasets was determined using the Tanimoto similarity index (TSI). It is a distance metric for the topology-based chemical similarity studies and calculates the Tanimoto similarity between the fingerprint of a chemical and a consensus fingerprint.60 A good cut-off for biologically similar molecules is 0.7 or 0.8. In this study, the average TSI values of the pesticides in different toxicity datasets considered here were 0.025 (D. magna), 0.027 (A. bahia), 0.027 (G. fasciatus), 0.042 (P. duorarum), 0.010 (L. macrochirus), and 0.009 (O. mykiss), respectively. These values suggest that the considered pesticides in these datasets had a sufficiently high structural diversity.
2.3. QSTR analysis
Here, the EL-based L-QSTR and G-QSTR models were established for predicting the aquatic toxicities of structurally diverse chemical pesticides in different crustacean species (D. magna, A. bahia, G. fasciatus and P. duorarum). Further, the ISC based linear QSAAR models were also constructed using the aquatic toxicity data of crustacean (D. magna) and fish species. A brief account of these approaches is provided here.
2.3.1. EL-modeling methods
Ensemble learning (EL) is a machine learning paradigm where multiple learners are trained to solve the same problem. An ensemble contains a number of learners who are usually called base learners.61 The generalization ability of ensemble is usually much stronger than that of base learners. Ensemble learning is able to boost weak learners for making accurate predictions. The DTF and DTB are the ensembles of SDTs. The bagging technique implemented in DTF reduces the variance associated with prediction and improves the prediction accuracy. In this technique, several bootstrap samples are drawn from the data and prediction method is applied to each bootstrap sample and the results are combined to obtain the overall prediction.62 Theoretically, if a training set D consists of data {(Xi,Yi), I = 1,2,···,n} where Yi is the real-valued response and Xi is the p-dimensional predictor variable for the ith instance, a predictor E(Y|X = x) = f(x) is denoted by Cn(x) = hn (D1, ···,Dn) (x), where hn is the nth hypothesis. Finally, the bagged predictor is obtained as, Cn;B(x) = E[Dn(x)].63 In DTF, a number of independent trees are grown in parallel, and they do not interact until all of them have been built.64
In DTB, the stochastic gradient boosting technique improves the prediction accuracy by applying the function repeatedly in a series.65 For the overall prediction, boosting uses a weighted average of the results obtained by applying a prediction method to various samples. The DTB generates a series of trees with the output of one tree going into the next tree in the series. The DTB algorithm minimizes the loss function in the training set, {x,y}. After each iteration, F represents the sum of all trees built so far: Fm(x) = Fm–1(x) + Treem(x), where m is the number of trees in the model. The regularization parameter (number of iterations) is achieved by shrinkage through modifying the update rule as; Fm(x) = Fm–1(x) + νγmhm(x), 0 < ν ≤ 1, where ν is the learning rate and hm(x) is the base learner. The number and depth of each tree are the model parameters in DTF and DTB analyses. Here, the DTF and DTB approaches were used to develop nonlinear L-QSTR and G-QSTR models for toxicity prediction of pesticides in multiple crustacean test species.
2.3.2. Model validation
The robustness of the developed nonlinear QSTR and QSAAR models was verified by using different types of statistical validation metrics. Both the internal and external validation strategies were adopted. For internal validation, a 5-fold CV procedure was used, whereas the external validation was performed with the external test data, kept away during the training phase. Such test sets (when defined prior to analysis) are commonly accepted as the gold standard to assess the real predictivity of the QSAR model.66 However, the external validation results of a QSAR model largely depend on the distribution of compounds in the training and test sets. A distribution of dissimilar compounds in the training and test sets may lead to poor external validation results. The validation strategies check the reliability of the developed models for their possible application on a new set of data and assess the confidence of such prediction.67 Since, the main objective of the QSTR/QSAAR analysis here is to develop robust models that are capable of making accurate and reliable predictions of the toxicological effect of an unknown chemical in multiple test species, these were subsequently validated using the test set compounds for checking their predictive power. Accordingly, the external validation metrics, such as the CCC (concordance correlation coefficient), Q2F1, Q2F2, Q2F3 and r2m were taken into account.68–73 The model fitness parameters R2 and the root mean squared error (RMSE) were reported in connection with validation for the developed models. Further, the Y-randomization test was performed to evaluate any chance correlation among the data matrix. In this test, the dependent variable is randomly scrambled and a new model is developed using the original independent variable matrix.74 A value of the coefficient of determination of the non-random model (R2) exceeding the average value for the random models (R2r) disapproves the chance correlation probability. The extent of the difference in the values of R2 and R2r that signifies the reliability of the developed model was determined in terms of cR2p.75 The threshold value of cR2p is 0.5 and a model exceeding this value might be not considered the outcome of mere chance only.
2.3.3. Applicability domain analysis
The applicability domain (AD) of a QSTR model should be defined before it is used for screening new chemicals. The AD is the physico–chemical, structural or biological space, on which the model (training) has been developed, and for which it is applicable to make predictions for new compounds.76 The AD of the constructed QSTR/QSAAR models was defined using the leverage method, which is calculated as, hi = xTi(XTX)–1xi, where xi is a row vector of MDs for a particular ith compound and X is the n × m matrix of the m model MD values for the n training set compounds. The value of hi greater than the critical h* value indicates that the structure of the compound substantially differs from those used for the calibration. The h* value can be calculated76 as,
, where p is the number of variables used in the model, and n is the number of training data. However, a major limitation of this method is that the value of h*, hence, the number of compounds within or out of the AD of a model would depend on the number of compounds in the training data. The AD of the QSTR models was also analyzed by the standardization approach.77
3. Results and discussion
3.1. Local QSTR modeling
Here, EL-based L-QSTR models were constructed to predict the aquatic toxicities of diverse pesticides in D. magna using six descriptors (Table 1). A local model was constructed with toxicity data in a single crustacean species (D. magna) and applied to other crustacean species (A. bahia, G. fasciatus and P. duorarum). L-QSTR models were developed using the DTF and DTB algorithms. The optimal architectures and the model parameters of the two models for the D. magna toxicity data were determined using a 5-fold CV. The average RMSE in the training and CV data for the two models (DTF and DTB) were 0.49, 1.23 and 0.31, 1.24, respectively. In 5-fold Y-scrambling, the R2 and cR2p values were 0.005, 0.935 (DTF) and 0.004, 0.961 (DTB), respectively, which revealed that the original L-QSTR models are unlikely to arise as a result of chance correlation. The architectures and the optimal parameters of the constructed L-QSTR models determined through the internal and external validation are given in Table 2.
Table 2. Optimal parameters in L-QSTR and G-QSTR models.
| Model parameters | Local-QSTR models | Global-QSTR models |
| DTF-QSTR | ||
| Number of trees | 200 | 245 |
| Maximum depth of any tree in the forest | 26 | 26 |
| Average number of group splits in each tree | 218.7 | 248.3 |
| DTB-QSTR | ||
| Number of trees | 405 | 404 |
| Maximum depth of any tree in the series | 11 | 11 |
| Average number of group splits in each tree | 580.0 | 593.9 |
The L-QSTR models were applied to the test data and yielded RMSE and R2 values of 0.37, 0.941 (DTF) and 0.31, 0.958 (DTB), respectively. It is evident that the models yielded high correlations between the measured and the model predicted values of the endpoint toxicity both in the training and test data (Table 3). Fig. 3 shows the plot of the model predicted values of toxicity against the experimental values. As can be seen, the agreement between the measured and the predicted results across the entire range of values is excellent. A closely followed pattern of variation by the measured and model predicted values (Fig. 3) and reasonably low values of prediction errors (Table 3) suggest for a good-fit of the developed L-QSTR models to the datasets and for the adequacy of the selected models for predicting the toxicity of the pesticides. Both the DTF and DTB based L-QSTR models were applied to other three test species to predict the toxicities of pesticides. The results (Table 3) suggest that both the L-QSTR models successfully predicted the toxicities of the pesticides in all the three species. The high correlations (R2) between the measured and predicted toxicity values (Table 3) may be due to the high similarities of the training (TSI 0.026), test (TSI 0.021) and external test sets (0.027 A. bahia; 0.027 G. fasciatus; 0.042 P. duorarum) and the interpolation capacity (possible over-fitting) of the model.
Table 3. Performance parameters for the L-QSTR models in multiple crustacean test species.
| Model/data set | RMSE | R 2 | Q 2 F1 | Q 2 F2 | Q 2 F3 | CCC | r 2 m |
| Coefficient threshold | — | 0.5 (training) | 0.7 | 0.7 | 0.7 | 0.85 | 0.65 |
| 0.6 (test) | |||||||
| DTF L-QSTR | |||||||
| Training set | 0.55 | 0.938 | — | — | — | — | — |
| Test set | 0.37 | 0.941 | 0.937 | 0.934 | 0.962 | 0.964 | 0.805 |
| A. bahia | 0.47 | 0.947 | 0.897 | 0.896 | 0.939 | 0.933 | 0.627 |
| G. fasciatus | 0.74 | 0.956 | 0.884 | 0.857 | 0.848 | 0.901 | 0.474 |
| P. duorarum | 0.95 | 0.971 | 0.909 | 0.885 | 0.747 | 0.922 | 0.584 |
| DTB L-QSTR | |||||||
| Training set | 0.43 | 0.963 | — | — | — | — | — |
| Test set | 0.31 | 0.958 | 0.956 | 0.955 | 0.974 | 0.977 | 0.906 |
| A. bahia | 0.23 | 0.987 | 0.974 | 0.974 | 0.985 | 0.985 | 0.862 |
| G. fasciatus | 0.44 | 0.983 | 0.958 | 0.948 | 0.945 | 0.969 | 0.746 |
| P. duorarum | 0.60 | 0.974 | 0.964 | 0.954 | 0.899 | 0.973 | 0.823 |
Fig. 3. Plot of the measured and model predicted endpoint toxicity values of pesticides in the training and test sets of (a) DTF L-QSTR, and (b) DTB L-QSTR models.
External validation coefficients (CCC, Q2F1, Q2F2, Q2F3 and r2m) were derived for the test data (D. magna) and other three species. The OECD principle 4 advocates for a rigorous validation of the constructed QSTR models prior to applying these for new chemicals. The values of these coefficients along with their respective thresholds78,79 and the quality metric R2 are given in Table 3. From the results, it is evident that the obtained values of the validation metrics for the developed L-QSTR models are in good agreement with the limit prescribed; demonstrating once again the high predictability of the L-QSTR models.
In the literature, there are some studies that reported linear QSAR models for pesticide toxicity prediction in Daphnia.15,80–82 Although, it is difficult to perform an exact comparison of the present study with the previous ones due to the difference in the composition of the modeling and validation sets. These studies considered a limited number of compounds (n = 10–263) and reported the correlation (R2) values in the range of 0.590 and 0.895, which are lower to those achieved in our study.
3.2. Global QSTR modeling
The EL-based G-QSTR models (DTF and DTB) were constructed using the combined toxicity dataset of all the four crustacean test species (n = 511) and a set of five MDs commonly selected by two approaches. The constructed G-QSTR models thus have wider application domains both in terms of the chemicals and test species. Y-scrambling and external validation (test data) were performed to verify the chance correlation and applicability of the constructed G-QSTR models. In CV, the average RMSE in the training and CV data were 0.50, 1.28 (DTF) and 0.36, 1.29 (DTB), respectively. A low R2 and high cR2p values of 0.002, 0.931 (DTF) and 0.003, 0.961 (DTB), respectively in the Y-randomization test revealed that the original G-QSTR models disapproved the chance correlation probability. The statistical coefficients calculated for the test data are summarized in Table 4. The values of all the coefficients were above their respective thresholds.78,79 The plot of the actual and G-QSTR model predicted toxicity values (Fig. 4) in each of the test species considered here suggested an excellent agreement between them. From the results, it is evident that the performances of both the G-QSTR models (DTF and DTB) are comparable. In the G-QSTR model, the high correlations (R2) between the measured and predicted toxicity values (Table 4) may be due to the high similarities of the compounds in the training (TSI 0.027) and test (TSI 0.024) sets.
Table 4. Performance parameters for the G-QSTR models.
| Model/data set | RMSE | R 2 | Q 2 F1 | Q 2 F2 | Q 2 F3 | CCC | r 2 m |
| DTF G-QSTR | |||||||
| Training set | 0.57 | 0.932 | — | — | — | — | — |
| Test set | 0.38 | 0.943 | 0.939 | 0.939 | 0.960 | 0.967 | 0.824 |
| DTB G-QSTR | |||||||
| Training set | 0.41 | 0.962 | — | — | — | — | — |
| Test set | 0.31 | 0.960 | 0.959 | 0.959 | 0.973 | 0.980 | 0.930 |
Fig. 4. Plot of the measured and model predicted endpoint toxicity values of pesticides in the training and test sets of (a) DTF G-QSTR, and (b) DTB G-QSTR models.
An inter-comparison of the L-QSTR and G-QSTR models established in this study revealed that the performance of both these models were closely comparable and the two models successfully predicted the toxicities of pesticides having a huge diversity from the point of view of the chemical structure on different crustacean test species considered here (Tables 3 and 4). An excellent performance of the QSTR models here could further be attributed to the fact that both the EL methods (DTF and DTB) successfully captured the nonlinearities in the data. The bagging and boosting algorithms implemented in these models are known to improve the model accuracies.
3.3. QSAAR modeling
The QSAAR model extrapolate data for one toxicity endpoint to those for another toxicity endpoint and can be used to determine the species-specific toxicity of a chemical.43 The QSAAR is a mathematical relationship between two different biological endpoints measured in the same species or the same endpoint in different species. This approach is widely used for the extrapolation of toxicological data from a surrogate species to a predicted species. In this study, it was investigated whether acute toxicity data for the invertebrate D. magna could be used to develop a model for making in vivo toxicity prediction to the vertebrate fish. The interspecies toxicity correlations were examined prior to the QSAAR modeling. A good interspecies correlation (R2) obtained in this study for D. magna to fish species (O. mykiss, 0.773; L. macrochirus, 0.795) seems to support the idea that one can use the toxicity data for one organism (D. magna) to predict the toxicity to another organism (fish). Here, we used the MLR technique to develop linear QSAAR models and to select the descriptor. The hydrophobicity has been identified as an important parameter to describe the toxicity of compounds to D. magna. The QSAAR models were established for the common pesticides in D. magna and two different fish species. Accordingly, MLR based QSAAR models were developed between D. magna and O. mykiss; D. magna and L. macrochirus. The common compounds in two fish species were 294 (L. macrochirus) and 318 (O. mykiss). The D. magna toxicity was considered as independent and those of other species (fish) were taken as the dependent variable. The respective linear equations (training) obtained were
pLC50 (L. macrochirus) = 0.09 + 0.18 (Log P) + 0.67 (pEC50D. magna); n = 235; R2 = 0.831; RMSE = 0.65, F = 570.92; p < 0.00;
pLC50 (O. mykiss) = 0.27 + 0.17 (Log P) + 0.67 (pEC50D. magna); n = 254; R2 = 0.813; RMSE = 0.65, F = 545.77; p < 0.00.
The developed ISC QSAAR models applied to the respective test data yielded R2 and RMSE values of 0.68, 0.840 and 0.68, 0.826, respectively. The values of the statistical coefficients for the test set (Table 5) were above their respective thresholds (except for r2m). The success of these QSAAR models may be attributed to a similar mode of action resulting in a similarity in the descriptor required to predict the toxicity in an organism. In the case of new pesticides that fit our defined selection criteria, the defined toxicological effects to D. magna and fish can be estimated using our developed models without any additional animal testing. Tremolada et al.83 and Zvinavashe et al.15 developed QSAAR models for predicting the toxicities of pesticides in fish (O. mykiss and C. carpio) using D. magna toxicity data and reported R2 of 0.59 (n = 267) and 0.94 (n = 9), respectively. In both the studies, simple toxicity–toxicity modes were constructed. However, both these studies considered lesser number of chemicals compared to the present report.
Table 5. Performance parameters for the QSAAR models.
| Data set/models | RMSE | R 2 | Q 2 F1 | Q 2 F2 | Q 2 F3 | CCC | Q 2 m |
| QSAAR-1 | |||||||
| Training set | 0.65 | 0.831 | — | — | — | — | — |
| Test set | 0.68 | 0.840 | 0.831 | 0.831 | 0.818 | 0.900 | 0.636 |
| QSAAR-2 | |||||||
| Training set | 0.65 | 0.813 | — | — | — | — | — |
| Test set | 0.68 | 0.826 | 0.817 | 0.817 | 0.794 | 0.894 | 0.598 |
3.4. Applicability of domain analysis
Here, the leverage and standardization methods were used to define the ADs of the developed L-QSTR, G-QSTR and QSAAR models and the corresponding Williams plots (Fig. 5) were used to detect the response outliers (standardized residuals >3) and the structurally influential chemicals in the model (h > h*) (in the training data). In both the DTF and DTB based L-QSTR models, there were totally 9 high leverage and 2 response outlier compounds detected, whereas in G-QSTR models, the number of such compounds were 7 and 3, respectively. On the other hand, there was a single high leverage compound (fenbutatin oxide) in both the QSAAR models. The structures of the outliers and structurally influential compounds in each model are presented in Table S3 (ESI†). Further, the outliers in the training, test and external datasets were identified using the standardization approach.77 The analysis revealed that in the L-QSTR model fifteen compounds in the training, three in the test and one in the external dataset (A. bahia) were out of the AD. In G-QSTR, thirteen compounds in the training and four in the test set were out of the AD. In QSAAR models (L. macrochirus and O. mykiss), three and four compounds in training and one and two compounds in the test data were detected as the outliers (Table S4, ESI†). The anomalous behavior of the compounds outside the ADs of the models may be due to the fact that the set of the selected MDs could not capture some relevant structural features present in these molecules and that their biological mechanism is different from the remaining chemicals. However, the average predicted toxicity for the test molecules that are inside the AD is close to the average predicted toxicity of the molecules in the training set and the presence of the molecules inside or outside the AD reveals nothing regarding the difference or the correlation between the predicted/observed values of toxicity for molecules in the test set. For future predictions, the developed local, global and QSAAR models can be used to predict the toxicity of a new compound, if they locate in the AD of the respective model.
Fig. 5. Williams plot for the (a) L-QSTR, (b) G-QSTR, and (c) QSAAR models.
3.5. Mechanistic interpretation of QSTRs
The principle 5 of the OECD guidelines requires that a QSAR model should be mechanistically interpretable. Here, totally seven MDs (nta, TPSA, Log P, Aweight, J, Hy and PEOEVSA13) were considered for developing the L-QSTR, G-QSTR, and QSAAR models. The contributions of the selected MDs in different QSTR models are presented in Fig. 6.
Fig. 6. Plots of the contributions of the MDs in (a) L-QSTR, and (b) G-QSTR models.
In the local and global QSTR models, Log P has the highest (100%) contribution followed by PEOEVSA13. Except TPSA and Hy, all the other MDs were positively correlated with the end-point. A positive relationship between the descriptor and the endpoint (pEC50 or pLC50) will mean its direct influence on the toxicity of the chemical, whereas a negative correlation would reveal an inverse effect on the toxicity. Log P is measure of the hydrophobicity of a chemical, reflecting the ability of a compound to form non-covalent interactions with its environment, to dissolve and persist in water or in a lipidic environment. A larger Log P indicates a stronger ability of a chemical to permeate the cell membrane of an organism and, therefore, to interact with its target in the organism.84 PEOEVSA13, a MOE-type descriptor is calculated using partial charge and surface area contributions. These descriptors largely consist of physico–chemical properties, sub-divide surface areas, connectivity and shape indices, and atom and bond counts, featuring whole molecule properties.85 TPSA is defined as the part of the surface area of the molecule associated with N, O, S and the H-bonded to any of these atoms.86 It correlates well with the passive molecular transport through membranes and allows for the prediction of transport properties of chemicals.87 Hy is a semi-empirical index related to the hydrophilicity of compounds based on count descriptors.88 A negative correlation of Hy with toxicity suggests that the presence of hydrophilic groups (OH, SH, NH) in a molecule would result in a decrease of the toxicity. The nta and Aweight are constitutional descriptors and represent the total number of atoms and average atomic weight, respectively. This has a major role in defining the molecular density, molecular mass, rigidity and presence of individual atoms in the chemicals. The Balaban's J index represents branching in a molecule and a higher value of J indicates high branching in a compound.89 The more branched molecules are less toxic, probably due to their lower membrane penetration ability.24 To further investigate the relationships of the selected MDs with the end-point toxicity of pesticides, we selected a fraction (10%) of the compounds that exhibited the highest and lowest toxicities (pEC50) in D. magna. For these compounds, the mean values of all the descriptors (expect TPSA, J and Hy) were high in high toxicity compounds and low in low toxicity compounds. The Box–Whisker plots of the MD values for the pesticides exhibiting low and high toxicities (10%) are shown in Fig. S1 (ESI†). Thus, it is clear that the selected descriptors have quantitative mechanistic relationships with the end-point properties investigated here.
In this study, QSTR models were constructed for predicting the toxicities of diverse chemical pesticides in multiple test species strictly in accordance with the OECD guideline for QSAR modeling for regulatory purposes.90 Accordingly, for QSTR analysis, this study considered the databases reporting well-defined experimental values of the aquatic toxicities of diverse chemicals in crustacean and fish test species with experimental protocols. Further, we adopted an unambiguous and well established modeling procedure with clearly defined methodologies for the calculation and selection of MDs and data processing. The ADs of the developed QSTR models were adequately determined. Towards the statistical quality checks on the developed QSTR models, various stringent coefficients for model fitness, robustness, and validation metrics were computed, which were above their respective thresholds. Further, a convincing mechanistic interpretation of the models was offered and the relevance of the selected MDs in different aquatic toxicity QSTR models here was investigated.
4. Conclusions
For a comprehensive safety assessment of chemicals, toxicological profile data in multiple test species will be required. Experimental toxicity testing of chemicals in multiple species will be time and resource intensive. In this study, EL based QSTR and QSAAR models were developed for estimating the aquatic toxicities of diverse chemicals in multiple test species strictly in accordance with the OECD guidelines for QSAR modeling. Chemical pesticides toxicity dataset in crustacean and fish species was considered; local and global QSTR models were established using DTF and DTB modeling methods, whereas ISC QSAAR models were established to predict toxicity in fish with Daphnia toxicity data. Totally seven MDs were used and several statistical validation tests performed on the constructed QSTR/QSAAR models revealed a high predictivity for these models and rendered high statistical confidence. The performances of both the local and global QSTR models were excellent and comparable. The developed QSTR models in the present study performed better than those reported earlier for the prediction of the toxicities of pesticides. Excellent predictivity and generalization achieved for the QSTR models here may be due to their ability to capture the nonlinearities in the data. The proposed models will help in reducing the cost and number of animals in toxicity testing of chemicals and in generating reliable toxicity data in multiple test species to streamline the risk assessment process of diverse chemicals.
Supplementary Material
Acknowledgments
The authors thank the Director, CSIR-Indian Institute of Toxicology Research, Lucknow (India) for his keen interest in this work and providing all necessary facilities.
Footnotes
†Electronic supplementary information (ESI) available. See DOI: 10.1039/c5tx00321k
References
- Pramanik S., Roy K. Ecotoxicol. Environ. Saf. 2014;101:184–190. doi: 10.1016/j.ecoenv.2013.12.030. [DOI] [PubMed] [Google Scholar]
- Scherb H., Voigt K. Environ. Sci. Pollut. Res. 2011;18:695–696. doi: 10.1007/s11356-010-0332-0. [DOI] [PubMed] [Google Scholar]
- Rohr J. R., Schotthoefer A. M., Raffel T. R., Carrick H. J., Halstead N., Hoverman J. T., Johnson C. M., Johnson L. B., Lieske C., Piwoni M. D., Schoff P. K., Beasley V. R. Nature. 2008;455:1235–1239. doi: 10.1038/nature07281. [DOI] [PubMed] [Google Scholar]
- Planson A. G., Carbonell P., Paillard E., Pollet N., Faulon J. L. Biotechnol. Bioeng. 2012;109:846–850. doi: 10.1002/bit.24356. [DOI] [PubMed] [Google Scholar]
- Azarbad H., Niklinska M., vanGestel C. A., vanStraalen N. M., Roling W. F., Laskowski R. Environ. Toxicol. Chem. 2013;32:1992–2002. doi: 10.1002/etc.2269. [DOI] [PubMed] [Google Scholar]
- Daouk S., Copin P. J., Rossi L., Chevre N., Pfeifer H. R. Environ. Toxicol. Chem. 2013;32:2035–2044. doi: 10.1002/etc.2276. [DOI] [PubMed] [Google Scholar]
- Cardinale B. J., Emmett D. J., Gonzalez A., Hooper D. U., Perrings C., Venail P., Narwani A., Mace M. G., Tilman D., Wardle D. A., Kinzig A. P., Daily G. C., Loreau M., Grace J. B., Larigauderie A., Srivastava D. S., Naeem S. Nature. 2012;486:59–67. doi: 10.1038/nature11148. [DOI] [PubMed] [Google Scholar]
- Ahrens A., Traas T. P. J. Exposure Sci. Environ. Epidemiol. 2007;17:S7–S15. doi: 10.1038/sj.jes.7500602. [DOI] [PubMed] [Google Scholar]
- Worth A. P., Bassan A., DeBruijn J., Gallegos-Saliner A., Netzeva G., Patlewicz G., Pavan M., Tsakovska I., Eisenreich S. SAR QSAR Environ. Res. 2007;18:111–125. doi: 10.1080/10629360601054255. [DOI] [PubMed] [Google Scholar]
- European Commission, Directive 2006/121/EC of the European Parliament and of the Council of 18 December 2006 amending Council Directive 67/548/EEC on the approximation of laws, regulations and administrative provisions relating to the classification, packaging and labelling of dangerous substances in order to adapt it to Regulation (EC) no. 1907/2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) and establishing a European Chemicals Agency. Off. J. Eur. Union (2006), L 396/850 of 30.12.2006, Office for Official Publications of the European Communities (OPOCE), Luxembourg
- Roy K., Kar S. and Das R. N., Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment, Academic Press, London, UK, 2015, ISBN: 978-0-12-801505-6. [Google Scholar]
- Roy K., Kar S. and Das R. N., A Primer on QSAR/QSPR Modeling Fundamental Concepts. Springer Briefs in Molecular Science, Springer Cham Heidelberg, New York, London, 2015, 10.1007/978-3-319-17281-1. [DOI] [Google Scholar]
- Organization for Economic Cooperation and Development (OECD). Guidance Document on the Validation of (Quantitative) Structure–activity Relationships [(Q)SAR] Models, ENV/JM/MONO 2 (2007), 2007, 1–154
- Huang C.-P., Wang Y.-J., Chen C.-Y. Ecotoxicol. Environ. Saf. 2007;67:439–446. doi: 10.1016/j.ecoenv.2006.06.007. [DOI] [PubMed] [Google Scholar]
- Zvinavashe E., Du T., Griff T., van den Berg H. H., Soffers A. E., Vervoort J., Murk A. J., Rietjens I. M. Chemosphere. 2009;75:1531–1538. doi: 10.1016/j.chemosphere.2009.01.081. [DOI] [PubMed] [Google Scholar]
- Aruoja V., Sihtmae M., Dubourguier H.-C., Kahru A. Chemosphere. 2011;84:1310–1320. doi: 10.1016/j.chemosphere.2011.05.023. [DOI] [PubMed] [Google Scholar]
- Bertinetto C., Duce C., Solaro R., Tiné M. R., Micheli A., Héberger K., Miličević A., Nikolić S. MATCH. 2013;70:1005–1021. [Google Scholar]
- Cassani S., Kovarich S., Papa E., Roy P. P., vanderWal L., Gramatica P. J. Hazard. Mater. 2013;258–259:50–60. doi: 10.1016/j.jhazmat.2013.04.025. [DOI] [PubMed] [Google Scholar]
- Lagunin A. A., Zakharov A. V., Filimonov D. A., Poroikov V. V. SAR QSAR Environ. Res. 2007;18:285–298. doi: 10.1080/10629360701304253. [DOI] [PubMed] [Google Scholar]
- Verslycke T., Ghekiere A., Raimondo S., Janssen C. Ecotoxicology. 2007;16:205–219. doi: 10.1007/s10646-006-0122-0. [DOI] [PubMed] [Google Scholar]
- EPA; US Environmental Protection Agency, Ecological Effects Test Guidelines, OPPTS 850.1010 Aquatic Invertebrate Acute Toxicity Test, Freshwater Daphnids, Prevention, Pesticides and Toxic Substances (7101), Washington, D.C. EPA 712-C-96-114, 1996
- US EPA; Marine Toxicity Identification Evaluation (TIE): Phase I Guidance Document. Office of Research and Development, EPA 600-R-96-054, September 1996
- OECD, Test No. 202: Daphnia sp. Acute Immobilisation Test, OECD Guidelines for the Testing of Chemicals, Section 2, OECD Publishing, Paris, France, 2004, 10.1787/9789264069947-en. [DOI] [Google Scholar]
- Katritzky A. R., Slavov S. H., Stoyanova-Slavova I. S., Kahn I., Karelson M. J. Toxicol. Environ. Health, Part A. 2009;72:1181–1190. doi: 10.1080/15287390903091863. [DOI] [PubMed] [Google Scholar]
- Singh K. P., Gupta S., Rai P. Ecotoxicol. Environ. Saf. 2013;95:221–233. doi: 10.1016/j.ecoenv.2013.05.017. [DOI] [PubMed] [Google Scholar]
- Singh K. P., Gupta S., Kumar A., Mohan D. Chem. Res. Toxicol. 2014;27:741–753. doi: 10.1021/tx400371w. [DOI] [PubMed] [Google Scholar]
- Singh K. P., Gupta S., Basant N. RSC Adv. 2014;4:64443–64456. [Google Scholar]
- Singh K. P., Gupta S., Basant N. Chemosphere. 2015;120:680–689. doi: 10.1016/j.chemosphere.2014.10.025. [DOI] [PubMed] [Google Scholar]
- Basant N., Gupta S., Singh K. P. Chemosphere. 2015;139:246–255. doi: 10.1016/j.chemosphere.2015.06.063. [DOI] [PubMed] [Google Scholar]
- Basant N., Gupta S., Singh K. P. J. Chem. Inf. Model. 2015;55:1337–1348. doi: 10.1021/acs.jcim.5b00139. [DOI] [PubMed] [Google Scholar]
- Russom C. L., Bradbury S. P., Broderius S. J., Hammermeister D. E., Drummond R. A. Environ. Toxicol. Chem. 1997;16:948–967. doi: 10.1002/etc.2249. [DOI] [PubMed] [Google Scholar]
- Yuan H., Wang Y. Y., Cheng Y. Y. J. Mol. Graphics Modell. 2007;26:327–335. doi: 10.1016/j.jmgm.2006.12.009. [DOI] [PubMed] [Google Scholar]
- Martin T. M., Grulke C. M., Young D. M., Russom C. L., Wang N. Y., Jackson C. R., Barron M. G. J. Chem. Inf. Model. 2013;53:2229–2239. doi: 10.1021/ci400267h. [DOI] [PubMed] [Google Scholar]
- Lyakurwa F., Yang X., Li X., Qiao X., Chen J. Chemosphere. 2014;96:188–194. doi: 10.1016/j.chemosphere.2013.10.039. [DOI] [PubMed] [Google Scholar]
- Kulkarni S. A., Raje D. V., Chakrabarti T. SAR QSAR Environ. Res. 2001;12:565–591. doi: 10.1080/10629360108039835. [DOI] [PubMed] [Google Scholar]
- Toropov A. A., Benfenati E. J. Mol. Struc.: THEOCHEM. 2004;676:165–169. [Google Scholar]
- Martin Smiesko E. B. J. Chem. Inf. Comput. Sci. 2004;44:976–984. doi: 10.1021/ci034219j. [DOI] [PubMed] [Google Scholar]
- Benfenati M. S. E. J. Chem. Inf. Model. 2005;45:378–389. doi: 10.1021/ci0496494. [DOI] [PubMed] [Google Scholar]
- Lyakurwa F. S., Yang X., Li X., Qiao X., Chen J. Chemosphere. 2014;108:17–25. doi: 10.1016/j.chemosphere.2014.02.076. [DOI] [PubMed] [Google Scholar]
- Sun L., Zhang C., Chen Y., Li X., Zhuang S., Li W., Lee P. W., Tang Y. Toxicol. Res. 2015;4:452–463. [Google Scholar]
- Gupta S., Basant N., Singh K. P. RSC Adv. 2015;5:71153–71163. [Google Scholar]
- Cronin M. T. D., Biological read-across: Mechanistically-based species-species and endpoint-endpoint extrapolations, in In Silico Toxicology: Principles and Applications, ed. M. T. D. Cronin and J. C. Madden, Royal Society of Chemistry, Cambridge, 2010, ch. 18, pp. 446–477. [Google Scholar]
- Furuhama A., Hasunuma K., Aoki Y. SAR QSAR Environ. Res. 2015;26:301–323. doi: 10.1080/1062936X.2015.1032347. [DOI] [PubMed] [Google Scholar]
- Das R. N., Roy K., Popelier P. A. Ecotoxicol. Environ. Saf. 2015;122:497–520. doi: 10.1016/j.ecoenv.2015.09.014. [DOI] [PubMed] [Google Scholar]
- Roy K., Das R. N., Popelier P. A. Environ. Sci. Pollut. Res. 2015;22:6634–6641. doi: 10.1007/s11356-014-3845-0. [DOI] [PubMed] [Google Scholar]
- Cronin M. T. D., Enoch S. J., Hewitt M., Madden J. C. ALTEX. 2009;28:45–49. doi: 10.14573/altex.2011.1.045. [DOI] [PubMed] [Google Scholar]
- Bassan A., Worth A. P. QSAR Comb. Sci. 2008;27:6–20. [Google Scholar]
- Snelder T. H., Lamouroux N., Leathwick J. R., Pella H., Sauquet E., Shanker U. J. Hydrol. 2009;373:57–67. [Google Scholar]
- Yang P., Yang Y. H., Zhou B. B., Zomaya A. Y. Curr. Bioinf. 2010;5:296–308. [Google Scholar]
- Hancock T., Put R., Coomans D., Vander Heyden Y., Everingham Y. A. Chemom. Intell. Lab. Syst. 2005;76:185–196. [Google Scholar]
- Dietterich T. G. Lect. Notes Comput. Sci. Eng. 2000;1857:1–15. [Google Scholar]
- Mahjoobi J., Etemad-Shahidi A. Appl. Ocean Res. 2008;30:172–177. [Google Scholar]
- Singh K. P., Gupta S., Rai P. Atmos. Environ. 2013;80:426–437. [Google Scholar]
- Cronin M. T. D., Schultz T. W. J. Mol. Struct. 2003;622:39–51. [Google Scholar]
- OPP Pesticide Ecotoxicity Database, 2014. Available at: http://www.ipmcenters.org/ecotox/ (accessed on October, 2014).
- ChemSpider. http://www.chemspider.com.
- The Chemopy program, http://www.scbdd.com/chemopy_desc/index/.
- Gupta S., Basant N., Singh K. P. Ecotoxicology. 2015;24:873–886. doi: 10.1007/s10646-015-1431-y. [DOI] [PubMed] [Google Scholar]
- Reitermanov Z. .
- Zhao C. Y., Zhang H. X., Zhang X. Y., Liu M. C., Hu Z. D., Fan B. T. Toxicology. 2006;217:105–119. doi: 10.1016/j.tox.2005.08.019. [DOI] [PubMed] [Google Scholar]
- Ishwaran H., Kogalur U. B. Stat. Probab. Lett. 2010;80:1056–1064. doi: 10.1016/j.spl.2010.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pino-Mejias R., Jimenez-Gamero M. D., Cubiles-de-la-Vega M. D., Pascual-Acosta A. Pattern Recognit. Lett. 2008;29:265–271. [Google Scholar]
- Bühlmann P., Yu B. Ann. Stat. 2002;30:927–961. [Google Scholar]
- Singh K. P., Gupta S., Mohan D. J. Hydrol. 2014;511:254–266. [Google Scholar]
- Friedman J. H. Comput. Stat. Data Anal. 2002;38:367–378. [Google Scholar]
- Benigni R., Netzeva T. I., Benfenati E., Bossa C., Franke R., Helma C., Hulzebos E., Marchant C., Richard A., Woo Y. P., Yang C. J. Environ. Sci. Health, Part C: Environ. Carcinog. Ecotoxicol. Rev. 2007;25:53–97. doi: 10.1080/10590500701201828. [DOI] [PubMed] [Google Scholar]
- Roy K., Mandal A. S. J. Enzyme Inhib. Med. Chem. 2008;23:980–995. doi: 10.1080/14756360701811379. [DOI] [PubMed] [Google Scholar]
- Lin L. I. Biometrics. 1992;48:599–604. [Google Scholar]
- Shi L. M., Fang H., Tong W., Wu J., Perkins R., Blair R. M., Branham W. S., Dial S. L., Moland C. L., Sheehan D. M. J. Chem. Inf. Comput. Sci. 2001;41:186–195. doi: 10.1021/ci000066d. [DOI] [PubMed] [Google Scholar]
- Schuurmann G., Ebert R., Chen J., Wang B., Kuhne R. J. Chem. Inf. Model. 2008;48:2140–2145. doi: 10.1021/ci800253u. [DOI] [PubMed] [Google Scholar]
- Consonni V., Ballabio D., Todeschini R. J. Chem. Inf. Model. 2009;49:1669–1678. doi: 10.1021/ci900115y. [DOI] [PubMed] [Google Scholar]
- Chirico N., Gramatica P. J. Chem. Inf. Model. 2011;51:2320–2335. doi: 10.1021/ci200211n. [DOI] [PubMed] [Google Scholar]
- Roy K., Chakraborty P., Mitra I., Ojha P. K., Kar S., Das R. N. J. Comput. Chem. 2013;34:1071–1082. doi: 10.1002/jcc.23231. [DOI] [PubMed] [Google Scholar]
- Rücker C., Rücker G., Meringer M. J. Chem. Inf.J. Chem. Inf. Comput. Sci.Comput. Sci. 2007;47:2345–2357. doi: 10.1021/ci700157b. [DOI] [PubMed] [Google Scholar]
- Mitra I., Saha A., Roy K. Mol. Simul. 2010;36:1067–1079. [Google Scholar]
- Netzeva T. I., Worth A. P., Aldenberg A., Benigni R., Cronin M. T. D., Gramatica P., Jaworska J. S., Kahn S., Klpoman G., Marchant C. A. ATLA, Altern. Lab. Anim. 2005;33:155–173. doi: 10.1177/026119290503300209. [DOI] [PubMed] [Google Scholar]
- Roy K., Kar S., Ambure P. Chemom. Intell. Lab. Syst. 2015;145:22–29. [Google Scholar]
- Tropsha A., Golbraikh A., Cho W. J. Bull. Korean Chem. Soc. 2011;32:2397–2404. [Google Scholar]
- Chirico N., Gramatica P. J. Chem. Inf. Model. 2012;52:2044–2058. doi: 10.1021/ci300084j. [DOI] [PubMed] [Google Scholar]
- Vighi M., Garlanda M. M., Calamari D. Sci. Total Environ. 1991;109/110:605–622. [Google Scholar]
- Toropov A. A., Benfenati E. Chemosphere. 2003;50:403–408. [Google Scholar]
- Amaury N., Benfenati E., Boriani E., Casalengo M., Chana A., Chaudhry Q., Chretien J. R., Cotterill J., Lemke F. and Piclin N., et al., Results of DEMETRA models, in Quantitative structure–activity relationship (QSAR) for pesticide regulatory purposes, ed. E. Benfenati, Elsevier B.V., 2007, ch. 7, pp. 201–282. [Google Scholar]
- Tremolada P., Finizio A., Villa S., Gaggi C., Vighi M. Aquat. Toxicol. 2004;67:87–103. doi: 10.1016/j.aquatox.2003.12.003. [DOI] [PubMed] [Google Scholar]
- Jiang D. X., Li Y., Li J., Wang G. X. Int. J. Environ. Res. 2011;5:923–938. [Google Scholar]
- Sun H., Shahane S., Xia M., Austin C. P., Huang R. J. Chem. Inf. Model. 2012;52:1798–1805. doi: 10.1021/ci3001875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ertl P., Rohde B., Selzer P. J. Med. Chem. 2000;43:3714–3717. doi: 10.1021/jm000942e. [DOI] [PubMed] [Google Scholar]
- Afantitis A., Melagraki G., Koutentis P. A., Sarimveis H., Kollias G. Eur. J. Med. Chem. 2011;46:497–508. doi: 10.1016/j.ejmech.2010.11.029. [DOI] [PubMed] [Google Scholar]
- Todeschini R., Consonni V. and Mannhold R., in Handbook of Molecular Descriptors, ed. H. Kubinyi and H. Timmerman, Wiley-VCH, Weinheim, 2000. [Google Scholar]
- Thakur M., Makwane P., Tiwari A., Jain L. and Thakur A..
- Fjodorova N., Novich M., Vrachko M., Smirnov V., Kharchevnikova N., Zholdakova Z., Novikov S., Skvortsova N., Filimonov D., Poroikov V., Benfenati E. J. Environ. Sci. Health, Part C: Environ. Carcinog. Ecotoxicol. Rev. 2008;26:201–236. doi: 10.1080/10590500802135578. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






