Table 14.
End point | Number of chemicals from the prediction set | Descriptors | Training (5-fold cross-validation) * | Test set * |
---|---|---|---|---|
VT | 23,767 | 21 | 0.79 | 0.77 |
NT | 30,971 | 11 | 0.90 | 0.89 |
EPA categories | 25,487 | 15 | 0.79 | 0.81 |
GHS categories | 25,720 | 15 | 0.78 | 0.79 |
values | 28,954 | 23 | 0.79 | 0.81 |
Balanced accuracy (BA) values are reported for performance metrics, except for the value end point for which the coefficient of determination and its equivalent for cross-validation are reported for the training and test sets, respectively. Note: EPA, U.S. Environmental Protection Agency; GHS, U.N. Globally Harmonized System of Classification and Labeling of Chemicals; , dose of a substance that would be expected to kill half the animals in a test group; NT, nontoxic/toxic; VT, very toxic/not very toxic. The number of prediction set chemicals represents the total for that data set, not the number that fell beneath the threshold (e.g., for VT).