TABLE 5.
Structural Diversity and Model Performance
| Data set | Average tanimoto | Average Tmaxa | Best model | Best model ROC |
| Cell viability (EPA test) | 0.15 | 0.38 | SMO (logistic) | 0.75 |
| Caspase activation | 0.15 | 0.33 | Naive Bayesian | 0.75 |
| Salmonella | 0.16 | 0.53 | SMO (logistic)/WFS | 0.78/0.77 |
| Hepatotoxicity | 0.13 | 0.32 | WFS | 0.67 |
Tanimoto score (Leadscope fingerprints) between a compound in the testing set and its most structurally similar toxic compound in the training set.