. 2022 Nov;135:105249. doi: 10.1016/j.yrtph.2022.105249

Table 1.

Summary of approaches to derive structure-activity relationships, and ultimately structural alerts, for predictive toxicology.

Method to derive the structural alert	Description	Characteristics in terms of data for the SAR, methodology and mechanistic understanding	Strengths	Weaknesses	Illustrative example
Expert Knowledge Based on Toxicological Data	Derived from the knowledge of toxicologists who have experience in assessing the data associated with toxicological properties of a series of chemicals	Data: small number of toxicological data on which to base a hypothesis Methodology: Expert judgement and opinion Mechanistic: Presumed high, through precise mechanistic definition may not be possible	Derived from a knowledge based on experimental data, supported by mechanistic information	Slow to develop, no performance statistics; may be a misinterpretation from flawed data or a subjective interpretation of data	Ashby and Tennant (1988) who compiled knowledge on genotoxic carcinogens
Expert Knowledge Based on Mechanistic Understanding	Derived from expert knowledge following (non-statistical) analysis of a data set of chemicals using a mechanistic hypothesis	Data: large number of mechanistic data Methodology: Expert judgement and opinion Mechanistic: Clear mechanistic hypothesis	Based on expert knowledge (preferably from multiple sources) and potentially creating a broad set of alerts, supported by data or mechanistic understanding. Can be extended broadly without extensive toxicological data.	Labour intensive to develop and requires expert knowledge across a complete mechanism of action or dataset	Enoch and Cronin (2010) and Enoch et al. (2011) who derived alerts for DNA and protein binding respectively on the basis of electrophilic chemistry; Bauer et al. (2018) who derived a decision tree on six classes of mechanisms of action, termed MechoA
Data-Driven Approaches	Use of statistical analyses to determine fragments associated with a particular toxicity	Data: Large data sets required for analysis Methodology: data mining and machine learning of toxicological data Mechanistic: Not possible unless assigned after alert development	A rapid method, with readily available performance statistics. The data on which the alerts are derived from are available	Requirement for large data sets to achieve significant results. Prone to limited validation (usually restricted to curation). Difficult to assign mechanistic knowledge or validity to the alerts derived as they may be in an uninterpretable “black box” form. Often the fragments are overlapping and require rationalisation	Wedlake et al. (2020) used a Bayesian approach to develop alerts for in vitro data related to MIEs; Claesson and Minidis (2018) to develop alerts for reactive metabolite formation; Cui et al. (2019) alerts from fingerprints for drug-induced rhabdomyolysis
Chemotype Enrichment	Use of statistical analysis to determine which structural fragments may be significantly associated with a toxicity or effect	Data: Large data sets Methodology: Data mining of high throughput data Mechanistic: Driven by the mechanistic hypothesis of the data	Rapid to apply. Provides a statistical outcome to demonstrate the strength of relationship between the activity and structure. Use of readily available alerts.	Currently limited by the need for relatively large data sets and the fragments already available	Wang et al. (2019, 2021) investigated ToxCast endpoints using ToxPrint Chemotypes
Hybrid Approaches Combining Statistical Analysis and Expert Analysis	The purpose here is to use statistical analysis (such as clustering approaches) to find groups within data to be used as leads for expert analysis. This will not produce a comprehensive set of alerts but may find SARs (which can be optimised) that would not be obtained by expert knowledge alone.	Data: Many toxicological data Methodology: Clustering of data following by expert judgement and opinion Mechanistic: No mechanistic understanding unless applied after alert development	A rapid approach to derive knowledge/hypotheses. Supported by data and mechanistic understanding	Evaluating the hypotheses from data mining can be slow and requires expert knowledge.	Hewitt et al. (2013) who applied expert knowledge to the results of cluster analyses on a database of hepatotoxicity data to derive useable alerts for liver toxicity. Wang et al. (2019) used a ToxPrint chemotype enrichment analysis to identify >20 distinct chemical substructural features significantly enriched for sodium-iodide symporter inhibition.