Skip to main content
. 2016 Mar 3;5(3):883–894. doi: 10.1039/c5tx00406c

Table 1. Overview of descriptor sets from the chemical, protein target, and cytotoxicity domain to be used in modelling toxicity data in all possible combinations. In each modelling repeat, the feature selection and pre-processing procedure was applied to the data in the respective modelling set to select an optimum similarly sized subset of descriptors from each domain.

Data domain Details Source Information encoded
Chemical 192 2D descriptors MOE Chemical structure and physiochemical properties
Protein target 477 human target-affinity descriptors In silico algorithm trained on dataset extracted from ChEMBL version 14 Translation of chemical space into biological space; likelihood of interaction with subset of human proteome
Cytotoxicity 182 dose–response datapoints of 14 concentrations across 13 human, rat and mouse cell lines, scaled such that the maximum response for each curve equals 1. Original data extracted from PubChem and processed to remove noise as per study of Sedykh et al. (2011) Experimental cell-viability outcomes of compound exposure