Skip to main content
Environmental Health Perspectives logoLink to Environmental Health Perspectives
. 2000 Nov;108(11):1029–1033. doi: 10.1289/ehp.001081029

Data quality in predictive toxicology: identification of chemical structures and calculation of chemical properties.

C Helma 1, S Kramer 1, B Pfahringer 1, E Gottmann 1
PMCID: PMC1240158  PMID: 11102292

Abstract

Every technique for toxicity prediction and for the detection of structure-activity relationships relies on the accurate estimation and representation of chemical and toxicologic properties. In this paper we discuss the potential sources of errors associated with the identification of compounds, the representation of their structures, and the calculation of chemical descriptors. It is based on a case study where machine learning techniques were applied to data from noncongeneric compounds and a complex toxicologic end point (carcinogenicity). We propose methods applicable to the routine quality control of large chemical datasets, but our main intention is to raise awareness about this topic and to open a discussion about quality assurance in predictive toxicology. The accuracy and reproducibility of toxicity data will be reported in another paper.

Full Text

The Full Text of this article is available as a PDF (196.6 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Ashby J., Paton D. The influence of chemical structure on the extent and sites of carcinogenesis for 522 rodent carcinogens and 55 different human carcinogen exposures. Mutat Res. 1993 Mar;286(1):3–74. doi: 10.1016/0027-5107(93)90003-x. [DOI] [PubMed] [Google Scholar]
  2. Bristol D. W., Wachsman J. T., Greenwell A. The NIEHS Predictive-Toxicology Evaluation Project. Environ Health Perspect. 1996 Oct;104 (Suppl 5):1001–1010. doi: 10.1289/ehp.96104s51001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. King R. D., Srinivasan A. Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming. Environ Health Perspect. 1996 Oct;104 (Suppl 5):1031–1040. doi: 10.1289/ehp.96104s51031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Klopman G., Rosenkranz H. S. International Commission for Protection Against Environmental Mutagens and Carcinogens. Approaches to SAR in carcinogenesis and mutagenesis. Prediction of carcinogenicity/mutagenicity using MULTI-CASE. Mutat Res. 1994 Feb 1;305(1):33–46. doi: 10.1016/0027-5107(94)90124-4. [DOI] [PubMed] [Google Scholar]
  5. Meylan W. M., Howard P. H. Atom/fragment contribution method for estimating octanol-water partition coefficients. J Pharm Sci. 1995 Jan;84(1):83–92. doi: 10.1002/jps.2600840120. [DOI] [PubMed] [Google Scholar]

Articles from Environmental Health Perspectives are provided here courtesy of National Institute of Environmental Health Sciences

RESOURCES