Abstract
The marriage of toxicology and genomics has created not only opportunities but also novel informatics challenges. As with the larger field of gene expression analysis, toxicogenomics faces the problems of probe annotation and data comparison across different array platforms. Toxicogenomics studies are generally built on standard toxicology studies generating biological end point data, and as such, one goal of toxicogenomics is to detect relationships between changes in gene expression and in those biological parameters. These challenges are best addressed through data collection into a well-designed toxicogenomics database. A successful publicly accessible toxicogenomics database will serve as a repository for data sharing and as a resource for analysis, data mining, and discussion. It will offer a vehicle for harmonizing nomenclature and analytical approaches and serve as a reference for regulatory organizations to evaluate toxicogenomics data submitted as part of registrations. Such a database would capture the experimental context of in vivo studies with great fidelity such that the dynamics of the dose response could be probed statistically with confidence. This review presents the collaborative efforts between the European Molecular Biology Laboratory-European Bioinformatics Institute ArrayExpress, the International Life Sciences Institute Health and Environmental Science Institute, and the National Institute of Environmental Health Sciences National Center for Toxigenomics Chemical Effects in Biological Systems knowledge base. The goal of this collaboration is to establish public infrastructure on an international scale and examine other developments aimed at establishing toxicogenomics databases. In this review we discuss several issues common to such databases: the requirement for identifying minimal descriptors to represent the experiment, the demand for standardizing data storage and exchange formats, the challenge of creating standardized nomenclature and ontologies to describe biological data, the technical problems involved in data upload, the necessity of defining parameters that assess and record data quality, and the development of standardized analytical approaches.
Full Text
The Full Text of this article is available as a PDF (412.5 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., Dolinski K., Dwight S. S., Eppig J. T. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000 May;25(1):25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ball Catherine A., Sherlock Gavin, Parkinson Helen, Rocca-Sera Philippe, Brooksbank Catherine, Causton Helen C., Cavalieri Duccio, Gaasterland Terry, Hingamp Pascal, Holstege Frank. Standards for microarray data. Science. 2002 Oct 18;298(5593):539–539. doi: 10.1126/science.298.5593.539b. [DOI] [PubMed] [Google Scholar]
- Bassett D. E., Jr, Eisen M. B., Boguski M. S. Gene expression informatics--it's all in your mine. Nat Genet. 1999 Jan;21(1 Suppl):51–55. doi: 10.1038/4478. [DOI] [PubMed] [Google Scholar]
- Brazma A., Hingamp P., Quackenbush J., Sherlock G., Spellman P., Stoeckert C., Aach J., Ansorge W., Ball C. A., Causton H. C. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001 Dec;29(4):365–371. doi: 10.1038/ng1201-365. [DOI] [PubMed] [Google Scholar]
- Brazma Alvis, Parkinson Helen, Sarkans Ugis, Shojatalab Mohammadreza, Vilo Jaak, Abeygunawardena Niran, Holloway Ele, Kapushesky Misha, Kemmeren Patrick, Lara Gonzalo Garcia. ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2003 Jan 1;31(1):68–71. doi: 10.1093/nar/gkg091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bumm Klaus, Zheng Mingzhong, Bailey Clyde, Zhan Fenghuang, Chiriva-Internati M., Eddlemon Paul, Terry Julian, Barlogie Bart, Shaughnessy John D., Jr CGO: utilizing and integrating gene expression microarray data in clinical research and data management. Bioinformatics. 2002 Feb;18(2):327–328. doi: 10.1093/bioinformatics/18.2.327. [DOI] [PubMed] [Google Scholar]
- Burchell B., Nebert D. W., Nelson D. R., Bock K. W., Iyanagi T., Jansen P. L., Lancet D., Mulder G. J., Chowdhury J. R., Siest G. The UDP glucuronosyltransferase gene superfamily: suggested nomenclature based on evolutionary divergence. DNA Cell Biol. 1991 Sep;10(7):487–494. doi: 10.1089/dna.1991.10.487. [DOI] [PubMed] [Google Scholar]
- Bushel P. R., Hamadeh H., Bennett L., Sieber S., Martin K., Nuwaysir E. F., Johnson K., Reynolds K., Paules R. S., Afshari C. A. MAPS: a microarray project system for gene expression experiment information and data validation. Bioinformatics. 2001 Jun;17(6):564–565. doi: 10.1093/bioinformatics/17.6.564. [DOI] [PubMed] [Google Scholar]
- Bushel Pierre R., Hamadeh Hisham K., Bennett Lee, Green James, Ableson Alan, Misener Stephen, Afshari Cynthia A., Paules Richard S. Computational selection of distinct class- and subclass-specific gene expression signatures. J Biomed Inform. 2002 Jun;35(3):160–170. doi: 10.1016/s1532-0464(02)00525-7. [DOI] [PubMed] [Google Scholar]
- Castle Arthur L., Carver Michael P., Mendrick Donna L. Toxicogenomics: a new revolution in drug safety. Drug Discov Today. 2002 Jul 1;7(13):728–736. doi: 10.1016/s1359-6446(02)02327-9. [DOI] [PubMed] [Google Scholar]
- Dowell R. D., Jokerst R. M., Day A., Eddy S. R., Stein L. The distributed annotation system. BMC Bioinformatics. 2001 Oct 10;2:7–7. doi: 10.1186/1471-2105-2-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edgar Ron, Domrachev Michael, Lash Alex E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002 Jan 1;30(1):207–210. doi: 10.1093/nar/30.1.207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eisen M. B., Spellman P. T., Brown P. O., Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863–14868. doi: 10.1073/pnas.95.25.14863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ermolaeva O., Rastogi M., Pruitt K. D., Schuler G. D., Bittner M. L., Chen Y., Simon R., Meltzer P., Trent J. M., Boguski M. S. Data management and analysis for gene expression arrays. Nat Genet. 1998 Sep;20(1):19–23. doi: 10.1038/1670. [DOI] [PubMed] [Google Scholar]
- Finkelstein David, Ewing Rob, Gollub Jeremy, Sterky Fredrik, Cherry J. Michael, Somerville Shauna. Microarray data quality analysis: lessons from the AFGC project. Arabidopsis Functional Genomics Consortium. Plant Mol Biol. 2002 Jan;48(1-2):119–131. doi: 10.1023/a:1013765922672. [DOI] [PubMed] [Google Scholar]
- Gollub Jeremy, Ball Catherine A., Binkley Gail, Demeter Janos, Finkelstein David B., Hebert Joan M., Hernandez-Boussard Tina, Jin Heng, Kaloper Miroslava, Matese John C. The Stanford Microarray Database: data access and quality assessment tools. Nucleic Acids Res. 2003 Jan 1;31(1):94–96. doi: 10.1093/nar/gkg078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ideker T., Galitski T., Hood L. A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet. 2001;2:343–372. doi: 10.1146/annurev.genom.2.1.343. [DOI] [PubMed] [Google Scholar]
- Liao B., Hale W., Epstein C. B., Butow R. A., Garner H. R. MAD: a suite of tools for microarray data management and processing. Bioinformatics. 2000 Oct;16(10):946–947. doi: 10.1093/bioinformatics/16.10.946. [DOI] [PubMed] [Google Scholar]
- Lockhart D. J., Dong H., Byrne M. C., Follettie M. T., Gallo M. V., Chee M. S., Mittmann M., Wang C., Kobayashi M., Horton H. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996 Dec;14(13):1675–1680. doi: 10.1038/nbt1296-1675. [DOI] [PubMed] [Google Scholar]
- Mattes William B. Annotation and cross-indexing of array elements on multiple platforms. Environ Health Perspect. 2004 Mar;112(4):506–510. doi: 10.1289/ehp.6698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Model Fabian, König Thomas, Piepenbrock Christian, Adorján Péter. Statistical process control for large scale microarray experiments. Bioinformatics. 2002;18 (Suppl 1):S155–S163. doi: 10.1093/bioinformatics/18.suppl_1.s155. [DOI] [PubMed] [Google Scholar]
- Murphy David. Gene expression studies using microarrays: principles, problems, and prospects. Adv Physiol Educ. 2002 Dec;26(1-4):256–270. doi: 10.1152/advan.00043.2002. [DOI] [PubMed] [Google Scholar]
- Pennie William, Pettit Syril D., Lord Peter G. Toxicogenomics in risk assessment: an overview of an HESI collaborative research program. Environ Health Perspect. 2004 Mar;112(4):417–419. doi: 10.1289/ehp.6674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petricoin Emanuel F., 3rd, Hackett Joseph L., Lesko Lawrence J., Puri Raj K., Gutman Steven I., Chumakov Konstantin, Woodcock Janet, Feigal David W., Jr, Zoon Kathryn C., Sistare Frank D. Medical applications of microarray technologies: a regulatory science perspective. Nat Genet. 2002 Dec;32 (Suppl):474–479. doi: 10.1038/ng1029. [DOI] [PubMed] [Google Scholar]
- Pruitt K. D., Maglott D. R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 2001 Jan 1;29(1):137–140. doi: 10.1093/nar/29.1.137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rininger JA, DiPippo VA, Gould-Rothberg BE. Differential gene expression technologies for identifying surrogate markers of drug efficacy and toxicity. Drug Discov Today. 2000 Dec 1;5(12):560–568. doi: 10.1016/s1359-6446(00)01597-x. [DOI] [PubMed] [Google Scholar]
- Schena M., Shalon D., Davis R. W., Brown P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995 Oct 20;270(5235):467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
- Spellman Paul T., Miller Michael, Stewart Jason, Troup Charles, Sarkans Ugis, Chervitz Steve, Bernhart Derek, Sherlock Gavin, Ball Catherine, Lepage Marc. Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol. 2002 Aug 23;3(9):RESEARCH0046–RESEARCH0046. doi: 10.1186/gb-2002-3-9-research0046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoeckert C., Pizarro A., Manduchi E., Gibson M., Brunk B., Crabtree J., Schug J., Shen-Orr S., Overton G. C. A relational schema for both array-based and SAGE gene expression experiments. Bioinformatics. 2001 Apr;17(4):300–308. doi: 10.1093/bioinformatics/17.4.300. [DOI] [PubMed] [Google Scholar]
- Stoeckert Christian J., Jr, Causton Helen C., Ball Catherine A. Microarray databases: standards and ontologies. Nat Genet. 2002 Dec;32 (Suppl):469–473. doi: 10.1038/ng1028. [DOI] [PubMed] [Google Scholar]
- Thomas Russell S., Rank David R., Penn Sharron G., Zastrow Gina M., Hayes Kevin R., Hu Tianhua, Pande Kalyan, Lewis Mark, Jovanovich Stevan B., Bradfield Christopher A. Application of genomics to toxicology research. Environ Health Perspect. 2002 Dec;110 (Suppl 6):919–923. doi: 10.1289/ehp.02110s6919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tong Weida, Cao Xiaoxi, Harris Stephen, Sun Hongmei, Fang Hong, Fuscoe James, Harris Angela, Hong Huixiao, Xie Qian, Perkins Roger. ArrayTrack--supporting toxicogenomic research at the U.S. Food and Drug Administration National Center for Toxicological Research. Environ Health Perspect. 2003 Nov;111(15):1819–1826. doi: 10.1289/ehp.6497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tseng G. C., Oh M. K., Rohlin L., Liao J. C., Wong W. H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 2001 Jun 15;29(12):2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolfinger R. D., Gibson G., Wolfinger E. D., Bennett L., Hamadeh H., Bushel P., Afshari C., Paules R. S. Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol. 2001;8(6):625–637. doi: 10.1089/106652701753307520. [DOI] [PubMed] [Google Scholar]