Skip to main content
Comparative and Functional Genomics logoLink to Comparative and Functional Genomics
. 2004 Jul;5(5):432–444. doi: 10.1002/cfg.416

A Case Study on Choosing Normalization Methods and Test Statistics for Two-Channel Microarray Data

Yang Xie 1, Kyeong S Jeong 2, Wei Pan 1, Arkady Khodursky 2, Bradley P Carlin 1,
PMCID: PMC2447464  PMID: 18629172

Abstract

DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA–protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.

Full Text

The Full Text of this article is available as a PDF (752.5 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Baldi P., Long A. D. A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics. 2001 Jun;17(6):509–519. doi: 10.1093/bioinformatics/17.6.509. [DOI] [PubMed] [Google Scholar]
  2. Brazma A., Robinson A., Cameron G., Ashburner M. One-stop shop for microarray data. Nature. 2000 Feb 17;403(6771):699–700. doi: 10.1038/35001676. [DOI] [PubMed] [Google Scholar]
  3. Irizarry Rafael A., Hobbs Bridget, Collin Francois, Beazer-Barclay Yasmin D., Antonellis Kristen J., Scherf Uwe, Speed Terence P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003 Apr;4(2):249–264. doi: 10.1093/biostatistics/4.2.249. [DOI] [PubMed] [Google Scholar]
  4. Iyer V. R., Horak C. E., Scafe C. S., Botstein D., Snyder M., Brown P. O. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature. 2001 Jan 25;409(6819):533–538. doi: 10.1038/35054095. [DOI] [PubMed] [Google Scholar]
  5. Kendziorski C. M., Newton M. A., Lan H., Gould M. N. On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Stat Med. 2003 Dec 30;22(24):3899–3914. doi: 10.1002/sim.1548. [DOI] [PubMed] [Google Scholar]
  6. Kerr M. K., Martin M., Churchill G. A. Analysis of variance for gene expression microarray data. J Comput Biol. 2000;7(6):819–837. doi: 10.1089/10665270050514954. [DOI] [PubMed] [Google Scholar]
  7. Kooperberg Charles, Fazzio Thomas G., Delrow Jeffrey J., Tsukiyama Toshio. Improved background correction for spotted DNA microarrays. J Comput Biol. 2002;9(1):55–66. doi: 10.1089/10665270252833190. [DOI] [PubMed] [Google Scholar]
  8. Newton M. A., Kendziorski C. M., Richmond C. S., Blattner F. R., Tsui K. W. On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J Comput Biol. 2001;8(1):37–52. doi: 10.1089/106652701300099074. [DOI] [PubMed] [Google Scholar]
  9. Newton Michael A., Noueiry Amine, Sarkar Deepayan, Ahlquist Paul. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004 Apr;5(2):155–176. doi: 10.1093/biostatistics/5.2.155. [DOI] [PubMed] [Google Scholar]
  10. Pan Wei. A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics. 2002 Apr;18(4):546–554. doi: 10.1093/bioinformatics/18.4.546. [DOI] [PubMed] [Google Scholar]
  11. Pan Wei. On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression. Bioinformatics. 2003 Jul 22;19(11):1333–1340. doi: 10.1093/bioinformatics/btg167. [DOI] [PubMed] [Google Scholar]
  12. Quackenbush John. Microarray data normalization and transformation. Nat Genet. 2002 Dec;32 (Suppl):496–501. doi: 10.1038/ng1032. [DOI] [PubMed] [Google Scholar]
  13. Ren B., Robert F., Wyrick J. J., Aparicio O., Jennings E. G., Simon I., Zeitlinger J., Schreiber J., Hannett N., Kanin E. Genome-wide location and function of DNA binding proteins. Science. 2000 Dec 22;290(5500):2306–2309. doi: 10.1126/science.290.5500.2306. [DOI] [PubMed] [Google Scholar]
  14. Spellman P. T., Sherlock G., Zhang M. Q., Iyer V. R., Anders K., Eisen M. B., Brown P. O., Botstein D., Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998 Dec;9(12):3273–3297. doi: 10.1091/mbc.9.12.3273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Storey John D., Tibshirani Robert. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003 Jul 25;100(16):9440–9445. doi: 10.1073/pnas.1530509100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Tseng G. C., Oh M. K., Rohlin L., Liao J. C., Wong W. H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 2001 Jun 15;29(12):2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Tusher V. G., Tibshirani R., Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001 Apr 17;98(9):5116–5121. doi: 10.1073/pnas.091062498. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Comparative and Functional Genomics are provided here courtesy of Wiley

RESOURCES