Skip to main content
Comparative and Functional Genomics logoLink to Comparative and Functional Genomics
. 2003 Jun;4(3):300–317. doi: 10.1002/cfg.298

Reproducibility Assessment of Independent Component Analysis of Expression Ratios From DNA Microarrays

David Philip Kreil 1,2,, David J C MacKay 2
PMCID: PMC2448447  PMID: 18629283

Abstract

DNA microarrays allow the measurement of transcript abundances for thousands of genes in parallel. Most commonly, a particular sample of interest is studied next to a neutral control, examining relative changes (ratios). Independent component analysis (ICA) is a promising modern method for the analysis of such experiments. The condition of ICA algorithms can, however, depend on the characteristics of the data examined, making algorithm properties such as robustness specific to the given application domain. To address the lack of studies examining the robustness of ICA applied to microarray measurements, we report on the stability of variational Bayesian ICA in this domain. Microarray data are usually preprocessed and transformed. Hence we first examined alternative transforms and data selections for the smallest modelling reconstruction errors. Log-ratio data are reconstructed better than non-transformed ratio data by our linear model with a Gaussian error term. To compare ICA results we must allow for ICA invariance under rescaling and permutation of the extracted signatures, which hold the loadings of the original variables (gene transcript ratios) on particular latent variables. We introduced a method to optimally match corresponding signatures between sets of results. The stability of signatures was then examined after (1) repetition of the same analysis run with different random number generator seeds, and (2) repetition of the analysis with partial data sets. The effects of both dropping a proportion of the gene transcript ratios and dropping measurements for several samples have been studied. In summary, signatures with a high relative data power were very likely to be retained, resulting in an overall stability of the analyses. Our analysis of 63 yeast wildtype vs. wild-type experiments, moreover, yielded 10 reliably identified signatures, demonstrating that the variance observed is not just noise.

Full Text

The Full Text of this article is available as a PDF (353.4 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bell A. J., Sejnowski T. J. An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 1995 Nov;7(6):1129–1159. doi: 10.1162/neco.1995.7.6.1129. [DOI] [PubMed] [Google Scholar]
  2. Bussemaker H. J., Li H., Siggia E. D. Regulatory element detection using correlation with expression. Nat Genet. 2001 Feb;27(2):167–171. doi: 10.1038/84792. [DOI] [PubMed] [Google Scholar]
  3. Hughes T. R., Marton M. J., Jones A. R., Roberts C. J., Stoughton R., Armour C. D., Bennett H. A., Coffey E., Dai H., He Y. D. Functional discovery via a compendium of expression profiles. Cell. 2000 Jul 7;102(1):109–126. doi: 10.1016/s0092-8674(00)00015-5. [DOI] [PubMed] [Google Scholar]
  4. Liebermeister Wolfram. Linear modes of gene expression determined by independent component analysis. Bioinformatics. 2002 Jan;18(1):51–60. doi: 10.1093/bioinformatics/18.1.51. [DOI] [PubMed] [Google Scholar]
  5. Schena M., Shalon D., Davis R. W., Brown P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995 Oct 20;270(5235):467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]

Articles from Comparative and Functional Genomics are provided here courtesy of Wiley

RESOURCES