Abstract
In order to study the molecular biological differences between normal and diseased tissues, it is desirable to perform classification among diseases and stages of disease using microarray-based gene-expression values. Owing to the limited number of microarrays typically used in these studies, serious issues arise with respect to the design, performance and analysis of classifiers based on microarray data. This paper reviews some fundamental issues facing small-sample classification: classification rules, constrained classifiers, error estimation and feature selection. It discusses both unconstrained and constrained classifier design from sample data, and the contributions to classifier error from constrained optimization and lack of optimality owing to design from sample data. The difficulty with estimating classifier error when confined to small samples is addressed, particularly estimating the error from training data. The impact of small samples on the ability to include more than a few variables as classifier features is explained.
Full Text
The Full Text of this article is available as a PDF (179.4 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- DeRisi J. L., Iyer V. R., Brown P. O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997 Oct 24;278(5338):680–686. doi: 10.1126/science.278.5338.680. [DOI] [PubMed] [Google Scholar]
- Duggan D. J., Bittner M., Chen Y., Meltzer P., Trent J. M. Expression profiling using cDNA microarrays. Nat Genet. 1999 Jan;21(1 Suppl):10–14. doi: 10.1038/4434. [DOI] [PubMed] [Google Scholar]
- Golub T. R., Slonim D. K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J. P., Coller H., Loh M. L., Downing J. R., Caligiuri M. A. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999 Oct 15;286(5439):531–537. doi: 10.1126/science.286.5439.531. [DOI] [PubMed] [Google Scholar]
- Kim S., Dougherty E. R., Bittner M. L., Chen Y., Sivakumar K., Meltzer P., Trent J. M. General nonlinear framework for the analysis of gene interaction via multivariate expression arrays. J Biomed Opt. 2000 Oct;5(4):411–424. doi: 10.1117/1.1289142. [DOI] [PubMed] [Google Scholar]
- Schena M., Shalon D., Davis R. W., Brown P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995 Oct 20;270(5235):467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
