Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2009 Nov 26;10:389. doi: 10.1186/1471-2105-10-389

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright ©2009 Sontrop et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Impact of preprocessing variability on feature selection for the Affymetrix datasets. Comparison of top-100 ranked features lists F_{100, m, k}and F_{100, m', k}, as obtained using different preprocessing strategies m and m', for different splits k. A) Percentage of the top-half of one list that is found in the other list, and vice vera. Each boxplot represents the distribution of such percentages over 50 splits, for a specific pair (m, m') (indicated on top of the figure). For each split, we determine the percentage of F_{50, m, k}found in F_{100, m', k}and the percentage of F_{50, m', k}found in F_{100, m, k}. Each distribution thus contains 50·2 = 100 points. All boxplots corresponding to the same preprocessing pair are colored similarly. In total there are 15 distinct pairs. The pairs are ordered by the observed median overlap over all six datasets. B) Distributions of the relative strength scores for top-ranked feature lists corresponding to the various preprocessing pairs. C) Relative strength of the top-100 multi-ranked gene lists with respect to the original rankings, for each preprocessing method and each Affymetrix dataset.