Skip to main content
. 2014 Nov 14;2014:266–273.

Table 1.

Details of the 12 datasets that were analyzed.

ID G/P P/D Number of variables (Original) Number of variables (PAIFE) Sample(Class1,Class2) Outcome variable Reference
1 G D 6584 1972 61(40,21) Colon Cancer Alon et al [6]
2 G P 5372 858 86(69,17) Lung Cancer Beer et al. [7]
3 P D 70 15 205(66,139) Lung Cancer Bigbee, et al. [5]
4 G D 7129 2288 72(47,25) Leukemia Golub, et al. [8]
5 G D 7464 1880 36(18,18) Breast Cancer Hedenfalk et al. [9]
6 G P 7129 699 60(20,40) Hepatocellular carcinoma Iizuka et al. [10]
7 G P 7399 1084 240(138,102) Lymphoma Rosenwald, et al. [11]
8 G D 7129 1927 77(58,19) Lymphoma Shipp, et al. [12]
9 G P 24481 4251 78(44,34) Breast cancer Van’t Veer, et al. [13]
10 G D 7039 1230 39(35,4) Ovarian Cancer Welch, et al. [14]
11 G P 12625 1166 249(201,48) Leukemia Yeoh, et al. [15]
12 P D 16 12 583(184,401) Lung cancer LungSPORE (unpublished)

G/P indicates if the data is Genomic or Proteomic. P/D shows whether the data is Prognostic (P) or Diagnostic (D). The number of variables (Original) gives the total variables in the original dataset. The number of variables (PAIFE) gives the total number of variables after processing the dataset through our irrelevant feature elimination algorithm ‘PAIFE’. The Sample (Class1, Class2) gives the total number of samples and class distribution, and ‘Reference’, the relevant reference to the dataset.