Table 1.
ID | G/P | P/D | Number of variables (Original) | Number of variables (PAIFE) | Sample(Class1,Class2) | Outcome variable | Reference |
---|---|---|---|---|---|---|---|
1 | G | D | 6584 | 1972 | 61(40,21) | Colon Cancer | Alon et al [6] |
2 | G | P | 5372 | 858 | 86(69,17) | Lung Cancer | Beer et al. [7] |
3 | P | D | 70 | 15 | 205(66,139) | Lung Cancer | Bigbee, et al. [5] |
4 | G | D | 7129 | 2288 | 72(47,25) | Leukemia | Golub, et al. [8] |
5 | G | D | 7464 | 1880 | 36(18,18) | Breast Cancer | Hedenfalk et al. [9] |
6 | G | P | 7129 | 699 | 60(20,40) | Hepatocellular carcinoma | Iizuka et al. [10] |
7 | G | P | 7399 | 1084 | 240(138,102) | Lymphoma | Rosenwald, et al. [11] |
8 | G | D | 7129 | 1927 | 77(58,19) | Lymphoma | Shipp, et al. [12] |
9 | G | P | 24481 | 4251 | 78(44,34) | Breast cancer | Van’t Veer, et al. [13] |
10 | G | D | 7039 | 1230 | 39(35,4) | Ovarian Cancer | Welch, et al. [14] |
11 | G | P | 12625 | 1166 | 249(201,48) | Leukemia | Yeoh, et al. [15] |
12 | P | D | 16 | 12 | 583(184,401) | Lung cancer | LungSPORE (unpublished) |
G/P indicates if the data is Genomic or Proteomic. P/D shows whether the data is Prognostic (P) or Diagnostic (D). The number of variables (Original) gives the total variables in the original dataset. The number of variables (PAIFE) gives the total number of variables after processing the dataset through our irrelevant feature elimination algorithm ‘PAIFE’. The Sample (Class1, Class2) gives the total number of samples and class distribution, and ‘Reference’, the relevant reference to the dataset.