Table 1.
Biomedical datasets used for the comparison experiments
| # | T | #C | #A | #S | M | Reference |
|---|---|---|---|---|---|---|
| 1 | D | 2 | 6584 | 61 | 0.651 | Alon et al. (1999) |
| 2 | D | 3 | 12 582 | 72 | 0.387 | Armstrong et al. (2002) |
| 3 | P | 2 | 5372 | 86 | 0.795 | Beer et al. (2002) |
| 4 | D | 5 | 12 600 | 203 | 0.657 | Bhattacharjee et al. (2001) |
| 5 | P | 2 | 5372 | 69 | 0.746 | Bhattacharjee et al. (2001) |
| 6 | D | 2 | 7129 | 72 | 0.650 | Golub et al., 1999 |
| 7 | D | 2 | 7464 | 36 | 0.500 | Hedenfalk et al. (2001) |
| 8 | P | 2 | 7129 | 60 | 0.661 | Iizuka et al. (2003) |
| 9 | D | 4 | 2308 | 83 | 0.345 | Khan et al. (2001) |
| 10 | D | 4 | 12 625 | 50 | 0.296 | Nutt et al. (2003) |
| 11 | D | 5 | 7129 | 90 | 0.642 | Pomeroy et al. (2002) |
| 12 | P | 2 | 7129 | 60 | 0.645 | Pomeroy et al. (2002) |
| 13 | D | 26 | 16 063 | 280 | 0.574 | Ramaswamy et al. (2001) |
| 14 | P | 2 | 7399 | 240 | 0.145 | Rosenwald et al. (2002) |
| 15 | D | 9 | 7129 | 60 | 0.506 | Staunton et al. (2001) |
| 16 | D | 2 | 7129 | 77 | 0.746 | Shipp et al. (2002) |
| 17 | D | 2 | 10 510 | 102 | 0.150 | Singh et al. (2002) |
| 18 | D | 11 | 12 533 | 174 | 0.150 | Su et al. (2001) |
| 19 | P | 2 | 24 481 | 78 | 0.562 | van't Veer et al. (2002) |
| 20 | D | 2 | 7039 | 39 | 0.878 | Welsh et al. (2001) |
| 21 | P | 2 | 12 625 | 249 | 0.805 | Yeoh et al. (2002) |
| 22 | D | 2 | 11 003 | 322 | 0.784 | Petricoin et al. (2002) |
| 23 | D | 3 | 11 170 | 159 | 0.364 | Pusztai et al. (2004) |
| 24 | D | 2 | 36 778 | 52 | 0.556 | Ranganathan (2005) |
In the type (T) column, P signifies prognostic and D signifies diagnostic. #C represents the number of classes, #A the number of attributes within the dataset, #S the number of samples and M is the fraction of the data covered by the most frequent target value. The first 21 datasets contain genomic data, whereas the last three datasets contain proteomic data.