Table 3.
Misclassification rate of RPMM cluster analysis to find 2 groups using different variable filtering methods (top 1000 features)
| Data set 1 | Data set 2 | Data set 3 | Data set 4 | Data set 5 | Data set 6 | Data set 7 | |
|---|---|---|---|---|---|---|---|
|
Tissue type |
Colon cancer |
Glioblastoma |
Glioblastoma |
Kidney |
Kidney |
Breast |
Breast |
|
Platform |
HM27 |
HM27 |
HM450 |
HM27 |
HM450 |
HM27 |
HM450 |
|
# of samples |
20 non-CIMP vs. 6 CIMP |
74 non-CIMP vs. 12 CIMP |
93 non-CIMP vs. 6 CIMP |
50 KIRC vs. 45 non-cancer |
283 KIRC vs. 160 non-cancer |
37 Breast cancer vs. 20 non-cancer |
56 Breast cancer vs. 17 non-cancer |
|
No filter |
0.31 |
0.22 |
NA |
0 |
NA |
0.12 |
NA |
|
Filter top 1000 by: |
|
|
|
|
|
|
|
|
Random * |
0.34 |
0.27 |
0.40 |
0.004 |
0.005 |
0.12 |
0.20 |
|
SD-b |
0.19 |
0.07 |
0.49 |
0 |
0.02 |
0 |
0.12 |
|
SD-m |
0.12 |
0.07 |
0.42 |
0.02 |
0.03 |
0.12 |
0.08 |
|
MAD |
0.38 |
0.35 |
0.49 |
0 |
0.005 |
0 |
0.14 |
|
DIP |
0.23 |
0.36 |
0.45 |
0 |
0.005 |
0 |
0.14 |
|
Precision |
0.08 |
0 |
0.10 |
0.03 |
0.01 |
0.11 |
0.22 |
|
BQ-GOF |
0.19 |
0 |
0.07 |
0 |
0.01 |
0.25 |
0.23 |
|
TM-GOF |
0.08 |
0.02 |
0.06 |
0.36 |
0.47 |
0.44 |
0.49 |
|
TQ-GOF |
0.08 |
0.03 |
0.06 |
0.35 |
0.47 |
0.44 |
0.48 |
|
BR |
0.12 |
0.02 |
0.11 |
0.02 |
0.02 |
0.23 |
0.19 |
|
AR |
0.08 |
0.06 |
0.11 |
0.02 |
0.02 |
0.25 |
0.19 |
|
WAR |
0.12 |
0.07 |
0.45 |
0.02 |
0.01 |
0.11 |
0.10 |
| SD-b + TM-GOF** | 0.08 | 0.07 | 0.20 | 0.05 | 0.01 | 0.26 | 0.36 |
NA = not applicable; Too many features for RPMM to run.
*Average from 10 analyses of randomly sampled feature sets.
**Combine top 500 SD-b + top 500 TM-GOF features.