Table 3.
Algorithm | Parameters |
Distributions |
||
---|---|---|---|---|
K | α | Globally separable | Pairwise separable | |
Sparse PCA (Zou et al., 2006) | 1 | 0.38 ± 0.21 | 0.97 ± 0.03 | |
5.75 | 0.59 ± 0.21 | |||
10.5 | ||||
15.25 | 0.65 ± 0.08 | |||
20 | 0.59 ± 0.01 | |||
K true | 1 | 0.47 ± 0.29 | ||
5.75 | 0.56 ± 0.21 | |||
10.5 | ||||
15.25 | 0.48 ± 0.01 | 0.78 ± 0.07 | ||
20 | 0.63 ± 0.02 | |||
1 | 0.52 ± 0.32 | |||
5.75 | ||||
10.5 | ||||
15.25 | 0.47 ± 0.01 | 0.80 ± 0.7 | ||
20 | 0.64 ± 0.01 | |||
Sparse K-means (Witten and Tibshirani, 2010) | 1 | 0.80 ± 0.40 | ||
5.75 | 0.84 ± 0.32 | |||
10.5 | 0.87 ± 0.18 | |||
15.25 | 0.86 ± 0.23 | |||
20 | 0.80 ± 0.40 | |||
K true | 1 | 0.96 ± 0.08 | ||
5.75 | 0.88 ± 0.24 | |||
10.5 | 0.80 ± 0.40 | |||
15.25 | 0.94 ± 0.12 | |||
20 | ||||
1 | 0.91 ± 0.18 | |||
5.75 | 0.85 ± 0.30 | |||
10.5 | ||||
15.25 | 0.84 ± 0.32 | |||
20 | 0.83 ± 0.34 | |||
Sparse hierarchical clustering (Witten and Tibshirani, 2010) | N/A | 1 | 0 ± 0 | 0.54 ± 0.03 |
5.75 | 0.57 ± 0.04 | |||
10.5 | 0.59 ± 0.02 | |||
15.25 | 0.56 ± 0.03 | |||
20 | 0.59 ± 0.02 | |||
LFSBSS (Li et al., 2008) | N/A | |||
K true | ||||
Spectral selection (Zhao and Liu, 2007) | N/A | N/A | ||
SMD (hierarchical proposal clusters) | Unif | N/A | ||
Unif | 1.0 ± 0.02 | |||
Unif | ||||
SMD (K-means proposal clusters) | Unif | N/A | ||
Unif | 0.94 ± 0.05 | |||
Unif |
Note: Here, we generate two classes of distributions: globally separable, where one dimension separates two clusters, and other dimensions are uninformative, and pairwise separable, where each dimension separates only a pair of clusters, and the rest are uninformative. In both cases, the ratio of informative to uninformative dimensions is . For each class of distributions, we generated 5 instances of the class, and used the algorithm in the left column to infer weights for each dimension. Some of the algorithms have input parameters, which are given in columns K (the number of clusters, or in the case of Sparse PCA, the number of components) and α (a sparsity parameter). From these weights, we calculated the AUROC score, and report the average, and standard deviation over the five trials.