Skip to main content
. Author manuscript; available in PMC: 2019 May 16.
Published in final edited form as: J Am Stat Assoc. 2018 May 16;113(521):95–110. doi: 10.1080/01621459.2017.1330202

Table 4:

Variable selection results from 1000 data sets in each scenario. X1 and X2 are useful for clustering, X3 and X4 are independent of X1 and X2 and are not useful for clustering, and X5 and X6 are not useful for clustering after conditioning on X1 and X2.

Selection Algorithm N % Correct %X1 %X2 %X3 %X4 %X5 %X6 % None
MSN Clusters

vscc 200 3.3 100.0 95.6 62.1 59.8 95.8 69.4 0.0
500 0.0 100.0 94.9 72.4 67.9 99.7 69.2 0.0
800 0.1 100.0 89.5 76.5 67.0 99.8 67.4 0.0

clustvarsel 200 15.6 100.0 16.8 48.3 0.0 0.1 0.7 0.0
500 13.5 100.0 28.7 85.8 0.0 0.0 0.0 0.0
800 5.3 100.0 52.3 94.7 0.0 0.0 0.0 0.0

skewvarsel 200 43.2 99.3 47.9 0.0 0.6 12.7 7.4 0.7
500 65.1 100.0 69.0 0.0 0.4 9.4 2.5 0.0
800 84.2 100.0 89.4 0.0 0.5 6.2 1.2 0.0

MVN Clusters

vscc 200 2.3 100.0 98.4 67.6 67.6 94.1 91.0 0.0
500 0.1 100.0 93.4 78.6 78.6 98.6 90.2 0.0
800 0.0 100.0 79.3 50.1 50.4 96.4 76.2 0.0

clustvarsel 200 74.0 100.0 76.4 0.5 0.3 1.5 2.3 0.0
500 99.6 100.0 99.9 0.1 0.0 0.2 0.0 0.0
800 99.8 100.0 100.0 0.1 0.0 0.1 0.0 0.0

skewvarsel 200 90.0 100.0 98.5 1.0 1.6 4.9 2.4 0.0
500 98.0 100.0 100.0 0.6 0.3 1.1 0.0 0.0
800 99.9 100.0 100.0 0.0 0.1 0.0 0.0 0.0