. Author manuscript; available in PMC: 2019 May 16.

Published in final edited form as: J Am Stat Assoc. 2018 May 16;113(521):95–110. doi: 10.1080/01621459.2017.1330202

Table 4:

Variable selection results from 1000 data sets in each scenario. X₁ and X₂ are useful for clustering, X₃ and X₄ are independent of X₁ and X₂ and are not useful for clustering, and X₅ and X₆ are not useful for clustering after conditioning on X₁ and X₂.

Selection Algorithm	N	% Correct	%X1	%X2	%X3	%X4	%X5	%X6	% None
MSN Clusters

vscc	200	3.3	100.0	95.6	62.1	59.8	95.8	69.4	0.0
	500	0.0	100.0	94.9	72.4	67.9	99.7	69.2	0.0
	800	0.1	100.0	89.5	76.5	67.0	99.8	67.4	0.0

clustvarsel	200	15.6	100.0	16.8	48.3	0.0	0.1	0.7	0.0
	500	13.5	100.0	28.7	85.8	0.0	0.0	0.0	0.0
	800	5.3	100.0	52.3	94.7	0.0	0.0	0.0	0.0

skewvarsel	200	43.2	99.3	47.9	0.0	0.6	12.7	7.4	0.7
	500	65.1	100.0	69.0	0.0	0.4	9.4	2.5	0.0
	800	84.2	100.0	89.4	0.0	0.5	6.2	1.2	0.0

MVN Clusters

vscc	200	2.3	100.0	98.4	67.6	67.6	94.1	91.0	0.0
	500	0.1	100.0	93.4	78.6	78.6	98.6	90.2	0.0
	800	0.0	100.0	79.3	50.1	50.4	96.4	76.2	0.0

clustvarsel	200	74.0	100.0	76.4	0.5	0.3	1.5	2.3	0.0
	500	99.6	100.0	99.9	0.1	0.0	0.2	0.0	0.0
	800	99.8	100.0	100.0	0.1	0.0	0.1	0.0	0.0

skewvarsel	200	90.0	100.0	98.5	1.0	1.6	4.9	2.4	0.0
	500	98.0	100.0	100.0	0.6	0.3	1.1	0.0	0.0
	800	99.9	100.0	100.0	0.0	0.1	0.0	0.0	0.0