Table 2.
Prediction and variable selection results for Simulations 1–3. Each table cell gives average(SD) over 50 repetitions. “K” is the average number of selected clusters, “ER” is the average clustering error rate, “ER (correct K)” is the average error rate when K is set to the true value rather than selected by BIC, “Info” is the average proportion of selected informative variables, and “Noninfo” is the average proportion of selected non-informative variables. “High SNR” corresponds to σ2 = 1, and “Low SNR” corresponds to σ2 = 4.
| Sim. (SNR) | Method | K | ER (%) | ER (correct K) | Info (%) | Noninfo (%) |
|---|---|---|---|---|---|---|
| 1 (High) | GMM | 3 (0) | 25 (0) | 0 (0) | 100 (100) | 100 (100) |
| AL1 | 4 (0) | 0 (0) | 0 (0) | 100 (100) | 7.1 (7.1) | |
| ALP | 4 (0) | 0 (0) | 0 (0) | 100 (100) | 2.4 (2.4) | |
| APFP | 4 (0) | 0 (0) | 0 (0) | 100 (100) | 0.5 (0.5) | |
| 1 (Low) | GMM | 3 (0) | 33 (4.9) | 20.6 (8.5) | 100 (100) | 100 (100) |
| AL1 | 3.8 (0.6) | 19.2 (14.9) | 14.2 (10.7) | 100 (100) | 6 (6) | |
| ALP | 3 (0) | 34.1 (14.5) | 14.4 (14) | 95.9 (95.9) | 4 (4) | |
| APFP | 3.7 (0.6) | 19.2 (16.7) | 15.1 (12.6) | 100 (100) | 2.3 (2.3) | |
| 2 (High) | GMM | 3 (0) | 40 (0) | 0 (0.2) | 100 (100) | 100 (100) |
| AL1 | 5 (0) | 0 (0) | 0 (0) | 100 (100) | 6.9 (6.9) | |
| ALP | 5 (0) | 0 (0.1) | 0 (0.1) | 100 (100) | 1.8 (1.8) | |
| APFP | 5 (0) | 0 (0) | 0 (0) | 100 (100) | 1.1 (1.1) | |
| 2 (Low) | GMM | 3 (0) | 40.3 (0.7) | 15.3 (5.3) | 100 (100) | 100 (100) |
| AL1 | 4.7 (0.6) | 11.7 (9.8) | 8.3 (5.3) | 100 (100) | 10 (10) | |
| ALP | 3 (0) | 40.1 (0.4) | 5.8 (3) | 100 (100) | 5.2 (5.2) | |
| APFP | 4.7 (0.5) | 11.7 (7.7) | 9.2 (5.5) | 100 (100) | 2.4 (2.4) | |
| 3 | GMM | 3 (0) | 4.5 (0) | 0 (0) | 100 (100) | 100 (100) |
| AL1 | 4 (0) | 0 (0) | 0 (0) | 100 (100) | 8.1 (8.1) | |
| ALP | 3.9 (0.2) | 0.3 (1.1) | 0 (0) | 100 (100) | 5.9 (5.9) | |
| APFP | 4 (0.1) | 0 (0) | 0 (0) | 100 (100) | 0.2 (0.2) | |