Skip to main content
. Author manuscript; available in PMC: 2019 Nov 17.
Published in final edited form as: Stat Med. 2018 May 17:10.1002/sim.7697. doi: 10.1002/sim.7697
Case 1(a) n = 150, p = 10. The true model has M = 3 components with mixing proportions 0.5, 0.25, 0.25, respectively, and y | ϕ is a multivariate normal with no censoring nor missing data. Only two variables y(1) = [y1, y2]′ are informative, with means of (2, 0), (0, 2), (−1.5, −1.5), unit variances, and correlations of 0.5, 0.5, −0.5 in each component, respectively. The non-informative variables y(2) = [y3, . . . , y10]′ are generated as iid 𝒩(0, 1).
Case 1(b) Same as the setup in 1(a) only the non-informative variables y(2) are correlated with y(1) through the relation y(2) = By(1) + ε, where B is a 8 × 2 matrix whose elements are distributed as iid 𝒩(0, 0.3), and ε~N(0,Q22-1), with Q22 ~ 𝒲(I, 10).
Case 1(c) Same as in 1(b), but variables y1, y6 are discretized to the closest integer, variables y2, y9 are left censored at −1.4 (~8% of the observations), and y3, y10 are right censored at 1.4.
Case 1(d) Same as 1(c), but the even numbered yj have ~ 30% of the observations MAR.
Case 2(a) n = 300, p = 30. The true model has M = 3 components with mixing proportions 0.5, 0.25, 0.25, respectively, y is a multivariate normal with no censoring nor missing data. Only four variables (y1, y2, y3, y4) are informative, with means of (0.6, 0, 1.2, 0), (0, 1.5, −0.6, 1.9), (−2, −2, 0, 0.6) and all variables with unit variance for each of the three components, respectively. All correlations among informative variables are equal to 0.5 in components 1 and 2, while component 3 has correlation matrix, Σ311(i, j) = 0.5(−1)||i+j|| I{i j} +I{i=j}. The non-informative variables y(2) = [y5, . . . , y30]′ are generated as iid 𝒩(0, 1).
Case 2(b) Same as the setup in Case 2(a) only the non-informative variables y(2) are correlated with y(1) through the relation y(2) = By(1) + ε, where B is a 26 × 4 matrix whose elements are distributed as iid 𝒩(0, 0.3), and ε~N(0,Q22-1), with Q22 ~ 𝒲(I, 30).
Case 2(c) Same setup as in Case 2(b), but now variables y1, y6, y11 are discretized to the closest integer, variables y2, y9, y10, y11 are left censored at −1.4 (~8% of the observations), and variables y3, y12, y13, y14 are right censored at 1.4.
Case 2(d) Same as Case 2(c), but the even numbered yj have ~ 30% MAR.