Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jun 9.
Published in final edited form as: Stat Med. 2021 Mar 9;40(12):2800–2820. doi: 10.1002/sim.8929

Latent class mediator for multiple indicators of mediation

Kyaw Sint 1, Robert Rosenheck 2, Haiqun Lin 3
PMCID: PMC8187142  NIHMSID: NIHMS1706248  PMID: 33687101

Abstract

This paper demonstrates the utility of latent classes in evaluating the effect of an intervention on an outcome through multiple indicators of mediation. These indicators are observed intermediate variables that identify an underlying latent class mediator, with each class representing a different mediating pathway. The use of a latent class mediator allows us to avoid modeling the complex interactions between the multiple indicators and ensures the decomposition of the total mediating effects into additive effects from individual mediating pathways, a desirable feature for evaluating multiple indicators of mediation. This method is suitable when the goal is to estimate the total mediating effects that can be decomposed into the additive effects of distinct mediating pathways. Each indicator may be involved in multiple mediation pathways and at the same time multiple indicators may contribute to a single mediating pathway. The relative importance of each pathway may vary across subjects. We applied this method to the analysis of the first 6 months of data from a 2-year clustered randomized trial for adults in their first episode of schizophrenia. Four indicators of mediation are considered: individual resiliency training; family psychoeducation; supported education and employment; and a structural assessment for medication. The improvement in symptoms was found to be mediated by the latent class mediator derived from these four service indicators. Simulation studies were conducted to assess the performance of the proposed model and showed that the simultaneous estimation through the maximum likelihood yielded little bias when the entropy of the indicators was high.

Keywords: latent class mediator, mental health service, multiple mediators, potential outcome

1 |. INTRODUCTION

Mediation analysis is the statistical study of whether and to what extent an exposure exerts its effect on an outcome through one or more intermediate variables. In this paper, the term “exposure” refers to either an intervention, or a treatment condition that is being examined in either an experimental or observational study. The intermediate variable that is on a pathway to the outcome is termed “the mediator.” A mediation model proposes that the exposure variable affects the mediating variables, which in turn affect the outcome. The effect of the exposure that is exerted through the mediator(s) is called the indirect effect or the mediation effect. The effect of the exposure on an outcome that bypasses the mediator(s) is called the direct effect. Under certain circumstances, the total effect (TE) of an exposure can be decomposed into the additive effects of the direct and indirect effects. The proportion of the TE that is attributed to the mediators can be calculated and provides useful information for understanding how an intervention improves health outcomes through the mediators of interest.

Examples of mediation analysis include the effect of genetic variants on the incidence of lung cancer, whether directly and/or mediated through smoking behaviors1; the effect of a new mechanism of drug delivery on quality of life through improved medication adherence2; the effect of a job training program on employment outcomes through the development of new vocational skills and/or job search tools3; and the effect of housing subsidies on quality of life of homeless people that may be mediated through reduced days of homelessness or increased social support.4

Traditional mediation analysis estimates the mediation effect through the difference-in-coefficients method5 or the product-of-coefficients method.6 In the difference-in-coefficients method, the outcome is regressed on the exposure (with coefficient c) in a first model and on both the exposure (with coefficient c′) and the mediator (with coefficient b) in a second model. The indirect effect is estimated by calculating the differences between the coefficients of exposure from the two models, cc′, where c is the TE. In the product-of-coefficient approach, the first model above is not fit but instead an additional model is employed in which the mediator is regressed on the exposure (with coefficient a) together with second model described above. The indirect effect is then estimated as the product of the two coefficients ab. For a continuous mediator and outcome, the overall effect of the exposure can be additively decomposed into the direct and indirect effects. However, in the presence of a non-continuous mediator, an exposure-mediator interaction, multiple mediators, or a mediator-mediator interaction, such decomposition cannot be accomplished by traditional methods of mediation analysis.

The use of a potential or counterfactual outcome framework allows the decomposition of total exposure effects into additive natural direct and natural indirect effects (NIEs)7,8 for noncontinuous mediators and for outcomes in presence of certain interactions.9,10 The term “potential outcome” refers to an outcome for a subject under a combination of the exposure level and mediator levels, either observed or not observed, that is, the outcome that could be set for a hypothetical level of exposure and mediators. In the case of a binary exposure, we illustrate the mediation framework using the potential outcome approach below. Suppressing subscript for subject, we consider two levels of exposure, A = 0, the unexposed condition and A = 1, the exposed condition. Let M(a) denote the potential mediator level when the exposure A is set to level a and let Y(a, m) denote the potential outcome for response variable Y if the exposure A is set to level a and the mediator at level m. Pearl8 has defined the natural direct effect (NDE) as the mean effect of an exposure on an outcome when the mediator is held at the level it would have had without the exposure: that is,

NDE=E[Y(1,M(0))Y(0,M(0))].

In contrast, the NIE has been defined as the mean effect of the potential mediator under the exposure condition as compared with the unexposed condition when the exposure is held at the exposed condition, that is,

NIE=E[Y(1,M(1))Y(1,M(0))].

The TE is the effect of the exposure on the outcome, that is,

TE=E[Y(1,M(1))Y(0,M(0))].

It is easily seen that TE = NDE + NIE. Imai and colleagues3,11 introduced a more general framework with identification assumptions not specific to any particular statistical model.

Using the potential outcome framework, mediation analysis with multiple mediators has been developed.2,1215 We briefly review some previously published mediation analyses that used the potential outcome approach. Steen et al,15 Albert and Nelson16and Daniel et al.17 conducted mediation analyses that estimated path-specific mediating effects for causally ordered multiple mediators. VanderWeele and Vansteelandt14 and Bellavia and Valeri18 decomposed direct and indirect effects in the presence of exposure-mediator and mediator-mediator interactions for causally nonordered mediators. Hong, Deutsch and Hill19 showed how the ratio-of-mediator-probability weighting (RMPW) method can be used to decompose the indirect effect into the additive effect of a pure indirect effect and a natural exposure-by-mediator interaction effect. However, a serious challenge exists regarding the additivity of the effect of multiple mediators, that is, the sum of individual mediating effects when mediators are handled one at a time which can be very different from the mediation effects when they are handled jointly. This is partly due to the fact that the mediators affect each other.14 As the number of mediators increases, the definition, identification and estimation of the mediating components become more and more of a challenge. The latent class mediation model proposed in this paper alleviates this problem and provides an alternative solution for decomposing the effect of multiple mediation pathways.

Before we introduce our latent class mediation model, we here briefly review two approaches to mediation that involve latent variables. Albert, Geng, and Nelson20 proposed a mediation model with a common latent continuous variable serving as the mediator for the multiple observed indicators. Witkiewitz and colleagues21 specified a latent class mediation model in which both the mediator and outcome variables were latent class variables. Specifically, they fit a three-latent class model for the indicators of the mediator to obtain a nominal variable “coping repertoire class”. They also fit a three-latent class model for two different drinking outcomes to obtain a nominal variable “drinking outcome class”. They then fit two multinomial logistic regression models for the mediator and outcome variables, respectively. They did not use a potential outcome for calculating indirect effects, but rather used the product of the coefficients obtained from the multinomial logistic regression for the mediator and outcome variables.

In this paper, we develop a latent class mediation model in which multiple intermediate variables are taken as indicators to identify a latent class mediator. Each level of the latent class variable represents a distinct mediation pathway in which multiple indicators of mediation exert pathway-specific effect. Our method is suitable for the situation in which there is an interest in decomposing the total mediating effects into additive effects of distinct mediating pathways when there are multiple mediating indicators that are possibly correlated and/or interacting. Though there may be a causal order of the mediating indicators it is not of particular interest. There are several advantages of using a latent class mediator. First, the use of a latent class mediator allows indicators of mediation that are of mixed types, correlated, interacting, causally ordered and/or high dimensional to share distinct mediation pathways rather than exerting their mediating effect as separate mediators whose effects overlap and are difficult or impractical to decompose. However, the causal structure between the mediators is often not known and analyzing the order of multiple mediators is not of particular interest in this paper. Second, an indicator of mediation may be involved in multiple mediation pathways and the relative importance of each pathway may vary across subjects. At the same time, each class-specific mediating pathway may receive contributions from multiple indicators and a subject is proportionally involved in each of the mediating pathways. Third, the latent class mediator model can reveal which mediators act together or alone in their mediating effect. Finally, the use of latent classes in mediation analysis provides a valuable tool for the decomposition of mediating effect into additive, class-specific effects that are practically useful and have not been achieved in previous studies of higher dimensions of mediators, that is, for more than two mediators.

In addition, under the potential outcome framework, latent classes have been used as principal strata representing partially unobserved compliance classes within each of which the principal causal effect of a treatment is evaluated.22 Jo, Wang and Ialongo23 used latent classes to characterize the outcome trajectory classes for a reference group that is not affected by the treatment. Egleston, Uzzo and Wang24 used two latent classes to represent survival classes with large and small hazard rates of death in each of two treatment arms. The main difference between our latent class mediator model and these latent class principal stratification models is that our latent classes serve as a post-treatment intermediate variable with discrete categories while the latent classes in the principal stratification models are regarded as a pre-treatment variable not affected by the treatment.

This paper is organized in the following way. In Section 2 we present the latent class mediation model and its estimation. In Section 3 we apply the latent class mediation model to the analysis of mediating effect of multiple service components provided to young people in a cluster-randomized trial during their first episode of schizophrenia. Finally, in Section 4 we present simulation studies to assess some properties of the model. Section 5 concludes the paper with a discussion.

2 |. LATENT CLASS MEDIATOR MODEL AND INFERENCE

2.1 |. Notations, model specification, and assumptions

Using subscript i to denote subject label, let Ai denote the exposure of subject i, Yi the outcome of interest, Zim the mth observed indicator of the latent class mediator with m = 1, 2, …, M, and Ui the latent class mediator taking value of 1, 2, …, C with C being the number of classes. The latent class mediator model is represented graphically as shown in Figure 1 and is specified as follows.

FIGURE 1.

FIGURE 1

Graph of latent class mediator with multiple medication indicators. The circle indicates the latent class with total of C levels. Zs are the indicators of the latent class. The notations are same as those introduced in the text

We use a multinomial logistic regression model for probability of subject i belonging in class c under exposure Ai.

P(Ui=cAi)=exp(γ0c+γ1cAi)k=1Cexp(γ0k+γ1kAi) (1)

with c = 1, …, C and γ01 = γ11 = 0. Covariates are not included in (1) because they are included in each of the model for indicators of mediation in (2) below. The observed m-th indicator of mediation, is specified as a generalized linear model with link gm for given class c:

gm{E(ZimUi=c,Xi)}=α0mc+α1mTXi (2)

for m = 1, 2, …, M and c = 1, 2, …, C. α0mc is the class-specific intercept for class c and α1m=(α1m1,α1m2,,α1m(qz))T is the qz-vector of coefficients associated with covariates vector Xi that contains exposure-mediator confounders and adjusting variables. For the outcome model, we use a generalized linear regression model with link function h:

h(E(YiUi=c,Ai,Wi)=β0c+β1cAi+βwTWi. (3)

where β0c is a class-specific intercept, and β1c is the class-specific coefficient for exposure and βw=(βw1,βw2,,βw(qy))T is the qy-vector of coefficients associated with the covariates vector Wi containing possible exposure-outcome, and mediator-outcome confounders and adjusting variables. This model includes the exposure-mediator interaction since there is a class-specific treatment effect. The inclusion of direct effect of the exposure A on the outcome Y acknowledges the direct effect of A. The outcome is typically regarded as a distal outcome for latent classes.25

Mediation analysis using the potential outcomes framework requires several assumptions as described in previous publications.7,8,26 Let a and a* denote the two values that can be taken by the exposure A. Omitting subject subscripts and given W and X we assume:

  1. No unmeasured confounding for the treatment-outcome relationship conditional on measured confounders, that is, Y(a, U) ⟂ AX, W.

  2. No unmeasured confounding for the mediator-outcome relationship conditional on treatment and measured confounders, that is, Y(a, U) ⟂ U ∣ {A, W}.

  3. No unmeasured confounding for the treatment-mediator relationship conditional on measured confounders, that is, U(a) ⟂ AX.

  4. No exposure-induced confounding for mediator-outcome relationship conditional on measured confounders, that is, Y(a, U) ⟂ U(a*) ∣ W.

  5. Outcome depends on Zs only through U.

Assumptions (i)–(iv) are similar to those stated by Vanderweele and Vansteelandt26 and collectively constitute the sequential ignorability assumption as termed by Imai (2010).11 For randomized experimental studies, the first and third assumptions hold since the exposure A is randomized. However, in most experiments the mediator is not randomized but observed during the experiment, and therefore, the second and fourth assumptions may not hold. The fifth assumption is an extra assumption for our latent class mediation model which is equivalent to the conditional independence of the outcome and the indicators given the latent class.27 Assumption (v) can be empirically checked as described in the second to the last paragraph in the Discussion section and is not as critical or central as assumptions (i)–(iv) for identifying causal mediation effects, because we can relax assumption (v) by additionally including observed mediation indicators into the Model (3) allowing local dependence of Y on Zs. The result of fitting the local dependence model is also given in the Online Supplement I. However, with local dependence, the interpretation of the results becomes more difficult and complicated because the mediating effect of an indicator with local dependence is quantified both as a standalone effect and as a contributing effect in each of the latent classes.

2.2 |. Estimation of mediation effect

The logarithm of the likelihood for simultaneous estimation of the latent class membership model, the indicators model, and the outcome model is given as:

logL(α,γ,βa,Z,y,w)=log{ic=1cP(Ui=cAi=a)m=1Mf(ZimUi=c,Xi)f(YiAi=a,Ui=c,Wi)}=ilog{c=1cP(Ui=cAi=a)m=1Mf(ZimUi=c,Xi)f(YiAi=a,Ui=c,Wi)}, (4a)

where f denotes the probability density function and P denotes the probability of a discrete latent class and the bold face Z denotes the M-vector of all Zs. If random effects are used (see the last paragraph in Section 3.3), the log-likelihood becomes:

klog{c=1CP(Ui=cAi=a)m=1Mf(ZimUi=c,Xi)k=1nkf(YikAi=a,Ui=c,Wi,bk)f(bk)dbk}, (4b)

where the outcome model (3) can have a random effect with subjects within a same cluster k of size nk share a same random effect bk with density f (bk) (as in the example of a clustered trial). Neither the log-likelihood function (4a) nor (4b) has a closed form solution. The expectation-maximization (EM) algorithm28 is typically used to obtain the maximum likelihood estimates for the latent class model.2932 The E-step in the EM algorithm for (4a) has a closed-form expression. However, in certain circumstances when random effects are present in the latent class model, as in the log-likelihood expression (4b) in our data analysis to accommodate clustering effects (see the last paragraph in Section 3.4), the E-step does not have a closed-form expression and the computational burden becomes very high. We therefore used the SAS PROC NLMIXED33 for obtaining the estimates in the likelihood expression (4a) and (4b). We used built-in optimization technique quasi-Newton in SAS PROC NLMIXED for (4a) in absence of random effects and adaptive Gaussian-Hermite quadrature34,35 with 30 quadrature points for (4b) which has random effects. Other authors have also used SAS PROC NLMIXED to estimate a mixture model and a latent class model.36,37

Based on the parameter estimates, omitting the subscripts, the NDEs can be estimated as:

NDE=E[Y(1,U(0))Y(0,U(0))]=c=1C[h1(β0c+β1c+βwTW)h1(β0c+βwTW)]P(U=cA=0). (5)

The NIE through the latent class mediator is

NIE=E[Y(1,U(1))Y(1,U(0))]=c=1Ch1(β0c+β1c+βwTW)[P(U=cA=1)P(U=cA=0)]. (6)

We denote the NIE through class c as

NIE(c)=E[Y(1,U(A!c=1,Ac=1))Y(1,U(A!c=1,Ac=0))], (7a)

where !c indicates the classes other than c, Ak indicates the value of A set for class k, and U(A!c = a, Ac = a*) is the potential outcome of U with the value of A being a for classes other than c and being a* for class c. The class c-specific mediation effect is derived by calculating the counterfactual difference between the mean outcome for class c with exposure being an intervention (A = 1) and the exposure a control (A = 0) when other classes all have exposure at the level of intervention (A = 1). So, the above NIE (c) becomes

NIE(c)=h1(β0c+β1c+βwTW)[P(U=cA=1)P(U=cA=0)]. (7b)

Therefore, the sum of the NIEs through each class equals the total NIE, c=1CNIE(c)=NIE. The variances of the NDE and NIE can be estimated by the delta method (see Appendix A) or through bootstrap resampling. It can be verified here that the TE equals the sum of the NDE and the NIE, that is,

TE=E[Y(1,U(1))Y(0,U(0))]=c=1Ch1(β0c+β1c+βwTW)P(U=cA=1)h1(β0c+βwTW)P(U=cA=0). (8)

2.3 |. The extent outcome informs the estimation of the latent class mediator

The conceptualization of our model indicates that Y is a distal outcome of the latent classes,25 that is, Y does not define the latent lasses but is a consequence of them. However, the simultaneous estimation approach we described in Section 2.2 seems to regard Y as another indicator of U. We provide a justification for using simultaneous estimation in this subsection.

Let the bold face Z denote the vector of all Zs and omit the individual subscript i in the notation, we compare the implications of the estimated class distribution given the information from both the mediation indicators and the outcome, f (U| Z, Y, A), and given only the indicators, f (U| Z, A) when using the simultaneous estimation. We have

f(UZ,Y,A)=f(YU,Z,A)f(UZ,A)f(YZ,A)=f(YU,A)f(YZ,A)f(UZ,A).

The first equality in the derivation follows by using the Bayes rule and the second equality follows by applying the assumption (v), the conditional independence between Y and Z given U, that is, f (Y| Z, U, A) = f (Y| U, A). In the last expression in this derivation, we can see that the numerator f (Y| U, A) is the outcome informed by the latent class mediator and the exposure, the denominator f (Y| Z, A) is the outcome informed by the indicators and the exposure without using latent classes. That the distribution of the latent classes derived from the simultaneous estimation with {Z,Y} are close to the that derived without Y, that is, f (U| Z, Y, A) ≈ f (U| Z, A), can be achieved when the entropy38,39 of Z is high and much greater than the entropy of Y for U. We demonstrate this through a simulation study described in Section 4.3. The approximate equality, f (U|Y,Z,A) ≈ f (U|Z,A), is not an assumption, but a consequence of our conceptualizing our model regarding the causal order: AU (Z) → Y, and the associated assumptions.

The three-step estimation approach40 seems to be more intuitive for the conceptual premise regarding the causal order, in which the parameter estimates are obtained for the measurement part of the latent class models (1) and (2) in the first step and the posterior probabilities of the latent class memberships are derived exclusively from the indicators Zs in the second step followed by the estimation of the outcome model for Y in the third step. The usual three-step approach (called “classify-analyze”) typically ignores the variation and mis-classification associated with the estimation of the parameters in the first two steps resulting in increased bias and underestimated variation associated with the parameter estimates for the outcome in the third step. Several adjusted approaches for the three-step estimation have been proposed, among which, the adjusted BCH method by Bolck, Croon and Hagenaars40 has worked well in various scenarios.41 However, all the existing three-step estimation approaches ignore the variation associated with the estimated parameters in the first step and the effort was mainly placed in correcting mis-classification in the second step. Furthermore, there is currently no available adjusted three-step approach when the indicators are Poisson (as in the case of our indicators Zs). The adjusted three-step approaches are mainly available for categorical indicators except we found one article for continuous indicators.42 It is recommended that the approaches for estimating the effect of latent classes on distal outcome should be based on substantive theory and research aims with each approach has advantages and disadvantages. Estimation of latent class models has been an area of active research and the simultaneous estimation is still a viable approach in many situations in presence of a distal outcome.40,41

We conduct a simulation study in Section 4.3 to compare the influence of the NIE by our simultaneous estimation approach described in Section 2.2 and the three-step estimation approach.

3 |. APPLICATION OF THE LATENT CLASS MEDIATION MODEL TO THE ANALYSIS OF RAISE

3.1 |. Data description and exploratory analysis

In this section, we demonstrate mediation analysis with a latent class mediator using the first 6 months of data from the 2-year cluster randomized trial—the Recovery After an Initial Schizophrenic Episode-Early Treatment Program (RAISE-ETP) study. RAISE-ETP evaluated a comprehensive multi-component intervention of patient-centered treatment of first-episode psychosis. The intervention, called NAVIGATE (NAV), is an enhanced coordinated specialty care service that has been shown to significantly improve quality of life and schizophrenia symptoms when compared to usual community care (CC) over a 2-year follow-up period.43

Patients in both NAV and CC groups had access to four specific service components of interest with patients in NAV receiving a more coordinated care and more intensive training and support from external experts in the use of the treatment components. We consider these four service elements as the indicators for the latent class mediator. They were measured by the number of visits, as reported by patients, for individual resiliency training (IRT), family psycho-education (FPE), supported employment and education (SEE), and structural assessment for medication (SAM). Site randomly assigned to provide NAV emphasized these four components and were provided training and limited funding to support their implementation. Similar services were available to patients at CC sites as routinely available. The same set of questions was used to evaluate receipt of these services among patients at both NAV and CC sites on a monthly basis. The number of contact visits for each service was summed across the 6-month period. The trial represents a real-world scenario of service delivery. We consider the two outcomes of principal interest as measured at the first follow-up assessment, 6 months after the initiation of treatment, the Positive and Negative Syndrome Scale (PANSS), a standard measure of symptom severity in schizophrenia, and the Heinrichs-Carpenter Quality of Life Scale (QLS), a multi-item measure of functioning, social activity and intrapsychic well-being in people with schizophrenia.

At baseline, NAV participants had significantly worse (more severe and numerous) symptoms on the PANSS than CC (P = 0.02), but there was no significant difference on the QLS. Using the change from baseline as the outcome and a dichotomous indicator of NAV assignment as the predictor, the NAV group had significantly greater improvement in PANSS than the CC group at 6 months even after controlling for baseline PANSS and student status of attending school as these variables were both significantly different between the two groups at baseline and were associated with PANSS scores at 6 months. Using QLS at 6 months as the outcome and controlling for the above set of the same variables, NAV was not significantly different from CC at 6 months. Since the treatment effect is not statistically significant for QLS, in the following mediation analyses, we focus only on the PANSS score as the outcome measure of interest for this paper.

Preliminary analysis showed that NAV patients had significantly more visits for all four services in the first 6 months than CC patients (all P-values <0.01). For each of the four service components (IRT, FPE, SEE and SAM), CC patients reported a mean number of visits in the first 6 months (SE) at respectively 4.57 (0.42), 1.10 (0.25), 1.49 (0.24), and 0.80 (0.10) while NAV patients on average reported a substantially greater number of each of these services in the first 6 months with means (SE) of 9.20 (0.45), 4.76 (0.15), 4.57 (0.37), and 3.17 (0.14) respectively. We next investigate whether the symptom improvement was mediated through the use of these services.

3.2 |. Observed mediator model

This subsection we conduct the mediation analysis using the four indicators of mediation as the observed mediators. In the analysis using the number of visits of a single service as the mediator for change in PANSS scores from baseline to 6-months as the outcome, two regression models were fit. The change in PANSS is regressed on the intervention, the mediator, and their interaction controlling for the baseline PANSS and student status as they were not balanced between the two treatment conditions and were significantly associated with the PANSS at 6 months. The potential confounders such as gender, race, mother’s education, antipsychotic use and days of illegal drugs use were also controlled for. At the same time, the mediator is regressed on the intervention controlling for baseline PANSS, student status and gender. The estimated NIE for each single mediator is calculated using the mediation formula: E[Y(1, M(1)) − Y(1, M(0))], where the value of the M under the mediator model is used as the regressor in the outcome model. FPE had a significant mediating effect of −1.63 (SE = 0.78, P-value = 0.037). SAM had a significant mediating effect of −3.58 (SE= 1.22, P-value = 0.0069). IRT and SEE did not reach a level of statistical significance at 0.05 for their individual mediating effect.

In a further mediation analysis using the four observed indicators of mediation simultaneously, controlling for the same sets of covariates as above, the outcome of change in PANSS is regressed on the intervention, all the four observed indicators of mediation, and the four two-way exposure-indicator interactions, while each of the mediators is separately regressed on the intervention, baseline PANSS, gender, and student status. We again found that FPE and SAM each had a significant mediating effect, with NIEs of −2.03 (SE= 0.85, P-value = 0.017) and −4.32 (SE= 1.48, P-value = 0.0036) respectively, while IRT and SEE did not reach statistical significance at P-value <0.05 for their individual mediating effect.

3.3 |. Mediation analysis with latent class mediator and class-specific NIE’s

We next performed a latent class mediation analysis separately for models with two to six latent classes using the same outcome and the adjusting for the same covariates as above. Many criteria have been used to guide the decision on the number of latent classes in mixture modeling but to date there is no consensus on the best criteria for determining the number of classes.44,45 BIC (Bayesian Information Criteria)46 has often been regarded as the best indicator of the information criteria considered in many simulation studies44 with the model with smallest BIC being selected. Therefore, we take account of other model features together with BIC in deciding on the number of classes. The BIC of the latent class mediation model plateaued when number of classes increased to five although it continued to decrease (ie, improve) with increasing numbers of latent classes up to seven (Figure 2A), indicating improved model fit as the number of latent classes increases. We also plotted the total NIE and NDE against the number of classes (Figure 2B) finding that the NIE plateaued at five classes. The total NIE in the five-class model was significant (P-value <0.001), as was the model with two classes (P-value <0.001). The total NIEs for the three-, four-, and six-class were not significant due to relatively large standard errors. It is known that for a given number of latent classes, a smaller sample size is required to produce sufficiently precise estimates when the classes have a better separation (as measured by the entropy, the distribution of marginal and posterior class membership probabilities, the mean difference across the classes, and so on).47 If the estimated proportion of subjects within a given class is small, then a larger overall sample will likely be required to find a stable solution with sufficient precision for that class. So, since the precision of the NIE from the six-class solution was low, we selected the model with the five classes. For our data, we had a good class separation for the five-class solution and all the class proportions were reasonably sized. The min, mean, median, and max for participants with highest posterior probabilities in the five latent classes were respectively: (0.55, 0.98, 1, 1), (0.46, 0.93, 0.99, 1), (0.52, 0.94, 1, 1), (0.47, 0.92, 0.99, 1), and (0.58, 0.97, 1, 1). The median of these posterior probabilities in each of the five classes was close to one. These results demonstrated the existence of these classes and the distinctness of the mediating pathways they represent.

FIGURE 2.

FIGURE 2

Plots of BIC, NDE, and NIE vs number of classes. (A) Plot of BIC against the number of classes. (B) Plot of NDE and NIE against number of classes

The five latent classes reflecting different patterns of NAV service use are numerically labeled and characterized as: (1) first class—“low class”, (2) second class—“moderate IRT class”, (3) third class—“moderate IRT and SEE class”, (4) fourth class—“moderate IRT and FPE class” and (5) fifth class—“high class” (Figure 3).

FIGURE 3.

FIGURE 3

Latent class models of service use patterns. The class label is in ascending numerical order of overall service use from left to right in each panel

Table 1 presents the results of the latent class mediation model with identity link function for h in mean change in PANSS and log link for Zs which were modeled as Poisson variables. The TE was calculated by plugging in the parameter estimates given in Table 2 into the formula (8) with identity link: c=1C[(β0c+β1c)P(U=cA=1)β0cP(U=cA=0)]. The estimated TE was −4.37 with its estimated SE of 1.80 calculated through the delta method and P-value of 0.016. The total NIE was calculated by plugging in the parameter estimates given in Table 2 into the formula (6) with the identity link: c=1C(β0c+β1c)[P(U=cA=1)P(U=cA=0)]. The estimated total NIE was −3.86 with an estimated SE of 1.60 calculated through the delta method and P-value of 0.016. The NDE was calculated by plugging in the parameter estimates given in Table 2 into the formula (5) with the identity link: c=1Cβ1cP(U=cA=0). The estimated NDE was −0.50 with its estimated SE of 2.52 calculated through the delta method with a P-value of 0.84. Neither the NIE nor the NDE depend on W.

TABLE 1.

Characteristics from the five-class mediation model

Mediation class
1: Low 2: IRT 3: IRT & SEE 4: IRT & FPE 5: High
Class, N (%) 152 (37.7%) 59 (14.6%) 72 (17.8%) 69 (17.1%) 52 (12.9%)
 CC 103 (56.9%) 42 (23.2%) 26 (14.4%) 8 (4.4%) 2 (1.1%)
 NAV 49 (22.0%) 17 (7.6%) 46 (20.6%) 61 (27.4%) 50 (22.4%)
Ratio: % NAV/% CC 0.39 0.33 1.43 6.22 20.4
Total NIE (SE), P-value −3.86 (1.60), 0.0163
NIE(c) (SE), P-value 2.57 (1.16), 0.027 −0.03 (0.90), 0.97 −0.39 (0.30), 0.20 −2.87 (0.68), <0.0001 −3.15 (0.66), <0.0001
NDE (SE), P-value −0.50 (2.52), 0.84
TE (SE), P-value −4.37 (1.80), 0.0159
Proportion mediated 88.43% (SE 54.18%)
6-Month PANSS (SE) 74.28 (1.77) 68.77 (2.11) 69.03 (2.38) 64.19 (2.16) 61.17 (2.04)
PANSS change from baseline (SE) −2.11 (1.73) −5.06 (2.19) −10.22 (2.60) −14.28 (2.17) −13.98 (2.33)
6-Month service use
 IRT 1.11 (0.13) 10.15* (0.66) 9.88* (0.43) 8.26* (0.46) 15.71* (0.61)
 FPE 0.34 (0.05) 0.60 (0.14) 0.70 (0.14) 6.40* (0.44) 12.96* (0.57)
 SEE 0.40 (0.06) 0.25 (0.11) 8.36* (0.43) 2.12* (0.28) 8.51* (0.50)
 SAM 0.90 (0.10) 1.04 (0.25) 2.56 (0.22) 3.24* (0.24) 4.16* (0.29)
Baseline Characteristics
Baseline PANSS (SE) 75.99 (1.25) 74.83 (1.99) 78.28 (1.70) 79.12 (1.74) 74.94 (2.09)
Male 108 (71.1%) 41 (69.5%) 47 (65.3%) 60 (87.0%) 37 (71.2%)
Race
 White 62 (40.8%) 36 (61.0%) 29 (40.3%) 51 (73.9%) 40 (76.9%)
 Non-white 90 (59.2%) 23 (39.0%) 43 (59.7%) 18 (26.1%) 12 (23.1%)
** Mother’s education
 Some college or more 46 (30.3%) 27 (45.8%) 36 (50.0%) 30 (43.5%) 28 (53.9%)
 Completed high school 45 (29.6%) 11 (18.6%) 19 (26.4%) 23 (33.3%) 13 (25.0%)
 Some school, no school or unknown 61 (40.1%) 21 (35.6%) 17 (23.6%) 16 (23.2%) 11 (21.2%)
** Days not taking antipsychotics
 Few if any, <7 91 (60.3%) 37 (62.7%) 52 (73.2%) 50 (72.5%) 37 (71.2%)
 7 or more 27 (17.9%) 11 (18.6%) 5 (7.0%) 7 (10.1%) 3 (5.8%)
 Not prescribed 33 (21.9%) 11 (18.6%) 14 (19.7%) 12 (17.4%) 12 (23.1%)
** Number of days of illegal drugs (SE) 4.05 (0.69) 2.67 (0.92) 2.25 (0.73) 2.87 (0.86) 2.54 (0.96)
*

The difference in use of the same service from the first class reached statistical significance at 0.05.

**

The Type 3 test for the effect of the covariates across the five classes reached statistical significance at 0.05.

TABLE 2.

Parameter estimates and values of the latent class mediation analysis and of simulation study

Estimate (SE)
Class-specific Parameter Latent class label, c
1: Low class 2: IRT class 3: IRT & SEE class 4: IRT & FPE class 5: High class
γ0c 3.90 (0.72) 3.04 (0.73) 2.51 (0.74) 1.32 (0.82) N/A
γ1c −3.94 (0.75) −4.15 (0.81) −2.67 (0.77) −1.16 (0.85) N/A
α01c 0.25 (0.17) 2.47 (0.14) 2.44 (0.14) 2.26 (0.15) 2.90 (0.13)
α02c −0.99 (0.25) −0.41 (0.31) −0.27 (0.27) 1.95 (0.23) 2.66 (0.21)
α03c −1.42 (0.25) −1.9 (0.5) 1.61 (0.22) 0.24 (0.26) 1.62 (0.21)
α04c 0.20 (0.23) 0.35 (0.32) 1.25 (0.23) 1.49 (0.23) 1.73 (0.22)
β0c 43.66 (5.19) 39.21 (5.72) 32.68 (6.19) 39.73 (8.48) 34.7 (14.29)
β1c −5.04 (3.75) 7.12 (6.01) 6.62 (4.16) −6.16 (6.55) −2.81 (13.77)
Parameter in Model (2) X Estimate (SE) Parameter in Model (3) W Estimate (SE)
α1m1, m = 1 Base PANSS −0.0022 (0.0016) βw1 Base PANSS −0.55 (1.83)
α1m1, m = 2 Base PANSS 0.0010 (0.0024) βw2 White Race −1.2 (2.11)
α1m1, m = 3 Base PANSS 0.0019 (0.0024) βw3 College Education −3.39 (2.29)
α1m1, m = 4 Base PANSS −0.0063 (0.0026) βw4 Below Highschool 4.92 (2.69)
α1m2, m = 1 Male Gender 0.052 (0.051) βw5 <7d not taking drug 4.92 (2.69)
α1m2, m = 2 Male Gender −0.19 (0.08) βw6 ≥7d not taking drug 5.20 (2.12)
α1m2, m = 3 Male Gender 0.48 (0.08) βw7 Days of Illicit Drug 5.2 (2.12)
α1m2, m = 4 Male Gender 0.25 (0.09) βw8 Male Gender 0.2 (0.11)
α1m3, m = 1 Student Status −0.09 (0.06) βw9 Student Status −1.64 (1.92)
α1m3, m = 2 Student Status −0.18 (0.09) σ Residual 5.17 (2.13)
α1m3, m = 3 Student Status 0.13 (0.09)
α1m3, m = 4 Student Status −0.08 (0.1)
Class-specific parameter values used for simulation study
Parameter Class Label
1 2 3 4 5
γ0c 3.90 3.15 2.47 1.38 0
γ1c −3.99 −4.01 −2.73 −1.26 0
α01c 0.04 2.27 2.30 2.08 2.77
α02c −1.06 −0.58 −0.35 1.81 2.57
α03c −0.87 −0.96 2.19 0.91 2.14
α04c −0.14 0.22 0.91 1.19 1.44
β0c 41.32 34.08 27.25 33.22 25.67
β1c −5.56 9.28 7.03 −4.79 1.98

The class-specific NIE(c) (c = 1, 2, …, 5) were calculated using formula (7b), which does depend on W. The NIE (1) for the first class (the overall low class) was estimated as 2.57 (SE = 1.16, P-value = 0.027). The NIE for the second class (the moderate IRT class) was estimated as −0.03 (SE = 0.90, P-value = 0.97); that of the third class (the moderate IRT and SEE class) was −0.39 (SE = 0.30, P-value = 0.20); that of the fourth class (the moderate IRT and FPE class) was −2.87 (SE = 0.68, P < 0.0001); and that of the fifth class (the high overall class) was −3.15 (SE = 0.66, P-value <0.0001) (Table 1). Because the first class-specific NIE (1) had a positive sign, the other four class-specific NIE(c) and the total NIE that was used as the denominator had a negative sign, so the estimated mediation proportion was negative for the first class and positive for the other four classes. The estimated mediation proportion for the five classes were −0.66, 0.01, 0.1, 0.74, and 0.81, respectively. Kenny, Kashy, and Bolger48 used the term “opposing mediation” for the situation when multiple mediators have opposite signs in their mediating effects. However, the presence of “opposing mediation” made the calculation of the proportion mediated by each mediator somewhat confusing. Here we provide an explanation of the opposite signs associated with the estimated mediation proportions corresponding to the class-specific NIE(c). By allowing a negative sign to be associated with a proportion, the five proportions of NIE(c) added up to 100%. The positive sign of this class-specific NIE (1) indicated that effect of the NAV intervention, mediated through the specific pathway represented by this class was to make symptoms worse (higher PANSS score) while the negative sign of other NIE(c) indicated that the specific pathway represented by each of the other classes was to make symptoms better (lower PANSS score) with the fourth and fifth classes having significant mediating effect.

3.4 |. Implication of the results from the latent class medication model

We now connect the results from the observed mediator model in Section 3.2 to those from the five class results as displayed by Figure 3 based on the latent class mediation model. The significant mediating effect of FPE from the observed mediator models in Section 3.2 is consistent with the result from the latent class mediator model in that that FPE use was significantly higher in the fourth and fifth classes, the “moderate IRT and FPE” and the “High” class, that both had a significant class-specific NIE(c) (c = 4, 5) while the first through the third class with non-significant NIE(c) had a low FPE. Significant mediating effect of SAM from the observed mediator models in Section 3.2 is also consistent with the result from the latent class mediator model in that the five mediating classes had increasingly greater use of SAM and increasingly significant NIE(c) as the order of class label c increases. That IRT did not have a significant mediating effect based on its observed mediator model in Section 3.2 is consistent with the result from the latent class mediator model which found that the second and the third classes, the “moderate IRT” class and the “moderate IRT and SEE” class, did not have a significant NIE(c) (c = 2, 3) although both of the classes had a significantly greater use of IRT as compared to the first class. That SEE did not have a significant mediating effect based on its observed mediator model in Section 3.2 is also consistent with the result from the latent class mediator model in that the third class “moderate IRT and SEE” did not have a significant NIE (3) although the class had a significantly greater use of SEE comparing to the first, the second and the fourth class.

As FPE and SAM each had a significant mediating effect from their observed mediator models, we can see that the latent class mediation model indicated that FPE and SAM act together but not alone in their mediation effect as the fourth and the fifth classes both had a significant NIE (c) with elevated use of both FPE and SAM. We did not see a class with a significant NIE (c) with elevated use FPE or SAM by themselves. This illustrates a clear advantage of using the latent class mediation model as the observed mediator model is unable to determine whether FPE and SAM act together or alone.

Accounting for clustering site-specific random intercept effects in the indicator models and in the outcome model, the estimated total NIE and class-specific NIEs were close to those given in Table 1. The class-specific NIEs were significant for the fourth and fifth class as well. The overall low class NIE was significant in the model without site-specific random effects but did not reach statistical significance at 0.05 in the model with the random effects with the NIE estimate, as this class amounted to only 1.66 (SE = 1.37, P-value = 0.061).

3.5 |. Participant characteristics across classes

Among CC patients, 56.5% were in the first class (the low overall class), 24.0% in the second class (the moderate IRT class), 14.1% in the third class (the moderate IRT and SEE class), 4.3% in the fourth class (the moderate IRT and FPE class), and only 1.2% in the fifth class (the high overall class). In contrast, at NAV intervention sites each of the last four classes was associated with more service visits than CC. Among NAV patients far greater proportions were in the classes representing higher levels of service use: only 22.2% were in the low overall class, 7.6% in the moderate IRT class, 19.8% in moderate IRT and SEE class, 27.1% in moderate IRT and FPE class, and 23.2% in the high overall class (Table 1).

The participants showed improvement of PANSS from baseline to the 6 months in all five classes and the changes are all statistically significant except for the first (overall low) class. We further examined participant characteristics associated with the five classes with a multinomial logistic regression of the classes based on maximum posterior classification of each subject (Table 1 lower panel). Among these characteristics, race, mother’s education, baseline adherence to prescribed antipsychotic medication (“days not taking the first antipsychotic” in Table 1, lower panel), and illegal drug use were all significantly different across the five classes. Patients with higher level of mother’s education were more likely to be in the higher service use patterns (P-value = 0.03). For baseline antipsychotic adherence, the low overall and the moderate IRT-only groups had lower adherence, with proportions of 77.1% compared to 87.7% and 92.5% in the higher service use classes (P-value = 0.01). The days of illegal drugs differed between the five classes with the low overall service use group having a greater mean number of days of illegal drug use compared to all other service use patterns (P-value = 0.05). The fourth and the fifth classes (the IRT with FPE class and the high overall class) had a greater proportion of white patients (73.9% and 76.9%) while the first class (the low-overall class), the second class (the moderate IRT class), and the third class (the moderate IRT with SEE class) all had a lower proportion of white patients (40.8%, 61.0%, and 40.3% respectively), (P-value = 0.09).

3.6 |. Comparison of our latent class mediator model to the latent continuous mediator model of Albert et al

In a further analysis, we compared the result using our latent class mediator with that using the latent variable mediator presented by Albert, Geng, and Nelson.20 Both data sets have four “intermediate” variables serving as mediating indicator. We here used a continuous latent variable with normal distribution as the mediator in place of the discrete latent class variable. We fit the following simultaneous models (9)–(11):

Ui*=γ0*+γ1*Ai+ei, (9)
gm{E(ZimUi*,Xi)}=α0m*+α1m* TXi+αum*Ui*, (10)
h(E(YiUi*,Ai,Wi)=β0*+β1*Ai+βw* TWi+βu*Ui*. (11)

In Model (9) we model a continuous latent variable Ui* as the mediator with a linear regression model with the exposure A and an error term ei with standard normal distribution. In model (10) we use a generalized structural equation model with Ui* as a latent covariate for the mth mediating indicator Zim. Our joint models (9)–(11) are set up similarly to those specified in Models (9a)–(9d) in the paper by Albert et al.20

The total NIE, NDE and TE from the model using the latent continuous mediator are −4.12 (SE = 1.46, P-value = 0.0052), 0.20 (SE = 2.18, P-value = 0.93), and − 3.92 (SE = 1.82, P-value = 0.032) respectively. We then compared these effects to those presented in Table 1. Although the total NIE from the two models looks similar in magnitude, the interpretations are different. The total NIE for the latent continuous mediator is interpreted as the mediating effect that is shared by the four mediating indicators while the total NIE of the latent class mediator is interpreted as the total NIE attributed to all five patterns of the four mediating indicators. That individual patterns can have opposite signs in their mediating effects possibly explains the smaller NIE effect of the latent class mediator than that of the latent continuous mediator. This explanation also applies to the NDE.

All the four αum*s for m = 1, …, 4 for the latent continuous mediator Ui* in model (10) are significantly different from zero implying that all the four mediating indicators are significantly associated to the latent continuous mediator Ui*. A significant association between a mediating indicator within the latent continuous mediator does not necessarily imply the mediating indicator is itself a significant mediator as in the case of IRT, which is not a significant mediator as illustrated in Section 3.2.

4 |. SIMULATION STUDY

4.1 |. Generating data

To further evaluate the latent class mediation approach, we performed a number of simulations. For these simulations, a binary exposure, four observed indicators for the latent class, a continuous outcome and a single continuous covariate were generated. We use the parameter values close to the estimates from the results section. We generated a sample size of 400 participants with probability of 0.5 for Ai = 1. We then assign the latent class membership variable Ui to one of the five latent classes using the probabilities calculated with Equation (1). Then we generate four indicators of the latent class, Zi1, Zi2, Zi3, Zi4 based on the using Equation (2) with log link for Poisson distribution with a class-specific mean as

Zim~ Poisson (k=15exp(αmk)I(Ui=k)) for m=1,2,3,4.

The outcome Yi is generated under two situations, the first one is under a class-specific intercept and class-specific exposure effect and confounder for the outcome using Equation (3) as.

Yi~Normal((k=15(β0,k+βa,kAi+βwwi)I(Ui=k)),13.99),

where the coefficient is set at −0.5504 for the continuous confounder wi which is generated as normal distributed variable with mean 76.62 and SD of 15.01. The parameter values used above for generating the data are given in Table 2 bottom panel. In the second situation the outcome Yi is generated using the indicators as the mediators directly without the structure of latent classes, the exposure and the indicators are generated similarly as above, and the following linear model is used:

Yi=λ0+λ1Zi1+λ2Zi2+λ3Zi3+λ4Zi4+λ5Zi1Zi3+λ6Zi2Zi3+λxAi+λwwi+ei

with λ0 = 38.92, λ1 = −0.09, λ2 = −0.30, λ3 = −0.47, λ4 = −1.46, λ5 = 0.057, λ6 = −0.13, βx = 2.50, and βw = −0.56 and ei ~ N(0, 13.99).

4.2 |. Impact of whether the true mediator is a latent class or not

When the outcomes were generated from the latent class mediator model with five classes, the latent class mediator model estimates of the NIEs had little bias. The NIEs calculated using the multiple indicators as mediator without latent class in the outcome model were biased under various specifications of exposure and mediators for the outcome. This may be due to the incorrect specification of the outcome model using mediators without latent class (Table 3A). When the outcomes were generated from the multiple mediator model with Zi1–Zi3 and Zi2–Zi3 interactions, we found the latent class model had a much smaller bias than the observed multiple mediator models without interaction terms or with exposure-mediator interaction but not mediator-mediator interaction. A bias is not observed when the outcome model is specified with the exact exposure-mediator and mediator-mediator interactions that generated the data (Table 3B). Even with the Zi1–Zi2 and Zi1–Zi3 interactions in the outcome model, the bias was still much greater than the latent class mediator. This indicates that the latent class model was able to accommodate interactions.

TABLE 3.

Bias study in NIE. (A) Simulation study for the outcome model generated using latent class mediator. (B) Simulation study for the outcome model generated using multiple mediators without latent class

Model used in analysis NIE NDE
Estimate SD Bias SE Estimate SD Bias SE
(A)
Observed multiple mediator model No interaction −3.80 0.87 0.94 0.03 −0.86 1.65 −0.94 0.05
Four Ai-Zim interactions −3.05 0.91 1.69 0.03 −1.59 1.69 −1.67 0.05
Latent class mediator model (TRUEa) (i) Simultaneous estimation −4.81 1.24 −0.07 0.04 0.15 1.87 0.08 0.06
(ii) Three-step estimation**:
Pseudo-class draw −4.35 1.27 0.39 0.03 −0.27 2.09 −0.35 0.05
Modal assignment −4.44 1.27 0.30 0.03 −0.18 2.09 −0.26 0.05
Proportional assignment −4.67 1.31 0.069 0.03 0.049 2.12 −0.03 0.05
a

True indirect effect = −4.74. True direct effect =0.08.

**

Posterior probabilities.

TABLE 3B.

(B) Simulation study for the outcome model generated using multiple mediators without latent class

Model used in analysis NIE NDE
Estimate SD Bias SE Estimate SD Bias SE
(B)
Observed multiple mediator model No interaction −6.82 1.09 −0.57 0.01 3.06 1.69 0.57 0.02
Four Ai-Zim interactions −7.01 1.19 −0.76 0.01 3.25 1.74 0.76 0.02
All six pairwise Zim two-way interactions −6.25 1.10 −0.01 0.01 2.50 1.71 0.01 0.02
Zi1–Zi3 & Zi2–Zi3 interactions (TRUE) −6.25 1.24 −0.01 0.01 2.50 1.80 0.01 0.02
Zi1–Zi2 & Zi1–Zi3 interactions −6.72 1.11 −0.46 0.01 2.96 1.70 0.46 0.02
Latent class mediator model −6.35 1.34 −0.10 0.01 2.60 1.89 0.10 0.02

True indirect effect = −6.25. True direct effect = 2.50.

4.3 |. Simultaneous vs stepwise estimation

In principle Y is downstream from U and therefore it is preferred that Y is not included as an indicator of U. However, technically Y seems to be like another indicator of U using our simultaneous estimation approach, except that Y is allowed to be directly influenced by A in Model (3) while Zs are not in Model (2). So, the outcome Y in our model is regarded as a distal outcome that is usually conceptualized as a consequence of latent classes rather than indicators of latent classes.25 Estimation of latent class models with distal outcomes has been an area of active research.25,40,41

In this subsection, we conduct a simulation study to investigate the impact of the inclusion of Y as a technical indictor for U in the simultaneous estimation described in Section 2.2 on the mediation effect. We drop the subscript for subjects from the notation in the following description. After a data is generated as described in Section 4.1 from the latent class mediator model with five classes, we can estimate the mediation model through one of the following two approaches: (i) Y is included as an indicator of U through the simultaneous estimating approach; or (ii) Y is not an indicator of U and the outcome model for Y is estimated through the three-step estimation approach described in the third paragraph of Section 3.3. Specifically, in (ii), we adopted the following three different ways of assigning class membership to a subject in the first step of the three-step estimation: (a) Pseudo-class draw. Assign membership to the subject by drawing a random sample from the multinomial distribution with the posterior probabilities (P**s) for that subject. This is similar to the method described by Wang, Jo and Brown.49 (b) Modal assignment. Assign a subject to the class for which the subject has the highest posterior probability; or (c) Proportional assignment. Partially assign the subject to all the classes using the respective posterior probabilities. Here we present the result from the simulation study comparing the results from (i) and (ii).

The result of the simulation study is presented in the bottom part of Table 3A. When Y was included as an indicator of U in the estimation as in (i) simultaneous estimation, there was the least bias in NIE and NDE. When Y was not included as an indicator of U as in the (ii) three-step estimation, all three different ways of assigning memberships to subjects yielded a greater bias than that yielded by the simultaneous estimation. This is because the relative entropy38,39 of Y was 0.2 while that of the Zs was 0.98, so the U in our data was almost completely determined by the Zs and not by Y. Therefore, Y had little influence in the identification of the latent class. As a consequence, the simultaneous estimation with Y as a technical indicator of U had little to lose.

However, when the class separation was made to decrease as the entropy of Zs decreased, the sample variation and the bias of the estimated NIE both increased (Online Supplement II). As long as the relative entropy stayed above 0.7 – 0.8, the bias looked quite small. When the entropy dropped to below 0.7 ~ 0.8, the bias became less acceptable.

4.4. Effects of the number of latent classes

Here we investigate the situation when the number of classes is not known. The data were generated from the three class, four class, and five class latent class mediation models, respectively, and then analyzed using the latent class mediation using two to six number of latent classes for each of the models.

The estimates of the indirect and direct effect were unbiased when we analyzed using the correct number of latent classes. When the true number of classes is three, using a latent class mediator model with four or five classes exhibited a relatively small bias for the indirect effect and the direct effect. A larger bias was observed for the two-class model (Table 4 top panel). Similar results were observed when the true number of classes is four (Table 4 middle panel). For the five-class simulation, the two, three, and four-class analyses exhibited large biases while the six class solutions had a relatively small bias (Table 4 bottom panel). These simulations suggest that a suitable number of latent classes are necessary to ensure an unbiased estimate of NIE. One should err on the side of more classes when a suitable number of classes is not known.

TABLE 4.

Simulation study of varying number of classes

Data generated under the three-class model
# Of classes in analysis NIE NDE
BIC Estimate SD Bias SE Estimate SD Bias SE
2 10 310.1 −1.65 0.91 1.07 0.01 −3.05 1.71 −1.05 0.02
3 (TRUE) 9504.9 −2.73 1.01 −0.01 0.01 −1.98 1.77 0.03 0.02
4 9554.1 −2.79 1.34 −0.07 0.01 −1.93 1.99 0.07 0.02
5 9592.1 −2.83 1.51 −0.11 0.02 −1.87 2.09 0.13 0.02
6 9638.0 −3.12 5.83 −0.40 0.06 −1.53 6.00 0.47 0.06
Data generated under the four-class model
# Of classes in analysis NIE NDE
BIC Estimate SD Bias SE Estimate SD Bias SE
2 10 819.4 −2.16 0.97 0.85 0.01 −2.43 1.75 −0.85 0.02
3 10 087.1 −3.06 1.03 −0.04 0.01 −1.53 1.79 0.05 0.02
4 (TRUE) 9581.1 −3.02 1.06 −0.01 0.01 −1.56 1.80 0.01 0.02
5 9625.5 −3.06 1.16 −0.04 0.01 −1.53 1.86 0.05 0.02
6 9666.8 −3.19 1.77 −0.17 0.02 −1.18 2.25 0.39 0.02
# of classes in analysis NIE NDE
BIC Estimate SD Bias SE Estimate SD Bias SE
2 11 035.0 −3.85 1.17 0.89 0.01 −0.82 1.84 −0.90 0.02
3 10 355.2 −3.36 1.10 1.38 0.01 −1.31 1.82 −1.38 0.02
4 9908.3 −4.02 1.31 0.72 0.01 −0.65 1.96 −0.72 0.02
5 (TRUE) 9636.9 −4.75 1.23 −0.01 0.01 0.08 1.88 0.01 0.02
6 9687.6 −4.80 1.26 −0.06 0.01 0.13 1.91 0.05 0.02

True indirect effect = −2.72. True direct effect = −2.01.

True indirect effect = −3.02. True direct effect = −1.58.

Data generated under the five-class model:

True indirect effect = −4.74. True direct effect = 0.08.

5 |. DISCUSSION

In this article, we considered a latent class mediation model that transforms a possibly high-dimensional set of mediation indicators of mixed types into a discrete latent class variable that is conceptually and practically easier to handle. The proposed method is suitable for situations in which the goal is to estimate the total mediating effects that can be decomposed into additive effects of distinct mediating pathways when there exist multiple mediating indicators that are possibly correlated and interacting. The causal order of the mediating indicators may exist but is not of particular interest. Each class-specific mediating pathway receives contributions from multiple indicators and a subject is proportionally involved in each of the mediating pathways.

An appealing feature of this approach is that individual class-specific indirect effects add up to the total NIE. This is a feature that has not been achieved with other methods of mediation analysis involving multiple mediators. For example, it is not possible to make the effect of the observed, significant indicators of mediation, FPE and SAM, sum to their joint mediation effect in the observed mediator model unless one imposes additional restrictive assumptions. An indicator of mediation, for example, SAM, may not act alone but only in combination with another indicator of mediation.

Use of a latent class mediator has an especially valuable advantage in the presence of interactions whose forms are often unknown or unapparent in the data. The observed mediator model using observed indicators as the mediators requires interactions to be correctly specified which is often unrealistic in presence of multiple indicators for mediation. Our latent class mediation model naturally incorporated interactions as class-specific parameters and it showed good properties of least bias when compared to the observed mediation model in presence of interactions. Since the true number of classes is unknown from the data, our simulation studies suggest the model should err in the direction of more classes to achieve a less biased indirect effect. Machine learning approaches can find the interactions among the indicators and treatment exposure so that an observed mediator-model can be fit with the identified interactions. Although the machine learning approach may discover high-dimension interactions among the indicators and the exposure, a machine learning model would be more difficult to interpret than a latent class model and is seen as “hypotheses generating while the latent class model can test a priori relationships. It is also difficult to achieve the additivity property using the machine learning approach.

A conceptual issue concerns the simultaneous estimation of all the parameters for the indicator model (2) and for the outcome model (3). This simultaneous estimation may allow the outcome to influence the estimation of the latent class mediators, which may not seem desirable because the exposure effect on the mediator should proceed the mediator effect on the outcome. Although the inclusion of direct effect of A on Y but not on Z would lessen such influence of Y on the latent classes, the three-step estimation40,41 seems more intuitive in presence of distal outcomes, however, it had its own issues and limitations as pointed out in Section 3.3. Estimation of latent class models has been an area of active research and the simultaneous estimation is still a viable approach in presence of a distal outcome. In the simulation studies presented in Section 4.3, our proposed simultaneous estimation did not seem to cause any additional bias in estimating the mediating effects due to Zs high entropy and Ys low entropy. Caution should be taken in case a distal outcome has high entropy.

In our example, we have an identity link for h in the outcome model (3), so the estimated NIE was obtained directly from the estimated model parameters in (3). The NIE thus does not depend on the choice of W and can be interpreted in the original measurement scale of Y. For a nonidentity link, estimation of NIE may involve nonlinear calculation and numerical integration if a random effect is present in the model (3). The NIE thus may depend on the value of W. However, if the NIE is defined in the same link function as h, then the calculation of estimated NIE may also be obtained directly from the estimated model parameters in (3). For example, if the link function h is log-odds for a binary Y and NIE is defined using the log-odds ratio (the difference in log-odds) as well as NDE and TE, then the NIE and NDE still sum up to TE and the NIE(c) from each of the classes still sum up to NIE. We can see this by adding a link function h of log-odds to the formula for NIE:

NIE=h{E[Y(1,U(1))}]}h{E[Y(1,U(0))}]}=c=1C(β0c+β1c+βwTW*)[P(U=cA=1)P(U=cA=0)]=c=1C(β0c+β1c)[P(U=cA=1)P(U=cA=0)].

When the link function h is identity, we found that the model was quite robust to the departure from the normality assumption for the error term of the Y, as indicated from the simulation study we performed in Online Supplement III. The robustness can be explained by that our outcome Y has a mixture of normal distributions using our simultaneous estimation approach (in contrast to the three-step estimation approach in which a continuous distal outcome has a normal distribution). The mixture of normal distribution is quite robust to normality assumption of the error term.

The assumptions (i)–(iv) required for the latent class model in terms of model specification and estimation, are similar to those presented by Vanderweele and Vansteelandt26 and collectively constitute the sequential ignorability assumption.11 Since our data is from a randomized trial, the assumptions (i) and (iii) are likely to be satisfied. We would follow the approach of Albert et al20 which dealt with the similar situation when the estimation of U involves Y. That approach used the proposal of Albert and Wang50 that the sensitivity of the sequential ignorability can be assessed equivalently by assessing the sensitivity to the “mediator comparability” which assumes:

E{Yi(a*,c)Ui(a)=c,Ai=a,Wi}=E{Yi(a*,c)Ui(a*)=c,Ai=a*,Wi}.

This assumption says that the mean potential outcome of Y for a given exposure level and mediator value is the same for subgroups of the two observed exposure groups (A = a, a*) that are observed at a same mediator level. The hybrid model for the sensitivity analysis to the “mediator comparability” as stated by Albert, Geng and Nelson20 can also be applied on our model (3) as:

h[E{Yi(a*,c)Ui(a)=c,Ai=a,Wi}]=β0c+β1c{ϕa+(1ϕ)a*}+βwTWi, (12)

where the exposure as a causal factor with level a* is distinguished from an observed exposure represented by level a, and the sensitivity parameter ϕ is defined as the proportion of the exposure due to observed exposure as confounding rather than the causal effect. Model (12) is an extension of Model (3) by allowing the mean of potential outcome to be determined beyond the exposure, the mediator and the confounders because the effect of the confounders is additionally related to the exposure, a clear violation of the assumption (iv). The sensitivity of the direct and the indirect effects to the departure from sequential ignorability11 can be studied by varying the value of ϕ in a plausible range that is described by Albert and Wang.50

An additional assumption required by our latent class is the assumption (v) of conditional independence between the outcome and the indicators given the latent classes that the outcome is independent of the mediating indicators conditional on the latent classes.27 We use the following empirical approach to test this assumption. We fit Model (3) with the extra term γ#Zim:h(E(YiUi=c,Ai,Wi)=β0c+β1cAi+γ#Zim+βwTWi, using the posterior probabilities for each subject as the sampling weights respectively across C classes for that subject. The significance of the coefficient γ# would indicate a violation of the assumption (v). We found that only the coefficient associated with SAM, the fourth indicator, Zi4 was statistically significant. We then modified Model (3) to allow local dependence of Y on the term SAM with its associated class-specific coefficient γc:

h(E(YiUi=c,Wi)=β0c+β1cAi+γc#Zi4+βwTWi. (13)

We obtained results by using the simultaneous estimation of Models (1), (2) and (12). Only γ1# was statistically significant while γ2#γ5# were not. The estimated NIE, and NIE(1)-NIE(5) from fitting Models (1), (2) and (13) are given in the Online Supplement Table 1 and they are very comparable to those in Table 1. However, both the latent class variable and the observed indicators serve as mediators in Model (13). The interpretation of the results becomes more difficult and complicated because the mediating effect of an indicator with local dependence is quantified both as a standalone effect and as a contributing effect in each of the latent classes.

A possible limitation of our analysis is that there were some participants missing PANSS data at 6 months. However, the likelihood estimation used our model is valid under the missing at random assumption for the missing data.51 For missing not at random data, a model for missing data52 can be incorporated into the simultaneous estimation of the latent mediation model. The data set we used is from a 2-year longitudinal trial and the longitudinal version of our latent class mediation model with increased complexities is a work-in-progress.

Supplementary Material

Supplement material

ACKNOWLEDGEMENTS

This work is partially supported by 1R03MH112053-01A1 grants. We thank John Kane and the RAIE-ETP executive committee and investigators for supporting this research initiative. We also appreciate the helpful comments from Daniel Zelterman, Eva Petkova, Melanie Wall, Hendricks Brown, and Tyler VanderWeele. We are grateful for Robert Gibbons’ critical suggestions.

Funding information

National Institute of Health, Grant/Award Number: 1R03MH112053-01A1

APPENDIX A

A.1 Approximation of SE by delta method

The NIE for the latent class model with continuous outcome is

NIE=E[Y(1,U(1))Y(1,U(0))]=k=1C(β0k+β1k)(P(U=kA=1)P(U=kA=0))
PUA=P(U=kA=a)=exp(γ0k+γ1ka)j=1Cexp(γ0j+γ1ja)

The partial derivative of the latent class probability with respect to the α parameters are:

PUAγ0k=Dk and PUAγ1k=aDk

with Dk=exp(γ0k+γ1ka)(j=1,jkCexp(γ0j+γ1ja))(j=1Cexp(γ0j+γ1ja))2.

The partial derivative of the NIE with respect to the parameters are:

NIEγik=k=1Cβk(PUA=1γikPUA=0γik)

and

NIEβ0k=NIEβ1k=P(U=kA=1)P(U=kA=0).

The gradient of the NIE with respect to the parameters is:

Γ=(NIEγ01,,NIEγ0K,NIEγ11,,NIEβ0K,NIEβ11,,NIEβ1K)

The approximation for the SE of the NIE is then ΓΣΓ where Σ is the covariance matrix of the parameters.

Footnotes

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

SUPPORTING INFORMATION

Additional supporting information may be found online in the Supporting Information section at the end of this article.

REFERENCES

  • 1.VanderWeele TJ, Asomaning K, Tchetgen EJ, et al. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. Am J Epidemiol. 2012;175:1013–1020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Preacher KJ, Hayes AF. SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behav Res Meth Instrum Comput. 2004;36(4):717–731. [DOI] [PubMed] [Google Scholar]
  • 3.Imai K, Keele L, Tingley D. A general approach to causal mediation analysis. Psychol Methods. 2010;15(4):309–334. [DOI] [PubMed] [Google Scholar]
  • 4.O’Connell M, Kyaw S, Rosenheck R. How do housing subsidies improve quality of life among homeless adults? a mediation analysis. Am J Com Psychol. 2018;61(3–4):433–444. [DOI] [PubMed] [Google Scholar]
  • 5.Judd CM, Kenny DA. Process analysis: estimating mediation in treatment evaluations. Eval Rev. 1981;5:602–619. [Google Scholar]
  • 6.Baron RM, Kenny DA. The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol. 1986;51(6):1173–1182. [DOI] [PubMed] [Google Scholar]
  • 7.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3(2):143–155. [DOI] [PubMed] [Google Scholar]
  • 8.Pearl J. Direct and indirect effects. In: Breese J, Koller D, eds. Proceedings of the 17th Conference on Uncertainy in Arifical Intelligence. San Francisco, CA: Mogan Kaufmann; 2001:411–420. [Google Scholar]
  • 9.VanderWeele TJ, Vansteelandt S. Odds ratios for mediation analysis for a dichotomous outcome. Am J Epidemiol. 2010;172(12):1139–1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychol Methods. 2013;18(2):137–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Imai K, Keele L, Yamamoto T. Identification, inference and sensitivity analysis for causal mediation effects. Stat Sci. 2010;25(1):51–71. [Google Scholar]
  • 12.Preacher KJ, Hayes AF. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behav Res Methods. 2008;40(3):879–891. [DOI] [PubMed] [Google Scholar]
  • 13.Wang W, Nelson S, Albert JM. Estimation of causal mediation effects for a dichotomous outcome in multiple-mediator models using the mediation formula. Stat Med. 2013;32(24):4211–4228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.VanderWeele TJ, Vansteelandt S. Mediation analysis with multiple mediators. Epidemiol Methods. 2014;2(1):95–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Steen J, Loeys T, Moerkerke B, Vansteelandt S. Flexible mediation analysis with multiple mediators. Am J Epidemiol. 2017;186(2):184–193. [DOI] [PubMed] [Google Scholar]
  • 16.Albert JM, Nelson S. Generalized causal mediation analysis. Biometrics. 2011;67(3):1028–2038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Daniel RM, De Stavola BL, Cousens SN, Vansteelandt S. Causal mediation analysis with multiple mediators. Biometrics. 2015;71(1): 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bellavia A, Valeri L. Decomposition of the total effect in the presence of multiple mediators and interactions. Am J Epidemiol. 2018;187(6): 1311–1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hong G, Deutsch J, Hill HD. Ratio-of-mediator-probability weighting for causal mediation analysis in the presence of treatment-by-mediator interaction. J Educ Behav Stat. 2015;40(3):307–340. [Google Scholar]
  • 20.Albert JM, Geng C, Nelson S. Causal mediation analysis with a latent mediator. Biom J. 2016;58(3):535–548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Witkiewitz K, Roos CR, Tofighi D, van Horn ML. Broad coping repertoire mediates the effect of the combined behavioral intervention on alcohol outcomes in the COMBINE study: an application of latent class mediation. J Stud Alcohol Drugs. 2018;79(2):199–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lin JY, TenHave TR, Elliott MR. Longitudinal nested compliance class model in the presence of time-varying noncompliance. J Am Stat Assoc. 2008;103(482):462–473. [Google Scholar]
  • 23.Jo B, Wang CP, Ialongo NS. Using latent outcome trajectory classes in causal inference. Stat Interface. 2009;2(4):403–412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Egleston BL, Uzzo RG, Wong Y-N. Latent class survival models linked by principal stratification to investigate heterogenous survival subgroups among individuals with early-stage kidney cancer. J Am Stat Assoc. 2017;112:534–546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Nylund-Gibson K, Grimm RP, Masyn KE. Prediction from latent classes: a demonstration of different approaches to include distal outcomes in mixture models. Struct Equ Modeling. 2019;26:967–985. [Google Scholar]
  • 26.VanderWeele TJ, Vansteelandt S. Conceptual issues concerning mediation, interventions and composition. Stat Interface. 2009;2:457–468. [Google Scholar]
  • 27.Hagenaars JAP, McCutcheon AL. Applied Latent Class Analysis. Cambridge, UK: Cambridge University Press; 2002. [Google Scholar]
  • 28.McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. New York, NY: John Wiley & Sons; 1997. [Google Scholar]
  • 29.Lin H, Turnbull BW, McCulloch CE, Slate EH. Latent class models for for joint analysis of longitudinal biomarker an event process data. J Am Stat Assoc. 2002;97(457):53–65. [Google Scholar]
  • 30.Lin H, McCulloch CE, Rosenheck RA. Latent pattern mixture models for informative intermittent missing data in longituidnal studies. Biometrics. 2004;60(2):295–305. [DOI] [PubMed] [Google Scholar]
  • 31.Beunckens C, Molenberghs G, Verbeke G, Mallingckrodt C. A latent-class mixture model for incomplete longitudinal Gaussian data. Biometrics. 2008;64(1):96–105. [DOI] [PubMed] [Google Scholar]
  • 32.Lai D, Xu H, Koller D, Foroud T, Gao S. A multivariate finite mixture latent trajectory model with application to dementia studies. J Appl Stat. 2016;43(14):2503–2523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.SAS Institute Inc. SAS 9.4 Help and Documentation Cary, NC: SAS Institute Inc; 2014. [Google Scholar]
  • 34.Golub GH, Welsch JH. Calculation of gauss quadrature rules. Math Comp. 1969;23(106):221–230. [Google Scholar]
  • 35.Pinhiro JC, Bates DM. Approximations to the log-likelihood function in the nonlinear mixed-effects model. J Comput Graph Stat. 1995;4(1):12–35. [Google Scholar]
  • 36.Ma X, Nie L, Cole SR, Chu H. Statistical methods for multivariate meta-analysis of diagnostic tests: an overiew and tutorial. Stat Methods Med Res. 2016;25(4):1596–1619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Liu L, Huang X, Yaroshinsky A, Cormier JN. Joint models for zero-inflated receurrent events in presence of a terminal event. Biometrics. 2016;72(1):204–214. [DOI] [PubMed] [Google Scholar]
  • 38.Celeux G, Soromenho G. An entropy criterion for assessing the number of clusters in a mixture model. J Classif. 1996;13:195–212. [Google Scholar]
  • 39.Li T, Ma S, Ogihara M. Entropy-based criterion in categorical clustering. In: Greiner R, Schuurmans D, eds. Proceedings of the 21st International Conference on Machine Learning. Banff, Alta: ACM Press; 2004:536–543. [Google Scholar]
  • 40.Bolck A, Croon M, Hagenaars J. Estimating latent structure models with categorical variables: one-step versus three-step estimators. Political Anal. 2004;12:3–27. [Google Scholar]
  • 41.Vermunt JK. Latent class modeling with covariates: two improved three-step approaches. Political Anal. 2010;18:450–469. [Google Scholar]
  • 42.Dziak JJ, Bray BC, Zhang J, Zhang M, Lanza ST. Comparing the performance of improved classify-analyze approaches for distal outcomes in latent profile analysis. Methodology (Gott). 2016;12(4):107–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Kane JM, Robinson DG, Schooler NR, et al. Comprehensive versus usual community care for first-episode psychosis: 2-year outcomes from the NIMH RAISE early treatment program. Am J Psychiatry. 2016;173(4):362–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Nylund KL, Asparouhov T, Muthen BO. Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study. Struct Equ Model. 2007;14(4):535–569. [Google Scholar]
  • 45.Tofighi D, Enders CK. Identifying the correct number of classes in growth mixture models. In: Hancock GR, Samuelsen KM, eds. Advances in Latent Variable Mixture Models. Charlotte, NC: Information Age Publishing, Inc; 2008:317–341. [Google Scholar]
  • 46.Schwarz G Estimating the dimension of a model. Ann Stat. 1978;6(2):461–464. [Google Scholar]
  • 47.Jaki T, Kim M, Lamont A, et al. The effects of sample size on the estimation of regression mixture models. Educ Psychol Meas. 2019;79(2):358–384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kenny DA, Kashy DA, Bolger N. Data analysis in social psychology. In: Gilbert D, Fiske S, Lindsey G, eds. The Handbook of Social Psychology. Vol 1. 4th ed. Boston, MA: McGraw-Hill; 1998:233–265. [Google Scholar]
  • 49.Wang CP, Jo B, Brown CH. Causal inference in longitudinal comparative effectiveness studies with repeated measures of a continuous intermediate variable. Stat Med. 2014;33:3509–3527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Albert JM, Wang W. Sensitivity analyses for parametric causal mediation effect estimation. Biostatistics. 2015;16(2):339–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. New York, NY: Wiley; 2019. [Google Scholar]
  • 52.Gueorguieva R, Rosenheck R, Lin H. Joint modelling of longitudinal outcome and interval-censored competing risk dropout in a schizophrenia clinical trial. J R Stat Soc Ser A. 2012;175(2):417–433. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement material

RESOURCES