Abstract
Behavioral science researchers have shown strong interest in disaggregating within-person relations from between-person differences (stable traits) using longitudinal data. In this paper, we propose a method of within-person variability score-based causal inference for estimating joint effects of time-varying continuous treatments by controlling for stable traits of persons. After explaining the assumed data-generating process and providing formal definitions of stable trait factors, within-person variability scores, and joint effects of time-varying treatments at the within-person level, we introduce the proposed method, which consists of a two-step analysis. Within-person variability scores for each person, which are disaggregated from stable traits of that person, are first calculated using weights based on a best linear correlation preserving predictor through structural equation modeling (SEM). Causal parameters are then estimated via a potential outcome approach, either marginal structural models (MSMs) or structural nested mean models (SNMMs), using calculated within-person variability scores. Unlike the approach that relies entirely on SEM, the present method does not assume linearity for observed time-varying confounders at the within-person level. We emphasize the use of SNMMs with G-estimation because of its property of being doubly robust to model misspecifications in how observed time-varying confounders are functionally related to treatments/predictors and outcomes at the within-person level. Through simulation, we show that the proposed method can recover causal parameters well and that causal estimates might be severely biased if one does not properly account for stable traits. An empirical application using data regarding sleep habits and mental health status from the Tokyo Teen Cohort study is also provided.
Supplementary Information
The online version contains supplementary material available at 10.1007/s11336-022-09879-1.
Keywords: longitudinal data, observational study, causal inference, marginal structural model, structural nested mean model
Introduction
Estimating the causal effects of (a sequence of) time-varying treatments/predictors on outcomes is a challenging issue in longitudinal observational studies because researchers must account for time-varying and time-invariant confounders. For this analytic purpose, potential outcome approaches such as marginal structural models (MSMs; Robins, 1999; Robins et al., 2000) have been widely used in epidemiology. Although actual applications have been relatively infrequent, structural nested models (SNMs; Robins, 1989, 1992) with G-estimation are in principle more suitable and robust for handling violation of the usual assumptions of no unobserved confounders and sequential ignorability (Robins, 1999; Robins & Hernán, 2009; Vansteelandt & Joffe, 2014).
Parallel with such methodological development, behavioral science researchers have shown interest in inferring within-person relations in longitudinally observed variables, namely, how changes in one variable influence another for the same person. Investigations based on within-person relations might produce conclusions opposite to those based on between-person relations. For example, a person is more likely to have a heart attack during exercise (within-person relation), despite people who exercise more having a lower risk of heart attack (between-person relation; Curran & Bauer, 2011).
Statistical inference for disaggregating within- and between-person (or within- and between-group) relations has been a concern in behavioral sciences for more than half a century. However, recent methodological development and extensive discussion (Cole et al., 2005; Hamaker, 2012; Hamaker et al., 2015; Hoffman, 2014; Usami et al., 2019a) have rapidly increased interest in this topic. In the psychometrics literature, along with multilevel modeling (e.g., Wang & Maxwell, 2015), structural equation modeling (SEM)-based approaches have become one popular method for uncovering within-person relations. Among these approaches, applications of a random-intercept cross-lagged panel model (RI-CLPM; Hamaker et al., 2015), which includes common factors called stable trait factors, have rapidly increased, reaching more than 1500 citations on Google as of June 2022. This model was originally proposed to uncover reciprocal relations among focal variables that arise at the within-person level (i.e., simultaneous investigations for the effects of a variable X on a variable Y, along with the effects of Y on X), without explicit inclusion of (time-varying) observed confounders L (however, Mulder & Hamaker, 2021) discussed an extension that included a between-level predictor).
Despite its popularity and theoretical appeal, the concepts of stable traits and within-person relations in the RI-CLPM have not been fully characterized in the causal inference literature. This might be partly because psychometricians have used these terms vaguely and ambiguously in statistical models, without clarifying the assumed data-generating process (DGP) and providing clear mathematical definitions. For this reason, the RI-CLPM has not been contrasted with many other methodologies used for causal inference (e.g., MSMs and SNMs). One potential advantage of the RI-CLPM as SEM is that it can easily include and estimate measurement errors in statistical models under parametric assumptions. However, the RI-CLPM demands linear regressions at the within-person level that are correctly specified to link focal variables (as well as time-varying observed confounders, if included in the model). The linearity assumption typically imposed with respect to time-varying observed confounders in path modeling and SEM has often been criticized in the causal inference literature (e.g., Hong, 2015), and relaxing this assumption is often a key to consistently estimating the causal quantity of interest (e.g., Imai & Kim, 2019).
In this paper, we propose a method of within-person variability score-based causal inference for estimating joint effects of time-varying continuous treatments/predictors at the within-person level by controlling for stable traits (i.e., between-person differences), which are assumed to be uncorrelated with within-person relations as in the RI-CLPM. The proposed method is a two-step analysis. A within-person variability score for each person, which is disaggregated from the stable trait factor score of that person, is first calculated using weights based on a best linear correlation preserving predictor through SEM. Causal parameters are then estimated by MSMs or SNMs, using calculated within-person variability scores. The proposed method still requires specification of the structure for within-person variability scores over time for each variable (e.g., Y and L) in the first step. However, this approach is more flexible than the one that relies entirely on SEM (e.g., the RI-CLPM that includes time-varying observed confounders) in terms of modeling how time-varying observed confounders are functionally related to treatments/predictors and outcomes at the within-person level, without imposing the linearity assumption in these relations. We particularly emphasize the utility of SNMs with G-estimation because of its attractive property of being doubly robust to model misspecifications in how time-varying observed confounders are functionally related to treatments/predictors and outcomes at the within-person level.
The proposed method can be viewed as one that synthesizes two traditions for factor analysis methods and SEM in psychometrics and a method of causal inference (MSMs or SNMs) in epidemiology. Because causal estimands that are defined at the within-person level are less common in the causal inference literature (Lüdtke & Robitzsch, 2021), the proposed method offers new insights for researchers in a broad range of disciplines who are interested in causal inference. Also, the idea of using within-person variability scores can be applied to many other issues that are closely relevant to causal hypotheses, including reciprocal effects and mediation effects.
The remainder of this paper is organized as follows. Because the concepts of stable traits and within-person relations have not been fully characterized in the causal inference literature, in Sect. 2 we start our discussion by introducing the two different DGPs in which time-invariant factors are included. After providing formal definitions of stable trait factors (for between-person relations) and within-person variability scores (for within-person relations), the definition of joint effects of time-varying treatments at the within-person level and their identification conditions are described in Sect. 3. We then introduce the proposed methodology in Sect. 4. In Sect. 5, we perform simulations and show that the proposed method can recover causal parameters well, and that causal estimates might be severely biased if stable traits are not properly accounted for. Section 6 describes an empirical application of the proposed method using data from the Tokyo Teen Cohort (TTC) study (Ando et al., 2019). The final section gives some concluding remarks and discusses our future research agenda.
Causal Models and Data-Generating Processes
In this section, we first explain two different DGPs and causal models in which time-invariant factors are included. In the first DGP, we assume that time-invariant factors have both direct and indirect effects on measurements; recent work by Gische et al. (2021), which provided a didactic presentation of the directed acyclic graph (DAG)-based approach and key concepts regarding causal inference based on a cross-lagged panel design, assumed this process. In the second DGP, we assume that time-invariant factors have only direct effects; this corresponds to the process that researchers (implicitly) assume in applying the RI-CLPM to infer within-person relations. This distinction of processes is inspired by Usami et al. (2019a), who highlighted how common factors included in the different statistical models to examine reciprocal relations have different conceptual and mathematical properties.
Below, we suppose that data are generated at fixed time points ,. Let denote a continuous treatment/predictor at time () for person i, and let denote time-varying observed confounders at that time for person i.1 Furthermore, is the outcome at time () for person i and is part of the time-varying confounders . Suppose that a time-varying confounder has three characteristics: it is independently associated with future outcomes , it predicts subsequent levels of treatment as well as future confounders, and it is affected by an earlier treatment and confounders (Vansteelandt & Joffe, 2014). In this paper, for the purpose of explanation, we assume a single confounder that is measured concurrently with the outcome at each time point and is measured before the treatment/predictor level is determined for each person. Thus, we presume that the variables are ordered as .2
Data-Generating Process 1: Time-Invariant Factors Have Both Direct and Indirect Effects on Measurements
Gische et al. (2021) introduced the DAG-based approach to causal inference, explaining how (SEM-based) statistical models can identify the causal models. Figure 1a is a DAG that expresses linear causal relations among variables in ; this is similar to the one presented by Gische et al. (2021) but with time-varying observed confounders L now included. Each solid single-headed arrow represents a direct causal relation, and a dashed double-headed arrow indicates the existence of an unobserved confounder. Dashed circles are used to express latent variables. To keep the illustration simple, here we assume (i) first-order (linear) lagged effects of variables and (ii) that mechanisms that are not directly targeted by treatment are not altered (modularity). More importantly, for the purpose of illustration, we temporarily assume that the process does not start prior to the initial measurement (), indicating that initial measurements are the beginning of the process. Thus, time-invariant factors as random intercepts do not have direct causal effects on the dynamics among variables that might be going on prior to the initial measurement. We will revisit this issue later.
Fig. 1.
The linear causal diagrams (DAGs) for two different data-generating processes in which time-invariant factors are included. Solid single-headed arrows (directed edges) are labeled with path coefficients that quantify direct causal effects. A dashed double-headed arrow (bidirected edge) represents a correlation due to an unobserved common cause. Time-invariant factors are represented in dashed circles, indicating that these are latent variables. For explanatory purposes, it is temporarily assumed that the process does not start prior to the initial measurement. a Time-invariant factors (: called accumulating factors) have both direct and indirect effects on measurements. b Time-invariant factors (I: called stable trait factors) have only direct effects on measurements.
In this linear causal DAG model, we suppose that time-invariant factors are additive to express person-specific differences in the mean levels of the respective variables (Y, A, and L), which do not change over time. Without loss of generality, we assume that these factors have zero means (). If we are interested in the longitudinal change of sleep time in adolescents, as will be investigated in the later empirical example, then reflects all time-invariant factors that might affect the level of sleep time in an adolescent during the course of study (e.g., sex, year of birth, constitution, genetic endowment, health, exercise habits, home environment including discipline, engagement in club/extracurricular activities in school). The values of coefficients corresponding to the paths from time-invariant factors to measurements are restricted to be equal to one. These restrictions (a) assign a scale to the latent random intercept and (b) reflect the assumption that the structural coefficients from the random intercepts to measurements do not change over time (Gische et al., 2021). The bidirected dashed edges between time-invariant factors indicate that they might show covarying relations due to unobserved confounding. Likewise, the bidirected dashed edge between initial measurements and indicates that they might show covarying relations due to unobserved confounding.
Under this linear causal DAG model, the DGP can be represented by the following set of linear equations ():
| 1 |
Here, , , and are (fixed) intercepts at time and are omitted in the DAG representation. Residual terms are denoted by d and are assumed to be uncorrelated with time-invariant factors; they are also usually omitted in the DAG representation. It is also assumed that there is no unobserved common cause among these residuals (i.e., concurrent residuals are mutually uncorrelated). Note that the equations assume homogeneity: the coefficients , and are fixed and constant across persons.
As suggested by Gische et al. (2021), a statistical model that captures the DAG depicted in Fig. 1a (i.e., Eq. 1) can be globally identified: all parameters can theoretically be estimated uniquely from observational data. Assuming sufficient sample size, correct model specification, and no excess multivariate kurtosis, the SEM-based maximum likelihood (ML) method provides estimates that are asymptotically unbiased, efficient, and consistent (Bollen, 1989). More details about causal identification and estimation in linearly parameterized causal DAG models are provided by Gische and Voelkle (in press).
A notable feature of this DGP is that time-invariant factors have both direct and indirect effects on measurements. For example, has a direct effect on (i.e., ), while is also caused by , which is again caused by (i.e., ). In addition, other time-invariant factors and also have indirect effects on (e.g., ). These indirect effects result from the fact that time-invariant factors are modeled with lagged regressions in Fig. 1a (or Eq. 1), rather than being modeled separately.
Usami et al. (2019a) compared several existing statistical models to examine reciprocal relations among variables, emphasizing that whether or not common factors are modeled with lagged regression makes substantial differences in the conceptual and mathematical roles of common factors. For example, the common factors included in the latent change score model (LCS; McArdle & Hamagami, 2001), autoregressive latent trajectory model (ALT; Bollen & Curran, 2004), and general cross-lagged panel model (GCLM; Zyphur et al., 2020a; b) and individual-specific effects that are often included in longitudinal panel models of econometrics (e.g., dynamic panel model) are commonly modeled with lagged regressions, reflecting that they have both direct and indirect effects on measurements. This type of common factor is called an accumulating factor (Usami, 2021; Usami et al., 2019a) because its effects accumulate in measurements at later time points through the lagged regression. However, in the RI-CLPM (that includes time-varying observed confounders), which researchers are increasingly using to uncover within-person relations, common factors (i.e., stable trait factors) are not modeled with lagged regression, indicating that this statistical model cannot identify parameters in the causal DAG model as depicted in Fig. 1a (i.e., Eq. 1).
Data-Generating Process 2: Time-Invariant Factors Have Only Direct Effects on Measurements
To clarify this point, let us consider a different (linear and first-order) DGP in which time-invariant factors are included but have only direct effects on measurements. In Fig. 1b, directed edges from time-invariant factors I are drawn to the corresponding measurements. Also, directed edges from time-varying factors, which are expressed by writing the variable name with an asterisk (e.g., ), are drawn to the corresponding measurements. Directed edges are assumed between these time-varying factors, rather than between measurements as in Fig. 1a. Time-varying factors are also assumed to be uncorrelated with time-invariant factors. As a result, time-invariant factors I have only direct effects on measurements, and under the linearity assumption each measurement can be decomposed into the linear sum of time-invariant and time-varying factors that are mutually uncorrelated.
The values of coefficients corresponding to the paths from time-varying factors to measurements are all restricted to be one, and we assume that these time-varying factors have zero means. Under this linear causal DAG model, the DGP can be represented by the following linear equations (with the assumption of homogeneity of coefficients among persons) that have two major parts:
| 2 |
for , and
| 3 |
for . , , and are the temporal group means (rather than fixed intercepts) at time point and are omitted in the DAG representation. The residual terms d are assumed to be uncorrelated with both time-invariant and time-varying factors and are also omitted in the DAG representation. As suggested from Eq. (2), under these specifications the time-varying factors , , and represent temporal deviations from the expected score for person i at time point (i.e., , , and ), whereas time-invariant factors represent stable between-person differences over time. The time series , , and can thus be interpreted as within-person variations that are uncorrelated from time-invariant factors as stable between-person differences.
In psychology, traits were originally considered as personality characteristics that are stable over time and in different situations. To express such latent constructs, common factors are explicitly included in psychometric models. In the context of the RI-CLPM, such common factors are called stable trait factors, and they have the same role as that of time-invariant factors I in the linear causal DAG model as depicted in Fig. 1b (i.e., Eq. 2). In the RI-CLPM, the initial deviations are modeled as exogenous variables, and their variances and covariances are estimated. Residuals in this statistical model are usually assumed to follow a multivariate normal distribution. Although the original motivation for the RI-CLPM was to infer reciprocal (rather than unidirectional) relations and the model does not usually assume time-varying observed confounders L and higher-order lagged effects of variables, it can be extended in a straightforward manner to investigate (joint) effects of continuous treatments/predictors A on outcomes Y, while including L in linear regressions. Therefore, such an extended version of the RI-CLPM as a statistical model can identify the causal parameters if the assumed DGP as in Fig. 1b (i.e., Eqs. 2 and 3) is correct and if (i.e., three or more time points; Usami et al., 2019a).
Usami et al. (2019a) explained that the conceptual and mathematical roles of common factors differ according to whether or not they are modeled with lagged regression in the statistical model. More specifically, in models that include accumulating factors, their influences on measurements at time (e.g., ) transmit to the future measurements (e.g., ) through the lagged regression, which is also influenced by the same accumulating factors (e.g., ); as a result, the magnitudes of impacts from these factors change over time. In contrast, in models that include stable trait factors (e.g., the RI-CLPM), their impacts are stable over time because they have only direct effects. In this way, the conceptual meaning and inferential results for (within-person) relations among variables being modeled differ in each statistical model according to whether researchers assume the inclusion of accumulating factors or stable trait factors; see Usami et al. (2019a, pp. 643–644) for a more detailed comparison.
Implications of Comparing Different DGPs: Control for Time-Invariant Unobserved Confounders and Initial Conditions
Control for Time-Invariant Unobserved Confounders
As we have argued, the conceptual and mathematical roles differ between stable trait factors and accumulating factors. Importantly, the differences between these factors can also be characterized as whether or not they can be considered as time-invariant unobserved confounders. For example, the accumulating factors of outcomes () cause measurements () while also being associated with measurements of treatments/predictor at the previous time point (). In this sense, accumulating factors can be considered as unobserved confounders in evaluating causal effects of treatments/predictors. In contrast, the stable trait factors of outcomes () also cause measurements but are uncorrelated with within-person variations such as and . More specifically, when measurements are unconditional, does not confound the relations among within-person variations (e.g., the path from to ) because the path from to within-person variations is blocked by the measurement , which act as colliders. Therefore, stable trait factors cannot be viewed as time-invariant unobserved confounders; rather, they should be characterized as merely random intercepts that are uncorrelated with predictors (i.e., within-person variations). This view differs from that of Usami et al. (2019a), who explain stable trait factors as time-invariant unobserved confounders.
If the assumed (linear and first-order) DGP as in Fig. 1b is correct and if all variables are observable, then controlling for only and is sufficient to evaluate the within-person relation between and . One could argue that controlling for stable trait factors is not required to identify causal parameters for treatment effects at the within-person level, the reason being that within-person processes (time-varying factors) and between-person differences (stable trait factors as time-invariant factors) are mutually uncorrelated. However, because all these factors are actually latent variables and unobservable, we need to use measurements (as colliders) to infer treatment effects at the within-person level, and appropriate control of stable trait factors as latent variables is required in the statistical model. If the assumed DGP as in Fig. 1b is correct, then not controlling for stable trait factors causes biased estimates of causal parameters for the within-person relation (e.g., Usami et al., 2019b and the later simulations).
Initial Conditions
How to treat the initial measurements (i.e., , , and ) is also important for distinguishing between the two DGPs in Fig. 1. Special attention needs to be paid to the initial measurements because there are no incoming directed edges to these variables from variables prior to the initial time point (). Although we have assumed so far that the DGP does not start prior to the initial measurements, this assumption is not realistic in many applications, and the initial measurements must somehow account for the past of the process that is not explicitly modeled (see Fig. S1 in the Online Supplemental Material for more details). Assuming a DGP similar to that in Fig. 1a, Gische et al. (2021, Fig. 5) provided a straightforward and interpretable approach that freely estimates the coefficients (loading) from to all initial measurements. For example, for , the coefficients from this factor to , , and are freely specified rather than being fixed to either one or zero. In applying dynamic panel models in econometrics, one usually assumes that individual-specific components (i.e., accumulating factors) are correlated with the initial measurements to account for the past of the process.
Importantly, if the second DGP (Fig. 1b) is correct, then such special considerations are not required. This is because time-invariant factors I have only direct effects on measurements (rather than on temporal deviations as within-person variations) and no directed edges are assumed between observed variables. In other words, past time-varying factors (, , ) as variations in within-person processes cause observed variables separately from I as stable between-person differences (see also Fig. S1 in the Online Supplemental Material). Therefore, if the assumed DGP as depicted in Fig. 1b is correct, then the RI-CLPM, which includes time-varying observed confounders and assumes that initial variables at the within-person level (, ) are exogenous (and are mutually correlated) and that loadings from I to the corresponding initial measurements (i.e., , and ) are all set to one, can identify causal parameters for treatment effects, even if the DGP actually starts prior to the initial measurements.
Summary and Discussion
The critical difference between the two different DGPs in Fig. 1 is whether the assumed time-invariant factors have only (stable) direct effects (i.e., stable trait factors) or both direct and indirect effects on measurements (i.e., accumulating factors). Because of this difference, stable trait factors as merely random intercepts cannot be viewed as time-invariant unobserved confounders, while special considerations for initial measurements are not required if the assumed DGP includes only stable trait factors (i.e., Fig. 1b). Although researchers are increasingly using the RI-CLPM as a statistical model to uncover within-person relations, stable traits and (implicitly) assumed DGPs have not been fully characterized in the causal inference literature. Below, we assume a DGP that includes stable trait factors as in Fig. 1b and also assume that measurements can be decomposed into the linear sum of time-invariant factors (i.e., stable traits) and time-varying factors (i.e., within-person variability scores) that are mutually uncorrelated. The proposed method of within-person variability score-based causal inference for (joint) effects of time-varying treatments at the within-person level can be effectively applied if such a DGP can be assumed. The proposed approach is more flexible than the one that relies entirely on SEM (e.g., the RI-CLPM that includes time-varying observed confounders) in terms of the linearity assumption regarding observed confounders at the within-person level.
A causal DAG represents a researcher’s theory about the causal process and should be drawn based on subject-matter knowledge. However, in many cases, researchers do not exactly know the true DGP and how time-invariant factors (if they exist) influence measurements (e.g., linearly or nonlinearly, directly or indirectly, or both). Although it is ideal if one can unambiguously articulate the theoretically derived expected relations for variables, this can be quite challenging in practical applications (Curran & Bauer, 2011). If linear SEM-based statistical models are used, then as a data-driven approach one could compare model fit indices between two statistical models that appropriately represent the causal models of Eq. (1) and Eqs. (2) and (3) (i.e., the RI-CLPM that includes time-varying observed confounders), and this would be useful for investigating the sensitivity of the conclusions.
In the context of applying the RI-CLPM, Lüdtke and Robitzsch (2021) argued that including stable trait factors might be better suited for short-term studies that typically use shorter time lags between time points. In short-term studies, one might be more certain that there are no indirect effects from time-invariant factors (i.e., only stable trait factors exist). Even if a researcher is certain that stable trait factors exist, they might not be additive and/or might be correlated with within-person variability scores if they share common causes (e.g., genotype; see also McNeish and Kelly Muthén and Asparouhov (2019), who discussed the issue of endogeneity in applying mixed-effects models). This indicates that the model fails to perfectly disentangle the within- and between-person relations. Also, if time-invariant unobserved confounders (rather than random intercepts that merely represent between-person differences as stable trait factors) are likely to be present, then other statistical approaches that account for such confounders might be more suitable. However, in our opinion there are no clear criteria that delineate when and how to include (time-invariant) factors in the assumed DGP, and continued discussion that also considers empirical investigations of each research hypothesis and sensitivity of results (e.g., the later simulations) will be required in the future.
Formal Definitions of Stable Trait Factors, Within-Person Variability Scores and Joint Effects of Time-Varying Treatments
Definitions of Stable Trait Factors and Within-Person Variability Scores
The terms “(stable) traits” and “within-person relations” have been used vaguely and ambiguously in statistical models, despite the existence of mathematical and interpretative differences among models (e.g., Usami et al., 2019a). Inspired by the discussion so far, we provide the formal definitions of these below.
A stable trait factor of person i (say, for Y) is defined in this paper as (i) the time-invariant factor that has additive influence on measurements, and (ii) its quantity is equal to the difference between the expected value of measurement (i.e., true score) of this person at time point (expressed as ) and the temporal group mean (), which is invariant over time:
| 4 |
for , , and . Note that .
Next, the within-person variability score is defined as (i) a time-varying factor that has additive influence on measurements, and (ii) its quantity is equal to the difference between a measurement and its expected value:
| 5 |
with the assumptions of and independence between and . From this formulation, stable trait factors and within-person variability scores are uncorrelated because
| 6 |
Thus, variances of measurements at time point can be expressed as the sum of those of stable trait factor scores and within-person variability scores. This means that the time series for within-person variability scores have the following covariance structure:
| 7 |
In this paper, we use the terms within-person relation and between-person relation to describe the relations between variables that are based on within-person variability scores and stable trait factor scores, respectively.
Definition of Joint Effects of Time-Varying Treatments at the Within-Person Level
Next, we explain the definition of joint (causal) effects of treatments at the within-person level using the potential outcome approach. We assume a similar causal DAG model to that in Fig. 1b: (i) measurements are expressed by the linear sum of stable trait factors and within-person variability scores, and (ii) within-person variability scores are expressed by functions (with assumption of homogeneity) of those in past time. However, unlike the presentation in Sect. 2, we relax some assumptions about the within-person variability scores to allow the following: (a) higher-order lagged effects and interaction effects of treatments/predictors can exist at the within-person level, and (b) time-varying observed confounders can be nonlinearly related with outcomes and treatments/predictors at the within-person level. The current focus is on evaluating the within-person relation between variables, that is, how the (joint) intervention of treatments/predictors influences future outcomes at the within-person level.
Below, we use overbars to denote the history of through and underbars to denote the future of this variable. Let () denote the within-person variability score for the outcome that would take at time point for person i were this person to receive treatment history at the within-person level through . Here, () indicates that the amount of treatments/predictors for person i is equal to the expected score of this person at time point (i.e., ). is a potential outcome, which we connect to the within-person variability score by the consistency assumption
| 8 |
if ; otherwise, is counterfactual. Note that is a latent variable and unobservable, while potential outcomes for measurements (i.e., observed variables) are assumed in the standard potential outcome approach.
In potential outcome approach, causal effect refers to a contrast between potential outcomes under different treatment values. Therefore, for each causal effect, we can imagine a (hypothetical) randomized experiment to quantify it (i.e., target trial; Hernán & Robins, 2021). For example, (average) causal effect on when a continuous treatment/predictor increases one unit from the reference value at time can be expressed as
| 9 |
The standard assumption of no unobserved confounders or sequential ignorability indicates that
| 10 |
Here, is the counterfactual history, that is, the history that agrees with through time and is zero thereafter. Along with the assumed causal DAG above as well as consistency and sequential ignorability, we impose the stable unit treatment value assumption (SUTVA; no unmodeled spillovers, e.g., Hong, 2015) and assumptions of positivity (i.e., the probability of receiving each level of treatment conditional on past confounders and treatments is greater than zero) and modularity. Under these assumptions, the average causal effect in Eq. (9) can be expressed using the difference in conditional means given information on confounders and treatment history as
| 11 |
In other words, the causal effect of treatment at the within-person level can be evaluated by the difference in conditional means of between persons who receive (i.e., treatment levels that are larger than their expected scores ) and who receive , given information on confounders and treatment history.
Similarly, the average joint (causal) effects of a sequence of treatments/predictors on when they increase one unit from the reference values can be expressed as
| 12 |
As a simple example, suppose and that the DGP can be represented by linear and first-order models as in Eqs. (2) and (3) (assuming homogeneity and no interaction effects of treatments/predictors). Then, a conditional mean at can be expressed as the linear (weighted) sum of the terms and :
| 13 |
From this result, joint (causal) effects of treatments and when increasing one unit from the reference values and become Note that (the effect of intervention on ) can also be evaluated by tracing the two paths () and () that start at and end at in Fig. 1b.3 Likewise, at can be expressed as
| 14 |
thus the causal effect of treatment when increasing one unit from the reference values at the within-person level becomes , which is equivalent to the so-called cross-lagged parameter in Eq. (3).
From Eqs. (5) and (6) (i.e., stable trait factors are uncorrelated with within-person variability scores), we have the relation . Because is the term that is not associated with treatments at the within-person level, it can be shown that
| 15 |
The right side of the equation can be interpreted as person-specific joint (causal) effects in the sense that it accounts for stable traits of persons (). Therefore, joint (causal) effects of on (i.e., Eq. 12; at the within-person level) can be interpreted as person-specific joint (causal) effects on Y under the assumed linear causal DAG such as in Fig. 1b.
Identification Conditions for Causal Parameters
So far, we have assumed a causal DAG model that is similar to that shown in Fig. 1b. In the proposed method, the assumptions for identifying parameters for joint (causal) effects can now be summarized as (i) measurements () are expressed by the linear sum of stable trait factors and within-person variability scores that are mutually uncorrelated, (ii) within-person variability scores are expressed by functions (with assumption of homogeneity) of those in past time, (iii) consistency, (iv) sequential ignorability, (v) SUTVA, (vi) positivity, (vii) modularity, and (viii) multivariate normality (if MLE is used in the first step). The proposed method can be used effectively in applications in which these assumptions are met.
Regarding the second assumption, if and the DGP can be represented by linear (and first-order) equations such as in Eqs. (2) and (3) (assuming homogeneity and no interaction effects of treatments/predictors), then in this special case, the RI-CLPM (that includes time-varying observed confounders) as a statistical model can identify parameters for joint (causal) effects. However, the linearity assumption that is typically imposed for time-varying observed confounders and outcomes (and treatments/predictors) in path modeling and SEM (including the RI-CLPM) has often been criticized in the causal inference literature (e.g., Hong, 2015), and relaxing this assumption is often key to consistently estimating the causal quantity of interest (e.g., Imai & Kim, 2019). In addition, ensuring a correct specification in terms of the linearity is very challenging in that many equations must be diagnosed in longitudinal designs.
As we will see, the proposed method still requires specifications of the structure for within-person variability scores over time in each variable (Y, A, and L in the first step) as well as parametric models for treatments and outcomes at the within-person level (in the second step). However, the assumption of linearity is not required for these parametric models in MSMs and SNMs, and in MSMs one does not need to model the relation between outcomes and time-varying observed confounders (at the within-person level) because it is the means of potential outcomes that are marginalized over these confounders that are of concern. Notably, SNMs with G-estimation have the property of being doubly robust to model misspecifications in how time-varying observed confounders are functionally related to treatments and outcomes (at the within-person level).
Proposed Methodology
We are now ready to introduce a method of within-person variability score-based causal inference for estimating joint effects of time-varying continuous treatments, assuming that the above conditions for identification are satisfied. The proposed method consists of a two-step analysis. First, within-person variability scores are calculated using weights through SEM that models only the measurement parts that include stable trait factors. Then, causal parameters are estimated by MSMs or SNMs, using the scores calculated in the first step. This approach is more flexible than the one that relies entirely on SEM (e.g., the RI-CLPM that includes time-varying observed confounders) in terms of modeling how time-varying observed confounders are functionally related to treatments/predictors and outcomes at the within-person level, without imposing the linearity assumption in these relations. Before explaining the proposed methodology, we briefly discuss the motivation for adopting a two-step method, rather than simultaneously estimating stable trait factors (or within-person variations) and causal parameters.
In general, partial misspecification in measurements and/or structural models is known to cause large biases in estimates of model parameters. In the present context, when a simultaneous estimation procedure such as the RI-CLPM is used, misspecification in the structural models at the within-person level may greatly affect parameter estimates in the measurement model ((co)variances of stable factors and within-person variability scores), and vice versa.
To avoid such confounding in interpreting the estimation results, in the SEM context Anderson and Gerbing (1988) proposed a two-step procedure that first confirms the measurement model with a saturated model, so that structural relations have no impact on the measurement model. Then, using an appropriate measurement model, the substantive structural relations model of interest is added (Hoshino & Bentler, 2013). Applications of similar multistep estimation procedures can be seen for diverse classes of latent variable models (Bakk & Kuha, 2017; Croon, 2002; Skrondal & Laake, 2001; Vermunt, 2010).
Another potential advantage of two-step estimation is its feasibility. MSMs and SNMs usually do not assume common factors, and the optimization procedure for these models is different from that in SEM. For this reason, fully customized programming is required if performing simultaneous estimation. However, in two-step estimation, parameters in measurement models can be estimated in the first step through various software packages for SEM, including Amos, SAS PROC CALIS, R packages (sem, lavaan, OpenMx), LISREL, EQS, and Mplus. MSMs and SNMs can be straightforwardly applied just by using calculated within-person variability scores instead of measurements.
Two-step estimation is also advantageous because it poses less risk of improper solutions. This problem is encountered relatively often when applying the RI-CLPM because of negative variance parameters and a singular approximate Hessian matrix for stable trait factor variance–covariance (e.g., Usami et al., 2019a), which is likely caused by misspecifications in linear regressions (i.e., the structural model). We will separately estimate stable trait factors for each variable (Y, A, and L) without influence from specified structural models, thus minimizing the risk of improper solutions.
Step 1: Estimation of Stable Trait Factors and Prediction of Within-Person Variability Scores
The first step of our method is divided into two sub-steps: (i) specification of the measurement models and parameter estimation and (ii) prediction of within-person variability scores.
Specification of the Measurement Models and Parameter Estimation
As stated earlier, we assume that measurements are expressed by the linear sum of stable trait factors and within-person variability scores that are mutually uncorrelated, as in Eq. (2). This equation can be viewed as a factor analysis model that includes a single common factor I (whose factor loadings are all one)4 and a unique factor as temporal deviations. In vector notation, the causal model of Eq. (2) for outcome Y becomes
| 16 |
where is a mean vector, , , , and . We denote as a variance–covariance matrix of within-person variability scores. This implies that the variance–covariance matrix of Y (denoted as ) is of the form .
Unlike the standard factor analysis model, has a dependence structure and is not diagonal. Therefore, in using SEM to estimate the parameters in Eq. (16), some structure—such as compound symmetry, a Toeplitz structure, or a (first-order) autoregressive (AR) structure—must be specified in for model identification. When the model is correctly specified, consistent estimators for , , and can be obtained by MLE in SEM (Jöreskog & Lawley, 1968).
In SEM, missing values can be easily handled by full information maximum likelihood (Enders & Bandalos, 2001) with the assumption of missing at random (MAR; Rubin, 1976). If data are suspected to be missing not at random (MNAR), then appropriate sensitivity analyses and/or multiple imputation should be considered (Resseguier et al., 2011). Models that account for MNAR can be easily estimated in popular software packages for SEM (see Enders, 2011; Newsom, 2015).
Another advantage of SEM is that validity of the specified model can be diagnosed via multiple model fit indices, along with model comparisons using information criteria. In this paper, we use three current major indices (e.g., Hu & Bentler, 1999; Kline, 2016): (a) the comparative fit index (CFI), (b) the root-mean-square error of approximation (RMSEA), and (c) the standardized root-mean-square residual (SRMR).
Similarly, we also set measurement models for treatments/predictors A and observed confounders L separately in this sub-step, then estimate parameters for mean vectors ( and ), stable trait factor variances ( and ), and variance–covariance matrices of within-person variability scores and .
Predicting Within-Person Variability Scores
Let and be vectors of measurements and within-person variability scores, respectively, and let be a mean vector. Also let and be covariance matrices for measurements and within-person variability scores .
We consider linear prediction of within-person variability scores under the condition that and are known. Consider a weight matrix W that provides within-person variability scores from measurements as
| 17 |
satisfying the relation
| 18 |
Unlike standard applications of factor analysis, we are interested in predicting within-person variability (unique factor) scores, rather than stable trait factor (common factor) scores. However, the current problem of determining weights W shares the similar motivation of predicting factor scores. In the factor analysis literature, a predictor that preserves the covariance structure of common factors has been developed as a linear correlation preserving predictor (Anderson & Rubin, 1956; Green, 1969; ten Berge et al., 1999).
With this point in mind, W that can provide the best linear predictor of minimizing the risk function, defined as the trace of a residual covariance matrix (i.e., mean squared error MSE()=), which also satisfies the relation in Eq. (18), can be obtained by utilizing singular value decomposition as
| 19 |
Here, for a positive (semi)definite matrix C, we denote as the positive (semi)definite matrix such that its square equals C. Matrices and are the inverse (if it exists) and the third power of , respectively. A derivation of W is provided in the Online Supplemental Material.
We use the sample means and covariance matrix S of X as estimators of and . As implied from the relation in Eq. (7), we use estimated stable trait factor variances to estimate as
| 20 |
where consists of estimated stable trait factor (co)variances. In the simple case where the initial measurement of Y () is missing and the number of measurements equals K for each variable, becomes
| 21 |
where is an estimator of a stable trait factor covariance matrix . Because stable trait factor covariances are not estimated in the previous sub-step, we use covariances between calculated linear correlation preserving predictors for variables. For example, this predictor for Y can be expressed as
| 22 |
and can be calculated in the same manner, whereby we obtain , , and . Predictors satisfy the relation if the model is correctly specified in the previous sub-step. From Eqs. (17) and (19)–(21), we can thus obtain without specifying the structural models that connect within-person variability scores from different variables (, , and ), successfully maintaining independence from the next step.
Applying MSMs and SNMMs
The second step of the proposed method is straightforward, because we just need to apply MSMs or SNMs using calculated within-person variability scores. Robins and co-workers developed SNMs with G-estimation (Robins, 1989; Robins et al., 1992) and MSMs with an inverse probability weight (IPW) estimator (Robins, 1999; Robins et al., 2000). These methods have been extended to treat clustered outcomes (e.g., Brumback et al., 2014; He et al., 2015, 2019). However, (joint) causal effects under the control of stable trait factors have not been investigated in this area because inference for stable traits and within-person relations has been an issue in the psychometric and behavioral science literature, and these concepts have yet to be fully characterized in the causal inference literature.
MSMs are advantageous in that they can be easily understood and fit with standard, off-the-shelf software that allows for weights (e.g., He et al., 2019; Vansteelandt & Joffe, 2014). However, it is well known that MSMs can be highly sensitive to misspecification of the treatment assignment model, even when there is a moderate number of time points (e.g., Hong, 2015; Lefebvre et al., 2008). Imai and Ratkovic (2015) proposed a covariate balancing propensity score methodology for robust IPW estimation.
Because of the attractive property of being doubly robust in G-estimators, SNMs are a better approach for handling violation of the usual assumptions of no unmeasured confounders or sequential ignorability (Vansteelandt & Joffe, 2014). In addition, SNMs can allow direct modeling of the interactions and moderation effects of treatments/predictors A with observed confounders L. Another advantage of SNMs is that the variance of locally efficient IPW estimators in MSMs exceeds that of G-estimators in SNMs, unless A and L are independent. We therefore emphasize the utility of SNMs in this paper. Because we are now interested in evaluating the joint effects of treatments on the mean of an outcome, rather than those on the entire distribution of the outcome, we apply structural nested mean models (SNMMs; Robins, 1994).
Note that potential disadvantages of SNMs are their limited utility for G-estimation when applying logistic SNMs and their limited availability of off-the-shelf software. Regarding the latter point, Wallace, Moodie, and Stephens (2017) developed an R package for G-estimation of SNMMs.
MSMs Using Within-Person Variability Scores
MSMs are typically applied to evaluate the joint effects of a sequence of treatments on the outcome, which is measured only at the end of a fixed follow-up period (). For generality of discussion, as before we assume that the outcome is measured each time and that the primary interest is evaluation of effects of a sequence of past treatments on the outcome at each time point.
MSMs consider the marginal mean of potential outcomes that are marginalized over the observed confounders L. In the current context, we consider potential outcomes at the within-person level, namely, with treatment history . might take the form
| 23 |
with 5 The average joint (causal) effects of on when increasing one unit from the reference values in each treatment become . Parameters can be estimated by fitting a weighted conditional model with an IPW estimator. One useful option for calculating weights is to use stabilized weights for person i at time point (Hernán et al., 2002) as
| 24 |
where for all , if (the positivity assumption). Parameters will be biased if the treatment assignment model is misspecified, but misspecification of does not result in bias. In MSMs, unlike the RI-CLPM (that includes L), one does not need to model the relation between outcomes and time-varying observed confounders at the within-person level because marginal (joint) effects of treatments/predictors are the primary focus in applying this method. Also, one can allow a nonlinear relation between treatments/predictors and confounders in the treatment assignment model , although estimates are sensitive to this model misspecification.
SNMMs Using Within-Person Variability Scores
SNMMs simulate the sequential removal of an amount (blip) of treatment at on subsequent average outcomes, after having removed the effects of all subsequent treatments. SNMMs then model the effect of a blip in treatment at on the subsequent outcome means while holding all future treatments fixed at a reference level 0 (Vansteelandt & Joffe, 2014); in other words, the level that is equal to expected scores of a person in the current context.
SNMMs parameterize contrasts of and conditionally on treatments/predictors and confounder histories through as
| 25 |
for each , where is a known link function, and is a known -dimensional function, smooth in the finite-dimensional parameter (Vansteelandt & Joffe, 2014).
In the following empirical applications using the data of , a linear SNMM using the identity link is given by
| 26 |
Here, the first equation models the effect of on , the second models the effect of on , and the third models the effect of on . The (conditional) average joint effects of and on when increasing one unit from the reference values in each treatment become . This effect becomes if there are no interaction effects between confounders and treatments.
SNMMs consider a transformation of , the mean value of which is equal to the mean that would be observed if treatment were stopped from time onward, in the sense that
| 27 |
for . Here, is a vector with components for if is the identity link. For instance, in the above example of ,
| 28 |
The assumptions of sequential ignorability (Eq. 10) together with identity (Eq. 27) imply that
| 29 |
for . The parameters can therefore be estimated by solving the estimating equation
| 30 |
where is an arbitrary -dimensional function, with p the dimension of , and is a -dimensional vector that includes the reciprocal of the variance of each element in .
This estimating equation essentially sets the sum across the time points of the conditional covariances between and the function , given and , are zero. If there is homoscedasticity in V, then local semiparametric efficiency under the SNMM is attained upon choosing
| 31 |
(Vansteelandt & Joffe, 2014). Solving estimating equation (30) requires a parametric model for the treatment/predictor : with . It also requires a parametric model for the conditional mean of , namely, . Notably, when the parameters and are variation-independent, G-estimators that solve Eq. (30), obtained by substituting and with consistent estimators, are doubly robust (Robins & Rotnitzky, 2001, cited from Vansteelandt & Joffe, 2014), meaning that estimates of causal parameters are consistent when either model or model is correctly specified. In addition, unlike the RI-CLPM (that includes L), one can allow nonlinear effects of time-varying observed confounders on treatments/predictors and outcomes in models and .
Simulation Studies
Method
This section describes a Monte Carlo simulation for systematically investigating how effectively the proposed method using calculated within-person variability scores can recover causal parameters, and it presents comparisons of estimation performance versus other potential (centering) methods to account for stable traits. We consider two different scenarios: (i) the assumed linear (and first-order) DGP of Fig. 1b (i.e., causal models represented in Eqs. 2 and 3) is correct and other assumptions of consistency, sequential ignorability, SUTVA, positivity, modularity, and multivariate normality are all satisfied, and (ii) some assumptions are violated and the statistical model contains misspecifications. In the whole simulation, for simplicity we also assume that causal effects are homogeneous among persons and interactions or moderation effects with observed confounders are not present.
In the first scenario, initial within-person variability scores (, , and ) are first generated so that they are normally distributed and their variances and covariances become 10 and 3, respectively. Then, within-person variability scores at succeeding times are sequentially generated via a first-order linear autoregressive model (i.e., Eq. 3) with the stationarity assumption6:
| 32 |
If , this setting produces (see also the calculation in Eq. 13)
| 33 |
Because no moderation effects are assumed, estimating 10 different causal parameters is a common goal between MSMs and SNMMs. The variance of normal residual d was set to 5 for each variable, making the variance of within-person variability scores for each variable become almost 10 at each time point (the proportion of variance explained in Eq. (32) becomes almost 50%).
Independently of generating within-person variability scores, three kinds of stable trait factors (, , and ) are generated by multivariate normal with a correlation of 0.3. Observed values are then generated using the relation of Eq. (2),
| 34 |
where temporal group means are set to zero at each time point (i.e.,
In this simulation, we systematically changed the total number of persons as , and 1000, the number of time points as and 8, and the size of stable trait factor variances as , and 10. This setting of stable trait factor variances indicates that the proportion of this variance to that of measurements becomes around 10%, 30%, and 50%, respectively, at each time point. To make it easier to compare the results between the and conditions, in we suppose only , , , and are intervened, while controlling for , , , and . This setting produces conditional means of (potential) outcomes as functions of treatments intervened: at , at , at , and at . There are a total of 10 causal parameters that are equal to those in the condition (i.e., ).
By crossing these factors, we generated 200 simulation data for each combination of factors. For comparison, each simulation dataset was analyzed by MSMs and SNMMs using four different scores: (1) true within-person variability scores (true factor score centering: e.g., for Y), (2) within-person variability scores predicted by the proposed method (Eq. 17), (3) scores based on observed person-specific means (observed-mean centering, e.g., , where ), and (4) observed scores (no centering, e.g., ). In the current scenario, the no-centering method totally ignores the presence of stable traits. On the other hand, because observed means include the components of both stable traits (between-person differences) and within-person variability, observed-mean centering fails to perfectly disentangle stable individual differences from within-person variability.7 Under each simulation condition, we calculated the bias and root-mean-squared error (RMSE) of 10 kinds of estimates of causal parameters from MSMs and SNMMs.
In the first step of the proposed method, to identify the measurement model (e.g., Eq. 16 for Y) SEM that assumes a linear AR(1) structure with time-varying autoregressive parameters and residual variances is specified for within-person variability scores in each variable. Although a true model (i.e., AR(K) structure) cannot be specified because of the identification problem, we confirmed that the AR(1) structure generally provides acceptable model fits under the current parameter setting.
The results are discarded when improper solutions appear in the first step because of out-of-range parameter estimates (e.g., negative variance). In the current simulation, fewer than 0.1% of all estimates produced such improper solutions. We also confirmed that improper solutions were not found in the second step of applying MSMs and SNMMs. When applying MSMs, a first-order linear regression model is specified for the treatment assignment model, namely, (i.e., the correct specification). For SNMMs, models and are also specified in an appropriate manner.
In the second scenario where model misspecifications are present, we assume various DGPs in which (a) measurement errors are present, (b) time-invariant factors do not influence measurements as stable trait factors, and (c) quadratic effects of time-varying observed confounders are present in the treatment assignment model, keeping the other conditions the same from the first scenario. More specifically, in (a), all measurements are influenced by normally distributed measurement errors with variances of 10% or 20% of those of the initial measurements (=10+). In (b), the relation between outcomes and time-invariant factors (I) is set as , (i.e., time-varying loadings from and those from other variables ( and ) are present), resulting from the assumed DGP such as that in Fig. 1a in which time-invariant factors have both direct and indirect effects on measurements. In (c), quadratic effects from time-varying observed confounders are included in the treatment assignment model as , indicating that the treatment assignment model that includes only linear effects of assumed in the current MSM and SNMM is misspecified. Note that causal parameters for time-varying treatments () remain unchanged even if quadratic effects exist in the treatment assignment model because time-varying treatments are now intervened.
The simulation was conducted in R, using the lavaan package (Rosseel, 2012) to estimate parameters by SEM with MLE in the first step and the ipw package for MSMs in the second step. In SNMMs, we solve Eq. (30) via the Newton–Raphson method. Simulation code is available in the Online Supplemental Material.
Results
Because of space limitations, Fig. 2 shows only biases of estimates of causal parameters in MSMs and SNMMs when and 10. Because differences in the N value were minor in terms of bias, here we only show the result when . Results under other conditions are provided in the Online Supplemental Material (Figs. S3 and S4).
Fig. 2.
Biases of causal effects estimates (). Note: Because of rank deficient, in estimates of are not available in marginal structural model with observed mean centering.
Figure 2 shows that true score conditions produce almost no biases in both MSMs and SNMMs. SNMMs show smaller RMSEs compared with MSMs on average (Fig. S2). In the proposed method, estimates show biases because of the biased estimates of stable trait factor (co)variances triggered by a model misspecification in the first step. However, the magnitude of biases is much smaller than in the observed-mean centering and no-centering methods. SNMMs again show smaller RMSEs than do MSMs (Fig. S2). The observed-mean centering method shows negative biases, and their magnitude becomes larger when . This result is caused by negatively biased covariances in variables resulting from subtracting observed means from measurements, and this impact increases as K decreases. Another critical aspect of this method is that linear dependence prevents identification of joint effects of all past treatments on (in this case, , , , in ). We therefore do not recommend use of observed-mean centering. The no-centering method shows serious negative biases when is not small, indicating that ignoring the presence of stable traits is critical to estimating causal effects. Magnitudes of stable trait factor variances should vary depending on the nature of variables and study period, but in the author’s experience many studies that applied the RI-CLPM have shown significant and moderate to large sizes of (e.g., the proportion of stable trait factor variance to that of measurements is above 30%). The following application also demonstrates large stable trait factor variance estimates.
As supplemental analyses, we additionally explored the performance of the methods under different parameter settings, as well as different model specification of SEM in the first step. From this, we find similar tendencies in the results (Figs. S5–S8): (a) SNMMs show smaller RMSEs than do MSMs, and (b) the proposed method shows adequate performance in terms of biases and RMSEs, and it works better than the no-centering method (especially when is larger) and the observed-mean centering method (especially when K is smaller). We also investigated the performance of linear correlation preserving predictor ( in Eq. 22) centering (e.g., ), confirming that the proposed method worked much better than this method on average (Figs. S5–S8).
Similar results were also observed in the second scenario, where model misspecifications are present (Figs. S9–S14). More specifically, when measurement errors were present, the biases and RMSEs became larger in all methods (Figs. S9 and S10). However, the proposed method still outperforms other centering methods. When time-invariant factors do not influence measurements as stable trait factors, the overall results of biases and RMSEs were not largely affected (Figs. S11–S12), regardless of the magnitude of . This result is a little surprising, considering that the specified time-varying loadings from factors ( at time ) are not small (i.e., the impact of this factor on the variance of measurement at time is almost twice that at time ). This may suggest that causal parameters can be recovered relatively well even when ignoring time-varying impacts from time-invariant factors that are actually present in the first step. However, future investigations are required in order to better clarify when estimated causal parameters are seriously biased under various scenarios for misspecified measurement models. When quadratic effects from time-varying observed confounders are present in the treatment assignment model but are ignored in analyses, biases and RMSEs in MSMs become larger on average. In the proposed method, this is salient in the RMSEs for the and conditions (Figs. S13–S14). SNMMs, which have the property of being doubly robust in G-estimators, were less influenced even if these by-no-means small quadratic effects are ignored, and in many conditions the proposed method again outperforms other centering methods.
Empirical Application
This section describes an empirical application of the proposed method using data from the Tokyo Teen Cohort (TTC) study (Ando et al., 2019). We assume a similar causal DAG model to that in Fig. 1b: (i) measurements are expressed by the linear sum of stable trait factors and within-person variability scores, (ii) within-person variability scores are expressed by functions (with assumption of homogeneity) of those in past time, along with (iii) consistency, (iv) sequential ignorability, (v) SUTVA, (vi) positivity, and (vii) modularity.
TTC was a multidisciplinary longitudinal cohort study on the psychological and physical development of adolescents who were 10 years old at enrollment and lived in municipalities in the Tokyo metropolitan area (Setagaya, Mitaka, Chofu). Datasets were collected in three waves: from 2012 to 2015, from 2014 to 2017, and from 2017 to 2019 (i.e., ). In total, 3171 children participated in the survey. See Ando et al. (2019) for more detailed information about measured variables, participant recruitment, and demographic characteristics of participants in the TTC study.
In this example, we estimate the (joint) causal effects of time-varying sleep duration (A) on later depressive symptoms (Y) in adolescents. Several epidemiological studies have suggested a relationship between sleep habits (sleep duration, bedtime, and bedtime regularity) and mental health status (depression and anxiety) in adolescents. For example, Matamura et al. (2014) applied the CLPM to data from 314 monozygotic twins living in Japan and showed that sleep duration had significant associations with mental health indices, even after controlling for genetic and shared environmental factors. However, to the author’s knowledge, no studies have investigated this relation that accounts for stable traits in sleep duration and symptoms (i.e., at the within-person level).
The Short Mood and Feelings Questionnaire (SMFQ; Angold et al., 1995) was used to measure depression in adolescents (Y). The SMFQ consists of 13 items assessing depressive symptoms rated on a three-point scale (0: not true, 1: sometimes true, 2: true) regarding feelings and actions over the preceding two weeks. Higher SMFQ scores suggest more severe symptoms. These data were measured at home by self-report questionnaires. In this example, sleep duration in hours (A) was measured by the question “How long do you usually sleep on weekdays?" Observed confounders were body mass index (BMI; ) and bedtime (), which was measured by the question “When do you usually go to bed on weekdays?" Because many adolescents reported no problems for all items on the SMFQ, the score distribution was positively skewed. In the present example, we focus on the clinical group comprising adolescents (13.1%) with SMFQ scores of 6 or higher during the study. Katon et al. (2008) reported 80% sensitivity and 81% specificity at this cutoff for diagnosis of major depression based on the Computerized Diagnostic Interview Schedule for Children (C-DISC). Missing data were primarily due to dropout. Of the 416 samples, 113 adolescents provided all three responses in the study. Descriptive statistics of sleep duration, SMFQ score, bedtime, and BMI are available in the Online Supplemental Material (Table S1).
In the first step, we use generalized least squares in the lavaan package to estimate the model parameters for each variable. To identify the measurement model (e.g., Eq. 16 for Y), SEM that assumes an AR(1) structure with time-varying autoregressive parameters and residual variances is specified for within-person variability scores of each variable. Let be the total number of variables observed in adolescent i. weights are calculated from estimated parameters under the assumption of MAR. Within-person variability scores are then calculated using this weight and measurements for adolescent i as .
Causal parameters ( and ) of sleep duration at 10 and 12 years old ( and ) on later depressive symptoms (SMFQ scores and ) are estimated using calculated within-person variability scores by linear SNMM. In linear SNMM, blip functions and are set as in Eqs. (26) and (28), except that the two confounders and are present in this example. When applying SNMMs, models and are both specified using first-order linear regression models. All calculated within-person variability scores were used in the analysis under the assumption of MAR.
We confirmed that the first step did not find improper solutions, and that current AR(1) models that assume time-varying parameters fit better than those that do not. Table S2 summarizes the model fit indices and estimated parameters in this step. All stable trait factor variance estimates are significant, indicating the necessity of controlling for stable traits. Specifically, the proportions of variances in measurements attributable to estimated stable trait factors at are 24.5%, 54.5%, 48.2%, and 74.8% for Y, A, , and , respectively.
Table 1 provides the estimation results of causal parameters, along with estimates based on the no-centering and observed-mean centering methods for comparison. As seen in Table 1, the proposed method reveals that intervention of longer sleep duration at 12 years old () has a positive effect (, 95%CI , .05) on later depressive symptoms at 14 years old () at the within-person level, but this estimate is not significant in the no-centering and observed mean score-centering methods. Similar positive effects of sleep duration were found in previous studies (Matamura et al., 2014), but the present analysis newly investigates this causal hypothesis at the within-person level by controlling for stable traits of persons. When the no-centering method is applied, the causal effect estimate of on is significant, showing that intervention of longer sleep duration at 10 years old has a negative effect (, 95% CI [0.405,2.981], .05) on later depressive symptoms at 12 years old. Considering that the magnitudes of the estimated stable trait factor variances were moderate or large for all variables, causal effect estimates in the no-centering method are unreliable and might be seriously biased.
Table 1.
Estimates of causal parameters of sleep duration on depression (SMFQ) ()
| Proposed method | Observed-mean centering | Observed scores (no centering) | |
|---|---|---|---|
| 2.704 (1.140) | 0.095 (0.869) | 1.492 (1.080) | |
| 0.603 (1.416) | 0.916 (1.857) | 0.336 (1.057) | |
| 0.442 (0.638) | 0.748 (1.169) | 0.532 (0.399) | |
| 0.293 (1.179) | 0.309 (1.021) | 0.185 (1.117) | |
| 2.278 (1.918) | 3.012 (1.784) | 1.169 (1.792) | |
| 1.856 (0.780) | 0.251 (0.986) | 0.306 (0.287) | |
| 0.702 (0.686) | 0.279 (0.572) | 1.693 (0.657) | |
| 0.315 (1.177) | 1.037 (1.069) | 0.918 (0.920) | |
| 0.021 (0.473) | 0.773 (0.572) | 0.101 (0.212) |
Bold font indicates statistical significance.
In supplemental analyses, we confirmed that the major findings did not change even when using only data of adolescents who provided all three responses () and a different cutoff for SMFQ (Angold et al., 1995; Tables S2–S5). Again, statistical significance as well as sign and magnitude in estimates of causal parameters might change according to the choice of calculation (centering) methods for within-person variability scores, and ignoring the presence of stable traits of persons might lead to incorrect conclusions.
General Discussion
We proposed a two-step estimation method for within-person variability score-based causal inference to estimate joint effects of time-varying (continuous) treatments/predictors by controlling for stable traits. In the first step, a within-person variability score for each person, which is disaggregated from the stable trait factor score, is calculated using weights based on the best linear correlation preserving predictor through SEM. Causal parameters are then estimated by MSMs or SNMs, using calculated within-person variability scores. The proposed method can be viewed as one that synthesizes the two traditions of factor analysis/SEM in psychometrics and a method of causal inference (MSMs or SNMs) in epidemiology.
In this paper, we began by providing formal definitions of stable trait factors (for between-person relations) and within-person variability scores (for within-person relations), because these concepts have not been fully characterized in the causal inference literature despite the fact that they have been attracting increasing attention in psychometrics and behavioral science (e.g., Hamaker et al., 2015; Usami et al., 2019a). On the other hand, in epidemiology the conceptual and mathematical differences between stable trait factors and accumulating factors, along with which kind of time-invariant factor is included in each statistical model, have received less attention. This paper may help bridge the gap. We have also clarified the assumptions required to identify causal parameters for within-person variability score-based causal inference: (i) (as depicted in Fig. 1b) measurements are expressed by the linear sum of stable trait factor scores (defined as Eq. 4) and within-person variability scores (defined as Eq. 5) that are mutually uncorrelated, (ii) within-person variability scores are expressed by functions of those (with assumption of homogeneity) in past time, (iii) consistency, (iv) sequential ignorability, (v) SUTVA, (vi) positivity, (vii) modularity, and (viii) multivariate normality (if MLE is used in the first step).
As for the second assumption, our approach is more flexible than the RI-CLPM (that includes time-varying observed confounders), which researchers are becoming increasingly interested in for uncovering within-person relations among variables, in that the assumption of linearity is not required with respect to time-varying observed confounders at the within-person level. We particularly emphasize the utility of SNMs with G-estimation, because of its property of being doubly robust to the model misspecifications in how the time-varying observed confounders are functionally related to treatments/predictors and outcomes, along with flexibility in that it allows investigation of moderation effects of treatments with observed confounders.
Through simulation and empirical application, we illustrated that ignoring the presence of stable traits might lead to incorrect conclusions in causal effects. We also confirmed that the proposed approach is superior to observed-mean centering, as a conventional method to predict stable traits of persons. Especially when K is small, observed-mean centering showed serious negative biases in estimates of causal parameters. Considering that most research applying the RI-CLPM to uncover within-person relations used longitudinal data with two or three time points (; e.g., Usami et al., 2019b), observed-mean centering cannot be recommended.
A recent study provided closed-form parametric expressions of causal effects for linear models (Gische et al., 2021), and Gische and Voelkle (in press) proposed asymptotically efficient estimators in the case of ML estimation. It is suggested that, at least in large samples, the parametric procedure proposed by Gische and Voelkle (in press) may give smaller standard errors compared with the proposed method in simulations in which data are generated by a linear model with normal residuals. Comparing the performances of these methods and the RI-CLPM under various conditions that account for nonlinear relations among variables is an important topic for future studies.
One caveat for the proposed method, which is relevant to the second assumption above, is that in the first step, one must correctly specify the structure (such as the AR(1) structure) for within-person variability scores in each variable so that the (identified) SEM can yield consistent estimates of parameters, which are required for consistent estimation of causal effects in the subsequent step. However, in general, how to establish the correct (or even a plausible) DAG model is a major challenge (Hamaker et al., 2020; see also the discussion in Sect. 2.4). Relatedly, the premise that stable trait factors exist and loadings from factors are equal to those in the DGP might be restrictive in actual applications; therefore, we need to carefully account for the consequences of possible model misspecifications in the first step on the results in the second step to precisely infer within-person relations. The good news is that in the present simulation, we confirmed that the specified time-varying AR(1) structure works well to recover causal parameters, and its performance was not largely influenced even when there were model misspecifications (i.e., time-varying effects from time-invariant factors). However, additional large-scale simulations to further clarify the robustness of the method regarding this point are needed in future studies.
We can also use model fit indices to evaluate how well the structure specified in the first step fits to the data, especially when the number of time points is large. However, this procedure is not a fundamental solution. Even if a researcher is certain that SEM that assumes stable trait factors for each variable can be specified in the first step, in general there is still at great risk of violating some assumptions. Notably, relating to the first assumption above, stable trait factors and within-person variability scores (temporal deviations) at each time point might be correlated if they share common causes. Future studies should investigate how this violation impacts the estimated causal parameters.
We used two-step estimation to account for feasibility, but this issue remains in that one still must write programming code, as that in the Online Supplemental Material. We are planning to develop packages for the proposed method. Another potential limitation is that the proposed approach (as well as the RI-CLPM) demands longitudinal data with three or more time points () to identify the measurement model (SEM) in the first step unless strong parameter constraints are imposed.
Because we take an SEM approach in the first step, accounting for measurement errors, which is closely related to violation of the consistency assumption, is feasible under the parametric assumption. Although we expect that longitudinal data with large K are required for precisely estimating measurement-error variances, we plan to investigate how the proposed method works under measurement models that include measurement errors.
This paper opens a new avenue for exploring other various research questions that are closely relevant to causal hypotheses. For example, use of within-person variability scores can be extended to cases in which one is interested in uncovering reciprocal effects (e.g., Usami et al., 2019a) and mediation effects (e.g., Goldsmith et al., 2018; Tchetgen & Shpitser, 2012), as well as to multilevel modeling and hierarchical continuous time modeling (Driver & Voelkle, 2018). Note that there is still room for discussion on the issues of within-person relation and stable traits, as well as the issue about when and how to include (time-invariant) factors in the assumed DGP (see Sect. 2.4). The present paper is intended to promote substantial discussion about the conceptual and statistical properties of the time-invariant factors (e.g., stable trait factors or accumulating factors) included in the assumed DGP among researchers who wish to infer within-person relations and causality, and the hope is that the proposed method helps in exploring various causal hypotheses in longitudinal design and guiding better decision-making for researchers.
Supplementary Information
Below is the link to the electronic supplementary material.
Funding
Funding was provided by Japan Society for the Promotion of Science (Grant No. 19K14378).
Footnotes
Time-invariant observed confounders can be included as a special case, but in the DGPs discussed herein, only time-varying observed confounders are assumed for simplicity.
Y is not shown explicitly here because it is part of L. We often omit Y in expressing time-varying observed confounders in this paper.
Here, the path does not need to be accounted for because time-varying treatments are now intervened and does not depend on ).
Although we defined stable trait factors as the (time-invariant) difference between the expected value of a given person’s measurement and the temporal group mean, one could argue for another definition that allows for time-varying influences on measurements. If this is the case, time-varying factor loadings can be freely specified in this step (except for one fixed factor loading for identification). However, there may be some cost in that the minimum number of time points required to identify the measurement model becomes larger than that in specifying time-invariant loadings.
Here, an intercept becomes zero. Other terms such as quadratic effects (e.g., ) for time-varying treatments can be included in MSMs. Also, one can include observed covariates/nonconfounders to assess effect modification (Hernán & Robins, 2021).
In this paper, we use the term stationarity assumption to indicate invariance of autoregressive parameters, cross-lagged parameters, and residual variance parameters over time, rather than indicating means and (co)variances in variables to be invariant over time.
As a similar problem, the risk of using observed person-specific (or cluster-specific) means to express cluster effects is recognized as Nickell’s bias and Lüdtke’s bias for estimates of regression coefficients in applications of multilevel models (e.g., Asparouhov & Muthén, 2018; Lüdtke et al., 2008; McNeish & Hamaker, 2020; Usami, 2017).
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Change history
10/19/2022
An Erratum to this paper has been published: 10.1007/s11336-022-09890-6
References
- Anderson JC, Gerbing DW. Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin. 1988;103:411–423. [Google Scholar]
- Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis. In J. Neyman (Ed.), Proceedings of the 3rd Berkeley symposium on mathematical statistics and probability (Vol. 5, pp. 111–150). Berkeley.
- Ando S, Nishida A, Yamasaki S, Koike S, Morimoto Y, Hoshino A, Kanata S, Fujikawa S, Endo K, Usami S, Furukawa TA, Hiraiwa-Hasegawa M, Kasai K. Cohort profile: Tokyo Teen Cohort study (TTC) International Journal of Epidemiology. 2019;48:1414–1414g. doi: 10.1093/ije/dyz033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Angold A, Costello EJ, Messer SC, Pickles A, Winder F, Silver D. Development of a short questionnaire for use in epidemiological studies of depression in children and adolescents. International Journal of Methods in Psychiatric Research. 1995;5:237–249. [Google Scholar]
- Asparouhov T, Muthén B. Latent variable centering of predictors and mediators in multilevel and time-series models. Structural Equation Modeling: A Multidisciplinary Journal. 2018;26:1–24. [Google Scholar]
- Bakk Z, Kuha J. Two-step estimation of models between latent classes and external variables. Psychometrika. 2017;83:871–892. doi: 10.1007/s11336-017-9592-7. [DOI] [PubMed] [Google Scholar]
- Bollen KA. Structural equations with latent variables. Wiley; 1989. [Google Scholar]
- Bollen KA, Curran PJ. Autoregressive latent trajectory (ALT) models: A synthesis of two traditions. Sociological Methods and Research. 2004;32:336–383. [Google Scholar]
- Brumback BA, He Z, Prasad M, Freeman MC, Rheingans R. Using structural-nested models to estimate the effect of cluster-level adherence on individual-level outcomes with a three-armed cluster-randomized trial. Statistics in Medicine. 2014;33:1490–1502. doi: 10.1002/sim.6049. [DOI] [PubMed] [Google Scholar]
- Cole DA, Martin NC, Steiger JH. Empirical and conceptual problems with longitudinal trait-state models: Introducing a trait-state-occasion model. Psychological Methods. 2005;10:3–20. doi: 10.1037/1082-989X.10.1.3. [DOI] [PubMed] [Google Scholar]
- Croon M. Using predicted latent scores in general latent structure models. In: Marcoulides G, Moustaki I, editors. Latent Variable and Latent Structure Modeling. Erlbaum; 2002. pp. 195–223. [Google Scholar]
- Curran PJ, Bauer DJ. The disaggregation of within-person and between-person effects in longitudinal models of change. Annual Review of Psychology. 2011;62:583–619. doi: 10.1146/annurev.psych.093008.100356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Driver CC, Voelkle MC. Hierarchical Bayesian continuous time dynamic modeling. Psychological Methods. 2018;23:774–799. doi: 10.1037/met0000168. [DOI] [PubMed] [Google Scholar]
- Enders CK. Missing not at random models for latent growth curve analyses. Psychological Methods. 2011;16:1–16. doi: 10.1037/a0022640. [DOI] [PubMed] [Google Scholar]
- Enders CK, Bandalos DL. The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling. 2001;8:430–457. [PubMed] [Google Scholar]
- Gische, C., & Voelkle, M.C. (in press). Beyond the mean: A flexible framework for studying causal effects using linear models. Psychometrika. [DOI] [PMC free article] [PubMed]
- Gische C, West SG, Voelkle MC. Forecasting causal effects of interventions versus predicting future outcomes. Structural Equation Modeling. 2021;28:475–492. doi: 10.1080/10705511.2020.1780598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldsmith KA, MacKinnon DP, Chalder T, White PD, Sharpe M, Pickles A. Tutorial: The practical application of longitudinal structural equation mediation models in clinical trials. Psychological Methods. 2018;23:191–207. doi: 10.1037/met0000154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Green BF. Best linear composites with a specified structure. Psychometrika. 1969;34:301–318. [Google Scholar]
- Hamaker EL. Why researchers should think “within-person”: A paradigmatic rationale. In: Conner TS, editor. Handbook of research methods for studying daily life. Guilford Press; 2012. pp. 43–61. [Google Scholar]
- Hamaker EL, Kuiper RM, Grasman RPPP. A critique of the cross-lagged panel model. Psychological Methods. 2015;20:102–116. doi: 10.1037/a0038889. [DOI] [PubMed] [Google Scholar]
- Hamaker EL, Mulder JD, van IJzendoorn MH. Description, prediction and causation: Methodological challenges of studying child and adolescent development. Developmental Cognitive Neuroscience. 2020;46:1–14. doi: 10.1016/j.dcn.2020.100867. [DOI] [PMC free article] [PubMed] [Google Scholar]
- He J, Stephens-Shields A, Joffe M. Structural nested mean models to estimate the effects of time-varying treatments on clustered outcomes. International Journal of Biostatistics. 2015;11:203–222. doi: 10.1515/ijb-2014-0055. [DOI] [PubMed] [Google Scholar]
- He J, Stephens-Shields A, Joffe M. Marginal structural models to estimate the effects of time-varying treatments on clustered outcomes in the presence of interference. Statistical Methods and Medical Research. 2019;28:613–625. doi: 10.1177/0962280217732598. [DOI] [PubMed] [Google Scholar]
- Hernán MA, Brumback B, Robins JM. Estimating the causal effect of zidovudine on CD4 count with a marginal structural model for repeated measures. Statistics in Medicine. 2002;21:1689–1709. doi: 10.1002/sim.1144. [DOI] [PubMed] [Google Scholar]
- Hernán MA, Robins JM. Causal inference: What if. Chapman & Hall/CRC; 2021. [Google Scholar]
- Hoffman L. Longitudinal analysis: Modeling within-person fluctuation and change. Routledge/Taylor & Francis; 2014. [Google Scholar]
- Hong G. Causality in a social world: Moderation, mediation and spill-over. John Wiley & Sons Ltd; 2015. [Google Scholar]
- Hoshino T, Bentler PM. Bias in factor score regression and a simple solution. In: de Leon AR, Chough KC, editors. Analysis of mixed data: Methods & applications. Chapman & Hall; 2013. pp. 43–61. [Google Scholar]
- Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling. 1999;6:1–55. [Google Scholar]
- Imai K, Kim S. When should we use unit fixed effects regression models for causal inference with longitudinal data? American Journal of Political Science. 2019;63:467–490. [Google Scholar]
- Imai K, Ratkovic M. Robust estimation of inverse probability weights for marginal structural models. Journal of the American Statistical Association. 2015;110:1013–1023. [Google Scholar]
- Jöreskog KG, Lawley DN. New methods in maximum likelihood factor analysis. British Journal of Mathematical and Statistical Psychology. 1968;21:85–96. [Google Scholar]
- Katon W, Russo J, Richardson L, McCauley E, Lozano P. Anxiety and depression screening for youth in a primary care population. Ambulatory Pediatrics. 2008;8:182–188. doi: 10.1016/j.ambp.2008.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kline RB. Principles and practice of structural equation modeling. 4. Guilford Press; 2016. [Google Scholar]
- Lefebvre G, Delaney JAC, Platt RW. Impact of mis-specification of the treatment model on estimates from a marginal structural model. Statistics in Medicine. 2008;27:3629–3642. doi: 10.1002/sim.3200. [DOI] [PubMed] [Google Scholar]
- Lüdtke O, Marsh HW, Robitzsch A, Trautwein U, Asparouhov T, Muthén B. The multilevel latent covariate model: A new, more reliable approach to group- level effects in contextual studies. Psychological Methods. 2008;13:203–229. doi: 10.1037/a0012869. [DOI] [PubMed] [Google Scholar]
- Lüdtke, O., & Robitzsch, A. (2021). A critique of the random intercept cross-lagged panel model. 10.31234/osf.io/6f85c
- Matamura M, Tochigi M, Usami S, Yonehara H, Fukushima M, Nishida A, Togo F, Sasaki T. Associations between sleep habits and mental health status and suicidality in the longitudinal survey of monozygotic-twin adolescents. Journal of Sleep Research. 2014;23:290–294. doi: 10.1111/jsr.12127. [DOI] [PubMed] [Google Scholar]
- McArdle JJ, Hamagami F. Latent difference score structural models for linear dynamic analyses with incomplete longitudinal data. In: Collins L, Sayer A, editors. New methods for the analysis of change. American Psychological Association; 2001. pp. 137–175. [Google Scholar]
- McNeish D, Hamaker EL. A primer on two-level dynamic structural equation models for intensive longitudinal data in Mplus. Psychological Methods. 2020;25:610–635. doi: 10.1037/met0000250. [DOI] [PubMed] [Google Scholar]
- Mulder D, Hamaker EL. Three extensions of the random intercept cross-lagged panel model. Structural Equation Modeling. 2021;28:638–648. [Google Scholar]
- McNeish D, Kelley K. Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations. Psychological Methods. 2019;24(1):20–35. doi: 10.1037/met0000182. [DOI] [PubMed] [Google Scholar]
- Newsom JT. Longitudinal structural equation modeling: A comprehensive introduction. Routledge; 2015. [Google Scholar]
- Resseguier N, Giorgi R, Paoletti X. Sensitivity analysis when data are missing not-at-random. Epidemiology. 2011;22:282. doi: 10.1097/EDE.0b013e318209dec7. [DOI] [PubMed] [Google Scholar]
- Robins, J. M. (1989). The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In L. Sechrest, H. Freeman, & A. Mulley (Eds.), Health service research methodology: A focus on AIDS (pp. 113–159). U.S. Public Health Service, National Center for Health Services Research.
- Robins JM. Estimation of the time-dependent accelerated failure time model in the presence of confounding factors. Biometrika. 1992;79:321–334. [Google Scholar]
- Robins JM. Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics-Theory and Methods. 1994;23:2379–2412. [Google Scholar]
- Robins JM. Marginal structural models versus structural nested models as tools for causal inference. Epidemiology. 1999;116:95–134. [Google Scholar]
- Robins JM, Blevins D, Ritter G, Wulfsohn M. G-estimation of the effect of prophylaxis therapy for Pneumocystic carinii pneumonia on the survival of AIDS patients. Epidemiology. 1992;3:319–336. doi: 10.1097/00001648-199207000-00007. [DOI] [PubMed] [Google Scholar]
- Robins JM, Hernán MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, editor. Handbooks of modern statistical methods: Longitudinal data analysis. CRC Press; 2009. pp. 553–599. [Google Scholar]
- Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560. doi: 10.1097/00001648-200009000-00011. [DOI] [PubMed] [Google Scholar]
- Robins JM, Rotnitzky A. Comment on Inference for semiparametric models: Some questions and an answer, by P.J. Bickel and J. Kwon. Statistica Sinica. 2001;11:920–936. [Google Scholar]
- Rosseel Y. lavaan: An R package for structural equation modeling. Journal of Statistical Software. 2012;48:1–36. [Google Scholar]
- Rubin DB. Inference and missing data. Biometrika. 1976;63:581–592. [Google Scholar]
- Skrondal A, Laake P. Regression among factor scores. Psychometrika. 2001;66:563–575. [Google Scholar]
- Tchetgen Tchetgen EJ, Shpitser I. Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics. 2012;40:1816–1845. doi: 10.1214/12-AOS990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ten Berge TMF, Krijinen WP, Wansbeek TJ, Shapiro A. Some new results on correlation-preserving factor scores prediction methods. Linear Algebra and its Applications. 1999;289:311–318. [Google Scholar]
- Usami S. Generalized sample size determination formulas for investigating contextual effects by a three-level random intercept model. Psychometrika. 2017;82:133–157. doi: 10.1007/s11336-016-9532-y. [DOI] [PubMed] [Google Scholar]
- Usami S. On the differences between general cross-lagged panel model and random- intercept cross-lagged panel model: Interpretation of cross-lagged parameters and model choice. Structural Equation Modeling. 2021;28:331–344. [Google Scholar]
- Usami S, Murayama K, Hamaker EL. A unified framework of longitudinal models to examine reciprocal relations. Psychological Methods. 2019;24:637–657. doi: 10.1037/met0000210. [DOI] [PubMed] [Google Scholar]
- Usami S, Todo N, Murayama K. Modeling reciprocal effects in medical research: Critical discussion on the current practices and potential alternative models. PLOS ONE. 2019;14(9):e0209133. doi: 10.1371/journal.pone.0209133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vansteelandt S, Joffe M. Structural nested models and g-estimation: The partially realized promise. Statistical Science. 2014;29:707–731. [Google Scholar]
- Vermunt JK. Society for political methodology latent class modeling with covariates: Two improved three-step approaches. Political Analysis. 2010;18:450–469. [Google Scholar]
- Wallace MP, Moodie EE, Stephens DA. An R package for G-estimation of structural nested mean models. Epidemiology. 2017;28:e18–e20. doi: 10.1097/EDE.0000000000000586. [DOI] [PubMed] [Google Scholar]
- Wang L, Maxwell SE. On disaggregating between-person and within-person effects with longitudinal data using multilevel models. Psychological Methods. 2015;20:63–83. doi: 10.1037/met0000030. [DOI] [PubMed] [Google Scholar]
- Zyphur MJ, Allison PD, Tay L, Voelkle MC, Preacher KJ, Zhang Z, Hamaker EL, Shamsollahi A, Pierides DC, Koval P, Diener E. From data to causes I: Building a general cross-lagged panel model (GCLM) Organizational Research Methods. 2020;23:651–687. [Google Scholar]
- Zyphur MJ, Voelkle MC, Tay L, Allison PD, Preacher KJ, Zhang Z, Hamaker EL, Shamsollahi A, Pierides DC, Koval P, Diener E. From data to causes II: Comparing approaches to panel data analysis. Organizational Research Methods. 2020;23:688–716. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


