Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Sep 28.
Published in final edited form as: Stat Med. 2014 May 12;33(22):3905–3918. doi: 10.1002/sim.6199

Causal Inference for Community-based Multi-layered Intervention Study

P Wu 1, D Gunzler 2, N Lu 3,5, T Chen 3, P Wymen 4, XM Tu 3,4,5
PMCID: PMC4156555  NIHMSID: NIHMS594534  PMID: 24817513

Summary

Estimating causal treatment effect for Randomized Controlled Trials (RCTs) under post-treatment confounding, i.e., noncompliance and informative dropouts, is becoming an important problem in intervention/prevention studies when the treatment exposures are not completely controlled. When confounding is present in a study, the traditional Intention-to-treat (ITT) approach could under-estimate the treatment effect due to insufficient exposure of treatment. In the recent two decades, many papers have been published to address such confounders to investigate the causal relationship between treatment and outcome of interest based on different modeling strategies. Most of the existing approaches, however, are suitable only for standard experiments. In this paper, we propose a new class of structural functional response model (SFRM) to address post-treatment confounding in complex multi-layered intervention studies within a longitudinal data setting. The new approach offers robust inference and is readily implemented. We illustrate and assess the performance of the proposed SFRM using both real and simulated data.

Keywords: Causal treatment effect, Noncompliance, Functional Response Models, Randomized Controlled Trials, Missing Data

1 Introduction

Although Randomized Controlled Trials (RCTs) remain as a benchmark for clinical research and practice, observational studies and semi-RCTs (trials that initiate treatment dynamically when needed) have become more popular, especially in studies in the behavioral and social sciences, health policy and health economics, because of the large amount of data generated by new web technologies and social media. Even within the confine of RCTs, we see more community-based, multi-layered and multi-modal, dynamic interventions to take advantage of both static (e.g., genetic traits) and dynamic (e.g., treatment response) information during the treatment. In this paper, we focus on community-based multi-layered RCTs and introduce a new class of structural functional response models to address complex treatment noncompliance issues when evaluating treatment effects.

The proposed approach is motivated by a community-based multi-layered RCT–the Rochester Resilience Project (RRP), where post-treatment noncompliance arises from both the primary (subject) and supportive (support group) layer. The RRP is designed to promote behaviorally and emotionally healthy trajectories in 1st–3rd grade urban children who are showing aggressive-disruptive and school socialization problems, a group at elevated risk for future mental health disorders, substance abuse problems, reduced educational outcomes and costly services. The study involved 401 children randomized to the intervention and control groups. In addition, the study interventionists also worked with parents to teach children a set of skills to strengthen emotion self-regulation, adaptive social behavior and classroom conduct. Parent visits focus first on identifying parent goals for the child, then on introducing and preparing parents to use activity sets that teach and reinforce children’s use of emotion self-regulation skills and incorporating those skills into their everyday relationship.

Our initial intention-to-treat (ITT) analyses failed to show any treatment effect for the primary behavior outcomes. Since ITT estimates are defined based on treatment assignment at randomization, rather than what actually goes on during the trial, such estimates completely ignore issues pertaining to violations of treatment protocols such as treatment noncompliance. For example, had only a small fraction of subjects in the intervention condition taken the treatment as prescribed, ITT would unduly underestimate the effect of receiving the intervention. However, child participation over 18 months was, as expected, high due to skill lessons being delivered in the school setting; 97% of children in the intervention condition completed all 14 lessons in the first year, and 81% completed all 10 lessons in the second year. Of the 39 non-completers, 33 were children relocating to non-study schools. Non-participation was unrelated to any baseline outcome measure.

Parent participation, however, was significantly lower; as shown in Table 1, with only 63.4% of parents (128 of 203 enrolled) participating in one or more intervention visits and few completing the 15 scheduled sessions. Under this condition of lower participation, ITT analyses are less informative about the true causal effects of parent involvement in the program, especially if the effect of treatment on child outcomes is achieved in part through parental participation.

Table 1.

Child Resilience Complete dataset

Total sessions attended by parents of children in intervention
Sessions 0 1 2 3 4 5 6 7 8 9+ total
Frequency 74 38 26 11 9 10 3 1 2 30 202
Percent 36.6 18.8 12.9 5.4 4.5 5.0 1.5 .5 1 15 100
Cumulative Percent 36.6 55.4 68.3 73.8 78.2 83.2 84.7 85.1 86.1 100.0 100

A number of approaches for addressing treatment noncompliance in RCTs have been developed based on the counterfactual outcome framework, such as the instrumental variable [1], principal stratification [2] and structural mean models [3]. None of the available methods address treatment noncompliance in multi-layered intervention studies. In this paper, we develop a new approach to extend the principles in these approaches to this new setting with treatment noncompliance from multiple layers of the intervention. In Section 2, we briefly review the counterfactual outcome based causal framework and introduce a class of structural functional response model (SFRM) to address both pre- and post-treatment confounding. In Section 3, the SFRM is extended to address treatment noncompliance in multi-layered interventions within a longitudinal study setting. Simulation studies are presented in Section 4 to evaluate the performance of the proposed SFRM. In Section 5, we apply the approach to address the variability in parent participation in the two-layered RRP study. We conclude with a discussion in Section 6.

2 Structural Functional Response Models for Causal Inference

2.1 Counterfactual Outcomes

The concept of counterfactual outcome, the underpinning of the modern causal inference paradigm, addresses the fundamental question of causal treatment effect [4]. Under this framework, associated with every patient is a potential outcome for each treatment condition, and the treatment effect is defined by the difference between the outcomes in response to the respective treatments from the same individual, thereby free of any confounding effect and providing a conceptual basis for causal effect without relying on the notation of randomization.

For example, if the two potential outcomes for the ith child in the RRP Study are yi1 and yi0 for the intervention and control condition, the difference Δi = yi1yi0 is the treatment effect for the child. Since this difference is based on the outcomes from the same child, it must be the result of the intervention. Unfortunately, since only the outcome from the treatment condition actually assigned is observed, this difference is unobservable. A large part of the causal inference literature centers on how to estimate the average, or population-level, causal treatment effect, Δ = E (yi1yi0).

In RCTs, treatment assignment is independent of potential outcomes, i.e., yikzi, where zi denotes a binary indicator for treatment assignment and ⊥ denotes stochastic independence. In this case, the average causal effect E (yi1yi0) can be estimated by the difference between the two sample means from the intervention and control group:

Δ̂=ӯ·1ӯ·0,ӯ·1=1n1i=1n1yi11,ӯ·0=1n0i=1n0yi00, (1)

where nk denotes the number of subjects assigned to the kth treatment group such that n = n1 + n0 and ik denotes the ith subject within the kth treatment group. Note that yikk refers to the observed outcome for the ikth subject in the assigned kth treatment, while yik denotes the potential outcome corresponding to the kth treatment.

The above shows that standard statistical models such as linear regression and mixed-effects models can be applied to RCTs to infer causal treatment effects. Randomization is key to the transition from the unobserved individual level difference, yi1yi0, to the estimable average treatment effect by the computable sample means in (1). For non-randomized trials such as most epidemiological studies, exposure to treatment or agent is non-random, in which case (1) generally does not estimate the average causal effect Δ = E (yi1yi0). Thus, associations found in observational studies generally do not imply causation.

2.2 Structural Functional Response Models

Since only one of the potential outcomes yik is observable, we cannot model the yik’s directly using conventional regression models. One way around this is to model the observed outcomes such as yikk as in the preceding section. Alternatively, we can circumvent this difficulty by constructing an observable response based on the unobserved yik and relate the response created to the mean of yik as follows:

E(zik(1zi)1kyikπk(1π)1k)=μk,E(zi)=π,zi=0,1,1in,k=0,1, (2)

where μk = E (yik) is the mean of potential outcome yik, since it is readily checked that

E(zik(1zi)1kyikπk(1π)1k)=1πk(1π)1kE(zik(1zi)1kyik)=μk.

Although yik are not both observed, the functional response, f(yi0,yi1,zi)=zik(1zi)1kyikπk(1π)1k, in (2) is still well defined. If π is known as in most RCTs, it is unnecessary to model zi and (2) reduces to the first equation.

The model in (2) is not a conventional regression model such as the generalized linear or non-linear models, since f (yi0, yi1, zi) is not a single linear response such as yik or zi. Rather, this model is a member of the following class of functional response models (FRM):

E[f(yi1,,yiq,θ)|Xi1,,Xiq]=h(Xi1,,Xiq;θ),(i1,,iq)Cqn, (3)

where f (·) is some function, h (·) is some smooth function (e.g., continuous second-order derivatives), yi and xi denote some response and explanatory variables, Cqn denotes the set of (nq) combinations of q distinct elements (i1, …, iq) from the integer set {1, …, n} and θ a vector of parameters. The response f (yi1, …, yiq, θ) in (3) for the general FRM can be quite a complex function of multiple outcomes (e.g., yik, zi in (2)) from different subjects as well as unknown parameters θ (e.g., π in (2)). By generalizing the response variable in this fashion, (3) provides a general framework for modeling a broad set of problems involving higher-order moments and between-subject attributes. The FRM has been applied to a range of methodological issues involving multi-subject responses such as extensions of the Mann-Whitney-Wilcoxon rank sum test to longitudinal and causal inference settings [5, 6], social network analysis [7, 8, 9], gene expression analysis [10], reliability coefficients [11, 12, 13, 14, 15, 16, 17] and complex response functions such as models for population mixtures [18] and structural equation models [19].

Because of its relationship to (3), the model in (2) will be referred to as the Structural FRM (SFRM):

E(fik(yi0,yi1,zi))=hik(θ),fi1=ziyi1π,fi2=(1zi)yi01π,fi3=zi,hi1(θ)=μ1,hi2(θ)=μ0,hi3(θ)=π, (4)

where θ = (μ1, μ0, π) denotes the collection of the parameters for this SFRM. Before adding more complexity to this SFRM to address treatment noncompliance within our context, let us first extend it to address selection bias in observational studies.

2.2.1 Selection Bias by Pretreatment Confounders

If subjects are not randomized with respect to the treatment condition (or exposure) as in observational studies (e.g., survey, epidemiologic studies), yikzi is generally not true. In the presence of such selection bias, if wi is a vector of covariates containing all sources of confounding such that the ignorability condition [20], yikzi | wi, holds, then we have:

E(zik(1zi)1kyikπ(wi)k(1π(wi))1k)=E[E(zik(1zi)1kyikπ(wi)k(1π(wi))1k|wi)]=μk. (5)

where π (wi) = E (zi | wi). We may model zi using a generalized linear model such as logistic regression:

E(zi|wi)=π(wi;η),logit(π(wi;η))=ηwi,1in. (6)

By combing (5) and (6), we have the following SFRM to provide valid inference about θ = (μ1, μ0, η) under selection bias:

E(fik(yi0,yi1,zi,wi)|k)=hik(θ),fi1=ziyi1π(wi;η),fi2=(1zi)yi01π(wi;η),fi3=zi,hi1(θ)=μ1,hi2(θ)=μ0,hi3(wi;θ)=π(wi;η)),logit(π(wi;η))=ηwi,1=2={0},3={wi},1in, (7)

where ℱk = {0} (k = 1, 2) denotes the sigma field generated by the constant 0 and ℱ3 = wi denotes the sigma field generated by wi. Note that E (fik (yi0, yi1, zi, wi) | ℱk) = E (fik (yi0, yi1, zi, wi)), since ℱk is contained in ℱ3 for k = 1, 2 (e.g., see Kowalski and Tu [12]).

2.2.2 Treatment Noncompliance as Post-treatment Confounders

In many RCTs, even well-planned and executed ones, treatment effect may be significantly modified by levels of exposure of intervention (e.g., compliance or dosage) due to treatment noncompliance. One popular approach for addressing this primary post-treatment confounder is the structural mean model (SMM)[3]. Other competing approaches also address treatment noncompliance such as the Instrumental Variable[1] and Principal Stratification methods[2]. However, only SMM models treatment compliance on a continuous scale, which is more appropriate for session attendance within our context. We first frame this model within the FRM framework and then discuss its extensions to accommodate multi-layered interventions and missing data in Section 3.

Consider a randomized medication vs. placebo study and let di1 denote a continuous potential outcome of medication use, if the ith subject is assigned to the medication condition. The SMM models the dose effect on treatment difference as follows:

E(yi1yi0|di1)=g(di1), (8)

where g (·) is known up to a set of parameters (i.e., only the functional form of g (di1) is specified). However, the above model cannot be fit directly using conventional statistical methods, since only one of the potential outcomes (yi1, yi0) is observed. For RCTs, yi1, yi0zi and it follows that:

E(yi1|di1,zi=1)=g(di1)+E(yi0|di1,zi=0). (9)

By conditioning on the assigned treatment zi = k, yik in (9) represents the observed outcome from the kth treatment group (k = 0, 1). Thus, E (yi0 | di1, zi = 0) cannot be modeled directly, since di1 is not observed for the subjects assigned to the placebo condition.

If treatment compliance is tracked for the subjects in the placebo group, then di0, the potential outcome of placebo use if the subject is assigned to the placebo condition, is observed. Because of randomization and the fact that subjects cannot distinguish between medication and placebo, di0 has the same distribution as di1. Thus, we may replace di1 by di0 in E (yi0 | di1, zi = 0) to re-express (9) as:

E(yi1|di1,zi=1)=g(di1)+E(yi0|di0,zi=0). (10)

Under this treatment compliance explainable condition, we will be able to model the right side to obtain estimates of dose-response relationships g (di1) [21].

Although applicable to medication studies, the SMM in (10) in general does not apply to psychosocial research. Many psychosocial intervention studies do offer attention or information controls and subjects in such control groups may also be tracked for their session attendance. However, unlike medication studies, compliance observed in the control group di0 generally does not have the same distribution as di1. For example, consider a HIV prevention intervention study for teenage girls at high risk for HIV infection, in which the intervention condition contains information on HIV infection, condom use and safe sex, while the control condition consists of nutritional and dietary information. Subjects with high compliance in the intervention group are generally different from their counterparts in the control condition; sexually active girls may form a majority of those with high attendance in the intervention group, while such girls might have low attendance rates, had they been assigned to the control condition. Thus, when assessing the effect of prevention intervention using outcomes of HIV risk behavior such as number of unprotected vaginal sex over the past month, it is not meaningful to compare compliant subgroups between the two treatment conditions.

Thus for psychosocial research studies, we cannot simply replace di1 in E (yi0 | di1, zi = 0) by a measure of treatment compliance such as session attendance in the control group di0 as in medication trials. In many studies, it is reasonable to assume that there is sufficient information to predict di1, i.e., given a set of covariates xi, di1 is independent of yi0. For example, if xi contains information on sexuality and other information on a subject’s interest to attend sessions in the intervention condition of the HIV study example above, yi0 may no longer depend on di1 given xi. In this case, E (yi0 | di1, xi, zi = 0) = E (yi0 | xi, zi = 0). Thus, under this ignorability condition, yi0di1 | xi, (9) becomes:

E(yi1|xi,di1,zi=1)=g(di1)+E(yi0|xi,zi=0). (11)

Note that the SMM in this case is essentially the same as the Principal Stratification Model, except that it requires neither discretization of di1 nor parametric distribution models for yik, since (11) only specifies the conditional mean of yik given di1, xi and zi.

By modeling E (yi0 | xi, zi = 0) and casting (11) in the form of FRM, we obtain the following SFRM for modeling treatment compliance measured by a continuous dose variable di1 (for the intervention condition only):

fi1=ziyi1π,fi2=(1zi)yi01π,fi3=zi,1in,hi1(x,β)=h(xi,β),hi2(xi,di1,θ)=g(di1,γ)+h(xi,β),θ=(β,γ,π), (12)

where h (x, β) (g (d, γ)) is some function of x (d) parameterized by β (γ). As before, n is the sample size of the study, i.e., the sample size of the intervention plus the control group. Although for RCTs it is not necessary to include π as a parameter, the general SFRM in (12) allows us to extend this model to observational studies. For example, for non-randomized studies, yikzi in general is not true. If yikzi holds conditional on a set of covariates wi (possibly overlapping with xi), then by modeling π as a function of wi as in (6), the following SFRM still provides consistent estimates in the face of selection bias:

fi1=ziyi1π(wi;η),fi2=(1zi)yi01π(wi;η),fi3=zi,1in,hi1=hi1(xi,β),hi2(xi,di1,β,γ)=g(di1,γ)+hi1(xi,β),hi3=π(wi;η),logit(π(wi;η))=ηwi,θ=(β,γ,η). (13)

We can model h (xi, β) and g (di1, γ) in various ways. For example, we may model both as a linear function: h1(xi,β)=xiβ and g (di1, γ) = di1. By specifying an appropriate form for g (di1, γ), we may also extend (12) to non-continuous dose variables such as categorical variables. Further, by appropriately specifying h1 (xi, β) and h2 (xi, di1, β), we can also generalize (12) to non-continuous responses. For example, for a binary yi, we may specify h1 (xi, β) and h2 (xi, di1, β) as follows:

h1(xi,β)=logit1(xiβ),h2(xi,di1,β)=logit1(g(di1,γ)+h1(xi,β)).

2.2.3 Inference for Structural Functional Response Models

We focus on inference about θ = (β, γ, η) for the SFRM in (13), from which (7) and (12) follow as a special case. Let

f(yi;zi)=(fi1,fi2,fi3),hi(θ)=(hi1,hi2,hi3),1in,

where fik and hik are defined in (13). Then, consistent estimates of θ are readily obtained by using the Generalized Estimating Equations (GEE) for FRM [18, 12, 22]:

U(θ)=i=1nDiVi1Si=0,Si=fihi,Di=θhi,Vi=Ai12R(α)Ai12,Ai=diagt(Var(fit|it)), (14)

where R (α) denotes a choice of working correlation matrix.

The choice of R (α) and associated properties for the GEE estimate of θ have been extensively discussed in the literature, which are stated for ease of reference without justifications [23, 24]. In particular, the GEE estimate may not be consistent in the presence of time-varying covariates under working correlation structures other than the working independence model [23]. Thus, the working independence model may be used in general to ensure valid inference. Although this simple working correlation structure may incur some loss of efficiency for time-dependent covariates [24] and thus other models such as the uniform compound symmetry matrix may be used in some specific applications to improve power, it suffices for the purpose of illustrating the proposed approach. We focus on the working independence model in what follows unless otherwise stated.

3 Extension to Complex Studies

We first extend the SFRM in Section 2 to longitudinal data and then to multi-layered intervention studies.

3.1 Longitudinal Data with Missing Values

Let yit = (yit1, yit0) (xit) denote the potential outcomes of yit (a vector of explanatory variables) of interest with i (t) indexing the subject (assessment time) for 1 ≤ in and 1 ≤ tT. By applying (13) to each time point, we obtain a longitudinal version of the SFRM:

fi=(fi1,,fiT,zi),hi=(hi1,,hiT,πi),E(fi|xi)=h(xi,θ),1in,fit=(fit1,fit2),fit1=ziπiyit1,fit2=1zi1πiyit0,hit=(hit1,hit2),hit1=h1(xit,β),hit2=gt(di1,γ)+hit1,πi=logit1(ηwi),θ=(β,γ,η). (15)

Inference for the FRM above is based on the following GEE for FRM [18, 12, 22]:

U(θ)=i=1nDiVi1Si=0,Si=fihi,Di=θhi,Vi=Ai12R(α)Ai12,Ai=diagt(Var(fit|xit)), (16)

where Di and Vi are readily computed given (15) and R (α) denotes a choice of working correlation matrix.

Missing data is a common issue in longitudinal studies. The GEE in (16) generally yields biased estimates under the missing at random (MAR) mechanism [25, 26, 27]. The weighted generalized estimating equations (WGEE), a common approach for addressing this issue, has been extended to the FRM [18, 22]. We adapt this approach to the current context, with an alternative implementation to simplify the inference procedure. As in the literature, we assume Monotone Missing Data Patterns (MMDP) to facilitate inference [18, 22, 25, 26, 27].

Let yit denotes the observed potential outcome, i.e., yit = yitk if the subject is assigned the kth treatment. Let

yit=(yi1,,yi(t1)),xit=(xi1,xi(t1)),1tm,

denoting the all individual responses (yit) and explanatory variables (xit) prior to time t. Let

rit={1ifithsubject is observed at timet0otherwise,pit={1ift=1E(rit=1|ri(t1)=1,xit,yit)ift>1,logit1(pit)=ξ0t+ξxtxit+ξytyit,Ψit=(s=1tpit)1ritI2,Ψi(ξ)=diagt(Ψit),ξt=(ξ0t,ξxt,ξyt),ξ=(ξ2,,ξT). (17)

We assume no missing data at baseline such that ri1 ≡ 1 (1 ≤ in). Under this, MAR and MMDP assumptions, pit in (17) is well defined for 1 ≤ tT. By integrating the weights Ψi into the GEE in (16), we obtain the following WGEE for inference about β:

U(θ,ξ)=i=1nDiVi1ΨiSi=0. (18)

In the extant literature, an estimate ξ̂ of ξ, obtained from a separate set of estimating equations, is substituted into the WGEE and (18) is then solved for θ to obtain the WGEE estimate θ̂ of θ. Since θ̂ is conditional upon ξ̂, its asymptotic variance is then adjusted to account for the sampling variability of ξ̂. If α is n-consistent and ξ̂ is asymptotically normal, the WGEE estimate θ̂ obtained from (17) is consistent and asymptotically normal [18, 22, 27]. The procedure for adjusting the sampling variability of ξ̂ in the asymptotic variance is quite complex and thus we discuss an alternative approach to estimate ξ and θ simultaneously.

Let

fi=(fi1,,fiT,zi,ri2,,riT),hi=(hi1,,hiT,πi,pi2,,piT),θ=(β,η,γ),1in,1tT, (19)

where fit, hit and πi are defined in (15), and rit and pit are defined in (17). Consider the WGEE in (18), but with Di and Ψi redefined as follows to provide estimates for both θ and ξ:

Di=θhi,Vi=(Vi11000Vi22000Vi33),Vi11=Ai12R(α)Ai12,Vi22=πi(1πi),Vi33=(pi2(1pi2)0piT(1piT)),Ψi=(Ψi11000Ψi22000Ψi33),Ψi11=diag(Ψit),Ψit=rit(s=1tpis)1I2,Ψi22=1,Ψi33=(ri10ri(T1)), (20)

where Ai is defined in (17). Unlike (18), the WGEE in (19) makes joint inference about θ and ξ. Thus, no adjustment is necessary for the asymptotic variance of the WGEE estimate of θ to account for the sampling variability of ξ̂ as in the standard approach above.

3.2 Multi-layered Intervention Study

We now extend the SFRM above to multi-layered interventions to address treatment noncompliance from different intervention layers, such as the child and parent layers of the RRP. For notational brevity, we focus on two-layered interventions, since extensions to multi-layered interventions with more than two layers are straightforward.

Consider a two-layered intervention study and let ui1 denote some (continuous) treatment compliance measure for the second layer. By taking into account both compliance measures di1 and ui1, we obtain from (11) the following dose-response relationship:

E(yi1|xi,di1,ui1,zi=1)=g(di1,ui1)+E(yi0|xi,zi=0). (21)

We assume that the covariates xi sufficiently explain treatment compliance patterns for both the primary and secondary layers of the multi-layered intervention, i.e., di1, yi0xi and ui1, yi0xi. In some studies, treatment noncompliance may be limited to some intervention layers, in which case xi is only required to explain the affected layers. For example, in the RRP, noncompliance is a major issue only for the second parent support layer and the ignorability condition only needs to be assumed for parent participation.

By formulating (21) as an FRM as in the case of single-layered intervention study, we obtain the following SFRM for modeling the effect of treatment noncompliance on the outcome in a two-layered intervention study:

fi1=ziyi1π(wi;η),fi2=(1zi)yi01π(wi;η),fi3=zi,1in,hi1=hi(xi,β),hi2(xi,di1,ui1)=g(di1,ui1,γ)+h1(xi,β),πi=π(wi;η),E(zi|xi,di1,ui1,θ)=πi,θ=(β,γ,η). (22)

where 1 ≤ in. The above has essentially the same form as the single-layered SFRM, except that the treatment effect g (di1, ui1, γ) is a function of compliance from both the primary and secondary intervention layers. Note that (22) applies to observational studies well, in which case wi is assumed to account for all sources of selection bias.

We can model treatment effect g (di1, ui1, γ) to reflect treatment compliance in both layers. For example, we may specify an additive effect function, g (di1, ui1, γ) = γ1di1 + γ2ui1 or we may also include a between-layer treatment compliance interaction di1ui1. If the treatment effect is moderated by some covariate ci, we may also include treatment moderating effect by setting g (di1, ui1, ci, γ) = ci1di1 + γ2ui1). If the moderating effect only occurs to one of the intervention layers, we may model g (di1, ui1, ci, γ) as γ1cidi1 + γ2ui1 or γ1di1 + γ2ciui1, depending on whether the moderating effect operates at the primary or secondary layer of the intervention.

As in the case of single-layered intervention study, the cross-sectional SFRM in (22) is readily extended to longitudinal studies. For example, by replacing the treatment effect function gt (di1, γ) in (15) by gt (di1, ui1, γ) in (22), the SFRM in (15) can be applied to model the effect of treatment compliance for two-layered observational studies. As well, by modeling the missing data under MAR using (17), we can make joint inference about θ in (22) and ξ for the missing data model using a WGEE akin to (18), but with Di, Vi, Ψi and Si in (20) redefined based on (22).

In the above, we have assumed that both di1 and ui1 are continuous. The models are easily extended to non-continuous compliance variables, if either di1 or ui1 or both are non-continuous.

4 Simulation Studies

We carried out a series of simulation studies to assess the performance of the proposed SFRM for multi-layered intervention studies for the most general case under both pre-treatment and post-treatment confounders. Since our RRP is a two-layered intervention study, we only considered this special case for the simulation study. We assessed the performance of the models under both cross-sectional and longitudinal data.

We considered continuous and binary outcomes yi for both cross-sectional and longitudinal data settings, with a continuous treatment noncompliance variable for both the primary and secondary layer. For space consideration, we only report results for two sample sizes n = 50 and 200 for a continuous response in cross-sectional data (Model I) and n = 100 and 400 for a binary response in longitudinal data (Model II). The increase in sample size for the binary outcome is to achieve more reliable estimates because of data sparseness in this binary response case, especially in the presence of missing data in the longitudinal data setting. All simulations were performed with a Monte Carlo (MC) sample of 1,000. All analyses were carried out using codes developed by the authors for implementing the models considered using the R software platform [28].

For the cross-sectional data scenario, let yik (k = 0, 1) be a continuous outcome in Model I and let di (ui) denote a continuous treatment noncompliance variable for the primary (secondary) intervention layer. Model I for the continuous yik is defined as follows:

ModelIContinuousyikfor Cross-sectional Datayi0|xi,bi=μ(xi;β)+bi+ei0,μ(xi;β)=β0+β1xi,yi1|{di,ui,xi,bi=g1(di,ui,xi;γ)+μ(xi;β)+bi+ei1di,ui,xi,ci,bi=g2(di,ui,xi,ci;γ)+μ(xi;β)+bi+ei1,g1(di,ui,xi;γ)=γ0di+γ1ui+γ2uidi,g2(di,ui,xi,ci;γ)=cig1(di,ui,xi;γ),πi=logit1(η0+η1xi),di,ui~U(0,5),xi,ci~N(0,1),bi~(χ121)σb2/2,ei1,ei0~(χ121)σ2/2,,β=(β0,β1)𝖳=(5,2),γ=(γ0,γ1,γ2)𝖳=(0.5,0.5,0.4),η=(η0,η1)𝖳=(0,1),σb2=σ2=1,θ=(β,γ,η), (23)

where zi is the indicator of treatment assignment, xi is a confounding variable (for both pre- and post-treatment), ci is a treatment moderator, g1 (g2) is a function modeling the effect of treatment noncompliance without (with) the treatment moderator, U (a, b) denotes a uniform over the interval between a and b, and χp2 denotes a χ2 distribution with p degrees of freedom. Since (yi0, yi1) share the same random effect bi, they are not independent. Note that to demonstrate robustness of the SFRM, both the random effect bi and model error eik followed non-normal distributions. In (23), we considered two treatment effect functions, g1(di, ui, xi; γ) and g2(di, ui, xi, ci; γ), with the latter including a moderating effect of the former by a treatment moderator ci. This moderator ci can be associated with either the primary or secondary layer of the multi-layered intervention.

Shown in Table 2 are the estimates of θ, along with their model-based (Mod. S.E.) and empirical (Emp. S.E.) standard errors for Model I. The model-based standard errors were computed based on the estimated asymptotic variance, while their empirical counterparts were calculated from the MC replicates. At the larger sample size n = 200, all parameter estimates were quite close to the true values of the respective parameters. The model-based standard errors also matched their empirical counterparts quite well. Although the difference all increased between the parameter estimates and their true values and between the model-based and empirical standard errors for the smaller sample size n = 50, the SFRM still performed quite well.

Table 2.

Parameter estimates and standard errors for Model I with a cross-sectional continuous response.

Parameter estimates and standard errors for Model I with treatment effect functions g1/g2
n = 50 n = 200
Parameter Est. Mod. S.E. Emp. S.E. Est. Mod. S.E. Emp. S.E.
γ0 = 0.5 0.475/0.514 0.629/0.566 0.744/0.758 0.496/0.495 0.318/0.274 0.326/0.289
γ1 = 0.5 0.459/0.535 0.655/0.553 0.808/0.760 0.505/0.515 0.315/0.271 0.323/0.300
γ2 = 0.4 0.428/0.377 0.369/0.313 0.453/0.438 0.402/0.395 0.179/0.163 0.196/0.175
β0 = 5 5.107/4.981 0.289/0.304 0.338/0.394 4.998/5.012 0.158/0.157 0.175/0.172
β1 = 2 1.995/2.013 0.341/0.348 0.402/0.509 2.002/1.976 0.189/0.189 0.201/0.195
η0 = 0 0.033/−0.003 0.325/0.326 0.329/0.343 0.001/−0.003 0.150/0.157 0.158/0.150
η1 = −1 −1.086/−1.107 0.394/0.395 0.422/0.461 −1.017/−1.016 0.189/0.189 0.188/0.199

For the longitudinal data, as noted earlier, we only report results for a binary response. We extended both the mean for the control group, μt (xi; β), and the treatment effect function, gt (di, ui, xi, ci; γ), in the cross-sectional case to include a temporal trend. In addition, to reflect the treatment noncompliance patterns in the RRP study, where treatment noncompliance only occurred in the supportive parent layer, we only considered treatment noncompliance in the second layer. As in the cross-sectional data setting, we also included a treatment moderator ci in gt (di, ui, xi, ci; γ). For notational brevity, we only considered one treatment effect function and two assessments, with t = 1 (2) denoting the baseline (follow-up). We created about 22% missing data at the follow-up.

We discussed two approaches for longitudinal data analysis. The first employs the conventional WGEE that conditions on the estimates of the missing data model and adjusts the variance estimates of parameter estimates to account for the sampling variability in the estimates of the missing data model. Since the adjustment part is quite complex, we also discussed an alternative that utilized the flexibility of FRM to model both missing data and treatment effect simultaneously. We used this latter approach in the simulation study.

For the binary response yik, the SFRM is given by:

ModelIIBinaryyikfor Longitudinal Data Settingyit0|xi=logit1(μt(xi;β)),μt(xi;β)=β0+β1t+β2xi+β3xit,yit1|di,ui,xi,ci=logit1(gt(di,ui,xi,ci;γ))+μt(xi;β)},πi=logit1(η0+η1xi),gt(di,ui,xi,ci;γ)=γ0uit+γ2ciuit,pi=logit1(ξ0+ξ1yi0o),di,ui~U(0,4),xi,ci~N(0.1)β=(β0,β1,β2,β3)𝖳=(1,1,1,1),γ=(γ0,γ1)𝖳=(1,1),η=(η0,η1)𝖳=(0,1),ξ=(ξ0,ξ1)𝖳=(1,1),θ=(β,γ,η,ξ). (24)

where pi = E(ri1 = 1 | yi1) is the probability of missing data at the follow-up t = 2 for both the treatment and control groups. For the control group, we included a time as well as a time by covariate interaction. As indicated earlier, the treatment effect function gt (di, ui, xi, ci; γ) also included a treatment moderator ci. Since the probability of missing response at post-treatment pi depends on the baseline yi1, the missing data mechanism follows the MAR. Under the specified ξ, there was about 22% missing data. The correlated yitk were created by the copula methods [29, 30]. The correlation between the two potential outcomes with each assessment time as well as between two assessments within the same potential outcome in our simulation study was set at about 0.5, uncontrolled for any of the explanatory variables.

Shown in Table 3 are the estimates of θ, along with their model-based (Mod. S.E.) and empirical standard (Emp. S.E.) errors for Model II. In comparison to the cross-sectional data case, Table 3 contains estimates for the additional parameters ξ = (ξ0, ξ1)𝖳 for the missing data model. As in the case of cross-sectional data, both the parameter estimates and model-based standard errors were quite good when compared to their true values or empirical counterparts.

Table 3.

Parameter estimates and standard errors for Model II with a longitudinal binary response.

Parameter estimates and standard errors for Model II
n = 100 n = 400
Parameter Est. Mod. S.E. Emp. S.E. Est. Mod. S.E. Emp. S.E.
γ0 = 1 0.873 0.497 0.508 1.047 0.290 0.321
γ1 = 1 0.964 0.512 0.586 1.070 0.284 0.317
β0 = −1 −1.067 0.468 0.491 −1.022 0.216 0.213
β1 = 1 1.128 0.461 0.525 1.025 0.205 0.217
β2 = 1 1.089 0.589 0.605 1.016 0.264 0.279
β3 = −1 −1.182 0.583 0.690 −1.044 0.261 0.275
η0 = 0 0.021 0.209 0.224 −0.001 0.108 0.109
η1 = −1 −1.058 0.291 0.303 −1.008 0.144 0.148
ξ0 = 1 1.022 0.273 0.292 1.010 0.131 0.135
ξ1 = 1 1.088 0.689 0.702 1.033 0.325 0.346

5 Rochester Child Resilience Study

The Rochester Resilience Project (RRP) is a randomized two-layered intervention study with significant treatment noncompliance by the parent, whose participation forms the second supportive layer of the intervention. The study’s enrollment began in Fall 2006, with data collection for the final cohort completed by June 2011. There were 401 students from first up to third grade from Rochester City School District elementary schools. The study examines how children with a higher risk of developing behavioral problems in the intervention condition improve as compared to the control condition over a 30-month period. Each child was assessed at baseline, and 6, 18, and 30 months post baseline.

Since treatment compliance was quite good for the children in the study, we only considered variability in the parent participation. In order to apply the proposed SFRM to analyze the data in this study, we first examined the baseline covariates to see if any of these variables effectively predicted the patterns of treatment noncompliance. We treated the second-layer noncompliance measure, ui, the number of session attendance by the parent, as a continuous variable and applied linear regression.

Shown in Table 4 are the estimated coefficients, standard errors and p-values of the variables that significantly predicted the number of session attendance ui from the linear regression model. The variable School Number represents the different schools which the children attended. The variable PNC stands for the Perceived Need for Care scale, assessing frequency over past six months that parent viewed her child as needing help for behavior or emotional problems, including from communication with others about child [31]. The DomEX Baseline denotes the baseline value of the subscale of the Dominic Interactive self-report, assessing symptoms of three externalizing (oppositional defiant, conduct problems, and ADHD) problems [32]. The results from the regression show that session participation was significantly different across the different schools and children with different PNC and DomEX baseline values. In addition, parent age also significantly predicted the session attendance.

Table 4.

Estimates, standard errors and p-values for significant predictors of parent participation for the Rochester Resilience Project from generalized linear models.

Significant Predictors for Parent Participation
Explanatory Variable Estimate Standard Error p-value
PNC 0.9191 0.2698 0.0008
Parent Age 0.0882 0.0293 0.0030
DomEX Baseline 0.9127 0.0373 0.0154
School Number <.0001
School 19 −4.1065 0.8446 <.0001
School 22 −3.3860 0.9122 0.0003
School 30 −3.1342 0.9873 0.0018
School 45 −4.3626 0.8440 <.0001
School 50 0.0000

For our illustrations of the model, we focused on two primary behavior outcomes of the study, the Teacher ratings of aggressive behavior (AthAcc) and Parent rating of internalizing behavior problem (PIntD). For both outcomes, higher values indicate fewer problems. For each of these behavior outcomes yit, let yit1 and yit0 denote the potential outcomes of yit at baseline (t = 1) and each of the three follow-ups (2 ≤ t ≤ 4). We modeled the causal treatment effect as a function of treatment compliance from the parent layer using a SFRM as follows:

E(1zi1πyit0|ui)=μit,E(ziπyit1|ui)=git+μit,E(zi)=π,μit=β0+β1t+xi1β2+β3xi1t+β4xi2+β5xi3+β6xi4+β7xi5+β8xi6+β9xi7+β10xi8,git=γuit,1t4, (25)

where zi is the indicator variable of treatment assignment with zi = 1 (0) for intervention (control), xi1 denotes the age of the child at baseline, xi2xi5 denote the four binary indicators of School 19, 22, 30, 45, and xi6, xi7 and xi8 denote the PNC, DomEXT Baseline and Parent Age variables, respectively. In addition, we included Age and Age by time interaction, since our theory as well as preliminary analyses show that these behavioral outcomes have different trajectories for children of different ages.

Prior to fitting the SFRM, we examined the missing data mechanism using logistic regression to determine whether missing data at each of the follow-up times, 6, 18, and 30 months post-baseline, depended on the observed outcomes at prior assessment times. Results indicated that missing data was not associated with the observed data for any of the two behavior outcomes considered. Thus, we assumed the dropouts for these two behavior outcomes in this RRP study followed the Missing Complete at Random (MCAR) mechanism. The MCAR mechanism was also consistent with the excellent treatment compliance observed for the study subjects (children), since unlike parent participation both the intervention and assessment were performed during the regular school time.

Shown in Table 5 are the estimates (Est.), standard errors (S.E.) and p-values (p-value) for the parameter γ in the treatment effect function git in (25) for the two behavior outcomes analyzed. Within the context of the study, this parameter γ measures the rate of change of the behavior outcome per month for each additional session attended by the parent. The results show that for both behavior outcomes γ was quite significant, with the positive estimate indicating that the intervention improved the child’s behaviors and reduced the risk for future mental disorder and substance abuse. With the SFRM in (25), causal treatment effect is given by γui. For example, if the parent of the child attended all the planned 15 sessions, then ui = 15 and the causal effect is β4ui = 0.25 per month time in the scale of the AthAcc outcome. Thus, in 18 months post-baseline, for instance, the intervention will on average increase the child AthAcc outcome by 4.32 points.

Table 5.

Child Resilience Complete dataset

Estimation Results of Treatment Time Effect (γ)
Causal Effect ITT Effect
Est. S.E. p-value Est. S.E. p-value
AthAcc 0.0167 0.0014 <0.0001 0.0053 0.0069 0.2235
PIntD 0.1640 0.0163 <0.0001 0.0476 0.0663 0.2365

For comparison purposes, we also performed the intent-to-treat (ITT) analysis for the two behavior outcomes by setting ui = 1 in git of the SFRM in (25). The estimated γ, standard errors (S.E.) and p-values (p-value) are shown in Table 5 under the column “ITT Effect”. As seen, γ was not significant for either outcome. Thus, parent support played a significant role in improving the two child behavior outcomes in this two-layered intervention study.

6 Discussion

We developed an approach to address treatment noncompliance in multi-layered intervention studies. This approach extends the structural mean model (SMM) to multi-layered intervention and longitudinal data settings. We selected the SMM to develop our approach because of the need to model treatment noncompliance on a continuous scale. Other competing approaches such as the Principal Stratification method characterize variability in treatment noncompliance using categorical outcomes. However, within the context of multi-layered intervention study, such methods yield a large number of noncompliance categories, limiting their applications. For example, if a 4-level categorical outcome is used to characterize treatment noncompliance for each layer of a 2-layered intervention, we will need a 16-level categorical outcome to understand treatment noncompliance when considering interactions of noncompliance patterns between the two intervention layers. The larger number of levels of a categorical outcome may cause problems for fitting models, if there are a limited number of subjects in one or more strata (defined by the levels of the categorical outcome). With the freedom to choose a continuous or categorical noncompliance measure as in the SMM and proposed SFRM, we can consider between-layer interactions in a more parsimonious and reliable fashion.

We also adopted the distribution-free framework of SMM for inference for our proposed model. Using the framework of FRM, we are able to provide robust inference about model parameters like the SMM and accommodate noncompliance from multiple intervention layers as well as missing data under MAR. Our simulation studies show that the proposed approach perform quite well even for a sample size as small as 50 (for combined intervention and control groups). As well, applications of the proposed model to the Rochester Resilience Project demonstrate the importance to consider treatment noncompliance from the supportive parent layer in this two-layered intervention study.

References

  • 1.Angrist J, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables (with discussion) Journal of the American Statistical Association. 1996;91:444–472. [Google Scholar]
  • 2.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341x.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Robins JM. Correcting for noncompliance in randomized trials using structural nested mean models. Communications in Statistics. 1994;23:2379–2412. [Google Scholar]
  • 4.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. 1974;66:688–701. [Google Scholar]
  • 5.Chen R, Chen T, Lu N, Zhang H, Wu P, Feng C, Tu XM. Extending the Mann-Whitney-Wilcoxon rank sum test to longitudinal data analysis with covariates. Applied Statistics. in press. [Google Scholar]
  • 6.Wu P, Han Y, Chen T, Tu XM. Causal inference for Mann-Whitney-Wilcoxon rank sum and other nonparametric statistics. Statistics in Medicine. 2014;33(8):1261–1271. doi: 10.1002/sim.6026. [DOI] [PubMed] [Google Scholar]
  • 7.El-Sayed AM, Scarborough P, Seemann L, Galea S. Social network analysis and agent based modeling in social epidemiology. Epidemiologic Perspectives and Innovations. 2012;9:1–9. doi: 10.1186/1742-5573-9-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lu N, White AM, Wu P, He H, Hu J, Feng C, Tu XM. Social Network Endogeneity and Its Implications for Statistical and Causal Inferences. In: Lu N, White AM, Tu XM, editors. Social Networking: Recent Trends, Emerging Issues and Future Outlook. New York: Nova Science; 2013. [Google Scholar]
  • 9.Yu Q, Tang W, Kowalski J, Tu XM. Multivariate U-Statistics: A tutorial with applications. Wiley Interdisciplinary Reviews – Computational Statistics. 2011;3:457–471. [Google Scholar]
  • 10.Kowalski J, Powell J. Nonparametric inference for stochastic linear hypotheses: Application to high-dimensional data. Biometrika. 2004;91(2):393–408. [Google Scholar]
  • 11.King TS, Chinchilli VM. A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine. 2001;20:2131–2147. doi: 10.1002/sim.845. [DOI] [PubMed] [Google Scholar]
  • 12.Kowalski J, Tu XM. Modern Applied U Statistics. New York: Wiley; 2007. [Google Scholar]
  • 13.Tu XM, Feng C, Kowalski J, Tang W, Wang H, Wan C, Ma Y. Correlation analysis for longitudinal data: Applications to HIV and psychosocial research. Statistics in Medicine. 2007;26:4116–4138. doi: 10.1002/sim.2857. [DOI] [PubMed] [Google Scholar]
  • 14.Ma Y, Tang W, Feng C, Tu XM. Inference for Kappas for longitudinal study data: Applications to sexual health research. Biometrics. 2008;64:781–789. doi: 10.1111/j.1541-0420.2007.00934.x. [DOI] [PubMed] [Google Scholar]
  • 15.Ma Y, Tang W, Yu Q, Tu XM. Modeling concordance correlation coefficient for longitudinal study data. Psychometrika. 2010;75:99–119. [Google Scholar]
  • 16.Ma Y, Alejandro GD, Hui Z, Tu XM. A U-statistics based approach for modeling Cronbach Coefficient Alpha within a longitudinal data setting. Statistics in Medicine. 2011;29(6):659–670. doi: 10.1002/sim.3853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lu N, Chen T, Wu P, Gunzler D, Zhang H, He H, Tu XM. Functional response models for intraclass correlation coefficients. Applied Statistics. in press. [Google Scholar]
  • 18.Yu Q, Chen R, Tang W, He H, Gallop R, Crits-Christoph P, Hu J, Tu XM. Distribution-free models for longitudinal count responses with over-dispersion and structural zeros. Statistics in Medicine. 2013;32:2390–2405. doi: 10.1002/sim.5691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gunzler D, Tang W, Lu N, Wu P, Tu XM. A class of distribution-free models for longitudinal mediation analysis. Psychometrika. doi: 10.1007/s11336-013-9355-z. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rubin DB. Bayesian inference for causal effects: The Role of Randomization. Annals of Statistics. 1978;6:34–58. [Google Scholar]
  • 21.Fischer K, Goetghebeur E. Structural mean effects of noncompliance. Journal of the American Statistical Association. 2004;99(468):918–928. [Google Scholar]
  • 22.Gunzler D, Tang W, Lu N, Wu P, Tu XM. A class of distribution-free models for longitudinal mediation analysis. Psychometrika. doi: 10.1007/s11336-013-9355-z. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Pepe MS, Anderson GL. A Cautionary Note on Inference for Marginal Regression Models with Longitudinal Data and General Correlated Response Data. Communication in Statistics-Simulation. 1994;23:939–951. [Google Scholar]
  • 24.Fitzmaurice GM. A caveat concerning independence estimating equations with multiple multivariate binary data. Biometrics. 1995;51:309–317. [PubMed] [Google Scholar]
  • 25.Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association. 1995;90:106–121. [Google Scholar]
  • 26.Lu N, Tang W, He H, Yu Q, Crits-Christoph P, Zhang H, Tu XM. On the impact of parametric assumptions and robust alternatives for longitudinal data analysis. Biometrical Journal. 2009;51:627–643. doi: 10.1002/bimj.200800186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wu P, Tu XM, Kowalski J. On Assessing Model Fit for Distribution-Free Longitudinal Models under Missing Data. Statistics in Medicine. 2014;33(1):143–157. doi: 10.1002/sim.5908. [DOI] [PubMed] [Google Scholar]
  • 28.R Development Core Team. Vienna, Austria: R Foundation for Statistical Computing; 2010. R: A language and environment for statistical computing. ISBN 3-900051-07-0, URL http://www.R-project.org. [Google Scholar]
  • 29.Nelsen RB. An introduction to Copulas. New York: Springer; 2006. [Google Scholar]
  • 30.Zhang H, Lu N, Feng C, Thurston SW, Xia Y, Tu XM. On Fitting Generalized Linear Mixed-effects Models for Binary Responses using Different Statistical Packages. Statistics in Medicine. 2011;30:2562–2572. doi: 10.1002/sim.4265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Meadows G, Burgess P, Fossey E, Harvey C. Perceived need for mental health care, findings from the Australian National Survey of Mental Health and Wellbeing. Psychological Medicine. 2000;30:645–656. doi: 10.1017/s003329179900207x. [DOI] [PubMed] [Google Scholar]
  • 32.Valla JP, Bergeron L, Smolla N. The Dominic-R: A pictorial interview for 6- to 11-year old children. Journal of the American Academy of Child and Adolescent Psychiatry. 2000;39:85–93. doi: 10.1097/00004583-200001000-00020. [DOI] [PubMed] [Google Scholar]

RESOURCES