Skip to main content
Oxford University Press logoLink to Oxford University Press
. 2024 Dec 4;80(4):ujae135. doi: 10.1093/biomtc/ujae135

Estimating marginal treatment effect in cluster randomized trials with multi-level missing outcomes

Chia-Rui Chang 1,, Rui Wang 2,3
PMCID: PMC11629964  PMID: 39656746

ABSTRACT

Analyses of cluster randomized trials (CRTs) can be complicated by informative missing outcome data. Methods such as inverse probability weighted generalized estimating equations have been proposed to account for informative missingness by weighing the observed individual outcome data in each cluster. These existing methods have focused on settings where missingness occurs at the individual level and each cluster has partially or fully observed individual outcomes. In the presence of missing clusters, for example, all outcomes from a cluster are missing due to drop-out of the cluster, these approaches ignore this cluster-level missingness and can lead to biased inference if the cluster-level missingness is informative. Informative missingness at multiple levels can also occur in CRTs with a multi-level structure where study participants are nested in subclusters such as healthcare providers, and the subclusters are nested in clusters such as clinics. In this paper, we propose new estimators for estimating the marginal treatment effect in CRTs accounting for missing outcome data at multiple levels based on weighted generalized estimating equations. We show that the proposed multi-level multiply robust estimator is consistent and asymptotically normally distributed provided that one of the multiple propensity score models postulated at each clustering level is correctly specified. We evaluate the performance of the proposed method through extensive simulations and illustrate its use with a CRT evaluating a Malaria risk-reduction intervention in rural Madagascar.

Keywords: cluster randomized trials, expectation-maximization (EM) algorithm, generalized estimating equation (GEE), inverse probability weighting (IPW), multi-level missing data, multiply robust, propensity score

1. INTRODUCTION

Cluster-randomized trials (CRTs), with groups of individuals as randomization units, are commonly used in biomedical research for intervention evaluation (Hayes and Moulton, 2017). Because outcomes from individuals within the same cluster are likely to be correlated, analysis of CRTs must account for this dependence within cluster. The generalized estimating equation (GEE) approach has often been adopted to estimate the marginal treatment effect in CRTs (Liang and Zeger, 1986). Compared to mixed effects model, GEE targets the population marginal effect parameter and requires fewer parametric assumptions on the outcome distribution (Hubbard et al., 2010). It renders valid inference provided that the mean model is correctly specified and is robust to misspecification of the correlation structure. However, in the presence of informative missing outcome data, the GEE estimator based on the complete data may result in biased estimates (Hossain et al. 2017a, b).

Here, we consider the setting where outcome missingness is independent of unobserved and observed outcomes, conditional on baseline covariates and exposure. Such missingness mechanism has been termed as “covariate-dependent missingness (CDM)” (Hossain et al., 2017a, b) or “restricted missing at random” (Prague et al., 2016). Two common approaches to address CDM data include multilevel multiple imputation (Schafer and Yucel, 2002; Diaz-Ordaz et al., 2016; Hossain et al., 2017a, b) and inverse probability weighting (IPW) (Robins et al., 1995). We adopt the IPW framework, which avoids the need to correctly specify the joint distribution of the clustered outcomes.

The IPW methods accounting for missing outcomes in CRTs have been proposed to handle the setting where missingness occurs at the individual level and each cluster has partially or fully observed individual outcomes (Prague et al., 2016). In the presence of missing clusters, for example, all outcomes from a cluster are missing due to drop-out of the cluster, these approaches ignore this cluster-level missingness and can lead to biased inference if the cluster-level missingness is informative (Giraudeau and Ravaud, 2009).

Missing clusters are not uncommon in CRTs. A systematic review of CRTs reported that 31% of 86 trials had missing clusters (Fiero et al., 2016). Multi-level missingness can also occur in CRTs with a multi-level structure. For example, in a study to evaluate if proactive community case management (pro-CCM) is effective in reducing malaria burden in rural endemic area of Madagascar, 22 fokontanies (smallest administrative units) were randomized to pro-CCM or conventional integrated community case management (iCCM) (Ratovoson et al., 2022). The study participants were nested in households, which were nested in each fokontany. About 24% of the study participants and 22% of the households were lost to follow-up due to moving away, absence, death, or refusal to participate.

In this paper, we develop new estimators for estimating the marginal treatment effect in CRTs with multi-level missing outcomes based on the GEE framework. We derive the multi-level weights to account for informative missingness and incorporate a multiply robust estimation approach (Han, 2014), which allows analysts to specify multiple propensity score (PS) models at each of the clustering levels. We note that the term “multiply robust” has been used in other contexts (Tchetgen Tchetgen and Shpitser, 2012). The proposed multi-level multiply robust GEE (MMR-GEE) estimator enjoys the multiple robustness property in the sense that the target parameter will be consistently estimated as long as one of the multiple PS models postulated at each clustering level is correctly specified.

The remainder of the article is organized as follows. Sections 2.1 and 2.2 introduce the notation, setting, and assumptions. Sections 2.3 and 2.4 present the proposed MMR-GEE estimator. Section 2.5 establishes the theoretical properties of the MMR-GEE estimator. Section 2.6 addresses the misclassification issue of the observed missingness indicator at the cluster level. For notational simplicity, we anchor our presentation around 2-level CRTs, where informative missingness may occur both at the cluster and at the individual level in Sections 2.1-2.6. In Section 2.7, we present an extension to 3-level CRTs where informative missingness can occur both at the subcluster and at the individual level. In Section 3, we illustrate the use of the proposed methods with the Pro-CCM study (Ratovoson et al., 2022). Results from extensive simulation based on the Pro-CCM study are reported in Section 4. The paper is concluded with practical considerations and discussions in Section 5.

2. METHODS

2.1. Notation and models

We consider a 2-arm parallel CRT where study participants are followed over time with the outcome Inline graphic, a vector of Inline graphic cluster-level baseline covariates Inline graphic, and a vector of Inline graphic individual-level baseline covariates Inline graphic for participant Inline graphic in cluster Inline graphic. Let Inline graphic be the binary treatment indicator for cluster Inline graphic (treated Inline graphic and control Inline graphic); the treatment assignment probability is known and given by Inline graphic. The vector of cluster-level covariates Inline graphic and matrix of individual-level covariates Inline graphic are assumed to be fully observed before randomization. Note that Inline graphic can include cluster-specific information (e.g., cluster location or size) as well as summary statistics of individual-level covariates (e.g., average age or the proportion of male/female within a cluster). Here, we use Inline graphic to denote the vector of individual-level missingness indicator and Inline graphic to denote the cluster-level missingness indicator for outcomes Inline graphic. Inline graphic when Inline graphic is observed and Inline graphic when Inline graphic is missing. Inline graphic if the cluster Inline graphic drops out of the study so that no individual outcomes in that cluster can be observed, and Inline graphic otherwise.

Let Inline graphic be the number of observed clusters, and Inline graphic be the number of observed outcomes in cluster Inline graphic. Without loss of generality, let Inline graphic be the indexes for observed clusters and Inline graphic be the indexes for missing clusters; for participants in observed cluster Inline graphic, let Inline graphic be the indexes of participants whose outcomes are observed and Inline graphic be the indexes of participants whose outcomes are missing. See Table 1 for an illustration of the data structure under the multi-level missingness setting.

TABLE 1.

Data structure of multi-level missingness in cluster randomized trials, including outcome Inline graphic, cluster-level missingness indicator Inline graphic (Inline graphic if the cluster is observed and Inline graphic if the cluster is missing), and individual-level missingness indicator Inline graphic (Inline graphic if Inline graphic is observed and Inline graphic if Inline graphic is missing).

Cluster Unit Inline graphic Inline graphic Inline graphic
1 1 Inline graphic 1 1
1 Inline graphic Inline graphic 1 1
1 Inline graphic Inline graphic 1 1
1 Inline graphic Inline graphic 0 1
1 Inline graphic Inline graphic 0 1
1 Inline graphic Inline graphic 0 1
Inline graphic Inline graphic Inline graphic Inline graphic 1
s 1 Inline graphic 1 1
s Inline graphic Inline graphic 1 1
s Inline graphic Inline graphic 1 1
s Inline graphic Inline graphic 0 1
s Inline graphic Inline graphic 0 1
s Inline graphic Inline graphic 0 1
s+1 1 Inline graphic 0 0
s+1 Inline graphic Inline graphic ​​​​​​​​​​​​​ 0 0
s+1 Inline graphic Inline graphic 0 0
Inline graphic Inline graphic Inline graphic 0 0
M 1 Inline graphic 0 0
M Inline graphic Inline graphic 0 0
M Inline graphic Inline graphic 0 0

Our primary interest lies in the estimation of and inference about the parameters in the marginal mean model Inline graphic with link function Inline graphic, where Inline graphic targets the marginal treatment effect. When there is no missing data, an estimator of Inline graphic can be obtained by solving the following estimating equation (Liang and Zeger, 1986):

2.1. (1)

Inline graphic is the design matrix with Inline graphic. Inline graphic is the covariance matrix with Inline graphic. Inline graphic is the working correlation matrix indexed by non-diagonal elements Inline graphic.

When outcomes are missing under CDM, fitting Model (1) with complete data could lead to biased inference for Inline graphic (Prague et al., 2016). Provided that all clusters are observed, that is, Inline graphic for all Inline graphic, one can attempt to correct the bias through IPW-GEE (Robins et al., 1995):

2.1. (2)

Model (2) recovers population moments by reweighing the complete data according to the weighting matrix Inline graphic. The conditional probability that Inline graphic is observed, also called the PS, is denoted by Inline graphic. In practice, the true PS is unknown. One can postulate a logistic regression model that regresses the missingness indicator on the treatment indicator and baseline covariates. A consistent and asymptotically normal estimator of Inline graphic can be obtained when the PS model is correctly specified for the CDM missingness mechanism.

2.2. Multi-level missingness processes and assumptions

We consider the following multi-level missingness processes: Clusters drop out or withdraw from the study after randomization and before outcome data collection, where this cluster-level missingness is induced by the model Inline graphic with parameters Inline graphic. For clusters that remain throughout the study, the outcomes of individual participants may be missing, where this individual-level missingness is induced by another model Inline graphic with parameters Inline graphic. In such setting, the IPW-GEE method ignores the cluster-level missingness process and may lead to biased estimates of Inline graphic. Throughout the paper, we make the following assumptions:

  1. Non-informative cluster size: Outcomes do not depend on cluster sizes.

  2. Multi-level CDM: The multi-level missingness processes depend on neither observed nor missing outcomes, conditional on baseline covariates and treatment:

    Inline graphic .

  3. Positivity: The probabilities of both cluster- and individual-level missingness are bounded away from zero: Inline graphic and Inline graphic.

2.3. Multi-level IPW-GEE

We adapt weighting methods from the longitudinal drop-out setting (Robins et al., 1995; Mitani et al., 2022) to estimate Inline graphic under the multi-level CDM setting. The conditional probability of observing Inline graphic can be expressed as

2.3. (3)

where Inline graphic corresponds to the cluster-level missingness process induced by Inline graphic and Inline graphic corresponds to the individual-level missingness process induced by Inline graphic. By modifying the weighting matrix of (2), we propose the multi-level IPW-GEE (MIPW-GEE) estimator as follows:

2.3. (4)

The consistency of the MIPW-GEE estimator requires correct specification of both PS models: that is, Inline graphic and Inline graphic for some Inline graphic and Inline graphic.

2.4. Multi-level multiply robust GEE

To protect against misspecification of the PS models, we propose a multiply robust estimator of Inline graphic, denoted by Inline graphic, based on the empirical likelihood theory (Owen, 2001; Han, 2014). The proposed MMR-GEE estimator allows analysts to specify multiple sets of PS models, and Inline graphic will be a consistent estimator for Inline graphic provided that one of the cluster-level and one of the individual-level PS models are correctly specified.

The MMR-GEE estimator can be obtained by solving estimating Equation (2) with weights replaced by the multiply robust weights:

2.4. (5)

We derive the multiply robust weights by extending the method from the independent data setting in Han and Wang (2013) to the clustered data setting where informative missingness can occur at multiple levels. Let Inline graphic denote the set of Inline graphic postulated individual-level PS models for Inline graphic and Inline graphic denote the set of Inline graphic postulated cluster-level PS models for Inline graphic, where Inline graphic and Inline graphic are vectors of parameters for the Inline graphicth and Inline graphicth models. Let Inline graphic and Inline graphic be the estimators for Inline graphic and Inline graphic, respectively. Now define Inline graphic. The multiply robust weights for individuals with observed outcomes Inline graphic, Inline graphic can be obtained from solving a constrained optimization problem:

2.4. (6)

subject to the following constraints:

2.4.

The first constraint requires that the weights are non-negative. The second constraint imposes that the weights sum up to 1. The third constraint weighs each postulated model evaluated at the biased samples to represent the population mean. For Inline graphic, Inline graphic, now define Inline graphic, Inline graphic, and

2.4.

The constrained optimization problem (6) can be solved through the Lagrange multiplier technique, which yields

2.4. (7)

where Inline graphic is a Inline graphic vector by solving the following equation:

2.4. (8)

The detailed derivation can be found in Web Appendix A. There may be multiple roots to Equation (8). We apply convex minimization from Han (2014) to obtain Inline graphic.

2.5. Consistency and asymptotic normality of MMR-GEE

Below, we first demonstrate that the MMR-GEE estimator has the multiply robust property, that is, Inline graphic is a consistent estimator for Inline graphic when both Inline graphic and Inline graphic contain a correctly specified PS model. We then establish the asymptotic distribution of Inline graphic. We consider the case where the number of clusters grows to infinity and the cluster sizes are bounded above. For notational simplicity, all clusters have the same cluster size. The results can be generalized to varying cluster sizes by invoking the Lindeberg–Feller central limit theorem. In what follows, we use subscript asterisk to denote probability limits, Inline graphic to denote the true parameters of the PS models, and Inline graphic to denote the true parameters of the marginal mean model.

2.5.1. Multiple robustness of Inline graphic

We first show that the multiply robust weights of Equation (5) under which both Inline graphic and Inline graphic contain a correctly specified PS model are asymptotically equivalent to the multi-level inverse probability weights of Equation (4) when the true correct models are known. This asymptotic equivalence can be established by building the connection between the multiply robust weights and another version of the empirical likelihood weights conditional on the observed sample assuming that the correct PS models are known, which we will derive below.

Without loss of generality, let Inline graphic and Inline graphic, the first model in Inline graphic and Inline graphic, respectively, be the correctly specified models. Furthermore, let Inline graphic be the empirical probability of Inline graphic conditional on Inline graphic for Inline graphic. The estimator for Inline graphic, denoted by Inline graphic, can be obtained by solving the empirical version of the constrained optimization problem (6) using the same Lagrange multipliers method as in Section 2.4. With some algebra manipulation, Inline graphic can be expressed as (see Web Appendix B for derivation)

2.5.1. (9)

Plugging Inline graphic back to the weighting matrix of (5), we establish the relationship that

2.5.1. (10)

which are asymptotically proportional to the weights of the correctly specified MIPW-GEE estimator of Equation (4). As the number of clusters Inline graphic goes to infinity, we have

2.5.1. (11)

which proves the consistency of Inline graphic. The results are summarized below:

Theorem 1

When Inline graphic contains a correct model for Inline graphic and Inline graphic contains a correct model for Inline graphic, as Inline graphic, Inline graphic.

The proof is provided in Web Appendix B.

2.5.2. Asymptotic distribution

We derive the asymptotic distribution of Inline graphic by following the approach from Theorem 2 of Han (2014) assuming that the correct models are known. Without loss of generality, let Inline graphic and Inline graphic be the correctly specified models for Inline graphic and Inline graphic. The score functions of Inline graphic and Inline graphic, denoted by Inline graphic and Inline graphic, are

2.5.2.

Let

2.5.2.

where Inline graphic and for any matrix Inline graphic, Inline graphic. Furthermore, write Inline graphic, Inline graphic, and Inline graphic. The following theorem gives the asymptotic distribution of Inline graphic.

Theorem 2

When both Inline graphic and Inline graphic contain a correctly specified model for Inline graphic and Inline graphic, respectively, Inline graphic has an asymptotic normal distribution with mean Inline graphic and variance var(Inline graphic), where

Theorem 2

See Web Appendix C for detailed derivation and proof.

The asymptotic variance of the MMR-GEE estimator requires the knowledge of correct PS models, which are usually unavailable. Therefore, the asymptotic variance formula cannot easily be used to obtain the standard error estimates. We recommend using the non-parametric “clustered bootstrap” approach for inference. The “clustered bootstrap” approach samples Inline graphic clusters with replacement, with all individuals from the resampled clusters included in the bootstrap sample (Field and Welsh, 2007).

2.6. An EM algorithm to address misclassification of cluster-level missingness indicators

The consistency of the proposed MMR-GEE estimator requires parameters of the PS models to be consistently estimated. When no individual outcomes from a cluster are available, it is possible that outcome data from this cluster are missing by the cluster-level missingness process (that is, the true cluster-level missingness indicator Inline graphic); it is also possible that the cluster remains in the study, but all individual outcomes from this cluster are missing, especially when cluster size is small (that is, Inline graphic, but Inline graphic for all Inline graphic). Let Inline graphic denote the observed cluster-level missingness indicator. In both cases, we observe Inline graphic, but Inline graphic can be either 0 or 1. Because consistent estimation of parameters in the PS models requires knowing Inline graphic, potential misclassification can occur if one naively assigns Inline graphic. We summarize all possible patterns of Inline graphic, which include

  1. Inline graphic : When a cluster drops out after randomization and before outcome data collection, all participants’ outcomes in that cluster cannot be observed so Inline graphic is 0.

  2. Inline graphic : When the cluster does not drop out, we might still observe Inline graphic if all individual outcomes within the cluster are missing. Such scenario is more likely to happen for small cluster sizes.

  3. Inline graphic : When the cluster does not drop out, the observed cluster-level missingness indicator is the true cluster-level missingness indicator when at least one participant’s outcome in cluster Inline graphic is observed.

The patterns are also summarized in Table S1 in Web Appendix D.1. Under pattern (2), the observed Inline graphic misclassifies the true Inline graphic, leading to bias in the estimated parameters for the PS models. More specifically, suppose that the PS models are

2.6. (12)

where Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Because Inline graphic and Inline graphic, estimators of Inline graphic and Inline graphic based on Inline graphic may be biased even if the PS models are correctly specified. To address this potential misclassification problem, we treat Inline graphic as partially observed data and propose an EM algorithm (Dempster et al., 1977) to estimate the parameters in the PS models.

In the current setting, the “complete” data are Inline graphic, which is denoted by (Inline graphic) for simplicity of notation. The complete data log likelihood is

2.6. (13)

The conditional expectation of the Expectation step (E-step) at iteration Inline graphic given the observed data Inline graphic is

2.6. (14)

For the Maximization step (M-step), we recommend using the optimization software such as the Optimr function in R (Nash, 2016; R Core Team, 2021) to maximize the complete data likelihood. By applying the E-step and M-step iteratively, the EM estimators Inline graphic and Inline graphic can be obtained after the algorithm converges. When the PS models are correctly specified, Inline graphic and Inline graphic would be consistent for the true parameters of the PS models despite misclassification of the cluster-level missingness indicators, that is, Inline graphic and Inline graphic. The detailed derivation of the complete data likelihood and E-step as well as pseudo code for the algorithm can be found in Web Appendix D.

2.7. Extension to 3-level CRTs

In 3-level CRTs, study participants are nested in subclusters such as households or healthcare providers, and subclusters are nested in clusters such as regions or clinics. Below, we extend the proposed methods to address informative outcome missingness at both the subcluster and the individual levels.

Let Inline graphic be the outcome and Inline graphic be a vector of Inline graphic baseline covariates for participant Inline graphic from subcluster Inline graphic in cluster Inline graphic. Here, the baseline covariates Inline graphic can contain individual-, subcluster-, and cluster-level information and are fully observed. We consider a 2-arm parallel CRT using the same binary treatment indicator notation Inline graphic. For the multi-level missingness processes, Inline graphic denotes the vector of individual-level missingness indicator, and Inline graphic is used to denote the subcluster-level missingness indicator for outcomes Inline graphic. Inline graphic when Inline graphic is observed and Inline graphic when Inline graphic is missing. Inline graphic when all participants’ outcomes in subcluster Inline graphic are missing and Inline graphic otherwise. Essentially, Table 1 represents the data structure of one cluster (except with the same treatment status), and the data structure for 3-level CRTs is the concatenation of all clusters.

Under the multi-level missingness setting for 3-level CRTs, the estimating equation for MIPW-GEE and MMR-GEE can be modified as follows:

2.7. (15)

Inline graphic is the design matrix with Inline graphic. Inline graphic is the covariance matrix with Inline graphic. Under 3-level CRTs, exchangeable and block exchangeable correlation structures are common choices for the specification of Inline graphic. To account for informative missing subclusters, the multi-level weighting matrix takes the following form:

2.7. (16)

where Inline graphic is the individual-level missingness process for Inline graphic and Inline graphic is the subcluster-level missingness process for Inline graphic. The extension of the MMR-GEE estimator can be obtained by replacing the weighting matrix of Equation (15) by

2.7. (17)

The estimation of Inline graphic follows the same strategy as in Section 2.4.

3. APPLICATION

We illustrate our proposed methods using data from the Pro-CCM study (Ratovoson et al., 2022), which investigated the efficacy of the pro-CCM intervention in reducing the prevalence of malaria in the Mananjary district of Madagascar. A total of 22 clusters (i.e., fokontany) were randomized to pro-CCM (treatment) or iCCM (control). Study participants were nested in households, which were nested in each fokontany. The disease status of each participant was assessed at baseline and at endline. Here, we focus on the individual-level diagnostic test result (Inline graphic) at endline for participant Inline graphic from household Inline graphic in fokontany Inline graphic (Inline graphic if positive, 0 if negative). The dataset consists of 29,683 participants with 7 individual-level baseline covariates (male indicator Inline graphic, age Inline graphic, primary school indicator Inline graphic, secondary school indicator Inline graphic, high level school indicator Inline graphic, sleep in mosquito nets indicator Inline graphic, and sleep in the yard indicator Inline graphic) and 4 household-level baseline covariates (household size Inline graphic, % of male Inline graphic, highest education level Inline graphic, and indoor residual spraying indicator Inline graphic).

The overall missingness of the individual-level outcome at endline was 31%, corresponding to 22.3% of missing households. Results based on a mixed effects model adjusting for socio-demographic characteristics suggested no statistical differences in test positivity at endline between participants in the intervention and control arm (OR = 0.71; 95% CI: 0.36-1.43) (Ratovoson et al., 2022). We reanalyze this dataset using the GEE approaches to estimate the marginal treatment effect assuming outcomes have CDM. The specification of the PS models in Inline graphic and Inline graphic would ideally be based on knowledge about the underlying causal structure of the missingness processes. In the absence of such information, we apply a backward step-wise procedure based on the AIC to select covariates for the PS, yielding the following models:

3. (18)
3. (19)
3. (20)

The model fitting results are provided in Table S2 in Web Appendix E.1. We carry out the following 4 analyses: CC-GEE based on participants with observed test results at endline, IPW-GEE using Model (18) for the PS, MIPW-GEE using Models (19) and (20) for the subcluster- and individual-level PS, and MMR-GEE with 2 sets of PS models Inline graphic and Inline graphic. Inline graphic contains Model (19) and another model that includes (Inline graphic, Inline graphic, Inline graphic); Inline graphic contains Model (20) and another model that includes Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic). The parameters in the PS models for MIPW-GEE and MMR-GEE are estimated with the EM algorithm proposed in Section 2.6.

CC-GEE yields a marginal treatment effect estimate (Inline graphic = 0.75, 95% CI: 0.38-1.50) that is similar to the original finding. Approaches that incorporate potentially informative missing outcomes lead to effect estimates slightly closer to the null (IPW-GEE Inline graphic = 0.81, 95% CI: 0.36-1.82; MIPW-GEE Inline graphic = 0.85, 95% CI: 0.28-2.41; MMR-GEE Inline graphic = 0.82, 95% CI: 0.38-1.77). Nevertheless, the conclusion remains the same as confidence intervals from all approaches include the null. The IPW-GEE estimator without explicitly modeling the subcluster-level missingness yields effect estimates similar to our proposed MIPW-GEE and MMR-GEE estimators. Such similarity suggests that subclusters may be missing completely at random. Indeed, even though 22.3% of households are missing, the estimated probability for the subcluster-level missingness is all close to 1 (i.e., mean of Inline graphic = 0.99 with range 0.96-1.00). While in this particular application, all approaches lead to the same conclusions, the availability of proposed methods permits assessment of the impact of potentially informative missingness on effect estimates under a range of assumptions about the outcome missingness mechanisms at multiple levels.

4. SIMULATION STUDIES

We conduct 2 sets of simulation studies to assess the finite-sample performance of our proposed MMR-GEE estimator. The first set, presented in this section, is structured under a 3-level clustering setting based on the Pro-CCM study, whereas the second set, presented in Web Appendix F, is structured under a 2-level clustering setting.

4.1. Simulation design and data generating processes

We consider 3 designs: the original Pro-CCM design (Org-Pro-CCM) that replicates the Pro-CCM study, as well as alternative design 1 (Alt-1) and alternative design 2 (Alt-2) under varying cluster sizes, outcome models, and missingness processes (see Table 2).

TABLE 2.

Details of the outcome models, missingness models, and proportion of missingness under the Org-Pro-CCM, Alt-1, and Alt-2 designs.

Org-Pro-CCM Alt-1 Alt-2
Outcome model
Inline graphic (−1.22, −0.34) (−0.50, 0.50) (−0.50, 0.50)
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic (−0.99, −1.32) (1.00, 0.50) (1.00, 0.50)
Inline graphic (0.29, 0.87) (1.00, 0.50) (1.00, 0.50)
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic (0.28, −0.30, 0.67, 1.39, 0.22) (−0.50, −0.10) (−0.50, −0.10)
Inline graphic (0, 0, −0.22, −0.70, −0.36) (−0.50, −0.10) (−0.50, −0.10)
Missingness models
Inline graphic (0.90, 0) (4.50, −0.50) (4.50, −0.50)
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic (0.03, 0.02) (−0.50, −0.50) (−1.00, −0.50, −0.50)
Inline graphic (−3.50, −2.00) (0, −3.50, −2.00)
Inline graphic (2.42, −0.22) (1.5, -0.50) (0, −0.50)
Inline graphic Inline graphic Inline graphic
Inline graphic (−0.07, −0.28, 0.22) 1
Inline graphic (0.04, 0.13, 0)
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic (−0.21, 0.03, 0.24) 0.2 0.2
Inline graphic 0.2 0.2
Proportion of missingness
% of missing households 0.26 0.18 0.16
% of overall missing individuals 0.36 0.31 0.36

Baseline information is resampled from the Pro-CCM study: we sample 22 clusters (i.e., fokontany) in the Org-Pro-CCM and Alt-1 designs and 50 clusters in the Alt-2 design with replacement. Within each cluster, we further sample 30 subclusters (i.e., households) with replacement, yielding a total of 660 households in the Org-Pro-CCM and Alt-1 designs and 1,500 households in the Alt-2 design. To better understand the impact of small subcluster sizes, a constraint is applied in the Alt-2 design, limiting household size to 1–3 members. Treatment assignment (Inline graphic) is simulated from a Bernoulli distribution with probability Inline graphic at the cluster level. The primary outcome Inline graphic is generated from the following model:

4.1. (21)

with nested block exchangeable correlation matrix defined by the within- and between-household intracluster correlation coefficients (ICCs), denoted as Inline graphic. Inline graphic characterizes the dependency between 2 outcomes within the same household, whereas Inline graphic characterizes the dependency between 2 outcomes from distinct households within the same fokontany. Here, we choose Inline graphic and generate the binary outcome using the SimCorMultRes package in R (Touloumis, 2016; R Core Team, 2021).

Our interest lies in estimating Inline graphic from the marginal model

4.1. (22)

which is different from the conditional treatment effect parameter Inline graphic in model (21). To obtain the true Inline graphic, we fit the (unadjusted) GEE model with an independent working correlation matrix to 20, 000 full data sets. We then obtain Inline graphic by averaging over 20, 000 estimates. The outcome missingness processes are induced through the following models:

4.1. (23)
4.1. (24)

The parameter values and the proportion of missingness under the 3 designs are provided in Table 2. Overall, the proportion of missing households ranges from 16% to 18%, and the proportion of overall missing individuals ranges from 31% to 36%.

4.2. Analysis approaches

To demonstrate the importance of correcting potential bias due to informative missing outcomes, we compare the following 4 approaches. First, we carry out an unweighted CC-GEE analysis based on Model (1). Second, we apply the IPW-GEE method based on Model (2), where the PS is estimated by the unconditional logistic regression model with the same functional form as Model (24) but ignores subcluster-level missingness:

4.2. (25)

Third, we employ the MIPW-GEE method, where both the subcluster- and individual-level PS models are correctly specified. Lastly, we implement our proposed MMR-GEE estimator by specifying Inline graphic and Inline graphic. Both Inline graphic and Inline graphic contain one correctly specified and one misspecified models. To estimate the parameters in the PS models, we fit the standard logistic regression model based on Inline graphic (referred to as MIPW-GEE-no-EM and MMR-GEE-no-EM) and also apply the EM algorithm (referred to as MIPW-GEE-EM and MMR-GEE-EM). In the Org-Pro-CCM and Alt-1 designs, the misclassification rate of Inline graphic is less than one percent, rendering the application of the EM algorithm unnecessary. For all approaches, we adopt an independent working correlation structure and utilize the “clustered bootstrap” method to obtain standard error estimates. All results are based on 1,000 replicates and 200 bootstrapping resamples.

4.3. Simulation results

Table 3 summarizes the estimates of Inline graphic, empirical and bootstrap standard errors, and empirical coverage probability. In the Org-Pro-CCM design, all methods produce comparable results, exhibiting unbiased estimates of the marginal odds ratio. The percentage of 95% CI covering the true parameter values is close to 95%. In the Alt-1 and Alt-2 designs, CC-GEE and IPW-GEE provide biased estimates of the marginal odds ratio, with biases ranging from −0.46 to −0.34 for CC-GEE and −0.37 to −0.26 for IPW-GEE, whereas the averages of estimates from MIPW-GEE and MMR-GEE are very close to the true parameter values. The averages of bootstrap standard errors closely match the empirical standard errors in general. In the Alt-2 design, the bootstrap standard errors for MIPW-GEE-EM and MMR-GEE-EM deviate slightly from their empirical counterparts, likely due to numerical instability from the EM algorithm. When Inline graphic is consistently estimated (i.e., MIPW-GEE-no-EM and MMR-GEE-no-EM for Alt-1 and MIPW-GEE-EM and MMR-GEE-EM for Alt-2), the percentage of 95% CI that covers the true value is close to 95%. The empirical coverage associated with CC-GEE and IPW-GEE can be substantially lower than the nominal level (e.g., Inline graphic60% for CC-GEE and Inline graphic75% for IPW-GEE).

TABLE 3.

Empirical estimates of Inline graphic and its corresponding odds ratio, empirical standard errors, mean of the estimated bootstrapping-based standard errors using the “clustered bootstrap” method, and empirical coverage probability based on 1,000 replicates and 200 bootstrapping resamples. The coverage probability is the percentage of true Inline graphic contains in the 95% CI constructed from the bootstrapping-based standard errors.

Clustered bootstrap
Est. Inline graphic Est. OR Emp. S.E. Est. S.E. Cov. prob.
Inline graphic   Org-Pro-CCM
Full data −0.24 0.87 0.43 0.42 0.93
CC-GEE −0.24 0.87 0.45 0.44 0.93
IPW-GEE −0.24 0.87 0.45 0.44 0.93
MIPW-GEE-no-EM −0.24 0.87 0.45 0.44 0.93
MMR-GEE-no-EM −0.24 0.88 0.45 0.44 0.93
Inline graphic Alt-1
Full data 0.41 1.57 0.28 0.27 0.94
CC-GEE 0.17 1.23 0.28 0.28 0.84
IPW-GEE 0.23 1.31 0.28 0.28 0.87
MIPW-GEE-no-EM 0.41 1.57 0.29 0.28 0.94
MMR-GEE-no-EM 0.41 1.58 0.29 0.28 0.95
Inline graphic Alt-2
Full data 0.27 1.35 0.23 0.23 0.94
CC-GEE −0.15 0.89 0.27 0.25 0.57
IPW-GEE −0.06 0.98 0.26 0.25 0.71
MIPW-GEE-no-EM 0.12 1.21 0.36 0.33 0.88
MIPW-GEE-EM 0.26 1.37 0.32 0.42 0.96
MMR-GEE-EM 0.26 1.37 0.31 0.28 0.91

Figure 1 presents the empirical distribution of the Inline graphic estimates for the Alt-2 design, with figures from the other 2 settings provided in Web Appendix E.2. The solid line denotes the true marginal effect, and the dashed line is the empirical mean across 1,000 coefficient estimates. Overall, the proposed MMR-GEE-EM and MIPW-GEE-EM estimators lead to estimates that are centered at the true Inline graphic. Because CC-GEE ignores informative missing data and IPW-GEE fails to account for subcluster-level missingness, they both result in biased estimates. MIPW-GEE-no-EM attempts to adjust for the multi-level missingness processes. However, without using the EM algorithm to correct the misclassfication in Inline graphic, estimates from MIPW-GEE-no-EM lead to bias. On the other hand, MIPW-GEE-EM appropriately accounts for the misclassification in Inline graphic and the bias disappears.

FIGURE 1.

FIGURE 1

Empirical distribution of Inline graphic based on 1000 replicates for the Alt-2 design. The solid line denotes the truth (0.28). The dashed line denotes the empirical mean of the estimated Inline graphic.

We further compare strategies for estimating the parameters in the correctly specified PS models with and without using the EM algorithm and present the estimated parameter values in Web Appendix E.3. Under all scenarios, Inline graphic and Inline graphic are centered at the true values, whereas the parameters estimated by the standard logistic regression model using Inline graphic can be substantially biased for the Alt-2 setting. As mentioned in Section 2.6, the misclassification in Inline graphic due to the missingness of all individual outcomes in the (sub)cluster is more likely to happen for small (sub)cluster sizes. As (sub)cluster size increases, the misclassification in Inline graphic becomes less probable because the probability of all individual outcomes within a (sub)cluster being missing, i.e., Inline graphic, would be very small. Therefore, large (sub)cluster size obviates the need to apply the EM.

5. DISCUSSION

Drawing upon the empirical likelihood theory, this paper proposes a new estimation procedure for the marginal treatment effect in CRTs with multi-level missing outcomes that guards against the partial misspecification of the PS models. The proposed MMR-GEE estimator allows analysts to specify multiple sets of PS models and leads to consistent treatment effect estimates provided that one of the multiple PS models specified at each clustering level is correctly specified and the parameters in the PS models are consistently estimated. We assume that outcome missingness is driven by baseline variables. If outcome missingness also depends on post-baseline variables, one could consider including post-randomization variables in the PS models (Carpenter and Kenward, 2007). The 2-stage targeted minimum loss-based estimation approach proposed by Balzer et al. (2023) considered outcome missingness driven by both baseline information and post-randomization variables. To guide the selection of the PS models, graphical models such as the missingness directed acyclic graphs (m-DAGs), which integrate information on the missingness process alongside context-specific knowledge, can be used to assess the causes of missingness (Moreno-Betancur et al., 2018; Mohan and Pearl, 2021; Lee et al., 2023). One potential strategy analysts can consider is to specify multiple potential m-DAGs (Lee et al., 2023); the PS models inferred from these m-DAGs can then be incorporated into the MMR-GEE estimator. The selection of the number of PS models for the MMR-GEE estimator involves balancing these considerations: including enough candidate models to improve the chances of correctly specifying the missingness process and controlling the dimension of the Lagrange multipliers. As the number of PS model increases, so does the complexity of the Lagrange multipliers, which can lead to numerical challenges when solving the constrained optimization problem (Han and Wang, 2013).

When using the GEE approach to analyze clustered data through a marginal model, the correlation among cluster members is modeled to determine the weight assigned to the data from each member. Here, we assume that the cluster size is non-informative (i.e., the outcome is independent of the cluster size), so that clustering does not affect the marginal mean model specification. In such settings, the participant-average and the cluster-average treatment effect estimands coincide (Kahan et al., 2023), and the proposed weighted-GEE estimators are consistent provided that the PS models for outcome missingness are correctly specified. However, in the presence of informative cluster size, the choice of working covariance matrix may correspond to different fitted marginal models, potentially resulting in treatment effect estimators targeting different estimands (Kahan et al., 2023; Williamson et al., 2003). For example, commonly used estimators such as GEEs with an exchangeable correlation structure can be biased for both the participant-average and the cluster-average treatment effect estimands. Therefore, it is recommended to carefully consider the target estimand and the likelihood of informative cluster size when choosing an appropriate analysis method.

We create a flowchart to guide the selection of an appropriate method for handling missing outcome data at multiple levels (Figure 2). First, our proposed approach is targeted toward the multi-level CDM setting. If one believes that clusters are missing completely at random, it may be sufficient to apply the IPW-GEE method to incorporate informative missingness at the individual-level. Second, although MMR-GEE provides the flexibility to specify multiple sets of PS models, analysts can apply the MIPW-GEE method if they have substantial knowledge about the true multi-level missingness processes. Finally, the goal of the EM algorithm is to address the challenge in estimating parameters of the PS models due to misclassification in Inline graphic, which is more likely to happen for small cluster sizes. When cluster sizes are large, the likelihood of all individual outcomes within a cluster being missing diminishes, making the misclassification in Inline graphic highly improbable. Thus, the application of the EM algorithm becomes unnecessary. When clusters contain a mixture of different sizes and the likelihood of this misclassification is uncertain, the EM algorithm is recommended. In the absence of misclassification in Inline graphic, the EM algorithm would converge very quickly so the added computational burden is minimal.

FIGURE 2.

FIGURE 2

Recommendations for modeling and estimating the propensity score under various missingness mechanisms and scenarios.

We consider parametric logistic regression models when modeling the outcome missingness processes. Machine learning methods such as classification and regression trees could also be used (Lee et al., 2010). Augmented IPW-GEEs, which combine a PS model and a covariate-conditional mean outcome model, have been developed for handling missing outcome data in CRTs (Prague et al., 2016). The inclusion of an outcome model adds an extra layer of protection against model misspecification, and can also enhance the estimation efficiency if correctly specified. Extension of the MMR-GEE estimator to incorporate outcome models in the multi-level missing outcome settings requires further investigation. Lastly, we assume the missingness mechanism to be multi-level CDM. Developing sensitivity analysis methods to evaluate the impact of violating missing data assumptions would be useful.

Supplementary Material

ujae135_Supplemental_Files

Web Appendices, Tables, and Figures referenced in Sections 2.4, 2.5, 2.6, 3, and 4, and source code are available with this paper at the Biometrics website on Oxford Academic. Software in the form of R code is available at: https://github.com/JerryChiaRuiChang/MMR-GEE.

ACKNOWLEDGMENTS

We thank the editor, associate editor, and two reviewers for their helpful feedback and comments.

Contributor Information

Chia-Rui Chang, Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115, United States.

Rui Wang, Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115, United States; Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA 02215, United States.

FUNDING

Research in this article was in part supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health (NIH) R01 AI136947.

CONFLICT OF INTEREST

None declared.

DATA AVAILABILITY

The data that support the findings in this paper are openly available at https://doi.org/10.7910/DVN/IIDE2B.

References

  1. Balzer  L. B., van der Laan  M., Ayieko  J., Kamya  M., Chamie  G., Schwab  J.  et al. (2023). Two-stage TMLE to reduce bias and improve efficiency in cluster randomized trials. Biostatistics, 24, 502–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Carpenter  J. R., Kenward  M. G. (2007). Missing data in randomised controlled trials: a practical guide. Health Technology Assessment Methodology Programme, Birmingham, p. 199. https://researchonline.lshtm.ac.uk/id/eprint/4018500. [Google Scholar]
  3. Dempster  A. P., Laird  N. M., Rubin  D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–22. [Google Scholar]
  4. Diaz-Ordaz  K., Kenward  M., Gomes  M., Grieve  R. (2016). Multiple imputation methods for bivariate outcomes in cluster randomised trials. Statistics in Medicine, 35, 3482–3496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Field  C. A., Welsh  A. H., (2007). Bootstrapping clustered data. Journal of the Royal Statistical Society: Series B, 69, 369–390. [Google Scholar]
  6. Fiero  M. H., Huang  S., Oren  E., Bell  M. L. (2016). Statistical analysis and handling of missing data in cluster randomized trials: a systematic review. Trials, 17, 72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Giraudeau  B., Ravaud  P. (2009). Preventing bias in cluster randomised trials. PLoS Medicine, 6, e1000065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Han  P. (2014). Multiply robust estimation in regression analysis with missing data. Journal of the American Statistical Association, 109, 1159–1173. [Google Scholar]
  9. Han  P., Wang  L. (2013). Estimation with missing data: beyond double robustness. Biometrika, 100, 417–430. [Google Scholar]
  10. Hayes  R. J., Moulton  L. H. (2017). Cluster Randomised Trials. New York: Chapman and Hall/CRC. [Google Scholar]
  11. Hossain  A., Diaz-Ordaz  K., Bartlett  J. W. (2017a). Missing binary outcomes under covariate-dependent missingness in cluster randomised trials. Statistics in Medicine, 36, 3092–3109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Hossain  A., Diaz-Ordaz  K., Bartlett  J. W. (2017b). Missing continuous outcomes under covariate dependent missingness in cluster randomised trials. Statistical Methods in Medical Research, 26, 1543–1562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hubbard  A. E., Ahern  J., Fleischer  N. L., Van der Laan  M., Satariano  S. A., Jewell  N.  et al. (2010). To GEE or not to GEE: comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology, 21, 467–474. [DOI] [PubMed] [Google Scholar]
  14. Kahan  B. C., Li  F., Copas  A. J., Harhay  M. O. (2023). Estimands in cluster-randomized trials: choosing analyses that answer the right question. International Journal of Epidemiology, 52, 107–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Lee  B. K., Lessler  J., Stuart  E. A. (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29, 337–346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Lee  K. J., Carlin  J. B., Simpson  J. A., Moreno-Betancur  M. (2023). Assumptions and analysis planning in studies with missing data in multiple variables: moving beyond the MCAR/MAR/MNAR classification. International Journal of Epidemiology, 52, 1268–1275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Liang  K.-Y., Zeger  S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22. [Google Scholar]
  18. Mitani  A. A., Kaye  E. K., Nelson  K. P. (2022). Accounting for drop-out using inverse probability censoring weights in longitudinal clustered data with informative cluster size. The Annals of Applied Statistics, 16, 596–611. [Google Scholar]
  19. Mohan  K., Pearl  J., (2021). Graphical models for processing missing data. Journal of the American Statistical Association, 116, 1023–1037. [Google Scholar]
  20. Moreno-Betancur  M., Lee  K. J., Leacy  F. P., White  I. R., Simpson  J. A., Carlin  J. B. (2018). Canonical causal diagrams to guide the treatment of missing data in epidemiologic studies. American Journal of Epidemiology, 187, 2705–2715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Nash  J. C. (2016). optimr: A Replacement and Extension of the ‘optim’ Function. http://cran.r-project.org/package=optimr. [Google Scholar]
  22. Owen  A. B. (2001). Empirical Likelihood. New York: Chapman and Hall/CRC. [Google Scholar]
  23. Prague  M., Wang  R., Stephens  A., Tchetgen Tchetgen  E., DeGruttola  V. (2016). Accounting for interactions and complex inter-subject dependency in estimating treatment effect in cluster-randomized trials with missing outcomes. Biometrics, 72, 1066–1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. R Core Team (2021). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
  25. Ratovoson  R., Garchitorena  A., Kassie  D., Ravelonarivo  J. A., Andrianaranjaka  V., Razanatsiorimalala  S.  et al. (2022). Proactive community case management decreased malaria prevalence in rural Madagascar: results from a cluster randomized trial. BMC Medicine, 20, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Robins  J. M., Rotnitzky  A., Zhao  L. P. (1995). Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association, 90, 106–121. [Google Scholar]
  27. Schafer  J. L., Yucel  R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457. [Google Scholar]
  28. Tchetgen Tchetgen  E. J., Shpitser  I. (2012). Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics, 40, 1816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Touloumis  A. (2016). Simulating correlated binary and multinomial responses under marginal model specification: the SimCorMultRes Package. The R Journal, 8, 79. [Google Scholar]
  30. Williamson  J. M., Datta  S., Satten  G. A. (2003). Marginal analyses of clustered data when cluster size is informative. Biometrics, 59, 36–42. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ujae135_Supplemental_Files

Web Appendices, Tables, and Figures referenced in Sections 2.4, 2.5, 2.6, 3, and 4, and source code are available with this paper at the Biometrics website on Oxford Academic. Software in the form of R code is available at: https://github.com/JerryChiaRuiChang/MMR-GEE.

Data Availability Statement

The data that support the findings in this paper are openly available at https://doi.org/10.7910/DVN/IIDE2B.


Articles from Biometrics are provided here courtesy of Oxford University Press

RESOURCES